Skip to Main Content
Gaussian Mixture Models (GMMs) are widely used among scientists e.g. in statistics toolkits and data mining procedures. In order to estimate parameters of a GMM the Maximum Likelihood (ML) training is often utilized, more precisely the Expectation-Maximization (EM) algorithm. Nowadays, a lot of tasks works with huge datasets, what makes the estimation process time consuming (mainly for complex mixture models containing hundreds of components). The paper presents an efficient and robust implementation of the estimation of GMM statistics used in the EM algorithm on GPU using NVIDIA's Compute Unified Device Architecture (CUDA). Also an augmentation of the standard CPU version is proposed utilizing SSE instructions. Time consumptions of presented methods are tested on a large dataset of real speech data from the NIST Speaker Recognition Evaluation (SRE) 2008. Estimation on GPU proves to be more than 400 times faster than the standard CPU version and 130 times faster than the SSE version, thus a huge speed up was achieved without any approximations made in the estimation formulas. Proposed implementation was also compared to other implementations developed by other departments over the world and proved to be the fastest (at least 5 times faster than the best implementation published recently).