I. Introduction
The objective of speaker identification (SI) is to determine which voice sample from a set of known voice samples best matches the characteristics of an unknown input voice sample [1]. SI is a two-stage procedure consisting of training and testing. In the training stage, speaker-dependent feature vectors are extracted from a training speech signal and a speaker model is built for each speaker's feature vectors. Normally, SI systems use the Mel-frequency cepstral coefficients (MFCCs) as the feature vector and a Gaussian mixture model (GMM) of the feature vectors for the speaker model. The GMM is parameterized by the set where are the weights, are the mean vectors, and are the covariance matrices of the Gaussian component densities of the GMM. In the SI testing stage, feature vectors are extracted from a test signal (speaker unknown), scored against all speaker models using a log-likelihood calculation, and the most likely speaker identity decided according to \mathhat{s}=\arg\max_{1\leq s\leq S}\sum_{m=1}^{M^{\prime}}\log p\left({\bf x}_{m}^{\rm test}\vert\lambda_{s}\right).\eqno{\hbox{(1)}}