Skip to Main Content
The decision-making process of many binary classification systems is based on the likelihood ratio (LR) scores of test patterns. This paper shows that LR scores can be expressed in terms of the similarity between the supervectors (SVs) formed by stacking the mean vectors of Gaussian mixture models corresponding to the test patterns, the target model, and the background model. By interpreting the support vector machine (SVM) kernels as a specific similarity (or discriminant) function between SVs, this paper shows that LR scoring is a special case of SVM scoring and that most sequence kernels can be obtained by assuming a specific form for the similarity function of SVs. This paper further shows that this assumption can be relaxed to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate the positive and negative classes using either regression analysis or SVM training. The idea was applied to both high-and low-level speaker verification. In both cases, results show that the proposed kernels achieve better performance than several state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores.