Skip to Main Content
In this letter, we propose a discriminative modeling approach for the speaker verification problem that uses polynomial kernel support vector machines (PK-SVMs). The proposed approach is rooted in an equivalence relationship between the state-of-the-art probabilistic linear discriminant analysis (PLDA) and second degree polynomial kernel methods. We present two techniques for overcoming the memory and computational challenges that PK-SVMs pose. The first of these, a kernel evaluation simplification trick, eliminates the need to explicitly compute dot products for a huge number of training samples. The second technique makes use of the massively parallel processing power of modern graphical processing units. We performed experiments on the Phase I speaker verification track of the DARPA sponsored Robust Automatic Transcription of Speech (RATS) program. We found that, in the multi-session enrollment experiments, second degree PK-SVMs outperformed PLDA across all tasks in terms of the official evaluation metric, and third and fourth degree PK-SVMs provided a performance improvement over the second degree PK-SVMs. Furthermore, for the “30s-30s” task, a linear score combination between the PLDA and PK-SVM based systems provided 27% improvement relative to the PLDA baseline in terms of the official evaluation metric.