By Topic

Accommodating sample size effect on similarity measures in speaker clustering

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Alexander Haubold ; Department of Computer Science, Columbia University, Columbia ; John R. Kender

We investigate the symmetric Kullback-Leibler (KL2) distance in speaker clustering and its unreported effects for differently-sized feature matrices. Speaker data is represented as Mel frequency cepstral coefficient (MFCC) vectors, and features are compared using the KL2 metric to form clusters of speech segments for each speaker. We make two observations with respect to clustering based on KL2: 1.) The accuracy of clustering is strongly dependent on the absolute lengths of the speech segments and their extracted feature vectors. 2.) The accuracy of the similarity measure strongly degrades with the length of the shorter of the two speech segments. These effects of length can be attributed to the measure of covariance used in KL2. We demonstrate an empirical correction of this sample-size effect that increases clustering accuracy. We draw parallels to two vector quantization-based (VQ) similarity measures, one which exhibits an equivalent effect of sample size, and the second being less influenced by it.

Published in:

2008 IEEE International Conference on Multimedia and Expo

Date of Conference:

June 23 2008-April 26 2008