By Topic

Discriminatively Trained GMMs for Language Classification Using Boosting Methods

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Man-Hung Siu ; Speech & Language Process. Dept., BBN Technol., Cambridge, MA ; Xi Yang ; Gish, Herbert

In language identification and other speech applications, discriminatively trained models often outperform nondiscriminative models trained with the maximum-likelihood criterion. For instance, discriminative Gaussian mixture models (GMMs) are typically trained by optimizing some discriminative criteria that can be computationally expensive and complex to implement. In this paper, we explore a novel approach to discriminative GMM training by using a variant the boosting framework (R. Schapire, ldquoThe boosting approach to machine learning, an overview,rdquo Proc. MSRI Workshop on Nonlinear Estimation and Classification, 2002) from machine learning, in which an ensemble of GMMs is trained sequentially. We have extended the purview of boosting to class conditional models (as opposed to discriminative models such as classification trees). The effectiveness of our boosting variation comes from the emphasis on working with the misclassified data to achieve discriminatively trained models. Our variant of boosting also includes utilizing low confidence data classifications as well as misclassified examples in classifier generation. We further apply our boosting approach to anti-models to achieve additional performance gains. We have applied our discriminative training approach to a variety of language identification experiments using the 12-language NIST 2003 language identification task. We show the significant performance improvements that can be obtained. The experiments include both acoustic as well as token-based speech models. Our best performing boosted GMM-based system on the 12-language verification task has a 2.3% EER.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:17 ,  Issue: 1 )