Cart (Loading....) | Create Account
Close category search window
 

A linear projection approach to environment modeling for robust speech recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Yu Tsao ; Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan ; Chien-Lin Huang ; Matsuda, S. ; Hori, C.
more authors

Use of a linear projection (LP) function to transform multiple sets of acoustic models into a single set of acoustic models is proposed for characterizing testing environments for robust automatic speech recognition. The LP function is an extension of the linear regression (LR) function used in maximum likelihood linear regression (MLLR) and maximum a posteriori linear regression (MAPLR) by incorporating local information in the ensemble acoustic space to enhance the environment modeling capacity. To estimate the nuisance parameters of the LP function, we developed maximum likelihood LP (MLLP) and maximum a posteriori LP (MAPLP) and derived a set of integrated prior (IP) densities for MAPLP. The IP densities integrate multiple knowledge sources from the training set, previously seen speech data, current utterance, and a prepared tree structure. We evaluated the proposed MLLP and MAPLP on the Aurora-2 database in an unsupervised model adaptation manner. Experimental results show that the LP function outperforms the LR function with both ML- and MAP-based estimates over different test conditions. Moreover, because the MAP-based estimate can handle over-fittings well, MAPLP has clear improvements over MLLP. Compared to the baseline result, MAPLP provides a significant 10.99% word error rate reduction.

Published in:

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Date of Conference:

25-30 March 2012

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.