By Topic

Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Li Deng ; Microsoft Res., Redmond, WA, USA ; J. Droppo ; A. Acero

We describe a novel algorithm for recursive estimation of nonstationary acoustic noise which corrupts clean speech, and a successful application of the algorithm in the speech feature enhancement framework of noise-normalized SPLICE for robust speech recognition. The noise estimation algorithm makes use of a nonlinear model of the acoustic environment in the cepstral domain. Central to the algorithm is the innovative iterative stochastic approximation technique that improves piecewise linear approximation to the nonlinearity involved and that subsequently increases the accuracy for noise estimation. We report comprehensive experiments on SPLICE-based, noise-robust speech recognition for the AURORA2 task using the results of iterative stochastic approximation. The effectiveness of the new technique is demonstrated in comparison with a more traditional, MMSE noise estimation algorithm under otherwise identical conditions. The word error rate reduction achieved by iterative stochastic approximation for recursive noise estimation in the framework of noise-normalized SPLICE is 27.9% for the multicondition training mode, and 67.4% for the clean-only training mode, respectively, compared with the results using the standard cepstra with no speech enhancement and using the baseline HMM supplied by AURORA2. These represent the best performance in the clean-training category of the September-2001 AURORA2 evaluation. The relative error rate reduction achieved by using the same noise estimate is increased to 48.40% and 76.86%, respectively, for the two training modes after using a better designed HMM system. The experimental results demonstrated the crucial importance of using the newly introduced iterations in improving the earlier stochastic approximation technique, and showed sensitivity of the noise estimation algorithm's performance to the forgetting factor embedded in the algorithm.

Published in:

IEEE Transactions on Speech and Audio Processing  (Volume:11 ,  Issue: 6 )