By Topic

Source generator equalization and enhancement of spectral properties for robust speech recognition in noise and stress

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
J. H. L. Hansen ; Dept. of Electr. Eng., Duke Univ., Durham, NC, USA ; M. A. Clements

Studies have shown that depending on speaker task and environmental conditions, recognizers are sensitive to noisy stressful environments. The focus of the study is to achieve robust recognition in diverse environmental conditions through the formulation of feature enhancement and stress equalization algorithms under the framework of source generator theory. The generator framework is considered as a means of modeling production variation under stressful speaking conditions. A multi-dimensional stress equalization procedure is formulated that produces recognition features less sensitive to varying factors caused by stress. A feature enhancement algorithm is employed based on iterative techniques previously derived for enhancement of speech in varying background noise environments. Combined stress equalization and feature enhancement reduces average word error rates across 10 noisy stressful conditions by -38.7% (e.g., noisy loud, angry, and Lombard effect stress conditions, etc.). The results suggest that the combination of a flexible source generator framework to address stressed speaking conditions, and a feature enhancement algorithm that adapts based on speech-specific constraints, can be effective in reducing the consequences of stress and noise for robust automatic recognition

Published in:

IEEE Transactions on Speech and Audio Processing  (Volume:3 ,  Issue: 5 )