By Topic

Acoustic Analysis for Automatic Speech Recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
O'Shaughnessy, D. ; Inst. Nat. de la Rech. Sci. (INRS), Univ. of Quebec, Montreal, QC, Canada

As a pattern recognition application, automatic speech recognition (ASR) requires the extraction of useful features from its input signal, speech. To help determine relevance, human speech production and acoustic aspects of speech perception are reviewed, to identify acoustic elements likely to be most important for ASR. Common methods of estimating useful aspects of speech spectral envelopes are reviewed, from the point of view of efficiency and reliability in mismatched conditions. Because many speech inputs for ASR have noise and channel degradations, ways to improve robustness in speech parameterization are analyzed. While the main focus in ASR is to obtain spectral envelope measures, human speech communication efficiently exploits the manipulation of one's vocal-cord vibration rate [fundamental frequency (F0)], and so F0 extraction and its integration into ASR are also reviewed. For the acoustic analysis reviewed here for ASR, this work presents modern methods as well as future perspectives on important aspects of speech information processing.

Published in:

Proceedings of the IEEE  (Volume:101 ,  Issue: 5 )