By Topic

Comparison of modulation features for phoneme recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Sriram Ganapathy ; Department of Electrical and Computer Engineering, Johns Hopkins University, USA ; Samuel Thomas ; Hynek Hermansky

In this paper, we compare several approaches for the extraction of modulation frequency features from speech signal using a phoneme recognition system. The general framework in these approaches is to decompose the speech signal into a set of sub-bands. Amplitude modulations (AM) in the sub-band signal are used to derive features for automatic speech recognition (ASR). Then, we propose a feature extraction technique which uses autoregressive models (AR) of sub-band Hilbert envelopes in relatively long segments of speech signal. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). Features are formed by converting the FDLP envelopes into static and dynamic modulation frequency components. In the phoneme recognition experiments using the TIMIT database, the FDLP based modulation frequency features provide significant improvements compared to other techniques (average relative improvement of 7.5% over the base-line features). Furthermore, a detailed analysis is performed to determine the relative contribution of various processing stages in the proposed technique.

Published in:

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference:

14-19 March 2010