By Topic

A stochastic segment model for phoneme-based continuous speech recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
M. Ostendorf ; Dept. of Electr., Comput. & Syst. Eng., Boston Univ., MA, USA ; S. Roukos

The authors introduce a novel approach to modeling variable-duration phonemes, called the stochastic segment model. A phoneme X is observed as a variable-length sequence of frames, where each frame is represented by a parameter vector and the length of the sequence is random. The stochastic segment model consists of (1) a time warping of the variable-length segment X into a fixed-length segment Y called a resampled segment and (2) a joint density function of the parameters of X which in this study is a Gaussian density. The segment model represents spectra/temporal structure over the entire phoneme. The model also allows the incorporation in Y of acoustic-phonetic features derived from X, in addition to the usual spectral features that have been used in hidden Markov modeling and dynamic time warping approaches to speech recognition. The authors describe the stochastic segment model, the recognition algorithm, and an iterative training algorithm for estimating segment models from continuous speech. They present several results using segment models in two speaker-dependent recognition tasks and compare the performance of the stochastic segment model to the performance of the hidden Markov models

Published in:

IEEE Transactions on Acoustics, Speech, and Signal Processing  (Volume:37 ,  Issue: 12 )