The application of hidden Markov models to speech pattern modeling suffers from two major deficiencies: the classical learning algorithm generates severe underflow problems and the implicit state occupancy function is inadequate for modeling speech-segment duration. To overcome the numerical problem, the classical joint probability formalism is replaced by a conditional probability formalism. To avoid the unrealistic implicit modeling of the state occupancy, the underlying Markov chain is replaced by a semi-Markov chain, a more general framework where the state occupancy is explicitly modeled by an appropriate probability density function, in the present case a gamma distribution. A particular scheme based on hidden semi-Markov models and an a posteriori probability formalism is presented. The learning algorithm is characterised on an isolated-word-recognition task. Preliminary results are given on demi-syllable modeling in the context of continuous speech decoding
Published in:
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Date of Conference: 23-26 May 1989