A major deficiency of standard hidden Markov models (HMM) is that both the spectral and the prosodic feature are uniformly processed. To combine more efficiently the prosodic cues with the acoustic ones, a segmental two levels hidden Markov model has been recently studied by Suaudeau [Suaudeau 94]. In this paper, we present an adapted version of this model in which the segmental processing is replaced by the classical centisecond processing. This new model is called centisecond two levels hidden semi Markov model (CTLHSMM). Our approach retains the traditional hierarchical structure of an HMM, and facilitate the introduction of others prosodic parameters [Caliope 89] (in particular the energy) in the phonetic level. Experiments on a French database composed of 20 numbers show that this model reduces the recognition error rates
Published in:
Parallel Computing in Electrical Engineering, 2006. PAR ELEC 2006. International Symposium on
Date of Conference: 13-17 Sept. 2006