The paper compares a newly proposed hybrid connectionist-SCHMM approach [Hutter and Pfister 1994] with other hybrid approaches. In the new approach a multilayer perceptron (MLP) replaces the conventional codebooks of semi-continuous HMMs. The MLP is therefore trained on so-called basic elements (phones and phone parts) in such a way that the outputs of the network estimate the a posteriori probabilities of these elements, given a context of input vectors. These a posteriori estimates are converted into scaled likelihoods, which are then used as observation probabilities in the framework of classical SCHMMs. The remaining parameters of the SCHMMs are trained with the well-known Baum-Welch algorithm using the estimated likelihoods of the MLP. This approach compared favorably with other proposed hybrid systems and classical approaches on an isolated German digit recognition task over telephone lines. It exhibited the highest recognition rate of all systems, followed by an approach using LVQ3 optimization of the codebook
Published in:
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
(Volume:5
)
Date of Conference: 9-12 May 1995