Input-output HMMs for sequence processing
Bengio, Y.; Frasconi, P.
Neural Networks, IEEE Transactions on
Volume 7, Issue 5, Sep 1996 Page(s):1231 - 1249
Digital Object Identifier 10.1109/72.536317
Summary:We consider problems of sequence processing and propose a solution
based on a discrete-state model in order to represent past context. We
introduce a recurrent connectionist architecture having a modular
structure that associates a subnetwork to each state. The model has a
statistical interpretation we call input-output hidden Markov model
(IOHMM). It can be trained by the estimation-maximization (EM) or
generalized EM (GEM) algorithms, considering state trajectories as
missing data, which decouples temporal credit assignment and actual
parameter estimation. The model presents similarities to hidden Markov
models (HMMs), but allows us to map input sequences to output sequences,
using the same processing style as recurrent neural networks. IOHMMs are
trained using a more discriminant learning paradigm than HMMs, while
potentially taking advantage of the EM algorithm. We demonstrate that
IOHMMs are well suited for solving grammatical inference problems on a
benchmark problem. Experimental results are presented for the seven
Tomita grammars, showing that these adaptive models can attain excellent
generalization
View citation and abstract |