Skip to Main Content
In this paper, a system that transforms speech waveforms to animated faces are proposed. The system relies on a state space model to perform the mapping. To create a photo realistic image, an active appearance model is used. The main contribution of the paper is to compare a Kalman filter and a hidden Markov model approach to the mapping. It is shown that even though the HMM can get a higher test likelihood than the Kalman filter, it is much easier to train and the animation quality is better for the Kalman filter.
Multimedia Signal Processing, 2004 IEEE 6th Workshop on
Date of Conference: 29 Sept.-1 Oct. 2004