Pei Yin
Essa, I.
Rehg, J.M.
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA;
This paper appears in: Analysis and Modeling of Faces and Gestures, 2003. AMFG 2003. IEEE International Workshop on
Publication Date: 17 Oct. 2003
On page(s): 68- 73
ISSN:
ISBN: 0-7695-2010-3
INSPEC Accession Number: 7922723
Posted online: 2003-10-27 09:54:33.0
Abstract
We propose a new approach for combining acoustic and visual measurements to aid in recognizing lip shapes of a person speaking. Our method relies on computing the maximum likelihoods of (a) HMM used to model phonemes from the acoustic signal, and (b) HMM used to model visual features motions from video. One significant addition is the dynamic analysis with features selected by AdaBoost, on the basis of their discriminant ability. This form of integration, leading to boosted HMM, permits AdaBoost to find the best features first, and then uses HMM to exploit dynamic information inherent in the signal.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.