By Topic

Dynamic Audio-Visual Mapping using Fused Hidden Markov Model Inversion Method

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Le Xin ; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China. xinle@nlpr.ia.ac.cn ; Jianhua Tao ; Tieniu Tan

Realistic audio-visual mapping remains a very challenging problem. Having short time delay between inputs and outputs is also of great importance. In this paper, we present a new dynamic audio-visual mapping approach based on the Fused Hidden Markov Model Inversion method. In our work, the Fused HMM is used to model the loose synchronization nature of the two tightly coupled audio speech and visual speech streams explicitly. Given novel audio inputs, the inversion algorithm is derived to synthesize visual counterparts by maximizing the joint probabilistic distribution of the Fused HMM. When it is implemented in the subsets built from the training corpus, realistic synthesized facial animation having relative short time delay is obtained. Experiments on a 3D motion capture bimodal database show that the synthetic results are comparable with the ground truth.

Published in:

2007 IEEE International Conference on Image Processing  (Volume:3 )

Date of Conference:

Sept. 16 2007-Oct. 19 2007