By Topic

Speaker-independent speech-driven facial animation using a hierarchical model

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $31
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Cosker, D.P. ; Univ. of Wales, Cardiff, UK ; Marshall, A.D. ; Rosin, P.L. ; Hicks, Y.A.

We present a system capable of producing video-realistic videos of a speaker given audio only. The audio input signal requires no phonetic labelling and is speaker independent. The system requires only a small training set of video to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow new facial poses, unseen in the training video, to be synthesised. To achieve this we have developed a novel approach which utilises a hierarchical and nonlinear PCA model which couples speech and appearance. We show that the model is capable of synthesising videos of a speaker using new audio segments from both previously heard and unheard speakers. The model is highly compact making it suitable for a wide range of real-time applications in multimedia and telecommunications using standard hardware.

Published in:

Visual Information Engineering, 2003. VIE 2003. International Conference on

Date of Conference:

7-9 July 2003