By Topic

A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Borgstrom, B.J. ; Dept. of Electr. Eng., Univ. of California, Los Angeles, CA ; Alwan, Abeer

This paper proposes a novel low-complexity lip contour model for high-level optic feature extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The model is based on weighted least-squares parabolic fitting of the upper and lower lip contours, does not require the assumption of symmetry across the horizontal axis of the mouth, and is therefore realistic. The proposed model does not depend on the accurate estimation of specific facial points, as do other high-level models. Also, we present a novel low-complexity algorithm for speaker normalization of the optic information stream, which is compatible with the proposed model and does not require parameter training. The use of the proposed model with speaker normalization results in noise robustness improvement in AV isolated-word recognition relative to using the baseline high-level model.

Published in:

Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on  (Volume:38 ,  Issue: 6 )