By Topic

An Automatic Lipreading System for Spoken Digits With Limited Training Data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Wang, S.L. ; Sch. of Inf. Security Eng., Shanghai Jiaotong Univ., Shanghai ; Liew, A. ; Lau, W.H. ; Leung, S.H.

It is well known that visual cues of lip movement contain important speech relevant information. This paper presents an automatic lipreading system for small vocabulary speech recognition tasks. Using the lip segmentation and modeling techniques we developed earlier, we obtain a visual feature vector composed of outer and inner mouth features from the lip image sequence for recognition. A spline representation is employed to transform the discrete-time sampled features from the video frames into the continuous domain. The spline coefficients in the same word class are constrained to have similar expression and are estimated from the training data by the EM algorithm. For the multiple-speaker/speaker-independent recognition task, an adaptive multimodel approach is proposed to handle the variations caused by various talking styles. After building the appropriate word models from the spline coefficients, a maximum likelihood classification approach is taken for the recognition. Lip image sequences of English digits from 0 to 9 have been collected for the recognition test. Two widely used classification methods, HMM and RDA, have been adopted for comparison and the results demonstrate that the proposed algorithm deliver the best performance among these methods.

Published in:

Circuits and Systems for Video Technology, IEEE Transactions on  (Volume:18 ,  Issue: 12 )