Neural network lipreading system for improved speech recognition
Stork, D.G.; Wolff, G.; Levine, E.
Neural Networks, 1992. IJCNN., International Joint Conference on
Volume 2, Issue , 7-11 Jun 1992 Page(s):289 - 295 vol.2
Digital Object Identifier 10.1109/IJCNN.1992.226994
Summary:A modified time-delay neural network (TDNN) has been designed to
perform both automatic lipreading (speech reading) in conjunction with
acoustic speech recognition in order to improve recognition both in
silent environments as well as in the presence of acoustic noise. The
system is far more robust to acoustic noise and verbal distractors than
is a system not incorporating visual information. Specifically, in the
presence of high-amplitude pink noise, the low recognition rate in the
acoustic only system (43%) is raised to 75% by the incorporation of
visual information. The system responds to (artificial) conflicting
cross-modal patterns in a way closely analogous to the McGurk effect in
humans. The power of neural techniques is demonstrated in several
difficult domains: pattern recognition; sensory integration; and
distributed approaches toward `rule-based' (linguistic-phonological)
processing
View citation and abstract |