By Topic

Multistream Recognition of Speech: Dealing With Unknown Unknowns

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

The purchase and pricing options are temporarily unavailable. Please try again later.
1 Author(s)
Hynek Hermansky ; Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD , USA

The paper discusses an approach for dealing with unexpected acoustic elements in speech. The approach is motivated by observations of human performance on such problems, which indicate the existence of multiple parallel processing streams in the human speech processing cognitive system, combined with the human ability to know when the correct information is being received. Some earlier relevant engineering approaches in multistream automatic recognition of speech (ASR) that aimed at processing of noisy speech and at dealing with unexpected out-of-vocabulary words are reviewed. The paper also reviews some currently active research in multistream ASR, focusing mainly on feedback-based techniques involving fusion of information between individual processing streams. The difference between the system behavior on its training data and during its operation is proposed as a substitute for the human ability of “knowing when knowing.” Most recent results indicate 9% relative improvement in error rates in phoneme recognition of high signal-to-noise ratio speech and as high as 30% relative improvements in moderate noise.

Published in:

Proceedings of the IEEE  (Volume:101 ,  Issue: 5 )