Skip to Main Content
A robust speech recognition system is reported based on mathematical models of auditory pathway and also their VLSI implementations. The developed auditory model consists of 3 components, i.e., nonlinear feature extraction at cochlea, binaural processing at superior olivery complex, and top-down attention through backward path. The feature extraction is based on cochlear filter bank and time-frequency masking, which is modeled with lateral inhibition in both time and frequency domain. Unlike the popular binaural processing models based on simple interaural time delay and interaural intensity difference our model incorporates hundreds of time-delays for noisy reverberated signals. The top-down (TD) attention comes from familiarity and/or importance of the sound, and a simple but efficient TD attention model had been developed based on error backpropagation algorithm. These auditory models require intensive computing, and special hardwares had been developed for real-time applications. Experimental results demonstrate much better recognition performance in real-world noisy environments.