Current speech recognition systems perform very poorly in the presence of background noise, particularly for signal-to-noise ratios (SNR) below 10 dB and for certain noise conditions such as cafeteria noise. In this study we investigate the use of acoustic processing based on cochlear models and neural-like processing as a means of arriving at noise robust acoustic representation of speech. However, unlike previous work based on cochlear models that used cochlear filter parameters based on neurophysiological data, we optimize cochlear filter shape and thresholds to reduce the noise contribution in the resulting acoustic representations. Results suggest that average SNR improvements of the order of 5-10 dB can be obtained for noise corrupted signals with SNRs near 0-6 dB for realistic noise such as cafeteria noise. Furthermore, using a neural network to include context and arrive at a lower dimensional representation can lead to further improvements in SNR
Published in:
Neural Networks, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on
(Volume:7
)
Date of Conference: 27 Jun-2 Jul 1994