Skip to Main Content
To alleviate the problem of severe degradation of speaker recognition performance under noisy environments because of inadequate and inaccurate speaker-discriminative information, a method of robust feature estimation that can capture both vocal source- and vocal tract-related characteristics from noisy speech utterances is proposed. Spectral subtraction, a simple yet useful speech enhancement technique, is employed to remove the noise-specific components prior to the feature extraction process. It has been shown through analytical derivation, as well as by simulation results, that the proposed feature estimation method leads to robust recognition performance, especially at low signal-to-noise ratios. In the context of Gaussian mixture model-based speaker recognition with the presence of additive white Gaussian noise, the new approach produces consistent reduction of both identification error rate and equal error rate at signal-to-noise ratios ranging from 0 to 15 dB.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:19 , Issue: 1 )
Date of Publication: Jan. 2011