Skip to Main Content
In the framework of speech enhancement, many approaches have been developed when the speech signal is embedded in an additive noise, white or coloured. In this paper, we focus our attention on speech contaminated both by convolutive and additive noise. This happens for instance in audio conferences or in an auditorium when echoes appear. The approach we propose here makes it possible to retrieve speech by using a two-microphone based device. It operates as follows: first, one performs the blind estimation of the variances of the additive noises and the finite impulse responses (FIR), which represent the spatial transformations between the sources and the microphones. Then, the filtered versions of speech, estimated by means of subspace methods, are used to retrieve the source speech. This approach has the advantage of not using a Voice Activity Detector (VAD), which is usually required to estimate the noise variances. In addition, the method can be easily extended to the additive coloured noise by introducing a pre-whitening step, in each "channel".
EUROCON 2003. Computer as a Tool. The IEEE Region 8 (Volume:2 )
Date of Conference: 22-24 Sept. 2003