Skip to Main Content
Typical speech enhancement algorithms that operate in the Fourier domain only modify the magnitude component of the noisy speech. It is commonly understood that the phase component is perceptually unimportant, and thus, it is passed directly to the output. Nevertheless, it has been reported in recent experiments that the Short-Time Fourier Transform (STFT) phase spectrum contributes significantly to speech intelligibility. Motivated by this, we investigated the role of phase spectrum in speech enhancement using Wiener filtering and Martin's minimum statistics. In this paper we report on results obtained using optimization algorithms, for phase correction of each processed frame, that intend to match the waveform of the zero-phase Wiener filtered speech to the conventional filter output obtained with noisy phase characteristic. No a priori information on the original phase is assumed. We show that better results are achieved using phase correction for different noise types. Different criteria are used for optimization with results similar to the case when the actual clean speech phase is at hand. Almost as good results are also obtained when minimizing the Wiener filter impulse response dispersion. The achieved improvement is assessed through different measurements such as signal to noise ratio (SNR), Segmental signal to noise ratio, and Perceptual Estimation of Speech Quality (PESQ).
Date of Conference: 4-6 Dec. 2010