Skip to Main Content
In this paper, we present a relative transfer function (RTF) identification method for speech sources in reverberant environments. The proposed method is based on the convolutive transfer function (CTF) approximation, which enables to represent a linear convolution in the time domain as a linear convolution in the short-time Fourier transform (STFT) domain. Unlike the restrictive and commonly used multiplicative transfer function (MTF) approximation, which becomes more accurate when the length of a time frame increases relative to the length of the impulse response, the CTF approximation enables representation of long impulse responses using short time frames. We develop an unbiased RTF estimator that exploits the nonstationarity and presence probability of the speech signal and derive an analytic expression for the estimator variance. Experimental results show that the proposed method is advantageous compared to common RTF identification methods in various acoustic environments, especially when identifying long RTFs typical to real rooms.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:17 , Issue: 4 )
Date of Publication: May 2009