The separation of under-determined convolutive audio mixtures is generally addressed in the time-frequency domain where the sources exhibit little overlap. Most previous approaches rely on the approximation of the mixing process by complex-valued multiplication in each frequency bin. This is equivalent to assuming that the spatial covariance matrix of each source, that is the covariance of its contribution to all mixture channels, has rank 1. In this paper, we propose to represent each source via a full-rank spatial covariance matrix instead, which better approximates reverberation. We also investigate a possible parameterization of this matrix stemming from the theory of statistical room acoustics. We illustrate the potential of the proposed approach over a stereo reverberant speech mixture.
Published in:
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Date of Conference: 18-21 Oct. 2009