Skip to Main Content
In this paper, we address the problem of underdetermined blind source separation (BSS) of anechoic speech mixtures. We propose a demixing algorithm that exploits the sparsity of certain time-frequency expansions of speech signals. Our algorithm merges lscrq -basis-pursuit with ideas based on the degenerate unmixing estimation technique (DUET) [Yiotalmaz and Rickard, "Blind Source Separation of Speech Mixtures via Time-Frequency Masking," IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004]. There are two main novel components to our approach: 1, our algorithm makes use of all available mixtures in the anechoic scenario where both attenuations and arrival delays between sensors are considered, without imposing any structure on the microphone positions, and 2, we illustrate experimentally that the separation performance is improved when one uses lscrq-basis-pursuit with q < 1 compared to the q = 1 case. Moreover, we provide a probabilistic interpretation of the proposed algorithm that explains why a choice of 0.1 les q les 0.4 is appropriate in the case of speech. Experimental results on both simulated and real data demonstrate significant gains in separation performance when compared to other state-of-the-art BSS algorithms reported in the literature.
Date of Publication: Aug. 2007