Skip to Main Content
The separation of multichannel audio mixtures is often addressed by the masking approach, which consists of representing the mixture signal in the time-frequency domain and associating each time-frequency bin with a small number of active sources. Adaptive time-frequency representations can increase the disjointness of the sources compared to fixed representations. However their use has not been conclusive so far. In this paper, we propose a new criterion for the blind estimation of an adapted representation of an instantaneous mixture and explain how to compute the oracle representation leading to the best possible performance given reference source signals. Experimental results suggest that a small separation performance improvement can indeed be achieved using adaptive representations, but that complementary approaches must be investigated to obtain larger improvements.