1. Introduction
Audio source separation is a challenging task in audio signal processing [1], in which the quality of the reconstructed sources depends strongly on the particular task and the amount of prior information that can be exploited. Informed source separation (ISS) [2]–[5], which is also strongly related to spatial audio object coding (SAOC) [6], is a new trend in source separation, where some side-information about the sources and/or the mixing system is extracted at a stage where the clean sources are still available, e.g., during the mixing of a music recording by a sound engineer. A natural constraint is that this side-information should be small enough as compared to encoding the sources independently. More precisely, an ISS method is based on a so-called encoding stage, where the side-information is extracted, given both the sources and their mixture, and a so-called decoding stage, where the sources are not available any more and estimated from the mixture, given the side-information. As such, the ISS being at the crossroads of source separation and compression [7], it usually leads to much better quality of reconstructed sources than the conventional audio source separation at the expense of some bitrate required for side-information transmission. Indeed, the quality of reconstructed sources can be fully controlled during the encoding stage [5], [7], and perceptual psycho-acoustic aspects can be taken into account [6], [8].