By Topic

Informed Audio Source Separation Using Linearly Constrained Spatial Filters

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Stanislaw Gorlow ; LaBRI—CNRS, Université de Bordeaux 1, Talence Cedex, France ; Sylvain Marchand

In this work we readdress the issue of audio source separation in an informed scenario, where certain information about the sound sources is embedded into their mixture as an imperceptible watermark. In doing so, we provide a description of an improved algorithm that follows the linearly constrained minimum-variance filtering approach in the subband domain, in order to obtain perceptually better estimates of the source signals in comparison to other published approaches. Just as its predecessor, the algorithm does not impose any restrictions on the number of simultaneously active sources, neither on their spectral overlap. It rather adapts to a given signal constellation and provides the best possible estimates under given constraints in linearithmic time. The validity of the approach is demonstrated on a stereo mixture with two levels of sound complexity. It is also shown by means of both objective and subjective evaluation that the proposed algorithm outperforms a reference algorithm by at least one grade. Bearing high perceptual resemblance to the original signals at a fairly tolerable data rate of 10-20 kbps per source, the algorithm hence seems well-suited for active listening applications such as re-mixing or re-spatialization in real time.

Published in:

IEEE Transactions on Audio, Speech, and Language Processing  (Volume:21 ,  Issue: 1 )