Recent work in source separation of two-channel mixtures has used spatial cues (cross-channel amplitude and phase difference coefficients) to estimate time-frequency masks for separating sources. As sources increasingly overlap in the time-frequency domain or the spatial angle between sources decreases, these spatial cues become unreliable. We introduce a method to re-estimate the spatial cues for mixtures of harmonic sources. The newly estimated spatial cues are fed to the system to update each source estimate and the pitch estimate of each source. This iterative procedure is repeated until the difference between the current estimate of the spatial cues and the previous one is under a pre-set threshold. Results on a set of three-source mixtures of musical instruments show this approach significantly improves separation performance of two existing time-frequency masking systems.
Published in:
Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09. IEEE Workshop on
Date of Conference: 18-21 Oct. 2009