By Topic

Evaluation of several strategies for single sensor speech/music separation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Raphael Blouet ; MIST Technologies Research 204, rue de Crimée 75019 Paris, France ; Guy Rapaport ; Israel Cohen ; Cedric Fevotte

In this paper we address the application of single sensor source separation techniques to mixtures of speech and music. Three strategies for source modeling are presented, namely Gaussian scaled mixture models (GSMM), autoregressive (AR) models and amplitude factor (AF). The common ingredient to the methods is the use of a codebook containing elementary spectral shapes to represent non- stationary signals, and to handle separately spectral shape and amplitude information. We propose a new system that employs separate models for the speech and music signals. The speech signal proves to be best modeled with the AR-based codebook, while the music signal is best modeled with the AF-based codebook. Experimental results demonstrate the improved performance of the proposed approach for speech/music separation in some evaluation criteria.

Published in:

2008 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference:

March 31 2008-April 4 2008