By Topic

Towards source-filter based single sensor speech separation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Stark, M. ; Signal Process. & Speech Commun. Lab., Graz Univ. of Technol., Graz ; Pernkopf, F.

We present a new source-filter based method to separate two speakers talking simultaneously at equal level mixed into a single sensor. First, the relation between the spectral whitened mixture and the speakers excitation signals is analyzed. Therefore, a factorial HMM capturing also time dependencies is exploited. Then, the estimated excitation signals are combined with best fitting vocal tract information taken from a trained dictionary. We report results on the database of Cooke considering 108 speech mixtures. The average improvement of 2.9 dB in SIR for all data is lower but not significantly lower compared to the Gaussian mixture method which relies on known pitch-tracks. Although the performance is currently moderate we believe in this approach and its significance towards the development of speaker independent single sensor speech separation.

Published in:

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Date of Conference:

19-24 April 2009