By Topic

Using Spatial Audio Cues from Speech Excitation for Meeting Speech Segmentation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Cheng, E. ; Sch. of Electr. Comput., & Telecommun. Eng., Wollongong Univ., NSW ; Burnett, I. ; Ritz, C.

Multiparty meetings generally involve stationary participants. Participant location information can thus be used to segment the recorded meeting speech into each speaker's 'turn' for meeting 'browsing'. To represent speaker location information from speech, previous research showed that the most reliable time delay estimates are extracted from the Hubert envelope of the linear prediction residual signal. The authors' past work has proposed the use of spatial audio cues to represent speaker location information. This paper proposes extracting spatial audio cues from the Hubert envelope of the speech residual for indicating changing speaker location for meeting speech segmentation. Experiments conducted on recordings of a real acoustic environment show that spatial cues from the Hubert envelope are more consistent across frequency subbands and can clearly distinguish between spatially distributed speakers, compared to spatial cues estimated from the recorded speech or residual signal

Published in:

Signal Processing, 2006 8th International Conference on  (Volume:4 )

Date of Conference:

16-20 2006