By Topic

Speech/Music Discrimination for Robust Speech Recognition in Robots

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Mu Yeol Choi ; Department of Electronics Engineering, Pusan National University, Busan 609-735, Korea, e-mail: ; Hwa Jeon Song ; Hyung Soon Kim

Automatic speech recognition (ASR) is one indispensable technology to communicate with a service robot. In real-world environments, ASR faces many kinds of sound sources and they should be discriminated to improve ASR performance. In ASR systems, speech is usually detected from the input signal by voice activity detection (VAD) scheme. Speech and music, how ever, are not easily discriminated by the VAD because they share similar characteristics such as periodicity. In this paper, we adopt a speech/music discriminator into the front-end of the ASR system in order to disable music stream not to be an input for the ASR system. Our speech/music discriminator employs the mean of minimum cepstral distances (MMCD) as a feature parameter. Experimental result shows the MMCD parameter outperforms the conventional feature parameter, spectral flux.

Published in:

RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication

Date of Conference:

26-29 Aug. 2007