By Topic

A fast and robust speech/music discrimination approach

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Wang, W.Q. ; Coll. of Inf. Sci. & Eng., Chinese Acad. of Sci., China ; Gao, W. ; Ying, D.W.

This paper presents a simple and effective approach to discriminate speech and music. First, the proposed modified low energy ratio is extracted from each window-level segment as the only feature. Then the system applied the Bayes MAP classifier to decide the audio class of each segment. Last, based on the fact that the audio types of neighboring segments have very strong relevance, a novel context-based post-decision method is designed to refine the classification results. The proposed method is evaluated on about 5 hours of audio data, which involves clean and noisy speech from various speakers, as well as a wide range of musical content. The experimental results are promising, and a classification accuracy of more than 97% has been achieved despite the low computation complexity of the method.

Published in:

Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on  (Volume:3 )

Date of Conference:

15-18 Dec. 2003