By Topic

Classification of four affective modes in online songs and speeches

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Chien Hung Chen ; Dept. of Electr. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan ; Ping Tsung Lu ; Chen, O.T.-C.

The amount of multimedia sources from websites is extremely growing up every day. How to effectively search data and to find out what we need becomes a critical issue. In this work, four affective modes of exciting/happy, angry, sad and calm in songs and speeches are investigated. A song clip is partitioned into the main and refrain parts each of which is analyzed by the tempo, normalized intensity mean and rhythm regularity. In a speech clip, the standard deviation of fundamental frequencies, the standard deviation of pauses and the mean of zero crossing rates are computed to understand a speaker's emotion. Particularly, the Gaussian mixture model is built and used for classification. In our experimental results, the averaged accuracies associated with the main and refrain parts of songs, and speeches can be 55%, 60% and 80%, respectively. Therefore, the method proposed herein can be employed to analyze songs and speeches downloaded from websites, and then provide emotion information to a user.

Published in:

Wireless and Optical Communications Conference (WOCC), 2010 19th Annual

Date of Conference:

14-15 May 2010