Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Feature vector classification based speech emotion recognition for service robots

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Jeong-sik Park ; Comput. Sci. Div., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea ; Ji-Hwan Kim ; Yung-Hwan Oh

This paper proposes an efficient feature vector classification for Speech Emotion Recognition (SER) in service robots. Since service robots interact with diverse users who are in various emotional states, two important issues should be addressed: acoustically similar characteristics between emotions and variable speaker characteristics due to different user speaking styles. Each of these issues may cause a substantial amount of overlap between emotion models in feature vector space, thus decreasing SER accuracy. In order to reduce the effects caused by such overlaps, this paper proposes an efficient feature vector classification for SER. The conventional feature vector classification applied to speaker identification categorizes feature vectors as overlapped and non-overlapped. Because this method discards all of the overlapped vectors in model reconstruction, it has limitations in constructing robust models when the number of overlapped vectors is significantly increased such as in emotion recognition. The method proposed herein classifies overlapped vectors in a more sophisticated manner, selecting discriminative vectors among overlapped vectors, and adds those vectors in model reconstruction. On SER experiments using an emotional speech corpus, the proposed classification approach exhibited superior performance to conventional methods, and displayed an almost human-level performance. In particular, we achieved commercially applicable performance for two-class (negative vs. non-negative) emotion recognition.

Published in:

Consumer Electronics, IEEE Transactions on  (Volume:55 ,  Issue: 3 )