By Topic

Improving Automatic Classification of Prosodic Events by Pairwise Coupling

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
César Gonzalez-Ferreras ; Department of Computer Science, Universidad de Valladolid, Valladolid, Spain ; David Escudero-Mancebo ; Carlos Vivaracho-Pascual ; Valentín Cardenoso-Payo

This paper presents a system that automatically labels tones and break indices (ToBI) events. The detection (binary classification) of prosodic events has received significantly more attention from researchers than its classification because of the intrinsic difficulty of classification. We focus on the classification problem, identifying eight types of pitch accent tones, nine types of boundary tones and five types of break indices. The complex multi-class classification problem is divided into several simpler problems, by means of pairwise coupling. We propose to combine two-class classifiers to achieve the multi-class classification because two-class problems provide high accuracy results. Furthermore, complementarity between artificial neural networks and decision trees classifiers has been exploited to improve the final system, combining their outputs using a fusion method. This proposal, together with the adequate feature extraction that includes the use of features such as the Tilt and Bézier parameters, allows us to achieve a total classification accuracy of 70.8% for pitch accents, 84.2% for boundary tones and 74.6% for break indices, on the Boston University Radio News Corpus. The analysis of the misclassified samples shows that the types of mistakes that the system makes do not differ significantly from the common confusions that are observed in manual ToBI inter-transcriber tests.

Published in:

IEEE Transactions on Audio, Speech, and Language Processing  (Volume:20 ,  Issue: 7 )