By Topic

Log Likelihood Ratio Based Annotation Verification of a Norwegian Speech Synthesis Database

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Amdal, I. ; Dept. of Electron. & Telecommun., Norwegian Univ. of Sci. & Technol., Trondheim ; Johnsen, M.H. ; Svendsen, T.

Accurate labeling and segmentation of the unit inventory database is of vital importance to the quality of unit selection text-to-speech synthesis. Misalignments and mismatch between the predicted and pronounced unit sequences require manual correction to achieve natural sounding synthesis. In this paper we have used a log likelihood ratio based utterance verification to automatically detect annotation errors in a Norwegian two-speaker synthesis database. Each sentence is assigned a confidence score and those falling below a threshold can be discarded or manually inspected and corrected. Using equal reject number as a criterion the transcription sentence error rate was reduced from 9.8% to 2.7%. Insertions are the largest error category, and 95.6% of these were detected. A closer inspection of false rejections was performed to assess (and improve) the phoneme prediction system

Published in:

Signal Processing Symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic

Date of Conference:

June 2006