By Topic

Generation of F0 contours using a model-constrained data-driven method

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
A. Sakurai ; Texas Instruments Japan Ltd, Tsukuba, Japan ; K. Hirose ; N. Minematsu

Introduces a model-constrained, data-driven method for generating fundamental frequency contours in Japanese text-to-speech synthesis. In the training phase, the parameters of a command-response F0 contour generation model are learned by a prediction module, which can be a neural network or a set of binary regression trees. The input features consist of linguistic information related to accentual phrases that can be automatically derived from text, such as the position of the accentual phrase in the utterance, number of morae, accent type, and parts-of-speech. In the synthesis phase, the prediction module is used to generate appropriate values of model parameters. The use of the parametric model restricts the degrees of freedom of the problem, facilitating data-driven learning. Experimental results show that the method makes it possible to generate quite natural F0 contours with a relatively small training database

Published in:

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on  (Volume:2 )

Date of Conference:

2001