Notification:
We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

A Japanese TTS system based on multiform units and a speech modification algorithm with harmonics reconstruction

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Takano, S. ; Cyber Space Labs., NTT, Kanagawa, Japan ; Tanaka, K. ; Mizuno, H. ; Abe, M.
more authors

This paper proposes a new text-to-speech (TTS) system that utilizes large numbers of speech segments to produce very natural and intelligible synthetic speech. There are two innovations; new multiform synthesis units and a new speech modification algorithm based on a vocoder that offers harmonics reconstruction. The multiform units make it possible to reduce acoustic discontinuities at concatenation points and unnatural sound by preparing synthesis units with various lengths and various F0 contours. The new speech modification algorithm, on the other hand, improves the quality of prosody modified speech. This algorithm is extremely effective in synthesizing speech whose prosodic parameters are quite different from those of synthesis units. Listening tests confirm that the new synthesis units yield speech with high intelligibility and naturalness, and that the new speech modification algorithm is superior to all other conventional vocoders and waveform domain algorithms including TD-PSOLA, especially when modifying the F0 frequency upward

Published in:

Speech and Audio Processing, IEEE Transactions on  (Volume:9 ,  Issue: 1 )