By Topic

A Chinese text to speech system based on TD-PSOLA

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

The purchase and pricing options are temporarily unavailable. Please try again later.
4 Author(s)
Yunbo Zhu ; Dept. of radio engineering, Southeast Univ., Nanjing, China ; Li Zhao ; Y. Xu ; Y. Niimi

The paper presents the implementation of a Chinese text to speech (hereafter called TTS) system based on the Time Domain Pitch-Synchronous OverLap-Add approach (hereafter called TD-PSOLA). In order to get natural synthesized speech, it is necessary to precisely extract pitch-marks for each monosyllabic speech unit, to predict the length of syllables in a sentence to be synthesized and to generate F0-contours for their final portion. In the paper, we concentrate on the last two issues to propose a scheme to predict syllable duration. which gives an accuracy of about 18% of the relative length error, and to generate F0-contour. To synthesize a certain tonal syllable with a desired duration, a new pattern-scaling algorithm was proposed. The preliminary hearing test showed the intelligibility and naturalness of synthetic speech were good.

Published in:

TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering  (Volume:1 )

Date of Conference:

28-31 Oct. 2002