The authors describe the training and use of a multilayer perceptron (MLP) which performs a mapping from the spectra of vowels and nasal consonants, using examples spoken by a single speaker, to sets of area parameters for use in the vocal-tract-modelling filter of a speech synthesizer. Different MLP structures have been investigated using as input data either PARCOR coefficients or sample values of the spectrum. The trained MLP can be used to estimate the driving parameters for speech synthesis from natural utterances using this restricted phoneme set
Published in:
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Date of Conference: 23-26 May 1989