A new technique to predict the pitch contour by using a small number of pitch values is proposed in this paper. The technique is based on the correlation between phonetic evolution and pitch variations during voiced speech. To follow the phonetic evolution, “temporal decomposition” (TD) is used which detects “event” functions, as interpolation paths, and their locations (centroids), as the certain instants to determine the pitch values. By using the proposed technique, the pitch contour is predicted with an overall error of less than 5% with some spectral parameter sets used in TD. Also, we show that this technique can give a better result than that of conventional frame by frame pitch detection methods, on the basis of a perceptually-based spectral distance measure of reconstructed speech
Published in:
TENCON '96. Proceedings., 1996 IEEE TENCON. Digital Signal Processing Applications
(Volume:1
)
Date of Conference: 26-29 Nov 1996