Skip to Main Content
A functional fundamental frequency (F0) model is applied to extract tone peak and gliding features from Mandarin F0 contours aiming at automatic prosodic labeling of a large scale speech corpus. Modeling four lexical tones and representing them in a parametric form based on the F0 model, we first cluster baseline tone patterns using the LBG (Linde-Buzo-Gray) algorithm, then perform analysis-by-synthesis-based pattern matching to estimate underlying tone peaks and tone pattern types from observed F0 contours and phonetic labels with lexical tones. Tone gliding features are re-estimated after the determination of tone peaks. 94% of the automatically estimated labels were consistent with the manual labels in an open test of 968 utterances from eight native speakers. Also, experimental results indicate that the proposed method is applicable for F0 contour smoothing and tone verification.