By Topic

Tone feature extraction through parametric modeling and analysis-by-synthesis-based pattern matching

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Jinfu Ni ; ATR Spoken Language Translation Res. Labs., Kyoto, Japan ; Kawai, H.

A functional fundamental frequency (F0) model is applied to extract tone peak and gliding features from Mandarin F0 contours aiming at automatic prosodic labeling of a large scale speech corpus. Modeling four lexical tones and representing them in a parametric form based on the F0 model, we first cluster baseline tone patterns using the LBG (Linde-Buzo-Gray) algorithm, then perform analysis-by-synthesis-based pattern matching to estimate underlying tone peaks and tone pattern types from observed F0 contours and phonetic labels with lexical tones. Tone gliding features are re-estimated after the determination of tone peaks. 94% of the automatically estimated labels were consistent with the manual labels in an open test of 968 utterances from eight native speakers. Also, experimental results indicate that the proposed method is applicable for F0 contour smoothing and tone verification.

Published in:

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on  (Volume:1 )

Date of Conference:

2003