By Topic

Mandarin vowel synthesis based on 2D and 3D vocal tract model by finite-difference time-domain method

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Yuguang Wang ; Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China ; Hongcui Wang ; Jianguo Wei ; Jianwu Dang

Finite-difference time-domain (FDTD) method is an effective numerical method to do acoustic simulation. This paper focused on the details of Mandarin vowel synthesis based on 2D and 3D vocal tract model by FDTD method. To do so, a 3D vocal tract shape and vocal tract area function were extracted from the MRI volumetric images during Mandarin vowel production. 3D and 2D model with staggered FDTD mesh were constructed based on the vocal tract and its area function, respectively. Finally, vowels were synthesized by simulating wave sound propagation in the vocal tract using FDTD method with the two-mass vocal folds model. The formant frequencies of synthesized vowels were compared to those of real speech sounds. It is found that the mean absolute errors of formant frequencies were 7.77% and 6.07% for 2D and 3D model, respectively. Results suggested that both 2D and 3D model are capable of producing speech formants in about the same accuracy. However, 3D method exhibits more realistic phenomenon in high frequency region because it was based on complete 3D vocal tract model. It is also observed that the bandwidths of real speech can be achieved through setting the normal sound absorption coefficient within a proper range.

Published in:

Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific

Date of Conference:

3-6 Dec. 2012