By Topic

Korean pronunciation variation modeling with probabilistic Bayesian networks

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Sakriani Sakti ; MASTAR Project, Knowledge Creating Communication Research Center, National Institute of Information and Communications Technology (NICT), Japan ; Andrew Finch ; Ryosuke Isotani ; Hisashi Kawai
more authors

In Korean language, a large proportion of word units are pronounced differently from their written forms due to an agglutinative and highly inflective nature having severe phonological phenomena and coarticulation effects. This paper reports on an ongoing study of Korean pronunciation modeling, in which the mapping between phonemic and orthographic units is modeled by a Bayesian network (BN). The advantages of this graphical model framework is that the probabilistic relationship between these symbols as well as additional knowledge sources can be learned in a general and flexible way. Thus, we can easily incorporate various additional knowledge sources from different domains. In this preliminary study, we start with a simple topology where the additional knowledge only includes the preceding and succeeding contexts of the current phonemic unit. In practise, this proposed BN pronunciation model is applied on our syllable-based Korean large-vocabulary continuous speech recognition (LVCSR) system, where we construct the speech recognition task as a serial architecture composed of two independent parts. The first part is to perform standard hidden Markov model (HMM)-based recognition of phonemic syllable units of the actual pronunciation (surface forms). By this way, the lexicon dictionary and out-of-vocabulary rates can be kept small, while avoiding high acoustic confusability. In the second part, the system then transforms the phonemic syllable surface forms into the desirable Korean orthography eumjeol of a recognition unit, by utilizing the proposed BN pronunciation model. Experimental results show that the proposed BN model can successfully map the phonemic syllable surface forms to eumjeols transcription with more than 97% accuracy on average. It also revealed that it could help to enhance our Korean LVCSR system, and gave about 25.53% absolute improvement on average with respect to baseline orthographic syllable recognition.

Published in:

Universal Communication Symposium (IUCS), 2010 4th International

Date of Conference:

18-19 Oct. 2010