Pronunciation modeling for speech technology | IEEE Conference Publication | IEEE Xplore

Pronunciation modeling for speech technology


Abstract:

Written text is based on an orthographic representation of words, i.e. linear sequences of letters. Modern speech technology (automatic speech recognition and text-to-spe...Show More

First Page of the Article

Abstract:

Written text is based on an orthographic representation of words, i.e. linear sequences of letters. Modern speech technology (automatic speech recognition and text-to-speech synthesis) is based on phonetic units representing realization of sounds. A mapping between the orthographic form and phonetic forms representing the pronunciation is thus required. This may be obtained by creating pronunciation lexica and/or rule-based systems for grapheme-to-phoneme conversion. Traditionally, this mapping has been obtained manually, based on phonetic and linguistic knowledge. This approach has a number of drawbacks: i) the pronunciations represent typical pronunciations and will have a limited capacity for describing pronunciation variation due to speaking style and dialectical/accent variations; ii) if multiple pronunciation variants are included, it does not indicate which variants are more significant for the specific application; iii) the description is based on phonetic-knowledge and does not take into account that the units used in speech technology may deviate from the phonetic interpretation; and iv) the description is limited to units with a linguistic interpretation. The paper will present and discuss methods for modeling pronunciation and pronunciation variation specifically for applications in speech technology.
Date of Conference: 11-14 December 2004
Date Added to IEEE Xplore: 27 June 2005
Print ISBN:0-7803-8674-4
Conference Location: Bangalore, India

First Page of the Article


Contact IEEE to Subscribe

References

References is not available for this document.