Skip to Main Content
A system is described that automatically generates phonetic transcriptions for German orthographic words. The entire generative process consists of two main steps. In the first step, the system segments the words into their morphs, or prefixes, stems, and suffixes. This segmentation is very important for the transcription of German words, because the pronunciation of the letters depends also on their morphological environment. In the second step, the system transcribes the morphologically segmented words. Several transcriptions can be generated per word, thus permitting the system to take pronunciation variants into account. This feature results from the application area of the system, which is the provision of phonetic reference units for an automatic large-vocabulary speech recognition system. Statistical evaluations show that the transcription system has an excellent linguistic performance: more than 99 percent of the segmented words obtain a correct segmentation in the first step, and more than 98 percent of the words receive a correct phonetic transcription in the second step.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.