Skip to Main Content
Speech synthesizers for computer voice output are most useful when not restricted to a prestored vocabulary. The simplest approach to unrestricted text-to-speech translation uses a small set of letter-to-sound rules, each specifying a pronunciation for one or more letters in some context. Unless this approach yields sufficient intelligibility, routine addition of text-to-speech translation to computer systems is unlikely, since more elaborate approaches, embodying large pronunciation dictionaries or linguistic analysis, require too much of the available computing resources. The work here described demonstrates the practicality of routine text-to-speech translation. A set of 329 letter-to-sound rules has been developed. These translate English text into the international phonetic alphabet (IPA), producing correct pronunciations for approximately 90 percent of the words, or nearly 97 percent of the phonemes, in an average text sample. Most of the remaining words have single errors easily correctable by the listener. Another set of rules translates IPA into the phonetic coding for a particular commercial speech synthesizer. This report describes the technical approach used and the support hardware and software developed. It gives overall performance figures, detailed statistics showing the importance of each rule, and listings of a translation program and another used in rule development.
Acoustics, Speech and Signal Processing, IEEE Transactions on (Volume:24 , Issue: 6 )
Date of Publication: Dec 1976