Skip to Main Content
We report on the synthesis of speech in the context of a phonetic vocoder operating at 100 b/s. With each phoneme, the vocoder transmits the duration and a single pitch value. The synthesizer uses a large inventory of diphone "models" to synthesize a desired phoneme string. The diphone inventory has been selected to differentiate between prevocalic and postvocalic allophones of sonorants, to account for changes in vowel color conditioned by postvocalic liquids, to allow exact specification of voice onset time, and to permit synthesis of glottal stops alveolar flaps and syllabic consonants. The diphones are extracted from carefully constructed short utterances and are stored as a sequence of LPC parameters. During synthesis, the requisite diphone models are time-warped, abutted and smoothed to produce a complete sequence of LPC parameters that are used in the synthesis. The algorithms used are described and compared with more conventional methods. Examples of the synthesized speech will be played.