The details of the implementation of a syntax-controlled acoustic encoder of a speech understanding system (SUS) are presented. Finite-state automata operating on artificial descriptions of suprasegmentals and global spectral features isolate syllables in continuous speech. Then a combinational algorithm tracks the formants for the voiced intervals of each syllable, and other algorithms provide a complete structural description of spectral and prosodic features for a spoken sentence. Such a description consists of a string of symbols and numerical attributes and is a representation of speech in terms of perceptually significant primitive forms. It contains all the information required to reconstruct the analyzed sentence with a formant synthesizer; it can be used directly either for emitting or verifying hypotheses at the lexical level of an SUS and for automatically learning phonetic features by grammatical inference.
Published in:
Acoustics, Speech and Signal Processing, IEEE Transactions on
(Volume:24
,
Issue:
5
)
Date of Publication: Oct 1976