The promise of a powerful computing device to help people in productivity as well as in recreation can only be realized with proper human-machine communication. Automatic recognition and understanding of spoken language is the first step toward natural human-machine interaction. Research in this field has produced remarkable results, leading to many exciting expectations and new challenges. We summarize the development of the spoken language technology from both a vertical (chronology) and a horizontal (spectrum of technical approaches) perspective. We highlight the introduction of statistical methods in dealing with language-related problems, as this represents a paradigm shift in the research field of spoken language processing. Statistical methods are designed to allow the machine to learn structural regularities in the speech signal, directly from data, for the purpose of automatic speech recognition and understanding. Research results in spoken language processing have led to a number of successful applications, ranging from dictation software for personal computers and telephone-call processing systems for automatic call routing, to automatic sub-captioning for television broadcasts. We analyze the technical successes that support these applications. Along with an assessment of the state of the art in this broad technical field, we also discuss the limitations of the current technology, and point out the challenges that are ahead. This paper presents an accurate overview of spoken language technology as a basis to inspire future advances.
Published in:
Proceedings of the IEEE
(Volume:88
,
Issue:
8
)
Date of Publication: Aug. 2000