Skip to Main Content
An important concern in the field of speech recognition is the size of the vocabulary that a recognition system is able to support. Large vocabularies introduce difficulties involving the amount of computation the system must perform and the number of ambiguities it must resolve. But, for practical applications in general and for dictation tasks in particular, large vocabularies are required, because of the difficulties and inconveniences involved in restricting the speaker to the use of a limited vocabulary. This paper describes a new organization of the recognition process, Multilevel Decoding (MLD), that allows the system to support a Very-Large-Size Dictionary (VLSD)—one comprising over 100,000 words. This significantly surpasses the capacity of previous speech-recognition systems. With MLD, the effect of dictionary size on the accuracy of recognition can be studied. In this paper, recognition experiments using 10,000- and 200,000-word dictionaries are compared. They indicate that recognition using a 200,000-word dictionary is more accurate than recognition using a 10,000-word dictionary (when unrecognized words are included in the error rate).
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.