By Topic

A speaker-independent, syntax-directed, connected word recognition system based on hidden Markov models and level building

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
L. Rabiner ; AT&T Bell Laboratories, Murray Hill, NJ ; S. Levinson

In the last several years, a wide variety of techniques have been developed which make practical the implementation and development of large networks for recognizing connected sequences of words. Included among these techniques are efficient and accurate speech modeling methods (e.g., vector quantization, hidden Markov models) and efficient, optimal network search procedures (i.e., level building). In this paper we show how to integrate these techniques to give a speaker-independent, syntax-directed, connected word recognition system which requires only a modest amount of computation, and whose performance is comparable to that of previous recognizers requiring an order of magnitude more computation. In particular, the recognizer we studied was an airlines information and reservation system using a 129 word vocabulary, and a deterministic syntax (grammar) with 144 states, 450 state transitions, and 21 final states, generating more than 6 × 109sentences. An evaluation of the system, using six talkers each speaking 51 test sentences, yielded a sentence accuracy of about 75 percent resulting from a word accuracy of about 93 percent, for an average speaking rate of about 210 words per minute.

Published in:

IEEE Transactions on Acoustics, Speech, and Signal Processing  (Volume:33 ,  Issue: 3 )