By Topic

Some experiments in discrete utterance recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Das, S. ; IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

This paper is concerned with the following four aspects of the discrete utterance recognition problem: utterance normalization, dynamic programming (DP) algorithm implementation, boundary error effects, and the importance of the transition states in speech, Performance sensitivity as a function of each aspect of the problem is comparatively studied utilizing several available alternatives, and significant conclusions are drawn regarding each of them. The concept of proportional normalizing is introduced as an effective method of handling the utterance normalization problem. The implications of this method and the conventional mean normalizing method are discussed. In general, normalization by either of these techniques is seen to be about equally effective and leads to substantial improvement in recognition score. Next, several alternative implementations of the dynamic programming algorithm are investigated and their performances are compared. Finally, improved recognition performance is obtained by introducing a smoothing technique which emphasizes the importance of the transition states in speech. A database consisting of the utterances of the alpha-digit vocabulary produced by several male and female speakers is used to conduct all the experiments.

Published in:

Acoustics, Speech and Signal Processing, IEEE Transactions on  (Volume:30 ,  Issue: 5 )