Cart (Loading....) | Create Account
Close category search window
 

A statical decision approach to the recognition of connected digits

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Sambur, M. ; Bell Laboratories, Murray Hill, NJ ; Rabiner, L.

A statistical decision approach to the recognition of connected digits is described in this paper. The method can be either speaker dependent (i.e., each new speaker must first train the system on representative digit strings before he can successfully use the system) or speaker independent. Multiple repetitions of each digit (spoken in connected strings) are used in the training sequence. Repetitions of the same digit are combined by linearly warping the individual reference patterns to the speakers' average length for the digit. Statistics of the mean and covariance of the recognition parameters between repetitions of the same digit are computed and are used in the recognition phase of the system. Once a spoken digit string has been segmented, the recognition of each digit within the string is achieved using a distance measure based on an expanded form of the principle of minimum residual error. In cases where a great deal of coarticulation can be anticipated between adjacent digits (i.e., between digits bounded by voiced regions) a second distance metric is employed. This metric includes both the effects of the analysis estimation error and the effects of coarticulation. The analysis parameters used in this system are the linear prediction coefficients (LPC's) of a 10-pole LPC analysis. For stability purposes, the linear predictive coding (LPC) coefficients are converted to parcor or reflection coefficients prior to the linear warping, and then the warped parcor coefficients are converted back to LPC coefficients for recognition purposes. The recognition system was tested on six speakers in the speaker-dependent mode with recognition accuracies of from 97 to 100 percent. It was also tested with 10 new speakers in the speaker-independent mode, with a digit recognition accuracy of 95 percent.

Published in:

Acoustics, Speech and Signal Processing, IEEE Transactions on  (Volume:24 ,  Issue: 6 )

Date of Publication:

Dec 1976

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.