Skip to Main Content
This paper describes an implementation of a speaker independent system which can recognize connected digits. The overall recognition system consists of two separate but interrelated parts. The function of the first part of the system is to segment the digit string into the individual digits which comprise the string; the second part of the system then recognizes the individual digits based on the results of the segmentation. The segmentation of the digits is based on a voiced-unvoiced analysis of the digit string, as well as information about the location and amplitude of minima in the energy contour of the utterance. The digit recognition strategy is similar to the algorithm used by Sambur and Rabiner  for isolated digits, but with several important modifications due to the impreciseness with which the exact digit boundaries can be located. To evaluate the accuracy of the system in segmenting and recognizing digit strings a series of experiments was conducted. Using high-quality recordings from a soundproof booth the segmentation accuracy was found to be about 99 percent, and the recognition accuracy was about 91 percent across ten speakers (five male, five female). With recordings made in a noisy computer room the segmentation accuracy remained close to 99 percent, and the recognition accuracy was about 87 percent across another group of ten speakers (five male, five female).