By Topic

Talker recognition in tandem with talker-independent isolated word recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Rosenberg, A.E. ; AT&T Bell Laboratories, Murray Hill, NJ ; Shipley, K.

A talker recognition system operating in tandem with a talker-independent isolated word recognizer is described and evaluated. The word recognizer uses a small set of reference templates for each vocabulary word. Each set is intended to span and typify individual talker templates over a large population of talkers. Word recognition decisions are based on template distance scores obtained by comparing processed input utterances to each set of reference templates. The distribution of distance scores for the templates corresponding to the actual word input has been found to be reasonably consistent for individual talkers, and to vary sufficiently from talker to talker to provide the basis for a talker recognition capability. A system has been implemented to exploit this capability. An evaluation of the system, carried out using a 100-talker database of digit utterances, shows that good talker recognition performance can be obtained for input utterances consisting of sequences of seven or more digits. Identification error rates varying from 3.6 to 14.0 percent for talker populations varying from 10 to 100 talkers are obtained. When the recognizer orders the talkers as candidates for recognition, the correct talker is found, on the average, among the top 0.8 percent of the population. Tested in a talker verification mode, the average error rate is approximately 8 percent.

Published in:

Acoustics, Speech and Signal Processing, IEEE Transactions on  (Volume:33 ,  Issue: 3 )