By Topic

A comparison of three feature vector clustering procedures in a speech recognition paradigm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Niles, L. ; Bell Telephone Laboratories, Murray Hill, NJ ; Silverman, H.F. ; Dixon, N.

One possible approach to achieving talker independence in discrete utterance recognition (DUR) is to classify speech feature vectors by using a talker-independent clustering procedure. There are many possible choices of clustering algorithms. This work studied the characteristics of three clustering procedures, Agglomerative, Basic Isodata, and a 'Biased Mean' modification of Basic Isodata, as applied to speech feature vectors. The feature extractor consisted of a six channel filterbank similar to those used in DUR systems. The speech data was derived from 19 (total) repetitions of a ten word vocabulary, spoken by 16 different talkers. Various distance functions and feature vector representations were employed. Agglomerative clustering did not produce clusters which corresponded to any apparent classification of speech events. The Biased Mean Isodata procedure did not converge, and therefore was not useful. The Basic Isodata algorithm produced clusters which were to varying degrees identifiable with classes of speech sounds. Simple classifiers for three such classes, based on these clusters, would classify feature vectors with 5-10% error rates. Best results were obtained by using feature vectors which consisted of the log filter channel energies. These test results are good enough to encourage further development of cluster-based feature vector classifiers.

Published in:

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '83.  (Volume:8 )

Date of Conference:

Apr 1983