By Topic

Speech spectrogram based model adaptation for speaker identification

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
S. Gurbuz ; Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA ; J. N. Gowdy ; Z. Tufekci

Speech signal feature extraction is a challenging research area with great significance to the speaker identification and speech recognition communities. We propose a novel speech spectrogram based spectral modal adaptation algorithm. This system is based on dynamic thresholding of speech spectrograms for text-dependent speaker identification. For a given utterance from a target speaker we aim to find the target speaker among a number of speakers who exist in the system. Conceptually, this algorithm attempts to increase the spectral similarity for the target speaker while increasing the spectral dissimilarity for the non-target speaker who is a member of the enrolment set. Therefore, it removes aging and intersession-dependent spectral variation in the utterance while preserving the speaker inherent spectral features. The hidden Markov model (HMM) parameters representing each listed speaker in the system are adapted for each identification event. The results obtained using speech signals from both the Noisex database and from recordings in the laboratory environment seem promising and demonstrate the robustness of the algorithm for aging and session-dependent utterances. Additionally, we have evaluated the adapted and the non-adapted models with data recorded two months after the initial enrollment. The adaptation seems to improve the performance of the system for the aged data from 84% to 91%

Published in:

Southeastcon 2000. Proceedings of the IEEE

Date of Conference: