By Topic

Physiological Feature Extraction for Text Independent Speaker Identification using Non-Uniform Subband Processing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xugang Lu ; Sch. of Inf. Sci., Japan Sci. & Technol. Adv. Inst., Ishikawa, Japan ; Jianwu Dang

The features used for speech recognition should emphasize linguistic information while suppressing speaker differences. For speaker recognition, features should have more speaker individual information while attenuating the linguistic information. In most studies, however, the identical acoustic features are used for the different missions of speaker and speech recognitions. In this paper, we propose a new physiological feature extraction method which emphasizes individual information for speaker identification. For the purpose, physiological features of speakers were analyzed from the point of view of speech production. It is found that the speaker individual information is encoded in different frequency regions of speech sound. The speaker discriminative information was quantified using Fisher's F-ratio in each frequency region. Based on the F-ratio, we proposed a non-uniform sub-band processing strategy to extract new feature which can emphasize or refine the physiological aspects involved in speech production. We combined the new feature with GMM for speaker identification task and applied on NTT-VR speaker recognition database. Compared with MFCC feature, by using the proposed feature, the identification error rate was reduced 20.1%.

Published in:

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on  (Volume:4 )

Date of Conference:

15-20 April 2007