A novel eigen-prosody analysis approach is proposed for robust speaker recognition under a mismatch handset environment. The idea is to convert the prosodic contours of a speaker's speech into sequences of prosody symbols, and transform the speaker recognition problem into a full-text document retrieval-similar task. Experimental results on the HTIMIT corpus have shown that, even though only few training/test data are available, about 32.2% relative error rate reduction could be achieved compared with the conventional Gaussian mixture model/cepstral mean subtraction approach.
Published in:
Electronics Letters
(Volume:40
,
Issue:
19
)
Date of Publication: 16 Sept. 2004