Loading [MathJax]/extensions/MathMenu.js
Speaker Identification in the Presence of Room Reverberation | IEEE Conference Publication | IEEE Xplore

Speaker Identification in the Presence of Room Reverberation


Abstract:

Speaker identification (SI) systems based on Gaussian Mixture Models (GMMs) have demonstrated high levels of accuracy when both training and testing signals are acquired ...Show More

Abstract:

Speaker identification (SI) systems based on Gaussian Mixture Models (GMMs) have demonstrated high levels of accuracy when both training and testing signals are acquired in near ideal conditions. These same systems when trained and tested with signals acquired under non-ideal channels such as telephone have been shown to have markedly lower accuracy levels. In this paper, we consider a reverberant test environment and its impact on SI. We measure the degradation in SI accuracy when the system is trained with clean signals but tested with reverberant signals. Next, we propose a method whereby training signals are first filtered with a family of reverberation filters prior to construction of speaker models; the reverberation filters are designed to approximate expected test room reverberation. Reverberant test signals are then scored against the family of speaker models and identification is made. Our research demonstrates that by approximating test room reverberation in the training signals, the channel mismatch problem can be reduced and SI accuracy increased.
Date of Conference: 11-13 September 2007
Date Added to IEEE Xplore: 14 January 2008
ISBN Information:
Conference Location: Baltimore, MD, USA
New Mexico State University, Klipsch School of Electrical and Computer Engineering, Las Cruces, New Mexico, USA
New Mexico State University, Klipsch School of Electrical and Computer Engineering, Las Cruces, New Mexico, USA

1. INTRODUCTION

In speaker identification (SI) the goal is to identify the most likely speaker of an unknown voice sample while in speaker verification (SV) the goal is to validate an identity claim based on a voice sample [1]. Our research focusses on the former. SI is a two-stage procedure consisting of training and testing. In the training stage shown in Fig. 1(a), speaker-dependent feature vectors, are extracted from a training speech signal and a speaker model, is built for each speaker's feature set. In the testing stage shown in Fig. 1(b), feature vectors are extracted from a test signal (speaker unknown). The test feature set is compared and scored against all speaker models and the most likely speaker identity, decided.

New Mexico State University, Klipsch School of Electrical and Computer Engineering, Las Cruces, New Mexico, USA
New Mexico State University, Klipsch School of Electrical and Computer Engineering, Las Cruces, New Mexico, USA

Contact IEEE to Subscribe

References

References is not available for this document.