Skip to Main Content
We present a speaker identification system that uses synchronized speech signals and lip features. We developed an algorithm that automatically extracts lip areas from speaker images, and a neural network system that integrates the two different types of signals to give accurate identification of speakers. We show that the proposed system gives better performances than the systems that use only speech or lip features in both text dependant and text independent speaker identification applications.
Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on (Volume:4 )
Date of Conference: July 31 2005-Aug. 4 2005