Skip to Main Content
This paper describes a new simple, but effective, approach to speaker verification using video sequences of lip movements. We use motion history images (MHI) to provide a biometric template of a spoken word for each speaker. Class-dependent correlation filters are then created by a weighted optimization of training MHI samples. Feature extraction is performed by correlating a test MHI against each correlation filter. A Bayesian classifier is deployed for classification. We carry out an extensive performance evaluation of our approach with respect to the number of training samples and different words. Results clearly show the efficacy of our method.