This paper presents a novel technique for the tracking and extraction of features from lips for the purpose of speaker identification. In noisy or other adverse conditions, identification performance via the speech signal can significantly reduce, hence additional information which can complement the speech signal is of particular interest. In our system, syntactic information is derived from chromatic information in the lip region. A model of the lip contour is formed directly from the syntactic information, with no minimization procedure required to refine estimates. Colour features are then extracted from the lips via profiles taken around the lip contour. Further improvement in lip features is obtained via linear discriminant analysis (LDA). Speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments are performed on the M2VTS database, with encouraging results
Published in:
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
(Volume:6
)
Date of Conference: 12-15 May 1998