By Topic

Cross-Modality Automatic Face Model Training from Large Video Databases

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Song, X. ; University of Washington, Seattle, WA ; Ching-Yung Lin ; Ming-Ting Sun

Face recognition is an important issue on video indexing and retrieval applications. Usually, supervised learning is used to build face models for various specific named individuals. However, a huge amount of labeling work is needed in a traditional supervised learning framework. In this paper, we propose an automatic cross-modality training scheme without supervision which uses automatic speech recognition of videos to build visual face models. Based on Multiple-Instance Learning algorithms, we introduce novel concepts of "Quasi-Positive bags" and "Extended Diverse Density", and use them to develop an automatic training scheme. We also propose to use the "Relative Sparsity" of a cluster to detect the anchorperson in the news videos. Experiments show that our algorithm can get correct models for the persons we are interested in. The automatic learned models are tested and compared with a supervised learning algorithm for face recognition in large news video databases, and show promising results.

Published in:

Computer Vision and Pattern Recognition Workshop, 2004. CVPRW '04. Conference on

Date of Conference:

27-02 June 2004