Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 5:00 PM ET (12:00 - 21:00 UTC). We apologize for the inconvenience.
By Topic

Harvesting Image Databases from the Web

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Schroff, F. ; Univ. of Oxford, Oxford ; Criminisi, A. ; Zisserman, A.

The objective of this work is to automatically generate a large number of images for a specified object class (for example, penguin). A multi-modal approach employing both text, meta data and visual features is used to gather many, high-quality images from the Web. Candidate images are obtained by a text based Web search querying on the object identifier (the word penguin). The Web pages and the images they contain are downloaded. The task is then to remove irrelevant images and re-rank the remainder. First, the images are re-ranked using a Bayes posterior estimator trained on the text surrounding the image and meta data features (such as the image alternative tag, image title tag, and image filename). No visual information is used at this stage. Second, the top-ranked images are used as (noisy) training data and a SVM visual classifier is learnt to improve the ranking further. The principal novelty is in combining text/meta-data and visual features in order to achieve a completely automatic ranking of the images. Examples are given for a selection of animals (e.g. camels, sharks, penguins), vehicles (cars, airplanes, bikes) and other classes (guitar, wristwatch), totalling 18 classes. The results are assessed by precision/recall curves on ground truth annotated data and by comparison to previous approaches including those of Berg et al. (on an additional six classes) and Fergus et al.

Published in:

Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Date of Conference:

14-21 Oct. 2007