Skip to Main Content
Tagged Web images provide an abundance of labeled training examples for visual concept learning. However, the performance of automatic training data selection is susceptible to highly inaccurate tags and atypical images. Consequently, manually curated training data sets are still a preferred choice for many image annotation systems. This paper introduces ARTEMIS - a scheme to enhance automatic selection of training images using an instance-weighted mixture modeling framework. An optimization algorithm is derived to learn instance-weights in addition to mixture parameter estimation, essentially adapting to the noise associated with each example. The mechanism of hypothetical local mapping is evoked so that data in diverse mathematical forms or modalities can be cohesively treated as the system maintains tractability in optimization. Finally, training examples are selected from top-ranked images of a likelihood-based image ranking. Experiments indicate that ARTEMIS exhibits higher resilience to noise than several baselines for large training data collection. The performance of ARTEMIS-trained image annotation system is comparable with usage of manually curated data sets.