Skip to Main Content
Easy photo-taking and photo-sharing today make image an increasingly important type of media in people's everyday life, which arouses a growing demand for a practical image understanding technique. Traditional computer vision or machine learning methods which learn models based on a set of training data are still in the stage of tackling hundreds of object categories. Such a scale is far from practical usage. In recent years, the technique of search-based image annotation on a large-scale data set has demonstrated great success. Rather than directly mapping visual features to texts which is inevitably hindered by the semantic gap, it understands the content of an image by propagating labels of its similar images in a large-scale data set. Since similarity search is performed among homogenous data, the difficulty is greatly reduced. This paper summarizes the extensive work on web image annotation using the large-scale metadata and social information available on the Web, and introduces the Arista system, which is a nonparametric image annotation platform built upon two billion web images. We propose a highly efficient and scalable duplicate-search technique so that the Arista system can be deployed on a few servers. A few interesting applications such as building large-scale celebrity face database and text-to-image translation are also presented in this paper.