Skip to Main Content
The World Wide Web is the largest publicly available image repository and a natural source of attention. An immediate consequence is that searching for images on the Web has become a current and important task. To search for images of interest, the most direct approach is keyword-based searching. However, since images on the Web are poorly labeled, direct application of standard keyword-based image searching techniques frequently yields poor results. We propose a comprehensive solution to this problem. In our approach, multiple sources of evidence related to the images are considered. To allow combining these distinct sources of evidence, we introduce an image retrieval model based on Bayesian belief networks. To evaluate our approach, we perform experiments on a reference collection composed of 54000 Web images. Our results indicate that retrieval using an image surrounding text passages is as effective as standard retrieval based on HTML tags. This is an interesting result because current image search engines in the Web usually do not take text passages into consideration. Most important, according to our results, the combination of information derived from text passages with information derived from HTML tags leads to improved retrieval, with relative gains in average precision figures of roughly 50 percent, when compared to the results obtained by the use of each source of evidence in isolation.