Skip to Main Content
In recent years, with the rapid proliferation of digital images, the need to search and retrieve the images accurately, efficiently, and conveniently is becoming more acute. Automatic image annotation with image semantic content has attracted increasing attention, as it is the preprocess of annotation based image retrieval which provides users accurate, efficient, and convenient image retrieval with image understanding. Different machine learning approaches have been used to tackle the problem of automatic image annotation; however, most of them focused on exploring the relationship between images and annotation words and neglected the relationship among the annotation words. In this paper, we propose a framework of using language models to represent the word-to-word relation and thus to improve the performance of existing image annotation approaches utilizing probabilistic models. We also propose a specific language model - the semantic similarity language model to estimate the semantic similarity among the annotation words so that annotations that are more semantically coherent will have higher probability to be chosen to annotate the image. To illustrate the general idea of using language model to improve current image annotation systems, we added the language model on top of the two specific image annotation models - the translation model (TM) and the cross media relevance model (CMRM). We tested the improved models on a widely used image annotation corpus - the Corel 5K dataset. Our results show that by adding the semantic similarity language model, the performance of image annotation improves significantly in comparison with the original models. Our proposed language model can also be applied to other image annotation approaches using word probability conditioned on image or word-image joint probability as well.