Skip to Main Content
Automatic image annotation has become an important and challenging problem due to the existence of the semantic gap. In this paper, we present an approach based on probabilistic latent semantic analysis (PLSA) to achieve the task. In order to model training data precisely, an image is firstly represented as a bag of visual words, then a probabilistic structure with two PLSA models is employed to capture semantic information from visual and textual modalities respectively. Furthermore, an adaptive asymmetric learning approach is proposed to fuse the aspects of these two models. For each image document, the distribution over aspects of different models is fused by multiplying different weights, which are determined by the entropy of the feature distribution. Consequently, the two models are linked with the same distribution over all aspects. This structure can predict semantic annotation well for an unseen image because it associates visual and textual modalities properly. We compare our approach with several previous approaches on a standard Corel dataset. The experimental results show that our approach performs more effectively and accurately.