Skip to Main Content
Automatic image annotation has been intensively studied for content-based image retrieval recently. In this paper, we propose a novel approach for this task. Our approach first performs the segmentation of images into regions, followed by the clustering of regions, before learning the associations between concepts and region clusters using the set of training images with pre-assigned concepts. The main focus of this paper and our main contributions are as follows. First, in the learning stage, we perform clustering of regions into region clusters by incorporating pair-wise constraints derived by considering the language model underlying the annotations assigned to training images. Second, in the annotation stage, to alleviate the restriction of the independence assumption between region clusters, we develop a greedy selection and joining algorithm to find the independent sub-sets of region clusters and employ a semi-naïve Bayesian (SNB) model to compute the posterior probability of concepts given those independent sub-sets. Experimental results show that our proposed system utilizing these two strategies outperforms the state-of-the-art techniques in large image collection.