Skip to Main Content
Concept similarity has been intensively researched in the natural language processing domain due to its important role in many applications such as language modeling and information retrieval. There are few studies on measuring concept similarity in visual domain, though concept based multimedia information retrieval has attracted a lot of attentions. In this paper, we present a scalable framework for such a purpose, which is different from traditional approaches to exploring correlation among concepts in image/video annotation domain. For each concept, a model based on feature distribution is built using sample images collected from the Internet. And similarity between concepts is measured with the similarity between their models. Hereby, a Gaussian mixture model (GMM) is employed to model each concept and two similarity measurements are investigated. Experimental results on 13,974 images of 16 concepts collected through image search engines have demonstrated that the similarity between concepts is very close to human perception. In addition, the entropy of GMM cluster distributions can be a good indication of selecting concepts for image/video annotation.