Learning the semantics of image retrieval using both text and visual information is a challenging research issue in content-based image retrieval systems. In this paper, we present a statistical natural language processing model for image retrieval, which integrates semantic information provided by WordNet, an online lexical reference system, and low-level visual features. In our system, the semantic hierarchy of word senses from WordNet is used to strengthen the association between images and the textual description of a concept. A statistical keyword selection algorithm is followed to choose the most representative keywords to annotate those images of the concept. We test our model on a landscape image database with 10 different concepts. Our experimental results show that our approach could greatly improve the retrieval accuracy. The results also demonstrate the high potential of our approach in building ontologies of image databases.
Published in:
Computer Vision and Pattern Recognition Workshop, 2004. CVPRW '04. Conference on
Date of Conference: 27-02 June 2004