Skip to Main Content
This paper proposes a new method for mapping image segments to words in three layers for image retrieval. Our main goal here is to incorporate higher-level semantics into the retrieval process and thus narrow the gap between the user's interpretation and the automatically extracted low-level visual features of the same image content. The method is based on nonlinear segmentation, as well as clustering and statistical learning applied to both visual and textual features to find semantic relations between visual segment clusters and words of various abstraction levels. Experiments conducted on a wide, natural image domain shows that step-by-step semantic inferencing in image-word mapping helps to improve retrieval performance. The method supports various textual and/or visual browsing and searching schemes and is proved to be very useful for effective browsing and retrieval in large image data sets.