Skip to Main Content
For the last few years bag-of-words models have been succesfully applied to the information retrieval field. However their application to visual content suffers from an important shortcoming: they model images as sets of unordered visual words rather than consider their spatial and geometric layout. Visual information is highly organized along the dimensions of an image and algorithms should make use of this to enhance the performance of several visual processing tasks. In this paper, a generative model is proposed that fuses both the local information obtained from visual words and the global geometric layout given by a previous segmentation of the image. Furthermore, the model considers inter-region influences so topics can spread along the image and, thus, generate final segmentations in which regions represent semantic concepts. The proposed model is succesfully tested on three different tasks.