Skip to Main Content
This paper presents a novel discriminative stochastic method for image categorization and annotation. We first divide the images into blocks on a regular grid and then generate visual keywords through quantizing the features of image blocks. The traditional Markov chain model is generalized to capture 2-D spatial dependence between visual keywords by defining the notion of “past” as what we have observed in a row-wise raster scan. The proposed spatial Markov chain model can be trained via maximum-likelihood estimation and then be used directly for image categorization. Since this is completely a generative method, we can further improve it through developing new discriminative learning. Hence, spatial dependence between visual keywords is incorporated into kernels in two different ways, for use with a support vector machine in a discriminative approach to the image categorization problem. Moreover, a kernel combination is used to handle rotation and multiscale issues. Experiments on several image databases demonstrate that our spatial Markov kernel method for image categorization can achieve promising results. When applied to image annotation, which can be considered as a multilabel image categorization process, our method also outperforms state-of-the-art techniques.