Skip to Main Content
This paper presents a new class of 2D string kernels, called spatial mismatch kernels, for use with support vector machine (SVM) in a discriminative approach to the image categorization problem. We first represent images as 2D sequences of those visual keywords obtained by clustering all the blocks that we divide images into on a regular grid. Through decomposing each 2D sequence into two parallel 1D sequences (i.e. the row-wise and column-wise ones), our spatial mismatch kernels can then measure 2D sequence similarity based on shared occurrences of k-length 1D subsequences, counted with up to m mismatches. While those bag-of-words methods ignore the spatial structure of an image, our spatial mismatch kernels can capture the spatial dependencies across visual keywords within the image. Experiments on the natural and histological image databases then demonstrate that our spatial mismatch kernel methods can achieve superior results.