Skip to Main Content
Some of the most effective recent methods for content-based image classification work by quantizing image descriptors, and accumulating histograms of the resulting visual word codes. Large numbers of descriptors and large codebooks are required for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests-ensembles of randomly created clustering trees-and show that they provide more accurate results, much faster training and testing, and good resistance to background clutter. Second, an efficient image classification method is proposed. It combines ERC-Forests and saliency maps very closely with the extraction of image information. For a given image, a classifier builds a saliency map online and uses it to classify the image. We show in several state-of-the-art image classification tasks that this method can speed up the classification process enormously. Finally, we show that the proposed ERC-Forests can also be used very successfully for learning distance between images. The distance computation algorithm consists of learning the characteristic differences between local descriptors sampled from pairs of same or different objects. These differences are vector quantized by ERC-Forests and the similarity measure is computed from this quantization. The similarity measure has been evaluated on four very different datasets and always outperforms the state-of-the-art competitive approaches.