Skip to Main Content
Most of the bag of visual words models are used to resorting to clustering techniques such as the K-means algorithm, to construct visual dictionaries. In order to improve their efficiency in the context of multi-class image classification tasks, we present in this paper a new incremental weighted average and gradient descent-based clustering algorithm which optimizes the visual word detection by the use of the class label of training examples. We show that this new supervised vector quantization allows us to better reveal concept or category-specific local feature distributions over the feature space. A large comparison with the standard K-means algorithm on the PASCAL VOC-2007 dataset is carried out. The results show that our visual word construction technique is much more suitable for learning efficient classifiers with Support Vector Machine and Random Forest algorithms.