This paper studies a method for learning a discriminative visual codebook for various computer vision tasks such as image categorization and object recognition. The performance of various computer vision tasks depends on the construction of the code book which is a table of visual-words (i.e. codewords). This paper proposed a learning criterion for constructing a discriminative codebook, and it is solved by the homonym scheme which splits codeword regions by labels. A codebook is learned based on the proposed homonym scheme such that its histogram can be used to discriminate objects of different labels. The traditional codebook based on the k-means is compared against the learned codebook on two well-known datasets (Caltech 101, ETH-80) and a dataset we constructed using google images. We show that the learned codebook consistently outperforms the traditional codebook.
Published in:
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Date of Conference: 22-27 May 2011