Skip to Main Content
SIFT features are widely used in content based image retrieval. Typically, a few thousand keypoints are extracted from each image. Image matching involves distance computations across all pairs of SIFT feature vectors from both images, which is quite costly. We show that SIFT features perform surprisingly well even after quantizing each component to binary, when the medians are used as the quantization thresholds. Quantized features preserve both distinctiveness and matching properties. Almost all of the features in our 5.4 million feature test set map to distinct binary patterns after quantization. Furthermore, number of matches between images using both the original and the binary quantized SIFT features are quite similar. We investigate the distribution of SIFT features and observe that the space of 128-D binary vectors has sufficient capacity for the current performance of SIFT features. We use component median values as quantization thresholds and show through vector-to-vector distance comparisons and image-to-image matches that the resulting binary vectors perform comparable to original SIFT vectors. We also discuss computational and storage gains. Binary vector distance computation reduces to bit-wise operations. Square operation is eliminated. Fast and efficient indexing techniques such as the signatures used for chemical databases can also be considered.