Skip to Main Content
One way to find near-matches in large datasets is to use hash functions. In recent years locality-sensitive hash functions for various metrics have been given; for the Hamming metric projecting onto k bits is simple hash function that performs well. In this paper, we investigate alternatives to projection. For various parameters hash functions given by complete decoding algorithms for error-correcting codes work better, and asymptotically random codes perform better than projection.
Date of Publication: March 2010