Optimal Hash Functions for Approximate Matches on the
-Cube
One way to find near-matches in large datasets is to use hash functions. In recent years locality-sensitive hash functions for various metrics have been given; for the Hamming metric projecting onto k bits is simple hash function that performs well. In this paper, we investigate alternatives to projection. For various parameters hash functions given by complete decoding algorithms for error-correcting codes work better, and asymptotically random codes perform better than projection.
Published in:
Information Theory, IEEE Transactions on
(Volume:56
,
Issue:
3
)
Date of Publication: March 2010