Abstract:
The approximate nearest neighbor(ANN) search over high dimensional multimedia data has become an unavoidable service for online applications. Returning fast and high-qual...Show MoreMetadata
Abstract:
The approximate nearest neighbor(ANN) search over high dimensional multimedia data has become an unavoidable service for online applications. Returning fast and high-quality results of unknown queries are the largest challenge that most algorithms faced with. Locality Sensitive Hashing(LSH) is a well-known ANN search algorithm while suffers from inefficient index structure and poor accuracy in the distributed scheme. The traditional index structures have most significant bits(MSB) problem, which is their indexing strategies have an implicit assumption that the bits from one direction in the hash value have higher priority. In this paper, we propose new content-based index called Random Draw Forest(RDF), which not only applies a content-based partition strategy to reduce the search range for fast query response, but also uses the shuffling permutations on hash values to solve the most significant bits problem. We also study the trade-off between query's efficiency and accuracy after applying our partition strategy. In the experiment, we show the effect of parameters and the salient performance of RDF compared with other LSH-based methods to meet the online ANN search.
Date of Conference: 13-16 September 2018
Date Added to IEEE Xplore: 21 October 2018
ISBN Information: