Skip to Main Content
We propose an iterative optimization algorithm for the generic class of clustering-based indexing for approximate similarity searching. It was previously shown that clustering is a powerful component of approximate searching that reduces the number of retrieved data points. The objective of the proposed algorithm is to maximize the expected search quality given the query distribution. The problem is decomposed into minimization over three mapping functions, and fixed-point iterations of the algorithm alternately optimizing one mapping while fixing the other two. We demonstrate via experiments on real high dimensional data sets that the algorithm significantly improves the time/accuracy efficiency over heuristic clustering design.