This paper proposes novel and effective techniques to estimate a radius to answer k-nearest neighbor queries. The first technique targets datasets where it is possible to learn the distribution about the pairwise distances between the elements, generating a global estimation that applies to the whole dataset. The second technique targets datasets where the first technique cannot be employed, generating estimations that depend on where the query center is located. The proposed k-NNF() algorithm combines both techniques, achieving remarkable speedups. Experiments performed on both real and synthetic datasets have shown that the proposed algorithm can accelerate k-NN queries more than 26 times compared with the incremental algorithm and spends half of the total time compared with the traditional k-NN() algorithms.
Published in:
Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
Date of Conference: 9-11 July 2007