By Topic

Bounded Approximation: A New Criterion for Dimensionality Reduction Approximation in Similarity Search

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Khanh Vu ; Sch. of Electr. Eng. & Comput. Sci., Univ. of Central Florida, Orlando, FL ; Hua, K.A. ; Hao Cheng ; Sheau-Dong Lang

We examine the problem of efficient distance-based similarity search over high-dimensional data. We show that a promising approach to this problem is to reduce dimensions and allow fast approximation. Conventional reduction approaches, however, entail a significant shortcoming: The approximation volume extends across the dataspace, which causes overestimation of retrieval sets and impairs performance. This paper focuses on a new criterion for dimensionality reduction methods: bounded approximation. We show that this requirement can be accomplished by a novel nonlinear transformation scheme that extracts two important parameters from the data. We devise two approximation formulations, namely, rectangular and spherical range search, each corresponding to a closed volume around the original search sphere. We discuss in detail how we can derive tight bounds for the parameters and prove further results, as well as highlight insights into the problems and our proposed solutions. To demonstrate the benefits of the new criterion, we study the effects of (un)boundedness on approximation performance, including selectivity, error toleration, and efficiency. Extensive experiments confirm the superiority of this technique over recent state-of-the-art schemes.

Published in:

Knowledge and Data Engineering, IEEE Transactions on  (Volume:20 ,  Issue: 6 )