By Topic

Subspace Similarity Search under {rm L}_p-Norm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xiang Lian ; Dept. of Comput. Sci., Univ. of Texas-Pan American, Edinburg, TX, USA ; Lei Chen

Similarity search has been widely used in many applications such as information retrieval, image data analysis, and time-series matching. Previous work on similarity search usually consider the search problem in the full space. In this paper, however, we tackle a problem, subspace similarity search, which finds all data objects that match with a query object in the subspace instead of the original full space. In particular, the query object can specify arbitrary subspace with arbitrary number of dimensions. Due to the exponential number of possible subspaces specified by users, we introduce an efficient and effective pruning technique, which assigns scores to data objects with respect to pivots and prunes candidates via scores. We propose an effective multipivot-based method to preprocess data objects by selecting appropriate pivots, where the entire procedure is guided by a formal cost model, such that the pruning power is maximized. Then, scores of each data object are organized in sorted lists to facilitate an efficient subspace similarity search. Furthermore, many real-world application data such as image databases, time-series data, and sensory data often contain noises, which can be modeled as uncertain objects. Different from certain data, efficient query processing on uncertain data is more challenging due to its intensive computation of probability confidences. Thus, it is also crucial to answer subspace queries efficiently and effectively over uncertain objects. Specifically, we define a novel query, namely probabilistic subspace range query (PSRQ) in the uncertain database, which finds objects within a distance from a query object in any subspace with high probability. To address this query, we extend our proposed pruning techniques for precise data to that of answering PSRQ in arbitrary subspaces. Extensive experiments demonstrated the performance of our proposed approaches.

Published in:

Knowledge and Data Engineering, IEEE Transactions on  (Volume:24 ,  Issue: 2 )