Cart (Loading....) | Create Account
Close category search window
 

An efficient subspace sampling framework for high-dimensional data reduction, selectivity estimation, and nearest-neighbor search

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Aggarwal, C.C. ; IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Data reduction can improve the storage, transfer time, and processing requirements of very large data sets. One of the challenges of designing effective data reduction techniques is to be able to preserve the ability to use the reduced format directly for a wide range of database and data mining applications. We propose the novel idea of hierarchical subspace sampling in order to create a reduced representation of the data. The method is naturally able to estimate the local implicit dimensionalities of each point very effectively and, thereby, create a variable dimensionality reduced representation of the data. Such a technique is very adaptive about adjusting its representation depending upon the behavior of the immediate locality of a data point. An important property of the subspace sampling technique is that the overall efficiency of compression improves with increasing database size. Because of its sampling approach, the procedure is extremely fast and scales linearly both with data set size and dimensionality. We propose new and effective solutions to problems such as selectivity estimation and approximate nearest-neighbor search. These are achieved by utilizing the locality specific subspace characteristics of the data which are revealed by the subspace sampling technique.

Published in:

Knowledge and Data Engineering, IEEE Transactions on  (Volume:16 ,  Issue: 10 )

Date of Publication:

Oct. 2004

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.