By Topic

Fixed-precision approximate continuous aggregate queries in peer-to-peer databases

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Banaei-Kashani, F. ; Comput. Sci. Dept., Univ. of Southern California, Los Angeles, CA, USA ; Shahabi, C.

In this paper, we propose an efficient sample-based approach to answer fixed-precision approximate continuous aggregate queries in peer-to-peer databases. First, we define practical semantics to formulate fixed-precision approximate continuous aggregate queries. Second, we propose “Digest”, a two-tier system for correct and efficient query answering by sampling. At the top tier, we develop a query evaluation engine that uses the samples collected from the peer-to-peer database to continually estimate the running result of the approximate continuous aggregate query with guaranteed precision. For efficient query evaluation, we propose an extrapolation algorithm that predicts the evolution of the running result and adapts the frequency of the continual sampling occasions accordingly to avoid redundant samples. We also introduce a repeated sampling algorithm that draws on the correlation between the samples at successive sampling occasions and exploits linear regression to minimize the number of the samples derived at each occasion. At the bottom tier, we introduce a distributed sampling algorithm for random sampling (uniform and nonuniform) from peer-to-peer databases with arbitrary network topology and tuple distribution. Our sampling algorithm is based on the Metropolis Markov Chain Monte Carlo method that guarantees randomness of the sample with arbitrary small variation difference with the desired distribution, while it is comparable to optimal sampling in sampling cost/time. We evaluate the efficiency of Digest via simulation using real data.

Published in:

Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2010 6th International Conference on

Date of Conference:

9-12 Oct. 2010