By Topic

Querying Very Large Multi-dimensional Datasets in ADR

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)

Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space, and access to data items is described by range queries. The basic processing involves mapping input data items to output data items, and some form of aggregation of all the input data items that project to the each output data item. We have developed an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of multi-dimensional datasets on distributed-memory parallel architectures with multiple disks attached to each node. In this paper we address efficient execution of range queries on distributed memory parallel machines within ADR framework. We present three potential strategies, and evaluate them under different application scenarios and machine configurations. We present experimental results on the scalability and performance of the strategies on a 128-node IBM SP.

Published in:

Supercomputing, ACM/IEEE 1999 Conference

Date of Conference:

13-18 Nov. 1999