By Topic

Exploiting Stream Request Locality to Improve Query Throughput of a Data Integration System

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Rubao Lee ; Chinese Academy of Sciences, Beijing ; Zhiwei Xu

This paper focuses on the problem of improving throughput of distributed query processing in an RDBMS-based data integration system. Although a buffer pool can be used in an RDBMS to cache disk pages in memory to reduce disk accesses, it cannot be used for data integration queries since its foundation, the memory-disk hierarchy, does not exist. The lack of a data sharing mechanism limits system throughput because unnecessary data requests increase burden on data sources and redundant resultant data transfers waste network bandwidth. To address the problem, we present a new technique called request window, which can detect and exploit data sharing opportunities among concurrent queries. Request window exploits a new stream request locality which reflects common query interests among independent users in a short time period. The existence of such a locality makes it possible to collect a group of related data requests and process them as a batch by request window. Evaluation on a PostgreSQL-based data integration system shows that request window can significantly increase system throughput when running a distributed TPC-H workload.

Published in:

IEEE Transactions on Computers  (Volume:58 ,  Issue: 10 )