By Topic

Group-by Query Process in Middleware of Large Scale Data Intensive Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Huaiming Song ; Key Lab. of Comput. Syst. & Archit., Chinese Acad. of Sci., Beijing, China ; Mingyuan An ; Yang Wang ; Weiping Wang
more authors

Large scale data intensive systems are available in many fields in recent years, and itpsilas a severe challenge for group-by query of large volume of data in a cluster based on shared-nothing architecture. This paper proposes a design of a parallel query engine (PQE) and its asynchronous improvement (APQE) for group-by queries. PQE and APQE support for pipelined query processing and develop maximum degree of pipeline parallelism. APQE further eliminates the synchronous overhead of multi nodes parallelism, and returns part of final result as early as possible if no data dependency exists. Experimental results demonstrate that, compared to previous 2-step query engine, PQE and APQE can make a significant performance improvement for group-by query of large data sets in a shared-nothing cluster system, as well as obviously better scalability.

Published in:

Networking, Architecture, and Storage, 2009. NAS 2009. IEEE International Conference on

Date of Conference:

9-11 July 2009