By Topic

Adapting Partitioned Continuous Query Processing in Distributed Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Yali Zhu ; Worcester Polytech. Inst., Worcester ; Rundensteiner, E.A.

Partitioned query processing is an effective method to process continuous queries with large stateful operators in a distributed systems. This method typically partitions input data into non-overlapping portions, with each query plan instance installed on a separate machine processing only one portion of the data. Dynamic redistribution of load among machines is then employed to handle fluctuating stream characteristics. However, existing load redistribution solutions have made the implicit assumption that no local query optimization is conducted at runtime on any of the participating machines, i.e., all local query plan instances are static and thus remain identical. This is restrictive for dynamic stream systems, where data partitions may experience significant fluctuations in selectivities or arrival rates over time - thus warranting local plan reoptimization. This raises the new problem that the heterogeneity of plan shapes among different machines must be tackled when doing load redistribution. To address this, we propose two new load balancing strategies along with corresponding protocols, that can balance the workload across a set of machines while seamlessly handling the complexity caused by local plan changes. The PTLB strategy is plan-agnostic, requiring no detailed knowledge of the underlying query plan. The MSLB strategy is plan-aware, that is. it rebalances the load by comparing the plan shape differences on the participating machines. All proposed techniques have been implemented in the DCAPE continuous query system. Our experiments demonstrate that the application of both query optimization and load balancing results in superior performance compared to applying either of the adaptation techniques alone - as has been the state-of-the-art in the current literature. Our evaluation compares the relative applicability and efficiency of the two proposed techniques PTLB and MSLB.

Published in:

Data Engineering Workshop, 2007 IEEE 23rd International Conference on

Date of Conference:

17-20 April 2007