By Topic

Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Hui Li ; SAP AG - SAP Res., Karlsruhe, Germany

Grid computing proves to be a successful paradigm for large-scale distributed data processing, and global eScience grids have been in production for years (e.g., LCG and OSG). The majority of applications running on these production environments can be characterized as massive CPU-intensive batch jobs (or ??bag-of-tasks??), sometimes considered as the ??killer?? application for the grid. A deep understanding of its main workload characteristics is not only necessary for realistic performance evaluation of the existing system, but also crucial to generate new insights into better resource allocation schemes. This paper presents a comprehensive statistical analysis of the workloads on production eScience grid environments. We focus on second-order statistics and the scaling behavior of main job characteristics, namely job arrivals and job runtimes. A range of autocorrelation structures is identified and analyzed, including pseudoperiodicity, short-range dependence (SRD), and long-range dependence (LRD). We further develop mathematical models that are able to capture these salient properties in the workloads. Workload models, in turn, enable us to quantitatively evaluate the performance impacts of autocorrelations in grid scheduling. The results indicate that autocorrelations in workloads result in system performance degradation, sometimes the difference can be as large as up to several orders of magnitude. Nevertheless, better performance can be achieved at the grid level under bursty local background workloads. Such effects of workloads on systems are extensively analyzed and explained.

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:21 ,  Issue: 4 )