By Topic

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Henry M. Monti ; Dept. of Computer Science, Virginia Tech. Blacksburg, Virginia, USA ; Ali R. Butt ; Sudharshan S. Vazhkudai

Innovative scientific applications and emerging dense data sources are creating a data deluge for high-end computing systems. Processing such large input data typically involves copying (or staging) onto the supercomputer's specialized high-speed storage, scratch space, for sustained high I/O throughput. The current practice of conservatively staging data as early as possible makes the data vulnerable to storage failures, which may entail re-staging and consequently reduced job throughput. To address this, we present a timely staging framework that uses a combination of job startup time predictions, user-specified intermediate nodes, and decentralized data delivery to coincide input data staging with job start-up. By delaying staging to when it is necessary, the exposure to failures and its effects can be reduced. Evaluation using both PlanetLab and simulations based on three years of Jaguar (No. 1 in Top500) job logs show as much as 85.9% reduction in staging times compared to direct transfers, 75.2% reduction in wait time on scratch, and 2.4% reduction in usage/hour.

Published in:

Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on

Date of Conference:

19-23 April 2010