By Topic

A Combined Dual-stage Framework for Robust Scheduling of Scientific Applications in Heterogeneous Environments with Uncertain Availability

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)
Ciorba, F.M. ; Center for Inf. Services & High Performance Comput., Tech. Univ. Dresden, Dresden, Germany ; Hansen, T. ; Srivastava, S. ; Banicescu, I.
more authors

Scheduling parallel applications on existing or emerging computing platforms is challenging, and, among other attributes, must be efficient and robust. A dual-stage framework is proposed in this paper to evaluate the robustness of efficient resource allocation and dynamic load balancing of scientific applications in heterogeneous computing environments with uncertain availability. The first stage employs robust resource allocation heuristics, while the second stage incorporates robust dynamic loop scheduling techniques. The combined dual-stage framework constitutes a comprehensive framework that enables and provides guarantees for the robust execution of scientific applications in computing systems where uncertainty is caused by various unpredictable perturbations. The paper reports on studies for determining the best techniques to be used for each stage that: (a) maximize the probability that the system make span satisfies a deadline, and (b) minimize the system make span for every given availability level in the system. The usefulness and benefits of the proposed framework are demonstrated via a small scale example.

Published in:

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International

Date of Conference:

21-25 May 2012