Cart (Loading....) | Create Account
Close category search window
 

Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Mu'alem, A.W. ; Sch. of Comput. Sci. & Eng., Hebrew Univ., Jerusalem, Israel ; Feitelson, D.G.

Scheduling jobs on the IBM SP2 system and many other distributed-memory MPPs is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order in which the jobs arrive (FCFS scheduling) is fair and predictable, but suffers from severe fragmentation, leading to low utilization. This situation led to the development of the EASY scheduler which uses aggressive backfilling: Small jobs are moved ahead to fill in holes in the schedule, provided they do not delay the first job in the queue. We compare this approach with a more conservative approach in which small jobs move ahead only if they do not delay any job in the queue and show that the relative performance of the two schemes depends on the workload. For workloads typical on SP2 systems, the aggressive approach is indeed better, but, for other workloads, both algorithms are similar. In addition, we study the sensitivity of backfilling to the accuracy of the runtime estimates provided by the users and find a very surprising result. Backfilling actually works better when users overestimate the runtime by a substantial factor

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:12 ,  Issue: 6 )

Date of Publication:

Jun 2001

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.