Skip to Main Content
In most parallel supercomputers, submitting a job for execution involves specifying how many processors are to be allocated to the job. When the job is moldable (i.e., there is a choice on how many processors the job uses), an application scheduler called SA can significantly improve job performance by automatically selecting how many processors to use. Since most jobs are moldable, this result has great impact to the current state of practice in supercomputer scheduling. However, the widespread use of SA can change the nature of workload processed by supercomputers. When many SAs are scheduling jobs on one supercomputer, the decision made by one SA affects the state of the system, therefore impacting other instances of SA. In this case, the global behavior of the system comes from the aggregate behavior caused by all SAs. In particular, it is reasonable to expect the competition for resources to become tougher with multiple SAs, and this tough competition to decrease the performance improvement attained by each SA individually. This paper investigates this very issue. We found that the increased competition indeed makes it harder for each individual instance of SA to improve job performance. Nevertheless, there are two other aggregate behaviors that override increased competition when the system load is moderate to heavy. First, as load goes up, SA chooses smaller requests, which increases efficiency, which effectively decreases the offered load, which mitigates long wait times. Second, better job packing and fewer jobs in the system make it easier for incoming jobs to fit in the supercomputer schedule, thus reducing wait times further. As a result, in moderate to heavy load conditions, a single instance of SA benefits from the fact that other jobs are also using SA.
Parallel and Distributed Systems, IEEE Transactions on (Volume:14 , Issue: 2 )
Date of Publication: Feb 2003