Skip to Main Content
Summary form only given. With the momentum gaining for the grid computing systems, the issue of deploying support for integrated scheduling and fault-tolerant approaches becomes paramount importance. Unfortunately, fault-tolerance has not been factored into the design of most existing grid scheduling strategies. To this end, we propose a fault-tolerant scheduling policy that loosely couples job scheduling with job replication scheme such that jobs are efficiently and reliably executed. Performance evaluation of the proposed fault-tolerant scheduler against a nonfault-tolerant scheduling policy is presented and shown that the proposed policy performs reasonably in the presence of various types of failures.