Skip to Main Content
Computational grids provide a massive source of processing power, providing the means to support processor intensive applications. The strong burstiness and unpredictability of the available resources raise the need to make applications robust against the dynamics of grid environment. The two main techniques that are most suitable to cope with the dynamic nature of the grid are load balancing and job replication. In this work, we develop a load-balancing algorithm by juxtaposes the strong points of neighbor-based and cluster-based load-balancing methods. We then integrate the proposed load-balancing approach with fault-tolerant scheduling namely MinRC and develop a performance-driven fault-tolerant load-balancing algorithm or PD_MinRC for independent jobs. In order to improve system flexibility, reliability, and save system resource, PD_MinRC employs passive replication scheme. Our main objective is to arrive at job assignments that could achieve minimum response time, maximum resource utilization, and a well-balanced load across all the resources involved in a grid. Experiments were conducted to show the applicability of PD_MinRC. One advantage of our approach is the relatively low overhead and robust performance against resource failures and inaccuracies in performance prediction information.