Skip to Main Content
The robust availability of resources in distributed computing environments is a very important issue. In general, the resources are distributed among geographical distributed sites resulting in the higher failure probability. Therefore, this paper proposes a fault tolerance policy on dynamic load balancing in P2P grids to improve the dynamic availability of resources. This proposed policy duplicates jobs in different computing nodes to avoid job or hardware failure. In the meantime, the proposed fault tolerance policy also considers the load balancing among different computing nodes while keeping the stable job turnaround time. Therefore, the proposed policy could improve the system performance in a varying environment. Experimental results show that the proposed policy could achieve a better job completion rate as the failure rate increases.