Skip to Main Content
Computational grids allow the sharing of geographically distributed resources in an efficient way, extending the boundaries of what we perceive as distributed computing. It is a fact that the computational grid nodes are not 100% secure from failure. Here comes a problem on how to handle failing nodes and effectively schedule and distribute the required work on the participating nodes and, in the same time, provide assurance that the task will be completed successfully. Additionally, when applying a recovery technique to an Economic Grid, the problem of maintaining the cost arises. In this paper, we propose an enhancement to a fault tolerance Genetic Algorithm (GA) using a checkpoint recovery technique. The enhancement focuses on finding a schedule which tries to minimize the running costs resulting from the overhead of implementing fault tolerance technique and in the same time tries to satisfy the quality constraints of the user. The results show that without adding these factors, the schedule running costs may be uncontrollable from the point of view of the grid owner.