Skip to Main Content
Computer grids have attracted great attention of both academic and enterprise communities, becoming an attractive alternative for the execution of applications that demand huge computational power, allowing the integration of computational resources spread through different administrative domains. The dynamic nature of the grid infrastructure, its high scalability, and great heterogeneity exacerbates the likelihood of errors occurrence, imposing fault tolerance as a major requirement for grid middlewares. This paper describes a flexible fault-tolerance mechanism implemented on integrate grid middleware that allows the customization of several fault tolerance parameters and the combination of different fault tolerance techniques. This paper also presents several experiments that measure the benefits of our approach, considering several different execution environments scenarios.