I. Introduction
A distributed system treated by rollback-recovery as a collection of application processes that establish a contact via a network. A stable storage device accessed by the processes which persists all tolerated failures. The fault tolerance is achieved by the processes using the stable storage device. The recovery information is saved on the regular basis in the course of failure-free execution. When a failure occurs, the saved information is used by a failed process to restart the computation from an intermediate state, and the amount of lost computation in reduced. The recovery information consists at a minimum the states of the participating processes called checkpoints [6], [7], [10].