Skip to Main Content
Although a self-stabilizing system that suffers from a transient fault is guaranteed to converge to a legitimate state after a finite number of steps, the convergence can be slow if the harmful effects of the fault are allowed to propagate into many processes in the system. Moreover, some safety properties of the system may be violated during the convergence. To address these problems, we propose in this paper the concept of a state checksum - a redundancy that can be added to the state of a self-stabilizing system so that some classes of faults become visible to the system, and the system can limit the propagation of their harmful effects, and maintain its safety properties during the convergence. To make these concepts concrete, we discuss the case study of a token ring and show how to use fault-detecting and fault-correcting checksums to detect visible faults, limit the propagation of their harmful effects, and ensure that the safety properties of the ring are maintained during the convergence from these faults.