Skip to Main Content
In this paper, first we discuss two critical data structures used in the communication-induced checkpointing (CIC) protocols and their distinct roles in guaranteeing z- cycle free (ZCF) property by tracking the checkpoint and communication pattern (CCPAT) in a distributed computation that can lead to Z-cycles and preventing them. Then, we provide our Transitive Dependency Enabled TimeStamp (TDE_TSS) mechanism by which we can both timestamp each event and get the transitive dependency information upon receiving a message. Finally, based on this times- tamping mechanism, we present our Fully Informed aNd Efficient (FINE) checkpointing algorithm which can not only improve the performance of Fully Informed (FI) CIC protocol proposed by Helary et al. but also decrease the overhead of piggybacked information.