Skip to Main Content
We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the uniform consensus (UC) and real-time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon unreliable failure detectors. We introduce the concept of fast failure detectors and show that the problem of building strong or perfect fast failure detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given and illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of unreliable failure detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound. Finally, we introduce FastUC, a novel solution to UC, that is based upon fast failure detectors.
Date of Publication: Aug 2002