By Topic

On the Fly Estimation of the Processes that Are Alive in an Asynchronous Message-Passing System

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Achour Mostefaoui ; IRISA, Université de Rennes, Campus de Beaulieu, France ; Michel Raynal ; Gilles Tredan

It is well known that in an asynchronous system where processes are prone to crash, it is impossible to design a protocol that provides each process with the set of processes that are currently alive. Basically, this comes from the fact that it is impossible to distinguish a crashed process from a process that is very slow or with which communications are very slow. Nevertheless, designing protocols that provide the processes with good approximations of the set of processes that are currently alive remains a real challenge in fault-tolerant-distributed computing. This paper proposes such a protocol, plus a second protocol that allows to cope with heterogeneous communication networks. These protocols consider a realistic computation model where the processes are provided with nonsynchronized local clocks and a function alpha () that takes a local duration Delta as a parameter, and returns an integer that is an estimate of the number of processes that could have crashed during that duration Delta. A simulation-based experimental evaluation of the proposed protocols is also presented. These experiments show that the protocols are practically relevant.

Published in:

IEEE Transactions on Parallel and Distributed Systems  (Volume:20 ,  Issue: 6 )