An important problem in fault-tolerant distributed computer systems is maintaining agreement between nonfaulty processes in the presence of undiagnosed faults. Approximate agreement defines a condition in which it is not necessary for the agreed values to be numerically identical. Rather, processes need only agree with each other to within a predefined numerical tolerance. Convergent voting algorithms which achieve approximate agreement have been studied in the context of two classes of systems, synchronous and asynchronous. Studies have also addressed both completely connected and partially connected systems. Together, the two properties of synchrony and connectivity yield 4 different voting domains. In all studies to date, each voting domain has been treated as a separate problem. This paper: shows that for at least one broad family of voting algorithms, the 4 domains are special cases of a more general convergent voting problem; analyzes convergent voting under the 3-mode hybrid fault model of Thambidurai and Park; and presents a set of unifying relations applicable to all 4 voting domains. These relations are used to specify voting algorithms which optimize fault-tolerance, convergence rate, or computational overhead in any given voting domain. The task of designing a voting algorithm for a particular fault-tolerant system is thus greatly simplified
Published in:
Reliability, IEEE Transactions on
(Volume:44
,
Issue:
4
)
Date of Publication: Dec 1995