The MAFT architecture for distributed fault tolerance
Keichafer, R.M.
Walter, C.J.
Finn, A.M.
Thambidurai, P.M.
Allied Signal Corp., Columbia, MD;
This paper appears in: Computers, IEEE Transactions on
Publication Date: Apr 1988
Volume: 37,
Issue: 4
On page(s): 398-404
ISSN: 0018-9340
References Cited: 22
CODEN: ITCOB4
INSPEC Accession Number: 3161324
Digital Object Identifier: 10.1109/12.2183
Current Version Published: 2002-08-06
Abstract
A description is given of the multicomputer architecture for fault
tolerance (MAFT), a distributed system designed to provide extremely
reliable computation in real-time control systems. MAFT is based on the
physical and functional partitioning of executive functions from
applications functions. The implementation of the executive functions in
a special-purpose hardware processor allows the fault-tolerance
functions to be transparent to the application programs and minimizes
overhead. Byzantine agreement and approximate agreement algorithms are
used for critical system parameters. MAFT supports the use of
multiversion hardware and software to tolerate built-in or generic
faults. Graceful degradation and restoration of the application workload
is permitted in response to the exclusion and readmission of nodes,
respectively
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.