Reliability of checkpointed real-time systems using time redundancy
Krishna, C.M.; Singh, A.D.
Reliability, IEEE Transactions on
Volume 42, Issue 3, Sep 1993 Page(s):427 - 435
Digital Object Identifier 10.1109/24.257826
Summary:Real-time computers are often used in embedded, life-critical
applications where high reliability is important. A common approach to
making such systems dependable is to vote on redundant processors
executing multiple copies of the same task is described. The processors
which make up such voted systems are subjected not only to independently
occurring permanent and transient failure, but also to correlated
transients brought about by electromagnetic interference from the
operating environment. To counteract these transients, checkpointing and
time redundancy are required, in addition to processor redundancy. This
work analyzes the use of time and device redundancy in systems subject
to correlated failure. The tradeoffs in checkpoint placement in such a
system are found to be considerably different from those for
non-redundant systems without real-time constraints. The authors compare
fault-tolerant designs and without a rollback capability, accounting for
the increased hardware-failure rate due to processor duplication when
faults are detected in hardware, and the doubled execution times when
detection is implemented in software
View citation and abstract |