Skip to Main Content
Existing schemes for software fault-tolerance are based on the ideas of redundancy and diversity. Although being experimentally tested valid, existing fault-tolerant schemes are mainly ad hoc and lack theoretically rigorous foundation. They substantially increase software complexity and incur high development costs. They also impose challenges for real-time concurrent software systems where timing requirements may be stringent and faults in concurrent processes can propagate one another. In This work we treat software fault-tolerance as a robust supervisory control (RSC) problem and propose a RSC approach to software fault-tolerance. In this approach the software component under consideration is treated as a controlled object that is modeled as a generalized Kripke structure or finite-state concurrent system, and an additional safety guarder or supervisor is synthesized and compounded to the software component to guarantee the correctness of the overall software system, which is aimed to satisfy a temporal logic (CTL*) formula, even if faults occur to the software component. The proposed RSC approach requires only a single version of software and is based on a theoretically rigorous foundation. It is essentially an approach of model construction and thus complementary to the approach of model checking. It is a contribution to the theory of supervisory control, software fault-tolerance as well as the emerging area of software cybernetics that explores the interplay between software and control.