Skip to Main Content
Many state-of-the-art approaches on fault-tolerant system design make the simplifying assumption that all faults are detected within a certain time interval. However, based on a detailed experimental analysis, we observe that perfect fault detection is not only an impractical assumption but even if implementable also a suboptimal design decision. This paper presents an approach that takes imperfect fault detection into account. Novel analysis and optimization techniques are developed, which distinguish detectable and undetectable faults in the overall workflow. Besides synthesizing the task schedules, our approach also decides which of the available fault detectors is selected for each task instance. Experimental results show that our approach finds solutions with several orders of magnitude higher reliability than current approaches.