Skip to Main Content
Reliability is a major requirement for most safety-related systems. To meet this requirement, fault-tolerant techniques such as hardware replication and software re-execution are often utilized. In this paper, we tackle the problem of analysis and optimization of fault-tolerant task scheduling for multiprocessor embedded systems. A set of existing fault-and process-models are adopted and a Binary Tree Analysis (BTA) is proposed to compute the system-level reliability in the presence of software/hardware redundancy. The BTA is integrated into a multi-objective evolutionary algorithm via a two-step encoding to perform reliability-aware design optimization. The optimization results contain the mapping of tasks to processing elements, the exact task and message schedule and the fault-tolerance policy assignment. Based on the observation that permanent faults need to be considered together with transient faults to achieve optimal system design, we propose a virtual mapping technique to take both types of faults into account. To the best of our knowledge, this is the first approach in fault-tolerant task scheduling that considers permanent and transient faults in a unified manner. The effectiveness of our approach is illustrated using several case studies.
Date of Conference: 9-14 Oct. 2011