Skip to Main Content
We present online tunable diagnostic and membership protocols for generic time-triggered (TT) systems to detect crashes, send/receive omission faults, and network partitions. Compared to existing diagnostic and membership protocols for TT systems, our protocols do not rely on the single-fault assumption and also tolerate non-fail-silent (Byzantine) faults. They run at the application level and can be added on top of any TT system (possibly as a middleware component) without requiring modifications at the system level. The information on detected faults is accumulated using a penalty/reward algorithm to handle transient faults. After a fault is detected, the likelihood of node isolation can be adapted to different system configurations, including configurations where functions with different criticality levels are integrated. All protocols are formally verified using model checking. Using actual automotive and aerospace parameters, we also experimentally demonstrate the transient fault handling capabilities of the protocols.
Dependable and Secure Computing, IEEE Transactions on (Volume:8 , Issue: 2 )
Date of Publication: March-April 2011