Skip to Main Content
In this paper, we consider distributed systems that can be modeled as finite state machines with known behavior under fault-free conditions, and we study the detection of a general class of faults that manifest themselves as permanent changes in the next-state transition functionality of the system. This scenario could arise in a variety of situations encountered in communication networks, including faults occurred due to design or implementation errors during the execution of communication protocols. In our approach, fault diagnosis is performed by an external observer/diagnoser that functions as a finite state machine and which has access to the input sequence applied to the system but has only limited access to the system state or output. In particular, we assume that the observer/diagnoser is only able to obtain partial information regarding the state of the given system at intermittent time intervals that are determined by certain synchronizing conditions between the system and the observer/diagnoser. By adopting a probabilistic framework, we analyze ways to optimally choose these synchronizing conditions and develop adaptive strategies that achieve a low probability of aliasing, i.e., a low probability that the external observer/diagnoser incorrectly declares the system as fault-free. An application of these ideas in the context of protocol testing/classification is provided as an example.