Skip to Main Content
The authors examine the use of data flow information in algorithm-based fault tolerance for multiprocessor computations. They show that this information can lead to a more efficient design, by reducing the number of checks. They investigate both the analysis problem of determining the fault tolerance measures of a given system, and the design problem of constructing compact check sets to detect or locate a given number of faults. Specifically, it is shown that the analysis problem can be solved efficiently when the number of faults is fixed. The computational difficulty of this problem when the number of faults is not fixed is addressed. The authors consider the design problem for special classes of data flow graphs and establish upper and lower bounds on the number of checks needed to detect or locate a given number of faults.