System reliability evaluation, sensitivity analysis, importance measures, failure frequency analysis and optimal design have become important issues for distributed dependable computing. Finding all the minimal file spanning trees (MFST) and avoiding repeatedly computing the redundant MFSTs is the key technique for evaluating the reliability of a distributed computing system (DCS) in previous works. However, identifying all the disjoint MFSTs is difficult and very time consuming for large-scale networks. Although existing algorithms have been demonstrated that they work fine on medium-scale networks, they have two inherent drawbacks. First, they do not support efficient manipulation of Boolean algebra. The sum-of-disjoint-products method used by them is inefficient in dealing with large Boolean functions. Second, the tree-based partitioning algorithm does not merge isomorphic subproblems and therefore, redundant computations cannot be avoided. We propose a new efficient algorithm for the reliability evaluation of a DCS based on recursive merge and binary decision diagram (BDD). Using the BDD substitution technique, we can easily apply our algorithm to a network with imperfect nodes. The experimental results show a significant improvement on the execution time compared to previous works.
Published in:
Dependable Computing, 2004. Proceedings. 10th IEEE Pacific Rim International Symposium on
Date of Conference: 3-5 March 2004