As computer and communication systems become more complex it becomes increasingly more difficult to analyze their hardware reliability, because simple models can fail to adequately-capture subtle but important features. This paper describes several ways the authors have addressed this problem for analyses based upon White's SURE theorem. They show: how reliability analysis based on SURE mathematics can attack very large problems by accepting recomputation in order to reduce memory usage; how such analysis can be parallelized both on multiprocessors and on networks of ordinary workstations, and obtain excellent performance gains by doing so; how the SURE theorem supports efficient Monte Carlo based estimation of reliability; and the advantages of the method. Empirical studies of large models solved using these methods show that they are effective in reducing the solution-time of large complex problems
Published in:
Reliability, IEEE Transactions on
(Volume:44
,
Issue:
1
)
Date of Publication: Mar 1995