By Topic

IEEE Transactions on Reliability

Issue 4 • Date Oct 1990

Filter Results

Displaying Results 1 - 10 of 10
  • System availability monitoring

    Publication Year: 1990, Page(s):480 - 485
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB)

    A process set up by Digital to monitor and quantify the availability of its systems is described. The reliability data are collected in an automated manner and stored in a database. The breadth of data gathered provides a unique opportunity to correlate hardware andsoftware failures. In addition, several hypotheses have been tested, e.g. the relationship between crash rate and system load, the int... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting and eliminating built-in test false alarms

    Publication Year: 1990, Page(s):500 - 505
    Cited by:  Papers (32)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (452 KB)

    Failures detected by built-in test equipment (BITE) occur because a BITE measurement noise or bias as well as actual hardware failures. A quantitative approach is proposed for setting built-in test (BIT) measurement limits and this method is applied to the specific case of a constant failure rate system whose BITE measurements are corrupted by Gaussian noise. Guidelines for setting BIT measurement... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Validating complex computer system availability models

    Publication Year: 1990, Page(s):468 - 479
    Cited by:  Papers (11)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1032 KB)

    The authors report on experiences in validating complex computer-system availability models. A validation process, the availability models, and the data-collection process are described. An iteration of the model validation process emphasizing discrepancies between observed system behavior from data and from the model assumptions is presented. Analysis of data from five sites revealed that interru... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluation and design of an ultra-reliable distributed architecture for fault tolerance

    Publication Year: 1990, Page(s):492 - 499
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (768 KB)

    The issues related to the experimental evaluation of an early conceptual prototype of the MAFT (multicomputer architecture for fault tolerance) architecture are discussed. A completely automated testing approach was designed to allow fault-injection experiments to be performed, including stuck-at and memory faults. Over 2000 injection tests were run and the system successfully tolerated all faults... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Experimental evaluation of the fault tolerance of an atomic multicast system

    Publication Year: 1990, Page(s):455 - 467
    Cited by:  Papers (26)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1268 KB)

    The authors present a study of the validation of a dependable local area network providing multipoint communication services based on an atomic multicast protocol. This protocol is implemented in specialized communication servers, that exhibit the fail-silent property, i.e. a kind of halt-on-failure behavior enforced by self-checking hardware. The tests that have been carried out utilize physical ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulated fault injection: a methodology to evaluate fault tolerant microprocessor architectures

    Publication Year: 1990, Page(s):486 - 491
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (580 KB)

    A simulation-based fault-injection methodology for validating fault-tolerant microprocessor architectures is described. The approach uses mixed-mode simulation (electrical/logic analysis), and injects transient errors in run-time to assess the resulting fault-impact. To exemplify the methodology, a fault-tolerant architecture which models the digital aspects of a dual-channel, real-time jet-engine... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A census of Tandem system availability between 1985 and 1990

    Publication Year: 1990, Page(s):409 - 418
    Cited by:  Papers (127)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (980 KB)

    A census of customer outages reported to Tandem showing a clear improvement in the reliability of hardware and maintenance has been taken. It indicates that software is now the major source of reported outages (62%), followed by system operations (15%). This is a dramatic shift from the statistics for 1985. Even after discounting systematic underreporting of operations and environmental outages, t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error log analysis: statistical modeling and heuristic trend analysis

    Publication Year: 1990, Page(s):419 - 432
    Cited by:  Papers (84)  |  Patents (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1172 KB)

    Most error-log analysis studies perform a statistical fit to the data assuming a single underlying error process. The authors present the results of an analysis that demonstrates that the log is composed of at least two error processes: transient and intermittent. The mixing of data from multiple processes requires many more events to verify a hypotheses using traditional statistical analysis. Bas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A case study of Ethernet anomalies in a distributed computing environment

    Publication Year: 1990, Page(s):433 - 443
    Cited by:  Papers (43)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1020 KB)

    Fault detection and diagnosis depend critically on good fault definitions, but the dynamic, noisy, and nonstationary character of networks makes it hard to define what a fault is in a network environment. The authors take the position that a fault or failure is a violation of expectations. In accordance with empirically based expectations, operating behaviors of networks (and other devices) can be... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Empirically based analysis of failures in software systems

    Publication Year: 1990, Page(s):444 - 454
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (788 KB)

    An empirical analysis of failures in software systems is used to evaluate several specific issues and questions in software testing, reliability analysis, and reuse. The issues examined include the following: diminishing marginal returns of testing; effectiveness of multiple fault-detection and testing phases; measuring system reliability versus function or component reliability; developer bias re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Reliability is concerned with the problems involved in attaining reliability, maintaining it through the life of the system or device, and measuring it.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
W. Eric Wong
University of Texas at Dallas
Advanced Res Ctr for Software Testing and Quality Assurance

ewong@utdallas.edu