By Topic

System availability monitoring

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
P. Moran ; Digital Equipment Int. BV, Galway, Ireland ; P. Gaffney ; J. Melody ; M. Condon
more authors

A process set up by Digital to monitor and quantify the availability of its systems is described. The reliability data are collected in an automated manner and stored in a database. The breadth of data gathered provides a unique opportunity to correlate hardware andsoftware failures. In addition, several hypotheses have been tested, e.g. the relationship between crash rate and system load, the interdependence of crashes, the cause of crashes, and the effect of new releases in the operating system. It is concluded that the process (in operation since 1988) has yielded worthwhile information on the products monitored. The usual availability metrics are calculated regularly for the machines monitored. Trends in system fault occurrence have been identified, leading to suggestions for both software and hardware improvements. The monitoring process and analysis methodology are revised on an ongoing basis to improve the quality of information obtained and to extend the analysis to Digital's new systems. The recently announced VAX9000 mainframe and fault-tolerant VAXft 3000 are two such systems

Published in:

IEEE Transactions on Reliability  (Volume:39 ,  Issue: 4 )