Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

The effect of program behavior on fault observability

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Bowen, N.S. ; IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA ; Pradhan, D.K.

Fault observability based on the behavior of memory references is studied. Traditional studies view memory as one monolithic entity that must completely work to be considered reliable. The usage patterns of a particular program's memory are emphasized here. This paper develops a new model for the successful execution of a program taking into account the usage of the data by extending a cache memory performance model. Three variations, based on well known allocation schemes, are presented (i.e., whether the program's storage is preallocated, dynamically allocated, or constrained in allocation). This is contrasted to traditional memory reliability calculations to show that the actual mean time to failure may be more optimistic when program behavior is considered. It also develops expressions for the probability of unobserved faults. With several studies reporting correlations between increased workloads and increased failure rates, a new theory is proposed here that provides an explanation for this behavior. The model studies several program traces demonstrating that increased workloads could cause an increase of the observed failure rates in the range of 32% to 53%

Published in:

Computers, IEEE Transactions on  (Volume:45 ,  Issue: 8 )