By Topic

Improved disk-drive failure warnings

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
G. F. Hughes ; Center for Magnetic Recording Res., California Univ., San Diego, La Jolla, CA, USA ; J. F. Murray ; K. Kreutz-Delgado ; C. Elkan

Improved methods are proposed for disk-drive failure prediction. The SMART (self monitoring and reporting technology) failure prediction system is currently implemented in disk-drives. Its purpose is to predict the near-term failure of an individual hard disk-drive, and issue a backup warning to prevent data loss. Two experimental tests of SMART show only moderate accuracy at low false-alarm rates. (A rate of 0.2% of total drives per year implies that 20% of drive returns would be good drives, relative to ≈1% annual failure rate of drives). This requirement for very low false-alarm rates is well known in medical diagnostic tests for rare diseases, and methodology used there suggests ways to improve SMART. Two improved SMART algorithms are proposed. They use the SMART internal drive attribute measurements in present drives. The present warning-algorithm based on maximum error thresholds is replaced by distribution-free statistical hypothesis tests. These improved algorithms are computationally simple enough to be implemented in drive microprocessor firmware code. They require only integer sort operations to put several hundred attribute values in rank order. Some tens of these ranks are added up and the SMART warning is issued if the sum exceeds a prestored limit. These new algorithms were tested on 3744 drives of 2 models. They gave 3-4 times higher correct prediction accuracy than error thresholds on will-fail drives, at 0.2% false-alarm rate. The highest accuracies achievable are modest (40%-60%). Care was taken to test will-fail drive prediction accuracy on data independent of the algorithm design data. Additional work is needed to verify and apply these algorithms in actual drive design. They can also be useful in drive failure analysis engineering. It might be possible to screen drives in manufacturing using SMART attributes. Marginal drives might be detected before substantial final test time is invested in them, thereby decreasing manufacturing cost, and possibly decreasing overall field failure rates

Published in:

IEEE Transactions on Reliability  (Volume:51 ,  Issue: 3 )