Conferences >2019 56th ACM/IEEE Design Aut...

System-level hardware failure prediction using deep learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware a...Show More

Metadata

Abstract:

Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware at risk and backup the data. However, the current model-based predictors are incapable of using the discrete time-series data, such as the values of device attributes, which conveys high-level information of the device behavior. In this paper, we propose a novel deep-learning based prediction scheme for system-level hardware failure prediction. We normalize the distribution of samples' attributes from different vendors to make use of diverse training sets. We propose a temporal Convolution Neural Network based model that is insensitive to the noise in the time dimension. Finally, we design a loss function to train the model with extremely imbalanced samples effectively. Experimental results from an open S.M.A.R.T data set and an industrial data set show the effectiveness of the proposed scheme.

Published in: 2019 56th ACM/IEEE Design Automation Conference (DAC)

Date of Conference: 02-06 June 2019

Date Added to IEEE Xplore: 22 August 2019

ISBN Information:

Print on Demand(PoD) ISSN: 0738-100X

Conference Location: Las Vegas, NV, USA

Contents

References is not available for this document.

System-level hardware failure prediction using deep learning

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

System-level hardware failure prediction using deep learning

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?