Loading [MathJax]/extensions/MathMenu.js
Impact of Data Leakage in Vibration Signals Used for Bearing Fault Diagnosis | IEEE Journals & Magazine | IEEE Xplore

Impact of Data Leakage in Vibration Signals Used for Bearing Fault Diagnosis


Examples of samples drawn from similar class distributions with and without domain shifts.

Abstract:

Bearing fault diagnosis is a well-developed field and an active area of research in which the combination of model-free machine learning techniques with vibration data ha...Show More

Abstract:

Bearing fault diagnosis is a well-developed field and an active area of research in which the combination of model-free machine learning techniques with vibration data has become a popular approach. However, vibration data from rotating machines has the potential to contain domain shifts beyond the accepted causes in this research area (different part models, operating conditions and sensor locations) which can enable data leakage between training and test datasets. To demonstrate the impact of data leakage, six common bearing diagnosis methods are applied to two datasets using three data splitting methods to compare classification performance. Diagnosis is preformed using Principal Component Analysis (PCA), Supervised Principal Component Analysis (SPCA) and Linear Discriminant Analysis (LDA) in combination with frequency analysis and envelope analysis feature extraction methods. Datasets from McMaster University and Paderborn University are used as experimental data sources, and produce vastly differing results (over a 40% drop in accuracy) depending on the selected dataset splitting method, revealing a previously unknown domain shift. Despite great results for diagnosis methods using frequency response analysis on the data from McMaster, these results are not expected to generalize due to possible data leakage. Out of fifty-five previous works using the Paderborn dataset, ten are identified as likely to be affected and only six properly address the problem. Recommendations are given for future experiment design, model creation and model evaluation.
Examples of samples drawn from similar class distributions with and without domain shifts.
Published in: IEEE Access ( Volume: 12)
Page(s): 169879 - 169895
Date of Publication: 13 November 2024
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.