A Robust Health Indicator for Rotating Machinery Under Time-Varying Operating Conditions

Bearing is an essential component whose failure leads to costly downtime in operation. Therefore, it is important to establish an accurate health indicator (HI), using which the remaining useful life can be reliably predicted. To date, most of the health assessment for bearing have been focused on the constant operating condition while in practice, it operates under various operating conditions (rotating speed and loading). Motivated by this, this paper proposes a method to extract robust HI which undergoes variable operating conditions. The idea is to cluster the operating conditions regimes, and develop HI based on the Mahalanobis distance using the optimal features subset in each regime. To validate the effectiveness, bearing run-to-fail experiment is performed under variable operating condition, and proposed HI is compared with the traditional statistical features. The remaining useful life is predicted by the data augmentation prognostics algorithm which was to overcome data deficiency problem.


I. INTRODUCTION
Bearings are considered as one of the most critical components in many rotating machineries since their failure leads to the high costs in the maintenance. To prevent these failures, Prognostics and Health Management (PHM) techniques have been actively developed, which is an enabling discipline that uses sensors to extract features, perform diagnosis to assess the health and prognosis to predict the remaining useful life (RUL) [1]. Many review papers have addressed the PHM techniques for bearing applications from anomaly detection to RUL prognosis (e.g., [2]- [4] among others). Among the steps of the PHM, the prognosis is of the most importance since it gives the RUL before failure which directly affects the maintenance decision. The prognostic performance is however dependent on how to construct the health indicator (HI) which reflects the current health state and degradation pattern of the component. Many literature have studied methods for HI construction for the bearing using the vibration data until failure. Traditionally, the HI construction has been driven by the time statistical features The associate editor coordinating the review of this manuscript and approving it for publication was Nagarajan Raghavan . such as the root mean square (RMS), kurtosis, and standard deviation or frequency domain features such as the bearing fault frequencies. These approaches aim to identify the proper health indicator that reflects the degradation of bearing [5]- [12]. More recently, literature have employed machine learning algorithms to better establish the health indicator by integrated use of multiple features [13]- [18]. For more information regarding the HI, a comprehensive review can be found in [19].
Though the aforementioned researches have demonstrated its effectiveness on many designed and actual experiments [5], [20], [21], the drawbacks are that they have mostly dealt with constant operating conditions. The bearings in practice, however, are operated more often under varying operating conditions as found in various industrial applications, e.g. automated industrial robots [22] and milling machines [23]. The systems or components under these conditions can cause large fluctuations in monitoring signal and consequently affect the HIs for the prognosis [24].
Upon a survey of literature, many literatures studied on fault diagnosis of bearing under fluctuating conditions. Xiang et al. [25] developed method based on Teager Computed Order(TCO) spectrum and Stacking Auto-encoder(SAE) to extract features adaptively from vibration under variable speed and load conditions. Aside from this, more studies have been conducted and interested readers are referred to [26]- [30]. However, only a few researchers have performed extracting robust HI for the prognosis study under variable operating conditions. Singh et al. [31] decomposed the vibration signals under variable operating conditions using the Ensemble Empirical Mode Decomposition (EEMD) to reduce the effect of variable conditions in the HI construction. However, the validity of the method was evaluated by the artificially seeded damage, whereas the application was made to the run-tofailure (RTF) data of constant conditions. Huang et al. [32] used the Nuisance attribute projection (NAP) technique to eliminate influence by the variable conditions. The method was however applied to the multiple sets of RTF data for constant condition with different speeds and loads. Saidi et al. [33] considered the high-speed variation of a shaft in the wind turbine, in which the spectral kurtosis (SK) is used as an indication of bearing failure since the SK is less affected to the noise by the speed variations. However, the variation was within 0.04 Hz which is marginal in the ordinary sense.
Based on the survey, the main challenges for the prognosis under variable operating conditions can be summarized as: First, in many literatures, the variable conditions have still meant the constant condition in a single operation, which were performed for several different speeds or loads. Next, even in the variable conditions, their interest was most likely the varying speed while the load is constant. However, the varying load is important since the fault characteristics in the bearing may be greatly affected by this. In the real field, many industrial systems undergo different conditions alternately in an operation, including the load changes. Therefore, the purpose of this study is to propose a robust HI development framework under different load and speed conditions and apply it to bearing RTF experiments performed under varying conditions. The main idea is to cluster the regimes by the operating conditions, identify the most representative features for each regime, and develop a robust HI by measuring the deviations between the current state and healthy state based on the Mahalanobis distance [34]. In addition, the authors conducted RUL prognosis using a method based on the data augmentation to enable the RUL prediction even by a single RTF dataset.
The rest of the paper is outlined as follows. Section 2 addresses the bearing RTF experiment design and description of acquired data. Section 3 proposes the developed HI extraction methodology and compares its performance with conventional features used for bearing HI. Moreover, the result of RUL prediction is addressed based on the developed HI. Lastly, conclusions and potential future works are presented in Section 4.

II. EXPERIMENTAL SETUP
The RTF experiments of the bearing are conducted by the Korea Institute of Machinery and Materials (KIMM) in which the testbed is illustrated in Fig. 1. The testbed consists of four modules including driving, specimen, radial loading and axial loading. In the specimen module, two test bearings are inserted at both sides of the shaft of the test rig. In the radial and axial loading module, maximum of 50 kN is applied to the both directions by the control unit. The test bearing is the taper roller bearing of 30306 series whose inner and outer diameters are 30 mm and 72 mm, respectively. As shown in Fig. 1(b), two test bearings are installed at both sides of the shaft, with vertical and axial acceleration sensors and thermometer installed per bearing.
The experiment is conducted under the two operating conditions being applied per hour alternately as listed in Table 1: Condition 2 is much more severe than condition 1 in terms of loads. The experiment is terminated when the temperature from one of the bearings exceeds 200 • C, the allowed value for the grease. The fault mode for both experiment was grease degradation. The vibration signal is acquired for 60 second at every 10 minutes with 25.6 kHz sampling rate, which corresponds to one cycle. Two datasets are obtained as a result of experiments as given in Fig. 2 where x-axis and y-axis refer to cycles and acceleration('Acc'). In the Fig. 2, two features can be observed: The raw signals fluctuate between the conditions 1 and 2 and there is a great difference in the life of the bearings even under the same operating conditions. The initial 5 cycles data are considered for stabilization and eliminated. The end of cycle for dataset 1 is 63 cycles and dataset 2 is 122 cycles.

III. METHOD AND APPLICATION
This section addresses the procedure of the HI construction and its application using the bearing datasets. The overall flowchart is shown in Fig. 3, which is composed of two phases: training and testing. For the training, dataset 1 is   used to develop the HI as a measure of health degradation. Then the HI is applied to the dataset 2 to figure out whether the HI monitors the fault progression properly. The training phase consists of three steps: First, regime clustering is conducted using k-means clustering algorithm, which clusters the operating conditions into different regimes. In the next step, features exhibiting greater increase over cycles are selected for each regime. For this purpose, time domain features such as RMS, kurtosis and crest factors and the bearing fault frequency amplitudes are extracted in each regime, from which the features showing higher Spearman correlation are selected respectively. In the third step, subset of features are selected from each regime, using which the HI is defined by the Mahalanobis distance between those at the current and initial normal conditions. By performing exhaustive search, the best features subset in each regime are determined that maximizes the correlation of HI among all the possible combination of features.

A. REGIME CLUSTERING OF OPERATING CONDITIONS
Since the bearing is operated under different operating conditions in alternation, they need to be clustered into the separate regimes using the k-means clustering algorithm. The algorithm aims to partition the N observations x = {x 1 , x 2 , . . . , x N } into the k number of clusters that user wants to separate by seeking the centroids u i , (i = 1, 2, . . . , k) for each cluster group S = {S 1 , S 2 , . . . , S k } that minimize the within-cluster sum of squares [35]: The operation conditions for the load and speed are shown in Fig. 4. Using the algorithm, these two conditions are automatically identified as shown by the regime 1 and 2 in

B. FEATURE EXTRACTION IN EACH REGIME
Traditional statistical features in the time domain and characteristic frequencies related to the rotation of bearings in the frequency domain are widely used in the field of vibration signal-based condition monitoring. In this study, 10 statistical features listed in Table 2 and three amplitudes of bearing fault frequencies (the ball-pass frequency of inner race f BPFI , the ball-pass frequency of outer race f BPFO , ball spin frequency f BSF ) are extracted from the raw signal to construct the HI. Results are shown in Fig.6, in which some features show periodic square-like patterns reflecting the alternating conditions, hence, are inappropriate for the HI. Others do not do so but they are not showing any distinct trend over cycles.
In order to establish a good HI, the features are clustered by the two regimes. Then only the features that show greater monotonic increase over time are selected. For this purpose, Spearman correlation is employed, which is widely used as the measure for monotonicity [36]: where X and Y are the cycle and the feature, rgX and rgY represent the rank of X , Y , i.e. the rank variables, and σ rgX and σ rgY mean the standard deviations of the rank variables, respectively. Fig. 7 illustrates the correlation between the features and cycle for each operating regime. From the figure, it is found that the features of high correlation are different in each regime: rms, stdev, peak and P2P are higher in regime 1 whereas the kurtosis is the highest followed by the entropy and BPFO in regime 2. The reason to have high kurtosis in regime 2 is in its unique feature: It is widely used to capture the impulsiveness in the signal, which is more pronounced when the bearing operates under greater load and higher speed (regime 2). On the other hand, under the mild condition (regime 1), the impulsiveness weakens, leading to the poor kurtosis value. Therefore, the useful features can differ by the different operating regimes.

C. HI CONSTRUCTION BY OPTIMAL FEATURES SUBSET
In the next step, a single HI that represents the current health condition is constructed. For this purpose, finite number of features showing higher correlation are selected to formulate a pool of feature candidates in each regime, from which the optimal subset of features are investigated. In this study, 7 features out of total 13 are chosen in the descending order of correlation in each regime, respectively. An arbitrary subset of features (or features vector) are considered within each regime, using which the HI is defined by the Mahalanobis distance: where i denotes the ID of regime, x i = {x i1 , x i2 , . . . , x ip } represents the p number of features vector at the curre90nt cycle, and µ i and S i are the mean and covariance of the features vector in the initial baseline period for the regime i. The baseline period refers to that of the normal condition, which is given by the initial period of two alternating 4996 VOLUME 10, 2022 conditions in this study. Once the current regime is identified, the corresponding x i , µ i and S i are used to obtain the HI in the equation.

FIGURE 8.
Overall procedure of HI construction to find optimal features subset. According to the definition, the Mahalanobis distance indicates how far the current condition deviates from the normal condition. If it increases over time, it is usually regarded that the fault is developing. Then, at every possible subset of features in the candidate pool of each regime, the HI is calculated until the end of cycles. The performance of HI is then assessed by its Spearman correlation with respect to the cycles. After exhaustive search for the features subset, the best one showing the maximum correlation value is determined. The procedure is illustrated in Fig. 8 for more understanding: First, choose an arbitrary feature vector (subset) within the 7 features of each regime. Calculate µ i and S i for the baseline period using the chosen features vector in each regime. HI is then calculated for the features vector at the current cycle by using (2). By repeating this over cycles, time series of HI are obtained. Finally, the correlation of HI with cycles is evaluated. Fig. 9 shows the correlation values of HI for all possible features subsets. Among them, the optimal features subset with the maximum correlation is marked by the star and listed in Table 3. Using the optimal features subset, the calculated HI for dataset 1 is given in Fig. 10. Compared to the traditional features in Fig. 6, the HI shows the health degradation much better in monotonic way regardless of the regime. Though, there is a major fluctuation of HI around end of cycles, the authors  believe this is regarded to bearing degradation characteristics caused by defect propagation counterbalancing with the vibration [12], [37].
In order to evaluate the goodness of developed HI in more quantitative way, the trendability and monotonicity metric are utilized. The trendability measures the linearity between features and operating time (cycle) by [38]: K is the length of a HI, and h k is the HI value at time t k . The monotonicity metric assesses an increasing or decreasing trend of the features as follows [39]: where dH is the differential of feature series. The metrics result for HI of dataset 1 are listed in Table 4 and shown in Fig. 11. The trendability and monotonicity values for proposed HI are 0.7265 and 0.3548, respectively, which is highest VOLUME 10, 2022 compared to other conventional features and demonstrates that the proposed HI is superior to the traditional features.

D. APPLICATION TO THE TEST DATA
Once the features subset is identified by the training data, the HI can be used for any bearing undergoing the same operating conditions. To demonstrate this, the dataset 2 is employed, which is given in Fig. 2(b): It has shown longer operating cycles until failure under the same condition. For this data, the steps of testing phase are applied: regimes are identified based on the k-means algorithm, optimal features subset as listed in Table 3 are extracted and µ and S are calculated in the baseline period for each regime. The HI for the current cycle is obtained by (2). The resulting HI is given in Fig. 12 and y-axis is plotted same scale with HI dataset 1 for visualization. Analogous trend to that of dataset 1 is observed here, showing the health degradation adequately. The trendability and monotonicity measures are also calculated and compared with the traditional features in Fig. 13.

IV. REMAINING USEFUL LIFE PREDICTION
Once the HI is constructed, the remaining useful life (RUL) can be predicted at any instance of cycles by employing a suitable prognosis algorithm. To do this, failure threshold should be available a priori, which will be mentioned in the later text. In general, the prognostic algorithms are grouped into model-based and data-driven approaches. Model-based approach employs a physical crack growth model such as Paris law or empirical degradation model to estimates the unknown parameter based on the monitored HI up to the current time. The model is then extrapolated into the future to predict the RUL [40]. For the data-driven approaches, the RTF HI data obtained in the past are used to predict the RUL. While there have been numerous algorithms toward this direction with the most popular being the Artificial Neural Network (ANN) [38]- [40], the challenge is that the ANN requires large number of RTF data for the training which is hard to obtain in the real industries [44].
To overcome this problem, Kim et al. [45] have proposed a method based on the data augmentation prognostics (DAPROG) using the dynamic time warping (DTW), which enables the RUL prediction even by only a single RTF data. The DTW measures the similarity between the two time-series data by finding the optimal mapping path. If the two signals go through the same degradation, their similarity is close and mapping path shows the linear pattern. The DAPROG exploits this to predict the RUL by treating the RTF data in the past and the target data at the current time for the mapping. The concept is illustrated via a simple example in Fig. 14. In Fig. 14(a), the 20 target data (filled circular markers) until the current time (t = 4) are given, which has degraded to the value 3.3 denoted by the dashed black line. In order to match this with the corresponding portion of the reference data in the past, 13 data (filled star markers) below the dashed line are selected for the mapping. Then the current mapping path is established using the two data as shown in Fig. 14(b). The path is then fitted by a linear function with predictive interval (PI), and the path is extrapolated into the future to generate virtual mapping paths with PI as illustrated in Fig. 14 (c), namely path 1, 2 and 3, representing the 97.5%, median (50%), and 2.5% percentiles, respectively. Based on these paths, the reference points above the dashed black line are mapped into the target points to generate the three virtual curves in the future as shown in Fig. 14 (d). This is used to predict the RUL against the predetermined threshold, e.g., at 8 shown as the red line. Since the end-of-life cycles are 6.58, 6.88 and 7.18, the RULs are those subtracted by the current cycle 4, which are 2.58, 2.88 and 3.18 for those of 2.5%, median and 97.5% percentiles. More details of DAPROG can be found in reference [45].  In order to apply the DAPROG to the test data, which is the dataset 2, failure threshold should be available. Since the dataset have been made by the termination condition that the temperature exceeds 200 • C, the HI value at the end of the cycles for the training dataset 1 (see Fig. 10), is used as the failure threshold, i.e., if the HI reaches this value, it is regarded as the failure. Though the curve shows overall increase representing the fault progression well, it has still high level of fluctuation during the cycle. To reduce this effect, the HI is fitted to the exponential function on an empirical basis that it is widely used in the bearing prediction. The result is shown in Fig. 15, in which the star markers are the HIs at each point of the curve in Fig. 10. Then the failure threshold becomes 530 at the end of the cycles in the figure. DAPROG is performed by utilizing these two time-series HI. The results are given in Fig. 16 at some point in cycles (50, 70, 80 and 90 cycles) in which the training data given by the solid curve at the left are mapped into the test data to generate three virtual curves of 2.5%, median and 97.5% percentiles in the future. For example, at 50 cycles, mapping path is established based on the test data up to the current time (current data), from which the three virtual curves are made to predict its future behavior as shown in the dotted curves (mapped data). Note that the dotted lines continued from the current data denotes the actual test data in the future (test data). Then the three percentiles of EOLs are predicted at 70, 85 and 100, which are the points where the three dotted lines cross the failure threshold given by the horizontal line at 530. The RULs are then the values subtracted by the current cycles 50, which are 20, 35 and 50. Since this is early stage of RTF cycles, the prediction is not good, as the bounds of the curves do not include the true test data. As more data are obtained, however, the prediction accuracy is improved, as can be found by the fact that the width of the bounds are getting narrower, enclosing the true test data. In order to evaluate the accuracy of RUL prediction in more efficient way, Fig. 17 is exploited, which compares the bounds of the predicted RUL against the true RUL. Note that the true EOL is given by 100 cycles based on the Fig. 12. Then at any cycles, the RUL is the EOL subtracted by the current cycles, which constitutes the linear curve with the slope of 45 degree in the Fig. 17. The predicted RUL bounds are then given here by three red dotted curves. From the point little greater than 50 cycles, the true RUL begins to be included VOLUME 10, 2022 within the bounds, which means that the algorithm predicts the RUL accurately with 95% reliability from that point in cycles. However, it should be noted that different prognostics algorithms can be used as well as similarity trajectory based prediction [46] which are applied on similar grounds.

V. CONCLUSION
In the fault diagnosis and prognosis of bearing, effective features extracted from the vibration signal may differ greatly by the operating conditions. Therefore, the health assessment of the bearings when subjected to the variable operating conditions should incorporate this aspect to implement the prognosis successfully. This paper has proposed health assessment approach for the bearing undergoing different operating conditions in alternation. For this purpose, Kmeans clustering and Mahalanobis distance are employed to explore the optimal features subset, using which the HI is calculated. The proposed approaches are successfully demonstrated by the bearing run-to-fail data which are experimented by alternating two different conditions, in which the health degradation in monotonic manner is observed regardless of the conditions. The RUL is also predicted via DAPROG developed by the authors using only one RTF dataset as the training.
There are several aspects to be considered as the future works. Although the proposed methodology proved its feasibility, the operating regime are limited to the periodic patterns with 2 different conditions. For more practical applications, the proposed method will be improved should be extended to for cover non-periodic operating conditions. In addition, as Mahalanobis distance has its own allegorical disadvantage related with singular matrix, distance metric will be replaced with other algorithms which do not have allegorical problem, such as the generative adversarial network (GAN) that has stronger point in imitating complex data distribution. Finally, since different bearing fault modes can lead to varying degradation patterns, considering faulty types of the bearing will enhance the generalization of the proposed method.