Adaptive Multiscale Weighted Permutation Entropy for Rolling Bearing Fault Diagnosis

Fault diagnosis of rolling bearing is of great importance to ensure high reliability and safety in the industrial machinery system. Entropy measures are useful non-linear indicators for time series complexity analysis and have been widely applied in bearing fault diagnosis in the past decade. In this paper, an improved entropy measure is proposed, named Adaptive Multiscale Weighted Permutation Entropy (AMWPE). Then, a new rolling bearing fault diagnosis method is developed based on the AMWPE and multi-class SVM. For comparison, an experimental bearing dataset is analyzed using the AMWPE and conventional entropy measures, and then multi-class SVM is adopted for fault type classiﬁcation. Further, the robustness of different entropy measures against noise is studied by analyzing noisy signals with various Signal-to-Noise Ratios (SNRs). The experimental results have demonstrated the effectiveness of the proposed method in bearing fault diagnosis under different fault types, severity degrees, and SNR levels.


I. INTRODUCTION
Rolling bearings are widely applied in the rotating machinery found in commercial and industrial applications. Despite the wide application, rolling bearings are prone to a variety of premature failures caused by many reasons, such as fatigue, lack of lubrication, or overload. The occurrence of failures in the bearing will introduce potential damages to the machinery, resulting in performance degradation in the system [1]- [3]. Therefore, fault diagnosis of rolling bearing is of significance to ensure the reliability of the machinery, enabling detecting and troubleshooting the potential failures as early as possible [4].
Vibration monitoring is a useful technique to monitor machine health conditions. However, interacting components and environmental noise often exist in the operation of industrial machinery systems. Due to instantaneous variations in bearing loads and clearance as well as other contributions -such as non-linear stiffness effects in the bearing and rotor, bearing vibration signals often exhibit non-linear and The associate editor coordinating the review of this manuscript and approving it for publication was Min Xia . non-stationary characteristics [5]- [7]. These factors bring difficulty in vibration analysis and feature representation. Traditional feature extraction methods can characterize representations from time-or frequency-domain only; nevertheless, they may not appropriately detect the underlying failures by directly analyzing the complexity change of the system [8], [9]. Continuing advances in entropy analysis [10] have significantly exhibited the prospect in time series complexity analysis by characterizing the complexity change in the system.
The most widely used entropy measures include Shannon entropy, approximate entropy, Permutation Entropy (PE), and their variants. Shannon entropy measures the information content of a message in the context of information theory, which can quantify the uncertainty in time series (measurements collected from a system) [11], [12]. Approximate entropy and its improvements (such as sample entropy and fuzzy entropy) enable estimating the complexity and irregularity of measurements [13]. The PE measure quantifies dynamic changes based on ordinal patterns originated from the structure of time series. Due to its theoretical simplicity and fast calculation, PE has been widely applied in time series VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ complexity analysis. Many works have applied PE measure as a non-linear health indicator to monitor machine health conditions [14]- [16]. There exist some improvements in the PE algorithm, aiming to enhance its performance in time series complexity analysis. The main shortcoming of the original PE algorithm is that it neglects the amplitude differences in the time series, so that different time series may have the same entropy value. Therefore, some works attempted to take into account the amplitude differences in neighboring elements based on a concept of ''weighting factor''. The basic idea is that the amplitude differences will yield different weighting factors accordingly, which will then alter the relative frequency of ordinal patterns, thereby changing entropy values. Giving an example, Liu and Wang [17] added an extra parameter in the ordinal pattern type so that time series with different amplitudes will produce different pattern types. Azami et al. presented a modified PE method by applying weighted coefficients that are based on average and absolute amplitude difference values of neighboring elements [18]. Faldalllah et al. developed a Weighted Permutation Entropy (WPE) by accounting the variance of neighboring elements into consideration in the calculation of PE [19]. Amongst these improvements, the WPE measure has better computational efficiency and has fewer extra parameters in calculating PE values. It is noted that although the WPE is a refinement of PE, the WPE values are extracted from the original time series over a single scale. However, in the fault diagnosis of rotating machinery, rich information related to fault symptoms may exist in the spatial-temporal structure of time series. In this case, a concept of multiple-scale entropy measure was developed to improve the entropy analysis from a multiple-scale perspective.
Aziz and Arif [20] put forth the notion of Multiscale Permutation Entropy (MPE), where PE values are calculated over a range of scales through the coarse-graining procedure. The MPE earns higher reliability and higher fault pattern recognition accuracy than PE for bearing diagnosis [20]. Although MPE outperforms PE in complexity analysis, the coarse-graining procedure is essentially a linear smoothing filter. When the scale factor increases, the MPE value decreases because the data length is greatly reduced in the coarse-grained time series. Also, the high-frequency components in the coarse-grained time series are abandoned, resulting in the loss of high-frequency information in entropy analysis. Some works have developed improved MPE algorithms through enhanced multiple-scale extraction mechanisms. For instance, Composite Multiscale Permutation Entropy (CMPE) [21] and refined CMPE [22] were introduced based on an improved coarse-graining procedure. The CMPE and refined CMPE alleviate the problem of sharply reduced data length to some extent; however, neither of them considers high-frequency information in the analysis of vibration signals. Thus, they may present limited diagnostic performance in identifying bearing health state.
To take high-frequency information into account, some improved scale-extraction mechanisms were later developed. For example, Jiang et al. introduced a hierarchical decomposition for multiple-scale entropy estimation [23], but they did not consider the decreased data length in the calculation of entropy values. Recently, a new entropy measure, termed Fine-to-Coarse Multiscale Permutation Entropy (F2CMPE), was put forward by Huo et al. [24] where a Fine-to-Coarse (F2C) procedure is proposed. The F2CMPE measure attempts to overcome the two limitations in conventional MPE algorithms. It is worth mentioning that in traditional entropy measures, entropy values are obtained from specified scales, however, in bearing diagnosis, not all scales are closely related to the fault information. In contrast, using all scales may inevitably contain unexpected redundant information and consume more computational resources, thus reducing the efficiency of entropy analysis in fault diagnosis.
In this paper, an Adaptive Multiscale Weighted Permutation Entropy (AMWPE) measure is proposed for time series complexity analysis. The AMWPE approach aims to yield adaptive multiple-scale time series containing salient fault information for bearing diagnosis through an improved scale-extraction procedure. Also, a new rolling bearing fault diagnosis method is proposed based on the AMWPE and multi-class Support Vector Machine (SVM) techniques. The main contributions of this paper are concluded as follows: • An improved multiple-scale entropy measure is developed for time series complexity analysis. The efficiency of the AMWPE in feature extraction and time cost is investigated and compared with traditional entropy measures.
• A new rolling bearing fault diagnosis method is presented based on the AMWPE and SVM. The procedure is shown in Fig. 1. A comparative study is performed using different diagnosis methods where the AMWPE and traditional entropy measures are used, respectively, to extract entropy features.
• The robustness of the AMWPE and traditional entropy methods against noise is investigated. Their diagnosis performances are studied and compared through analyzing noisy vibration signals with different Signal-to-Noise Ratios (SNRs). The rest of this paper is structured as follows: Section II presents the principles of related entropy measures. Section III introduces the proposed AMWPE entropy measure and presents the proposed bearing fault diagnosis method based on the AMWPE and SVM. Section IV discusses the experimental results using the AMWPE and traditional entropy algorithms for bearing diagnosis. Finally, a conclusion is drawn in Section V.

II. RELATED ENTROPY PRINCIPLES
This section briefly introduces the theoretical background of traditional PE and MPE measures and their related improvements.

A. PERMUTATION ENTROPY
Bandit and Pompe [25] introduced the PE approach for measuring the complexity change of time series based on the ordinal pattern. The PE can be interpreted as a quantifier that evaluates the rate of generation of new ordinal patterns in the time series. These ordinal (permutation) patterns naturally originate from the local sequential structure of time series. The principle of PE is briefly described as follows: For a time series x = {x 1 , x 2 , · · · , x N }, the m-dimensional embedding vector is constructed as where m is the embedding dimension, λ is the time delay, and 1 ≤ i ≤ N − (m − 1)λ. For any X i , it can be mapped onto a specific distinct symbol π n = (j 1 , j 2 , · · · , j m ) by ranking m number of real values in an ascending order. π n is one of m! possible symbol permutations, and each X i corresponds to a unique π n . Define P(π n ) as the relative frequency of each symbol sequence respectively where k is no larger than N − (m − 1)λ, and P(π n ) = 0 only when there are no vectors belonging to the given permutation type π n . Then, PE measure is defined as the Shannon entropy of the probability distribution of permutation types: The value of PE ranges from [0, log 2 m!]. The minimum value of PE is zero, which means that the time series is regular. Usually, a larger PE value denotes that the time series is more irregular and relatively unpredictable.

B. WEIGHTED PERMUTATION ENTROPY
Fadlallah et al. [19] developed the WPE approach by incorporating amplitude differences in the calculation of the probability distribution of permutation patterns. In contrast to the PE, the WPE takes weighting factors, w i , into account using the variance of neighboring elements. The weighted relative frequency of each permutation π n is calculated as where k is no greater than N − (m − 1)λ, and Q(π n ) = 0 only when there are no vectors X i belonging to the given permutation type π n . The weight w i is obtained from the corresponding vector X i by wherex i is the arithmetic mean of the X i . Then, the WPE is obtained as The value of WPE is also in the interval of [0, log 2 m!]. The definition of WPE maintains most of PE's properties. The most significant difference consists in the definition of the relative frequency of symbol sequences. The WPE can distinguish vectors that have the same ordinal patterns but different amplitude elements based on the weighting factors, thereby altering the probability distribution of ordinal patterns. Therefore, the WPE is more applicable for measuring the irregularity of time series and has better performance than PE in entropy analysis for bearing diagnosis [26].

C. ORIGINAL AND COMPOSITE MULTISCALE PERMUTATION ENTROPY
Aziz and Arif [20] developed an extension of PE method by calculating PE values over a range of scales based on the coarse-graining procedure. The coarse-grained time series y (τ ) are obtained by averaging a successively increasing number of data points in non-overlapping windows at a given scale factor [27]. When the scale factor τ = 1, the original time series is obtained.
Although the MPE is a refinement of PE, it still lacks relative consistency in estimating entropy values with an increasing scale because of the sharply decreased data length in the coarse-grained time series. For example, the entropy values on adjacent scales will have a large variance. To alleviate this problem, Azami [21] proposed the CMPE, an enhancement of MPE, where a modified procedure is used to generate composite coarse-grained time series. In the CMPE, the kth coarse-grained time series for a given scale factor τ , y For a specific scale τ , its CMPE value corresponds to the average of PE values obtained from τ number of composite coarse-grained time series accordingly Under the coarse-graining framework, the MPE and CMPE usually present better performance and extract more potential information on the time series, compared with the single-scale PE method. Therefore, many works have employed MPE and CMPE to the application of feature extraction and health condition recognition to bearing diagnosis [28]- [30]. Moreover, under the multiple-scale framework, WPE applies for analyzing complex signals in entropy analysis compared to PE. For comparison, when applying WPE for entropy analysis under the original and modified coarse-graining framework, we refer to these two corresponding methods as the Multiscale Weighted Permutation Entropy (MWPE) and Composite Multiscale Weighted Permutation Entropy (CMWPE), respectively.

D. FINE-TO-COARSE MULTISCALE PERMUTATION ENTROPY
The F2CMPE measure is a multiple-scale entropy measure for complexity analysis, in which an improved scaleextraction mechanism is proposed [24]. It not only alleviates the shortcoming of data length reduction in traditional multiple-scale time series but also takes into account high-frequency information in entropy estimation. Therefore, the F2CMPE earns higher consistency in entropy analysis and is more to robust to noise compared with traditional multiplescale entropy methods [24].
The calculation of F2CMPE relies on a two-step procedure. First, the F2C scale-extraction procedure is applied to generate time series with multiple scales based on Wavelet Packet Decomposition (WPD) analysis. A set of decomposed wavelet coefficients are generated from the original signal. Later, these wavelet coefficients are reconstructed to sub-signals that have the same data length as the original signal. Then, the F2C signals are constructed based on these sub-signals. Specifically, given a k-th decomposition level, the F2C procedure will only generate 2 k−1 set of wavelet coefficients which are produced from the approximate coefficients in the first decomposition level. Then, reconstructed sub-signals, can be generated from each branch of wavelet coefficients correspondingly. The scale factor τ thus equals to 2 k−1 , and the F2C signals can be constructed by consecutively removing one reconstructed sub-signal from previously obtained F2C signals, commencing from the accumulation of all τ reconstructed sub-signals.
where k is the decomposition level and τ is the scale factor. Finally, the F2CMPE value can be computed by calculating the PE values over a range of F2C signals.
Though the F2CMPE measure has improved from the traditional MPE approaches, the efficiency of the scaleextraction scheme can be enhanced further. First, in the course of generating F2C signals, there may exist reconstructed signals that are not closely related to the characteristic fault symptoms; thus, the use of all reconstructed signals inevitably produce redundant information. Moreover, the WPE algorithm can improve the entropy analysis compared to the original PE method. Therefore, this study proposes an AMWPE measure, aiming to offer better feature representation and computational efficiency, thus improving entropy analysis for bearing diagnosis.

III. THE PROPOSED BEARING DIAGNOSIS METHOD BASED ON THE AMWPE MEASURE AND SVM
This section first introduces the proposed AMWPE algorithm. Then, a new bearing fault diagnosis method is developed based on the AMWPE and SVM techniques.

A. ADAPTIVE MULTISCALE WEIGHTED PERMUTATION ENTROPY
In the AMWPE algorithm, an improved F2C procedure is developed to construct adaptive F2C signals. The advent of failures in the bearing will introduce coupling frequencies and change amplitude magnitudes in bearing vibration signals. Crucial components extracted from raw signals should maintain characteristic symptoms in the waveforms and thus have a high similarity to raw signals in the time domain. Considering this, the adaptive F2C procedure in the AMWPE algorithm selects salient reconstructed sub-signals based on correlation coefficient analysis. These selected sub-signals are closely related to the raw signals and have a high correlation in the time domain. Then, adaptive F2C signals are constructed based on these selected sub-signals, and entropy values are calculated from obtained F2C signals. The improved F2C procedure has two merits. On the one hand, these adaptive F2C signals could incorporate more crucial fault information and less redundancy. On the other hand, the improved F2C procedure can achieve higher computational efficiency compared to the F2CMPE in time series complexity analysis.
In this study, correlation coefficients are used to evaluate the similarity between reconstructed signals R k,i and the raw signal x in the time domain. Fig. 2 presents the diagram of the AMWPE algorithm, and its detailed calculation steps are described below: 1) Decompose a vibration signal x into the k-th decomposition level using WPD. Select the wavelet coefficients } that are decomposed from the approximate coefficients at the first decomposition level in the wavelet tree. Reconstruct these selected wavelet coefficients to sub-signals that have the same data length to x. Thus, totally 2 k−1 number of reconstructed signals − 1)} are obtained correspondingly. 2) Compute correlation coefficients between the reconstructed sub-signal and raw signal in the time domain where µ(R k,i ), µ(x), σ (R k,i ), σ (x) denote the mean and standard deviation of the reconstructed sub-signal and the original signal, respectively.

3) Contribution rates are calculated based on the correlation coefficients by
where 0 ≤ i ≤ 2 k−1 − 1, and a larger S i indicates that the corresponding sub-signal has higher correlation with the original signal in the time domain. 4) Rank the contribution rates in descending order. For each signal, refer to n as the maximum number of its reconstructed sub-signals, which satisfies that the sum of the first n largest contribution rates is no less than 90%, namely n i=1 S i ≥ 90%, (n ≤ 2 k−1 ). Record the index of the selected n number of sub-signals and denote them as {U i , (1 ≤ i ≤ n)}. 5) Apply obtained sub-signals U i to construct adaptive F2C signals accordingly, commencing from the accumulation of all n number of selected sub-signals where 1 ≤ i ≤ n, and 1 ≤ τ ≤ n. 6) Calculate the WPE value over each F2C signal, the AMWPE values are finally obtained by The AMWPE analysis consists in wavelet analysis and WPE estimation. In wavelet analysis, appropriate parameters -mother wavelet and resolution of decomposition scale -can produce time-frequency components containing crucial fault information. In this study, we select a six-level (k = 6) wavelet decomposition tree, and therefore 32 wavelet coefficients are totally obtained according to the Step 1) in the AMWPE algorithm. Also, a ''db4'' wavelet is applied as the Daubechies family of wavelets is well-known for their orthogonality and efficiency in filter implementation [24]. Besides, regarding entropy parameter configuration in the WPE measure, many studies have examined the performance of embedding dimension m and time delay λ in the calculation of PE values [31]. Researchers recommended that parameters, m = 4-7 and λ = 1-3, are suitable for analyzing vibration signals in bearing diagnosis [32].

B. FAULT PATTEN RECOGNITION USING SUPPORT VECTOR MACHINE
In machine health monitoring, the SVM classifier is a useful technique to distinguish between various bearing health states [33]. It maps the original pattern space into a high dimensional feature space and maximizes the margin of separation between boundaries of data points called support vectors in the multi-dimensional space. The decision function is made using f (x) to generate a separating hyperplane. For the nonlinear function, the SVM constrained optimization VOLUME 8, 2020 problem can be summarized as [33]: where w is a weight vector, and b is a bias. For multi-class classification, the LIBSVM Matlab Toolbox [34] is used for bearing fault pattern recognition in this study. A detailed discussion on multi-class SVM approaches can be found in Ref [35], [36].

C. PROPOSED ROLLING BEARING FAULT DIAGNOSIS METHOD
Based on the AMWPE and SVM, the proposed fault diagnosis method for rolling bearing is presented as follows: 1) Collect vibration signals from rolling bearings with various health conditions. For each condition, split raw data sets into training and testing data sets, respectively; 2) Calculate the AMWPE values from the training data samples. In this study, a 6-level decomposition tree (k = 6) is used and thus τ = 32. For each training sample, calculate the value of n; thus, a vector of n values can be obtained from all training samples. Then, specify the maximum n (herein denoted as n max ) as the number of features for constructing training feature vectors as F train n max ; 3) Calculate the AMWPE values from the testing data samples and construct testing feature vectors F test n max where n max is acquired from the training data samples; 4) Apply training feature vectors F train n max to train the SVM-based multi-class model for classifying bearing fault types; 5) Input testing feature vectors F test n max into the obtained model to predict the health label. Thus, the fault pattern of the testing sample can be recognized. The flowchart of the proposed bearing diagnosis method is described in Fig. 3.

IV. EXPERIMENTAL ROLLING BEARING DATA ANALYSIS
In this section, the performance of the proposed method for bearing fault diagnosis is investigated. For comparison, the AMWPE and traditional entropy-based methods are used to analyze bearing data, after which entropy feature vectors are then inputted into the multi-class SVM for fault type identification. We start with the analysis of raw vibration signals. Afterwards, noisy signals with various SNRs are analyzed to investigate the robustness of different entropy measures against noise in bearing diagnosis.

A. TEST RIG AND DATA ACQUISITION
The experimental rolling bearing dataset is provided by Case Western Reserve University (CWRU) [37]. The schematic of test rig is shown in Fig. 4. Tested bearings are 6205-2RS JEM SKF deep groove ball bearings with single-point failures. In this study, bearing signals with ten  conditions are considered and collected from the drive-end channel, including bearings with normal condition (Norm) and damages on the inner race (IR), the outer race (OR) at 6 o'clock, and the ball element (BE), respectively. Bearings with various defect sizes are considered (i.e., 0.1778 mm, 0.3556 mm, and 0.5334 mm) under a speed of 1730 r.p.m with Load 3 HP and a sampling frequency of 12 kHz.
For classification purpose, raw vibration signals are split into training and testing data sets, respectively. In this study, there are 29 samples with a data length of 4, 096 for each bearing condition, and they are categorized into 14 training samples and 15 testing samples. Therefore, for ten bearing conditions, there are 140 training samples and 150 testing samples. Table 1 describes the detail specification of each bearing condition. Fig. 5 shows the time-domain waveforms of bearing signals under ten health states.

B. FAULT DIAGNOSIS ANALYSIS BASED ON ORIGINAL VIBRATION SIGNALS
In this study, raw bearing vibration signals are analyzed using the proposed bearing fault diagnosis method.  The AMWPE is compared with traditional measures to study their performance on bearing fault diagnosis, such as the MPE, MWPE, CMPE, and CMWPE algorithms. Their computational efficiency in extracting entropy features from time series is also investigated.
We first evaluate various entropy measures' computation time for extracting entropy features from the vibration signal. A PC is used with the configuration (Intel Core i7-3770 Quad 3.40 GHz with 8G of RAM on a Windows 7 operating system platform). Table 2 shows the average time cost results of computing various entropy features under parameters m = 5, λ = 1, and τ = 32.
The time cost is determined by two main steps in the calculation of the multiple-scale entropy value. First, the theoretical differences between the principles of various entropy measures will consume different computing resources as well as time. From the Table 2, it can be seen that the traditional MPE and MWPE algorithms consume the least time. This  can be interpreted as the coarse-graining procedure is a linear smoothing operation and thus saves time in signal transformation and generation. For improved MPE measures, such as CMPE, CMWPE, and F2CMPE measures, they can achieve better performance on fault diagnosis; nevertheless, they consume more time than that of MPE. Second, as the data length of the time series increases, the calculation time for each measure also increases. This is because sorting elements and matching templates consume most of the time in the calculation of PE values. Also, an increasing scale factor produces more multiple-scale time series and thus consumes more time to yield PE values. With respect to the proposed measure, the results verify that the AMWPE measure earns higher computational efficiency in entropy analysis compared with traditional modoified MPE measures.
Experimental data sets are then analyzed using different entropy measures under various parameters VOLUME 8, 2020 (m = 4-5, λ = 1-3, and τ = 32). Fig. 6 shows the fault diagnosis performance on testing data sets based on entropy extraction, where a multi-class SVM [34] is adopted for recognizing health conditions. The radial basis kernel function is an effective option for kernel function and is applied in this study. Also, two parameters -the optimum cost c and the width parameter g -have to be appropriately specified. In this study, these two hyper-parameters in the SVM model are fine-tuned using the Particle Swarm Optimization (PSO) method based on the training data sets [38]. The PSO is a population-based heuristic method that optimizes a problem using swarm intelligence. The PSO searches optimized solutions by using a population of individuals that are updated recursively until the optimized c and g solutions are located. The specified optimal hyper-parameters can derive high bearing diagnosis accuracy. The classification accuracy rate is defined as where N t and N f denote the number of true and false classification samples, respectively.
From Fig. 6, it can be noticed that the diagnosis performance of MPE and MWPE measures are around 98%, and their performances are not relatively stable when m and λ changes. Comparatively, from Fig. 6 (c) and (d), the CMPE and CMWPE approaches have better performance and can obtain over 98% accuracy rate in contrast to traditional MPE methods. Although they could achieve reasonable results under specified parameters, their performance lacks relative flexibility in selecting parameters (i.e., m and λ). For instance, given m = 4, the performance of the diagnosis method using the CMPE decreases when λ increases. Fig. 6 (e) shows that the F2CMPE-based method presents a high performance and gives an accuracy rate of 100% when λ = 1 under all m values. In contrast, the AMWPE-based method exhibits the best performance on testing data analysis compared with traditional entropy measures. More specifically, from Fig. 6 (f), it is noted that the proposed method can obtain 100% results when m = 5 for all λ values. Also, the AMWPE shows higher flexibility in parameter selection. To sum up, experimental results have demonstrated that the AMWPE algorithm not only owns high computational efficiency in entropy feature extraction but also exhibits better accuracy in bearing fault diagnosis and flexibility in parameter selection.

C. ROBUSTNESS ANALYSIS BASED ON NOISY SIGNALS WITH DIFFERENT SNRs
In practical applications, rotating machinery often works in complex environments with strong noises. Therefore, it has a necessity to study the robustness of entropy-based analytic models to external disturbances and noises. For this purpose, we add additive Gaussian white noise with different Signalto-Noise Ratios (SNRs). SNR is defined as the ratio of signal power to background noise power in decibels (dB): SNR = 10 log 10 ( P signal P noise ) A comparative study was first performed to evaluate various entropy measures for feature extraction. Noise signals are generated with different SNRs ranging from −4 to 14 dB, respectively. Fig. 7 shows waveforms of the original bearing signal with IR state and signals under SNR = 6/2/ − 2 dB, respectively. As is shown, raw signals are contaminated with stronger noise as the SNR level decreases, and the waveforms of signals will be more complicated. For comparison, we consider noisy bearing vibration signals with SNR = 2 dB. Entropy values are calculated under m = 5, λ = 1, and τ = 32 for all entropy methods.    8 presents a reduced 2-D feature space of extracted features using different entropy measures using the t-SNE method. The t-SNE technique visualizes high-dimensional data by mapping it to a two-dimensional feature space while still preserving the high dimensional clustering relationship [39]. From Fig. 8, the MPE and MWPE feature values that represent ten bearing conditions are completely mixed up, and thus bearing conditions cannot be differentiated. Although the CMPE and CWMPE feature points spread in a relatively dispersed space, most of the points are blended and are difficult to identify. The F2CMPE feature values are relatively scattered; nonetheless, some data points at the bottom left in the feature space are difficult to distinguish. Comparatively, it is noted that the AMWPE features display a clear degree of separation and are easy to differentiate between bearing conditions by observing the feature space. Moreover, the data points in the AMWPE feature space in each cluster are more compact, compared with traditional approaches. The results verify that the AMWPE algorithm is more robust to the analysis of noisy signals with small SNRs compared with traditional measures.
Besides, we compared the diagnosis performance using different entropy measures for the analysis of bearing signals with SNR = 2 dB. Fig. 9 shows the confusion matrix results, which are used to indicate the number of correct and incorrect predictions in identifying bearing health state.  From the figure, it reveals that diagnosis accuracy results are in line with the performance of differentiating between bearing health conditions using the feature space. That is, the entropy measure that can present a better separation in feature clusters will obtain a higher classification accuracy. For example, the accuracy results of diagnosis methods based VOLUME 8, 2020 on the MPE, CMPE, and MWPE are no greater than 90%. The methods using the CWMPE and F2CMPE could obtain higher accuracy rates of 90% and 92.7%, respectively. In contrast, the AMWPE-based diagnosis method could obtain the highest accuracy rate of 99.3%; therefore, it verifies the superiority of the AMWPE measure in analyzing noisy signals for bearing fault diagnosis.
Further, more experiments were carried out to study the robustness of entropy analysis to noise for bearing diagnosis where entropy values are calculated using different parameters (i.e., m and λ). In this study, the SNR value of tested signals increases from −4 to 14. Three groups of experiments are considered, and entropy values are calculated under λ = 1-3 and m = 5, respectively. Fig. 10 compares the diagnosis accuracy rates resulted from six entropy measures under different λ values. It can be seen that the traditional MPE and CMPE give relatively low accuracy rate in three cases. As the noise level decreases, their diagnosis performance increases very slowly. In comparison, the MWPE and CMWPE present a better performance in three cases, but their accuracy results are no larger than 80% when SNR ≤ 4. It indicates that they are not suitable for the cases when signals have high noise levels.
Comparatively, from Fig. 10, it is observed that even when the SNR is low, the diagnosis methods using the F2CMPE and AMWPE measures are superior to the traditional methods. Also, the proposed AMWPE method improves the diagnosis performance compared to F2CMPE. For example, when SNR is −4dB, the AMWPE-based diagnosis method can still reach a high accuracy of 70%. Further, in this study, the accuracy rate of the proposed method achieves over 95% when SNR = 0 dB (the power of the noisy signal is equal to the original vibration signal) and continues increasing as SNR increases. To sum up, the experimental results demonstrate the high effectiveness of the proposed bearing diagnosis method in bearing fault detection and identification. The developed AMWPE can offer reliable entropy analysis with high flexibility in parameter selection. Also, the proposed method is robust to noisy vibration signals and can give satisfactory diagnostic accuracy rates compared with traditional methods.

V. CONCLUSIONS
In this paper, an improved entropy measure named AMWPE is proposed for time series complexity analysis. A new method is then developed based on the AMWPE and SVM for bearing fault diagnosis. Diagnosis performances are studied and compared between different entropy measures for feature extraction in terms of feature representation, computational efficiency, and diagnosis accuracy. Experimental results have verified that the proposed diagnosis method can present reliable and satisfactory diagnostic results. Also, through analyzing noisy signals with different SNRs, the AMWPE method exhibits more robustness in analyzing noisy signals with low SNRs compared to traditional entropy measures for bearing diagnosis. For future work, the proposed method will be applied to diagnose compound faults in industrial-scale machinery in an attempt to study and improve its diagnosis performance. Furthermore, the development of further improved permutation entropy measures for fault diagnosis of rotating machinery can be explored.