Feature Extraction Based on EWT With Scale Space Threshold and Improved MCKD for Fault Diagnosis

Aiming at the problem of feature extraction of non-stationary, non-linear and weak fault signals, a new feature extraction method based on empirical wavelet transform (EWT) with scale space threshold (STEWT) and improved maximum correlation kurtosis deconvolution (MCKD) with power spectral entropy and grid search (PGMCKD), namely STEWT-PGMCKD is proposed for rolling bearing faults in this paper. In the proposed STEWT-PGMCKD method, the scale space threshold method is designed to solve the problems of falling into local extremum and mode over decomposition caused by the local-max-min band decomposition method of EWT, which is used to decompose the frequency band of signal, and the correlation analysis is carried out between the decomposed modal components and the original signal to retain the modal components with high correlation. Then an adaptive MCKD based on power spectral entropy is proposed to solve the problem that the signal processing effect of MCKD is affected by filter size $L$ and deconvolution period $T$ . Nextly, the parameters of the MCKD are optimized by grid search method. Finally, the power spectrum analysis of the enhanced signal is carried out to realize the feature extraction and fault diagnosis. The experiment results show that the proposed STEWT-PGMCKD method can effectively extract the weak fault information and accurately realize the fault diagnosis for rolling bearings.


I. INTRODUCTION
As the core component of typical rotating machinery, rolling bearing plays an important role in the effective operation of the whole mechanical system, and also becomes one of the components with frequent faults. It is necessary and significant to monitor the states of rolling bearings and find out the fault in time [1]- [5]. The fault signal of rolling bearings is non-stationary and non-linear noise signal, and the fault location is often in the position where the signal is not easy to collect, which leads to the problem of signal weakening and interference by the external environment in the transmission process of fault signal, which brings difficulties to the fault diagnosis of rolling bearings [6]- [11]. It is the key is to find an The associate editor coordinating the review of this manuscript and approving it for publication was Junhua Li . effective method which can effectively extract fault features under weak signal and strong noise interference [12]- [16].
In recent year, a lot of methods are proposed by some researchers [17]- [21].
For the rolling bearing fault feature extraction, empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), wavelet transform (WT) and other time-frequency analysis methods are widely used. Lou and Loparo [22] proposed a bearing fault classification method combining wavelet transform with fuzzy inference. To increase the computation speed of biorthogonal WT, lifting wavelet transform was proposed by Sweldens [23], and it was successfully applied to detecting the fault of a helicopter's planetary transmission system. Lin and Qu [24] proposed a feature extraction method based on Morlet wavelet, and this method was greatly applied to mechanical VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ fault diagnosis. Qin et al. [25] proposed M-band flexible wavelet transform for identifying the underlying fault features in measured signals. But these methods have certain limitations and are susceptible to interference. EMD is an adaptive signal time-frequency analysis method proposed by Huang et al. [26] in 1998. This method has achieved satisfactory results in dealing with nonlinear and non-stationary signals, and has been widely used. But EMD is limited to use due to mode mixing and end effects. Although EEMD alleviates the problem of mode mixing, but the number of iterations will increase significantly, resulting in a sharp increase in computation. Dragomiretskiy and Zosso [27] proposed Variational Mode Decomposition (VMD) in 2014. By constructing and solving variational problems, the centre frequency of each IMF and its bandwidth are determined. Finally, the signal is adaptively decomposed into K IMFs. This method not only retains the advantages of dealing with non-linear and non-stationary signals, but also overcomes the disadvantages of mode mixing and end effects. However, the processing effect of VMD is greatly affected by K , improper selection of K will lead to over decomposition or under decomposition. Empirical wavelet transform(EWT) is a new adaptive signal decomposition method proposed by Gilles [28]. This method is based on empirical mode decomposition and wavelet transform. The EWT overcomes the problems that wavelet bases cannot be selected adaptively according to signal characteristics, and EMD is lack of mathematical basis and prone to mode aliasing and endpoint effect. It can adaptively select a set of wavelet filter banks to extract different AM-FM components of the signal according to the maximum value of the Fourier spectrum. A lot of researchers have done relevant research based on EWT.
Bhattacharyya et al. [29] proposed a signal decomposition method of the EEG signal using empirical wavelet transform technique. A sparsity guided empirical wavelet transform is proposed to automatically establish Fourier segments required in empirical wavelet transform for fault diagnosis of rolling element bearings [30]. Zheng et al. [31] proposed an adaptive parameterless empirical wavelet transform based time-frequency analysis method and its application to rotor rubbing fault diagnosis. Ou et al. [32] proposed a method of rolling bearing fault diagnosis using improved majorization-minimization-based total variation and empirical wavelet transform. Pan et al. [33] proposed a modified EWT (MEWT) method via data driven adaptive Fourier spectrum segment is proposed for mechanical fault identification. Yuan et al. [34] proposed a new technique that combines the second-order blind identification (SOBI) method with the EWT to delineate closely-spaced frequencies. However, due to the fact that the actual mechanical vibration signal is more complex and easily interfered by strong noise, the local-max-min method for interval segmentation is prone to modal redundancy or falling into local extremum. Therefore, it is necessary to study the optimization strategy of interval segmentation.
Maximum correlated kurtosis deconvolution (MCKD), as an algorithm to enhance the periodic component in the signal, can highlight the periodic pulse signal in the original signal to the greatest extent, Therefore, it has been widely used in the field of fault diagnosis for bearings. Miao et al. [35] proposed an improved MCKD method for fault diagnosis of rolling element bearings. The parameterized time-frequency transform methods are proposed to solve extraction the IRF accurately due to the strong nonstationary properties of the signal [36]. Zhang et al. [37] proposed a bearing fault diagnosis and degradation analysis based on improved EMD and MCKD. McDonald et al. [38] proposed a new deconvolution method for the detection of gear and bearing faults from vibration data. Hong et al. [39] proposed a method combining customized balanced multiwavelets and adaptive MCKD for rotating mechanical compound faults diagnosis. Jia et al. [40] proposed an improved spectral kurtosis (SK) method based on maximum correlated kurtosis deconvolution (MCKD) to extract the weak fault characteristics of bearings from the signals. Tang et al. [41] proposed a new diagnosis method based on Adaptive maximum correlated kurtosis deconvolution (AMCKD) for accurate identification of compound faults of rolling bearings. Wang et al. [42] proposed a MCKD-CEEMD-ApEn method to denoise and classify the combined failure of slewing bearings. Lyu et al. [43] proposed an improved maximum correlated kurtosis deconvolution method based on quantum genetic algorithm, named QGA-MCKD for gear and bearing compound fault diagnosis. Cheng et al. [44] proposed the optimal minimum entropy deconvolution adjusted and multipoint optimal minimum entropy deconvolution adjusted (MOMEDA) for enhancing the impulse-like component in the fault signal. Zhang et al. [45] proposed a new detected method combining complementary advantages of variational mode decomposition (VMD) and adaptive maximum correlated kurtosis deconvolution, named VMD-AMCKD. Wang et al. [46] proposed multipoint optimal minimum entropy deconvolution adjusted to the fault diagnosis of gearbox. Shen et al. [47] proposed a novel signal processing method based on the particle swarm optimization, maximum correlated kurtosis deconvolution, variational, mode decomposition and fast spectral kurtosis (PSO-MCKD-VMD-FSK) to extract fault characteristics for the signal-to-noise ratio and uneven energy distribution problems. Zhang et al. [48] proposed an improved maximum correlation kurtosis deconvolution based on grasshopper optimization algorithm. Liang et al. [49] employed an optimized Morlet wavelet as the initial filter in the deconvolution process, which contributes to improving both the efficiency and performance of MAKD. However, the parameters of MCKD are strict. Only when the unwinding period matches the corresponding failure period, the MCKD can give full play to the optimal effect. Therefore, it is necessary to study the parameter selection of the MCKD. In recent year, some optimization algorithms are proposed to determine the parameters for it [50]- [54].
Aiming at the deficiency of local local-max-min method of EWT for interval segmentation, the EWT with scale space threshold is deeply studied, and the an adaptive MCKD based on power spectral entropy is proposed to solve the problem that the signal processing effect of MCKD is affected by filter size L and deconvolution period T . The grid search method is used to optimize the parameters of the MCKD. Firstly, the vibration signal is decomposed by EWT with scale space threshold, and the AM-FM component is analyzed by envelope spectrum. Then the fault feature is enhanced and extracted by the MCKD with power spectral entropy and grid search. The rolling bearing fault signal of QPZZ-II is used to prove the effectiveness of the proposed STEWT-PGMCKD method.
The main contributions are summarized as follows: • A novel feature extraction method based on integrating EWT with scale space threshold and improved MCKD with energy entropy and grid search was proposed.
• The scale space threshold method is designed to improve the EWT to avoid to fall into local extremum and mode over decomposition.
• The energy entropy and grid search method are combined to adaptively select and optimize the filter order L and deconvolution period T of the MCKD.
The rest of this paper is arranged as follows. Section 2 presents empirical wavelet transform with scale space threshold. The maximum correlation kurtosis deconvolution with power spectral entropy and grid search is described in Section 3. Section 4 introduced a novel fault diagnosis method. The experimental results on rolling bearing fault signal of QPZZ-II are shown and analyzed in Section 5. Finally, the conclusion is given in Section 6.

II. EMPIRICAL WAVELET TRANSFORM WITH SCALE SPACE THRESHOLD
EWT is a new signal processing method proposed by Gilles, which extracts different AM-FM components of signal by adaptively selecting a set of wavelet filter banks according to the Fourier spectrum characteristics of signal. In order to select a suitable wavelet filter, the Fourier spectrum is defined in [0, π] in signal analysis according to Shannon criterion, and the Fourier spectrum is segmented adaptively. After the partition interval is determined, empirical wavelet is defined as band-pass filter in each interval. Gilles constructs empirical wavelet with empirical wavelet function and empirical scaling function according to Meyer wavelet construction method.
The dividing the central frequency band of the traditional EWT method is based on the local minimum between two adjacent maxima in the Fourier spectrum, which is easy to fall into the local minimum and cause modal redundancy. In this paper, the central frequency band is divided based on the scale space threshold method. The scale space is based on the Fourier transform of the signal, through further internal operation to obtain the spectrum segmentation information. Based on the scale space representation, the STEWT method is used to classify the spectrum of the original signal by using the fuzzy C-means method in order to obtain the spectrum division interval. In this paper, the fuzzy C-means method divides the minimum length curve set into 2 clusters (meaningful and non-meaningful). Thus, the original signal is decomposed into several intrinsic mode functions (IMFs). Finally, Pearson correlation coefficient is made between the IMFs and the original signal, and the threshold value is set to retain the IMF component which is larger than the threshold value, and then the power spectral density is analyzed to extract the fault characteristic frequency.
then the scale space is expressed as where, * represents convolution, t represents the scale parameter.
In practical application, the Fourier transform f (ω) of the original signal and the sampled Gaussian kernel function are used to describe the scale space representation, which is shown as follow.
When M is large enough, the error of Gaussian approximation is negligible. In this paper, M = C √ t + 1, 3 ≤ C ≤ 6. In order to satisfy the requirement that the error is small enough, Under the initial scale parameter ( √ t 0 = 0.5), the minimum between the local maximum values can be obtained in the scale space representation L (ω, t 0 ). When t = 0, the number of local minimums is recorded as N 0 . These local minimums constitute ''the scale space curve'' C i (i ∈ [1, N 0 ]). In the scale space representation plane L (ω, t), N 0 calculated at t = 0 is recorded as the number of minimum length curves L i , and as the scale t increases, the number of local minimums is decreasing until the number of minimums is reduced to 0. At this time, the value of the minimum length curve L i is calculated, starting with the first ''scale space curve''C 1 of length 1 represented by the local minimum of f (ω) at t = 0, and sequentially accumulating ''the scale space curve'' at different scales t k (k = 1, 2, · · · 2N 0 ) and gets the first ''minimum length curve'' L i (i = 1, 2, N 0 ). From this, a proportional threshold T can be set from the minimum length curve L i , that is to say all minimum length curves corresponding to the proportional threshold T are found. The problem is transformed into a clustering problem. So that we can use fuzzy C-mean method to divide the set {L i } i= [1,N 0 ] clusters into two clusters (meaningful and non-meaningful minima). VOLUME 9, 2021

B. THE IMF SELECTION
Due to the strong noise interference in the actual operation environment, IMFs have the following characteristics after EWT decomposition: 1) there are differences in the fault information contained in every IMF; 2) there are still interference components in the IMF containing fault information, which cause interference to the power spectrum analysis of the signal, so it is unable to extract the fault frequency from the IMF.
Therefore, this paper introduces Pearson correlation coefficient, by comparing the correlation between the IMF and the original signal, to retain the IMF which has greater correlation with the original signal in order to further process the signal. For the two set of data {x i } and {y i }, i = 1, 2, · · · n, their correlation coefficient can be expressed as: where arex andȳ are the mean of data. The correlation coefficient r ranges between −1 and 1. If r = ±1, it indicates a completely positive or negative linear correlation between x and y; If r = 0, it means that there is no linear correlation between them. When the Pearson correlation coefficients of every IMF are obtained, it is necessary to set a threshold to retain IMFs with more fault information as much as possible. The value of the threshold affects the signal processing results. Based on the experience of the experiment, it is usually better to set the threshold to 0.2 for signal processing.

C. THE STEPS OF EWT WITH SCALE SPACE THRESHOLD
In this paper, the scale space threshold method is used to segment the frequency band, and the IMF components after reconstruction are analyzed by correlation analysis, which is more relevant to the original signal. The specific steps are described as follows.
Step 1. Get Fourier transform f (ω) of the original signal f (x).
Step 3. When there is t = 0, the local minimum of f (ω) is calculated. Take these positions as the starting points, the scale space curves under different scales t k (k = 1, 2, . . . 2N 0 ) are accumulated in turn, and the minimum length curve set Step 4. Fuzzy c-means is used to cluster the minimum length curve in set {L i } i=[1,N 0 ] . The minimum length curves can be divided into two clusters (meaningful and non-meaningful).
Step 5. The meaningful cluster is regarded as the segmentation point of spectrum segmentation interval.
Step 6. A band-pass filter based on wavelet transform is established in the spectrum division interval.
Step 7. The IMF component is obtained by decomposing the vibration signal.
Step 8. Based on Pearson correlation coefficient method, the correlation coefficient between each component and the original signal is calculated, and the threshold value is set to retain larger IMF components than the threshold value.
Step 9. The IMFs with larger Pearson correlation coefficient will be selected to reconstruct the signal.

III. MAXIMUM CORRELATION KURTOSIS DECONVOLUTION WITH POWER SPECTRAL ENTROPY AND GRID SEARCH A. THE MCKD
For periodic impulse signal x (n), the attenuation response of the transmission path is h (n) and the noise component of doping is e (n). Then the measured signal y (n) collected by the sensor can be expressed as The expression for maximizing of the correlation kurtosis is described as follow.
The formula (4) is derivatived as follow.
The filter coefficients are calculated.
where, T and M represent the period of impulse signal and the number of shifts. f represents the filter vector of finite impulse response. L represents the filter length. The coefficients in equation (7) are obtained as follow.

B. THE PARAMETER OPTIMIZATION OF THE MCKD
Filter size L and deconvolution period T are two important parameters of the MCKD, especially the deconvolution period T . The selection of T determines whether MCKD can enhance the fault frequency. And the reasonable selection of L will affect whether the fault frequency is obvious in the power spectrum. When the MCKD is used to process the signal, it is necessary to determine the appropriate parameters according to the characteristics of the signal because of the processed signal with different characteristics. The performance of the MCKD depends on the selection of the parameters.
Theoretically, the deconvolution period T can be calculated, and the formula is shown as follow.
where, f s is the sampling frequency, f 0 is the fault frequency. However, the value of T is slightly different from the theoretical value in practical application, so it is necessary to optimize the parameter T near the theoretical value. Generally, the selection range of filter size L is 100-500. Because the filter size L and deconvolution period T affect each other, it is of great significance to deeply study the adaptive selection of two parameters in order to improve the effect of the MCKD.
The search range of parameter L is [a, b], step is n 1 . The search range of parameter T is [c, d], step is n 2 . The possible values of parameters L and T are arranged and combined, and all possible combination results generate grid, which is described as Power spectral entropy reflects the distribution of signal energy, so the power spectral entropy is used to optimize the parameters. As a parameter optimization method, grid search method is simple and accurate, although the calculation efficiency of grid search method is lower than other methods. However, in the case of low precision requirement, the calculation speed can be improved by increasing the step size.
The proposed parameter optimization evaluation criteria based on power spectral entropy are defined as follows (11) where, there is i = 1, 2, · · · , n.

IV. A NOVEL FAULT DIAGNOSIS METHOD
In order to solve the problem that the fault signal of rolling bearings is weak and it is difficult to extract fault feature under strong noise and complex transmission path, a new feature extraction method based on EWT with scale space threshold and MCKD with power spectral entropy and grid search is proposed to realize the fault diagnosis of rolling bearings in this paper. Firstly, the AM-FM modal decomposition of vibration signal is carried out by using STEWT, and the correlation analysis of the decomposed modal components is carried out to remove the modal components whose correlation is less than the threshold. The remaining IMF will be used to reconstruct the signal. Then, the filter size L and impulse signal period T of the PGMCKD are adaptively selected, and then the reconstructed signal is enhanced in order to achieve feature extraction and fault diagnosis.
The specific process of the proposed STEWT-PGMCKD method is shown in Figure 1. The main steps of the proposed STEWT-PGMCKD method in this paper are described in detail as follows.
Step 1. The scale space threshold method is used to determine the frequency band division interval, and the wavelet filter bank is established to decompose the signal.
Step 2. The correlation between the IMF components and original signal after decomposition are analyzed, and the components less than the threshold are removed. The remaining IMF will be used to reconstruct the signal.
Step 3. The search range and step size of L and T in the MCKD are set. The optimization of parameter T takes into account the interference of rotating frequency. The theoretical VOLUME 9, 2021 value of the search range needs to be calculated, and the search in a small range near the theoretical value is obtained.
Step 4. The grid search method is used to optimize the two parameters of the MCKD, and the power spectral entropy evaluation criterion is used to determine the optimal parameters.
Step 5. The optimized MCKD is used to enhance the reconstructed signal.
Step 6. The enhanced signal is transformed by Hilbert transform, and analyzed by power spectrum in order to identify the fault feature frequency.

V. THE EXPERIMENT RESULTS AND ANALYSIS A. THE EXPERIMENT PLATFORM
In this paper, the rolling bearing fault signal of QPZZ-II testbed is used to obtain experiment data. The bearing model is N205, the pitch diameter is 39 mm, there are 13 rolling elements with a diameter of 7.5mm, the rotating speed is 1500r / min, and the sampling frequency is 12000 Hz. According to the above information, the fault characteristic frequency of bearing rolling element is about 125.2 Hz, the inner race fault characteristic frequency is about 194.2 Hz, and the rotating frequency f r is about 25Hz. The QPZZ-II bearing fault test-bed is shown in Figure 2.

B. THE EXPERIMENT RESULTS
Because the fault feature of outer race is obvious, the inner race and rolling element fault feature extraction are only considered to prove the effectiveness of the proposed fault diagnosis method in this paper.

1) INNER RACE FAULT
The time domain and power spectrum of inner race fault signal of rolling bearings are shown in Figure 3 and Figure4.
As shown in Figure 3, the power spectrum contains strong noise interference, and there is no obvious periodic component of fault information. However, for the power spectrum in Figure 4, it cannot find any useful information for fault diagnosis of inner race.
The proposed STEWT-PGMCKD method is used to further analyze in this paper. Firstly, the EWT with scale space  threshold is used to decompose the signal in order to obtain AM-FM components. The result of spectrum segregation is shown in Figure 5. According to the spectrum segmentation, the signal is decomposed into several IMFs. The frequency components of each IMF are obtained by analyzing the power spectrum of IMFs, the result and the Pearson coefficient of each IMF are shown in Table 1. As can be seen from Table 1, EWT cannot directly get the fault frequency of the signal, so it is necessary to select the IMF which contains more fault information for signal reconstruction and further process the reconstructed signal.
In this paper, the scale space threshold is set to 0.2 according to experience. Therefore, IMF2, IMF6 and IMF7 are selected for signal reconstruction.
Then the reconstructed signal is processed by using the MCKD with power spectral entropy and grid search. The theoretical value of T is 62, so the search range of T is [60, 64] and the step size is 1. The value range of L is [100, 500], and the step size is 5. According to the principle of minimum entropy of energy spectrum, a set of optimization parameters [62, 400] is obtained. The final processing results by power spectrum analysis are shown in Figure 6. As can be seen from Figure 6, 1/4 rotating frequency is 5.859Hz, the fault frequency is 193.4 Hz and its double feature frequency is 386.7 Hz, which can accurately diagnose the inner race fault of rolling bearings in the signal power spectrum processed by using MCKD with power spectral entropy and grid search. Therefore, the experiment results show that the proposed STEWT-PGMCKD method can effectively diagnose the fault of inner race.

2) ROLLING ELEMENT FAULT
The time domain and power spectrum of rolling element fault signal of rolling bearings are shown in Figure 7 and Figure 8. It can be seen from Figure 7 and Figure 8 that the power spectrum contains strong noise interference, and there is no obvious periodic component of fault information. However, for the power spectrum in Figure 7, it can find the useless information for fault diagnosis of rolling element.
The proposed STEWT-PGMCKD method is used to further analyze in this paper. Firstly, the EWT with scale space threshold is used to decompose the signal in order to obtain AM-FM components. The result of spectrum segregation is shown in Figure 9.  The frequency components of each IMF are obtained by analyzing the power spectrum of IMFs, the result and the Pearson coefficient of each IMF are shown in Table 2. As can be seen from Table 2, EWT also cannot directly get the fault frequency of the signal, so IMF2, IMF3, IMF7 and IMF8 are selected for signal reconstruction. Then the reconstructed signal is processed by using the MCKD with power spectral entropy and grid search. The theoretical value of T is 96, so the search range of T is [94, 98] and the step size is 1. The value range of L is [100, 500], and the step size is 5. According to the principle of minimum entropy of energy spectrum, a set of optimization parameters [98,460] is obtained for MCKD. The final processing results by power spectrum analysis are shown in Figure 10.
It can be found from Figure 10 that 1/4 rotating frequency is 5.859Hz, fault feature frequency is 123Hz, and its double frequency, triple frequency, quadruple frequency and quintuple frequency can be used to accurately diagnose the rolling element fault of rolling bearings in the signal power spectrum processed by using the MCKD with power spectral entropy and grid search. Therefore, the experiment results show that the proposed STEWT-PGMCKD method can effectively diagnose the fault of rolling element.

C. THE COMPARISON AND ANALYSIS 1) THE COMPARISON OF SIGNAL DECOMPOSITION METHODS
In order to illustrate the effectiveness of the EWT in signal decomposition, the EWT is compared with the EMD, EEMD and VMD. All the signal enhancement methods adopt the MCKD method with power spectral entropy and grid search.
For the decomposed signal by using EMD, some IMFs are selected for signal reconstruction according to the Pearson coefficient. And then the reconstructed signal is selected to enhance by the MCKD with power spectral entropy and grid search. The search range of L is [100, 500], the theoretical value of T is 96, the search range is [94,98], and the optimized parameter is [94,110]. The power spectrum is shown in Figure 11. For the decomposed signal by using EEMD, some IMFs are selected for signal reconstruction according to the Pearson coefficient. And then the reconstructed signal is selected to enhance by the MCKD with power spectral entropy and grid search. The search range of L is [100, 500], the theoretical value of T is 96, the search range is [94,98], and the optimized parameter is (94,100). The final power spectrum is shown in Figure12.
From Figure 11 and Figure 12, it can be seen that 1/3 harmonic of fault frequency, but it is not obvious. Compared with the power spectrum by using the STEWT in Figure 10, the fault feature of rolling element cannot be extracted obviously. Therefore, the comparison results show that the SWEWT is more effective than the EMD and EEMD.
For the decomposed signal by using VMD, some IMFs are selected for signal reconstruction according to the Pearson coefficient. Considering that VMD needs to preset the number of mode decomposition K , this paper sets K as the number of IMFs after EMD. And then the reconstructed signal is selected to enhance by the MCKD with power spectral entropy and grid search. The search range of L is [100, 500], the theoretical value of T is 96, the search range is [94,98], and the optimized parameter is [97, 145]. The power spectrum is shown in Figure 13. From Figure 13, it can be seen that the 1/4f r and the fault frequency 123Hz, but the fault frequency is not the most obvious frequency component in the figure, and there is no harmonic of fault frequency in the figure. Although VMD has made great progress compared with EMD and EEMD, but compared with the power spectrum by using the STEWT in Figure 10, the fault feature of rolling element still cannot be extracted obviously. The comparison results show that STEWT is effective for signal feature extraction.

2) THE COMPARISON OF OPTIMIZATION METHODS FOR MCKD
Firstly, in order to illustrate the necessity of MCKD parameter optimization and the influence of the L and T , the reconstructed signal is reprocessed by MCKD with random parameters selection (the parameter is [66, 250]), the result is shown in Figure 14.   It can be seen from the figure that there is no fault information except rotating frequency, and the distribution of frequency components is disordered. This phenomenon shows that the reasonable selection of parameters has a great influence on the signal processing effect of MCKD.
In order to further verify the effectiveness of the improved MCKD with power spectral entropy and grid search, it is compared with the MCKD and the MCKD with Shannon entropy. The T value of the MCKD is 96, and L is 460, which is the same as the improved MCKD with power spectral entropy and grid search in this paper.
The MCKD with power spectral entropy is set as the same search range and step size with the improved MCKD with power spectral entropy and grid search. The final parameter optimization result is [95,315]. The power spectrums of reconstructed signal by using two methods are shown in Figure 15 and Figure 16. It can be seen from Figure 15 that MCKD can extract the related information with the rotating frequency and double harmonic of the fault frequency. Figure 16 shows that the fault frequency (123 Hz) can be extracted by using the improved MCKD with Shannon entropy. However, compared with Figure 10, it is found that the extraction effect of the improved MCKD with Shannon entropy is not obvious. The improved MCKD with power spectral entropy and grid search is more effective for fault diagnosis.

VI. CONCLUSION
Aiming at the problem of feature extraction of rolling bearing fault signal, a feature extraction method based on EWT with scale space threshold and improved MCKD with power spectral entropy and grid search is proposed to realize the fault diagnosis of rolling bearings in this paper. The scale space threshold is designed to improve the EWT, which is used to decompose the vibration signal. The correlation coefficient method is used to analyze the correlation of decomposed components in order to further remove redundant components. It can avoid the phenomenon of mode over decomposition or local extremum in the EWT. Then the power spectral entropy and grid search are used to improve the MCKD, which is used to enhance the reserved component in order to improve the enhancement effect of weak fault signal. The effectiveness of the proposed STEWT-PGMCKD method is proved by the rolling bearing fault signal of QPZZ-II. The comparison results with EMD, EEMD, VMD, MCKD and the MCKD with information entropy show that the proposed STEWT-PGMCKD method is an effective method for fault diagnosis.