Roller Bearing Fault Diagnosis Based on Partial Reconstruction Symplectic Geometry Mode Decomposition and LightGBM

It is always a hot and challenging problem to extract the characteristic information of roller bearings from strong noise interference. Conventional Hilbert-Huang Transform (HHT), Local Mean Decomposition (LMD), Local Feature-Scale Decomposition (LCD), and so on have some issues like overenvelope, under-envelope, frequency-chaos, end-point effect, and so on. Symplectic Geometry Mode Decomposition (SGMD) is one of the most efficient approaches to reconstruct this model. But SGMD has a drawback that the computation efficiency is reduced quickly with an increase in the quantity of data, and the degradation precision is influenced by the non-valid Symplectic Geometric Component (SGC). On this basis, a Regularized Composite Multiscale Fuzzy Entropy (RCMFE) is proposed, which is used to estimate the complexity of the reconstructed original individual parts and restrict the minimum amount of remaining power. This paper presents a Partial Reconstruction Symplectic Geometry Mode Decomposition (PRSGMD) approach. The simulation results indicate that PRSGMD can not only enhance the precision of SGMD but also enhance its robustness and validity. Finally, a maximal distance evaluation technique (DET) is employed in combination with a more interpretable tree-based Light Gradient Boosting Machine (LightGBM) for the intelligence fault diagnosis for rolling bearings.


I. INTRODUCTION
Roll bearing is an indispensable part of the mechanical system [1], [2].The fault of roller bearings will lead to a series of effects, which may lead to different levels of mechanical damage.In severe cases, it can cause a fall or even an accident.Thus, the state of the roll bearing directly influences machine operation reliability [3], [4], [5], [6].Thus, in order to ensure the safety and stability of the system, it is necessary for the rolling bearings to be monitored and diagnosed in real The associate editor coordinating the review of this manuscript and approving it for publication was Li He .time [7], [8], [9], [10].However, due to the complex inner structure of the device and severe operating conditions, the collection of vibration signals is often combined with the multicomponent and noise-related vibration modes.Therefore, the key for accurate diagnosis of a bearing failure is to extract the fault characteristic information with interference information from the vibrating signal [11], [12], [13].
The conventional approaches to extract the characteristic data are Fourier transform based spectrum analysis, STFT, WD, and WT) [14], [15], [16], [17].Because of the application of fixed-base functions, such analytical approaches usually cause analytical results that are not meaningful enough to capture the inherent characteristics of signals [18], [19].Lots of experts at home and abroad have made lots of researches on how to select basic functions and their parameters according to the character of signals.On the basis of identifying basic functions, this paper classifies the method as a parameter and a non-parameter.
Based on the characteristic of the signal, the optimum value or coefficient of the base function is defined by using the parameter adaptation algorithm, and then the optimum value can be obtained in the decomposition process [20], [21], and the nonparametric adaptive signal decomposability is based on its own characteristic, which has no explicit analytical expression [22], [23], [24].
Based on the characteristics of the signal, it is possible to choose the basic function and its parameters to obtain the physical significance.Thus, the intrinsic properties of a mechanical fault vibration signal can be effectively extracted.Compared with parameter adaptation, NMI doesn't have to build an integrated dictionary database, so it is more adaptable and can be divided into several parts.Currently, the most commonly used non parameter adaptive signal analyzing techniques include Herbert-Huang Transformation (HHT) [25], LMD [26], LCD [27], and so on.The concept behind this approach is similar to that of obtaining physical meaning.Compared with other approaches, the proposed approaches have better adaptability and physics significance, but there are two shortcomings.First of all, it is necessary to match the extremum value of the component, Low Envelope, Frequency Chaos, and End Effect.Secondly, there is no strict mathematical explanation for the physical meaning of the definition of a single component signal.
SGMD is presented by Pan and his colleagues.Through the decomposition of the signal, we can get some kind of symplectic geometric parts which have independent modes [28] The Hamiltonian matrix is computed using the SGMD method, and then the single component signal is reconstructed using its characteristic vector.SGMD has no need to customize the parameters, so it can be used to reconstruct the current model efficiently and remove the noise.But, SGMD has a drawback: the computation efficiency decreases rapidly when the amount of data is increased, and the non-valid symplectic geometric elements also influence the degradation precision.In order to solve this problem, this thesis uses CMFE to efficiently assess the complexity of every original individual element in SGMD and to overcome the variation of the original index [29].First, an RCMFE operator is built to estimate the complexity of every original individual element and limit the remaining power to a minimum, and then, in combination with a built partially reconstructed threshold, the merger is ended.A new approach to signal de-noising is presented by using Part Reconstruction Symplectic Geometry Mode Decomposition.Compared with the previous edition, PRSGMD [30] is only required to process a portion of the original single-element, which includes distinct patterns, and the computational efficiency is not reduced as the number of data is raised.At the same time, PRSGMD is able to enhance the degradation precision at the same time by eliminating the influence of noise and other null patterns on the degradation results.Compared with other methods, PRSGMD has better performance in de-noising and extraction of characteristic information.
GBDT [31] is a typical model in machine learning, the core ideology of GBDT is to exploit multiple feeble classifiers to iterate training that continuously reduces the gap between the target value and the predicted value, and then, the optimal model with the merits of excellent training effect, hard to overfitting and so on is gained.While Light Gradient Boosting Machine (LightGBM) classification tree [32], as one of the frameworks to implement the GBDT algorithm, facilitates effective parallel training and offers benefits like shorter training times, lower running memory requirements, improved accuracy, distributed support that can handle massive volumes of data quickly, and better interpretability.
The paper proposes the RCMFE operator to solve SGMD training efficiency decreasing with increasing data volume and decomposition accuracy.Then, to solve the issues of decomposition efficiency and weak invalid initial single component affecting the decomposition accuracy proposes the Partial Reconstruction Symplectic Geometric Mode Decomposition method.Lastly, to enhance the interpretation and diagnosis precision of the model, a LightGBM Tree Model was applied to identify the failure of the rolling bearing.
The remainder is arranged as follows.In chapter 2, a PRSGMD approach is suggested based on fundamental SGMD; Section III, the LightGBM and DET algorithms are introduced; Section IV, the simulation signals are tested to contrast the PRSGMD, SGMD, VMD, and EEMD; Section V, PRSGMD is combined with LightGBM to recognize the fault type of rolling bearing.

II. THE THEORY OF THE PRSGMD A. SYMPLECTIC GEOMETRY MODE DECOMPOSITION
The method uses a symplectic geometry similar transformation to get rid of the Hamilton matrix and then re-constructs the symplectic geometrical parts according to its characteristic vector so as to get rid of the complicated signal.SGMD includes 3 main processes.

1) PHASE SPACE RECONSTRUCTIO
Set the time order of the raw signal as x = x 1 , x 2 , • • • ,x n , with n being the length of the data.From Takens embedding theorem, it is possible to construct a multi-dimension signal by applying time-series delayed topological equivalence to a one-dimensional signa X .Where d is the embedding dimension, is the delay time, τ is the delay time, and m = n−(d−1)τ . 2

) SYMPLECTIC GEOMETRIC INITIAL SINGLE COMPONENT OBTAINING
To build the Hamilton matrix, we analyze the trace matrix and get the covariant symmetric matrix A: Decomposition of the matrix A 2 yields the eigenvector matrix Q, where Q i (i= 1, 2, • • • ,d) is the eigenvector of the matrix A corresponding to the eigenvalue σ i .
The transformation factor matrix S i = Q T i X T obtained by the characteristic vector of the unitary matrix and the path matrix is transformed into the original single-component matrix Z .
The original single-component matrix Z is transformed by diagonal averaging to get the symplectic geometry initial single component 3) SINGLE COMPONENT RECONSTRUCTION d single-component signal is obtained by means of a path matrix decomposition, but at this point, not all individual elements are independent of one another, so it is possible that any set of elements has the same cycle elements, the same frequency components, and so on.Consequently, it is necessary to reconstitute each of the original individual parts.SGMD makes use of cycle similarity to evaluate indicators.In PRSGMD, SGC 1 is defined as the first component containing the symplectic geometrically effective mode of the signal.G 1 is the residual amount obtained after subtracting SGC 1 from the original signal after signal decomposition is completed.The former SGC 1 is obtained by reconstructing highly similar parts, whereas the latter will not be involved in the remainder of the rebuild of SGC 1 the rest is represented as, and then a residual signal is generated by summing up the remaining element matrix to compute an NMSE (Normalized Average Squared Error) between the remaining signal and the original signal, and if this value falls below a given threshold value, then the remaining element matrix is considered to be an initial matrix so that it can be repeated until the end of the iteration:

B. COMPOSITE MULTISCALE FUZZY ENTROPY
Composite Multiscale Fuzzy Entropy is applied to estimate and sequence the complexity of every initial individual element.
CMFE uses fuzzy entropy to get rid of the variation of SGC components' similarity in signals.In order to reduce the time order of rough granulating, the average of fuzzy entropy-entropy was calculated with the same scale factor.The calculation of CMFE is as follows: Initially, for a given signal x with N having data points, compute various coarse granularity time sequences y k,j } with scale factor τ , where: where, 1 Then, for every scale factor, the fuzzy entropies y(ρ)k(1 ≤k ≤ ρ) are computed, and then the average of the entropies ρ entropy values is computed, making FE(•) as the fuzzy entropy calculation of the signal, so that the CMFE of this scale factor is computed as:

C. PARTIAL RECONSTRUCTION SYMPLECTIC GEOMETRY MODE DECOMPOSITION
The original single-component matrix Z is converted by diagonal averaging, and then SGMD d gets the original symplectic geometry Y i , which is called the embedding dimension d, which is generally defined as n/3.So, when SGMD is applied to deal with the data length n, it is necessary to repeatedly iterate the original individual parts in order to compare their similar with those of other original individual parts.As the number of data increases, SGMD computation speed will rise quickly, and computing time will be prolonged, which will be unfavorable for SGMD's practicality and validity.Moreover, the original individual parts with null patterns, like noise, are not differentiated in the process of reconstructing, so the precision of the decomposition of SGC.
The PRSGMD iteration procedure for the signal is described below, with in FIGURE 1.
] is acquired by means of diagonal averaging, which comprises y k as indicated in Eq.
(4) Compute RCMFE operator for every original individual element.

RCMFE[y
Of these, CMFE's scale factor is generally set at 3, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. of the original individual parts at various scales.Characteristic information and noise are usually distributed at various scales.The original single-component with noise is more complex than the original one with the defect characteristic information.The original single-component energy can be estimated by ∥y k (t)∥ 2 2 , and the bigger the component energy is, the less the decomposition residue is.
First, RCMFE re-arranges the original single-component matrix Y i .Based on the assumption of validity of the components of the decomposition, the failure characteristic and the noise are separated by complex quantizing.Therefore, Y is sorted by the RCMFE values of y k (t) from largest to smallest as follows: (5) Construct u = RCMFE(y ′ k (t))/RCMFE(r 1 (t)) as a partially reconstructed threshold index, choose a portion of the original unit refactor u> 0.001, get a partially reconstructed original single-component matrix ] which includes the important pattern of original signal, and leave a lot of weak null elements which are not involved in rebuilding, so that the computation load is decreased and the degradation rate is increased.(6)

III. LIGHT GRADIENT BOOSTING DECISION TREES A. LIGHT GRADIENT BOOSTING DECISION TREES
LightGBM as a gradient boosting tree based boosting method in integrated learning proposed by Microsoft in 2017 [30], is currently one of the best-performing boosting methods.The model's calculation complexity and memory occupancy of the traditional GBDT method are greatly increased when the volume of sample data and features grows.
LightGBM utilizes the GOSS algorithm to decrease the training data volume, which determines the sampling weights based on the gradient values, retains data with large gradients (i.e., not yet trained, which contributes more to the improvement of the information gain), randomly samples data with small gradients, and maintains the original distribution of the data, which achieves more accurate information gain compared to uniform random sampling.To reduce the sample features during training, LightGBM uses the Exclusive Feature Bundling (EFB) algorithm, which binds mutually exclusive features (several features that are not zero at the same time, e.g., unique heat coding) in high dimensional features together to form a single feature, thereby decreasing the feature dimensions and increasing the training rate without affecting its accuracy.
The samples with gradients lying in the front a × 100% are categorized as sample A, while the remaining examples are classed as sample A c .Then a subset B of dimension b×|A c | will be stochastically sampled further from the sample set A c .In the end, the instances segmentation of subset A∪B is performed based on the estimated variance gain Ṽj (d), which is calculated as follows: where High dimensional features tend to be sparse, while many of them are usually dissimilar in their sparsity properties.To decrease the feature dimensionality, EFB binds multiple features that are mutually exclusive into one by devising the feature scanning scheme.The indexing complexity is decreased from O(data * feature) to O(data * bundle) by constructing histograms.Due to feature ≥ bundle, fewer features have to be retrieved significantly.With this approach, the LightGBM training process is dramatically speeded up with no loss of features.

B. FEATURE VALUE EXTRACTION AND SELECTION
When rolling bearing faults occur, the time domain signal is not the same as the regular condition.In this case, the correspondent spectral distributions and amplitudes are also different.The statistical parameters of frequency and time domains are often employed for extracting fault features.As shown in TABLE. 1.
PRSGMD is an effective method to extract the flaw characteristic from the vibration signal of the rolling bearing by using adaptive decomposition of the raw signal into several intrinsic narrow band components.Each SGC represents an original intrinsic vibrational modal, and thus, extracting features from SGCs is occasionally performed more efficiently than extracting features straight from the raw signal.In order to get more abundant failure information, we can get both temporal and frequency-field statistics from SGCs produced by PRSGMD.
Before extracting the statistical parameters from the SGCs, the most representative SGCs containing rolling bearing fault information must be identified.Depending on the filtering properties of PRSGMD, the SGCs with lower signal complexity and higher periodicity are always filtered out first in the PRSGMD iteration process.Considering these two factors, the former three SGCs are selected for extracting the statistic parameter features.
Though the metrics from TABLE. 1 allow the fault classes to be recognized in various perspectives, the sensitivity to different faults is not the same.Some parameters are tightly correlated with faults as well as being significant, while some are not.If each of them is used in training classifiers, it will reduce its recognition accuracy.As a result, to improve the accuracy, during the calculation there is requirement to pick the remarkable features that match with the fault information and delete the uncorrelated or duplicated ones.To choose important features, the DET method-distance evaluation technique-is utilized.The basic principle of DET is to choose the characteristics with low intra class variance and significant intergrade variability.Set p i,j,k as the jth statistical parameter for the kth sample in the ith category.C and N i are the number of categories and samples, respectively.The computation of the effective factor is shown below: First of all the mean distance of the samples in the identical category is calculated: Then the average distance for all categories is obtained: The mean values of each parameter for samples in the identical category are shown below: The average distance between the means of the parameters for the different categories is shown below: Finally the effective factor is obtained: To make a selection of features, the effective factors are normalized using the maximum value: where α j is the effective factor of the jth statistical parameter, its value range is 0 to 1. Without a universal fixed value for selecting the effective factor, and the paper specifies its value as 0.5, and the statistical parameter with this value exceeding 0.5 is taken as a significant feature.Various significant features differ in their amplitude intervals.Consequently, significant feature parameters are normalized as follows: where f i,j is ith data sample's jth significant parameter, J ′ represents the total amount of significant features, and l represents the total sample amount.The highest absolute value of the jth significant parameter for all categories is referred to as max

IV. SIMULATION ANALYSIS
First, a simulated signal y(t) is constructed, and obtain the decomposition results by PRSGMD, SGMD, EEMD, and VMD.Various verifications and comparisons are performed on the components obtained by the four methods to demonstrate the superiority of PRSGMD compared to other methods.
y(t) contains both vibration attenuated signal and Gaussian white noise.y(t) and its two constituents are represented in the time domain in FIGURE.2. Among them, the SNR of the WGN signal is -20dB.Further, PRSGMD, SGMD, VMD, and EEMD decompositions are performed on y (t).
FIGURE 3 presents the decomposition results where subgraphs (a), (b), (c) and (d) correspond to results obtained from the PRSGMD, SGMD, VMD and EEMD decomposition, respectively.At an SNR of -20 dB, the IMF 1 component amplitude of the EEMD decomposition results fluctuates significantly outside the amplitude limit of the original signal y 1 (t), resulting in the worst decomposition effect.The IMF 1 component of the VMD decomposition results partially exceeds the limit and is influenced by noise.The amplitude of the SGC 1 component of SGMD exceeds the amplitude limit of y 1 (t), indicating that strong noise has caused some distortion.However, its waveform is primarily consistent with that of y 1 (t).In contrast, the SGC 1 component of PRSGMD is mostly similar to y 1 (t) in general, maintaining the features of y 1 (t) overall.
The instantaneous amplitude and frequency (IA and IF) of the IMF 1 and SGC 1 are obtained by Hilbert transform on them, as shown in FIGURE 4. By comparing these with the IA and IF of y 1 (t), we obtain the errors in IA and IF, which are obtained by subtracting the absolute values of the IA and IF of x 1 (t) and x 2 (t), respectively.By comparing the IA and IF of the IMF 1 , SGC 1 , and other components obtained by EEMD, VMD, and SGMD with those of the true signals, it indicates the PRSGMD decomposition results in more accurate and stable SGC 1 compared with the other three methods, which in turn illustrates the better decomposition capability of the PRSGMD.
To further contrast the compatibility of the IMF 1 and SGC 1 obtained by PRSGMD, SGMD, VMD, and EEMD with the true signals, employing energy error E i and the correlation coefficient r i as assessment metrics.The assessment metrics for the first component of each method are shown in    .2, compared with the other three methods, the PRSGMD decomposed components have higher correlation coefficients and lower energy errors, which are closer to the true signals.Additionally, the computational time T is used to evaluate the efficiency of the calculation.Although the PRSGMD decomposition speed is better than SGMD, it is still lower than the remaining two methods EEMD and VMD.
In order to verify the performance of the proposed method's noise resistance, a signal described in Eq. 20, y , (t), is developed, which is composed of a damping signal and a white-Gaussian noise produced when simulating an actual fault.FIGURE 6 shows the y , (t) and the parts thereof.The SNR is 5 dB, -10 dB and -20 dB.In addition, PRSGMD, SGMD, and EEMD are divided into y , (t).
FIGURES 7-9 show y , (t) corresponding results.Also, for quantification of noise immunity, relevant assessment measures are presented in TABLE .3. From FIGURES 7-9, it can be understood that when an SNR is 5 dB, i.e., when the noise is comparatively low, the 4 signal decompositions can be efficiently separated from the noise.Though the VMD has a burr and is not sufficiently smooth, the wave shape of the active component IMF 1 is similar to that of the damping signal in the simulated signal.From the point of view of the precision of decomposition, it is determined by the correlative factors in TABLE .3.
The analysis shows that SGMD, VMD and EEMD have better performance than PRSGMD.From the point of time domain mean, VMD and EEMD are more in agreement with additive noise.But the PRSGMD is not sufficiently separated from the signal and noise, so the wave tendency of the remaining part includes a weak AM feature.As the Gauss White Noise is enhanced, the efficiency and precision of these approaches are reduced to different extent.At the SNR of −10 dB, though the VMD's components IMF 1 are generally oscillating, the whole time domain saw wave severely suppress the decay characteristic.Meanwhile, the EEMD component profile IMF 10 shows significant wave 129066 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.y 2 (t), leading to more severe signal aliasing.The amplitude fluctuations of the 10 components in EEMD IMF 10 are relatively small, but there are no significant waveform features.
On this basis, an improved SGMD method is proposed.On the contrary, the SGC 1 component in PRSGMD exhibits slight distortion at certain peaks and valleys, while the AM/FM characteristics slightly decrease, so it generally retains the original signal y 2 (t).The R1 (R1) index of this algorithm is around 0.8, and E1 (E1) is the smallest.This algorithm can achieve good results even in strong noise environments.
Conventional approaches like HHT, LCD, LMD, etc. have the following issues: overenvelope, underenvelope, frequency chaos, end point effect etc.The SGMD has the merits that it does not need to be customized, and it can be used to reconstruct the current model efficiently and remove the noise.But SGMD has a drawback: the computational efficiency is reduced quickly with the increment of the number of data, and the non-valid symplectic geometric components can influence the degradation precision.In order to solve SGMD's poor computation efficiency, PRSGMD has been put forward to increase precision and computation efficiency.However, it is not as efficient as that of EEMD.
Based on the simulated analysis, it is found that PRS-GMD is more precise in comparison with the others, but the efficiency of VMD and EEMD is greatly decreased when the noise strength is higher.Though SGMD possesses certain noise-proof properties, its degradation can not meet the requirements under high-intensity noise conditions.The PRSGMD algorithm is applicable to the separation of the signal and the noise, and can get good resolution in the case of SNR interference.As far as computing efficiency is concerned, PRSGMD algorithm has lower degradation time than SGMD, but it is better than VMD or EEMD.The PRSGMD approach requires optimization of filter parameters to decrease degradation time and increase degradation efficiency.y , (t) = y 2 (t) + n (t) y 2 (t) = e −0.8t sin[30πt + cos(3πt 2 )] (20)

V. THE APPLICATION OF THE PRSGMD METHOD IN ROLLING BEARING FAULT DIAGNOSIS
To further demonstrate the superiority and practicality of the PRSGMD approach, which is employed in the real measured bearing vibration signals, and relevant analyses are being carried out.The bearing failure simulation experimental bench used is given in FIGURE .10, which consists of rolling bearings, motors, acceleration sensors, load pressurization device, BK test module, and so on.A double half inner ring bearing of model QJ305M was used for failure simulation, and the vibration sensor was adsorbed on the bearing housing where the problem bearing was located through a magnetic holder.In the experiment, the BK vibration system was used for vibration data acquisition, the shaft's rotational speed was configured at 1800 rpm, the sampling frequency of the acquisition card was 8192 Hz, and the sampling experiment was repeated for ten times, each time collecting 10 seconds of vibration data.For different types of bearing failure, the sample data of outer ring depth failure (0.8mm), inner ring depth failure (0.8mm), cage depth failure (0.8mm), rolling body failure and failure-free operation are used to select the obtained feature values using the DET method, and the results of the selection of the feature value vectors for different failure types are shown in FIGURES 15 (a)-(d) (the significant features are marked with circles).The results of PRSGMD decomposition can be obtained with 29 significant features eventually, the results of SGMD decomposition with 24 significant features finally, the results of VMD decomposition with 16 significant features eventually, and the results of EEMD decomposition   with 11 significant features eventually, which demonstrates that the PRSGMD method is capable of extracting more information about the failure features.
Along the same lines, when analyzing the sample data with different failure levels, the sample data with deep failure of the outer ring (0.8 mm), shallow failure of the outer ring  (0.4 mm), and failure-free operation are used to select the received values with the DET method, and the results of selecting the feature value vectors with different failure levels are shown in FIGURES.15 (e)-(h) (the significant features are marked with circles).The results of PRSGMD decomposition have been found to obtain 22 significant features, while the other three methods, such as SGMD, VMD, and EEMD decomposition results have been found to obtain 19, 14, and 13 significant features respectively, which are all lower than 20 significant features, which also demonstrates the PRSGMD method is capable of extracting more information about the failure features.
After the selection of significant features was performed, the significant features were extracted according to the significant feature number, and the labels were divided evenly across the dataset on the scale of 80% (96 groups) for the training set and 20% (24 groups) for the testing set.
Then, the LightGBM tree model, via the significant features extracted by the four methods of PRSGMD, SGMD, VMD, and EEMD, is applied to pattern recognition of the data in Finally, the validity of the PRSGMD is verified through the test.The SGMD has the merits that it does not need to be customized, and it can be used to reconstruct the current model efficiently and remove the noise.But SGMD has a drawback: the computational efficiency is reduced quickly with the increment of the number of data, and the non-valid symplectic geometric components can influence the degradation  precision.In order to resolve the above issues, PRSGMD is used to enhance the resolution precision at the same time as SGMD analysis is not affected by noise or other null patterns.

VI. CONCLUSION
Conventional HHT, LMD and LCD have problems of overenvelope, underenvelope, frequency chaos and end-point.The SGMD has the merits that it does not need to be customized, and it can be used to reconstruct the current model efficiently and remove the noise.But SGMD has a drawback: the computational efficiency is reduced quickly with the increment of the number of data, and the non-valid symplectic geometric components can influence the degradation precision.In order 129074 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
to resolve the above issues, PRSGMD is used to enhance the resolution precision at the same time as the SGMD analysis is not affected by noise or other null patterns.The special conclusions are as follows: (1) The RCMFE operator was built to assess the effectiveness of the original individual parts so that the SGC parts could be achieved with minimum constraints.
(2) Compared with the original SGMD approach, PRS-GMD can only deal with a portion of the original individual elements which include important patterns, without decreasing the operating efficiency with an increase in the number of data.
(3) Compared with SGMD, EEMD and VMD, PRSGMD has superior performance in restraining endpoint effect and mode chaos, resisting noise property, and improving component orthogonality and precision, but PRSGMD's total time cost is smaller compared with SGMD.Coupled with LightGBM method, this method can effectively improve the accuracy of fault identification.
It should be noted that the problem of handling the vibration data of rolling bearing under varying operation conditions and its convergence in PRSGMD still remains to be explored.
) n is the length of the data, d is the normal size of the embedding n/3, and τ is the delay time,.m = n−(d−1)τ Choose proper embedding size d and delay time τ .(3)The original single-component matrix Y
is the gradient of sample i, d is the segmentation point of feature j, at the same time n j i (d) and n j r (d) denote samples where the value of feature j is less than, greater than or equal to d, respectively.

FIGURE 2 .
FIGURE 2. The time-domain waveforms of the simulation signal and its components.(a) signal y (t ).(b) The vibration attenuated signal.(c) The WGN with SNR = −20.

FIGURE 3 .
FIGURE 3. The components of the mixed signal.(a) The PRSGMD components.(b) The SGMD components.(c) The VMD components.(d) The EEMD components.

FIGURE 4 .
FIGURE 4. The instantaneous frequency and amplitude of the first component decomposed by the simulation signal.(a) The IA of SGC 1 and IMF 1 .(b) The IF of SGC 1 and IMF 1 .

FIGURE 5 .
FIGURE 5.The instantaneous frequency and amplitude errors of the first two components decomposed by the simulation signal.(a) The IA error of SGC 1 and IMF 1 .(b) The IF error of SGC 1 and IMF 1 .loss.Both PRSGMD and SGMD are more similar in wave form, and their relative correlation factor and EOD can be

FIGURE 10 .
FIGURE 10.Double half inner ring bearing failure diagnosis experiment table.

FIGURE 15 .
FIGURE 15.Significant features applied by DET.(a)-(d) are the results of the vibration data with different failure types generated by applying PRSGMD, SGMD, VMD and EEMD, respectively.(e)-(h) are the results of the data with different failure levels generated by applying PRSGMD, SGMD, VMD and EEMD, respectively.
If End Condition is not met, Return to Step (6) to get SGC i (t).If Condition Is Met, Stop Iteration and Complete Decomposition.

TABLE 2 .
Comparison of assessment metrics for the decomposition results of y (t ) by four methods.

TABLE 3 .
Evaluation metrics of the components of y (t ) obtained by PRSGMD, SGMD, VMD, and EEMD.

TABLE . 2
. As shown in TABLE

TABLE 4 .
Parameters of the bearing experiment bench.
The bearing-related parameters and the eight different bearing states simulated, such as shallow/depth failure of the outer ring, shallow/depth failure of the inner ring, shallow/depth failure of the cage, ball failure, and healthy state, are given in TABLE.4.

TABLE . 5
. The recognition results are shown inTABLE.6 and FIGURE.16, the mean value of recognition accuracy of PRSGMD is 98.43%, while the mean value of recognition accuracy of SGMD, VMD, and EEMD is 96.86%, 95.29%, and 93.82%, respectively, and the recognition results validate that the PRSGMD method extracts the failure feature signals more significantly.

TABLE 6 .
Recognition results of different methods.