Introduction
China’s future manufacturing system planning is marked by critical terms such as the “14th Five-Year medium and long-term manufacturing development plan” and “intelligent manufacturing”. These terms represent the direction of China’s manufacturing industry and reflect the global trend in manufacturing development. China has made significant advancements in its manufacturing capabilities, transitioning from being known as the “world’s factory” to becoming an “intelligent manufacturing power”. The continuous upgradation of industries and technologies has been crucial in supporting the steady growth of China’s economy. Rolling bearings serve as vital components in various large equipment and mechanical parts. They perform several functions under challenging working conditions and operating loads to ensure the safe operation of automated systems. With the continuous increase in the emphasis on the reliability of industrial products, there have been increased demands for more accurate and effective fault diagnosis systems. The AI technologies [1], such as machine learning, deep learning, and pattern recognition, have the advantages of enhancing fault diagnosis accuracy, predictive maintenance, and fault prognosis. These fault diagnosis systems are critical for promptly identifying potential issues in rolling bearings. Through the early detection of faults, they aid in preventing unexpected failures, minimizing downtime, and optimizing maintenance schedules. Moreover, an accurate and effective fault diagnosis system facilitates proactive maintenance strategies, resulting in improved productivity, reduced costs, and enhanced overall system performance. Incorporating the background of big industrial data, national policy guidelines, and current industry trends, researchers have made significant progress by integrating rolling bearing fault diagnosis with machine learning algorithms, neural networks, and deep learning techniques. This integration [2], [3], [4] has effectively addressed the challenges of modern complex industrial equipment. Indeed, the Deep Belief Network (DBN) is recognized as among the most representative and strongly compatible algorithms in various fields, including fault diagnosis. Researchers have increasingly favoured DBN [5] owing to its advantages over traditional fault diagnosis methods. Thus, fault diagnosis has evolved into a multi-method fusion pattern recognition process. Although diagnostic models are crucial to the overall process, the significance of signal processing, feature extraction, and other related techniques [6], [7] cannot be understated.
Typically, the vibration signal of the rolling bearing includes pulse signals, external signals, background noise signals, etc. The complexity of the vibration signal directly corresponds to the progress of the follow-up work. A corresponding signal analysis method is required to perform the related analysis. Wavelet decomposition and reconstruction can effectively remove noise, complete the preliminary optimization of the vibration signal, and retain the original signal characteristics. This [8] renders it suitable for vibration signal processing of rolling bearings. There [9], [10], [11] exist several novel methods for signal processing, among which the most popular is the modal decomposition algorithm. Considering that the parameters of such algorithms affect the signal decomposition effect under manual intervention, this study employed the wavelet analysis algorithm with powerful functions and deep industry experience. In the feature extraction stage, Zhang and Huang [12] used the Empirical Mode Decomposition (EMD) algorithm to decompose the vibration signal into a set of inherent mode functions. Kumar et al. [13] proposed a frequency mode based on the Variational Mode Decomposition (VMD) signal to monitor the bearing health state. Ni et al. [14] identified a superior feature extraction method by comparing VMD, EMD, and Local Mean Decomposition (LMD). Complete Ensemble Empirical Mode Decomposition (CEEMDAN) [15] has been employed to determine specific fault characteristics, and TFR demodulation analysis has been used to obtain accurate fault characteristics. Regarding pattern recognition, Kang et al. [16] proposed the deep domain adaptation method, wherein convolutional and pooling theories were integrated with DBN to solve the problem of multi-state identification of rolling bearings. Xu and Tse [17] combined DBN with the Affinity Propagation (AP) model, which exhibited excellent results compared to traditional fault diagnosis methods. Zhong et al. [18] used the improved fault diagnosis method that combined Ensemble Empirical Mode Decomposition (EEMD) and DBN to achieve fault diagnosis. The intelligent fault diagnosis method of PCA-DBN was proposed [19], which reduced the dimension of complex features before completing the fault diagnosis. Therefore, to obtain a good fault diagnosis result based on the DBN network model [20], the initial parameters of the model must be improved and optimized. Deng et al. [21] proposed an improved quantum-inspired differential evolution (MSIQDE) algorithm, which avoided premature convergence, improved global search ability, and optimized DBN parameters using MSIQDE with global optimization ability. Furthermore, Gao et al. [22] employed the intelligent optimization method Salp Swarm Algorithm to optimize DBN, effectively improving its classification accuracy.
The remainder of this paper was organized as follows. Section II presented different methods applied in the study to extract various features. Section III presented the feature extraction and model optimization process to prepare multi-domain data sets for experiments. Section IV presented the assignment of the experimental data sets and the comparison of results between this paper and the mainstream methods. Section V summarized the whole study and the future work.
Methods
A. Wavelet Decomposition and Reconstruction
The Short-time Fourier Transform (STFT) is an evolution of the traditional Fourier Transform (FT). The size and shape of the window function in STFT remain fixed and independent of time, rendering it unsuitable for analysing time-varying signals. An excessively narrow window function frame can result in poor frequency resolution, whereas a wider frame can result in poor time resolution. This limitation prevents STFT from satisfying the frequency requirements of unsteady signal changes. However [23], the wavelet transform differs from the STFT as it abandons the infinite trigonometric function basis and adopts a finite and decaying wavelet basis. This transformation facilitates both frequency information and accurate time localization. In wavelet transform, frequency information can be obtained while accurately determining the specific time location of a signal. The specific expression is as follows:\begin{equation*} \textrm {WT}\left ({{\alpha,\tau } }\right)=\frac {1}{\sqrt \alpha }\int _{-\infty }^{+\infty } {f\left ({t }\right)\ast \psi \left ({{\frac {t-\tau }{\alpha }} }\right)dt} \tag{1}\end{equation*}
Eq. (1) shows that in contrast to FT, the wavelet transform incorporates two variables: scale
Wavelet noise reduction facilitates decorrelation. In wave- let analysis [24], signal decomposition is performed using the Mallat tower algorithm, resulting in approximate and detailed signals at each decomposition level. This study employed the Daubechies (Db) wavelet owing to its suitability for rolling bearing fault characteristics. The Daubechies wavelet was characterized by its outstanding orthogonality, effectively reducing information loss during wavelet transformation and inverse operations. Compared to other wavelet functions, it offered a notably refined time resolution in its designated time domain. Furthermore, there was a noticeable enhancement in the smoothness and continuity of the processed signal as the wavelet coefficients increased. Samsingh [25] employed the db4 wavelet to denoise medical images and juxtaposed its performance with other techniques. The superior efficacy of the db4 wavelet was distinctly manifested in his comparisons.
In Fig. 2, when selecting the optimal wavelet base function, the SNR post-Daubechies wavelet denoising experiences a significant surge, plateauing around the db4 mark. Remarkably, all three wavelet functions nearly reach their SNR zenith when the base is set at 4, with only marginal gains observed beyond this point. Employing higher-order wavelet base functions introduced the potential for signal over-decomposition and an upswing in computational demand. Given these findings, this study earmarked the Daubechies db4 wavelet as the focal point for vibration signal denoising. In addition, the number of decomposition levels should be chosen while considering the trade-off between separation effectiveness and noise reduction during reconstruction. A suitable scale of three decomposition levels was selected to address this trade-off. This choice resulted in the clear separation of noise and signal and an excellent final noise reduction effect after reconstruction. Based on the principles above and influencing factors of wavelet noise reduction, the original signal comprising 20,480 sampling points was processed using wavelet noise reduction.
B. Ensemble Empirical Mode Decomposition
Limitations, such as mode aliasing and end effects, plague the traditional EMD method in signal processing. These deficiencies can result in periodic signals and a loss of physical meaning, ultimately affecting signal decomposition accuracy. A novel method, called EEMD [26], was proposed to address these issues. The EEMD method tackles the problem of signal decomposition accuracy, particularly mode aliasing, by introducing noise-assisted computation. EEMD involves introducing random noise into a given time-domain signal, performing multiple EMDs on the signal with added noise, and then averaging the resulting IMFs across the decompositions to obtain a more robust decomposition, effectively separating different frequency components within the original signal. The IMF is defined as a single-component signal obtained through decomposing the original signal after noise reduction. The detailed decomposition steps are depicted in Fig. 3. Consequently, the EEMD method offers a reliable solution for achieving accurate signal decomposition.
C. Sparrow Search Algorithm
SSA is a swarm optimization algorithm that outperforms existing algorithms regarding convergence speed, stability, and local optima avoidance. It achieves this by simulating the foraging and anti-predation behaviour of sparrow populations. A new meta-heuristic algorithm called the Sparrow Search Algorithm (SSA) [27] is proposed to optimize the operation of microgrids. Furthermore, an improved version of the sparrow search algorithm [28] is utilized to solve time-optimal trajectory problems. Numerous researchers have acknowledged and affirmed the optimization capabilities of SSA, and its feasibility has been demonstrated through comparisons with ample data from multiple regions. The specific flow of the algorithm is as follows.
The population and fitness function composed of \begin{align*} X=\begin{bmatrix} {x_{1}^{1}} & {x_{1}^{2}} & \cdots & {x_{1}^{d}} \\ {x_{2}^{1}} & {x_{2}^{2}} & \cdots & {x_{2}^{d}} \\ \cdots & \cdots & \cdots & \cdots \\ {x_{1}^{n}} & {x_{2}^{n}} & \cdots & {x_{n}^{d}} \\ \end{bmatrix},F_{x} =\begin{bmatrix} {f\left ({{\left [{ {{\begin{array}{cccccccccccccccccccc} {x_{1}^{1}} & {x_{1}^{2}} & \cdots & {x_{1}^{d}} \\ \end{array}}} }\right]} }\right)} \\ {f\left ({{\left [{ {{\begin{array}{cccccccccccccccccccc} {x_{2}^{1}} & {x_{2}^{2}} & \cdots & {x_{2}^{d}} \\ \end{array}}} }\right]} }\right)} \\ \vdots \\ {f\left ({{\left [{ {{\begin{array}{cccccccccccccccccccc} {x_{1}^{n}} & {x_{2}^{n}} & \cdots & {x_{n}^{d}} \\ \end{array}}} }\right]} }\right)} \\ \end{bmatrix} \\{}\tag{2}\end{align*}
In SSA, the efficiency of food discovery is directly proportional to the fitness value. After each iteration of the algorithm, the discoverer in the sparrow population continually searches for food and updates its position and direction. The number of discoverers typically accounts for approximately 10-20% of the total population. The expression for updating the position can be described as follows:\begin{align*} X_{i,j}^{t+1} =\begin{cases} \displaystyle X_{i,j}^{t+1} \cdot \textrm {exp}\left ({{-\frac {i}{\alpha \cdot iter_{max} }} }\right),R_{2} < \textrm {ST} \\ \displaystyle X_{i,j}^{t} +Q\cdot L,R_{2} \ge \textrm {ST} \\ \displaystyle a\in \left ({{0,1} }\right],R_{2} \in \left [{ {0,1} }\right],\textrm {ST}\in \left [{ {\frac {1}{2},1} }\right] \\ \end{cases} \tag{3}\end{align*}
Its position update expression can be described as:\begin{align*} X_{i,j}^{t+1} =\begin{cases} \displaystyle Q\cdot \textrm {exp}\left ({{\frac {X_{worst} -X_{i,j}^{t}}{i^{2}}} }\right),&i>\frac {n}{2} \\ \displaystyle X_{p}^{t+1} +\left |{ {X_{i,j}^{t} -X_{p}^{t+1}} }\right |\cdot A^{+}\cdot L,&i\le \frac {n}{2} \\ \end{cases} \tag{4}\end{align*}
D. Deep Belief Network
DBN is a Restricted Boltzmann Machines (RBM) type that combines low-level features with other nonlinear transformations. DBN is a deep learning model incorporating Neural Networks and Backpropagation Neural Network (BPNN) to capture high-level abstract features. It has been extensively used in various fields, such as classification, prediction, and speech recognition, wherein it has showcased remarkable performance and thus established itself as a leading approach in fault classification and diagnosis owing to its advantageous combination of features. RBM serves as the foundation of the DBN network model. Its structure comprises independent layers without internal communication between them. Each node processes input data units and independently decides whether to pass on the input based on random judgment. The parameters of RBM are randomly initialized, enabling the calculation of the probability for each neuron individually. By multiplying these probabilities, RBM estimates the activation of the entire layer of neurons. Consequently, the computational complexity is reduced, and the connections within the visible and hidden layers are eliminated. Thus, no connections exist between visible or hidden units, as shown in Fig. 4.
By leveraging the main idea of unsupervised learning, the fault feature identification in the DBN can be achieved by adding a Softmax output layer at the top. This facilitates supervised learning techniques, wherein labels are used to evaluate and analyze the entire dataset. Through this process, effective classification and prediction can be achieved. Regarding the structural characteristics, the training model for the DBN involves establishing the initial model and fixing the weights
Fig. 5 shows a basic RBM structure composed of \begin{equation*} E\left ({{v,h} }\right)=-\left ({{\sum \limits _{i,j=1}^{m,n} {v_{i} w_{ij} h_{j} +\sum \limits _{i=1}^{m} {b_{i} v_{i} +\sum \limits _{j=1}^{n} {c_{j} h_{j}}}}} }\right) \tag{5}\end{equation*}
Substituting the energy function into the probability density function yields the final form of RBM:\begin{equation*} P\left ({x }\right)=P\left ({{v,h} }\right)=\frac {1}{z}\cdot e^{\sum \limits _{i,j=1}^{m,n} {v_{i} w_{ij} h_{j} +\sum \limits _{i=1}^{m} {b_{i} v_{i} +\sum \limits _{j=1}^{n} {c_{j} h_{j}}}}} \tag{6}\end{equation*}
The formation of probabilities in an RBM involves calculating the energy of a specific state and dividing it by the sum of the energies of all possible states. The energy function in RBM follows the Boltzmann distribution, and it facilitates the expression of a joint probability density by continuously calculating the probabilities and connections between the two layers. RBM builds a unified energy model by combining energy functions with related probability distribution functions. The shared weights between the two layers of RBM determine the joint distribution probability.
The states of each unit in the visual and hidden layers are independent of each other, and the conditional probability expression as in Eqs. (7) and (8):\begin{align*} P\left ({{v,h} }\right)&=\frac {P\left ({{v,h} }\right)}{P\left ({h }\right)}=\prod {P\left ({{v_{i} h} }\right)} \tag{7}\\ P\left ({{h,v} }\right)&=\frac {P\left ({{h,v} }\right)}{P\left ({v }\right)}=\frac {\frac {1}{z}\cdot e^{-E\left ({{v,h} }\right)}}{\frac {1}{z}\cdot \sum \limits _{h} {e^{-E\left ({{v,h} }\right)}}}=\prod {P\left ({{h_{j},v} }\right)} \tag{8}\end{align*}
The neuronal activation probabilities of the visual and hidden layers are defined in Eqs. (9) and (10):\begin{align*} P\left ({{v_{i} =1,h} }\right)&=\textrm {sigmoid}\left ({{\sum \limits _{j=1}^{n} {w_{ij} h_{j} +c_{j}}} }\right) \tag{9}\\ P\left ({{h_{j} =1,v} }\right)&=\textrm {sigmoid}\left ({{\sum \limits _{i=1}^{m} {w_{ij} v_{i} +b_{i}}} }\right) \tag{10}\end{align*}
The sigmoid activation function facilitates the hidden layer effect in the DBN network. Instead of receiving a linear function output from the previous layer, each node in the hidden layer transforms the output value using a nonlinear function such as the sigmoid function. This nonlinearity ensures the model’s powerful expressive capabilities. Considering that the selected experimental task involved multi-classification, the sigmoid activation function was essential for achieving the hidden layer effect. Therefore, the activation function that yielded the maximum accuracy was chosen. Fig. 4 shows the complete training process of a DBN network, beginning from the unsupervised pre-training of the input layer to the top layer. After optimizing the initial parameters of each layer, reverse supervised fine-tuning was conducted in combination with labelled data. During the reconstruction phase, the activation state of the hidden layer served as the input during the backward transmission process. Similar to weight adjustment in the forward transmission process, errors were reconstructed and backpropagated based on weight adjustments. Through continuous iterative learning, the errors were minimized until convergence was reached.
Feature Extraction and DBN Optimization
A. Signal Noise Reduction
Assume a known signal with sampling frequency \begin{align*} x\left ({i }\right)&={\text {sin}}\left ({{2\ast {\text {pi}}\ast 50\ast i\ast {\text {dt}}} }\right)+0.5 \\ &\quad {\ast \text {sin}}\left ({{2\ast {\text {pi}}\ast 1500\ast i\ast {\text {dt}}} }\right)+1 \\ &\quad \ast {\text {sin}}\left ({{2\ast {\text {pi}}\ast 3000\ast i\ast {\text {dt}}} }\right)+0.1\ast {\text {randn}}\left ({{1,1} }\right) \\{}\tag{11}\end{align*}
Based on Eq. (11), the time-domain signal comprised components at frequencies of 50, 1500, and 3000Hz, and random noise. Among these, the 50Hz component represents the effective signal, whereas the others correspond to interference noise signals at different frequencies. Wavelet decomposition was applied to obtain wavelet coefficients, which were then used for signal reconstruction. To illustrate the impact clearly, the first 500 signals are shown in Fig. 6. The resulting signals exhibited a noticeable effect of smooth noise reduction, ultimately producing a cleaner representation.
This study employed the wavelet transform to reduce the noise present in the four operating states of the rolling bearing. This technique effectively extracted the smooth and meaningful components of the signal while eliminating the noise. After decomposing the signal using wavelet transform, the wavelet coefficients were used to reconstruct the denoised signal. Fig. 7–10 provide a visual comparison between the original signal and the denoised signal obtained through wavelet transform. It is apparent from these figures that wavelet transform exhibits remarkable denoising ability by removing the unwanted noise components from the original vibration signal. The vertical axis represents the amplitude, measured in the unit
Fig. 7–10 provide above illustrates the standard vibration signal of the rolling bearing, along with the waveforms of the inner ring fault, rolling element fault, and signal after noise reduction for the outer ring fault. As evident, the vibration signal waveform became smooth and devoid of any sharp or jagged points after noise reduction. This denoising process preserved the essential characteristics of the original signal while effectively eliminating unwanted noise. Consequently, the denoised signal retained the integrity of the practical components while ensuring the removal of invalid noise.
B. Feature Extraction
After applying noise reduction, the sample data was subjected to combined time-domain and frequency-domain index analysis. The EEMD was then used to extract IMF energy features. The distribution of these features in the time-domain and frequency-domain is shown in Fig. 11, which clearly illustrates the contribution of mean and effective values to fault diagnosis. Fig. 12 shows the distribution of IMF energy features, demonstrating the retention of compelling IMF energy features while avoiding modal aliasing. The characteristic parameters corresponding to these features are listed in Table 1. When obtaining IMF component features through EEMD, an additional white Gaussian noise with a mean square error of 0.25 was introduced, with an overall average of 50 samples. This resulted in different characteristic parameters and energy curves, providing additional bearing information that could be reflected.
In Fig. 11, the first one exhibited mean, effective, peak, variance, and skew distribution values. The second one exhibited kurtosis, peak, pulse, and margin factor distribution.
Figure 13 illustrates the variance contribution rates and Pearson correlation coefficients of various modal components post-EEMD decomposition for both denoised and raw signals. The denoised signal maintains a strong linear correlation with the original one and captures distinct frequency details through EEMD, resulting in a smoother curve. This combination of EEMD and db4 wavelet is instrumental in extracting diverse frequency features for bearing fault diagnosis. In contrast, the raw signal exhibits enhanced frequency periodicity after EEMD decomposition but lacks detailed frequency characteristics.
C. DBN Optimized By SSA
The SSA [29] was employed to optimize the structure and weight parameters of the DBN. The results demonstrate that the recognition rate of the SSA-DBN model surpassed that of other classifiers, with a recognition accuracy approximately 2% higher than that of the unoptimized DBN model. SSA-DBN, VMD and Wigner-Ville distribution (WVD) [30] were used for intelligent fault severity detection. The model achieved an accuracy rate of 98%, indicating its effectiveness in fault detection. Li et al. [31] compared and verified the performance of DBN models combined with different optimization algorithms, including Simulated Annealing (SA), Particle Swarm Optimization (PSO), and SSA. The evaluation results indicated that all three improved DBN models outperformed the original DBN model. However, the SSA-DBN model achieved the highest evaluation accuracy among them.
When proposing a relatively new algorithm, further research and verification are necessary to assess its optimization effectiveness. In the case of the algorithm considered, which was proposed 2–3 years ago, it is expected that more researchers will need to conduct experiments and validate its performance using actual data.
Indeed, a DBN’s optimal performance relies on the optimal network structure. While researchers typically set the DBN network structure based on their experience, it may not fully exploit the potential performance of DBN. To address this, a new fault detection model called SSA-DBN was proposed by optimizing DBN with the SSA. The core idea behind using SSA to optimize DBN was to determine the sparrow with the best position and the individual sparrow with the highest fitness. Throughout the iteration process, the parameters of the sparrow were used to determine the optimal network structure of DBN. Subsequently, the labelled data was selected and input into a Softmax classifier for fault classification and diagnosis, as shown in Fig. 14. This optimization process aimed to obtain the optimal fault detection model. By integrating SSA optimization with DBN, the SSA-DBN model enhanced the performance and effectiveness of fault detection. It utilized the optimization capabilities of SSA to determine the optimal DBN network structure, resulting in an improved fault detection model.
D. Fault Diagnosis Procedure
The detailed steps of rolling bearing fault diagnosis based on the SSA-optimized DBN are as follows:
Denoising: Three-layer wavelet packet decomposition and reconstruction were applied to denoise the original vibration signal of the rolling bearing. The denoising effect was remarkable. Further, the practical signal dataset was divided into different states.
Feature Extraction: Time-domain and frequency-domain features were extracted from the practical signal dataset. In addition, IMF energy features were extracted using the EEMD.
Labeling: The sample labels in the entire dataset were marked manually to distinguish different types of sample data. The dataset contained 480 samples with four classifications. The dataset was split into a ratio of 3:1, with 120 samples as the test set and 360 as the training set.
Data Preprocessing: The data were normalized to ensure that the values ranged between [0, 1]. This preprocessing step reduced the computational load of the model. Further, the data were transposed to adapt to the model characteristics. Before training, the maximum number of iterations and the number of sparrows were set in the SSA algorithm, which were 50 and 100. Further, the momentum parameter was set as 0.5, and the learning rate was set as 0.1. Through constant updating and iteration of sparrow positions, the optimal position sparrow with the highest fitness value was determined.
DBN Training: The number of nodes in the input layer corresponded to the dimensionality of the input features, and the number of output nodes was 4, representing the running state of the four types of rolling bearings. A 2-layer RBM was set up. The RBM was trained with 65 iterations, a learning rate of 0.01, and a fine-tuning process of 10 iterations. Further, the integrated feature set was divided into a 3:1 ratio and input into the optimized SSA-DBN network model.
Fault Diagnosis: The labels were set, and the rolling bearing faults were diagnosed using the trained SSA-DBN network model.
Result analyzing: The combination model applied in this study was compared with other mainstream methods to verify the effectiveness of the proposed method.
This paper’s diagnostic process and technical roadmap are shown in Fig. 15.
Fault Diagnosis Process Based on SSA-DBN
In this study, the data set used was the life cycle data of rolling bearings obtained from the NSF I/UCR Intelligent Maintenance System Center. The data set can be accessed at https://www.nasa.gov. The experiments were conducted using an AC motor rotating at a constant speed of 2000 RPM. Four ZA-2115 double-row roller bearings manufactured by Rexnord were installed on the rotating shaft. Further, acceleration sensors were placed on the horizontal and vertical directions of each bearing to measure and collect the corresponding vibration signals. Each data set recorded the complete life cycle of a bearing, beginning from regular operation and progressing to the point of damage. The sampling frequency for data collection was set at 20 kHz. Each interval between data points was 10 min, resulting in the collection of one sample per interval. The collection time for each sample was approximately 1.024 s, yielding 20,480 data points for each sample.
A. Feature Set
After preliminary analysis, the 10-dimensional time-domain feature set
B. Fault Diagnosis
Using the vibration mentioned above signal data, 120 samples were obtained for each of the four working conditions (normal, inner race fault, rolling element fault, and outer race fault). These samples were divided into training and test sets according to the specified proportion. The training set was then used to learn and train the SSA-DBN model with the parameters mentioned earlier. The experimental results of the SSA-DBN model are shown in Fig. 16. As evident, high recognition accuracy and effective classification were achieved by the DBN network model optimized by the SSA algorithm. Fig. 17 shows the fitness function curve of the SSA-DBN model, indicating the decrease in the objective function value with increasing iteration times. By the second iteration, the objective function value reached its optimal value.
To evaluate the paper’s experimental results, a visualization tool, referred to as the Confusion Matrix, was added. The Confusion Matrix is particularly suitable for supervised learning tasks as it facilitates the comparison of the accuracy of classification results. Fig. 18 shows the Confusion Matrix used in this study. In the Confusion Matrix, each row represents the actual class labels corresponding to the four bearing states examined in this paper. Each column represented the predicted class labels assigned by the SSA-DBN network model. Based on the definitions of the Confusion Matrix, it can be categorized into four types: True Positive (TP), False Negative (FN), False Positive (FP), and True Negative (TN). Notably, in the Confusion Matrix, there is an occurrence of FP type. This implies that in one sample, the actual state is an inner race fault; however, the SSA-DBN network model misidentified it as a normal state, resulting in a misdiagnosis. This observation suggests further refinement in the data processing stage to enhance the overall diagnostic performance.
In contrast, when the SSA was not used to optimize the DBN network model, the diagnostic accuracy rate was only 75% for the same data division and parameters. However, after applying the SSA optimization, the fault diagnosis rate of the DBN network model improved significantly. This demonstrates SSA’s effectiveness in enhancing the DBN model’s performance for fault diagnosis. Furthermore, to validate the effectiveness of the selected feature set L in this study, all the characteristic parameters were input into the SSA-DBN model for fault diagnosis, and the diagnosis results were compared with mainstream methods. The parameters of different methods were set as follows:
PSO-DBN: The learning rate was set as 0.02, the momentum parameter was set as 0.1, the activation function was the sigmoid function, the number of particle swarms was 15, and the number of particle swarm training iterations was 100. Other parameters were consistent with SSA-DBN.
SSA-SVM: The number of iterations, sparrows, the momentum parameter, and the learning rate were consistent with SSA-DBN. The optimized penalty parameter was 60.309, and the kernel parameter was 0.5694.
The results of PSO-DBN and SSA-DBN are shown in Fig. 19. The compared results and accuracy of this analysis are summarized in Table 4.
Table 4 shows that the diagnostic accuracy distribution of the SSA-DBN diagnostic model adopted in this study spans different feature sets. Specifically, the performance of the energy features derived from EEMD decomposition, represented by IMFEF, is relatively subpar. However, when temporal and spectral feature sets are combined (i.e., the combined feature set), there is a significant enhancement in the overall diagnostic accuracy, reaching 99.17%. The percentage difference reveals that, prior to feature integration, the diagnostic precision of a single-dimensional feature set lags behind that of multi-dimensional integrated feature sets by approximately 8%. This underlines the complementary nature of features from different dimensions and domains, highlighting the superiority of multi-dimensional integrated feature sets in fault diagnosis, thereby elevating diagnostic accuracy and robustness. Additionally, Table 4 also offers a lateral comparison against various diagnostic models. The results demonstrate that the DBN model, without structural and parameter optimization, commits significant errors in its diagnosis. Conversely, the diagnostic accuracy of the DBN model, post-optimization with the SSA algorithm, has increased by about 24%, attesting to the necessity of algorithmic optimization. Furthermore, incorporating mainstream optimization algorithm PSO-DBN and mainstream classification model SSA-SVM for comparison, the analysis in conjunction with Figure 17 suggests that the same multi-dimensional integrated feature set, under different optimization algorithms or different classification models, presents variant outcomes with a disparity of around 5%. Both methods have diagnostic errors across four labels, impermissible in practical scenarios. Hence, deep learning models, optimized in structure and parameters with optimization algorithms, outshine shallow machine learning models. The degree of model optimization varies among different optimization algorithms. Empirical evidence confirms that the SSA-DBN diagnostic model employed in this study possesses genuine diagnostic capability and high precision standards.
Conclusion and Future Work
The multi-domain feature set, consisting of time-domain features, frequency-domain features, and IMF energy features, achieved a high accuracy of 99.17% in diagnosing the three fault states (inner ring fault, rolling element fault, outer ring fault) as well as the normal state of rolling bearings. Utilizing this feature set facilitated the efficient diagnosis process in the network model. Furthermore, by comparing different features’ impact on the diagnosis results, it was evident that this study’s selected feature data set was highly effective. The average diagnosis accuracy rate of 99.08% was obtained after conducting ten experiments, highlighting the robustness of the feature dataset. Compared to the non-optimized diagnosis model, even including the mainstream models, such as PSO-DBN and SSA-SVM, the proposed SSA-DBN model outperformed in performance. This emphasizes the significance of the feature dataset and the diagnosis model, as each domain’s feature parameters contribute to diagnostic characteristics and rely on each other. The SSA-DBN model demonstrated excellent fault identification and diagnosis stability, ultimately enhancing overall diagnostic accuracy.
In terms of future work, there are several areas for further improvement and research based on the method proposed in this paper. These aspects can contribute to deploying the proposed method in real-world engineering scenarios and fundamentally contribute to the field. In addition, these suggestions can serve as references for other scholars conducting further research:
Multi-Dimensional Fault Diagnosis: The integration of multiple types of signals, such as vibration, current, temperature, and sound from rolling bearings, can be explored. Various types of sensors can be used for signal acquisition, study sensor layout positions and quantities, and develop suitable algorithms to extract features from different signal types. This research can contribute to establishing a multi-angle, multi-functional system for intelligent fault diagnosis of rolling bearings under different working conditions.
Addressing Sample Scarcity: In practical applications, collecting an adequate number of fault samples can be challenging, resulting in imbalanced datasets. Although deep learning fault diagnosis models can handle multiple types of faults and perform identification and classification tasks, addressing the classification error issue in the presence of imbalanced samples becomes crucial. Further research can focus on developing strategies to mitigate the effects of imbalanced datasets and improve classification performance.
These research directions can extend the current work and contribute to the advancement of fault diagnosis in rolling bearings. Thus, by addressing issues related to multi-dimensional signal analysis and handling imbalanced datasets, further improvements can be made to enhance the practical applicability and effectiveness of fault diagnosis methods.
ACKNOWLEDGMENT
The authors would like to thank the editors and the reviewers for their helpful suggestions, which have greatly improved this article.