A Novel Intelligent Fault Diagnosis Method Based on Variational Mode Decomposition and Ensemble Deep Belief Network

The deep belief network is widely used in fault diagnosis and health management of rotating machinery. However, on the one hand, deep belief networks only tend to focus on the global information of bearing vibration, ignoring local information. On the other hand, the single deep belief network has limited learning ability and cannot diagnose the health of rotating machinery more accurately and stably. As a non-recursive variational signal decomposition method, variational mode decomposition can easily obtain local information of signals. And the ensemble deep belief network composed of multiple deep belief networks also improves the accuracy and stability of the health status diagnosis of rotating machinery. This paper combines the advantages of ensemble deep belief network and variational mode decomposition to propose a novel diagnostic method for rolling bearings. Firstly, the variational mode decomposition is used to decompose the vibration data of the rolling bearing into intrinsic mode functions with local information. Then, using the deep belief network based on cross-entropy to learn the intrinsic mode functions of the rolling bearing data and reconstruct the vibration data. Finally, In the decision-making layer, the improved combination strategy is used to process the health status information of the bearings obtained by multiple deep belief networks to obtain a more accurate and stable diagnosis result. This method is used to diagnose experimental bearing vibration data. The results show that the method can simultaneously focus on and learn the global and local information of bearing vibration data and overcome the limitations of individual deep learning models. Experiments show that it is more effective than the existing intelligent diagnosis methods.


I. INTRODUCTION
With the rapid development of science and technology, modern rotating machinery has become more efficient, large-scale and integrated, playing an increasingly important role in different industries [1]. Rolling bearings are the most important part of a rotating machine, which directly affects its performance and operation [2]. Therefore, automatic, accurate and robust identification of rolling bearing operating conditions, reducing unplanned downtime and economic losses are becoming increasingly important.
The associate editor coordinating the review of this manuscript and approving it for publication was Jie Li. The traditional fault diagnosis method mainly extracts the fault characteristics through the signal processing method, and identifies the fault type of the bearing based on the empirical example based on the fault characteristics [3]. Among them, various signal processing methods are widely used in fault diagnosis feature extraction. For example, Zhu et al. used sequential statistical filtering and empirical wavelet transform to analyze the time-frequency domain feature of rolling bearings for fault diagnosis [4].Yang et al. used the variational mode decomposition and phase space parallel factor analysis to detect the weak fault signal of rolling bearings [5]. Xu et al. used the variational mode decomposition to decompose the gear vibration signal, and the spectral VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ kurtosis method highlights the fault information to achieve the purpose of gear fault diagnosis [6]. However, these fault diagnosis methods relying on signal processing are too complicated on the one hand to be applicable to the analysis of massive data and on the other hand cannot accurately identify the severity of the fault. The traditional intelligent diagnostics based on artificial intelligence such as artificial neural network (ANN) and support vector machine (SVM) are designed to efficiently analyze massively acquired vibration data and automatically provide diagnostic results, which has become a new trend in the field of equipment condition monitoring [7], [8]. For example,Li et al. calculated 1634 characteristics reflecting bearing conditions and selected 12 sensitive features as input to the ANN for fault diagnosis [9]. Lei et al. used the wavelet packet transform (WPT) and empirical mode decomposition (EMD) for feature extraction and then selected sensitive features based on an artificial neural network for fault diagnosis [10]. Zhang et al. designed a feature vector based on 19 parameters. Then SVM is used for bearing fault diagnosis [11]. Liu et al. used EMD to extract 71 features, and then used the selected sensitive features as the input of SVM for bearing fault diagnosis [12]. Van and Kang et al. proposed a local Fisher discriminant analysis of the composite feature dimension reduction of particle swarm optimization and wavelet kernel. The selected features are input to the SVM classifier for fault diagnosis of the bearings [13]. Jing et al. used the least squares support vector machine (LSSVM) and D-S evidence theory to realize the bearing fault diagnosis of information fusion under multi-sensor [14]. Although the traditional intelligent diagnosis method solves the problem that the conventional fault diagnosis method based on signal processing is difficult to apply to big data,the traditional intelligent diagnosis is inseparable from feature extraction, feature selection, and pattern recognition. This leads to three obvious limitations of traditional intelligent diagnostic methods: (1) The feature extraction of rolling bearings requires experts to master various signal processing knowledge,which limits the popularity of fault diagnosis technology (2) The selection of sensitive features in rolling bearings depends on the expert's prior knowledge, which greatly wastes the time of fault diagnosis. At the same time, the extracted sensitive features are poorly generalized and difficult to adapt to different bearing signals. (3) Artificial neural networks and support vector machines belong to the shallow machine learning model,which has a common problem, that is, its nonlinear approximation ability is limited, which results in poor performance when dealing with complex classification problems [7]. Therefore, there is an urgent need to study a new method to eliminate the dependence on manual feature extraction and feature selection.
In order to solve the problem of limited nonlinear approximation and dependence of artificial feature extraction and feature selection of shallow learning architecture in traditional intelligent diagnosis methods,Hinton proposed the concept of deep learning architecture in 2006 [15]. As the cutting-edge research area of machine learning, compared with shallow network, it provides stronger generalization ability, deeper nonlinear mapping as well as the ability to extract feature from higher dimensional data set. At present, there are three deep learning models, namely Deep Belief Network (DBN), Stack Automatic Encoder (SAE), and Convolutional Neural Network (CNN), which have been successfully applied to the field of mechanical fault diagnosis in the past few years [16].For example, Shao et al. used an ensemble stacking automatic encoder (ESAE) which is constituted by automatic encoders with different activation to complete the fault diagnosis of rolling bearings [17]. T. Ince et al. proposed a 1-D convolutional neural network (CNN) method to diagnose real-time motor faults [18]. Wang et al. designed an adaptive convolutional neural network (CNN) for fault identification of rolling bearings [19]. Shao et al. used particle swarm optimization to design a deep belief network for fault diagnosis of rolling bearings [20]. Chen et al. combined a sparse automatic encoder (SAE) and deep belief network (DBN) for bearing fault diagnosis. 15 time-domain features and 3 frequency domain features are extracted from the sensor vibration signal and input into the sparse automatic encoder (SAE) for feature fusion, the resulting fusion feature vector is used to train the deep belief network (DBN) [21]. Tao et al. proposed a fault diagnosis method for rolling bearing based on the Teager energy operator (TEO) and DBN. The instantaneous energy in the vibration signal of the rolling bearing was extracted by TEO, and input into the DBN model after adjusting the parameters by the hierarchical optimization algorithm to identify the fault [22]. Although the concept of deep learning is widely used in the field of mechanical fault diagnosis, there are still three shortcomings. (1) Most researchers only use the deep learning model as a classifier to obtain the input value of the deep learning model by manually extracting features and feature selection. The feature learning ability of deep learning is not fully utilized. (2) When the fault information is learned by using the deep learning model, only the global signal is considered, and the fault information contained in the local signal is ignored, resulting in low accuracy of diagnosis and poor system performance. (3) The single deep learning model has limited learning ability and cannot completely learn fault information, which limits the fault recognition rate of the system. This paper presents a novel fault diagnosis method for rolling bearing based on variational mode decomposition and ensemble deep belief network. This method can be divided into three parts: First, the original vibration signal of the bearing is processed using a variational mode decomposition (VMD) to obtain the IMFs containing local information of the rolling bearing and a reconstructed vibration signal containing global information of the rolling bearing. This part directly uses the original vibration signal of the rolling bearing without artificial feature extraction and feature selection. Then, using a plurality of deep belief networks, the IMF component and the reconstructed vibration signal are respectively used as input signals to perform feature learning of the rolling bearing. This part makes full use of the powerful non-linear mapping ability of the deep belief network to deepen the local feature information and the global feature information of the bearing. Finally, the improved combination strategy is used to comprehensively study the diagnosis results of each deep belief network to obtain the final fault diagnosis results. This part combines the learning results of multiple deep belief networks with the idea of the ensemble learning, avoiding the limitations of a single deep learning model and ensuring the accuracy and stability of the diagnostic system. The experimental results show that the method can get rid of the artificial dependence feature extraction, pay attention to the local and global feature information of the bearing, overcome the limitations of the individual deep learning model, and is more effective than the existing intelligent method.
The rest of this paper is organized as follows: In Section 2, the basic theory of VMD and DBN is briefly introduced. In Section 3, detailed description of the proposed method. In Section 4, experiments were performed to verify the effectiveness of the proposed method. In Section 5, the conclusion.

II. BASIC THEORY OF VARIATIONAL MODE DECOMPOSITION AND DEEP BELIEF NETWORK A. VARIATIONAL MODE DECOMPOSITION
VMD is a non-recursive variational signal decomposition method proposed by Konstantin Dragomiretskiy et al. In 2014 [23]. This method is very suitable for processing non-stationary signals, and can accurately separate signals with close frequency from the components with different center frequency and bandwidth, which is suitable for the separation of multi-component non-stationary nonlinear signals. Compared with EMD and LMD, the VMD algorithm can effectively avoid the problem of modal aliasing and false composition, which has the advantages of less decomposition layer and high efficiency. The core of VMD algorithm is to construct and solve the variational problem.
The purpose of the variational mode decomposition is to ensure the minimum bandwidth of the IMF.The solution of the bandwidth of each mode is obtained by the following steps: 1) performing a Hilbert transform on all the decomposed u k to obtain a unilateral spectrum; 2) the modal signals are mixed by the correction coefficient e −jw k t and the spectrum of each modal function is modulated to the respective base band. 3) The final gradient L 2 method calculates the final result for the obtained u k . Equation 1 shows the constrained variational model of VMD where u k represents K IMFs and w k represents K center frequencies.
In order to obtain the optimal value of the above variational model, a quadratic penalty factor α and a Lagrangian multiplication operator λ(t) are introduced to transform the constrained variational problem into an unconstrained variational problem.
The alternating direction multiplier algorithm is used to calculate Equation 2. Proceed as follows: firstly,ecompose the original signal into different components, each component has a different center frequency and bandwidth. Then,Equation 3 is used to continuously update the center frequency and bandwidth. RBM is the smallest unit for DBN to implement feature extraction and classification. As shown in Fig. 1, the RBM is an undirected probability graph model including a visible layer v and a hidden layer h. The visible and hidden layers of the RBM are connected to each other by weights. The visible layer is used to input data. All nodes of the hidden layer are set to be random 0 or 1. The units of the same layer FIGURE 1. Restricted Boltzmann machine network structure. VOLUME 8, 2020 are independent of each other, and the full probability of the visible layer and the hidden layer. Distribution satisfies the Boltzmann distribution.
The energy function of the RBM model can be given in: where W represents the weight between the visible layer and the hidden layer, a represents the bias of the visible layer, and b represents the bias of the hidden layer. The joint probability distribution of RBM has an energy function expressed as: Since the RBM inter-layer units have no connection, the probability of activation of the visible layer node and the hidden layer node can be expressed as: In order to make the error of the input signal and the reconstructed signal as small as possible, the deep belief network introduces the Contrast Divergence (CD) algorithm and two hyper parameters( learning rate η and momentum m). The weights in the RBM and the offsets of the layers are updated by multiple Gibbs samples.
After completing the unsupervised training of each RBM, the DBN begins its own supervised training. In supervised learning, the RBMs of the deep belief network are considered as a whole, that is, the BP neural network. The BP neural network plays the role of constraint classification in the deep belief network. First, the training samples are input into the RBMs that have been trained through unsupervised learning, and the feature information of samples is learned by the RBMS from bottom to top. The RBM on the top layer obtains the predicted classification result through the classifier. Then, according to the classification result and the sample label diagnosed by the model, the error is layer-bylayer transmitted to the lowest RBM, and the weight of each RBM is further optimization is performed with the offset of each layer by the gradient descent algorithm.

III. PROPOSE METHOD
In this paper, a novel intelligent fault diagnosis method based on variational mode decomposition and ensemble deep belief network (VMD-EDBN) is proposed. The method mainly consists of three parts. The first part is to obtain the local feature information and the global feature information of the original vibration signal of the rolling bearing through VMD. The second part is to learn the local feature information and the global feature information of the rolling bearing through the improved DBN. The improved DBNs constitute an ensemble deep belief network (EDBN). In the third part, the final fault diagnosis result of the rolling bearing is obtained from information fusion of the diagnosis results of each DBN in the ensemble deep belief network through the improved combination strategy.

A. DESIGN BEARING DATA SET
The working conditions of rolling bearings are usually poor, and the bearing vibration data obtained by the sensors is inevitably mixed with noise. The traditional intelligent fault diagnosis method with rolling bearing only pays attention to the global feature information of the vibration signal. Our proposed method simultaneously mines the local feature information of the vibration signal while paying attention to the global feature information of the vibration signal. It is worth noting that we did not go through the manual feature extraction and feature selection steps, directly using the bearing vibration signal obtained from the sensor.
The variational mode decomposition technique can decompose non-stationary signals into IMF components with different center frequencies. It has strong adaptability and is now used in the field of mechanical fault diagnosis. The VMD technology is a prerequisite for our proposed method. We use the VMD technique to decompose the vibration signal of the rolling bearing to obtain a series of IMF components including local feature information of the bearing vibration signal. Then, the reconstructed signal containing the global feature information of the vibration signal is obtained by reconstructing all of the IMF components. Finally, the IMF component obtained by the VMD technique and the reconstructed signal together constitute the experimental data set.

B. THE FAULT DIAGNOSIS OF TEH IMPROVED ENSEMBLE DEEP BELIEF NETWORK
As one of the most classic models in deep learning, DBN is a probabilistic generation model that has been successfully applied in many fields, especially in fault diagnosis [24]. In fault diagnosis, the deep belief network first diagnoses the fault by the unsupervised feature learning of the input signal from the bottom of the independent RBM. However, there must be an error between the fault diagnosis result and the actual fault information. In order to improve the network performance, it is necessary to use the loss function to reduce the error-oriented top-down supervised fine adjustment. At the same time, it is hoped that while the training convergence is guaranteed, the greater the error, the greater the strength of the parameter correction.
Traditional deep belief networks typically use a quadratic cost function as a loss function for inverse fine-tuning.
where a represents the fault diagnosis result and y represents the actual fault information. Parameter adjustment mostly adopts the method of gradient descent. Therefore, the gradient formula for weights and offsets is: where z represents Neuron input and a represents the activation function. It can be known from (10) that the gradient of the activation function determines the adjustment speed of the parameter, and the faster the parameter adjustment, the faster the training converges. The activation function of the deep belief network is generally a sigmoid function, and when the error is large, the function gradient is small. Therefore, the improved deep belief network in the method we mentioned replaces the loss function with a cross entropy function.
The gradient of the parameters is as follows: The gradient of the parameter is directly expressed as the difference between the output value and the actual value, so when the error is larger, the gradient value is larger, and the correction strength of the parameter is larger. Compared with the traditional deep belief network whose the loss function is a quadratic cost, our improved deep belief network uses the cross entropy as the loss function, which eliminates the influence of the activation function on the parameter update, and achieves the purpose that the greater the error, the stronger the adjustment parameters.
The improved deep belief network based on the cross entropy function diagnoses bearing faults by learning the feature information of the rolling bearing is the basis of the proposed method. Multiple improved deep belief networks form the ensemble deep belief network. We use the IMF component obtained by the VMD containing local feature information and the reconstructed signal containing the global feature information as input signals for the improved ensemble deep belief network. Each improved deep belief network in the ensemble deep belief network learns a corresponding input signal respectively. After repeated iterations of unsupervised feature learning and supervised fine tuning training, the fault diagnosis results of the rolling bearing are obtained.

C. IMFORMATION FUSION OF FAULT DIAGNOSIS RESULTS
Information fusion is a feature of our proposed method. The biggest advantage of our proposed method is to use the ensemble deep belief network model to simultaneously learn the local feature information and the global feature information of the rolling bearing vibration signal to achieve a more accurate and stable fault diagnosis conclusion. We have obtained corresponding fault diagnosis results by learning the IMF component containing local feature information and the reconstructed signal containing the global feature information. Now, we need to use the appropriate combination strategy to fuse the fault diagnosis results.
The traditional weighted voting method in ensemble learning is mainly based on the learning ability of the learning model, that is, the accuracy of fault diagnosis as the standard to design corresponding weight of the individual learning model in the ensemble system in the field of fault diagnosis. However, for the rolling bearing fault diagnosis method proposed in this paper, the traditional weighted voting method has obvious shortcomings: the IMF component obtained by the VMD technology to decompose the bearing vibration signal is a series of vibration signals with different center frequencies, which means the different IMF component containing different fault feature information. The traditional method only considers the overall fault diagnosis ability of the individual learning model, which leads to two kinds of error phenomena, that is, the overall diagnostic accuracy of the learning model is high while the accuracy of the single fault diagnosis is low. And the overall diagnostic accuracy of the learning model is low while the accuracy of the single fault diagnosis is high. Therefore, based on the traditional weighted voting method, we have designed an improved weighted voting method based on the accuracy of single fault diagnosis in each learner.
The improved weighted voting method proposed by us is divided into the following three steps: First, we obtain the fault diagnosis results obtained by each improved deep belief network in the ensemble deep belief network according to different vibration signals. Then, we calculate the accuracy of each improved deep belief network fault diagnosis under the same fault type of rolling bearing. According to the accuracy of the improved deep belief network, the weights of each improved deep belief network under the fault type are designed. Finally, we reorganize the weights of each improved deep belief network under various fault types into a weight set of the entire integrated network.

D. PROPOSED DIAGNOSTIC STEPS
As shown in Fig. 2, based on our novel fault diagnosis method based on variational mode decomposition and the ensemble deep belief network, we designed a fault diagnosis flowchart for rolling bearings. The rolling bearing diagnostic steps are summarized as follows: Step 1: The rolling bearing simulates the fault test bench for the fault test, and uses the acceleration sensor to obtain the original vibration signal of the bearing.  36298 VOLUME 8, 2020 Step 2: The original vibration signals of the rolling bearing are directly and randomly divided into training samples and test samples. It is worth noting that manual feature extraction and feature selection are not performed.
Step 3: Directly decompose the bearing vibration signal using the VMD technique to obtain a series of IMF components including the local feature information and a reconstructed signal including the global feature information.
Step 4: The IMF components and the reconstructed signal are used as input signals for the improved ensemble deep belief network. Each improved deep belief network performs feature learning on its corresponding input signal to obtain a series of fault diagnosis results.
Step 5: Using the improved weighted voting method in the decision-making layer to fuse the fault diagnosis results in the ensemble deep belief network to obtain the final diagnosis result.
Step 6: Using the test sample to systematically evaluate the fault diagnosis method based on VMD and the ensemble deep belief network proposed in this paper.

IV. EXPERIMENT AND ANALYSIS A. BEARING EXPERIMENTAL DATA DESCRIPTION
We used the rolling bearing vibration fault data simulated by Case Western Reserve University Laboratories to evaluate the capabilities of the proposed method [25]. the rolling bearing test bench consists of a load motor (left), a torque sensor/encoder (center) and a dynamometer (right). The original vibration signal for different health conditions was measured by an accelerometer at 1797 rpm and the sampling rate was 12 kHz.
In this paper, the vibration data of the driving end 6205-2RS rolling bearing is selected for subsequent simulation research. The parameters are shown in Table.  (1 foot = 25.4 mm) respectively. Each type has 300 samples, each sample contains 400 sampling points, of which 200 random samples are used as training sets and 100 random samples are used as test sets.
In this paper, the vibration data of the rolling bearing under no-load and load is 1 horsepower is selected. As shown in Table. 2, the data series A series is the vibration data of various health states of the rolling bearing under no-load. The data set B series is the vibration data of various health states of the rolling bearing under load of 1 horsepower. The data set D series consists of data set A series and data set B series, the purpose is to test the applicability of the fault diagnosis method proposed in this paper under multi-load conditions. It is worth noting that each data set series contains two sets of bearing data sets,that is, the original vibration data set and the feature data set. Each raw vibration data set contains bearing vibration signals for normal conditions and nine different fault conditions. Each feature data set is composed of 10 sensitive wavelet values extracted from the 8 recombination bands after the wavelet signal transform of the original vibration data set. The 10 sensitive eigenvalues include mean, variance, root mean square, maximum, peak-to-peak, median, crest factor, distortion, sheath, wavelet packet energy, and so on. Therefore, each health state in the original vibration data set contains 300 samples, each sample contains 400 vibration data points, and each health state in the feature data set contains 300 samples, each of which contains 80 (8 * 10) features data.

B. VARIATIONAL MODE DECOMPOSITION OF BEARING VIBRATION SIGNALS
In order to obtain local feature information of the rolling bearing vibration signal, the VMD technique is used to adaptively decompose the original bearing vibration signal to obtain the IMF components containing local bearing characteristic information. According to the VMD technology theory in Section 2.1, the signal decomposition scale K and the penalty factor α are the main factors affecting the decomposition results of the bearing signal. If the K value is chosen to be small, the number of IMF components of the signal decomposition is small. Since the VMD is equivalent to the selfapplying Wiener filter, some important information in the original signal is filtered out; On the contrary, the number of IMF components of the signal decomposition is large, so that the signals of the same frequency segment are decomposed into different IMF components, and the center frequency bands of the decomposed IMF components overlap. The penalty factor α affects the bandwidth and convergence speed of each IMF component. In order to obtain the suitable IMF components containing local feature information of the bearing, we experimentally determine the parameters of the VMD algorithm.
This article only takes the vibration signal of the bearing inner ring fault as an example. Fig. 3 is the final decomposition result of this signal.We determine the signal decomposition scale K and the penalty factor α in the VMD technique by observing the center frequency method. Table 3 is the statistical table of the center frequency values of the IMF components at different decomposition scales K. Fig. 4 is a visualization of Table 3. It can be seen from observation that when K is 2, information of 1000∼2000 Hz and 3000∼4000 Hz can be filtered out. When K is 3, the information of 1000∼2000 Hz is still filtered out. When K is 4, the orientation information of each frequency segment can be obtained. When K is 5, the band information of 3000∼4000 Hz is divided into two segments, and the center bands of the fourth IMF and the fifth IMF overlap.    decomposition scale K is 4. It can be seen from observation that when α is 100, the IMF1 component contains two central frequency modal components 0∼1000Hz and 1000∼2000Hz. In addition, the same modal component of 3000∼4000Hz is decomposed into two components of IMF3 and IMF4, and modal aliasing occurs. When α is 200, the IMF1 component contains two central frequency modal components 0∼1000Hz and 1000∼2000Hz. In addition, the same modal component of 2000∼3000Hz is decomposed into two components of IMF2 and IMF3, and modal aliasing occurs. When α is 400∼4000, the bearing vibration signal is successfully divided into 4 IMF components with no overlapping center frequencies, no modal aliasing. When α is 2000,it takes the least amount of time. When α is 8000, it is so large that the low frequency band IMF1 amplitude is too small. We can't get the fault feature information from IMF1.Therefore the penalty factor α is 2000.
The IMF component obtained from the original vibration signal decomposed by the VMD method can contain feature information in different frequency ranges of the original signal, which provides the possibility for further deepening of the azimuth information mining using the ensemble deep belief network.

C. EXPERIMENT DESIGN
In order to evaluate the practical diagnosis ability of our proposed fault diagnosis method based on variational pattern decomposition and ensemble deep belief network. We conducted three sets of fault diagnosis experiments using the three sets of rolling bearing vibration data sets introduced in Section A. In each set of experiments, the fault diagnosis ability of our proposed method was evaluated by comparing the proposed fault diagnosis method with the diagnosis results of existing intelligent fault diagnosis methods.
Because model parameters such as the network structure of the ensemble deep belief network will directly affect the accuracy of fault diagnosis and no mature theory currently directly determines this parameter, this paper uses experiments to determine the hyper parameters of the ensemble deep belief network. Therefore, before the comparative test of the proposed method and the traditional intelligent diagnostic method, the hyper parameters of the ensemble deep belief network need to be determined experimentally.
A total of four experiments are performed in this paper, the details are as follows: Experiment 1: In order to obtain the best hyper parameters of the ensemble deep belief network, the three sets of bearing data sets of Part A are input into the model, and the best hyper parameters of the model are determined by comparing the fault diagnosis results of the models under different parameters.

Experiment 2:
To evaluate the fault diagnosis capabilities of our proposed method, we used the A series of bearings under no-load vibration data sets. Firstly, we use the raw vibration signal as the input signal, and enter the fault diagnosis model and the traditional Intelligent diagnosis methods,such as deep belief network, convolutional neural network, stack autoencode, BP neural network and support vector machine for feature learning and fault diagnosis.Since BP neural network and support vector machine are shallow learning models, we then use the feature data set as input signal to enter BP neural network and support vector machine for feature learning and fault diagnosis.

Experiment 3:
In order to evaluate the versatility of our proposed method for fault diagnosis under different loads, we used the B series bearing data set with a load of 1 horsepower. The procedure of Experiment 3 is the same as Experiment 2. First, the original vibration signal is used as the input signal to enter the fault diagnosis model and traditional intelligent diagnosis methods, such as deep belief network, convolutional neural network, stack autoencoder, BP neural network and support vector machine for feature learning. Since BP neural network and support vector machine are shallow learning models, we use feature data set as input signals to enter BP neural network and support vector machine for feature learning and fault diagnosis.

Experiment 4:
To evaluate the ability of our proposed method to diagnose faults under multiple loads, we used the D series multi-load bearing data set. The procedure of Experiment 4 was also the same as Experiment 2. First, the original vibration signal is VOLUME 8, 2020 used as an input signal to enter the fault diagnosis model and traditional intelligent diagnosis methods, such as deep belief network, convolutional neural network, stack autoencoder, BP neural network and support vector machine for feature learning. Since BP neural network and support vector machine are shallow learning models, we use feature data sets as input signals to enter BP neural network and support vector machine for feature learning and fault diagnosis.

Experiment 1:
There are many hyper parameters for the ensemble deep belief network, and the network structure of the model can directly affect the effect of feature learning and fault diagnosis. On the one hand, too many hidden layers and hidden units may improve the diagnosis results but complicate the model and increase the amount of calculation. On the other hand, if the number of hidden layers and hidden units is too small, network performance may be poor. Therefore, Experiment 1 is used to discuss and determine the network structure of the models in the three sets of experiments.
In this experiment, the selection of the network structure follows the principle that the number of units in the i-th hidden layer is less than the number of units in the (i −1) -th hidden layer. This experiment evaluates the diagnostic effect of the model from the accuracy of the diagnostic results. Each model is subjected to 5 repeated experiments. The average of the diagnostic results in the 5 groups is regarded as the accuracy of the model. Tables 5, 6, and7 show the diagnosis results of the ensemble deep belief networks with different network structures under different data sets. We can draw the following conclusions: 1) When dealing with complex nonlinear classification problems such as bearing fault diagnosis, deep networks structural models usually have better diagnostic accuracy,   because deep architecture models have powerful nonlinear approximation capabilities and powerful computing capabilities. 2) It is not that the more complex the network structure, the better the ability to handle nonlinear problems. This is because an overly complex network structure model may have over-learning of training samples due to its more powerful learning ability. It may also learn some interference information while learning the regularity of bearing performance changes, which leads to the fault prognosis becomes worse, that is, overfitting occurs when the test sample was used.
In summary, this paper uses manual adjustment of parameters to determine the optimal network structure of the model under different data sets, which provides a prerequisite for the subsequent comparative experiments.

Experiment 2:
As shown in Table 8, the network structure of each deep belief network in the ensemble deep belief network is 400-50-20-10-10. The learning rate of the weight of each layer in the network is 0.001 and the momentum of them is 0.9, The number of iterations is 200. The number of decomposition K of the variational mode decomposition (VMD) is 4, and the penalty factor α is 2000. The parameters of the remaining intelligent diagnosis methods in Experiment 2 are as follows: 1) Deep belief network: network structure is 400-50-20-10-10, the learning rate is 0.001, momentum is 0.9, and the number of iterations is 250. 2) Convolutional neural network: The input sample is made into a 20 * 20 sample map. The first convolutional layer includes 6 cores, the size of which is 5. The step size of the pooling layer is 2. And the second convolutional layer includes 12 cores whose size is 5. The learning rate is 1, and the number of iterations is 100. 3) Stack autoencoder: network structure is 400-200-10. The activation function is ReLU. The learning rate is 0.45, and the momentum is 0.9. The number of iterations is 100. the sparsity penalty factor is 0.3, and the sparse parameter is 0.01. 4) BP neural network with Raw data set: BP neural network structure is 400-25-10, the learning rate is 0.8, the number of iterations is 100. 5) Support vector machine with Raw data set: RBF kernel is applied. The penalty factor is 1.2 and the kernel radius is 0.6. 6) BP neural network with feature data set: BP neural network structure is 80-25-10, the learning rate is 0.8, the number of iterations is 100; 7) Support vector machine with original bearing vibration data set A1: RBF kernel is applied. The penalty factor is 3.2 and the kernel radius is 1.8.
In order to ensure the accuracy and stability of the results of the proposed method for bearing fault diagnosis, we have carried out repeated experiments. The average of multiple failure diagnosis results is considered to be the accuracy of the method, and the standard deviation of multiple failure diagnosis results is considered to be the stability of the method. We can see that the improved deep belief network (IDBN) has obtained a relatively stable fault diagnosis result after fully learning the feature information of the rolling bearing of each input signal. As shown in Table 9, when the input signal is the IMF1 component, the diagnostic accuracy of the improved deep belief network is 61.88%. When the input signal is the IMF2 component, the diagnostic accuracy of the IDBN is 75.94%. When the input signal is the IMF3 component, the diagnostic accuracy of the IDBN is 87.64%. When the input signal is the IMF4 component, the diagnostic accuracy of the IDBN is 84.38%. When the input signal is a reconstructed signal, the diagnostic accuracy of IDBN is 94.64%. The final diagnostic accuracy obtained through the improved combination strategy was 98.96%.    We can see that the accuracy of the diagnostic results of the deep belief network, which is only concerned with the global feature information, is 94.64%. The proposed fault diagnosis method based on VMD and EDBN learns the local feature information and the global feature information of the rolling bearing at the same time, greatly improved the accuracy of rolling bearing fault diagnosis. The accuracy reaches 98.96%. It is worth noting that when only the feature information of the IMF component is learned, the result of the fault diagnosis may not be ideal. It is only because the VMD technique is an adaptive decomposition of the vibration signal into IMF components with different center frequencies, so some IMF components contain less fault characteristic information, resulting in failure to diagnose correctly. Fig. 7 is a confusion matrix of fault diagnosis results when the IMF1 component is used as an input signal in one experiment. The improved deep belief network cannot recognize the 0.007 feet outer ring fault, that is, the eighth health state, because there is almost no fault information specific to the eighth health state in the IMF1 component. But it can accurately identify the fifth, sixth, and seventh health states, because the IMF1 contains the inner ring fault information of the rolling bearing. Fig. 8 is a confusion matrix of the final diagnostic results of the fault diagnosis method based on VMD and EDBN in one experiment. After the method proposed in this paper fully learns the local feature information and the global feature information VOLUME 8, 2020  of the rolling bearing, the diagnostic accuracy is obviously improved to 99.5%.
In order to better evaluate the effectiveness and superiority of the proposed fault diagnosis method. We have done a series of comparative experiments in combination with traditional intelligent diagnostic methods. Table10 compares the diagnostic results of different intelligent diagnostic methods when the no-load bearing vibration signal is used as the input signal. The proposed method based on VMD and EDBN has the highest test accuracy rate of 98.96%, and the stability is also the best. The standard deviation is only 0.7861%. The accuracy and stability of the remaining traditional deep learning and machine learning diagnostic methods are as follows: the accuracy of the convolutional neural network is 89.82%, the stability is 2.9953%; the accuracy of the stack automatic encoder is 91.43%, and the stability is 1.4700%; the accuracy of the deep belief network is 93.52%, the stability is 2.5600%; the accuracy of BP neural network is 72.70%, the stability is 2.2310%; the accuracy of support vector machine is 83.70%, the stability is 2.1600%. When the feature data set is used as the input signal, the diagnosis results of the two machine learning fault diagnosis methods are: BP neural network accuracy is 87.95%, stability is 1.5946%; support vector machine accuracy is 87.70%, stability It is 1.8900%. feature information of the vibration signal. Our fault diagnosis method learns the global feature information of the vibration signal of the rolling bearing and also mines the local feature information of the vibration signal. In addition,As shown in Fig. 9, we have visualized the accuracy and stability of the diagnostic results of the proposed method and the traditional Intelligent fault diagnosis. The figure is a double ordinate graph, the left ordinate represents the accuracy of the model diagnosis results, and the right ordinate represents the stability of the model diagnosis results. The histogram represents the test accuracy of the diagnostic method, and the line graph represents the test stability of the diagnostic method. Through the data in Table 10 and the diagnostic results visualization of Fig. 9, we can get the following conclusions: (1) The proposed fault diagnosis method based on VMD and EDBN is far superior to traditional Intelligent fault diagnosis method on the accurate and stability of the diagnosis results. This is because the fault diagnosis model proposed in this paper is different from the traditional Intelligent fault diagnosis method, which only pays attention to the global feature information of the vibration signal. Our fault diagnosis method learns the global feature information of the vibration signal of the rolling bearing and also mines the local feature information of the vibration signal. In addition, the ensemble learning of multiple improved deep belief networks also overcomes the problem of insufficient learning ability of a single learning framework, improving the accuracy and stability of bearing fault diagnosis. (2) The diagnostic accuracy of bearing fault diagnosis using deep learning method is obviously superior to the traditional machine learning method. This is because machine learning is a shallow learning model, and its nonlinear approximation ability is limited. When dealing with the non-static complex signal recognition problems with noise pollution such as rolling bearing vibration signals, the effect is relatively poor.
(3) When using the machine learning method to deal with the fault diagnosis of the rolling bearing, feature data set as input signal is better than raw data set. This is because the data in the feature data set has undergone manual feature extraction and feature selection, and the original data set does not go through these steps. However, artificial feature extraction and feature selection require a lot of manpower and time, and the feature set does not have generalization. Our proposed method based on VMD and EDBN not only eliminates the process of artificial feature extraction feature selection and greatly saves manpower and time, but also achieves an extremely high diagnostic results. Compared with the traditional intelligent fault diagnosis method, our proposed diagnosis method is more accurate and more stable. Figure 10 is a detailed diagnosis result of the failure condition of the bearing in one test. We can see that the diagnostic accuracy of fault diagnosis method based on the VMD-EDBN proposed in this paper is basically much higher than the traditional intelligent fault diagnosis methods. Among the tenth health status, our method accuracy rate is relatively low, but it also reached 97%. At the same time, the accuracy of SVM achieved 100% when the feature data set was used as the input signal. However, under other health conditions, the diagnostic accuracy of support vector machines is much lower than our proposedault diagnosis method. Therefore, compared with the traditional intelligent fault diagnosis methods, our proposed method can diagnose bearing faults and the severity of faults more accurately and stably.

Experiment 3:
In order to prove that our proposed fault diagnosis methods based on VMD and EDBN can be applied to different single-load rolling bearing fault diagnosis. The vibration data of the rolling bearing under the B series load of 1 horsepower was used as the control experiment of Experiment 2. The parameters of Experiment 3 were also obtained through actual experiments. The parameters of Experiment 3 are as follows: (1) Our proposed method based VMD and EDBN:As shown in Table 11, the network structure of each deep belief network in the ensemble deep belief network is 400-100-50-10-10. The learning rate is 0.01 and the momentum of them is 0.9, The number of iterations is 200. The number of decomposition K of the variational mode decomposition (VMD) is 4, and the penalty factor α is 2000. (2) Deep belief network: network structure is 400-50-20-10-10, the learning rate is 0.01, momentum is 0.9, and the number of iterations is 200.
(3) Convolutional neural network: The input sample is made into a 20 * 20 sample map. The first convolutional layer includes 6 cores, the size of which is 5. The step size of the pooling layer is 2. And the second convolutional layer includes 12 cores whose size is 5. The learning rate is 1, and the number of iterations is 200.
(4) Stack automatic encoder: network structure is 400-200-10. the activation function is ReLU. The learning rate is 0.45, and the momentum is 0.9. The number of iterations is 100. the sparsity penalty factor is 0.3, and the sparse parameter is 0.01.
(5) BP neural network with Raw data set: BP neural network structure is 400-200-50-10, the learning rate is 0.8, the number of iterations is 500.
(6) Support vector machine with Raw data set: RBF kernel is applied. The penalty factor is 1.5 and the kernel radius is 0.9.
(7) BP neural network with Feature data set: BP neural network structure is 80-25-10, the learning rate is 0.8, the number of iterations is 400.
(8) Support vector machine with Raw data set: RBF kernel is applied. The penalty factor is 1.2 and the kernel radius is 2.2.
In order to ensure the accuracy and stability of the results of the proposed bearing fault diagnosis method, we also carried out repeated experiments in Experiment 3. The average of multiple failure diagnosis results is considered to be the accuracy of the method, and the standard deviation of multiple failure diagnosis results is considered to be the stability of the method. Fig. 11 is the accuracy of the fault diagnosis results of different diagnostic signals and the accuracy of the model diagnosis results in the fault diagnosis method proposed in this paper under the load of 1 horsepower. We can see that the improved deep belief network (IDBN) has obtained relatively stable fault diagnosis results after fully learning the rolling bearing feature information of each input signal. As shown  in Table 12,when the input signal is the IMF1 component, the diagnostic accuracy of the improved deep belief network is 66.52%. When the input signal is the IMF2 component, the diagnostic accuracy of the IDBN is 66.06%. When the input signal is the IMF3 component, the diagnostic accuracy of the IDBN is 80.92%. When the input signal is the IMF4 component, the diagnostic accuracy of the IDBN is 84.26%. When the input signal is a reconstructed signal, the diagnostic accuracy of IDBN is 95.08%. The final diagnostic accuracy obtained through the improved combination strategy was 97.54%. Fig. 12 shows the fault diagnosis results for different diagnostic signals and the average accuracy of the model diagnosis results under the load of 1 horsepower. We can see that the accuracy of the improved deep belief network diagnostic results only related to the global feature information is 95.08%. The proposed fault diagnosis method based on VMD and EDBN can simultaneously learn the local feature information and global feature information of rolling bearings, which greatly improves the accuracy of rolling bearing fault diagnosis. The accuracy rate is 97.54%. Fig. 13 is the confusion matrix of the final diagnostic results of the proposed method in one experiment under the load of 1 horsepower. The fault diagnosis method proposed in this paper is also applicable to rolling bearings with different single loads. The method has applicability. In order to better evaluate the effectiveness and adaptability of the proposed fault diagnosis method. We conducted a series of comparative experiments using rolling bearing data under the load of 1 horsepower combined with traditional intelligent diagnostic methods. Table 13 compares the diagnostic results of different intelligent diagnostic methods when the load is 1 horsepower vibration signal is used as the input signal. The method is based on VMD and EDBN with the highest test accuracy, 97.54%, and the best stability. The standard deviation is only 0.9813%. The accuracy and stability of the remaining traditional deep learning and machine learning diagnostic methods are as follows: the accuracy of the convolutional neural network is 93.65%, the stability is 1.9953%; the accuracy of the stacked automatic encoder is 88.40%, and the stability is 1.7394%. The accuracy of the deep belief network is 94.33%, the stability is 2.1733%; the accuracy of BP neural network is 67.60%, the stability is 1.2365%; the accuracy of support vector machine is 80.50%, and the stability is 2.2076%. When the feature data set is used as the input signal, the diagnosis results of the two machine learning fault diagnosis methods are: BP neural network has an accuracy of 91.80% and a stability of 1.6325%. The support vector machine has an accuracy of 88.40% and a stability of 1.6542%.
Through the data in Table 13 and the diagnostic results of Fig. 14, we can draw the following conclusions: In the bearings with different loads, our proposed fault diagnosis method based on VMD and EDBN is far superior to the traditional intelligent fault diagnosis methods in terms of accuracy and stability. The diagnostic method proposed by us has not only paid attention to the global feature information of the rolling bearing, but also excavated the local feature information of the bearing. Therefore, our diagnostic method can better learn the feature information of rolling bearings and realize the diagnosis of rolling bearing faults. The method has generalization and applicability.  It can be seen that the diagnostic accuracy of the proposed fault diagnosis method is basically higher than that of the traditional intelligent fault diagnosis method under various health conditions of the rolling bearing. Among the second health state, the diagnostic accuracy of our proposed method is relatively low, but it also reaches 94%. Although stack selfencoders, deep belief networks, and support vector machines have higher diagnostic accuracy in this health state. However, under other health conditions, the three methods have poor diagnostic results: for example, in the tenth health state, the diagnostic accuracy of the support vector machine is 25%, and the diagnostic accuracy of the stack self-encoder is 67%. The accuracy of our proposed method can reach 100%. In the fourth health state, the diagnostic accuracy of the deep belief network is only 72%. The accuracy of our proposed method can reach 98%. Therefore, compared with the traditional intelligent fault diagnosis method, the proposed method can diagnose the bearing fault and severity more accurately and stably.

Experiment 4:
Rolling bearings, which are indispensable in rotating machinery, often need to work under different loads. Therefore, in the problem of fault diagnosis of actual rolling bearings, the bearing vibration signal obtained by the sensor is often a vibration signal of a variable load. Therefore, we verify the fault diagnosis ability of our proposed fault diagnosis method based on VMD and EDBN under multiple load conditions. We experimented with the D series of multi-load rolling bearing vibration data. The parameters of Experiment 4 were also obtained through actual experiments. The parameters of Experiment 4 are as follows: (1) Our proposed method based VMD and EDBN: As shown in Table 14, the network structure of each deep belief network in the ensemble deep belief network is 400-100-50-10-10. The learning rate is 0.01 and the momentum of them is 0.9, The number of iterations is 300. The number of decomposition K of the variational mode decomposition (VMD) is 4, and the bandwidth constraint α is 2000. (2) Deep belief network: network structure is 400-50-20-10-10, the learning rate is 0.1, momentum is 0.9, and the number of iterations is 200.
(3) Convolutional neural network: The input sample is made into a 20 * 20 sample map. The first convolutional layer includes 6 cores, the size of which is 5. The step size of the pooling layer is 2. And the second convolutional layer includes 12 cores whose size is 5. The learning rate is 0.8, and the number of iterations is 150.
(4) Stack automatic encoder: network structure is 400-200-10. the activation function is ReLU. The learning rate is 0.45, and the momentum is 0.9. The number of iterations is 100. the sparsity penalty factor is 0.3, and the sparse parameter is 0.01.
(5) BP neural network with Raw data set: BP neural network structure is 400-100-50-10, the learning rate is 0.8, the number of iterations is 500.
(6) Support vector machine with Raw data set: RBF kernel is applied. The penalty factor is 1.2 and the kernel radius is 1.8.
(7) BP neural network with Feature data set: BP neural network structure is 80-25-10, the learning rate is 0.8, the number of iterations is 300.
(8) Support vector machine with Raw data set: RBF kernel is applied. The penalty factor is 1.2 and the kernel radius is 1.4.
In order to ensure the accuracy and stability of the results of the proposed bearing fault diagnosis method, we also performed repeated experiments in Experiment 4. The average of multiple failure diagnosis results is considered to be the accuracy of the method, and the standard deviation of multiple failure diagnosis results is considered to be the stability of the method. Fig. 16 is the accuracy of fault diagnosis results of different diagnostic signals in multiple experiments under multiple loads. We can see that the improved deep belief network (IDBN) has obtained relatively stable fault diagnosis results after fully learning the rolling bearing characteristic information of each input signal. As shown in Table 15, when the input signal is the IMF1 component, the diagnostic accuracy of the improved deep belief network is 66.52%. When the input signal is the IMF2 component, the diagnostic accuracy of the IDBN is 66.06%. When the input signal is the IMF3 component, the diagnostic accuracy of the IDBN is 80.92%. When the input signal is the IMF4 component, the diagnostic accuracy of the IDBN is 84.26%. When the input signal is a reconstructed signal, the diagnostic accuracy of IDBN is 95.08%. The final diagnostic accuracy obtained  through the improved combination strategy was 97.54%. Fig. 17 shows the fault diagnosis results for different diagnostic signals under multiple loads and the average accuracy of the model diagnostic results. We can see that the accuracy of the improved deep confidence network diagnostic results only related to global feature information is 95.08%. The proposed fault diagnosis method based on VMD and EDBN can simultaneously learn the local feature information and global feature information of rolling bearings, which greatly improves the accuracy of rolling bearing fault diagnosis. The accuracy rate is 97.54%. Fig. 18 is a confusion matrix of the final diagnostic results of the method in one experiment under multiple loads. The fault diagnosis based on VMD and EDBN proposed in this paper can accurately and stably diagnose the health status of rolling bearings under multi-load. It is adapted to fault diagnosis in actual operating conditions.  Faced with the problem of multi-load rolling bearing fault diagnosis, in order to better evaluate the effectiveness and adaptability of the proposed fault diagnosis method. We used a multi-load rolling bearing data set and combined with traditional intelligent diagnostic methods to conduct a series of comparative experiments. Table 16 compares the diagnostic results of different intelligent diagnostic methods when the multi-load vibration signal is used as the input signal. The method is based on VMD and EDBN and has the highest test accuracy and best stability. The accuracy is 98.452%,and the standard deviation is only 0.5303%. The accuracy and stability of the remaining traditional deep learning and machine learning diagnostic methods are as follows: the accuracy of the convolutional neural network is 94.133%, the stability is 0.9815%; the accuracy of the stacked automatic encoder is 88.270%, and the stability is 0.5346%.. The accuracy of the deep belief network is 93.330%, the stability is 1.9635%; the accuracy of the BP neural network is 75.167%, the stability is 1.4511%; the accuracy of the support vector machine is 64.800%, and the stability is 2.9463%. When the feature data set is used as the input signal, the diagnosis results of the two machine learning fault diagnosis methods are: the accuracy of the BP neural network is 90.067%, and the stability is 0.8607%. The support vector machine has an accuracy of 91.167% and a stability of 2.4111%. Fig. 19 is a visualization diagram of the diagnosis effect of each intelligent diagnosis method in Table 16 under multiple loads. Through Fig. 19, we can draw the following conclusions: (1) The fault diagnosis method proposed in this paper is not only suitable for bearing fault diagnosis under single load, but also for bearing fault diagnosis under multiple loads. (2) The fault diagnosis method proposed in this paper is far superior to the traditional intelligent fault diagnosis methods relying on machine learning and deep learning in terms of accuracy and stability of the diagnosis results. (3) The fault diagnosis method proposed in this paper makes full use of the deep structure characteristics of DBN, and performs autonomous and effective feature learning on bearing vibration signals, which not only omits the process of artificial feature extraction feature selection, but also saves labor and time costs.
In order to further demonstrate the diagnosis effect of the proposed fault diagnosis method under multiple load conditions. Fig. 20 shows the comparison chart of the accuracy of detailed fault diagnosis in each healthy state of the multi-load rolling bearing. We can clearly see that under the multi-load rolling bearing vibration data, the fault diagnosis method proposed in this paper is basically higher than the traditional intelligent diagnosis methods relying on deep learning and machine learning in each health state. Among them, in the tenth state of health, the diagnostic accuracy of the proposed fault diagnosis method is low, but it also reaches 96%. Although methods such as convolutional neural networks and support vector machines using feature sets are more accurate in this type of fault diagnosis. However, in other health states, these two methods have poor diagnostic results: for example, in the seventh state of health, the diagnostic accuracy of support vector machines is 70%, and the accuracy of the fault diagnosis method proposed in this paper reaches 100%. In the fourth state of health, the diagnostic accuracy of the convolutional neural network is only 77%, and the accuracy of the proposed fault diagnosis method can reach 96%.

V. CONCLUSION
This paper proposes a novel method based on variational modal decomposition and ensemble deep belief network for the fault diagnosis of rolling bearing. We directly use the bearing vibration data obtained from the experiment without artificial feature extraction and feature selection and pay attention to the local information and the global information from bearing vibration data and diagnose 10 healthy states in the bearing data set. In order to verify the effectiveness of our proposed method, we conducted three sets of experiments using the rolling bearing vibration data simulated by Case Western Reserve University Laboratory. The experimental results show that compared with the traditional machine learning and deep learning methods, the method is more effective and stable to diagnose the fault type of the rolling bearing and the severity of the fault. The advantages of this method are as follows: (1) directly using the original bearing vibration data obtained by the experiment, without manual extraction and selection features, eliminating the subjectivity of the person and greatly saving the information processing time; (2) ensemble deep belief network simultaneously uses multiple deep belief networks to learn the local information and global information of the bearing, fully utilizes the learning ability of the deep belief network and improves the accuracy and stability of the diagnosis of the bearing health status. (3) This method is not only suitable for the single-load condition of the bearing, but also for the multiple-load whose working condition is close to the actual project.