Short Term Wind Speed Prediction Based on VMD and DBN Combined Model Optimized by Improved Sparrow Intelligent Algorithm

Accurate wind speed prediction can help the power department to perceive the change rule of wind power in advance, reduce the impact of wind power grid connection, and then improve the wind power consumption rate. Therefore, an optimized variational modal decomposition (OVMD) method combined with optimized depth belief neural network (ODBN) is proposed to predict wind speed. First, the original wind speed data are processed by OVMD method, then the decomposed data are predicted by ODBN method, and the predicted component values are superimposed to obtain the wind speed prediction results. Taking the actual wind speed data of a certain area in Northwest China as an example, the proposed combined model is compared with common prediction methods such as DBN, long short term memory (LSTM), extreme learning machine (ELM), BP neural network, etc. The experimental results show that its RMSE decreases by 0.4494, 0.4778, 0.6217 and 0.6587, and its MAPE decreases by 10.3554%, 11.5484%, 14.6226% and 15.9493% respectively. The results verify the effectiveness of the prediction model.

Because it adopts multi-layer nonlinear transformation, it can 48 more effectively represent the complex relationship in wind 49 speed and wind power data. At present, it has become a 50 research hotspot in new energy output prediction [9], [10]. 51 Literature [11] proposed that the complete set empirical mode 52 decomposition was used to preprocess data, and the combined 53 model of long and short-term memory neural network and 54 BP network was used to build wind speed prediction model. 55 Literature [12] proposed a combined model of convolutional 56 neural network and bidirectional long and short-term memory 57 neural network, in which convolutional neural network was 58 used to propose the internal features of time series, and 59 genetic algorithm was used to optimize the hyperparameters 60 in the model. Literature [13] proposed a combination model 61 combining wavelet transform and deep belief network, and 62 made a comparative analysis with conventional prediction 63 methods. Literature [14] proposed an adaptive deep learn-64 ing model, which can realize automatic data learning and 65 generate appropriate structure, and can capture the dynamic 66 characteristics of wind speed data, thus achieving good wind 67 speed prediction effect. However, the prediction methods in 68 literature [11], [12], [13], and [14] are characterized by poor 69 stability. 70 Due to the strong nonlinear characteristics of wind speed 71 data, the single prediction method is rough, and it is diffi-72 cult to refine the intrinsic law of the analysis data, and the 73 prediction error is large. Wavelet decomposition, empirical 74 mode function and other methods are used to decompose 75 the data signal, and the prediction model of each component 76 is established separately, which has gradually replaced the 77 single prediction method. In literature [15] and [16], empir-78 ical mode decomposition method was used to decompose 79 data series and further predict wind speed, but the modal 80 aliasing problem existing in EMD could not be avoided. 81 Literature [11] and [17] introduced improved EMD meth-82 ods, including set empirical mode decomposition and com-83 plete set empirical mode decomposition, but the mode 84 aliasing problem of EMD was not fundamentally solved. 85 In literature [18] and [19], variational modal decomposition 86 is used to decompose data sequence, which can effectively 87 avoid the occurrence of modal aliasing. However, this method 88 is not adaptive, and parameters such as decomposition num-89 ber and penalty factor need to be determined. 90 In addition, some literatures discuss the use of swarm 91 intelligence methods to optimize the parameters of predic-92 tion models, such as the optimization of VMD parameters 93 and DBN parameters. These problems are essentially con-94 strained programming mathematical problems, and the accu-95 racy of the problem mainly depends on the optimization 96 performance of intelligent algorithms, so the selection and 97 optimization of solution methods is very important. Spark 98 search algorithm [20] is a new intelligent optimization algo-99 rithm proposed in 2020. Compared with traditional intelligent 100 optimization algorithms such as particle swarm optimization 101 algorithm and gravitational search algorithm, this algorithm 102 has advantages in search accuracy, convergence speed and 103 stability. Scholar Li Yali [21] has made a detailed comparative 104 study of the new swarm intelligent optimization algorithm 105 that has emerged in recent years. It is concluded that the 106 performance of sparrow search algorithm in convergence 107 accuracy and stability is far better than that of bat algorithm, 108 grey wolf optimization, whale optimization algorithm and 109 other five optimization algorithms. However, as an algorithm 110 1) An improved sparrow optimization algorithm based 126 on reverse learning and cloud model theory is proposed to 127 enhance the optimization ability of the algorithm.

128
2) The tracking method of energy difference is introduced, 129 and an improved SSA algorithm is proposed to optimize the 130 decomposition number and dependency factor of VMD.   The remainder of this article is organized as follows.

137
Section 2 analyzes the improved sparrow intelligence 138 algorithm and variational mode decomposition theory.

139
Section 3 introduces deep belief networks and the imple- In the formula, {u k } is the modal components and {ω k } is 154 the frequency center of each component.

155
Solve the above equation with the augmented Lagrange 156 function, and obtain Equation (2): The formula above can be obtained by using the alternating 162 direction multiplier method (ADMM): In the formula,û n+1 k (ω) and ω n+1 k are wiener filtering and 166 frequency center of each component respectively. According to the foraging rules, the finder guides the popu-174 lation search and foraging through location updating. Some 175 participants chose to follow the finders to get food, while 176 others chose to constantly monitor the finders and participate 177 in food competition to increase their own predation rate. 178 When the sparrow population is aware of the danger, the 179 sparrows in different positions will choose the correspond-180 ing escape strategy. The above is a brief introduction of 181 SSA algorithm, and the specific content can be found in 182 literature [20] and [22].

183
The location of the finder is updated as follows: where, MaxCycle is the maximum number of iterations 186 of the algorithm; α is uniform random number within 187 interval (0, 1]; Q is a standard normal random number; L is 188 the matrix of 1 × d with an element value of 1; R 2 and ST are 189 the set warning value and safety value respectively.

190
The location of the subscriber is updated as follows: where, x t pj is the optimal position of the discoverer in the 193 t iteration; x t wj is the global worst position at the t itera-194 tion; NP is population number; A represents the matrix of 195 1 × d whose elements are randomly assigned 1 or −1, and 196 197 VOLUME 10, 2022 The location of the scouter is updated as follows: where, x t bj is the global optimal position in the t iteration; initialization, and its mathematical model is as follows: where, when ϕ ∈ (0, 1) and x ∈ [0, 1], the system (8) is in a 218 chaotic state.
corresponding to x is defined as: In the formula, a j and b j are the upper and lower bounds entropy E n and hyper entropy H e . The cloud model is charac-243 terized by stability in uncertainty and change in stability. The 244 optimal solution of SSA algorithm can be taken as the center 245 of the cloud model to search and compare the surrounding 246 points, and then the optimal solution can be found. Normal 247 cloud model is an important model in cloud theory, which 248 can reflect the random probability distribution of nature 249 and has great universality. Let C be a qualitative concept 250 in the domain U of quantitative theory. If the quantitative 251 value x is a random realization of the qualitative concept 252 in the domain U and satisfies x ∼∼ N Ex, En 2 , En ∼ 253 N En, He 2 , then the certainty of C can be expressed as: In the formula, µ (x) is a random number at (0,1).
Combined with the previous sections, the steps of ISSA algo-258 rithm proposed in this paper can be summarized as follows: 259 Step 1: Initialize the algorithm parameters N , Maxiter, 260 ST and the proportion of discoverer, joiner and scout in the 261 sparrow population.

262
Step 2: The initial population is generated by using 263 Equation (8).

264
Step 3: The population is updated by the sparrow algo-265 rithm, and the reverse population is generated by using 266 Equation (9); And calculate the optimal individual.

267
Step 4: According to Section C, the position of the optimal 268 solution is improved by using the normal cloud generator, and 269 the optimal solution at this time is compared and determined. 270 Step 5: If t < MaxCycle, then t = t + 1, return to step 3, 271 otherwise the algorithm ends. When VMD is used for signal decomposition, parameters 274 such as modal decomposition number, penalty factor, fidelity 275 coefficient and convergence condition need to be preset. The 276 study shows that the decomposition accuracy of VMD mainly 277 depends on decomposition number K and penalty factor α. 278 If the value of decomposition number K is set too small, 279 information will be lost; if the value of decomposition num-280 ber K is set too large, excessive decomposition will be caused. 281 Penalty parameter α affects the bandwidth of each modal 282 component, and different bandwidth scales affect the signal 283 extraction results. Due to the complexity and variability of 284 the actual signals to be decomposed, it is difficult to set the 285 decomposition number K and penalty factor α artificially, and 286 it is easy to lead to randomness of decomposition results [24]. 287 Therefore, this paper proposes to optimize VMD parameters 288 using ISSA algorithm.

289
The fitness function of ISSA's optimization of VMD 290 parameters is based on the energy difference tracking method 291 proposed in literature [25]. The basic idea is to decom-292 pose signal f (t) into K finite Bandwidth Intrinsic Mode 293 Function (BIMF) u i according to VMD method, as shown in 294 If BIMF satisfies orthogonality, then the energy of the .
If the actual decomposition components of the signal are 303 not all orthogonal, there is an energy error E err between E f 1 304 and E BIMF .
The smaller E err is, the better the orthogonality of decom-307 posed BIMF component is, and the decomposition result can 308 better characterize the characteristics of signal f (t).

309
The solving steps of the optimal parameter combination 310 [K , α] of VMD algorithm are as follows:

311
Step 1: Set the parameters of ISSA algorithm and the initial 312 population, and take the energy error E err as fitness function.

313
Step 2: VMD decomposition is performed on the signal, 314 and the fitness value of each sparrow can be obtained by 315 formula (15).

316
Step 3: According to the optimization mechanism of the is compared, and the minimum fitness value is constantly 320 updated.

321
Step 4: Cycle through step 2 ∼ step 4 until the global min-322 imum fitness value is determined or the maximum number 323 of iterations is reached, and the optimal sparrow individual 324 [K , α] is output.

325
Step 5: VMD decomposition of the signal is carried out by 326 using the optimal parameter [K , α]. In order to verify the performance of SSA algorithm, it is 330 compared and analyzed with common gray Wolf optimization 331 algorithm (GWO), particle swarm optimization algorithm 332 (PSO), and moth flame optimization algorithm (MFO). Dif-333 ferent single-mode and multi-mode benchmark test function 334 scenarios are selected, as shown in Table 1. Parameter Set-335 tings of each test algorithm are shown in Table 2. The number 336 of population is set as 30, the number of iterations is set 337 as 500, and the experimental results are the values of each 338 method running independently for 30 times.

339
As can be seen from Table 3, ISSA            In the formula, v i and h j represent the state of visi-404 ble layer node and hidden layer node respectively. a i and 405 b j represent the bias corresponding to visible layer node and 406 hidden layer node respectively. w ij represents the connection 407 weight between the visible and hidden layers.

408
According to the above formula, the joint probability 409 density of the visible layer and the hidden layer can be 410 obtained In the formula, Z (θ ) = v,h e −E(v,h|θ ) is the normalized factor. 413 In unsupervised learning, the purpose of training is to get 414 parameters θ. For the training set containing N samples, the 415 maximum likelihood function can be used DBN algorithm greedily pretrains RBM layer by layer, and 418 then fine-tune and optimize the initial weight obtained by 419 pre-training layer by layer using supervised back propagation 420 algorithm, so that the model can obtain the optimal solution, 421 and thus can characterize the complex nonlinear relationship 422 in the wind speed data.

479
In the formula, N is the number of samples, y i and Y i are the 480 predicted value and true value of the first sample respectively. 481 See Figure 6 for the flow chart of ISSA optimization DBN, 482 and the specific steps are as follows:

483
Step 1: Set the parameters of Issa algorithm and the initial 484 population, code the individuals in the population, set each 485 sparrow as a three-dimensional vector X(m 1 , m 2 , η), select 486 the population number as 20, set the maximum iteration 487 number as 100, and set the threshold parameter ε as 0.001.

488
Step 2: The original wind speed data is decomposed by 489 OVMD, and the generated component data is used as a test 490 set, and the fitness value of each sparrow is obtained by 491 formula (23).

492
Step 3: According to the sparrow algorithm optimization 493 mechanism, the positions of individual sparrows are updated, 494 the fitness function values corresponding to each position 495 are compared, and the minimum fitness value is constantly 496 updated.

497
Step 4: When the fitness function value is less than the 498 threshold value ε or reaches the maximum number of iter-499 ations, the loop iteration ends, and the global minimum fit-500 ness value is determined to complete the optimization of 501 DBN parameters.

503
OVMD-ODBN proposed in this paper is shown in Figure 7, 504 and the specific steps are as follows:

505
Step 1: Preprocess the wind speed data, query the singular 506 values and missing data in the data, and fill them with cubic 507 spline interpolation.

508
Step 2: OVMD decomposes the original wind speed 509 sequence and obtains several training and test data sets of the 510 DBN-network constructed by IMF.

511
Step 3: Initialize parameters such as the number of hidden 512 layers and training times of DBN-network and the number of 513 ISSA population and training times. ISSA algorithm is used 514 to determine the number of neurons and the learning rate of 515 each hidden layer in DBN network.

516
Step 4: Conduct pre-training and reverse fine-tuning on the 517 determined DBN-network structure, and build DBN models 518 corresponding to each IMF component.

519
Step 5: Start from the first moment of prediction, make 520 multi-step rolling prediction, overlay and get the final wind 521 speed value.

522
Step 6: Root mean square error (RMSE), mean absolute 523 percentage error (MAPE), mean absolute error (MAE) and 524 coefficient of determination (R 2 ) are selected to evaluate the 525 performance of the prediction model.    data is 15min. Taking the data from 2588-2976 in January 538 as the sample population, the input and output data sets are 539 set by using the method of predicting the data from the first 540 five moments to the next moment. Among them, data from 541 2588-2880 are used as training data, and data from 2881-2776 542 are used as test samples. That is, 96 data are wind speed test 543 data on January 31. See Figure 8 for wind power data. When 544 the sample sequence is decomposed by the VMD method, 545 the penalty parameter α and decomposition quantity value of 546 VMD are optimized by the method described in section II, 547 and the default values of other parameters are adopted. The 548 decomposition results are shown in Figure 9.

549
As can be seen from Figure 9, the data quantity of IMF1 is 550 the largest, but its frequency is low. The frequency of the other 551 three columns increases gradually, but its value decreases 552 gradually. In the prediction of wind speed data, the IMF1 553   Figure 10 and 11, which obtained by OVMD four compo-580 nents, See Figure 9. Then, the corresponding ODBN model is 581 established to predict the four components, and the prediction 582 results are shown in Figure 10. The final prediction results are 583 obtained by superposing the values of each predicted compo-584 nent, as shown in Figure 11. As can be seen from Table 4, 585 compared with LSTM and ELM and BP methods, RMSE 586 index decreased by 0.0284m/s,0.1723m/s and 0.2093m/s, 587 MAPE index decreased by 1.193%,4.2672% and 5.5939%, 588 respectively. The results show that the prediction effect of 589 DBN is better than that of LSTM, ELM and BP, among 590 which the prediction effect of BP model is the worst, the 591 prediction effect of LSTM model is better, but the prediction 592 speed of LSTM method is the worst. ELM, LSTM and BP 593 neural networks are not as stable as DBN. Compared with the 594 ODBN method, the RMSE and MAPE indexes of the pro-595 posed OVMD-ODBN method decreased by 0.3731m/s and 596 8.7223%, respectively. Compared with the ODBN method, 597 the RMSE and MAPE indexes of EMD-ODBN decreased 598 by 0.2016m/s and 7.4064%, respectively. Compared with 599 LSTM method, RMSE and MAPE indexes of OVMD-LSTM 600 decreased by 0.3229m/s and 5.9793% respectively, indicating 601 that the combined prediction model can accurately charac-602 terize the internal characteristics of each part of the sig-603 nal due to the pretreatment and refinement operation of the 604 prediction signal, and then carry out classification predic-605 tion. Therefore, the prediction effect is better than the single 606 rough prediction method. It can be seen from Table 4       is also relatively small, R 2 index is close to 1. The overall pre-634 diction effect is good. Due to the large wind speed mutation 635 on May 6, the prediction effect is not as good as the prediction 636 effect of the previous two days, but the error is within 1m/s.

637
Therefore, the prediction models proposed in this paper can 638 meet the requirements of accurate prediction.

640
In order to improve the prediction accuracy of wind 641 speed, OVMD-ODBN prediction model is proposed. 642 Through experimental analysis, the following conclusions 643 are drawn:(1) The prediction accuracy and stability of DBN 644 method are better than that of LSTM, ELM and BP meth-645 ods.
(2) Optimization of decomposition number K and penalty 646 factor α parameters of VMD method by ISSA algorithm 647 can improve the signal adaptability of VMD method, and 648 optimization of hidden layer unit number and learning rate 649 of DBN prediction model by ISSA algorithm can optimize 650 the performance of DBN prediction model. (3) The com-651 bined prediction model of OVMD-ODBN, OVMD-DBN 652 and EMD-ODBN is better than the single DBN and ODBN 653 method. From the whole prediction process, VMD variable 654 IMF1 accounts for the largest proportion, but the predic-655 tion error is a little large, so the prediction accuracy of 656 IMF1 component needs to be further improved. In addition, 657 the empirical value and default value are used for VMD 658 parameters in the experiment. In the next step, we will 659 continue to study the comprehensive prediction effect of 660 different methods according to different decomposition data 661 characteristics.