Short Term Load Forecasting Based on SBiGRU and CEEMDAN-SBiGRU Combined Model

With the continuous development of global science and technology industry, the demand for power is increasing, so short-term power load forecasting is particularly important. At present, a large number of load forecasting models have been applied to short-term load forecasting, but most of them ignore the error accumulation in the iterative training process. To solve this problem, this paper proposes a combined measurement model which combines stacked bidirectional gated recurrent unit (SBiGRU), complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and error correction. In the first stage, SBiGRU model is established to study the time series characteristics of load series under the influence of temperature and holiday types. The error series generated in the prediction process of SBiGRU model reflects the error characteristics of load series; In the second stage, the error sequence is decomposed into several intrinsic mode functions (IMF) components and trend components by CEEMDAN algorithm. The SBiGRU model is established again for each component to learn and predict, and the predicted values of all components are reconstructed to get the error prediction results; Finally, sum the two-stage prediction results to correct the error. The accuracy of SBiGRU-CEEMDAN-SBiGRU combination model is evaluated by two public power load data. The experimental results show that the SBiGRU-CEEMDAN-SBiGRU combination model has better accuracy and stability than the traditional model


I. INTRODUCTION
Short term load forecasting is to estimate and forecast the electricity demand in the next 24 hours, days or weeks on the premise of considering the influence of meteorological factors and working day types. Accurate forecasting is of great significance for the dispatching of power system and stable operation of power grid [1].
Common load forecasting algorithms include: extreme learning machine [2], support vector machine [3], Kalman filter [4] and random forest [5]. These methods, through simple internal structure, can rapidly converge the algorithm model, but can not fit the time series relationship between input data and output data, so the prediction accuracy is often not very ideal [6]. In recent years, with the development of deep learning, deep learning algorithms such as convolutional neural network [7]- [9], recurrent neural network (RNN) [10] and deep confidence network [11] have been gradually applied to power load forecasting. Among them, RNN is widely applied in speech recognition and natural language processing. It does not process all the data at one time, but processes the data in a recurrent way. The advantage of the recurrent connection is that it can learn and record the historical information in the sequence, so that it can be very good in the prediction of data Characteristics of fitting data on time series [12]. Based on the time series characteristics of load series, RNN has been introduced into load forecasting by many scholars [13]- [17]. In literature [18], firstly, K-means was used to classify the data sets into the same cluster, and then the EEMD algorithm was used to decompose the load series into relatively stable components, and the BiRNN model was established for each component. At the same time, the network parameters are initialized by deep confidence network. The experimental results have proved that this prediction strategy can effectively improve the prediction accuracy. However, RNN will involve the cumulative multiplication of Jacobian matrix when the calculation distance is long, which is prone to gradient expansion and gradient dispersion [19]. Long shortterm memory (LSTM) improves the network structure compared with RNN, which makes the output information of the network is controlled jointly by the input gate, forgetting gate and output gate, thus avoiding gradient expansion and gradient dispersion in RNN network. In literature [20], LSTM and other similar algorithms are tested in the publicly available real residential smart meter data set. Finally, LSTM algorithm is better than other similar algorithms in short-term load forecasting of households.
Gated recurrent unit (GRU) is an modified LSTM, which integrates the forgetting gate and the inputs into a new gate. Compared with LSTM, it reduces the network parameters and is not easy to over fit the network [21]- [22]. In recent years, many scholars have applied GRU to load forecasting, and achieved good prediction results in their respective experimental data sets [23]- [24]. In literature [25], the combined model of convolutional neural network and GRU is used to predict wind speed and residential load. The experimental results show that the prediction effect of this combined model is better than that of the comparative model. However, the standard GRU often ignores the context information in the load series, and unable to effectively capture the time sequence rules in the load series. To solve this problem, the bi-directional RNN network model based on LSTM and GRU is used in the literature [26] to predict the short-term power load, and it is verified on two data sets. The experimental results show that the method has advantages in the aspect of prediction accuracy, which also proves that the proposed method has advantages The results confirm that the RNN with bi-directional structure can better capture its temporal characteristics according to the load information of history and future time.
In short-term power load forecasting, accurate feature selection is particularly important for the accuracy of forecasting. Empirical mode decomposition (EMD) can decompose complex signals into relatively stable modal components, which can reduce the mutual interference between different components. However, the EMD decomposition method is difficult to avoid mode aliasing, which will have a negative impact on the prediction accuracy [27]- [29]. Ensemble empirical mode decomposition (EEMD) [30]- [32] introduces white noise of normal distribution into the original signal, which significantly improves the mode mixing problem of EMD. In literature [33], a hybrid algorithm integrating empirical mode decomposition (EEMD) and least squares support vector regression (LSSVR) was proposed, and the monthly data set of China's nuclear energy consumption from March 1993 to January 2010 achieved better prediction accuracy than other commonly used models. However, the above algorithm ignores the error information produced in the process of iterative training, especially when the feature information of the data set is less, the prediction accuracy of the algorithm will be significantly reduced. Therefore, combining the advantages of the above algorithm, this paper proposes a two-stage forecasting model including error correction. Firstly, the algorithm uses SBiGRU to learn the main characteristics of load data, and the error information is reflected in the error sequence predicted by SBiGRU model; secondly, it uses CEEMDAN-SBiGRU model to fit the error series; Finally, SBiGRU model and CEEMDAN-SBiGRU model are combined to obtain the two-stage prediction model in this paper. The main contributions of this paper are as follows: (1) A 2+1 layers SBiGRU is proposed. This multi-layer bidirectional GRU can not only provide the past and future load information for the load forecasting of each time point, but also improve the generalization ability of the model through its multi-layer network design.
(2) In order to solve the problem that the traditional algorithm can not correct the error of the prediction results, CEEMDAN-SBiGRU is introduced as the error correction model of the algorithm in this paper, and proposes a new twostage prediction algorithm combined with SBiGRU model.
(3) CEEMDAN is introduced as the decomposition algorithm of error sequence. CEEMDAN adds adaptive white noise and calculates the only residual signal to obtain IMF component in the EEMD decomposition process. It can reconstruct the error sequence in a small number of experiments, so it improves the efficiency of error sequence decomposition.

A. GATED RECURRENT UNIT (GRU)
GRU refers to the network structure of LSTM in the design process. It integrates the forgetting gate and input gate in LSTM into an update gate, and integrates cell state and update state. Compared with LSTM, GRU has less model parameters and significantly higher calculation efficiency due to the reduction of one gating unit. The internal details are shown in Fig.1. Firstly, GRU controls reset gate r t and update gate z t by inputting x t at time t and hiding state h t-1 at time t-1, and σ is activation function Sigmoid; After resetting the gate, h t-1 is added with x t , and the data is reduced to the range of [-1,1] by activating tanh,* represents the multiplication of corresponding elements in the matrix. The specific calculation process is shown in formula (1)~(4). (1)

B. STACKED BI-DIRECTIONAL GATED RECURRENT UNIT
The bi-directional gated recurrent unit (BiGRU) is composed of a forward GRU network and a backward GRU network. Its structure is shown in Fig.2. The BiGRU neural network forecasts based on the whole time series. It divides the hidden layer into two opposite parts, forward and backward, and reads the information of the past and future times respectively. The first level forward GRU calculates the sequence information of the current time series, the second level backward GRU calculates the reverse sequence information of the same time series, and finally the hidden layer state of BiGRU at time t is obtained by weighting the forward hidden layer state and backward hidden layer state. The calculation process is as follows: Where: h ⃗ t represents the forward hidden layer state at time t, h ⃖ t represents the backward hidden layer state at time t, W t and V t represent the weight corresponding to h ⃗ t and h ⃖ t , b t represents the offset of hidden layer at time t.  The SBiGRU is composed of multiple BiGRU stacks. The specific stacking method is shown in Fig.3. The input of the N-layer BiGRU at the time of step t is x t n , and the output is x t n+1 . At the same time, the output of the n layer is the input of the N+1 layer. This multi-layer stack structure makes the model have better generalization ability.

C. COMPLETE ENSEMBLE EMPIRICAL MODE DECOMPOSITION WITH ADAPTIVE NOISE
In the process of EEMD decomposition, CEEMDAN adds adaptive white noise and calculates the unique residual signal to obtain IMF component. It can reconstruct the signal sequence in a few experiments, and eliminate the error caused by adding adaptive white noise in the residual signal produced by EEMD decomposition.
The definition calculation symbol E i () indicates the i-th IMF component generated by EMD decomposition. The calculation steps of CEEMDAN algorithm are as follows: Step 1 In the k times of experiments, the original signal x t +δ 0 ω j decomposition, where δ 0 is the standard deviation of white Gaussian noise, ω j said Gaussian white noise. The first IMF component is obtained through EMD decomposition and the unique residual signal r 1 (t) is obtained: Step 2 Continue to acquire the second IMF component: Step 3 Repeat the above steps to calculate the n-th residual signal: r n (t)=r n-1 (t)-IMF k (t) (11) Then the n+1 IMF component is: Step 4 Repeat Step 3 until the residual signal is monotonous, and the decomposition stop original signal x(t) is decomposed into: In formula (13), N is the number of final modal components, and r(t) is the final monotone residual signal.

D. CALCULATION PROCESS OF SBiGRU-CEEMDAN-SBiGRU MODEL
Step 1 first stage load forecasting. The training set is used for supervised training of the SBiGRU model in the first stage. After the model converges, the training set and test set are predicted to get the prediction value P btrain of training set and the prediction value P b of test set, and the error sequence P e produced in the training process of SBiGRU is obtained according to the following formula: P e =P btrain -T train (14) Where: T train is the real value of the training set.
Step 2 error prediction in the second stage. The CEEMDAN algorithm is used to decompose P e into several IMF components and one residual component. The SBiGRU model is established for learning and prediction of each component. Finally, the prediction results are accumulated to get the error prediction value P e ' .
Step 3 calculates the predicted value of the combined model. The prediction result P of SBiGRU-CEEMDAN-SBiGRU model is the sum of the predicted value of the test set and the predicted value of the error, namely: The prediction process of the SBiGRU-CEEMDAN-SBiGRU composite model is shown in Fig.4.

True Value of Load Data on
The N-th Day  The proposed algorithm model is verified on two public datasets. The first data set is the power load forecasting competition data organized by EUNITE. The data set contains historical load, daily average temperature and holiday data, and the load data sampling period is 30 minutes, and the sample number is 17,520. The second data set is the power load data collected in Alberta, Canada in 2016. The sampling period of the data set is 1 h, and the number of samples is 8760.
A. DATA SET PREPROCESSING (1) Normalization The MAXMIN method is used to normalize the data, normalize the value range of all data to the range of [0,1], then transfer the normalized data into the model, and finally reverse normalize the output of the model. The calculation formula is as follows: Where: x i ' and x i represent normalized value and data value respectively, x max and x min represent maximum value and minimum value.

A. MODEL TRAINING METHOD
The training process of the model adopts Adam optimization algorithm. Adam combines the advantages of Rmsprop which is good at dealing with non-stationary targets, and Adagrad which is good at dealing with sparse gradients. The calculation formula is as follows: Δθ t =-m t n t +ϵ *η Where: g t is the gradient, ϵ is the smoothing index, which is mainly used to prevent the denominator from being zero; μ and v are the momentum factors; mt and n t are the first-order matrix estimates of the gradient, which can be regarded as the estimates of the expected E g t and E g t 2 ; m t and n t are the modifications of mt and n t , so that the expectation can be estimated approximately unbiased.

B. MODEL EVALUATION INDEX
The average absolute percentage error E MAPE , root mean square error E RMSE and maximum absolute percentage error E Max are used as evaluation indexes in the model, and the calculation formula is as follows: Where: x i is the predicted value of neural network at i time, and the actual value of x i neural network at i time.

C. EXPERIMENT 1
Firstly, the data of EUNITE electric load forecasting competition is used as the test sample. In the experiment, 92d data from August to October were used as training set, 11d data from November 1 to November 11 were used as test set, and 48 historical time points were used to predict the load data of the next 48 time points.
Load data is the result of periodicity, trend and randomness. Seasonal-Trend decomposition procedure based on Loess (STL) decomposition algorithm can be used to decompose the load data from these three characteristics, so as to observe the time series characteristics of load data more intuitively. The experimental results are shown in Fig.5. In Fig.6, the shaded area represents the 95% confidence interval, the correlation coefficient outside the shadow area indicates significant correlation, the horizontal axis represents the lag order, and the vertical axis represents the correlation coefficient. It can be seen from the figure that the lag orders beyond the confidence area are as follows: 1, 2 and 3, which shows that the load values at historical time t-1, t-2 and t-3 have the highest correlation with the current time t. The training set of 92d is transformed into a threedimensional matrix. The dimensional information of the threedimensional matrix is as follows: {D1⋯D92}×{L1⋯L48}×{X1⋯X3} (25) Where: {D1⋯D92} indicates that the data set is 92 days in length; {L1⋯L48} indicates that the data set contains 48 times in a day; {X1⋯X3} represents the 3-dimensional characteristics of the data set, which represents the load data, average temperature and date type at the current time.
Similarly, the dimension information of the test set is:

1) FIRST STAGE SBIGRU TRAINING AND PREDICTION
The experiment was completed with Pytorch1.10 in Ubuntu 18.04 system. The processor of the experimental PC was Inter Core i5-8300h, the memory was 16GB, and the graphics card was NVIDIA GeForce GTX 1050TI. In order to design the structure of SBiGRU model scientifically, the test set is used to analyze the SBiGRU model with different layers. Taking single layer SBiGRU, 2+1 layers SBiGRU, 3+1 layers SBiGRU and 4+1 layers SBiGRU as experimental samples, combined with reference [6], the iteration number is set to a larger empirical value of 2000, and the evaluation index is the average absolute percentage error and training time. The experimental results are shown in Tab.1. It can be found from Tab.1 that when the network layer is a single layer structure, the model has weak generalization ability and low prediction accuracy. When the number of layers is 2+1, because its multi-layer network structure improves the generalization ability, the network structure model at this time has the best prediction accuracy. However, with the increase of the network structure, when the model structure is 3+1 layers and 4+1 layers, the accuracy of the model decreases with the increase of the number of network layers. At this time, it is considered that the too complex network structure makes the model over fit.
According to Tab.1, the structure of the first stage SBiGRU model is determined as 2 + 1 layers, which is composed of two layers of BiGRU and one layer of full connection layer. The model structure is shown in Fig.7  The training mode of the model is to forecast the load data of 48 time points in the future through the load data, date type and temperature information of 48 time points in the history, Therefore, the data from August 1 to October 30 is taken as the input of the network, the load data from August 2 to October 31 is taken as the output of the network, and the Adam algorithm is used to optimize the network parameters. In order to obtain the error samples generated in the training process, after the SBiGRU model converges, first input the training set into the trained model, and obtain the prediction result P btrain of the training set, as shown in Fig.8. Then, according to equation (14), the error sequence P e generated by the SBiGRU model in the prediction process in the first stage can be obtained, as shown in Fig.9.

2) TRAINING PROCESS OF CEEMDAN-SBIGRU MODEL IN THE SECOND STAGE
The STL decomposition algorithm is used to analyze the error series. The experimental results are shown in Fig.10. It can be seen from the figure that the periodicity of residual components in time series is weak, which increases the difficulty of error series fitting.  Fig.11 is the autocorrelation diagram of error series. It can be seen from the figure that although the periodicity of the error sequence is weak, there are still historical errors significantly related to the current time point. Therefore, historical errors can be used to predict future errors. In order to improve the prediction accuracy of the error series, the CEEMDAN algorithm is used to decompose the error series into nine IMF components and one residual component The test results are shown in Fig.12. It can be seen from Fig.12 that the frequency of IMF1~IMF5 components is high and the periodicity is not obvious, which can be regarded as the high frequency component of error sequence. The periodicity of IMF6~IMF8 component is obvious, which can be regarded as the periodic component of error sequence. The frequency of IMF9 component is low and its periodicity is not obvious, so it can be regarded as the low frequency component of error sequence. IMF10 is the residual component of the error sequence.
The second stage SBiGRU model is established by decomposing eight IMF components and one residual component from CEEMDAN algorithm. The structure design process is similar to the SBiGRU structure in the first stage, and the network structure is determined as 2+1 layers by experiment. However, since the input of the network is error component and its characteristic dimension is 1, the dimension information of SBiGRU input layer in the second stage is {91,24, 1}, At this time, the input of each SBiGRU model is set as the error component from August 1 to October 29, and the output is the error component from August 2 to October 30, and the network parameters are optimized iteratively by Adam algorithm.

3) PREDICTION COMPONENT RECONSTRUCTION
The predicted values of each IMF component c i (t) and residual component r n (t) are accumulated to obtain the predicted values of errors, as shown in formula (27), where p(t) is the reconstructed error data.

4) EXPERIMENTAL PROCESS AND RESULT ANALYSIS
In order to display the prediction results intuitively and conveniently, the load data of 96 time points of November 3-4 are predicted through the data of 96 time points of November 2-3. After the SBiGRU model in the first stage converged with the CEEMDAN-SBiGRU model in the second stage, the first stage SBiGRU model is used to predict the main characteristics P b of the load sequence from November 3-4, then the CEEMDAN-SBiGRU model is used to predict the error characteristics P e ' of November 3-4, and finally the final prediction results are obtained according to equation (15). At the same time, LSTM, SBiGRU, SBiGRU-SBiGRU and SVR, the traditional load forecasting method, are selected as comparative tests, as shown in Fig.13. SBiGRU-SBiGRU is a two-stage forecasting model with SBiGRU as error correction model. It can be seen from Fig.13 that compared with SVR, LSTM, SBiGRU and SBiGRU-SBiGRU, the combined model proposed in this paper has better performance and can capture the complex nonlinear changes of load series more effectively, so it can make short-term forecast of load series more accurately and stably.
The data of the remaining dates in the test set are predicted by the same method. Fig.14 shows the correlation distribution between the actual and predicted values of the four models in the forecast set. R is Pearson correlation coefficient, which is used to evaluate the correlation between the real value and the predicted value.  .14 shows the correlation between the predicted results and the real results. The abscissa is the predicted value and the ordinate is the real value. It can be seen from the figure that, compared with other models, the correlation coefficient between the predicted value and the real value of SBiGRU-CEEMDAN-SBiGRU is closer to 1, so it performs best in the prediction set. Fig.15 shows the error histograms of the five models. If the error histograms are more concentrated near 0, the performance of the models will be better. It can be seen from the figure that the error histograms of the SBiGRU-CEEMDAN-SBiGRU model proposed in this paper is the most concentrated near 0, which shows that the performance of this combined forecasting model is the best among the five models.   It can be seen from Tab.2. that SVR model has the worst performance in the prediction set, and an average daily accuracy rate in the test set is 94.94%; LSTM records the historical information of load series in the process of forecasting, and the prediction accuracy is 95.85%; The SBiGRU model takes into account the data information of past and future times in the prediction process, and uses the network structure of 2+1 layer to improve the data generalization ability of the model. Therefore, the prediction accuracy of the model is higher than that of the SVR model, and the average accuracy rate of the test set is 96.06%; SBiGRU-SBiGRU model uses SBiGRU as the main feature of load forecasting in the first stage, and SBiGRU is used again as the error correction model in the second stage, so the forecasting accuracy is improved, and the daily average accuracy is 96.86%; in the second stage, SBiGRU-CEEMDAN-SBiGRU model uses CEEMDAN algorithm to decompose the error sequence, which can predict the error sequence on different time scales Therefore, the average daily accuracy is the highest among the four models, which is 97.25%.
Tab.3 shows the training time consumption and prediction time consumption of the five models. The combined forecasting model divides the forecasting process into two stages: load forecasting and error forecasting. Therefore, the consumption of training time and forecasting time is greater than that of single stage model. Especially for the SBiGRU-CEEMDAN-SBiGRU model, because the model needs to decompose the error sequence, and needs to train each error component, the training time consumption and prediction time consumption are far greater than other models. The combined forecasting model sacrifices the time cost of calculation, thus improving the prediction accuracy.

C. EXPERIMENT 2
In order to further verify the validity of the model, this paper continues to use the power load data of Alberta in 2016 to evaluate the accuracy of the model.
Firstly, the load data set is analyzed by time series, and the load series is decomposed by STL. The experimental results are shown in Fig.17.  Fig.17-B, the overall load value increases first and then decreases in the 92-day cycle, that is, the load data presents the development trend of first rising and then decreasing. Fig.17-C shows the periodicity of load series, which takes about 24 time points as the period. Fig.17-D shows the distribution of the influence of external random factors on load and time series Then, the autocorrelation of data sets is analyzed experimentally. The experimental results are shown in Fig.18. It can be seen from Fig.18 that the load value at the current time point is mainly related to the load value at three historical time points. For the load forecasting model, the load forecasting model uses at least the load data beyond these three historical time points to forecast the future load data. 92d data from July to September were selected as training set, and 11d data from October 1 to October 11 were selected as test set.
Dimension information of training set is: Where: {D1⋯D92} indicates that the length of the data set is 92 days; {L1⋯L24} indicates that the data set contains 24 times in a day; {X1,X2} represents the two-dimensional characteristics of the data set. In this case, it represents the load data and date type at the current time.
Similarly, the dimension information of the test set is:

1) EXPERIMENTAL PROCESS AND RESULT ANALYSIS
In order to determine the network structure, single-layer BiGRU, 2+1 layer SBiGRU, 3+1 layer SBiGRU and 4+1 layer SBiGRU are taken as experimental samples, and the average absolute percentage error and training time are used as evaluation indexes. The experimental results are shown in Tab.4.  The training mode of the model is as follows: the load data, date type and temperature information of 24 historical time points are used to predict the load data of the future 24 time points. Therefore, the data from July 1 to September 29 are used as the input of the network, and the load data from July 2 to September 30 are taken as the output of the network, and the network parameters are optimized by Adam algorithm.
In order to obtain the error samples generated in the training process, after the SBiGRU model converges, first input the training set into the trained model to obtain the prediction result P btrain of the training set, as shown in Fig.20. Then, according to equation (14), the error sequence P e generated in the prediction process of the first stage SBiGRU model can be obtained, as shown in Fig.21.

2) TRAINING PROCESS OF THE SECOND STAGE CEEMDAN-SBIGRU MODEL
The STL decomposition algorithm is used to decompose the time series of the error sequence. The experimental results are shown in Fig.22. It can be seen from the figure that the residual components still show irregular changes.   In Fig.24, IMF1 ~ IMF4 are high frequency components. IMF5 ~ IMF7 are periodic components. IMF8 is a low frequency component. IMF9 is the residual component.
The SBiGRU model was established for each component and supervised training was conducted. The input of each SBiGRU model is set as the error component from July 1 to September 29, and the output from July 2 to September 30. The network parameters are optimized iteratively by Adam algorithm.

3) EXPERIMENTAL PROCESS AND RESULT ANALYSIS
Through the load data of 216 time points from October 2 to 10, the load data of 216 time points from October 3 to 11 are predicted. The experimental results are shown in Fig.25, and the correlation distribution between the actual value and the predicted value is shown in Fig.26. It can be seen from Fig.25 and Fig.26 that the SBiGRU-CEEMDAN-SBiGRU model can fit the load data more accurately than other models in Experiment 2, and the correlation coefficient between the predicted value and the real value is closer to 1, so the model performs best in Experiment 2. Fig.27 is the error histogram. It can be seen from the figure that the SBiGRU-CEEMDAN-SBiGRU model is more concentrated near 0 than other models, so it has the best prediction performance.    It can be seen from Tab.5 that the SVR model performs the worst with an average daily accuracy of 97.34%; LSTM records the historical information of load series in the process of forecasting, and the prediction accuracy is 97.82%; the SBiGRU model takes into account the data information of past and future times in the prediction process, and uses the 2 + 1 layer network structure to improve the generalization ability of the model, with an accuracy rate of 98.26%; SBiGRU-SBiGRU uses SBiGRU as the error correction model, so the prediction accuracy is improved, and the daily average accuracy rate is 98.58%; The SBiGRU-CEEMDAN-SBiGRU model uses CEEMDAN algorithm to decompose the error series and can predict the error series on different time scales. Therefore, the daily average accuracy rate of the four models is the highest, which is 98.95%. Tab.6 shows the time calculation cost of different models. It can be seen from the table that the training and prediction time of SBiGRU-CEEMDAN-SBiGRU model is much longer than that of other models, indicating that the model improves the prediction accuracy at the expense of calculation cost.

V. CONCLUSION
This paper introduces the combined model of SBiGRU and CEEMDAN-SBiGRU. In the first stage of the combined model, a 2+1 layers SBiGRU model is designed. In the second stage, CEEMDAN-SBiGRU is proposed as the error correction model. Finally, the prediction results of the two stages are added together to correct the error. According to the results of Experiment 1 and Experiment 2, the following conclusions can be drawn: (1) The three evaluation indexes of SBiGRU-CEEMDAN-SBiGRU and SBiGRU-SBiGRU are all better than the other two single prediction models, indicating that the two-stage prediction model has better prediction accuracy than the single-stage prediction model.