A Novel Data-Driven Method with Decomposition Mechanism Suitable for Different Periods of Electrical Load Forecasting

For improving the precision of the load forecasting in different time spans, a new load forecasting model which combines the improved complete ensemble empirical mode decomposition algorithm based on adaptive noise (ICEEMDAN) algorithm, the least squares support vector machine (LS-SVM) and the long short-term memory network (LSTM) is proposed. In this paper, the training set of the forecasting model is acquired from the department of dispatch center from a large city in the north of China. And the advantages of the forecasting algorithms and decomposition algorithm are applied reasonably, where the ICEEMDAN algorithm is used to decompose the original historical load data. By using the ICEEMDAN algorithm, the fluctuation trend of different periods can be obtained. And the structure of the neural network combining LS-SVM and LSTM, which is used to obtain the load forecasting result. Based on LS-SVM and LSTM algorithm the non-stationary and stationary signals have been processed, respectively. The simulation results show that whether testing by the dataset acquired from a city in the north of China or the Elia dataset, the proposed method outperforms the load forecasting model in the short-, medium- and long-term which is based on PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM, respectively.


I. INTRODUCTION
Recently, load forecasting has played an important role in the future dispatch planning of the power grid, as well as the operation of power market transactions and the economic and stable operation of the power grid system [1,2]. The accuracy of load forecasting directly affects the safety and reliability of the grid [3]. However, with continuous economic development, the load structure of power users has become increasingly complex and the peak-to-valley differences in power grids have continued to increase, causing great difficulties for accurate load forecasting [4]. Therefore, considering the multiple characteristics, strong nonlinearity and strong volatility in load data, it is important to explore how to mine the characteristics' information, improve the accuracy and efficiency of forecasting, and identify the load forecasting models.
According to the time span, load forecasts are usually classified into three categories: the first is short-term load forecasting (STLF), which is used to forecast load data from a few days. The second category is medium-term load forecasting (MTLF), which is used to forecast load data over a span of weeks. The third category is long-term load forecasting (LTLF), which is used to forecast load data for a month or more [5,6].
The most common load forecasting method is based on time series, which mainly includes statistical models, artificial intelligence (AI) models and hybrid models [7]. Based on only one type of time span, many efforts about load forecasting have been made. Takeda et al. proposed an ensemble Kalman filter (EnKF) method for short-term load forecasting, which improved the load forecasting accuracy [8]. Hassan et al. developed a system for electricity load demand forecasting based on the theory of extreme learning machine (ELM), and the results show that the proposed method was effective [9]. Eskandari et al. used convolutional neural networks for multidimensional feature extraction which were input into a two-way relayed longterm and short-term memory network with gated loop units. They also made it possible to obtain hour-by-hour electricity load forecasting results with the effects of multiple external factors [10]. Samuel et al. proposed an improved entropic mutual information-based feature selection method for data preprocessing. This type of entropy can handle load data of both linear and nonlinear electricity and was used by [11] to remove irrelevant and redundant features. They also combined a conditional restricted Boltzmann machine (CRMB) and a meta heuristic optimization algorithm, Jaya, to construct a medium-term load forecasting model, which improved the algorithm's convergence speed and reduced execution time. Mohammed and Al-Bazi [12] developed an improved neural network model for long-term load prediction by using the adaptive back propagation algorithm (ABPA), which provides a high prediction accuracy. And the load forecasting methods which have been introduced in the above part are summarized in Table I. However, there remains the reason for the plenty of historical load data and many uncertain factors in mediumterm and long-term load forecasting. Therefore, the realization of medium-term and long-term load forecasting are the most difficult parts of the study for the load forecasting work. With the improvement in generalization capability of load forecasting algorithms, some load forecasting models can be applied to both types of time span, for example, in [13], combining the weighted least squares state estimation with an adaptive neuro-fuzzy inference system enabled the construction of a hybrid model for short-term load forecasting, which was applicable to medium-term load forecasting. In [14], an artificial neural network model and a stochastic/continuous autoregressive model were developed, which fit the needs of medium-short-term and very short-term load forecasting. In [15], combining the time-based convolutional neural networks and period-based long-short-term memory networks made it possible to identify load forecasting models for the medium-and short-term forecasting, which reduced the computational complexity. In [16], a kind of hierarchical optimization method was proposed, which combines a nested-based strategy, state transfer algorithm and hybrid support vector regression, thereby fitting the needs of medium and long-term load forecasting.
In summary, many researchers have developed helpful load forecasting algorithms which are applicable to one or two types of time span. However, according to our recent search results, it is still a challenge to build load forecasting models which are applicable to short-term, medium-term and long-term at the same time. To solve this problem, the authors of this paper have developed a forecasting model which combines an improved complete ensemble empirical mode decomposition algorithm based on adaptive noise and weighted least squares-based support vector machine algorithm, as well as a long-short-term memory neural network. To best of our knowledge, this is the first time the forecasting algorithm, which combines the improved complete ensemble empirical mode decomposition algorithm based on adaptive noise (ICEEMDAN) model, as well as LS-SVM and LSTM has been applied to load forecasting research. By using the actual power load data and holiday data, the generalization ability of load forecasting in the short-term, medium-term and long-term can be verified. The contributions of this model are as follows: 1) This paper proposes a novel hybrid model based on short-term hybrid models CNN LSTM, GRU [10] medium-term hybrid models Adaptive K-means CRMB, Jaya [11] long-term artificial intelligence (AI) models -ABPA [12] ICEEMDAN, LS-SVM and LSTM. By using ICEEMDAN, the load data can be decomposed into stationary and nonstationary signals, and the neural network input data processing can be completed. In the structure of the neural network combining LS-SVM and LSTM, the LS-SVM network is used to process the input non-stationary signal and the LSTM network is used to process the input stationary signal.
2) The hybrid model is applied to short-term, mediumterm and long-term load forecasting. In this paper, the advantages of ICEEMDAN, LS-SVM and LSTM are comprehensively utilized. Based on the ICEEMDAN algorithm, the fluctuation trend of different time spans in load data series can be accurately extracted. By using LS-SVM and LSTM to give full play to their respective technical advantages of coping with non-stationary and stationary signals, which can accurately learn the fluctuation characteristics of different time spans in each sub-series, and then realize the short-term, medium-term and long-term load forecasting.
3) The historical load data measurement at 1-minute intervals from a city of northern China and the holiday data is chosen as the training set, which improves the accuracy of the load forecasting model. The model is trained and tested by the actual data set, and the accuracy of the proposed model is verified to be improved compared to the existing models for load forecasting in short-term, mediumterm and long-term.

II. FORECASTING ARCHITECTURE
The original power load data is an effective basis for power load forecasting. However, fluctuation and uncertainty in the data can have a great impact on forecasting results, even causing the predicted values to deviate greatly from the real results. Therefore, in this paper, a method is developed using an improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) will be adopted to preprocess power load data, and the signals of data will be decomposed into stationary and non-stationary. By using the corresponding neural network, the strong generalization ability of the proposed forecasting scheme can be realized.

(CEEMDAN)
CEEMDAN is an advanced theoretical achievement based on empirical mode decomposition (EMD) [17]. The traditional EMD method can be used to analyze the nonlinear signal sequence, and the natural mode function is adaptively generated by the analyzed signal, so as to realize the decomposition of the natural mode function and residual in the signal [18]. Based on the EMD algorithm, the mode aliasing can be effectively restrained by adding limited white noise. [19] However, with the increase of white noise frequency, the efficiency of the algorithm can gradually worsen, and the prediction error can increase after reconstruction. To deal with this problem, CEEMDAN transforms the white noise into adaptive white noise, which can make the signal reconstruction error approach zero. This method can adaptively decompose nonlinear and nonstationary signals into multiple natural modal components with different scales. Let ( ) I t serve as the original signal sequence, , 0,...,   k k K are the adaptive coefficient, and ( ) i w t are the white noise sequence added each time, which are selected as the realization of zero mean unit variance white noise here.
R t as the first remainder sequence, and the CEEMDAN algorithm is shown in algorithm I.
Algorithm I: Complete ensemble empirical mode decomposition with adaptive noise 1: Use empirical mode decomposition (EMD) to decompose N times 1 1, 1 Calculate the second natural modal component： However, the CEEMDAN algorithm will produce pseudo modes in the early stage of decomposition, and the residual noise, which degrades the performance of modal decomposition still exists [20].

(ICEEMDAN)
In order to eliminate the pseudo modes generated by CEEMDAN and improve the performance of the modal decomposition algorithm, the improved complete ensemble empirical mode decomposition algorithm based on adaptive noise (ICEEMDAN) method is proposed. The ICEEMDAN algorithm was proposed by Comomainas based on the CEEMDAN algorithm, which was applied in much research [21,22] Unlike CEEMDAN, ICEEMDAN uses the local average achieved by additive white noise estimation instead of the white noise estimation mode. Define ( )  M as the envelope operator; let 1 ( ) ( )   E I I M I , and define ( ) i w t as the realization of Gaussian white noise with zero mean and unit variance. Then, the detailed steps of ICEEMDAN are shown as follows.
Step 1: Owing to l , the first residual sequence can be obtained as: (1) Step 2: Define the first mode as: Step 3: Calculate the second residual sequence 2 R , and the second model 2 IMF can be obtained as: Step 4: Calculate the k th residual sequence k R , and the second model k IMF can be obtained Step 5: Repeat step 4 until all the intrinsic modal components are obtained.  being the reciprocal of the expected signal-to-noise ratio of the input signal ( ) I t to the first noise, and ( )  std being the standard deviation operator. The structure block diagram of ICEEMDAN is shown in Fig. 1.
The LSTM is an extension of the recurrent neural network (RNN) and has been improved compared to the original RNN network. The main difference between RNNs and LSTMs is that LSTMs can store long-term time-dependent information and can appropriately map input and output data, providing a strong capability in predicting time-series data [23].
The LSTM network structure is added with cells containing the control information flow and three gates in comparison with the traditional perceptron architecture. These three gates are the input, forget, and output gates [24]. The three gates have different roles and cooperate with each other to achieve the best results. The LSTM uses the three gates to control the retained information and ensure that the retained information is what the algorithm needs. The structure diagram of the LSTM is shown in Fig. 2.
a a t f t c t o t are the output result of the input gate, the forget gate, the internal state and the output gate, respectively. The LSTM operates according to the following steps [25]: (1) The primary task of the data entering the LSTM is to be detected by the forget gate, which focuses on determining what information needs to be removed from the data, that is, what information can be dropped from the cells. This strategy is implemented through a sigmoid layer called the forget gate.
(2) The second step of the LSTM is to determine which new information will be saved within the data. First, the sigmoid layer of the input gates is used to decide which values are in the update cell.
Second, the tanh layer is used to create new values to be added to the cell. This means the results of the forget gates and input gates must be combined to create a state update that updates the old cell state (3) The final step of the LSTM is to identify the results of the cell state outputs. The strategy is implemented in two parts.
First, a sigmoid layer is used to determine which parts of the cell state are output.
Then, the output of the cell is set to the tanh functions multiplied by the output of the sigmoid layer so that only the part needed is the output.
( ) where tanh is the tangent excitation function and × denotes the point-by-point multiplication.

D. LEAST SQUARES SUPPORT VECTOR MACHINE
(LS-SVM) Uyken et al. proposed the least squares support vector machine (LS-SVM) [26][27][28] in recent years, which is one of the most important achievements in statistical theory. The LS-SVM training process follows the principle of structural risk minimization and has the advantages of low computational complexity and high operational speed compared with general vector machines.
The SVM employs a linear model to predict sample data by a nonlinear mapping from the input vector to a highdimensional feature space. Considering the given training set x R , and output data  i y R , the LSSVM aims to find a plane 0   T w x b to separate the two classes of samples, that is, the solution of the following quadratic programming problem [29], which is calculated as: where  denotes the penalty parameter and  denotes the slack variable.
For (10), the Lagrangian operator can be introduced to obtain the Lagrangian objective function, which is expressed as: where  i is the Lagrangian multiplier. Let the partial derivatives of each parameter of (12) be zero; then, the Karush-Kuhn-Tucker (KKT) optimality condition for the solution can be obtained as: (13) can then be expressed as a system of equations: where ( , ).   ij i j i j y y K x x Solving for and according to (14), the classification discriminant function is obtained as: There are several different types of Mercer kernel functions K (x, xi), such as sigmoid, polynomial, and radial basis functions (RBFs). RBF is a common option for kernel functions because of the fewer parameters that need to be set and the excellent overall performance [30]. Therefore, in this paper, RBF is chosen as the kernel function, which is given as:

E. The Proposed Method
When power grid planning, different effects will be produced while in different terms (time periods) of load forecasting. Short-term load forecasting (STLF) is mostly applied to monitor the power quality of the grid. Medium term load forecasting (MTLF) is mainly used for making generation plans and guiding electric energy production, and long-term load forecasting (LTLF) is used in power grid construction planning. In this article, we propose a hybrid algorithm that includes ICEEMDAN, LS-SVM, and LSTM for solving the load forecasting problem of short, medium, and long terms.
Here, ICEEMDAN is used to decompose the original load data onto the stationary signals and non-stationary signals. Then, we put the signals into the forecasting algorithms and get the result. Remark 1: In this paper, the dataset was acquired from a city in the north of China. And the results of the load forecasting can apply to guiding electric energy production and provide auxiliary decision-making for the actual power system. The flow chart of the proposed hybrid algorithm is shown in Fig. 3.
The steps of the proposed hybrid algorithm are shown as follows: Step 1: Data Analysis The purpose of this step is to ensure the quality of the dataset when the historical load data must be cleaned. First, the missing values and outliers must be checked to avoid the effects of missing values and outliers in the dataset. The missing values and outliers are repaired by the trend of the historical load and median value. In this paper, we use the linear interpolation to deal with this issue, which can be expressed as (17)  If the dataset is mixed with multi-dimensional features, feature relations should be mined and analyzed. Then, in order to remove the dimensions between different data and expedite the following steps, normalization must be set.
where the interpolation formula 1 ( ) L x is approximated by a line that goes through 0 0 ( , ) x y and 1 1 ( , ) x y . Step 2: Data Decomposition In Step 2, the load data will be decomposed by ICEEMDAN. According to the ICEEMDAN principle, the load data will be decomposed into IMFk and a margin RES, by k-step iteration, where IMFk and RES are the stationary signal and the non-stationary signal which obtained by decomposition, respectively. Then, to train and evaluate the forecasting model established in step 3, the load data is divided into training set and test set.
Step 3: Load Forecasting In this step, we put two parts of the load data decomposed by ICEEMDAN into the forecasting algorithm. This requires taking the non-stationary signal and the stationary signal into the forecasting algorithm as input, respectively. In order to establish the load prediction model and adjust the model parameters, the non-stationary signal is brought into the LS-SVM algorithm, and the stationary signal is brought into the LSTM algorithm. Then, we sum the output of the two algorithms to get the result of forecasting. Finally, the load prediction results are obtained by normalizing, which restores the original dimension of the data.
Step 4: Model Validation In Step 4, we verify the accuracy of the model. The degree of fitting and accuracy of the established model is evaluated by the evaluation indices. The evaluation indices are introduced in the next section. VOLUME XX, 2017

III. RESULTS AND DISCUSSION
To realize the proposed algorithm, we chose the historical load data measured at 1-minute intervals, is utilized and resampled at 15-minute intervals. Then, we chose the holiday data from 2018 to 2020 from a city of northern China to build a load forecasting model which is written in Python. In this model, the data set is divided into training set and test set in 3:1. Then the PSO-SVR-based, LS-SVM-based, LSTMbased and ICEEMDAN-LSTM-based forecasting method are compared and the results of the simulation are obtained. It should be noted that the electricity consumption in summer and winter is more than that in spring and autumn because the air conditioners are often used for cooling and heating in summer and winter, respectively. Usually, the last IMF component contains the seasonal trend of the load data. And the seasonal trend can be captured and learned by the LSTM algorithm effectivity, which provides important information for long-term load forecasting. Here the result of the decomposition by four months and the result of the decomposition by a year are shown in Fig.4 and Fig.5 . Remark 2: In order to construct the load forecasting model, the real load data from a large industrial city in the north of China was obtained in this research to construct and test the load forecasting model. Thus, the real-world scenario in this research is a city in the north of China.  In the process of simulation, the model parameters for the load forecasting in the three time spans we selected are shown in Table Ⅱ.
At the end, to evaluate the accuracy of the proposed algorithm, three metrics were selected: mean average error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). The expressions of these three evaluation criteria are: where N is the number of the data point， i C is the result of the forecast data and i R is the real data ( 1   i N ). By using the load forecasting model and the three metrics. The forecasting results of the short term, the medium term and the long term will be discussed in next three subsections, respectively.

A. SHORT-TERM LOAD FORECASTING(STLF)
To forecast the electricity load of the last three days of 2020, the effectiveness of the proposed hybrid algorithm in STLF was validated, and the training set was three times as long as the test set. The STLF based on the proposed method and its comparison with the PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM. are shown in Fig. 6.
The correlation curve between actual value and predicted value is shown in Fig. 7. And the MAE, MAPE, and RMSE of SLTF are listed in Table Ⅲ. Table Ⅲ shows the prediction effect of using only LSTM or LS-SVM to be poor. The prediction results are significantly improved by mixing ICEEMDAN, LSTM and LS-SVM.  The electricity load data compiled during the last 4 weeks of 2020 was utilized for MTLF forecasting, with the training and test datasets divided in a ratio of 3:1. The forecasting result of MTLF and the correlation curve between the actual value and predicted value are shown in Fig. 8. and Fig. 9, respectively. The MAE, MAPE, and RMSE of MTLF are listed in Table Ⅳ.   As shown in Table Ⅳ, with the increase of prediction sequence, the prediction performance of the proposed algorithm weakens slightly. However, our proposed hybrid algorithm is obviously superior to other algorithms. The MAPE of the proposed method was reduced by 1.5307%, 0.6967%, 0.16% and 0.0441% respectively, compared with the PSO-SVR, LS-SVM, LSTM and ICEEMDAN-LSTM.
Following the same protocol as in the above two cases, the power load data from September to November was used to train the data-driven model and to forecast the load of December before the model validity for LTLF could be verified. The simulation results are shown in Fig. 10 and Fig.  11, where Fig. 10 is the result of the LTLF, and Fig. 11 is the correlation curve between the actual value and predicted value.
The MAE, MAPE, and RMSE of LTLF are listed in Table  Ⅴ, and the comparison of MAPE by using experimental sets in STLF, MTLF and LTLF is shown in Fig. 12.  Fig. 5, Fig. 7 and Fig. 9, it can be seen that the fitting effect of LTLF is the best because the correlation curve between actual value and predicted value in LTLF has the largest R 2 . Remark 3: As shown in Table Ⅲ, Table Ⅳ and Table Ⅴ, through comparing the ICEEMDAN-LSTM model with the LSTM model, it is obvious that the accuracy of the load forecasting has been improved by using the ICEEMDAN algorithm. Comparing the ICEEMDAN-LSTM model with our proposed model, we can conclude that the simulation results of the LS-SVM-LSTM model, which cope with the non-stationary and stationary signals in a model, respectively, can enhance the performance of the load forecasting.

D. MODEL VALIDATION OF AN OPEN SOURCE POWER LOAD DATASET
To verify the generalization ability of the proposed hybrid algorithm, the electricity load data collected from the European Eila Grid [31] was utilized to perform load forecasting, including the measured net generation of the (local) power stations that inject power into the Elia grid. The netto inflows from the Distribution to the Elia Grid and the netto import on borders. Exports on borders and energy used for energy storage are deducted.
Then, we also selected the PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM for comparison. It should be noted that the model parameters for the load forecasting in three types of spans were conducted by using a common experimental set with the same parameters as found in Table  Ⅱ.
Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 9      By using this common experimental set, we obtained the results of STLF, MTLF and LTLF in Fig. 13, Fig. 14, and Fig. 15 respectively. The correlation curve between the actual value and predicted value in the three types of time spans are shown in Fig. 16, Fig. 17, and Fig. 18 respectively.
The MAE, MAPE, and RMSE of STLF, MTLF and LTLF were obtained by using common experimental set ups as listed in Table Ⅵ and a comparison of MAPE can be made by using the experimental setup in STLF, MTLF and LTLF as shown in Fig. 19. Table Ⅵ indicates that the proposed method performed well in all time spans by using the common experimental set. In all cases, compared with PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM, our method had the smallest MAE, MAPE and RMSE. It also adds to the proof that an algorithm has good robustness and generalization ability.

IV. CONCLUSION
This paper presents a forecasting method combining ICEEMDAN, LS-SVM and LSTM, which the authors developed to forecast the electric load for three types of time spans: short, medium and long term. The proposed method is implemented by using the hidden features of the least squares support vector machine (LS-SVM) and LSTM networks to obtain the advantages of both algorithms. The performance of the proposed algorithm was evaluated by investigating real data of power loads in the power system of a large city in China. To ensure stability and validity, the validation corresponds to the use of a dataset from each section. In addition, the robustness and generality of the designed scheme were tested. From the experimental part, we observed that the proposed algorithm gives the lowest MAE, RMSE and MAPE values compared to PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM algorithms. In all validation experiments, the proposed scheme outperforms the PSO-SVR, LS-SVM, LSTM and ICCEMDAN-LSTM algorithms in terms of prediction results. Finally, it can be shown that the proposed scheme can handle long time series electric load data and predict future loads for a considerable period of time. In addition, it can also handle load forecasting for three types of time horizons simultaneously: short, medium and long term.