Reactive Load Prediction Based on a Long Short-Term Memory Neural Network

Accurate reactive load prediction can improve the accuracy and process of reactive power optimization for power grids and improve the control effect. The changes in the bus reactive load and active load are not synchronous, the base of the reactive load is small, nonlinear changes are abundant, and it is difﬁcult to mine the inherent data trends. In view of the above problems, this paper proposes a method for predicting bus reactive loads based on deep learning. A bus reactive load prediction model is constructed based on a dual-input long short-term memory neural network to mine the detailed characteristics of active and reactive load data. Active and reactive loads are used as input and output data for the dynamic modeling of load time series data to form integrated forecasts of bus active and reactive loads. The experimental results show that this method can accurately predict the reactive power load of buses, and the prediction accuracy is better than that of time series and general long short-term memory neural network prediction models.


I. INTRODUCTION
With the development of power-side distributed power generation and microgrids, the diversity and uncertainty of the power supply and load side have increased, and operating scenarios are becoming more complicated. The need for power grid operations to coordinate among various types of reactive power control equipment in the grid and achieve fine-scale reactive voltage control in the grid has become increasingly urgent. Accurate reactive load prediction is of great significance for fine-scale reactive voltage control.
Reference [1] proposed a coordinated control strategy based on the combined control of reactive power and voltage. This strategy involves the coordinated control of the reactive power and voltage of static synchronous compensators in the parallel operation of wind farms. The voltage at the coupling point in a power system is rapidly adjusted, and the dynamic reactive power reserve of the wind farm is optimized. In [2], a two-level reactive voltage control The associate editor coordinating the review of this manuscript and approving it for publication was Jason Gu . strategy based on a multiagent system (MAS) and reactive voltage sensitivity algorithm was proposed. By using reactive voltage sensitivity information, a wind power cluster was connected to the regional power grid, and the regional power grid was equipped with intelligent, decentralized and coordinated features to maintain the connection between the voltage and power grid at a reasonable level. In [3], a control strategy for reactive power and voltage in a wind farm based on an adaptive discrete binary particle swarm optimization algorithm was proposed. Taking the voltage fluctuations of a wind farm bus and the minimum reactive power input as the control objectives, a power flow equation for wind farms was established. By treating the voltage safety and control variables as constraints, the comprehensive control instructions for the prediction period were obtained according to wind power prediction data. Most of the reactive power optimization methods in the above references are postcorrection control methods based on real-time online data, and the control strategies are temporally lagging. Reference [4] proposed an active method of reactive power control based on predictive control theory. According to the predictions of VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ wind speed, the active power was adjusted before the wind speed changed to explore the reactive power control capability of the model for a wind turbine and achieve the active control of the reactive power. However, the method focuses on the reactive power control capability of wind turbines and active power prediction on the power supply side, thus failing to consider the complex characteristics of reactive power on the load side. Some studies have begun to analyze the reactive power characteristics of grid loads. Because the lack of substation data analysis in the reactive power control of the power grid can lead to malfunctions in reactive power devices, according to the similarity and smoothness of the reactive load curve, [5] proposed an improved BP network reactive load prediction method; this method can be used for the online identification of bad reactive power data via comparisons with the real-time reactive load data, but this approach is mainly used for real-time data denoising. The accuracy of reactive load forecasting is not high, so it is difficult to meet the online prediction requirements of reactive power optimization. In [6], [7], the characteristics of the reactive power load of the power grid in time and space by establishing reactive power evaluation indicators is analyzed, and based on this approach, predictions based on the stabilization characteristics of reactive load indicators were obtained in the same operating mode. However, the data used tend to be statistical network data, and the method is unable to achieve the dynamic online prediction of reactive loads. In summary, predicting the reactive power of a bus-level power grid with a small base number and nonlinearity is an urgent issue. This paper analyzes the actual data collected in a certain power grid, mines the bus-level load reactive power characteristics, and proposes a bus reactive load prediction method based on deep learning. This method builds a bus reactive load prediction model based on a dual-input long short-term memory neural network. The load forecasting model fully exploits the deep time series characteristics of the active and reactive load data, dynamically models the load time series, and provides integrated predictions of the active and reactive bus loads, which improve the accuracy of traditional bus reactive load prediction.

II. ANALYSIS OF REACTIVE LOAD CHARACTERISTICS AND INTRODUCTION OF LONG SHORT-TERM MEMORY NEURAL NETWORKS A. ANALYSIS OF REACTIVE LOAD CHARACTERISTICS
Achieving a reactive power balance is necessary for maintaining voltage stability and high-quality voltage in power systems. However, the reactive load is affected by the consumer side characteristics, substation reactive compensation and line flow. The inherent regularity of data makes them difficult to mine, and the reactive power and voltage control requirements in different areas of large power systems often vary. Additionally, the diversity of control equipment and strong nonlinearity of the reactive voltage must be considered.  The actual reactive load data from a regional power grid in Hainan Province are taken as an example to analyze the reactive load characteristics of the power grid. Figure 1 shows the power factor curves of 10 buses in the power grid within a day (96 sampling points). Notably, the load power factor of the power grid is not constant; it often displays volatility throughout the day, and the fluctuation characteristics of different bus loads are different. Figure 2a and Figure 2b are the time series diagrams of the changes in the actual values of the active and reactive loads on two lines in the area; the abscissa is the sampling time, and the ordinate is the bus load. Figure 2a and Figure 2b show that the base number of the reactive load is significantly smaller than that of the active load; additionally, the changes in the reactive and active loads are not synchronous, and the power factor is not constant. The actual values of the bus loads of the above two lines can be standardized; that is, the per-unit value of the active load = the actual value of the active load/the maximum value of the active load, and the per-unit value of the reactive load = the actual value of the reactive load/the maximum value of the reactive load. The corresponding time series diagrams of the per-unit value of the daily load are shown in Figure 3a and Figure 3b. Figure 3a and Figure 3b illustrate that the reactive load displays highly nonlinear and complex variations.
The change in the reactive load is not synchronized with the change in the active load; additionally, the base is small, nonlinearity is strong, and the active load is susceptible to noise. Thus, using a constant power factor for reactive power analysis cannot reflect the actual situation in the power grid or meet the requirements of the increasingly refined control of reactive power in the current grid.

B. LSTM NETWORK STRUCTURE AND TRAINING CHARACTERISTICS
Recurrent Neural Networks (RNNs) [8] are mainly employed for processing and predicting sequence data. The characteristics of RNNs enable the output of a neuron at a certain time to be reinput to the neuron as an input. This network structure can maintain the dependency of data. The RNN structure is shown in Figure 4.
The RNN considers the output of the hidden layer at the previous moment as the input of the hidden layer at this moment, can use past information and is memorable. However, when using the Back Propagation Trough   Time (BPTT) algorithm to solve the RNN [9], problems such as vanishing gradients or exploding gradients easily occur. The LSTM [10] neural network controls the flow of information by introducing gate structures and memory units, which can effectively overcome the problem of gradient disappearance in RNNs and accurately model data with shortterm or long-term dependence. Figure 5 [11] shows the unit structure of LSTM.
Compared with the traditional RNN, the LSTM unit introduces three types of gate control [12], as shown in Figure 6: input gate, forget gate and output gate. The LSTM uses these gates to store and update information. Gating is implemented by a sigmoid function and bit-based multiplication operations. A sigmoid function maps an actual value to the interval VOLUME 8, 2020 0∼1, which is used to describe how much information passes. When the output value of the gate is 0, no information is passed; when the value is 1, all information can pass. After this design, the network can more easily learn the long-term dependence between two sequences and solve the problem of vanishing gradients that easily occur in a traditional RNN.
The working process of the LSTM unit is expressed as follows: 1) The sigmoid layer of the forget gate determines the information to be forgotten in the neuron.
2) Decide what information needs to be stored in the neuron and update the state of the memory unit. The sigmoid layer of the input gate determines the information to be updated, the candidate information C t is created via the tanh layer, and both are combined and updated to obtain the current state C t of the neuron.
3) Determine the output of information. The output of the unit state is determined by the sigmoid layer, the unit state is processed by the tanh layer, and the final output h t is determined.
The forward propagation process of the LSTM unit is expressed by (1) to (6). f t , i t , c t and o t are the forget gate, input gate, unit status, and output gate, respectively, at time t; h t is the final output of LSTM; W f and b f are the weight matrixes and bias vectors, respectively, of the forget gate; W i and b i are the weight matrixes and bias vectors, respectively, of the input gate; W c and b c are the weight matrixes and bias vectors, respectively, of the memory unit; W o and b o are the weight matrixes and bias vectors, respectively, of the output gate. The matrix [x t , h t−1 ] is composed of the two vectors of the output h t−1 at time t − 1 and the input x t at time t. c t−1 is the unit state at time t − 1; σ and tanh are the sigmoid activation function and tanh activation function, respectively; and the symbol indicates that the vector and vector are multiplied by the elements.
The BPTT algorithm is the application of the back propagation (BP) training algorithm in the RNN. As shown in Figure 7, by unfolding the LSTM network into a deep network in chronological order and then applying the BP algorithm to the unfolded network, the accumulated residuals at the last moment are continuously passed back to the starting point.

III. INTEGRATED FORECASTING MODEL OF ACTIVE AND REACTIVE LOADS BASED ON THE DI-LSTM NETWORK
There are three hidden layers in the DI-LSTM model, including two LSTM layers and one DNN layer, which predicts the active and reactive loads of the bus 15 min in the future. The base number of the power system bus load is small, and the characteristics are different and easily affected by users in the power supply area. Additionally, the reactive load is affected by the voltage level, line power flow and nonlinear compensation; thus, the load has considerable randomness. A reactive load does not have the same clear flow direction as an active load, and it is difficult to mine irregular data, but the time series characteristics of the active and reactive load data are significant. In this paper, both of types of data are used as the inputs and outputs of the forecasting model to form an integrated forecasting model of bus active and reactive loads; this model can not only accurately predict active loads but can also predict reactive loads with strong nonlinearity and irregularly variability.
The power factor of the bus load also varies, and the difference between the reactive load and active load is large. Normalize the data to accelerate the solution, as shown in (7).
In (7), x is the initial active and reactive load data, max(x) and min(x) are the maximum and minimum value, respectively, and x is the normalized input data of the network.
The bus load varies on holidays and working days. Therefore, the data from working days/holidays are added to the input variable set to improve the accuracy of network load forecasting. The output of the LSTM network for the daytype variable can be used as a correction for the workday load on holidays. Because nonnumeric day-type input data are directly represented by a Boolean variable, no normalization is required; specifically, 1 represents a working day, and 0 represents a holiday.

B. STRUCTURAL DESIGN OF THE DI-LSTM NETWORK MODEL
The input and output layers of the DI-LSTM use twodimensional variables that consist of active and reactive loads. The model structure and superparameters are determined experimentally.
To obtain a suitable network structure, the number of LSTM layers is set to 1 to 5 layers; the training time and error are shown in Table 1. Based on the results of Table 1, a stacked structure that involves a 2-layer LSTM and 1-layer DNN is employed as the hidden layer structure of the DI-LSTM.
Considering the effective control period of reactive power optimization and the characteristics of the LSTM network  solution, the sampling interval of network input data is set to 15 min, and the number of samples collected in a day is 96. Therefore, the input of the model is 96 points.
The initial learning rate is set to 0.025, and the learning rate is dynamically optimized by the Adam optimization algorithm. The Adam optimization algorithm can fully utilize the first and second moment estimations of the gradient, with high calculation efficiency and low memory requirements.
The structure of the DI-LSTM prediction model is shown in Figure 8.

C. BUS ACTIVE AND REACTIVE LOAD PREDICTION PROCESS
The process of DI-LSTM bus reactive load prediction is shown in Figure 9, involves the following steps: 1) Data preprocessing: Clean and normalize the original data to be input into the network, and divide the training samples and test samples.  6) Error calculation and performance evaluation: Calculate the error between the predicted value and the true value in the test sample, analyze the prediction result, and evaluate the model performance.

IV. EXAMPLE ANALYSIS A. EXPERIMENTAL DATA AND ERROR EVALUATION INDICATORS
Anaconda Spyder is the development environment, which is based on the Google-TensorFlow deep learning framework. The prediction program is written in Python language. The data used in the experiment are the actual SCADA data from June 1 to August 22, 2011, from a certain region of the power grid in Hainan Province. A total of 357 bus loads were measured during the acquisition period. The obtained data include active and reactive loads. The training set uses data from June 1 to August 5, the test set uses data from August 6 to August 22, and the data sampling frequency is set to 15 min. The input and output data of the bus reactive loads are shown in Table 2.
To better evaluate the effectiveness of the prediction, the mean absolute percentage error (MAPE) and root mean square error (RMSE) are employed as error evaluation indicators [13], which can represent the degree of difference between the predicted value and the true value and the average degree of dispersion to measure the prediction effect, respectively.
In (8) and (9), n is the number of samples, y r (t) is the true load value at time t, and y p (t) is the predicted load value at time t. VOLUME 8, 2020

B. ANALYSIS OF THE PREDICTION RESULTS
In the comparative experiment, the autoregressive moving average (ARMA) model, which is based on the Bayesian information criterion (BIC), was applied for automatic ordering; the single-input LSTM network, in which the number of hidden layers is 2 and the number of neurons is 97, was employed; the Adam training optimization algorithm was selected; and the proposed DI-LSTM network was utilized to predict the bus load. The three models use the same data set to predict the active load and reactive load of a typical type 2 bus. The prediction results are shown in Figures 10a to 10d. The error results are shown in Tables 3 and 4. According to Table 3, for the active load of the bus, ARMA, LSTM and DI-LSTM can achieve an accurate prediction because the active load change is relatively stable and can capture the characteristics of the data for data prediction. Table 4 indicates that the MAPE of the ARMA prediction model for the reactive load prediction of the 110-kV Jinjiang Station with 2 buses reaches 240.049%, which is because the change in the reactive load of the bus is significant on the prediction day. ARMA cannot predict the change in data based on the linear fitting method, and the prediction result has no value. Compared with ARMA, the LSTM forecast of the reactive load for the 110-kV Power Village Station with 1 bus yields an MAPE of 15.594%, and that for the 110-kV Jinjiang Station with 2 buses is 32.026%. The LSTM model can improve the prediction accuracy of the reactive load compared to that of ARMA, but the prediction error   of for reactive loads with complex changes remains large; thus, this approach cannot meet the requirements of actual projects. The DI-LSTM prediction model reduces the MAPE of the 1-bus reactive load prediction for the 110-kV Power Village Station to 8.131% and the MAPE of the 2-bus reactive load prediction for the 110-kV Jinjiang Station to 10.148%. In addition, the average MAPE of active load prediction is reduced by 1.109%, and the average reduction in the RMSE is 0.197. Moreover, the average MAPE of bus reactive load prediction is reduced by 62.379%, and the average reduction in the RMSE is 0.386. The DI-LSTM model can significantly improve the prediction accuracy of the reactive load, and this approach provides better prediction accuracy than other models for reactive loads with complex variations. Therefore, compared with ARMA and LSTM, DI-LSTM has obvious advantages in forecasting bus reactive loads. Figure 11a to Figure 11d show the distributions of the MAPE and RMSE for bus active and reactive loads on VOLUME 8, 2020  It can be seen from Table 5 and Figure 12 that the prediction effects of the three models are affected by the degree of nonlinearity of the reactive load. The higher the degree of nonlinearity of the data is, the lower the accuracy of the prediction. Among them, the ARMA model is most affected by the data. When the bus reactive load changes significantly, the ARMA model cannot meet the prediction requirements.
Based on the above analysis, the LSTM neural network has strong nonlinear fitting performance, which can meet the general requirements of bus reactive load prediction, but the accuracy of reactive load prediction under complex conditions is not sufficient. The prediction model based on the DI-LSTM neural network proposed in this paper can accurately predict the reactive loads of various types of buses and achieve integrated predictions of the active and reactive loads of buses. The prediction results for the measured data show that the method can accurately predict bus loads with complex reactive power variations and has good application prospects.

V. CONCLUSION
This paper proposes a method for predicting bus reactive power based on deep learning. The deep time series characteristics of active and reactive load data are mined, using active and reactive loads as input and output data at the same time to, dynamically model the load time series. Network parameter training is completed by the time-reverse error propagation algorithm to form an integrated prediction of bus active and reactive loads. By constructing a bus reactive load prediction model based on a dual-input long and shortterm memory neural network, the accuracy of the bus reactive load prediction is effectively improved. This result is of great significance for fine control of reactive voltage and improvements the safety and economy of power systems.