Determining Optimal Operating Reserves toward Wind Power Penetration in Indonesia based on Hybrid Artificial Intelligence

The stability and economic level of the power system operation during the penetration of Wind Power Plants (WPPs) are much determined by the variability and uncertainty of the wind power output. The characteristics of seasonal wind power output can be used to define the optimal operating reserves of a stable and cost-effective power system operation. This paper proposes a comprehensive algorithm of hybrid Artificial Intelligence (AI) approach that combines the Seasonal Autoregressive Integrated Moving Average (SARIMA) and selected Neural Network Variants (NNVs) in Seasonal Daily Variability and Uncertainty (SDVU) scheme. Among all NNVs, Long Short-Term Memory (LSTM) shows the most consistent and accurate results. With the hybrid AI approach, this algorithm calculates the Dynamic Confidence Level (DCL) to determine hourly operating reserves on a daily basis. The proposed algorithm has been successfully tested using historical data of real-world WPPs that operated in Indonesia. Furthermore, the comparison toward non-seasonal with a Static Confidence Level (SCL) in several percentile scenarios is made to prove the cost-effectiveness advantages of this new algorithm that may save up to 4.2% of total daily energy consumption. An interface application is added so that the results of this research can be directly utilized by users both on the observed power system and generally in Indonesia.


I. INTRODUCTION
Wind Power Plant (WPP) is part of renewable energy sources, which has green energy and rapid expansion throughout the world. But integrating WPPs into a power system means taking into account all possible fluctuations caused by variability and uncertainty of wind power output. It needs additional operating reserves to maintain the stability of the power system due to the penetration of WPPs. In [1], the authors explained the reserve procurement approach and transmission capacity allocation in a power system with high penetration of WPPs. Other literature [2] describes the multi-purpose dynamic economic delivery model with renewable liability requirements. It proposed model incorporates various renewable energy resources including WPPs into the grid.
Various methodologies have been presented to calculate the number of operating reserves due to the penetration of WPPs. Based on the principle of the methods, they are classified into statistical and probabilistic, Artificial Intelligence (AI), and hybrid AI. The traditional ones typically use only statistical and probabilistic approaches to analyze the observed data. In [3], the authors used a statistical approach to determine reserve requirements in Nordic countries based on hourly wind power variations. Other studies use a combination of statistical and probabilistic approaches for that purpose [4]- [6]. AI methods include the utilization of machine learning such as neural networks and fuzzy approaches. In [7], the authors used a fuzzy power flow tool based on a steady-state security assessment to determine reserve settings due to wind power uncertainty forecast. The hybrid approach incorporates various models to get their respective benefits and deliver excellent results. In [8], the authors offer an improved Grey BP Neural Network with an Autoregressive Integrated Moving Average (ARIMA) in modeling the uncertainties of wind-wave forecast and swell forecast.
In terms of the flexibility of the reserves, [9] uses a probabilistic approach to obtain dynamic reserve scheduling.
Dynamic scheduling of operating reserves can also be used in co-optimized electricity markets with high WPPs penetration [10]. In recent years, studies on operating reserves are not only at the system level but also at the plant level. They focus only on balancing the intermittent output of the WPPs and reducing the size of reserve requirements. Most of the studies focus on the optimal size of the Energy Storage System (ESS) [11]- [13].
This paper proposes a Seasonal Daily Variability and Uncertainty (SDVU) scheme based on a hybrid AI approach that incorporates Seasonal Autoregressive Integrated Moving Average (SARIMA) and Neural Network Variants (NNVs) to obtain optimal additional operating reserves due to penetration of WPPs in the southern Sulawesi power system, Indonesia. Previously, other literature uses a similar hybrid AI approach based on a combination of SARIMA and neural network (multi-layer perceptron) for wind speed forecasting [14], however, it is not directly applicable to our study. Furthermore, it only used one type of neural network as the comparison, while in our proposed algorithm we used multiple ones.
In the proposed model, the optimal operating reserves are determined by calculating the Dynamic Confidence Level (DCL) with Hybrid AI results as the input data. Previously, [15] use probabilistic-dynamic approaches to calculate DCL. They treat expected energy not supplied as the key parameter. In this paper, we use SDVU for that purpose. We analyze that the variability and uncertainty of the observed data have a specific seasonal nature, so they can be approached linearly by SARIMA, then NNV is used to catch the non-linear nature of the SARIMA output to increase the accuracy of the final result. This study uses commonly NNV methods for time series analysis and forecast such as a Nonlinear Autoregressive Neural Network (NARNET), Multi-Layer Perceptron Back Propagation (MLPBP), Wavelet Neural Network (WNN), and Long Short-Term Memory (LSTM). The NNV methods are combined one by one with SARIMA to generate forecasted models of the variability and uncertainty from the observed data. The best combination will be chosen as the proposed method in this study. It is determined based on the results of accuracy analysis using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) parameters. To prove the cost-effectiveness of this algorithm, the results of additional operating reserves are compared with the ones from Static Confidence Level (SCL) for several percentile scenarios. Table I shows the comparison of the research works on some reference papers cited previously. It notifies the correlation to the previous ones and justifies the novelty of this research.  [8] Grey BP NN + ARIMA wind-wave forecast Wang et al [11] Probabilistic Distribution energy storage systems Alencar et al [14] SARIMA + MLPBP wind speed forecast Rahman et al [15] Probabilistic-Dynamic WPPs dynamic reserves

Barus & Dalimi SARIMA + LSTM WPPs DCL reserves
The question to be answered from the research is as follows: How to maintain the level of system reliability with a more economical cost during WPPs penetration?
The main contributions of this manuscript can be summarized as follows: 1) Presenting a hybrid forecasting method based on SARIMA and NNV for specific variability and uncertainty characteristics; 2) Developing an effective forecasting approach to calculate DCL; 3) Applying SDVU scheme for more economical operating reserves but still maintaining the same level of system reliability; 4) Use the real-world Indonesia database for operating reserves forecasting due to WPPs penetration, which has great potential in its future development. The rest of this paper is organized as follows: Section II describes the observed data used in this study. The proposed methodology is presented in Section III. In Section IV, the result of SARIMA, NNV, and DCL are explored. Finally, the conclusion is provided in Section V.

II. DATASET OF OBSERVED POWER SYSTEM
This section displays the dataset of the observed power system that used in the research. It consists of system overview, historical seasonal data, and historical intermittent data derived from the statistical data of the southern Sulawesi power system.

A. SYSTEM OVERVIEW
Indonesia lies in the tropical climate area between latitudes 11°S and 6°N, and longitudes 95°E and 141°E. Fig. 1(a) shows the location of Indonesia in the southeast Asia region based on a wind potential map [16]. Sulawesi is one of the major islands in Indonesia. It has 2 separate major interconnected power systems, which are southern Sulawesi and northern Sulawesi power systems. As shown in Fig. 1(b), there are 2 WPPs connected to the southern Sulawesi power system, which are Sidrap WPP (70 MW) and Jeneponto WPP (60 MW). Sidrap WPP is located at a higher altitude with hilly terrain, while the Jeneponto WPP is at a lower one with flat coastline terrain. In [17], the authors have described the detailed technical specification of both WPPs. Table II shows the composition of power plants (PPs) in the southern Sulawesi power system. The highest peak load is 1,411 MW (Nov 2019). It means there are 379 MW (27%) reserve margins of the system, mostly in gas PP and diesel PP.

B. HISTORICAL SEASONAL DATA
The historical database [18] used in this study is obtained from the southern Sulawesi load dispatch center. This study uses the aggregation power output data of the two WPPs (southern Sulawesi WPPs) on an hour-ahead forecast basis. The data used has been normalized for aggregation purposes. The timeseries data is from April 01, 2019, to March 31, 2020, with a time resolution of 30 minutes. The timing is to refer to the prevailing season period in Indonesia. Annually, BMKG (Badan Meteorologi Klimatologi dan Geofisika), as the National Meteorological, Climatological, and Geophysical Agency in Indonesia, publishes online documents regarding two seasonal climate forecasts, namely the dry season forecast in March [19] and the wet season forecast in September [20]. Generally, in Indonesia, the dry season period occurs between April-September and the wet season between October-March. Fig. 2 shows the annual characteristic of southern Sulawesi WPP during the seasonal period from April 01, 2019, to March 31, 2020. It uses the following parameters: Capacity Factor ( ), Daily Capacity Load Factor ( ), and Standard Deviation Error ( ) to show the seasonal characteristic of the southern Sulawesi WPPs. The formulation of these parameters is as follows [21]. Within a certain period, is defined as the ratio of the total actual energy produced ( ) to the maximum energy that might be generated if the power plant operates at full load based on its installed capacity ( ). By using (1), may also be calculated as the average power generated ( ) divided by the installed capacity of the power plant ( ). (1) For daily operating patterns of WPP in a certain period, the is another useful indicator to note. As defined in (2), is the average CF ( divided by the maximum CF ( ) in a day (24 hours) for a specified period.

FIGURE 2. Characteristic of Southern Sulawesi WPP: (a) Monthly (b) Daily
As the basis [22], the forecast deviation ( ) means the difference between the actual ( ) and the forecasted value ( ) with time function ( ), which defined as: Standard Deviation Error ( ) which refers to the standard deviation of the error distribution, is defined as the average root square error deviation for the entire observed duration (n) and calculated for each time function (4). (4) Fig. 2(a) shows that in the dry season period is higher than in the rainy season, with the highest occurring in July 2019 and the lowest in December 2019. It shows that southern Sulawesi WPPs are more productive during the dry season. Accordingly, in the dry season is higher than in the rainy season. This shows that the WPPs energy production during the dry season is more stable than the rainy one.
In terms of forecast accuracy, SDE in the rainy season is almost the same as the one in the dry season. It shows that the accuracy of forecasting is almost evenly distributed throughout the year and does not depend on the season period. Fig. 2(b) shows that, on a daily basis, the highest occurs at around 3 pm, either on an annual period, dry season, or rainy season. It also presents that in the dry season is much higher than in the rainy season. It confirms the previously displayed data. From these two conditions, it can be seen that there are seasonal characteristics of the WPPs output data, both in terms of production patterns and prediction accuracy. These results will be used as the basis for using the SARIMA method in a more comprehensive analysis of the linear properties of the observed data.

C. HISTORICAL INTERMITTENT DATA
WPP is part of Variable Renewable Energy (VRE) with specific intermittent characteristics. It has natural changes in the level of availability over time (variability) that may not be perfectly forecasted (uncertainty) [23]. The variability of wind power is due to the natural fluctuation of wind power output which is an effect of the availability of natural resources. It still exists even if the forecast is accurate. WPP uncertainty can be defined as the deviation that occurs between forecasted and actual wind power output. The accuracy of the WPP forecast can be determined by some elements such as forecast horizons and geographic distribution. As WPP contribution increases, additional operating reserves are required due to the increase of variability and uncertainty that appeared in the power system [24]- [25]. In this study, wind variability means the difference between 30 minutes WPP output series, while wind uncertainty means the difference between the 30 minutes of the actual WPP output and the estimated value in the time frame of 1 hour ahead. Fig. 3 describes wind variability and uncertainty historical data of south Sulawesi WPP in the observed period. It shows the white noise characteristic of the data that describes the high complexity and stochastic level. This characteristic will be observed using the proposed method that will be discussed more in the next chapters. SARIMA will handle the linear nature of the data, while several NNVs will be tested to explore their non-linearity and weighting their accuracy and consistency based on commonly used statistical test parameters such as MAE, RMSE, and MAPE. The database used included 366 sampling days every 30 minutes, totaling 17,568 points for each variable and a total of 35,136 points.

III. PROPOSED METHODOLOGY
The Seasonal Daily Variability and Uncertainty (SDVU) scheme is the proposed methodology in this study. The role of SDVU in the daily production simulation procedure can be seen in Fig. 4(a). SDVU takes input data from the daily wind power forecast phase which then its results are used as input for the economic dispatch simulation phase. In this study, fluctuations due to load forecast and uncertainties due to the Forced Outage Rate (FOR) of generation and transmission systems are assumed to have been calculated in the hydrothermal coordination phase. The system level is chosen in this study to calculate the number of additional operating reserves due to penetration of WPPs by considering system stiffness.
Details of the SDVU process scheme can be seen in Fig.  4(b). First, a large database is created from seasonal wind power behavior as initial input [18]. Variability and uncertainty variables are extracted from the large database that represented WPP aggregation. Then, the seasonal characteristic of data is approached linearly using the SARIMA method. Within the SARIMA process, clustering and Principal Component Analysis (PCA) steps are included to minimize the white noise effect of existing data. After that, NNV will catch the non-linear relations of SARIMA results to produce a more accurate forecast. The best combination of SARIMA and NNV will be used as the proposed Hybrid AI method for this study. It is determined based on the results of the accuracy analysis with MAE, RSME, and MAPE parameters. Furthermore, with a statistical approach, the DCL quantities will be determined based on the results of the proposed method. Then, it will be used to calculate the hourly additional operating reserve for several percentile scenarios on a daily basis. To prove the cost-effectiveness advantage of the proposed method, comparisons are made to the non-seasonal method based on the Static Confidence Level (SCL), regarding the need for additional operating reserves and associated energy saving for several percentile scenarios.

A. OVERVIEW OF SARIMA METHOD
The SARIMA method uses a statistical approach to predict value as a linear function of the observed data. The method is a derivative of the ARIMA model which is usually utilized to examine data with a time-series pattern that has seasonal characteristics [26]. From the previous analysis [18], it was known that the wind variability and uncertainty of the observed data had a specific seasonal characteristic, so SARIMA is suitable to be used in this study.
The SARIMA model is usually displayed as  Notation and denote the autoregressive parameter, and denote the difference parameter, and and denote the moving average parameter.
The mathematical formulation of the SARIMA Model is as follows: where: ϕ is an AR parameter of order p, θ is a MA parameter of order , Φ is a seasonal AR parameter of order P, Θ is a seasonal MA parameter of order , 1 B is the differentiator parameter, 1 is the seasonal differentiator parameter, is the backshift parameter as , ε is the residual deviation at time t, and y is the variability or uncertainty for t.
Equations (5)-(9) will be used to analyze the linearity portion of the seasonal observation data. This is the first stage in the SDVU method. The main stages in obtaining the SARIMA model consist of four steps: identify, estimate, diagnose, and forecast [26]. But in this study, we include clustering, PCA, and regroup as additional steps. The complete steps are as follows. 1) Clustering time-series data into 24 groups based on data clusters per 1 hour. 2) Use the principal component analysis (PCA) method [27] on the clustering results as input data for the SARIMA method. 3) Identify the existence of seasonality data with the Auto Correlation Functions (ACF) and Partial Auto Correlation Functions (PACF). 4) Estimated coefficients of the models created. 5) Diagnose model validity with standard accuracy parameters (R-square, RMSE, MAPE, MAE, and normalized Bayesian information criterion). 6) Perform a forecast based on the predetermined model 7) Regroup the time-series result which is then be used as input for the NNV methods.

B. OVERVIEW OF NNV METHOD
The Neural Network (NN) method is designed using performance characteristics similar to the human nerve impulse delivery process [28]. This method is increasingly developing along with its ability to solve many problems that occur in various fields, such as voice recognition, face identification, and weather predictions. This study uses the NNV methods which are commonly used for time series analysis and forecast, namely: the Non-linear Autoregressive Neural Network (NARNET) [28], Multi-Layer Perceptron Back Propagation (MLPBP) [14], Wavelet Neural Network (WNN) [29], and Long Short-Term Memory (LSTM) [30]. In general, the network architecture used in NNV consists of an input layer, hidden layer, and output layer. Each layer is made up of units called neurons. where they are connected in adjacent layers. These relationships and their weights and biases are formulated as shown in (10) and (11) [31].
Where, is the weight moving from input to the hidden neuron , is the neuron output, is the neuron input, is the bias, and is the activation function.
At the training phase, an iterative learning process occurs. The network processes the existing input and output data by adjusting weights and biases on an ongoing basis to obtain mismatches that are smaller than the allowable tolerances or the number of iterations reaching the maximum epoch. Model validation is carried out after training is complete to ensure the process produces accurate output. The complete steps in this phase are as follows: 1) Preparation of SARIMA output data as input for the selected NNV methods (NARNET, MLPBP, WNN, LSTM).

C. OVERVIEW OF ACCURACY EVALUATION METHOD
Several accuracy evaluation methods were adopted in this study [22], namely: mean absolute error ( ), root mean square error ( ), and mean absolute percentage error ( ). This accuracy index calculated as follows: MAE, RMSE, and MAPE are statistical parameters that are commonly used in various references to calculate the accuracy level of an error prediction in time series analysis based on absolute, average, and relative aspects.

D. OVERVIEW OF DCL PARAMETER
By using a statistical approach, we use historical data of wind variability and uncertainty to determine the size of the additional operating reserves needed in the penetrated system. It commonly uses the n-sigma criterion [15], [32]. If wind variability and uncertainty are assumed to be uncorrelated, the wind intermittency sigma (σ ) of the time series data will be: Where, σ is the sigma of wind variability and σ is the sigma of wind uncertainty. The greater the σ value, the greater the additional reserves that need to be provided. In this study, the amount of this reserve associated with σ magnitude is defined as the Confidence Level (CL) since it shows the level of confidence of the penetrated system. In [2], [3], [29] the Static Confidence Level (SCL) is used which assumes a fixed sigma magnitude over the period. In this study, Dynamic Confidence Level (DCL) is introduced, which come from the hourly σ magnitudes that vary over the observed period. These dynamic quantities are obtained from forecasted variability and uncertainty variables using the Hybrid AI method. The Probability Density Function (PDF) of the observed or forecasted data can be used to determine the sigma magnitude of its variability and uncertainty. Fig. 5 describes the pdf of a normal distribution [33]. It shows the correlation between the size of the sigma multiplier ( ) and the probability of occurrence (percentile). The greater the value of the sigma multiplier ( ), the greater the percentile.   In the context of WPP penetration, the additional operating reserve value ( , as shown in (16), is the σ the magnitude of the observed data times the multiplier ( ) of the desired percentile level.
. σ (16) The larger the percentile chosen means the greater the WPP fluctuation can be reduced (the penetrated system is more stable and reliable), but consequently the more additional operating reserves that need to be prepared.

IV. RESULTS OF THE SIMULATION
This section displays the results attained from the proposed methodology using SDVU based on hybrid AI. Observed data were taken from the aggregation of the southern Sulawesi WPP for a one-year duration including the dry and wet season period in Indonesia of tropical climate area. Forecasting models were developed using SPSS modeler software for the SARIMA method [34] and MATLAB statistics and machine learning software for the NNV method [35]. The results are presented as follows.

A. RESULT OF SARIMA METHOD
This section presents the forecasting results obtained from the SARIMA method process. The first stage is clustering variability and uncertainty time-series data into 24 clusters according to the existing time scaling (1-hour interval). Then, the PCA process is carried out on the data. The aim is to reduce the white noise characteristic of existing stochastic data. SARIMA model parameters are then automatically obtained from the expert modeler feature in the SPSS software. The selection of the best-fitting model refers to ACF, PACF, as well as statistical accuracy parameters, such as R-square, MAE, RMSE, MAPE, and normalized Bayesian information criterion. The best SARIMA configuration is displayed in Table IV. It shows that the weekly cycle on a daily basis ( =7) is the best seasonal period for the observed data.

B. RESULT OF NNV METHOD
NNV methods use SARIMA output as input data. Then, the initial configurations and options are determined.  For all NNV models, the database is separated into two groups, 80% for training and the last 20% for testing. Fig. 6 illustrates hourly variability and uncertainty on a daily basis of observed and forecasted values of each tested hybrid AI method.   In the next section, SARIMA+LSTM becomes the proposed hybrid AI method to calculate DCL and additional operating reserves using the SDVU scheme.

C. RESULT OF DCL CALCULATION
By using (15), (16), Table II, and forecasted results of the proposed hybrid AI method (SARIMA+LSTM), we calculate DCL in the range of percentile magnitudes. Fig. 7 shows SCL versus DCL in determining the average additional daily operating reserve per unit (p.u.) of the observed data based on percentile value ranges. SCL used the maximum the magnitude from the previous day while DCL uses forecasted ones from the proposed hybrid AI method. Furthermore. this paper evaluates the cost-effectiveness advantages of DCL by comparing its results with the SCL one. The comparison is made by using Table I Table VII shows the associated energy saving for each percentile scenario.
It can be inferred that using SDVU-based DCL can produce significant savings with the same level of reliability compared to SCL. Even the P-99 with DCL requires additional operating reserve which is only about one-fourth of the P-90 with the SCL.

D. USER INTERFACE APPLICATION
Since the result of this research was used directly by users at the Southern Sulawesi Load Dispatch Center, we created a simple user interface application so that users can use it for daily operational purposes. By using this application, the user can choose the desired day and make changes to the peak load, load factor, composition of peakers, load-followers, baseloaders, and WPPs as needed. Based on this choice, the stiffness of the system can be calculated using a statistical approach. Based on the system-level approach, the optimal percentile, total additional energy requirements, and additional hourly operating reserve composition are calculated according to the pre-selected day data. The application is designed in such a way that it can be used by generally in Indonesia, which has a typical tropical climate similar to conditions in South Sulawesi. Figure 9 shows an example of the application display to calculate the need for additional operating reserves on Thursday which is the highest peak load on each week.

V. CONCLUSION
This paper has demonstrated the implementation of Seasonal Daily variability and Uncertainty (SDVU) schemes to determine the optimal additional operating reserve in a power system due to the penetration of WPP. The proposed hybrid AI method uses the strengths of SARIMA and LSTM schemes, which can catch linear and nonlinear system characteristics. It is the best hybrid AI among all NNV methods based on the accuracy and consistency rate.
The research question in the Introduction section has been answered clearly since this innovative method is conclusively proven to give more economical additional operating reserves with the same level of system reliability during WPP penetration. With a Dynamic Confidence Level (DCL) it may save up to 4.2% of the total energy compared to the one with a Static Confidence Level (SCL). To further complement the results of this study an interface application is added so the benefit of this new breakthrough can be used in the southern Sulawesi power system, generally in other Indonesia areas or around the world with similar tropical climate characteristics. Further work to be done is to determine system-level operating reserves in the observed system based on Hybrid AI that considers load fluctuation as well as uncertainties of generation and transmission systems.