RECLAIM: Renewable Energy Based Demand-Side Management Using Machine Learning Models

The diesel generators sets (DGs) and battery storage systems (BSS) are the essential energy sources in a modern high-rise buildings. In this paper DG, BSS and Photovoltaic system (PV) has been considered to minimize the grid power injection using a centralized Energy Management System (EMS). Machine Learning (ML) techniques are used to predict the performance of various regression models by comparing grid power and load curves. It includes Artificial Neural Network (ANN), Wide Neural Network (WNN), Linear Regression (LR), Linear Regression Interaction (LR-I), Linear Regression Stepwise (LR-S), Regression Fine Tree (RF-T), Regression Coarse Tree (RC-T) and Gaussian Process Regression (GPR) based techniques. The Demand Side Management (DSM) techniques such as peak shaving and valley filling is integrated with ML technique in a Hybrid energy source (HS) system.The comparative analysis of results depicts the effective reshaping of the grid profile without scheduling or disconnecting the loads. Matlab simulation software is used to validate the results.


I. INTRODUCTION
Environmental constraints and limited reserve of fossil fuels, has shifted world's response in favor of renewable energy sources (RES). Solar and wind are two major energy sources in this regard [1]. An Energy management system (EMS) is required for the optimal operation of RES in a power network due to un-predicted nature of solar and wind energies [2]. A stand-alone micro grid (MG) can be integrated with RES for the stable operation in a power distribution system. It includes, diesel generator sets (DGs) and battery storage systems (BSS) [3].The MG must balance the power generation and demand and also fix uncertainties in RES [4].Therefore implementation of EMS is important in regulating HS efficiently [2] These HS can be used to overcome peak demand in high-rise buildings and can deliver power The associate editor coordinating the review of this manuscript and approving it for publication was Szidonia Lefkovits . at a point of common coupling (PCC) in AC bus system. In commercial and industrial buildings DGs and BSS acts as a backup supply during the outages of main supply [5], [6]. Now Artificial intelligence (AI) and ML based algorithms are used to predict models in energy consumption [7], uncertainties in renewable energies [8] and Demand-side management (DSM) based heating system in buildings [9].An EMS based MG using ANN is developed in [10].However it did not consider any DSM technique. The comparative analysis of Machine learning (ML) techniques for short-term demand prediction in MG is reported in [11]. Whereas, deep learning ANN is compared with support vector machine (SVM) algorithm to improve energy dispatch in MGs based on solar PV forecasting [12]. To forecast about Renewable energy generation and load, a method based on automated reinforcement learning using multi-period is also proposed in [13]. ML based regression models are developed and compared with statistical method for forecasting electrical prices [14].
EMS control method using HS can be classified as; a) centralized b) decentralized. In a centralized control system, EMS adjusts the power balance amongst various HS in a distribution network [15].There is a difficulty in controlling various HS independently in a decentralized control approach [16]. DSM is a technique that allows users to reduce energy consumption from the grid and maximize on-site energy generation to ensure the sustainability of power supply in a grid-connected system. This technique can be applied for different objectives include load shift, valley filling, time shift and peak shaving [17].Furthermore, it helps to re-shape the load profile and reduce peak load demand in high rise buildings [18]. Therefore, in this paper EMS is proposed based on centralized control using Machine Learning (ML) techniques for the prediction of reference power to ensure a grid power reduction and continuous supply of electricity in high rise buildings. The proposed system block diagram is given in Fig.1. It consists of Renewable energy, DGs, and BSS. The predicted performance of models are evaluated by comparing the power curves generated using Artificial Neural Network (ANN), Wide Neural Network (WNN), Linear Regression (LR), Linear Regression Interaction (LRI), Linear Regression Stepwise (LRS), Regression Fine Tree (RFT), Regression Coarse Tree (RCT) and Gaussian Process Regression (GPR) based techniques. In Fig.2 load profiles shows the real demand and shaped demand after applying the peak shaving and valley filling DSM technique.
The key contribution of this paper can be summarized as; 1) Proposed system allows the better utilization of sources by accurate predictions that minimizes the power consumption through the grid in EMS. 2) Comparative performance of Artificial Neural network (ANN), Regression Tree (RT), and Linear regression LR) techniques analysis indicates the prediction performance on which the parameters are tuned. 3) Model Generated samples approx. 120,000 are used to tuned all the ML techniques which improves the accuracy of results. 4) Proposed regression models are based on DSM technique i.e. peak shaving and valley filling.

II. PROPOSED METHODOLOGIES A. MULTISOURCE ENERGY SYSTEM (MES)
MES got attention after the fast development and research in the field of renewable energy technology [19]. The interfacing of a three-phase grid-connected system with other sources on a common point of coupling is discussed in later sections.
To get high-quality electric power, these sources must be interconnected with the grid under particular conditions. Inter connection of DGs, BSS, and BIPV/Solar PV systems at a PCC can deliver steady generation, balancing demand and production during peak and off-peak hours. The maximum power is captured by modeling of the BIPV system with MPPT, inverter, and DC-DC Boost converter.

B. MACHINE LEARNING TECHNIQUES
In this Paper, we have provided only an overview of ML techniques based on regression approaches to learn how they are operated. Furthermore, four distinct energy sources along with load profile based on 0.2ms time simulated for the total time period of 24 seconds, which further estimated to be one day (24 Hours). Model-generated data set based on time and power magnitudes are used as reference to control the Diesel Generator and Battery power output. Besides, this paper also investigates the various ML approaches to reduce grid power injection with already available data set. The rest of this section gives brief introduction of ML algorithms, datasets, and performance analyses of the earlier discussed techniques.

1) ARTIFICIAL NEURAL NETWORKS (ANN)
Artificial Neural Network (ANN) are computational networks inspired by biology and most often adopted for a wide range of complex problems. ANN artificially mimics the functioning of the human brain by generating the right connections between different network nodes, where each node is equivalent to a biological neuron in the human brain. Artificial Neural Networks, are supervised neural networks that consist of three layers: an input layer, one or more hidden layers, and one or more output layer(s). The input data VOLUME 11, 2023 processing in ANN-based models is structured as a series of many layers that show or mimic the way the brain operates, and the regression is structured similarly. To communicate within a network, nodes link together to interact with each other, where each node can take input data to perform computations and give the results to other nodes.The output of each node is then passed through an activation function known as activation or simple node output. The interconnected links are the substitute for the weights of the artificial network. There are two types of Artificial Neural Networks, e.g., FeedForward and FeedBack neural networks. The information flow in FeedForward neural network is unidirectional from input to output while, in the FeedBack neural network, an additional feedback loop is also involved. Fig. 3 demonstrates the architecture of FeedForward Neural Network. During the training, the weights are adjusted with each iteration using some common backpropagation mechanism. The neural network models in the Statistics and Machine Learning Toolbox are interconnected, besides FeedForward neural networks may vary the activation functions of the layers and allows the adjustment in the dimensions of the connecting layers. A typical backpropagation, involves minimization of loss function using gradient descent method. The square error function is given as, where E denotes square error, l denotes the label and y demonstrates the output of network. For single neuron j with output O j , where W ij denote the weight of i th layer. Xj are inputs for j neuron.
a(z) denotes the activation function for input z. The derivative of a(z), In order to update the weights using gradient decent method during backpropagation, the term −η E W ij is added to the old weight to calculate the new weight.
where η > 0 denotes the learning rate, positive E W ij indicates a rise in W ij , raises the error and negative E W ij represents the increase in W ij decreases error. δ j denotes the square error.
Authors presented ANN construction and operation in more details and gives the overview of ANN in [20]. Moreover, in ANN a model will be trained and the new predictor data can be used further to predict responses for new data after the training process. In this Paper, base model trained using Levenberg-Marquardt algorithm also known as the damped least-squares (DLS) method. Morover Wide Neural Network also used to tune the ANN (Levenberg-Marquardt) based data to get best results. Wide neural networks (WNN) represent a network with less number of hidden layers (usually 1-2) but more number of neurons per layer. These types of networks can be useful when we have less data and problem isn't too complex. Single hidden layer with a lot of neurons can detect simple patterns (simple classification and regression problems) in the dataset but will fail when we start expecting it to detect complex relations (Image detection, Speech recognition, etc.).

2) REGRESSION TREE
Decision trees are supervised machine learning algorithms that offer prediction framework for regression and classification. These algorithms establish a relationship between inputs, decisions, predictions, and outcomes. Modeling the prediction framework for decision trees requires the dichotomy of datasets into binary subsets. The dataset splitting begins at the head node known as a root node. This separation continues until every sample categorizes into leaf nodes based on termination criterion. The node at which splitting occurs is the parent node, and the resultant binary nodes are child nodes.
The binary split procedure is applied continuously for each child node until a termination criterion is met. Terminal leaves are nodes that have not been partitioned further. After a huge tree has grown, a pruning method is used to eliminate the leaves that contribute little to the purity enhancement [21]. A linear regression tree model can be developed for each leaf to increase model fit [22]. Fig. 4 is showing the conventional decision tree model.
Decision trees employ several measures to select the best feature split from the top of the tree to leaf nodes. The successive divisions systematically decide which sub-tree has a better value than the last tree. The algorithm selects the split that reduces the residual sum of square. The process repeats itself for the subsequent splits. The target values to the decision tree framework are continuous for regression.
Purity is frequently defined as the variance in forecasting the output variable's mean value. For regression, commonly used criteria is the residual sum of square between targets and the average response of each sample in that tree.It is defined as, where ρ is the residual sum of square and y i = f (x i ) denotes the value predicted by tree function f for input sample x i . N is the total number of attribute in a sample x i . In order to find a better optimized tree structure, evolutionary learning of globally optimum classification and regression trees [23] uses evolutionary algorithms to build trees while considering not just the next split but also potential splits further down the tree. [24] is other work in the literature that use genetic programming approaches to induct tree models. The major purpose of executing a regression analysis once the regression model has been trained is to produce a more exact forecast of the level of output variables for new samples or data.

3) LINEAR REGRESSION (LR)
LR used to predict the occurrence probability of an event, when the data is fitted to a logistic function. It may use several predictor variables that can either categorical or numerical. A basic assumption of linear regression is that the relationship between the predictors and response variable is linear. Linear regression interaction (LR-I) implements when an interaction effect comes, we can add the assumption that relationship between predictor and response is linear regardless of the level of the moderator. The regression method can be improved based on the choice of the attribute selection method. Thus, for power demand prediction purposes, it is sufficient to apply the LR method.
Linear regression Stepwise (LR-S) is the step-by-step iterative construction of a regression model that involves the selection of independent variables to be used in a final model. It involves adding or removing potential explanatory variables in succession and testing for statistical significance after each iteration.

4) GAUSSIAN PROCESS REGRESSION (GPR)
Gaussian process regression (GPR) models are nonparametric kernel-based probabilistic models [25]. When the observations are noise free, the predicted responses of the GPR fit cross the observations. The standard deviation of the predicted response is almost zero. Therefore, the prediction intervals are very narrow. When observations include noise, the predicted responses do not cross the observations, and the prediction intervals become wide.

C. PERFORMANCE METRICS
Performance metrics plays key role for the validity of algorithms to be implemented to some dataset. The Machine Learning Techniques used in this research had chosen in such a way that minimal possibility of biasness for the metrics under evaluation would be achieved [26]. It may also be possible that each metric provides distinct information about an algorithm's performance. In order to present a more comprehensive picture of algorithm performance, comparisons of ML model have been done using a range of important performance metrics criteria.

1) ROOT MEAN SQUARE ERROR
Root mean square error of model based on some dataset measured to check the accuracy. It is the square root of the average of squared errors, which is calculated mathematically as follows: As the RMSE is calculated by squaring the difference between the predicted and target values, a few substantial discrepancies will significantly increase the RMSE when compared to the MAE. As a result, the RMSE is extreme rarity, making it suitable for studying models with outlying tendencies.

2) MEAN ABSOLUTE ERROR (MAE)
The mean absolute error (MAE) is a measurement of the difference in error between two random variables that represent the same event. It's calculated as Following (2), it can be observed that an error less model will generate a zero MAE value, since A i = A i , thus indicating that the MAE ranges from 0 to infinity, with 0 being an ideal model. For this reason, the MAE is a boundless metric and thus, is data specific. Nevertheless, it remains a valuable metric for comparing models that are based on the same input data. VOLUME 11, 2023 3

) R-SQUARED PARAMETER
The R-squared value is used to calculate how much the data points scatter around the fitted regression line. It's also known as the coefficient of determination. R-squared parameter measures the response in percentage, and it always between 0% to 100%.
For the same data set, higher R-squared values represent smaller differences between the observed data and the fitted values.

D. PREDICTION MODELS
Performance of prediction models is key component in ML techniques which can be assessed by hyper parameter such as RMSE, R-Squared, MSE, MAE, Prediction Speed and Training time tuning processes. Furthermore, trained model results are compared and used further to find fit values of new data. As ML techniques has a limitation for multiple Responses and allows one response at one time based on observations. Thus, in this paper prediction model for Battery system and Diesel generator has been conducted separately with same ML techniques and same dataset. Prediction Model results are presented in onward section, which indicates the model expected performance.

1) GRAPHICAL AND TABULAR REPRESENTATION OF PREDICTION MODELS FOR BATTERY SYSTEM
The Prediction model of battery system aimed to predict the optimized reference values to control the Battery Power accurately. Table 1 Table 1. R-Squared parameter shown in Table 1 indicates that how much observation deviated from their perfect line and how much scattered from the line. Predictive model used to generate the fit values, which further used to control the power drawn from Battery and DGSET. Observations and prediction plot for GPR outperforms well as their parameters are concerned but it take much time and used less observations per seconds. Regression plot based model shown in Figure 5 consists of Observations with reference to their Perfect prediction line. All techniques LR, LRS and LR-I performs poorly. Plots shows the the observations are much dynamic and deviating from the perfect prediction line. R-squared parameter clearly illustrate the regression model fits the observations. Rest of techniques such as WNN, RFT, RCT and GPR completely fit on the perfect prediction line. It can be supposed that obtained results using the trained models for new data possibly will be like their prediction model plot.
Regression plots for battery system exemplifies which technique may perform better for new data. The implemented techniques have their corresponding observations and perfect prediction which is mainly called Regression plots.

2) GRAPHICAL AND TABULAR REPRESENTATION OF PREDICTION MODELS FOR DIESEL GENERATOR
In the proposed approach for the prediction model of generator system, we first took the same real data-set which has been taken for battery system to predict the optimized reference values to control the DG power. Table 2 presents the implemented models parameter values and their respective performances based on the RMSE, R-Squared, MSE, MAE, Prediction speed and training time.
As presented in above table GPR outperforms than other technique. Meanwhile Training time and observation per second are very less as compared to the other prediction model. Performance criteria based on the errors are also lesser other ML techniques. Figure 6 indicates the regression plots of applied models. ML techniques i.e. LR, LRI, LRS and RCT again perfumes below par and it is witnessed the observations are much dispersed from the perfect prediction line. This can also verfied by the R-Sqaured value of LR, LR-I, LR-S for generator system shows the dispersal of observations.
Regression plots and hyper parameters table helpful for the judging the expected performance of overall model. The performance of system may be the better as values are time based and variance may come during the model functioning. As like predication models of Battery system, Linear regression models performs poorly for the given dataset and Neural network and GPR outperforms in both cases for the battery and diesel generator.

E. RUNTIME PERFORMANCE OF ALGORITHMS
In this paper, runtime evaluation performed on the ML algorithms for the complex datasets of battery and DGSET. Whereas, for the validity of results below listed conditions were met: 1) The same datasets used for all three Machine Learning Techniques (i.e. Regression Tree, Linear Regression and Neural Network) were used to evaluate each algorithm. 2) Training parametric of above mentioned techniques was measured. 3) Operating system task manager on the PC was used to end all other foreground processes, this is to avoid any kind hindrance during the simulation. Moreover, all other processes running into the system were forcedly stop to guarantee that the PC is not aiding additional processing time.

F. PREDICTION MODEL GOALS
The aim of this research is to predict and minimization of the power injection from grid and efficient control of sources power during Peak and off peak hours. For this purpose, the dataset extracted from ANN Based model which further 3850 VOLUME 11, 2023 FIGURE 5. Observations and prediction plot for battery system using different regression models.
used with feature engineering process and classifications. For the performance assessment of system, performance metrics are the most important for the model. The techniques that are applied for the purpose of above mentioned prediction purpose assumed to be work efficiently and accuracy and response also be better and faster.

G. DATASET
This section discusses the generation of base data which further tuned with machine learning algorithms such as ANN, Linear regression, Regression tree and wide Neural Network. We considered the system hourly demand for the 24 hours' period. For grid and load profile we have collected data using energy analyzer from BF Tower building in Islamabad. Solar PV profile has been taken from the 300 KW installed system near to BF Tower and hourly data is collected from AiSWEI APP which allows the monitoring of data on hourly, daily  and yearly basis. This application also allows the data to transmit to the cloud platform AiSWEI Cloud via the Internet so that users can remotely monitor their photovoltaic power stations and inverters through cellular phones. In addition, Battery and DGSET reference power considered randomly using MATLAB function in which battery and DGSET can only draw power during the peak hours based on logic. This paper presented the ANN based DSM using the feed-forward neural network type and the Levenberg-Marquardt algorithm for training the ANN. Output of the ANN is further tuned using the seven distinct techniques to get more optimized fit values.

III. SIMULATION RESULTS
The MES system consists of HES and loads is designed to meet increase electricity demand in a building. A simulation model based on Fig.1 consists of solar PV, DG, Grid, BSS and loads are given Figure 7. EMS primary control block and Energy Storage (PQ) model is shown with their controls.
The main grid has an output voltage of 33KV/11K V. The proposed system is designed in grid connected mode and BSS can be charged or discharged based on the reference values of power given by ML techniques. Interconnecting these HES at a PCC on AC bus can deliver steady generation, balance in demand and supply during peak and off-peak hours. The maximum power is captured by modeling solar PV with MPPT, inverter, and DC-DC Boost converter. A bi-directional DC/DC converter is also used to interface BSS. The phaselocked loop (PLL) synchronization technique is used to detect grid voltages, phases, amplitudes, and frequency under typical grid conditions. PQ control approach is used to regulate active and reactive power outputs of HES.During off peak hour's solar PV excess power can be used to serve the load and charging of batteries. Initially battery is charged and it get discharged during off peak hours around 5:00 PM to 10:00 PM. DGs is used during peak hours to minimize the peak load demand. The power Curves of DG, solar PV and BSS extracted using Artificial Neural Network (ANN) shown in Figure 8. Table 3 consist of the load profile taken from the BF Tower High rise building project. As described earlier a realistic load power curve required for to check system performance in real time scenario. Load power curve show in Figure 9.
DG power and Battery power further tuned with ML techniques and rest of all profiles remain same for all techniques. As described earlier, the main aim of this paper is to reduce the injection of power from Grid. Figure 10(a) and 10(b) represents the battery and DG power results with other techniques.
Initially the system trained using ANN based Levenberg-Marquardt algorithm which contains 120,000 samples sources and load profile recorded based on time. After that Model generated samples further tuned with proposed machines learning techniques. Research focused on energy sources and demand-side data, which are reflective of a typical smart grid. Commulative curves of ANN, WNN, LR, LR-I, LR-S, RFT, RCT and GPR shown in Figure 8(a-h). For the performance evaluation purpose, two more ML's techniques GPR and RCT Regression Tree implemented to check the response of the system. Seven Distinct ML techniques applied to get the optimized power curves and to minimize the grid profile over the period.

A. PERFORMANCE EVALUATION
ML techniques used in simulation to reduce the peak demand of load. Techniques are widely applied as a most effective load management technique. It takes advantage of time independence of sources and Shifts the peak period loads to  off-peak hours. Peak shaving and valley filling based on Grid power and Load shown in Figure 9.
Above figure clearly shows that due to the successful implementation of ML techniques the load peaks get reduced and valley also filled through the available site source. Table 4 shows the Power output results of all techniques from 17 to 17.009 time based values.
In order to investigate the robustness and accuracy of the proposed model; a comparison is conducted as shown in Figure 11. For performance evaluation, seven Grid power VOLUME 11, 2023 curves obtained ML techniques were compared together to check the reshaped profile of grid. Table 5 represents the Battery power in KWh which shows that already implement ANN system utilize's 433.256 KWh that means power from on site available sources utlized lesser. Linear Regression technique utilized more power from battery which is 453.021 KWh. Table 5 represents the DG power in KWh which shows that already implement ANN system utilize's 4855.31 KWh which means power from on site available DG source utilized lesser. Coarse Regression Tree reshaped the DG profile and extract more power from DG which is approx 5423.381 KWh.
As described in earlier sections that core purpose of the implementation of ML techniques is to minimize the power injection from Grid and Maximum utilization of on-site available energy sources. The performance and accuracy of system can also be evaluated in terms of Power KWh. It can be seen in Table 5 that Linear regression (LR) reduces the overall grid power up to 7346.103 KWh and Coarse Tree 7414.583 KWh as compared to the rest of techniques. Mentioned techniques clearly shows that during the peak hour's power injection can be reduced significantly reduced from grid without disconnecting or scheduling of loads. This means grid power injection can be minimized effectively during peak hours if Battery and DG controlled by ML techniques.

IV. CONCLUSION
This paper proposed a Centralize Demand side management techniques i.e. peak shave and valley filling aimed to reduce grid power injection and to maximize utilization of Hybrid energy sources in the building.ANN and Machine Learning techniques are implemented to reduce power injection from grid. It is evident that Linear Regression (LR) and Regression Coarse Tree (RCT) outperforms others methods in terms of accurately predicting reference grid power in a distribution system. The performance of proposed micro grid is evaluated by comparing grid power and predicted train models using MATLAB software for its stable operation. Linear regression tree method reduces the grid power compared with other regression models.