Real Time Demand Response Modeling for Residential Consumers in Smart Grid Considering Renewable Energy With Deep Learning Approach

Demand response modelling have paved an important role in smart grid at a greater perspective. DR analysis exhibits the analysis of scheduling of appliances for an optimal strategy at the user’s side with an effective pricing scheme. In this proposed work, the entire model is done in three different steps. The first step develops strategy patterns for the users considering integration of renewable energy and effective demand response analysis is done. The second step in the process exhibits the learning process of the consumers using Robust Adversarial Reinforcement Learning for privacy process among the users. The third step develops optimal strategy plan for the users for maintaining privacy among the users. Considering the uncertainties of the user’s behavioral patterns, typical pricing schemes are involved with integration of renewable energy at the user’ side so that an optimal strategy is obtained. The optimal strategy for scheduling the appliances solving privacy issues and considering renewable energy at user’ side is done using Robust Adversarial Reinforcement learning and Gradient Based Nikaido-Isoda Function which gives an optimal accuracy. The results of the proposed work exhibit optimal strategy plan for the users developing proper learning paradigm. The effectiveness of the proposed work with mathematical modelling are validated using real time data and shows the demand response strategy plan with proper learning access model. The results obtained among the set of strategy develops 80 % of the patterns created with the learning paradigm moves with optimal DR scheduling patterns. This work embarks the best learning DR pattern created for the future set of consumers following the strategy so privacy among the users can be maintained effectively.

INDEX TERMS Demand response, best strategy, robust adversarial reinforcement learning, renewable energy. In recent years, residential electricity usage is considerably increased due to various factors including home appliances and abrupt change in climate condition. To manage increasing electricity requirement, renewable energy (e.g., solar, hydro, wind) are considered as a supplemental energy source. But renewable energy sources are inconsistent and highly VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ depend on the unpredictable weather conditions compared to conventional energy sources. Hence, there are many challenges encountered by traditional electrical energy grid in order to meet the growing electricity demand. These challenges can be effectively managed by smart grid through efficient utilization of renewable energy as per the demand of residential user by controlling production and distribution of electricity [1]. The primary objective of demand response (DR) is to minimize the electricity expenditure of the residential user by implementing an optimal strategy to decrease the electricity usage for a limited period [2]. This can be realized with the help of several DR approaches by enabling the residential appliances usage in low demand hours at user side and reduce peak load demand at grid side [3], [4].
There are many DR strategies proposed by researchers to reduce the electricity usage during peak time and balance the demand and supply of electricity by the smart grid [5]. In [6], different types of electrical assets including electric vehicle and applied mixed integer linear programming methods are considered to perform day-ahead DR scheduling of appliances to minimize the electricity usage cost. Meanwhile, uncertainty of the electrical appliances usage and electricity cost are not considered while performing DR scheduling. An intelligent DR scheme proposed with a mathematical model to schedule the electricity consumption of controllable appliances for smart households considering dynamic electricity pricing scheme [7] is executed. In work [8], an adaptive consumption level pricing-based demand response scheme proposed for residential user which results in lowering the electricity bill of 73% of customers. To handle uncertainty on the DR scheduling scheme, an intergeneration projection evolutionary algorithm is employed with a robust optimization method to decrease the electricity usage under unpredictable user behavior on appliances in [9]. A Smart residential community DR scheduling model presented in [10] efficiently schedules electricity load demanded by the user at different price-based DR plans through interruptible load programs. A residential DR model using Demand Price Elasticity Matrix (DPEM) mathematical model is proposed to determine the cross-elasticities based on weighted load shifting and reduction target on real world data collected from south brazil electrical distribution utility [11]. A DR model for renewable energy source based residential smart grid demand response system introduced, in which real-time electricity pricing prediction performed by reformulating non-convex to convex problem to assure optimal solution [12]. In addition, a heuristic approach is adopted in order to achieve best prediction of uncertain electricity pricing. These aforesaid approaches are model-based DR strategies which demand an appropriate optimization model that comprise of a predictor and solver. In addition, these DR strategies are involved in model construction and to identify model parameters. This procedure demands desire domain knowledge besides model inaccuracy which may lead to performance degradation in optimal scheduling strategy. In contrast to model-based DR strategies, learning-based DR strategies gained more attention among researchers by eliminating the requirement of identifying necessary model [13]. An optimal DR scheduling scheme for residential and small commercial building using most popular Q-learning Reinforcement learning (RL) algorithm recommended in [14]. Similarly, model-free batch RL based DR scheduling proposed for prediction day-ahead schedule of thermostat by formulating decision making process as Markov decision process [15]. An optimal scheduling scheme based on binary backtracking search algorithm presented to forecast the electricity demanded by various appliances [16]. Similar to early works of model-based scheme, aforementioned learning-based schemes failed to consider inconsistent behavior of the user and varying electricity prices. A similar approach is followed in [42] however, a mixed integer non-linear program (MINLP) is introduced to arrive a customer's objective function instead of a traditional integrated method. Also, it provides the flexibility to the customer to choose different Stackelberg game-based solution to maximize the satisfaction level. Due to recent advancement in artificial intelligence, reinforcement learning(RL) gained more attention to provide a solution to the decision-making problem in smart grid [43]. The RL based DR scheme proposed in [17], a fluctuating electricity cost is considered as a discrete finite Markov decision process (MDP), and Q-learning is adopted to make the predication. Simulation results prove the effectiveness of the proposed method by providing a win-win strategy to both customer and utility company.
To balance energy supply and demand in smart grid, a dynamic pricing DR strategy using RL proposed in such a way that the pricing problem in dynamic state is formulated as a discrete finite Markov decision process (MDP), and Q-learning is adopted to solve this decision-making problem [17]. An optimal scheduling scheme proposed for users by approximating as Markov perfect equilibrium of a fully observable stochastic game using online load scheduling based on actor-critic RL approach for online optimal scheduling of DR appliances [18]. An approximate dynamic programming (ADP) based approach with temporal difference learning used for effective scheduling of house-hold appliance's uncertain electricity [19]. A new real-time incentive-based DR algorithm using RL and deep neural network (DNN) applied to balance energy demand and supply by smart grid [20]. In order to forecast electricity price and load demand under unpredictable circumstances, DNN is used. Later, RL is utilized to compute optimal electricity price for different users by ensuring benefits of user and service provider. A multi-agent RL based DR approach suggested to make optimal decisions for scheduling several home appliances in a distributed manner besides artificial neural network (ANN) used to handle uncertainties in future electricity pricing [21]. Although, aforementioned approaches do not need of any appliances model for optimal DR scheduling, however information about varying electricity distribution of appliances and its key features are required for learning process. By adopting deep RL (DRL) approach aforesaid problems can be rectified by enabling end-to-end learning capability of deep neural networks [22]. This feature motivates many researches to develop optimal DR scheduling strategy by employing DRL based algorithm. A deep Q-Learning (DQN) based DR scheduling strategy is implemented for achieving optimal charging strategy of an EV by considering many constraints involved during EV charging process [23] An online based optimal energy optimization technique is proposed for scheduling time-scaling and timeshifting load using DQN and deterministic policy gradient (DPG) algorithms [24]. A deep DPG (DDPG) algorithm based optimal DR strategy for controlling various hose-hold appliances suggested in [25]. One of the main limitations of DRL approach on DR scheduling is either it can handle discrete or continuous control action appliances. However, this can be overcome by applying trust region policy optimization (TRPO) approach combined with DRL to control both discrete and continuous controlled appliances in resident and schedule them optimally [26].
RL algorithms generally based on creating new actions which helps in creating optimal condition [24]- [28]. Generally, RARL algorithm are more advantageous where the agent constantly creates a situation and adapt to the changing environment. The unique feature helps in the smart grid environment where the residential users. Table 1 exhibits the characteristics of related research in recent years. The work proposed in [35] exhibits the predictive control algorithm with DR modelling where the analysis is done with rule based predictive method. In [36]- [38] different machine learning algorithms are used which emphasized only with set of consumers. The work proposed in [20] in recent literature shows the importance of incentive-based DR modelling considering Reinforcement learning and Artificial Neural Network. As in this work the customers considered in the dataset are only set of residential consumers. In work [39] the target of obtaining the utility side maximation of profit with online pricing is observed considering the objective of customer profit. In this work the authors had established online pricing scheme depending on the usage of the appliances at different load conditions peak load and non-peak load. In work [40], [21], residential set of consumers are used for targeting the objective of customer behavioral pattern. In the work proposed in literature [20], [32]- [38], the demand response modelling has been established with different set of residential consumers units. The work in the literature shows the customer behavioral pattern and also the two main objective of maximizing the utility profit and minimizing the cost of the consumer side. Thereby on scheduling of appliances, depending upon the different load patterns of peak load and off peak load. The users create a proper scheduling pattern with DR strategy. Different machine learning algorithms [20], [21], [36]- [41] are involved in literature embarking on the effective DR strategy with proper scheduling patterns. And also different algorithms are considered for the DR strategy. Using Machine learning algorithms, makes the consumer to arrive at the best strategy plan for the users in proper scheduling patterns.
In recent times, most of the modern homes are designed to harvest energy from home renewable energy systems beside utilizing electricity energy from grid to minimize the electricity bill. Due to advancement of smart grid technology, there is a high chance that privacy of the customer can be compromised, since smart grid utility center aware precise amount of renewable energy generated by individual in a specific time period. Consequently, utility center may misuse these vital data and rise electricity price in a situation wherein home renewable energy sources are incapable to offer electricity [29]. Despite the measurable benefits of DR, electricity usage data collected from individual residential customer may disclose their vital information about their lifestyle including sleeping routine, economic status and house-hold appliance usage patterns which leads to serious privacy concerns [28]. Although aforesaid advanced RL [17], [42]- [44] approaches are proposed for optimal DR scheduling for residential customer, it is evident from the existing studies that from home renewable energy systems and preserving customer's privacy are not effectively incorporated in most of their learning model. There are very limited number VOLUME 9, 2021 literature deals with residential DR schemes considering customer's privacy preservation [29]- [31]. However, these approaches failed to achieve an optimal DR strategy that benefits both residential user and utility center. Therefore, a Robust Adversarial Reinforcement Learning (RARL) based approach with Gradient Based Nikaido-Isoda Function (GNI) [32] proposed to develop an optimal mixed DR strategy model considering renewable energy from photovoltaic (PV) cell set up and preserve customer's privacy [33]. In this model, GNI function helps to develop an energy schedule for scheduling of appliances based on cooperative mixed strategy with cooperative analysis and RARL learning model assist to provide best incentive policy by analyzing energy schedule of different users. The proposed learning model hide customer's vital information, meanwhile it offers sufficient information needed to utility center for computing DR scheduling and electricity forecasts.
In our proposed work, the main objective targets on effecting DR strategy pattern as the first step which involves the consideration of mixed set of consumers. The consumers considered as the dataset are mixed of residential consumers. In a group of 100 users in a residential unit, in which scheduling pattern are carried out. As such in the 100 consumers in the dataset pattern, using Machine learning RARL algorithm, best strategy plan is created. This DR strategy plan is based on the previous scheduling patterns created by the user. This implies that privacy can be maintained in the consumers. The consumers had given a policy-based DR strategy to learn best scheduling pattern based on the learning process done by them. And also the uniqueness in the model compared to other works is that using RARL, advanced machine learning algorithm best strategy plan is given to the customers using an agent. This helps the consumers to follow the best scheduling pattern thereby developing DR strategy. The learning process helps the consumers to develop proper privacy measures. The privacy concern is that the user develops a proper behavioral pattern. The scheduling of appliances with DR strategy is done by GNI function. The model uses GNI function in order to develop an equilibrium stage of multiusers, as there are mixed set of consumers in the residential unit. The main aspect for multi user process this function is useful for obtaining optimal strategy. This work embarks on the strategies created from the consumers in a mixed profile set up with renewable energy and DR analysis is done. Learning paradigm is created using advanced machine learning algorithm RARL for the privacy concern of the residential consumers. The privacy is analyzed clearly using this process. congestion analysis is made using the algorithm by creating optimal strategy plan for the mixed set of consumers. Privacy is maintained among the users as only with proper DR strategy users, incentive process is given is mathematically analyzed in our model.
The main contribution of the model proposed • DR analysis is done using GNI function to obtain the optimal strategy pattern for the residential units • The residential users are mixed set of consumers, both units integrated with and without renewable energy at residential units.
• Learning paradigm is created using RARL, which helps to develop a best DR strategy pattern which avoids DR congestion and also creates a best pattern for forthcoming users. On adopting this algorithm in the process, helps the consumers to frame proper strategy. Thereby privacy concerns are avoided very much among the residential units. The users try to follow the best pattern depending on the appliances which is pre modeled using this learning paradigm. A learning optimal DR pattern is created for the forthcoming users.
The rest of the paper is organized into following sections. Section II shows the system model explaining the step analysis for reaching the goal of best scheduling and policy development in the environment. Section III exhibits the mathematical illustration for the models with algorithm learning model analysis. Section IV depicts the results and discussions of the DR model with policy developed. Section V completes with the conclusion and future work approaches in the future smart energy era.

II. SYSTEM METHODOLOGY
In this proposed model, Demand Response (DR) analysis is done in smart grid for a set of real time data of residential consumers considering renewable energy in the house hold set up with optimal scheduling of appliances and minimizing the user's bill in accordance with the privacy concerns using deep learning modelling. The entire process is done with two steps in the modelling. The first step is the process of optimal strategy plan for the residential users considering renewable energy by developing a scheduling procedure for reducing the energy bill using Gradient Based Nikaido-Isoda Function (GNI) arriving at a cost-effective optimal strategy. At the outset the second step involves the strategy plan obtained from the different users is made interacted with the utility to obtain their incentives and also predicting the future prices using Robust Adversarial Reinforcement learning (RARL). The model of household with solar panel can exhibit added income by selling excess generation and on the other hand at shortage times receive energy from another consumer who holds an energy storage battery system. DR strategy is formulated in the model based on approximate mixed strategy for minimizing the cost and effective scheduling process with less privacy concerns by obtained the optimal equilibrium.
The idea of RARL is involved mainly in DR strategy as in set of users in a residential community is trained based on the common agent which operates in the occurrence of destabilizing adversary in the system. The common adversary is reinforced such that it learns with an optimal policy. Fig. 1 depicts the DR system model with learning paradigm. In the proposed model, RARL exhibits multi agents in the group of residential consumers which are trained jointly as two set of agents named protagonist and an adversary. In this learning model to obtain the optimal and best policy for the users the protagonist learns to exhibit the task of reducing the cost by obtaining best scheduling plan.
Considering set of users in residential set up enabled with smart metering for scheduling the appliances as home energy management unit. The appliances scheduling depends on the loads from deferrable and non-deferrable loads depending on the real time pricing scheme. In this entire model the dynamic pricing scheme is considered from the utility based on the incentive strategy scheme in demand response considering the privacy and model of uncertainty. In accordance with this each consumer in this environment used as learning model develops an energy minimization problem based on their energy consumption as per day analysis. In this model each consumer develops an energy schedule based on cooperative mixed strategy using Gradient Based Nikaido-Isoda Function (GNI) to reach the equilibrium.
The necessity of this function to develop the strategy in a group of consumers is that it created a mixed strategy with cooperative analysis. This joint strategy is done with scheduling of appliances based on the loads and also considering the solar energy (PV) at every unit. Fig. 1 exhibits the model analysis with proposed system. The Fig. 2 explains the two steps in the RARL model. The first one is the adversary mode where the adversary creates an optimal strategy. Among the strategy, best one are created from the set of profiles formed from scheduling the appliances. The next section in the Fig. 2 shows the best learning policy created among the best strategy pattern which helps the consumer to follow the DR strategy for reducing the bill in smart grid environment.
This joint analysis develops to minimize the cost on per day basis. There are privacy concerns in the systems which can be analyzed by this function so the data is secured depending on the strategy created by the consumer. At the next step the strategy plan created considering this process at a daily basis moves on to a learning model in order to obtain incentives for the best plan and also creating privacy issues. This is done using RARL deep learning model which develops the best policy by creating two set of sections of protagonist and an adversary to fulfill the task of best strategy creating a proper policy from the utility by developing test and training scenarios in a smart grid environment.

III. MATHEMATICAL ANALYSIS AND LEARNING PARADIGM
In this model we considered discrete time system where the scheduling process is done as a load aggregator which aims to an effective strategy profile for incentivizing the residential consumers in order to adapt to their consumption pattern for the learning model reinforced. The time slot t for the scheduling pattern obtained by the consumers considering set of renewable energy users in the home, GNI function is involved to obtain the mixed strategy. The mixed strategy is considered as in the residential set up, there is mix of consumers who uses renewable energy with their fixed appliances scheduling in a day pattern where S is the strategy for the users to get the best strategy path. The incentives I n for the consumers are released based on the best strategy thereby incorporating required amount of load reduction with best schedule. In the first step of modelling set of consumers are formulated for energy consumption scheduling pattern of household appliances both the mixed strategy of integration of renewable energy using Gradient Based Nikaido-Isoda Function. Thereby the first model is formulated with an objective function to obtain the best mixed strategy in the residential sectors for both controllable and non-controllable loads. The loads which are able the shift their load time are called shiftable loads. This model invariably depends on mixed strategy of users reacting dynamically with each load interfaced between the utility and the user. Another major constraint proper scheduling of appliances is done based on VOLUME 9, 2021 the strategy involved by shifting the loads in accordance to non-peak hours. In the second model the learning is incorporated based for privacy preserving users. In this incentive strategy learning is done based on the estimated value from the utility which is obtained from the test scenarios.

A. MODELLING FOR THE FIRST SCENARIO FOR SMOOTH BEST STRATEGY
The interaction in the first step in algorithm is formulated using Gradient Based Nikaido-Isoda Function (GNI) in which the residential users are been classified as set of users with renewable energy interaction and without this consideration forming a mixed strategy. The user's house considered to be equipped with PV panels and battery for electrical storage. The distribution is made for the strategy is arranged as discrete and continuous actions. The incentives for the users depending on the best strategy obtained is announced based on the user's privacy process handled using the function obtained. Consider N as the set of users as N = {1,2,3. . . n } and strategy of the n users with m appliances as The n denotes the set of residential units with m set of appliances. The process of strategy obtained for the mixed set of users in a residential environment is obtained from the set of appliances considered as m = {1,2,3. . . . N}. For obtaining the best mixed strategy of using renewable energy in DR modelling with minimizing the cost function β(n) as the consideration of multiplayer game with two group of residential users in the set of N values considered for obtaining the best Nash equilibrium. This formulation helps the users to locally improve their strategy for minimizing the cost with comfort and maximizing the profit. This method is adopted so GNI function makes it easier with less complexity for training the best strategy for the users at the prediction model in RARL. This function in the game helps the user to indicate how much the consumer's gains at a particular stage of game in scheduling the appliances and the user's changes the plan of a strategy into a new vector as in all other users continue to play with old strategy in accordance to their scheduling of appliances. The aggregated energy consumption (n ∈ N ) the strategy reaching at an equilibrium using GNI function for mixed profiles considered using GAMS solver and scheduling appliances in MATLAB.
Scheduling of appliances at the starting and ending time. This activity is divided on two slots in a day as there is energy balance created from PV [25] and storage for set of consumers as GNI function [26] used as such that when (objective function) is set as the global minimum when it is minimized as Nash equilibrium of N users at m appliances for is satisfied for mixed strategy of the users.
in which constraints considered with This subjects the GNI function is does as steepest descent gradient for scheduling the appliances for all the loads scheduled in the required time period. Each value in the order of GNI function to obtain the best strategy from the mixed profile exhibits the knowledge of changing in the user gain from the strategy obtained and it repeats the process as the sum of change in the user gain for obtaining the best scheduling strategy at every given process of the game involved in the environment of the learning process equipped in the second step in the mathematical analysis in Demand response modelling.

B. LEARNING MODEL
In this learning model for better privacy among the consumers when demand response strategy is exhibited among the consumers, this learning model creates the best strategy obtained to check with an adversarial agent and also receive incentives from the utility services. As such if the users are playing a best strategy and the other users is also playing for the best strategy with this learning model creating an adversary, reward function called the incentives is received from the utility and this helps in maximizing the reward obtain and the model continues for every strategy created by an agent developed in this learning model. The final goal from the best strategy obtained from the process creating a policy for the protagonist, in this DR modelling are the residential users playing the game by scheduling the appliances in a day basis. Fig. 3 explains the understanding of the model in the DR structure. Thereby the model is developed for best policy with the strategy developed among the set of two elements of pro-tagonist and adversary for privacy concerns. The transition function here the obtained cost function β x and the best policy rendered using the learning reinforced parameter θ φ for the policy in such a way the expected reward obtained from the utility side in which the policy φ from the start of the best strategy created from the GNI function developed in the first step. At every step both the users accrued at the states of strategy developed S i and take actions h reinforced is developed from the control center, the utility and the user environment. At each step in learning the policy is set up ). The protagonists at the users developed strategies in order receive the incentive and maximize the reward with the following reward function as the policy are the fine-tuned learned models involved and created based on the best strategy developed from the model by giving incentives with standardization of policy. The foremost parameters for the best two strategy players after the utility announcement are been sampled continuously from a rapid random distribution. To create a standard policy in further learning models, the protagonist-based strategy made from one customer is made constant and the adversary model for further strategy winners are learned based on the reward analysis made. In every learning procedure, depending on the learning aspect much iteration is done. The complete sequence is repeated in DR environment till the best policy is created and the process is made till it reaches the convergence state. Table 2 exhibits the algorithm for the learning model developed for policy makers in DR model. This learning model inhibits for the consumers to learn the agents' policy created generally from best two strategies created from the consumers in a continuous period. The process involves large amount of data for practical scenarios so training of the model is done. Thus, creating policy model perhaps different uncertainties created from the user scheduling side and perturbations involved. In this model the parameters

IV. RESULTS AND DISCUSSIONS
In the proposed model with learning paradigm, the mathematical analysis is worked with GAMS solver and Gurobi Optimization [34] considering using independent load profiles. This GAMS solver is used in the process for the load profile generation patterns with GNI function. Different set of consumer patterns are considered, in order to obtain the dataset this GAMS solver is used. The results are obtained from the data set collected from the data port [44], [45]. From the collected real time data set, the GAMS solver and with Python programming, the raw data set are preprocessed with the sources of mixed consumers. The mixed consumers here show the users with profile considering with and without PV integrated at their units. The data set considers of pre- processing data of both set of consumers. Among them the certain load profiles are taken for consideration. As to develop a learning process for this load profile. The mathematical model which is developed tries to give us related load profiles. The dataset of 20 appliances profile is considered at the first set. The appliances are separated for deferrable and non-deferrable loads. Among the 20 appliances considered, the first set provided the best scheduling pattern. The scheduling pattern are obtained from the GNI function which helps to deliver best profiles. The best profiles are configured by developing a function with GNI which sets up a threshold for the peak load and off-peak load. Among the data set considered, best load profiles are formed. It elaborates for set of consumers which stands as the optimal pattern for the other dataset considered. The DR algorithm with the users considering with and without renewable energy is involved for every hour basis the scheduling analysis is done and the best strategy evolved. The best strategies evolved is considered as the learning objective for the users to set the policy at the utility side using RARL to obtain the best policy with optimized strategies for future perspective in the DR structure. The price structure and the energy information are collected from real-time dataset and VOLUME 9, 2021 the model is trained using MATLAB tool box with Python programming to obtain the results for the learning structure. Fig. 4 exhibits the example of single user load profile data with set of appliances to use for DR strategy scheme.
GAMS solver is involved for the strategy obtained from the load profiles collected from the users with PV and without usage in a set of 2000 load profiles collected. All the results for GAMS solver with the optimization involved are done on MATLAB 2016 running on a Windows machine. Among the 2000 load profiles created the GAMS solver which is used to develop the GNI function sets up an equilibrium for obtaining a standard load profile pattern which is depicted in Fig. 4. The pricing scheme involved with the strategy is done with Time of Use (TOU) pricing scheme and also compared with Hourly pricing scheme for DR analysis. The pricing schemes used here are TOU and hourly pricing schemes. Comparison analysis are made for set of 100 users considering the set of appliances on an average day usage of 1122.6 units, estimated the average weekday usage of 1353.0 units and also weekend usage calculated of 892.2 units and maximum load usage in the set of units are calculated in TOU pricing scheme for the dataset. The unit's usage among the residential users on an average calculates the load profile pattern for 892 sets. In Fig. 5 the mixed consumers pattern which is the second set of units, best scheduling pattern are created using GNI function. The plot shows Fig. 5 shows the load profile considered for scheduling of load per user with mixed profiles considered with renewable energy based on the optimization. Best profile function is developed using best DR strategy pattern is shown here as the second dataset of residential users integrated with Solar PV at their units.
The GNI function develops a set of two profiles one with a group of users with solar PV and forms a best pattern. And the second set formed using GNI function is the best profile without Solar PV integrated at the residential units. GAMS solver is used to develop the best equilibrium state   profile to move on to the next model of learning paradigm using RARL.
The calculations are being done based on the cost function developed at every single user case with and without usage of load scheduling in DR schemes. Table 3 explains the performance gains of the model developed as the GNI function where two set of profiles are considered for learning process. The first set shows with and without PV develops better saving in the model created. Table 4 shows the performance analysis for set of users in Rs per unit calculated with DR strategy best schemes as TOU pricing is involved and considerable there change of cost analysis done with hourly pricing scheme and further understanding of the different choices of strategy obtained and develop the best two strategies the RARL algorithm learning model is involved. The results shows the system cost of the best strategies developed and analysis made. Best strategy profiles obtained from the utility center using the GAMS solver results from DR and DR strategy of renewable energy profiles. Best policies are created based on different iterations in the learning model. Fig. 6 shows the DR strategy created among the best 200 profiles formed from the GNI function. The plot shows among the set of best analysis done created from GNI function. The best DR patterns are created which is been developed from RARL model. Learning pattern develops the DR process for best profile. The best profile without DR is ignored so that the RARL model helps to create a proper profile pattern for the future set of consumers following this learning paradigm through the agent. Fig. 7 shows the best policy pattern created for the strategy based on cost analysis process. This pattern helps to develop a best DR strategy among the two developed set of profile patterns. This learning paradigm forms a maximum of 80 % accurate matching DR profiles generated predicted. RARL done with Python programming shows that among set of 200 profiles, 80 % generated profiles match the best strategy pattern. Thus, uniqueness provides a best learning paradigm for the future set of consumers following this DR strategy created using GNI function. Table 5 depicts the different set of computational functions used. GAMS solver is used to develop GNI function which creates two set of profiles which is integrated with renewable energy and without renewable energy. The next step, a cost function is obtained using RARL machine  learning algorithm. Among the two sets of profiles created with DR strategy, best cost function scheduling pattern is set of learning paradigm where the computational time is very less. Since less computational time is involved, privacy among the consumers can be easily manageable. Using data privacy, the consumers follows a standard learning pattern which avoids unnecessary actions which creates problems for the other users. From the effective pattern for DR strategy is created for the future consumers, so that the users can follow this general profile pattern without DR congestion and data privacy can be very much used. TOU pricing scheme used which helps the user to develop a pattern based on the time the appliances are scheduled. The analysis of the process is done by two set of agents created from TOU pricing scheme considered. The policy developed from the best strategies created is shown in Fig. 8 as such different iterations are followed and the best curve analysis among the set of profiles from optimization GNI function. In the Fig. 8 which is obtained from RARL algorithm shows that among the 100 iterations considered for the first set of profile selection, it is found that among 150 profiles follows the best strategy pattern created by RARL algorithm. The parameters considered here it develops best strategy profiles based on VOLUME 9, 2021 cost function and forms a standard pattern for the future consumers.
In the Fig. 8 it shows that's among the models considered for the set of 100 users from the DR strategy model. The policy is created from best strategy and prices are determined as incentives to the users and the curve fits at a range of 3 to 4 consumers follow the best strategy involved and the policy model is developed based on this so that a set of consumers can follow this model without any privacy concerns developing in the future. This learning model gives as understanding how the learning parameters can be done using the DR strategy with the consumers are in the mixed set up. Results show that training more models for a set of consumers the policy developed by the agent can be used and more residential consumers are able to the learn the analysis in a better response. The iterations in the Fig. 8 are repeated for standard set of mixed profiles which has been created by GNI function and forms a standard template for the future set of consumers following this scheduling pattern. From this model the unique aspect is that RARL model develops a standard learning pattern to the consumers for the mixed set of profiles, where considering the dataset 80 percent of DR profiles where able to train the learning process in a successful model.

V. CONCLUSION
In this work Demand response modelling for residential consumers is done with a learning model created with the strategy developed from the set of users. In this model from the regular DR modelling created, this work embarks on the strategies created from the consumers in a mixed profile set up with renewable energy and DR analysis is done. GNI function is involved to play this game among the users which helps to develop strategies when mixed profiles are created. Furthermore, the best strategies which are created are given incentives as for the privacy concern has to be maintained, a learning model is done using deep learning paradigm. This learning model created in the smart grid environment helps to form the best strategy created and exhibit a model to residential users without any privacy concerns. Different uncertainties created among the users can be solved using this deep learning analysis and simulation results exhibits the best strategy created and training model is developed so that best pattern can be developed. Furthermore, as an extension of work, DR congestion developed at the network can be formulated and the comparison analysis with the strategy model can be done so that it can be applied in real world situations at a wider perspective.
S. SOFANA REKA is currently working as an Assistant Professor (Senior) with the School of Electronics Engineering, Vellore Institute of Technology at Chennai. She has published various international journals with high impact factor in her credit. Her research interests include smart grid, embedded systems, machine learning, the Internet of Things, and cyber physical systems. She is a reviewer of many SCI journals.

PRAKASH VENUGOPAL is currently a Faculty
Member with the School of Electronics Engineering, Vellore Institute of Technology at Chennai. He has published many research articles in the high-quality peer-reviewed journals and international conferences. His major research interests include embedded system design, real-time operating systems, battery management system in electric vehicle, smart grid, the Internet of Things, and machine learning.
HASSAN HAES ALHELOU (Senior Member, IEEE) is currently a Faculty Member with Tishreen University, Lattakia, Syria. He is also with IUT, Iran. He has participated in more than 15 industrial projects. He has published more than 100 research articles in the high quality peer-reviewed journals and international conferences. His major research interests include power systems, power system dynamics, power system operation and control, dynamic state estimation, frequency control, smart grids, micro-grids, demand response, load shedding, and power system protection. PIERLUIGI SIANO (Senior Member, IEEE) received the M.Sc. degree in electronic engineering and the Ph.D. degree in information and electrical engineering from the University of Salerno, Salerno, Italy, in 2001 and 2006, respectively. He is currently a Professor and the Scientific Director of the Smart Grids and Smart Cities Laboratory, Department of Management and Innovation Systems, University of Salerno. His research interests include demand response, energy management, integration of distributed energy resources in smart grids, electricity markets, and planning and management of power systems. In these research fields he has coauthored more than 500 articles, including more than 300 international journal articles that received in Scopus more than 10 100 citations with an H-index equal to 49. In 2019 and 2020, he received award as a Highly Cited Researcher from the ISI Web of Science Group. He has been the Chair of the IES TC on Smart Grids. He is an Editor of the Power and Energy Society Section of IEEE Access, IEEE Transactions on Industrial Informatics, IEEE Transactions on Industrial Electronics, IEEE Open Journal of the Industrial Electronics Society, IET Smart Grid, and IET Renewable Power Generation.
MOHAMAD ESMAIL HAMEDANI GOLSHAN (Senior Member, IEEE) received the B.Sc. degree in electrical engineering from the Isfahan University of Technology, Isfahan, Iran, in 1987, the M.Sc. degree in electrical engineering from the Sharif University of Technology, Tehran, in 1990, and the Ph.D. degree in electrical engineering from the Isfahan University of Technology, in 1998. He is currently a Full Professor with the Department of Electrical and Computer Engineering, Isfahan University of Technology. His major research interests include power system analysis, power system dynamics, power quality, dispersed generation, flexible ac transmission systems and custom power, and load modeling special arc furnace modeling.