A Master-Slave Game Optimization Model for Electric Power Companies Considering Virtual Power Plant

As an emerging and active entity in China’s electricity market, electricity selling companies call for a more reliable operational mechanism and new consumption mode to broaden their profit margins. Aiming at distributed power generation-based sales companies and by considering the participation of virtual power plants (VPPs), this paper presents the relevant operating systems in the Chinese power market. Then the paper proposes a new platform for power transactions and optimal dispatch based on a master-slave game optimization model. The model is built so that the main gamer, power sales company, can achieve maximum profit while at the same time the secondary gamer, represented by the VPP attains the lowest internal dispatching cost. The energy of both parties is linked together, and the two parties continuously exchange their own strategies and optimize the operational decisions and scheduling plans iteratively. The results of the investigated case study reveal the benefits to retail electricity companies from adopting the proposed model to aggregate and manage decentralized resources, and optimize decision-making. The platform facilitates the use of controllable loads and distributed energy sources to participate in market transactions on a large scale, and optimize the operational strategies of electric utilities.


I. INTRODUCTION
At present, China is facing a new round of power system reform. With the full opening of the power selling market and the participation of several companies, the power market has become more dynamic, which imposes more opportunities and challenges to the planning and operation of power selling companies [1]- [3]. Recognizing its own situation and functional positioning, and to continuously improve its market mechanism, operating models and supporting construction have become the main issue that electricity retail companies should focus on in their future development [4]- [6].
Among the multiple types of power sellers participating in the market competition, the main body studied in this article is a generation-type power sales company with distributed The associate editor coordinating the review of this manuscript and approving it for publication was Amedeo Andreotti . power generation capabilities. This type of companies can make use of abundant electricity market and distributed energy resources (DER) to participate based on its own optimization decision-making, in order to achieve large profit margins. However, the fluctuation of electricity prices and the uncertainty caused by intermittent renewable energy sources adversely affects the real-time market balance of electricity sales companies and the safe operation of their jurisdictions [7], [8]. The restriction of transaction access rules will also lead to the waste of some resources. Virtual power plant (VPP) technology can integrate a variety of distributed energy sources to participate in the operation of the power market, and provide new ideas for the largescale utilization of DER and controllable load resources for electricity selling companies.
At present, domestic and foreign scholars have carried out several research efforts on the business model and marketing strategy of electricity retail companies, but they not much attention was given to the optimization of dispatching of electricity retail companies. Regarding the business strategy of purchasing and selling electricity, references [9], [10] propose a dynamic optimization method of demand response based on reinforcement learning, which is meant to improve the long-term income of electricity sellers. Ref. [11] assesses the price deviation among electricity sales companies, and formulates a reasonable business plan for the electricity transactions. Regarding the optimal dispatch of electricity retail companies, reference [12] incorporates DER and distributed loads (DL) into the optimal dispatch of electricity retail companies, but it does not consider the impact of intermittency of renewable energy sources and the uncertainty of DL. Reference [13] presents the needs of China's distribution network development and the specific situation of power system reform and constructs a random optimization decision-making model for the energy trade by power distribution companies. The VPP model adopted by electricity sales companies can improve communication technology, coordinate and control resources, and respond to uncertain risks. Reference [14] recommends retail companies to adopt VPP and a two-tier optimal dispatching model to aggregate internal units. Reference [15] formulated the VPP dynamic combination plan of the retail electricity companies based on VPP combination strategy. It is to be noted that in the above literatures, the dispatching scale of the electricity retail companies is small, and the utilization efficiency of schedulable resources is not high enough. The two-layer modeling of the electricity selling companies and VPP is complicated and difficult to solve. The cross-domain development of game theory has provided a great help in solving problems of power system dispatch and power market planning. The master-slave game theory can hierarchically grasp the main contradictions of the two decision-making control variables with the same ultimate interest but with a master-slave relationship. The physical meaning of the model greatly reduces the complexity of the problem.
In view of the above literature review, this article proposes a new operating system for the electricity sales companies in considering the participation of VPP. The proposed model is developed based on the master-slave game theory. The developed model enables companies to take optimum decisions on the energy transactions and formulate preliminary dispatching plans to maximize its revenue. The VPP takes the lowest scheduling cost as the goal, and considers multiple constraints to optimize the scheduling of internal units. The uncertainty of distributed renewable energy sources and controllable loads such as electric vehicles is subject to opportunity constraints to ensure the efficiency and reliability of the model's solution. As such, the game model is solved based on iterative search method. As will be elaborated below, simulation results show the robustness of the proposed game model in maximizing the economic benefits of power selling companies while ensuring safe and stable operation of their regions.

A. OPERATION MODEL OF ELECTRICITY SALES COMPANIES
Combining the actual situation of China's electricity sales market and the future development direction, the operation mode of the electricity sales company with VPP participation is established as shown in Figure 1. First, the electricity sales company purchases electricity in a day-ahead market according to the load demand and the day-to-day power purchase rate. Then the company sells the purchased electricity to users in the area under its jurisdiction. The electricity purchased more or less in the day-ahead market can be traded in the realtime market. The day-to-day market adopts a unified clearing electricity price model, while the electricity price in the realtime market fluctuates over time. In addition, electricity sales companies can also use VPP technology to aggregate DER and DL into a number of units. VPP scheduling mode is adopted to optimize scheduling of DER and DL during peak or trough periods of real-time market, adjust the DER and DL output according to the plan, and sell the electricity to the users [16], [17]. Compared with the current industry practice of only purchasing and selling electricity at the electricity trading center, electricity sales companies have more flexible strategies for purchasing and selling electricity under the new operating model proposed in this paper. The flexible scheduling of DER and DL also provides greater profit margins for electricity sales companies. Figure 2 shows the optimized scheduling framework of VPP. The VPP control coordination center manages the internal distributed energy output and cluster controllers to realize the power demand of end users and the optimal scheduling of various DERs, while stabilizing external output, ensuring the safe and stable operation of the dispatching. The units that VOLUME 10, 2022 can be dispatched by VPP include energy storage systems, distributed photovoltaics (PVs), micro gas turbines (MTs) and controllable loads. It is worth noting that PV does not count the operating cost of power generation and DL can be divided into interruptible loads and shiftable loads, participating in VPP scheduling by signing contracts with power sales companies [18], [19].

B. VIRTUAL POWER PLANT SCHEDULING MODE
VPP comprises several internal adjustable units along with complex constraint conditions. The direct scheduling mode is likely to cause ''dimension disasters'', and it is not easy to solve for the optimal decision-making plan. Therefore, in this paper, the optimum decision of retail companies with VPP is developed based on a master-slave game model. Then, according to the scheduling plan, each VPP optimizes the scheduling of the internal units with compliance to their own operating constraints to ensure efficient use of energy and stable operation of the system.

III. PROPOSED OPTIMIZED MODEL
The application of master-slave game theory in constructing optimal decision-making model comprises two situations: the first type exhibits already existing multiple decisionmaking subjects while the second type has no multiple decision-making subjects. In view of the complex modeling background and based on the master-slave game theory, the decision-making control variables can be hierarchically grasped the main contradictions step by step. This is conducive to clarifying the physical meaning of the model, understanding its logical structure, and reducing the complexity of the problem [20]- [22]. In the process of optimizing operation and coordinating dispatch of electricity sales companies, the purchase and sale strategy in the electric energy trading center and the VPP power generation dispatching plan represent the core of the entire system operational plan. The optimal scheduling of VPP internal units is mainly carried out on the basis of the operation strategy formulated by the electricity sales company. The available information and the order of decision-making are not the same, and the decision-making authority also has a clear master-slave difference. Therefore, it is suitable to use the master-slave game theory for modeling this problem.

A. OPTIMAL DECISION-MAKING MODEL FOR RETAIL ELECTRICITY COMPANIES
The main game player in the proposed model is the electricity sales company. The strategy is to formulate electricity trading and VPP dispatch plans for different periods of time. The revenue of a power selling company is calculated based on the returns from the sale of electricity to end users, the cost of purchasing electricity in a day-ahead market, the realtime market transaction fee, and the cost of VPP. Hence, the objective function F er needs to be maximized can be formulated as: where T is the time interval between electricity purchase and sale; T is the total number of time periods in a day.
In the tth period, the income E load t of the electricity sale to end users is expressed as: where c s t is the selling price per unit at the tth period; p l t is the active load in the tth period.
The electricity purchase cost E day t of the electricity sales company in the day-ahead market is: where c d t is the unified clearing price in the day-ahead market; η is the purchase rate of the electricity sales company in the day-ahead market.
The electricity sales fee E real t of the electricity sales company in the real-time market is: where c r t is the real-time market price at the tth period; p l t is the trading power of the electricity sales company at the tth period of the real-time market. When p l t > 0, the electricity company purchases electricity from the real-time market. When p l t < 0 , the electricity company sells electricity to the real-time market.
The objective function in (1) is to be maximized based on the below constraints to ensure balance of purchase and sale of electricity.
where p VPP n,t is the planned call power of the nth VPP during t period.
where P VPP min and P VPP max are the upper and lower limits of VPP transmission power; respectively.

B. VPP OPTIMIZATION SCHEDULING MODEL
The subordinate player in this article is the VPP, and the strategy is to optimize the scheduling of its internal units on the basis of plans made by the main player. In order to facilitate the role of VPP in coordinating power generation and load side resources, and to improve energy utilization efficiency, the VPP call cost is used as the objective function. The call cost of VPP includes the penalty of deviation output, its own MT, energy storage system (ESS), interruptible load (IL) and shift table load (SL) call costs. The objective function f VPP,n to be minimized is: where f vpp,n is the total call cost of the nth VPP; λ + and λ − represent the upward and downward penalty coefficients; respectively, λ 1 is the cost coefficient of the micro gas turbine; p MT n.t is the output power of the MT of the nth VPP at the tth period; λ 2 is the cost coefficient of charging and discharging the ESS; P ESS n,t is the charging and discharging power of the nth VPP energy storage system at the tth period; λ 3 is the cost coefficient of IL call; θ t n is the call state variable of the nth VPP interruptible load at the tth period, θ t n = 0 means not being called, θ t n = 1 means being called; p IL n,t is the load reduction amount of the nth VPP IL at the tth period; λ 4 is the load shift cost coefficient of the shiftable load; p SL n,t is the load of the nth VPP SL at the tth period Shifting power, p SL n,t > 0 means the load is moved out, p SL n,t < 0 means the load is moved in. Equation (7) is solved subject to the constraints given by (8) through (21) for the MT, ESS, IL, SL, PV along with the power constrain.
where P MT n,max and P MT n,min are respectively the upper and lower limits of the output power of the nth VPP MT; r u and r d are the maximum upward and downward ramp rates of the MT; respectively.
−P ESS ch,n ≤ P ESS n,t ≤ P ESS dis,n where SOC max and SOC min are the upper and lower limits of the battery state of charge (SOC) of the VPP ESS; E N,n is the rated capacity of the nth VPP battery; E n,t is the remaining energy of the nth VPP battery at the period t; P ESS dis,n and P ESS ch,n are the maximum discharge and charge power of the nth VPP battery; respectively.
t+T max,n where O max,n is the upper limit of the number of calls of the nth VPP IL in a scheduling cycle; T max,n and T min,n are the maximum number of consecutive calls and the minimum number of consecutive non-calls of the nth VPP IL; respectively, IL,n is the set of non-callable periods of the nth VPP IL.
where P SL n,d and P SL n,u are the maximum load shifting out and in of the nth VPP translatable load; respectively, P SL n,max and SL,n are the maximum load translation amount and the set of non-translational periods of the nth VPP translatable load; respectively. Equations (18)- (20) are respectively the power balance constraint of the VPP translatable load (that is the total load remains unchanged before and after the load is shifted), the load translating power constraint, the maximum load shifting amount constraint, and the non-translating time period constraint. P PV n,t − P PV n,t,pre P PV where P PV n,t and P PV n,t,pre are the actual PV active power output and the predicted value of the generated power output of the nth VPP distributed PV; respectively, δ PV is the maximum allowable prediction error of PV power generation.
The power balance constraint can be formulated as: where P VPP n,t is the power transmitted by the nth VPP to the distribution network.
Due to the multiple uncertain variables within the VPP internal units, an opportunity constraint formula for spinning reserve can be also considered as given by (23).

P P MT
n,t,sp + P PV n,t,pre ≥ P PV n,t + P IL n,t,def θ t n + K SL n,t · P SL n,t ≥ α (23) VOLUME 10, 2022 where P MT n,t,sp is the spare capacity provided by the gas turbine; P IL n,t,def is the default power of IL user, and K SL n,t is the ratio of SL user's default power to SL load shift, α is the probability that the chance constraint is established.

C. THE MASTER-SLAVE GAME MODEL OF ELECTRICITY SALES COMPANY AND VPP
According to the optimized decision-making model of the retail electricity companies and the VPP optimized dispatching slave game model established above, the masterslave game comprehensive model can be described as: Equation (24) contains three basic elements of the general game model: participants' N , strategy S, income/payment u. The game participants N in this article are the main power selling company and the subordinate VPP. The strategy set and revenue of the electricity sales company are as follows: On the other hand, the strategy set and revenue of VPP are: when the main player implements a strategy x ∈ S 1 , the response of the slave player to this strategy is recorded as y(x), and the main gamer generates his own response strategy x(y(x)). The strategies of both sides of the game are continuously coupled and iterated.
At the end, when the main player chooses the strategy x * ∈ S 1 , the slave gamer will choose the strategy y * ∈ K (x * ), and then call (x * , y * ) as the Nash equilibrium point of the master-slave game. If and only if it is satisfied: outside the equilibrium point for ∀(x, y) ∈ (S 1 , S 2 ), u 1 (x * , y * ) ≤ u 1 (x, y); for ∀(x * , y) ∈ (S 1 , S 2 ), all have u 2 (x * , y * ) ≤ u 2 (x * , y); for ∀(x, y * ) ∈ (S 1 , S 2 ), u 2 (x * , y * ) ≤ u 2 (x, y * ). This means that under the Nash equilibrium point, the strategies of the two sides of the master-slave game form a fixed point, and no one of the game players can further increase the profit by changing the strategy above the fixed point.

IV. MODEL SOLUTION
The current methods for solving the game equilibrium point include iterative search method, reverse induction method, maximum-minimum value method, sequence linearization and elimination of disadvantages strategy method [23], [24]. Aiming at the game planning problem investigated in this paper and based on the CPLEX solver and YLMIP toolbox, an iterative search method is used to identify the optimal decision-making plan. The specific solution steps are shown in the flow chart of Figure 3 and briefly explained below.
Step (1): Obtain system parameters. This includes electricity market price, electricity selling price, user load, and VPP internal unit related parameters. Step (2): According to the demand side load of the electricity sales company and the electricity price elasticity matrix, the user's demand side response is carried out.
Step (3): Randomly select an initial value (x 0 , y 0 ) in the strategy space S ∈ R of the game model.
Step (4): Make a decision with the highest total revenue of the electricity sales company as the company optimization goal.
Step (5): Pass the obtained VPP scheduling plan P VPP n,t to the VPP layer.
Step (6): Make decisions with the lowest VPP call cost as the VPP optimization goal.
Step (7): Judge whether the results of the adjacent two rounds of iterative optimization; c and c-1 are consistent or not. The specific calculation process is given by (29). If w c ≤ w(w ≈ 0) is satisfied, the results of the two rounds are considered the same. On the other hand, if it this condition is not satisfied, the VPP layer plan will be re-passed back to the sales company layer, and steps (4) ∼ (7) will be repeated.  Step (8): Export the electricity purchase and sale strategy of the electricity sales company and the internal optimization scheduling plan of each VPP.

A. MODEL PARAMETERS
In order to verify the feasibility and robustness of the proposed model and to facilitate the study of a specific VPP dispatching situation, a simulation analysis was carried out on a case study of which the power sales company aggregated DER into two VPP optimization decisions. The active load data at the end user side utilizes day-ahead load forecast data of a city in southern China, with a conventional 24-hour load profile as shown in Figure 4. The penalty coefficients for upward and downward adjustment of VPP are λ + = 1.0Yuan/kWh and λ − = 0.5Yuan/kWh; respectively. The specific parameters of the two VPP internal units are listed in Table 1, and the PV output forecast is shown in Figure 5.
Taking 1 day as a cycle, the operation of the electricity sales company is studied in 24 time periods i.e. T = 1h. Electricity sales companies participate in day-ahead and real-time market electricity transactions as price takers, and sell electricity to users in the distribution network as price setters. The electricity sales company has a unified electricity purchase rate η = 0.8 in the day-ahead market. The electricity  price in the electricity market and the electricity price sold by the electricity sales company at each time period are shown in Figure 6.

B. ANALYSIS OF OPTIMIZATION RESULTS
Through the YALMIP toolbox and CPLEX solver, the devolved model in this paper is solved and the final operation decision of the Electricity Sales company and VPP is shown in Figures 7 and 8; respectively. VOLUME 10, 2022  It can be seen from Figures 7 and 8 that the electricity purchase of electricity sales companies in the day-ahead market mainly depends on changes in load demand, which can avoid waste of electricity resources caused by excessive deviation of electricity. At the same time, the company mainly purchase electricity in real-time market to meet the needs of end user side load. VPP's output changes are mainly related to real-time market electricity prices and the game of electricity sales companies. During the peak hours of realtime market electricity prices, VPP will use more power than during the trough periods. Because VPP is limited by its own scale, the injected power is generally less than 1MW. With the improvement of VPP technology and the increase in the number of VPPs participating in the cooperative game, the influence on the purchase and sale strategy will become larger, and the decision-making of the entire operating system will become more diverse.
Analyzing the optimized VPP scheduling situation, the call situation of the internal units by the two VPPs is as shown in Figures 9 and 10; respectively.
It can be seen from Figure 9 that the SL of the two VPPs chooses to inject active power when the real-time market electricity price is high, while drawing active power when the electricity price is low. The output of MT is mainly to meet the power balance condition. During the period 13:00-14:00, the generated power of the distributed PV is relatively high, and MT does not need to use too much power. During the   period 21:00 to 5:00, the distributed PV output power is almost zero, but due to the decrease of load demand and realtime market electricity price, the active power injected by the VPP into the distribution network can be mainly borne by the MT. Figure 10 reveals that due to the related constraints of the number of IL calls, active power can only be provided in 6-7 time periods, and it is not called at other times. Also, the amount of active power called by IL is only related to the amount of load that can be reduced by VPP. ESS performs charging and discharging activities under the premise of satisfying its own power balance. It only maintains its own power balance during the 21:00-5:00 period and is not called by VPP. Table 2 shows a revenue comparison of optimized operation of electricity selling companies under different scenarios. It can be seen from Table 2 that joint power sales companies such as VPP dispatched controllable loads and distributed photovoltaics participate in the operation of the power market, can make the operation strategies of power sales companies more flexible and thus achieve better benefits. With the increase in the number of VPPs participating in the cooperative game, more profit margins can be achieved for electricity sales companies.
A comparison between the virtual power plant's direct scheduling model for internal units and the cooperative game optimization model proposed in this paper is shown in Table 3.
It can be seen from Table 3 that when VPP and power selling companies are optimized through a master-slave game model, they can expand their operation methods, avoid uncertain risks, and reduce their own operating costs.

VI. CONCLUSION
In order to improve the operating mechanism of electricity sales companies and expand their profit margins, a model of decision-making and optimal dispatching for electricity sales companies based on master-slave game theory is proposed and verified on a case study. The results of the presented case show that the use of the virtual power plant mode to aggregate and manage distributed resources can enable distributed energy and controllable loads to better participate in market transactions, so that the electricity sales company can have greater benefits. In addition, by optimizing the operation strategy of the electricity sales company through the master-slave game, it can further avoid the uncertainty risk caused by the fluctuation of electricity prices, and make the operation of the entire operating system more stable. RAN HUO received the bachelor's degree from Handan College, Handan, China, in 2019. Currently, she is pursuing the master's degree with China Three Gorges University. Her research interests include the optimal operation of power systems and the optimization of power quality of distribution networks.