Capacity Planning of Micro Energy Grid Using Double-Level Game Model of Environment-Economic Considering Dynamic Energy Pricing Strategy

Multi-energy unified planning is difficult because of the complex conflicting relationship between the coupling and complementary interaction of multiple forms of energy in micro energy grids (MEGs). Conflicting relationships between the economy and the environment as well as the impact of uncertain energy prices must be considered during MEG planning. To address these problems, this paper proposes a two-level game with an environment–economic planning model that considers dynamic energy pricing strategies. This model consists of an upper environment–economic planning level based on a multi-strategy evolution game considering players’ bounded rationality and a lower dynamic energy pricing level, including the MEG operator-user leader-follower Stackelberg game. Simultaneously, based on the energy hub theory, a multi energy coupling matrix is established for a MEG and includes electricity, gas, heat, and cooling. The evolutionary stability strategy (ESS) of the planning results is analyzed using the replicator dynamic equation of the evolutionary game, and the existence of the Nash equilibrium is proven for the dynamic energy pricing of Stackelberg games. Finally, the effectiveness of the proposed environment–economic planning two-level game model considering dynamic energy pricing strategies is verified using simulations. Because dynamic energy pricing and the environment–economic planning are considered, the number of energy equipment required during peak hours is reasonably reduced, thereby reducing the total planning cost and improving the energy utilization efficiency. Simultaneously, greenhouse gas (CO2) and air pollutant (NOx) emissions are reduced to decrease environmental impact.


I. INTRODUCTION
Multi energy systems benefit from the synergy between different forms of energy. They have been widely regarded as effective ways of promoting the integration of renewable energy and improving energy utilization [1]- [3]. The The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney. electricity-heat-cold storage to achieve electric-gas-heat-cold multi energy conversion, which is the main way of consuming renewable energy. The unified planning of multienergy systems will compensate for the deficiency of separate planning of various forms of energy, and fully considers the coupling and complementary relationship of various forms of energy, thereby improving the efficiency of asset utilization and reducing the cost of the whole society [4], [5].
To further develop energy systems, demand side management must be strengthened and the actual participation of users must be realized. Moreover, peaks should be reduced, valleys should be filled, and low-carbon, high-efficiency systems must be promoted. It is expected that wholesale energy prices will fluctuate over time based on energy demand [6]. However, in most cases, energy operators charge consumers a fixed price, with price fluctuations being borne by energy operators. As consumers are not affected by changes in wholesale prices, their demand shows dramatic fluctuations. For example, the demand is low at night and peaks during the day. These fluctuations reduce power supply reliability and system efficiency and result in reduced profits for energy operators. In [7], the relationship between a traditional power demand response and dynamic electricity price was studied, and a game-based dynamic electricity price scheme was developed. Focusing on the level of the power trading market, a demand-response strategy based on real-time electricity prices was proposed in [8]. To investigate the influence of the demand response on the operation of multi-energy systems, the authors in [9] coordinated the incentivized demand response of peak electricity consumption and the gas load of a multi energy system. The instantaneous load charging scheme developed in [10] enabled consumers to minimize their personal energy costs by planning their future energy consumption. In [11], to address the intermediary role between the wholesale energy market and the final consumers (energy retailers), an energy pricing scheme was developed based on the Stackelberg game. For economic and security purposes, in [12], safety-constrained unit commitments were incorporated into the hourly demand response, including load responses related to hourly market prices. In [13], considering the extended modeling framework of an energy hub, a two-layer optimal scheduling model based on an integrated demand response was established to realize the optimal scheduling of power, natural gas, and thermal systems. For integrating energy resources in an energy hub [14], based on the noncooperative game, an integrated demand response program for power and natural gas networks was developed to balance the profits of public utility companies and customers.
However, most of the above studies [7]- [14] focus on dynamic electricity prices. In fact, when a multi-energy MEG is planned uniformly, there will be a coupling interaction between electricity, heat, and cooling. This is different from traditional research undertaken on traditional electricity dynamic pricing. Second, these studies only optimize energy prices from an operational perspective and do not consider the impact of fluctuations in energy prices and user demand on the planning level. Simultaneously, the use of various demand-response procedures to improve the operational economy of different multi-energy systems have been investigated, but the impact on the environment has not been discussed.
The production of various forms of energy is one of the main emission sources of greenhouse gases and air pollutants. With an increased focus on climate change, there is an urgent need to reasonably plan the development of MEGs while ensuring energy use and environmental benefits. In response to this problem, various measures have been taken to reduce their impact on the environment [15]. To reduce the NO x and SO 2 emissions, different emission control technologies, such as selective catalytic reduction (SCR) and flue gas desulfurization (FGD) have been developed. Traditional environmental and economic planning is planned under the operational constraints of power systems, while a MEG contains multienergy coupled gas units such as combined heat and power (CHP) and gas-fired boiler (GB) units, which need to be recalculated in the planning. In [16], the multi-period optimal energy flow of a carbon emission embedded neutralization energy system is studied. In [17], multi-objective optimization of a multi-energy network is proposed to jointly optimize the total operating cost and the total emissions generated by the network. In order to maximize the use of the economic and environmental advantages of the integrated energy system, a large-scale integrated energy system (IES) optimal energy flow model that considers the carbon trading market is proposed in [18], and three decentralized algorithms are used to deal with limited information exchange. However, in [15]- [18], most of the environmental benefits are calculated by using a penalty function or environmental cost. The environmental cost is smaller than the planning cost, which is negligible in multi-objective planning, causing planning errors. Simultaneously, there are often some conflicts between the economy and environment during planning, so the optimal approach to balancing the relationship between the economy and environment during planning has become the focus of research.
Game theory is an important tool for capturing complex strategic interactions among market participants and for strategically analyzing situations that involve multiple independent participants. Game theory is mainly classified as traditional game theory, which assumes that the participants are completely rational, and evolutionary game theory, which considers the limited rationality of participants. Based on the characteristics of biological evolution, the evolutionary game considers that the interaction between individuals in a group is a dynamic process related to the game environment and individual state. When the evolutionary game is used to analyze a planning problem, the effects of decision uncertainty, information incompleteness, and the uncertainty of participants' decision-making skills can be considered more comprehensively; thus, it establishes a dynamic equilibrium. Therefore, the evolutionary game can more reasonably reflect the planning results in an actual scenario. Many related problems require attention during the detailed planning of MEGs. Existing studies have focused on power quality [19], DC converters [20], micro grid voltage deviation [21], and so on. This study focuses on the capacity optimization configuration of the main energy supply equipment and the energy conversion equipment of the MEG.
To address these challenges, this study focused on the optimal allocation of key equipment in MEGs. It aims to reduce air pollutants and greenhouse gas emissions, as well as to consider the dynamic characteristics of energy prices on the energy side. This study considers both economy and air pollutant emissions. Based on the multi-strategy set evolution game theory of bounded rational decision-making, this method promotes the environmentally advantageous planning of MEGs by establishing a two-level game programming model that takes into account dynamic energy pricing and the environment-economics. Users are encouraged to participate in reducing peak load, realizing peak load cutting and valley filling, improving energy efficiency, reducing greenhouse gas and air pollutant emissions, and realizing environmental value. By performing multi-energy collaborative and complimentary planning, operation, control, and scheduling, it directly meets the energy needs of industrial users of different grades. It also significantly improves the level of renewable energy utilization and reduces carbon and nitrogen emissions.
The optimal configuration of the key equipment in MEGs at the planning level and the economic operation in MEGs at the pricing level are solved iteratively.
The contribution of this paper is twofold: 1) It combines changes in energy prices with planning scenarios and considers the impact of energy prices on MEG planning by employing a method of dynamic energy pricing (including electricity prices, heat prices, and cooling prices) of MEG operators based on the Stackelberg game. The energy grid operator's energy sales revenue and the user's energy consumption cost and energy satisfaction game play a dynamic energy pricing game that is satisfactory to both the MEG operator and the user and is included in the MEG planning scenario.
2) A bi-level game programming model is proposed. Based on the dynamic energy pricing game, considering the bounded rational decision-making of the players, the game relationship is balanced between the environment of CO 2 and NO x emissions and the economics of planning in MEGs. In this paper, an optimal allocation method of key equipment in MEG based on the multi-strategy set evolutionary game is proposed, and its evolutionary stability strategy is analyzed using the replicator dynamic equation.
The remainder of this paper is organized as follows. Section II provides the problem statement and the model structure. Section III presents the formulations of the bi-level operation model for MEGs. Then, Section IV introduces the evolution stable strategy method of the bounded rational decision-making multi-strategy set evolutionary game based on the replicator dynamic equation and the proof of the existence of the Nash equilibrium solution of the Stackelberg game. Section V provides the results of the case study. Finally, conclusions are presented in Section VI.

II. PROBLEM DESCRIPTION A. PROBLEM STATEMENT
The framework of the problem is shown in Fig.1.
The MEG is responsible for the supply of electricity and the heating and cooling loads in the region. It is constructed by MEG operators. The profits of MEGs can be maximized by paralleling residual electricity into the grid and selling energy to users in the region. The MEG can purchase electricity from the grid to meet the load demand when it is unable to do so.
In this paper, the environment-economic conflict of planning and dynamic energy pricing are introduced into MEG planning, and a bi-level game planning method based on game theory is proposed. In the upper planning level, a multistrategy evolutionary game theory that considers bounded rational decision-making is proposed. Then, considering the demand-response characteristics of the comprehensive demand, the game between the profits of the MEG and the satisfaction of users is played in the lower pricing level. Finally, a new dynamic energy pricing strategy is formulated.
The advantages of the proposed method are as follows. First, the proposed method can reasonably balance the peak-valley energy difference. Therefore, the allocation of high-capacity equipment can be reduced and the capacity construction of key equipment in the MEG can be more reasonable. Second, the proposed method takes carbon and nitride emissions into account, which means that environmental protection can be achieved by rationally allocating equipment. Simultaneously, the proposed multi-strategy evolutionary game is applied to MEG planning. The method considers the bounded rational decision-making of participants and the influence of the game environment and situation fluctuations on players. The idealization of the game conclusion can be avoided, and the dynamic planning game can be explained. The proposed method is suitable for energy systems with various operating mechanisms, and it is easy to implement in practice.

B. MODEL STRUCTURE
Mathematically, the proposed game programming is a twolevel optimization problem, as shown in Fig. 2. The difference in energy prices affects the economic benefits of MEGs, which will affect the planning ratio of different energy equipment. At the same time, The role of reasonable energy prices in load-shedding also has an impact on the number of MEG planning. Therefore, the upper planning level and the lower energy pricing level in the MEG planning are inextricably linked.
The upper level is the planning level. In the optimal allocation of the MEG's key equipment, in addition to the economic problem of cost and benefit, greenhouse gases and air pollutants emitted by CHP units, GBs, and other equipment are also worthy of attention. As the terminal of comprehensive energy utilization, the main advantage of the MEG is the comprehensive utilization of various productivity elements and energy conversion coupling elements to maximize the comprehensive utilization of energy. On this basis, under the premise of ensuring regional energy supply and economy, the environmental value is incorporated at the planning level to realize an energy supply that is environmentally friendly. The upper level takes the economy and environment of MEG planning as the game players, which considers bounded rational decision-making, and proposes the evolutionary game method of the multi-strategy set. In order to achieve the goal of the maximum economy of MEG during the whole year and minimum emissions of CO 2 and NO x .
The lower level focuses on dynamic energy pricing. Most of the energy produced by the MEG, such as for electricity, heat, and cooling, can meet the internal demand, and internal energy conversion and utilization are relatively independent. Therefore, the electric, heating, and cooling prices of the system can be determined by the MEG operator according to the real-time operation and demand. When determining the optimal energy price, MEG operators are faced with uncertainties related to the supply prices of external energy supply prices, the internal output of renewable energy generation output, and the load demand. Different energy prices will also affect the operation and benefits of MEGs. Typically, uncertainties in the demand side only consider the benefits of the MEG operator when adjusting the demand-response strategy, while ignoring the responses of users to the price. The level of the energy price will directly determine the load-response degree of end users. On the one hand, it will also impact the economy of the system operation. Therefore, there is a natural contradiction between the two. In the lower level, with MEG operators and users as game players, operators maximize their net benefits by optimizing the energy purchase strategy, managing the operation status of the equipment, and setting the dynamic price of energy to be sold/purchased. On the other hand, users should minimize the total cost by developing their own energy consumption strategies. On the management side, the operator can decide the approach taken. The game between the operator and the user agent can be described as a Stackelberg game.
The upper planning level obtains the initial results and combined with the results of the lower level, the energy price uncertainty of the MEG can be considered. The reasonable energy price and the optimal planning results can be obtained based on the game between multi-energy suppliers and users. This process is repeated until an equilibrium is reached.

III. MODEL FORMULATIONS
This section presents the detailed formulations of the proposed bi-level mathematical model, and the mathematical model of the energy coupling matrix based on the MEG energy hub is shown.

A. ENVIRONMENTAL ECONOMY GAME MODEL
In the upper planning level, using the energy price obtained by the lower level, a multi-strategy set evolutionary game model considering bounded rational decision-making is established to balance the planning conflict between the economy and the environment. The game players of the upper planning level are the economic and environmental implications of MEG planning, and the game strategy space is determined by the number of energy components to be planned. The payoff function of the players is as follows:

1) ECONOMY PLAYER PAYOFF FUNCTION
In the MEG economy player payoff function, the equivalent year is used as the time scale. The cost part considers the component installation costs, operation and maintenance costs, and equipment replacement costs associated with the MEG construction. The revenue part considers the sale of electricity, heat, and cooling energy to users after the MEG construction. The MEG economy payoff function is VOLUME 8, 2020 specifically expressed as Equation (1) U cos t 1 = C invs + C om + C re + C pur (1) where C invs is the total annual installation cost of MEG components to be planned;C om is the annual operation and maintenance cost of MEG components to be planned;C re is the annual equipment replacement cost; and C pur is the MEG energy sales revenue. Each part is calculated as follows: where is a collection of components to be planned; i represents elements to be planned; N i is the number of components to be planned, p i is the unit capacity of the components to be planned, U i is the unit investment cost for the components to be planned, L i is the service life of the components to be planned, δ is the maintenance cost coefficient, R c is the number of equipment replacements, I e is the cost of purchasing electricity for MEG from the grid company, I gas is the purchase of gas for MEG from the natural gas company, c grid is the price of selling electricity for the grid company, and c gas is the price of selling gas for the natural gas company. I m e , I m h , and I m c respectively represent the proceeds from the sale of electricity, heating, and cooling by the MEG to users; c m e , c m h , and c m c respectively represent the electricity, heating and cooling prices from MEG operators to energy users, and their values are the output results of the upper game.

2) ENVIRONMENTAL PLAYER PAYOFF FUNCTION
Considering the environmental impact on the MEG, gas emissions from fossil fuel combustion should not be ignored. In the MEG, the CHP unit and GB are the main emission sources of CO 2 , NO x , and SO 2 . CO 2 is a major greenhouse gas, and its emissions should be controlled to alleviate global warming. Currently, various policies, such as carbon taxes and carbon trading schemes, have been developed to reduce CO 2 emissions. Compared with coal-fired units, the SO 2 emission intensity of gas-fired units is negligible, and FGD technology can absorb more than 90% of SO 2 contained in the flue gas of coal-fired units. It is assumed that the gas-fired boilers in the CHP system are equipped with low-NO x burners (LNBs) and selective catalytic reduction (SCR) systems to reduce nitrogen emissions [15]. Therefore, this study only focuses on the CO 2 and NO x gas emissions in the MEG. The carbon emissions related to grid losses are traced back and allocated to generators, while the carbon emissions caused by gas losses are relatively small; therefore, they are not considered in the energy transmission grid. In addition, the carbon emissions of consumers do not need to be optimized because they are constant. In conclusion, generators, thermal power plants, and gas furnaces contribute to all types of carbon emissions. The CO 2 emissions of CHP units and GB units are proportional to their power output [18].
Net CO 2 emissions of MEG: Net NO x emission of MEG: where µ CO 2 G is the CO 2 emission coefficient for the grid company; P pur grid is the purchase of electricity from the grid for MEG;µ CO 2 CHP is the CO 2 emission coefficient for the CHP unit; µ CO 2 GB is the CO 2 emission coefficient for the gas-fired boiler; λ CO 2 e is the carbon emission quota unit of electric power; λ CO 2 h is the carbon emission quota unit of thermal power; µ NO X CHP is the NO x emission coefficient for the CHP unit; and µ NO X GB is the NO x emission coefficient for the gas-fired boiler.

B. DYNAMIC ENERGY PRICE GAME MODEL
In the lower-level model, MEG operators interact with users through Stackelberg games. As the leaders of both sides of the game, MEG operators have a full understanding of the follower-user revenue model. The model built in this study considers the following assumptions: a. The daily electricity consumption in the MEG remains basically unchanged before and after the implementation of the dynamic energy price; that is, the implementation of the dynamic energy price does not affect the total electricity consumption of users.
b. The transferred electricity is evenly distributed according to the time axis of different periods.
c. When the total consumer demand is constant, the marginal cost of the electricity price of micro grid operators remains unchanged.

1) LOAD ELASTICITY
The sensitivity of the customers' energy demand to different energy prices is usually described by the elasticity [22].
Some loads cannot be transferred from one cycle to another but can only be reduced. As a result, these loads respond to price changes only in terms of load reduction. This behavior is called self-elasticity and always has a negative value.
However, some loads can be transferred from one cycle to another and are called movable loads. In addition to reducing the load, these loads can also be transferred within the dispatching range. This response to price changes (load transfer) is called cross-elasticity and is always positive.
where E tt is the elasticity of the load L t of time t, which is responsible for the energy price at time t, its value is exactly the cross elasticity, and its value is negative and is the self-elasticity. L t and d t are respectively the initial energy consumption and modified energy consumption of the load demand at time t, e ini t and e t are respectively the initial energy price and modified energy price of the load at time t.
The customer energy consumption under the maximum net income is Note that we assume that the demand for heating and cooling loads is inelastic in EH, that is, the self-elasticity and cross-elasticity of heating and cooling loads are equal to zero.
For MEG, in order to guide consumers' energy consumption, prices of different energy sources can be specified within an acceptable range (15). The variability of energy prices is determined by multiplying the initial energy price by an acceptable price adjustment factor. Simultaneously, according to the changes in different energy prices, only some consumers will actively change their energy consumption patterns (16). It is characterized by multiplying the initial energy consumption and load by a certain load adjustment factor. In addition, in order to maintain customer comfort and simplify the calculation, the total energy consumption of each energy source is assumed to be the same as its original energy consumption pattern (17).
where, φ 1 , φ u is an acceptable price adjustment factor for the load, and ϕ 1 , ϕ u is the adjustment factor of the load.

2) MEG OPERATOR PAYOFF FUNCTION
MEG operators desire to maximize profits, minimize load deviations, and fulfill their obligations to serve the public and satisfy power users. Therefore, the payoff function of MEG is its profit, which includes the revenue from electricity, heat, and cooling sales, as well as the revenue from interacting with the power grid company: it is the cost borne by the MEG operators, which should ideally be as low as possible, so that the MEG operators can always meet a predictable demand model; d e , d h , d c is the load response of users to dynamic prices, and is calculated using the model and is employed used in the optimization process. where: where d t,i is the load at time t, and i = 1, 2, 3 represent electric, hot and cold loads, respectively, while d avg,i is the daily average load.

3) USERS PAYOFF FUNCTION
Users aim to maximize their satisfaction while minimizing the cost incurred on energy such as electricity, heating, and cooling. Their payoff function is a negative number of the company's cost function, and is expressed as U User S k is the users' satisfaction function [7], and is expressed as where α k and β k represent the satisfaction coefficient, and their value is related to the load elasticity, while their calculation formula is as given by (22) and (23).

4) PRICE CONSTRAINT
MEG operators determine whether the energy price meets the following constraints: where c m e,min is the minimum electricity price for MEG operators;c m e,max is the maximum electricity price for MEG operators.  To guarantee the interests of power users, the average price of energy in the park is not higher than that purchased from the external distribution system.

C. STRUCTURE AND MODELING OF MEG
The MEG includes energy production, coupling, and use of the equipment and incorporates the transformation between different energy sources. In order to facilitate the study of the coupling relationship between different energy sources in the MEG, this study introduces the concept of the energy hub [2]. The relationship between MEG devices is expressed as an energy hub. The relationship between equipment in the micro-energy network is represented by the energy hub. The energy hub of the MEG shown in Figure 3 is taken as an example in this study. The power production element is composed of a photovoltaic (PV) supply, and the coupling elements of electricity, heat and cold energy conversion comprise a CHP unit, GB, electric chiller (EC), and lithium bromide absorption chiller (AC). The energy storage element includes an energy storage system (ESS) and a thermal storage system (TSS).

1) PHOTOVOLTAIC OUTPUT MODEL
The PV output power considered is related only to the radiation intensity and the ambient temperature [26]: where N pv is the number of PV cells installed; p STC is the maximum test power under standard test conditions (the irradiation intensity is 1 kW/m 3 , the ambient temperature is 25 • C); G C is the irradiation intensity of PV cells; T C is the surface temperature of PV cells, which is consistent with the surrounding air temperature by default; T STC is the reference temperature (in this case, we use 25 • C); k is the power temperature coefficient (in this case, we use −0.35%/K); G STC is the light intensity under the standard test conditions, which is taken as 1 kW/m 3 .

2) BATTERY STORAGE MODEL
The working state of the ESS at time t is described by the remaining charge after t−1 and the charge and discharge power in period t, and the lead acid battery is used as the energy storage element. When the energy storage battery is charged, the charging power of the system at time t can be expressed as follows [26]: where E t ESS and E t−1 ESS respectively represent the remaining power of the ESS at the end of t and t−1, ε is the self-leakage rate of the battery per hour, P t,ch ess is the charging power of the battery at time t, and θ ch is the charging efficiency of the battery. During the discharge of the energy storage battery, the discharge power of the system at time t can be expressed by the following formula: where P t,dch ess is the discharge power of the storage battery at time t, and θ dch is the discharge efficiency of the storage battery. The charged state of the ESS is expressed by the following formula (29): where SOC t ESS is the charge state of the ESS after period t, and E bat,rated is the rated capacity of each battery.

3) HEAT STORAGE SYSTEM MODEL
The state of the TSS at time t is related to the state at the previous time and the coefficient of heat storage and heat release: where Q t ts and Q t−1 ts respectively represent the heat energy stored at times t and t−1, µ is the self-loss coefficient of the TSS;Q in and Q out represent the heat storage and heat release energy of the TSS, respectively. η in and η out represent the heat storage and release efficiency of the TSS, respectively.

4) ELECTRIC CHILLER OUTPUT MODEL
where P EC is the electric chiller output;E EC is the electric chiller that consumes electricity; and R COP EC is the energy efficiency ratio of the electric chiller.

5) ABSORPTION CHILLER OUTPUT MODEL
The cold energy output is: where P AC is the absorption chiller output; Q AC is the absorption chiller, which consumes heat; and R COP AC is the energy efficiency ratio of the absorption chiller.

D. ENERGY HUB MODEL
To study the coupling relationship between different energy sources in MEG, we introduce the concept of the energy hub [2]. The essence of the multi energy coupling hub is to describe the functional relationship between the multienergy input and multi-energy output in the MEG. Because there exist only energy transmission, conversion, and storage devices in the system, the coupling matrix can be used to represent the ideal steady-state model of a multi energy coupling hub without considering the transient situation in the process of energy conversion, and the coupling matrix can be recorded as C.
The input and output parts of the energy are represented by P = P 1 P 2 . . . P m and L = L 1 L 2 . . . L n , respectively. The coupling matrix C represents the energy conversion relationship from the input to the output. The element of the coupling matrix is the coupling factor c ij , which represents the ratio of type j energy output to the type i energy input. The MEG studied in this paper comprises energy storage (electricity storage, heat storage) devices, and is connected to the output of the MEG. Considering the influence of energy storage devices on energy hubs, it is necessary to add a modified equation S, where S is the column vector added at the output. Equation (33) is amended as follows: Taking the EH of the MEG system shown in Figure 3 as an example, the input energy includes electric power and natural gas energy, and the electric, heating, and cooling energy are outputted from the EH through energy conversion elements such as CHP, GB, AC, and EC. The relationship between the input matrix P and the output matrix L is expressed as follows: where L e , L h , and L c represent the electricity, heating, and cooling loads, respectively; P grid is the distribution network contact line power; P ES and P HS are the electric storage and heat storage equipment total output, respectively (positive for charging, negative for discharging); P e CHP is the electricity output of CHP; P H CHP is the heat output of CHP; P gas is the natural gas power used in the MEG; λ is the natural gas distribution coefficient; w represents the thermal energy distribution coefficient; and η e CHP , η h CHP , and η h GB represent the cogeneration unit power generation efficiency, GB heat generation efficiency, and cogeneration unit heat production efficiency, respectively.
The form of the matrix is as follows:

5) ENERGY STORAGE EQUIPMENT CONSTRAINT
Including the output constraints, charged state constraints, as well as initial and final state constraints.
where P D,max and P C,max respectively represent the power limitation of charging and discharging; S min and S min respectively represent the minimum and maximum charge coefficient; and W t i is the energy storage of equipment i at time t; where i represents the electricity storage and heat storage.

6) CLIMBING RATE CONSTRAINT
When increasing the output: When reducing the output: where R up,i is the upward climbing constraint of equipment i; R down,i is the downward climbing constraint of equipment i; and equipment i represents the grid, CHP, and GB. VOLUME 8, 2020 where the value N max is determined according to the actual needs of the system, which is determined by the goal of nonredundancy of the equipment capacity under the maximum load.

8) POWER TIE LINE CONSTRAINT
where P gird,min and P gird,max are the minimum and maximum values of the allowable transmission power of the tie line, respectively.

IV. GAME EQUILIBRIUM
Game theory is a mathematical theory that studies the relationship or conflict of interests of different intelligent subjects. An evolutionary game is a game analysis method that is based on the bounded rationality of the subjects who participate in the game. It simulates the evolutionary theory in biology and holds that the interaction among individuals in a group is a dynamic process related to the game environment and individual state. As opposed to the equilibrium state of a traditional game, the evolutionary game uses the dynamic process to study the strategic evolution of how the players adjust their behavior during the game. Bounded rationality is considered a hybrid between complete rationality and incomplete rationality, and it comprehensively considers the influence of decision uncertainty, information incompleteness, and uncertainty in participants' decisionmaking skills. Further, it emphasizes dynamic equilibrium. Simultaneously, an evolutionary multi-strategy game planning method is proposed, and comprehensively reflects the potential optimal solution set under long-term planning. The proposed approach analyzes the evolution state of multiple strategies, and it finally determines the optimal evolution state strategy. It solves the contradiction between the continuity problem in planning and the limited strategy of evolutionary games, combines the idea of multi-strategy and evolutionary games, and proposes the application and solving of a multistrategy evolutionary game. Therefore, evolutionary games can more reasonably reflect the behavior of game participants in the actual state and are more suitable for planning scenarios.
At the lower level, users often respond according to the MEG pricing results, and the two do not make decisions simultaneously, so the master-slave game is more in accordance with the needs of the model. In the proposed model, the two levels are iteratively optimized to achieve a balance. When the number of plans per MEG for each period between two iterations is sufficiently close, a balance can be achieved.

A. MULTI-STRATEGY SET EVOLUTION GAME MODEL
Different from the Nash equilibrium of the traditional game, the evolution game uses dynamic processes to study the strategic evolution of how participants adjust their behavior during the game. In contrast to the traditional game concept of the solution ''Nash equilibrium'', an important concept of the solution in the evolutionary game is ''evolutionary stable strategy''. Therefore, in the analysis of the upper evolution game, we analyzed its replication of this dynamic equation and its evolutionary stable strategy following the evolutionary game concept.
In the evolutionary game analysis of the bounded rational decision-making of the economy and environment in MEG programming, the two game participants of the economy and environment are mapped into two populations. Without a given game structure and game scenario, there is no artificial intervention in the decision-making of the economy and environment, that is, there is no restriction on the decision-making approach and the basis of the information. It fully considers the uncertainty of decision-making methods, the incompleteness of information, and other uncertain factors to seek decision-making results and evolution strategic stability [23].
Based on the above settings, the new method is based on three elements of the game (that is, participants, strategy set, and payoff function) and the basic concept of the evolutionary game. An evolutionary game model is established between an economic player and an environmental player in MEG planning.

1) PLAYERS
In the evolution game analysis, the game players are biological groups, and the economic and environmental players of MEG planning are mapped into two populations, which are recorded as P 1 and P 2 . There are many individuals in the population, each of which produces its own strategy and proceeds with the random repetitive game.

2) MULTI-POLICY SET
Under the constraints, each population randomly generates n strategies, and takes the set of installation numbers of each micro source as the strategy set. The strategy set of the economic player population P 1 is recorded as S 1 , while the environmental player population P 2 is recorded as S 2 . The strategy set is characterized as: Based on the principle of the evolution game, under the constraint of the maximum evolution time, the adaptability of different strategies in the population is analyzed, and the evolutionary stability strategy is finally determined.

3) PAYOFF FUNCTION
The payoff function represents the goals pursued by the economic and environmental players of MEG planning under their respective strategies. The payoff made by the population P 1 is recorded as U cos t 1 , and the payoff made by the population P 2 is recorded as U env 2 (U cos t 1 and U env 2 have been given in Section III).

4) REPLICATOR DYNAMIC EQUATION
The replicator dynamic equation emphasizes the selection mechanism in the evolution game, and the dynamic evolution state of the decision can be abstracted into the replicator dynamic equation for analysis [24].
If p i (t) represents the number of individuals who adopt strategy S i at time t, the total number of groups is: If the ratio of the number of individuals to the total number of individuals in the selection strategy S i is x i , then there is: If the proportion of individuals to the total number x i is used as the state variable, the replicator dynamic equations of population P 1 and population P 2 are respectively expressed as follows.
Equations (55) and (56) are replicator dynamic equations of two populations, and their evolution time is the time differentiation of the evolution state. If the return of the individual selection strategy S i is less than the average return of the population, the growth rate of the number of individuals who choose this strategy is negative. On the contrary, it is positive. If the income of the strategy chosen by the number of individuals is exactly equal to the average income of the group, the number of individuals who choose the strategy remains unchanged.
The replicator dynamic equation shows the evolution process of each individual's continuous selection strategy in the changing game structure and situation, that is, the dynamic process of different individuals in the continuous random repeated game.

5) EVOLUTIONARY STABLE STRATEGY
Compared with the concept of the Nash equilibrium in the traditional game theory, an important concept of the solution in the evolution game is the evolution stable strategy. Under the strict evolutionary selection, the evolutionary stability strategy will not be invaded by the small population under gene mutation. Evolutionary stability strategy and replicator dynamics together constitute the core of the evolutionary game theory, which characterizes the stable state of the evolution game and the process of dynamic convergence to this stable state respectively. There exist the following theorems about evolutionary stable strategy [24]: If ∀y ∈ S and y = S i , there is a positive numberε y ∈ (0, 1) such that the fitness function f of the population with strategy S i satisfies: S i ∈ S is called an evolution stable strategy. If almost all individuals in the population adopt the strategy S i , the fitness of these individuals must be higher than those of other possible mutants. At this time, S i is a stable strategy; otherwise, the mutant individual will invade the whole population and S i can not be stable. This fact indicates that strategy S i is better than strategy y.

B. STACKELBERG GAME
The lower-level game is based on the Stackelberg Game, to prove the existence of its Nash equilibrium.
When there is a Nash equilibrium in the game model, according to the definition of the Nash equilibrium, it is assumed that (e * i , d * i ) is the Nash equilibrium strategy of the game model, where i = 1,2,. . . , n. This means that when the electricity price of the MEG is e * i and the energy consumption strategy of the user is d * i , the benefit of both sides may be the best with respect to the equilibrium.
In this study, the strategy adopted by the MEG and users are considered as pure strategies. The existence theorem of pure strategy Nash equilibrium is explained as follows: in multiplayers game, if the pure strategy set S of each player is a non-empty, closed, bounded convex set on the Euclid space, and the payoff function U is continuous with respect to the strategy combination and quasi-concave to S, then there is a pure strategy Nash equilibrium in the game [25].
Because the strategy space of the MEG and users game in this study is a non-empty compact convex set in Euclid space, it is only necessary to prove that the payoff function of both players is a continuous quasi-concave function of the corresponding strategy point.

1) QUASI-CONCAVITY OF MEG PAYOFF FUNCTION
From the definition of the payoff function of the MEG, it can be known that the five costs that make up the MEG can be divided into a nonlinear function part and a linear function part about the energy price.

2) QUASI-CONCAVITY OF THE USERS' PAYOFF FUNCTION
When e * i is a fixed policy, it changes linearly in the user strategy set d * i and belongs to a class of concave functions. Therefore, it can be proven that it is a continuous quasiconcave function of the policy.
From the above proof, it can be concluded that there is a pure strategy Nash equilibrium in the operation strategy optimization model of the MEG based on the Stackelberg game.

V. CASE STUDY
The proposed method is applied to the MEG structure shown in Fig. 3. The number of installations of PV, ESS, CHP, GB, TSS, EC, AC, and other equipment is planned, with a planning period of 20 years. The economic parameters of the equipment are shown in Table 1. In our case study, this EH was used to simulate an industrial park. The park has electrical loads that are obtained from industrial production and lighting. At the same time, the heat load mainly includes hot water required for daily life and heating in winter, and the cold load mainly includes the reduction of the indoor temperature in summer. The historical annual electricity, heating, and cooling loads used in the planning are as shown in Fig. 4. Other experimental data parameters are shown in Table 2. The EH is equipped with both LNB and SCR, and the  [2], [15], [18], [22].
corresponding parameters can be found in [15]. The parameters of carbon trading are the same as in [18]. The presented model is coded in Matlab aided by YALMIP with CPLEX 12.6.
The example is planned with the equivalent year as the time scale, and typical days of three typical seasons, namely, heating season, cooling season, and transition season, were analyzed. The electricity purchase price of MEG from the grid company and natural gas company is shown in Fig. 4 f), and the CHP grid electricity price is 0.745 Yuan/kWh, while the PV grid electricity price is 0.75 Yuan/kWh.
To prove the effectiveness of the planning method of the new scene, the following three cases were considered.
Case 1: Cost-benefit single-level planning under the scenario of fixed market electricity, hot and cold price.
It is solved directly using CPLEX solver, and the planned number of PV, ESS, CHP, GB, TSS, EC and AC devices in MEG is obtained. Among them, the selling electricity price of MEG is the time-of-use electricity price shown in Fig.4 f), and the cold and hot selling prices are 0.587 Yuan/kWh and 0.563 Yuan/kWh respectively.
Case 2: Under the fixed market electricity, hot, and cold price scenario, the economy player and the environment player in the MEG planning play game.
Considering the environmental factors, a single-level game scenario is established, and MATLAB solves its Stackelberg The case study results were compared and analyzed based on the investment cost, energy efficiency, emission, and dynamic energy pricing strategy.

A. EVOLUTION GAME PLANNING RESULTS
In the planning scenario of Case 3, the evolutionary game model is used to solve the strategic stability state of the upper planning level. In order to improve the accuracy of the results, 500 policy sets were set. That is, after determining the location of the MEG connected to the power grid, and the planning number of PV, ESS, CHP, GB, TSS, EC, and AC is limited, 500 sets of planning number combinations are randomly generated. The maximum evolution time is set to 30 and the evolution time interval is 0.001. The evolution status of the 50 policy sets is shown in Fig. 5 a) and Fig. 5 b). Among them, the evolution time corresponds to the differential equation in the dynamic equation of the replicator. This is the time differentiation of the evolutionary state with an interval of 0.001.
The game decision process is constantly changing based on the game environment. From Fig. 5 a) and Fig. 5 b), it can be concluded that the decision of the initial state does not affect the decision of the final steady state. The final stable state is determined by the game environment. Figure 5 a) is the evolution state curve of the population strategy, which represents the economy of MEG operator planning. As can be seen from the graph, with the evolution of the game, the evolution state of only one strategy set in the strategy set gradually approaches 1 and finally reaches stability (as shown by the red line in Fig. 5 a)). This strategy is the 278th strategy that is generated, while other strategies are gradually approaching 0. The evolutionary state approaching 1 indicates that: under a constantly changing game situation and environment, this strategy finally becomes a stable strategy in the population. This is an evolutionary stability strategy. Figure 5 b) represents the evolutionary state curve of a strategy set for MEG operators planning environmental gas emissions as a population. Figure. 5 shows the changing trend of the 500 strategies in the evolutionary game, and the final change is close to the stable strategy of 1, as shown in Case 3 in Table 3.  Under each case, the planned number of devices in the planned MEG is shown in Table 3.

B. DYNAMIC ENERGY PRICING STRATEGY
In the lower-level game, MEG operators and users play the Stackelberg game to obtain the dynamic energy price in hourly units. Figure 6 shows the iterative game between the MEG operator and the user until the equilibrium is reached. Figure 7 shows the dynamic electricity price on a typical day of the transition season, the dynamic heat price on a typical day of the heating season, and the dynamic cooling price on a typical day of the cooling season. As shown in Fig. 7 b), because the load elasticity of the thermal load and the cold load is not assumed in this study, only the load response of the electrical load under dynamic electricity prices is studied.
In the game mode, users have high sensitivity to the electricity price, and can respond independently and flexibly to the different quotations of MEG to adjust the load. During the period of the higher electricity price (from 7 to 17), the maximum electricity load reaches 1105 kWh, which corresponds to the peak load period. Users pay fees first, reduce non-essential loads by about 2% according to their own demand, and reduce payment fees. During the period of the lower electricity price (periods 1 to 6, 23 to 24), users give priority to energy experience, customer satisfaction increases, and the peak load is appropriately transferred to this period up to 4%. During periods 9 to 12, electricity prices are relatively high. The highest electricity price reached is 1.18 Yuan/kWh. On the premise of considering the energy consumption experience and paying fees, users can make small load adjustments.
The initial energy price strategy of the MEG is based on the peak and valley periods of the users' historical power load. It can be seen that the game equilibrium electricity price strategy is closely related to the user load strategy. During the period of the low user load level (periods 1 to 6, 23 to 24), to prevent users from reducing the load, the MEG reasonably adjusts the electricity price to improve user satisfaction and guarantee its own income. During the period of high user load (periods 7 to 17), the MEG increases the electricity price substantially within the acceptable range of users in order to maximize its own revenue. During periods 9 to 12, the MEG operator increases its price at the expense of a small change in the load in order to increase the revenue.
Simultaneously, under the dynamic electricity price, the user peak load is reduced by 9%, which plays a certain role in reducing the peak and filling the valley, and satisfactory results are obtained. Table 4 shows the economic results of each case MEG plan. It should be noted that in this study, the goal of calculating environmental benefits is to minimize the emissions of greenhouse and other harmful gases, and the penalty cost of gas emissions is not included in the planning cost. This is because the cost of gas emissions penalties is a relatively small proportion of the planning costs. Calculations in the form of the total economic cost will result in negligible gas emission penalty costs. Therefore, if only the construction cost is taken into account, Case 1, which has the lowest pure economic cost, will incur the lowest cost. Considering the environmental benefits, Case 2 and Case 3 will increase the installation of renewable energy, resulting in higher economic costs. However, Case 1 has poor environmental benefits and does not satisfy the current planning requirements. Therefore, in the economic analysis, only Case 2 and Case 3, which take into account the environment, are compared. The economic benefits of Case 1 in pure economic planning are taken as the comparison standard.

1) ECONOMICAL ANALYSIS
In Case 3, which takes into account the environmenteconomic double-level game planning of dynamic energy pricing, the installation cost of energy equipment on the MEG side is 6.20 × 10 6 Yuan. The cost of purchasing energy from grid companies and natural gas companies is 4.96 × 10 5 Yuan, and the total cost is 6.73 × 10 6 Yuan. Compared with the total planning cost of Case 2, under a fixed market energy price, the total planning cost of environmentaleconomic planning is reduced by 1.4 × 10 5 Yuan, and the cost of purchasing energy from grid company and natural gas company is reduced by 4.2 × 10 4 Yuan. It can be seen that the planning that considers the dynamic energy pricing strategy can reduce the cost of the energy purchase and increase energy sales.

2) GAS EMISSION
Case 2 and Case 3 take into account environmental emissions. From the planning results, after considering the game of environmental emissions, the number of renewable energy PV installations is significantly more than that of Case 1. This result indicates that to reduce gas emissions, MEG operators  reduce power purchases from the power grid and power generation of CHP units.
Because Case 3 takes into account the dynamic pricing of energy, the peak load of users is reduced, and the load demand can be met by planning a lower power supply at the peak load In terms of planning results, compared with Case 2, which also takes into account environmental emissions, Case 3 has fewer PV and ESS plans. At the same time, Case 3 incorporates increased GB planning compared to Case 1 and Case 2 and has reduced planning costs.
The annual emission data of Table 5 also show this situation. Based on the results of the Case 3 planning, nitride and carbon emissions are lower. Compared with Case 1, NO x emissions are reduced by 4.87 kg, and the difference between CO 2 emissions and CO 2 emission quotas is reduced from 2.35 × 10 4 kg for Case 1 to 1.6 × 10 3 kg.

3) ENERGY OUTPUT DISTRIBUTION
By comparing the distribution of typical daily electricity, hot, and cold energy, we can obtain the typical daily energy output composition. By selecting the typical days of the transition season to compare the power distribution of each case, it is possible to more clearly see the interaction between hot, cold, and electric energy. From Fig. 8(a), it can be seen that the main source of power for Case 1 is purchased from the grid company. This is due to the higher cost of PV installations and the MEG's reduction of renewable energy construction in order to increase revenue. In addition, owing to the high on-grid electricity prices of PV and CHP, ranging from 11 to 18, respectively, the MEG will put the power of PV and CHP units on the grid, and then buy electricity from the grid company for sale. In Case 2 and Case 3, in order to improve the environmental efficiency, most of the power supply is completed by PVs, a small part is supplemented by CHP, and the power generation of renewable energy is about 1500 kW. There is a surplus of renewable energy to generate electricity, and the surplus electricity is connected to the grid, which increases the revenues of MEG. In Case 3, owing to the dynamic energy pricing, the power load has changed, which is more stable than Case 1 and Case 2. Renewable energy generates about 1200 kW, and surplus electricity is connected to the grid in order to increase the MEG revenue. The overall energy distribution is the same as in Case 2. In order to more clearly determine the composition of the thermal energy distribution, the typical days of the heating season with a higher heat load are selected to analyze and compare the thermal energy output distribution, as shown in Fig. 8(b). For the heat energy, the heat is mainly generated by CHP and GB, the heat is stored or released by TSS, and the heat is absorbed by AC to produce cold energy. Because Case 1 and Case 2 have no planned GB, the heat is mainly generated by CHP. The output of CHP accounts for 90%, and insufficient heat is supplemented by TSS exothermal. Because Case 2 considers the environmental benefits, the installation of CHP units has been reduced. Furthermore, TSS increases heat dissipation to meet the heat load demand. Case 3 increases the heat generated by the GB to meet the heat load while minimizing the cost.
In order to more clearly determine the composition of the thermal energy distribution, the typical days of the heating season with a higher heat load are selected to analyze and compare the thermal energy output distribution, as shown in Fig. 8(b). For the heat energy, the heat is mainly generated by CHP and GB, the heat is stored or released by TSS, and the heat is absorbed by AC to produce cold energy. Because Case 1 and Case 2 have no planned GB, the heat is mainly generated by CHP. The output of CHP accounts for 90%, and insufficient heat is supplemented by TSS exothermal. Because Case 2 considers the environmental benefits, the installation of CHP units has been reduced. Furthermore, TSS increases heat dissipation to meet the heat load demand. Case 3 increases the heat generated by the GB to meet the heat load while minimizing the cost.

VI. CONCLUSION
In this paper, a MEG planning model that considers the dynamic pricing of energy is proposed, and is based on a bilevel game between environmental and economic factors. The planning level is calculated by the multi-strategy evolutionary game method, which considers the bounded rationality. Meanwhile, the dynamic price of MEG energy is determined by the master-slave game, which considers the benefits of operators and the payoff of users. The simulation results show that the proposed evolutionary game between the environment and economic method, which is the planning model based on the limited rational decision-making of the upperlevel, can be applied to the MEG planning. It overcomes the defect of the complete rationality of participants in traditional game theory. It considers the influence of the fluctuation of the game environment and situation on the participants, avoids the idealization of the game conclusion, and more effectively explains the dynamic game phenomenon in real planning. Under the condition of reducing NO x emissions and preventing the purchase of excess CO 2 quotas, the new model can reasonably reduce the planning cost, consider in the planning the economy and environment, and achieve the balance of the planning economic environment. At the lower level, dynamic energy pricing can optimize the power load curve, which helps to improve the utilization rate of renewable energy and reduce the economic operation cost of the MEG. Based on balancing the economy and environment, by performing dynamic energy pricing, load peak reduction and valley filling can be realized. The planning cost of the MEG is reduced, and the profit is maximized.