General Three-Population Multi-Strategy Evolutionary Games for Long-Term On-Grid Bidding of Generation-Side Electricity Market

Founded on bounded rationality and limited information, evolutionary game theory has been preliminarily applied in many fields, such as electricity market (EM). To address the complex behavioral decision-making issues in the more-common three-population multi-strategy evolutionary game (3PmSEG) scenarios in EM. This paper explores the long-term evolutionarily stable equilibrium (ESE) characteristics of general 3PmSEG systems with the aim of systematically investigating the evolution process of long-term on-grid bidding of a generation-side EM based on these features. First, the long-term ESE characteristics of general three-population two-strategy and three-strategy evolutionary games are thoroughly investigated. Complete relative net payoff (RNP) parameters are defined for these games. Then, the modeling idea of general 3PmSEGs is elaborated. Research shows that the game can be guided to evolve toward an expected long-term ESE point by properly adjusting its RNP parameters. To verify this, finally, the long-term on-grid bidding of power generation is investigated for a tripartite generation-side EM. The case study reveals that effective government supervision can effectively promote new energy accommodation of the market. Overall, the models developed in this paper are relatively universal and practical, which can provide some theoretical and methodological references for complex evolutionary game issues in related fields.


I. INTRODUCTION
When addressing complex multi-agent behavioral decisionmaking issues, game theory is gradually becoming a useful and powerful mathematical tool to overcome such obstacles [1], [2]. As an emerging branch of game theory, evolutionary game theory (EGT) [3] is founded based on assumptions of bounded rationality and limited information, and it can be used to well describe the evolution trends of population behavior through processes of dynamic interactive The associate editor coordinating the review of this manuscript and approving it for publication was Zhiyi Li . decision-making among individuals such as imitation and learning. Moreover, it can be used to accurately predict the population behavior of individuals. Thus, the EGT is more suitable for real game situations when compared with classical game theory. It has been rapidly applied in the fields of economy [4], [5] and management science [6], and also has been initially developed in engineering fields [7]- [10].
Currently, the application research of EGT in many fields is more biased toward the research of two-population twostrategy behavioral decision-making problems. For example, Sun et al. [11] use EGT to investigate the green investment in a two-echelon supply chain involving a population of VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ manufacturers and a population of suppliers. Obviously, this is a typical two-population two-strategy asymmetric evolutionary game (2P2S-AEG) system. Wang et al. [12] use an evolutionary game approach to manage the manufacturing service allocation for the user population and cloud manufacturing operator population in cloud manufacturing. In addition, Sun and Zhang [13] apply EGT to investigate the government regulation in the prevention of greenwashing involving two heterogeneous enterprise populations, i.e., dominant and inferior enterprises. In terms of theoretical research, EGT has made great progress, especially in aspects of cooperative evolutionary game, stochastic evolutionary game (StEG) and evolutionary game updating rules and mechanisms. For example, aiming at cooperative evolutionary game, Gámez et al. [14] propose an evolutionary game-theoretical model for the market cooperative in fisheries, where an evolutionary dynamics is proposed for the continuous change of the applied strategies that can lead to a particular Nash equilibrium (NE) in the long term. Aiming at StEG, Tadj and Touzene [15] adopt a quasi-birth-and-death approach to investigate the stochastic 2 × 2 non-symmetric evolutionary game and 3 × 3 symmetric evolutionary game and provide some illustrative examples, Zhou and Qian [16] conduct in-depth theoretical analysis of the fixation principle, transient landscape and diffusion dilemma in StEG dynamics, Zhou et al. [17] thoroughly investigate the evolutionary stability and quasi-stationary strategies in StEG dynamics, and Ohtsuki [18] analyzes the stochastic evolutionary dynamics of bimatrix games; and aiming at evolutionary game updating rules and mechanisms, researchers in [19] and [20] systematically investigate the impact of several evolutionary mechanisms on the evolution of cooperation based on EGT, including the impacts of randomness and diversity, breaking links and establishing links, indirect reciprocity, proportional best response, and migration, and researchers in [21]- [23] thoroughly investigate the cooperative evolution issues based on some strategy updating rules, such as fixation of strategies, fixation probabilities and fixation times. In general, EGT has yielded considerable results in theoretical research. Based on the above theoretical research, the EGT has also achieved good results in the application. Apart from the application of 2P2SEG, the research work on the evolutionary game problems based on three-population multi-strategy evolutionary game (3PmSEG) has also made preliminary development, especially the research on three-population two-strategy evolutionary game (3P2SEG) issues. To this end, this paper focuses on such type of 3P2SEGs. Based on 3P2SEG, its application research work has been carried out in some fields, especially in the field of electricity market (EM), as summarized as follows.
In the fields of industry and management science, Wu et al. [24] construct a tripartite evolutionary game to investigate the collaborative innovation and management of three parties, including the institutes of government, industry and university. Shan and Yang [25] investigate the sustainability of photovoltaic poverty alleviation in China based on an evolutionary game between three stakeholder populations, including the PV enterprises, poor households and the government. Jiang et al. [26] use an EGT approach to implement the multi-agent environmental regulation under Chinese fiscal decentralization, where the research subjects include the polluting enterprises, local government regulators, and central government planners. Long et al. [27] conduct a coevolutionary simulation study of multiple stakeholders based on a tripartite game model involving the government, consumers and enterprises, where the evolutionary equilibrium and the main driving factors are explored in the take-out waste recycling industry chain. Xu et al. [28] investigate a tripartite equilibrium for the carbon emission allowance allocation in the power-supply industry.
In the field of electricity market (EM), including demand-side EM and supply-side power generation EM, the multi-population evolutionary game theory and methodology have been preliminarily used in analysis of generators' bidding strategies and in the development of EM models.
Taking the demand-side response management (DRM) in EM as an example, Cheng and Yu [29] develop a multi-group asymmetric evolutionary game model to study the NE-based asymptotic stability of a typical game scenario in an EM, which involves the populations of power consumers, new power supply entities, and grid companies. Chai et al. [30] study DRM issues in a scenario involving multiple distribution utilities, where the competition between power companies is constructed using a noncooperative game, while the interaction between home users is constructed using an evolutionary game. The proposed strategic approach in [30] shows that the two types of agents, i.e., power companies and home users, can converge to an NE point and an evolutionary game equilibrium point, respectively. Zhu et al. [31] study the demand-side management and control issues for a class of networked smart grids using EGT. Miorandi and Pellegrini [32] explore DRM techniques from an EGT perspective and focus on a distributed control scheme that is enforceable by operators through a pricing scheme. Srikantha and Kundur [33] propose a distributed demand-side response strategy for real-time demand response problems in the context of smart grids, and use EGT to study the important convergence characteristics in the determination problem on a parsimonious and empirical basis. The results in [33] show that the proposed strategy is real-time and highly scalable, which can provide good application prospects for practical DRM problems.
EGT has also been initially applied in the long-term bidding of supply-side power generation markets. Fang et al. [34] investigate the government regulation of renewable energy generation and transmission in China's supply-side EM, where the strategic interaction involves three populations, including fossil-energy power plants, provincial power grids and provincial governments. Liu et al. [35] establish a tripartite asymmetric evolutionary game model for the supply-side EM in order to investigate the impact of the integration of new energy resources on three populations, including wind power enterprises, thermal power enterprises and power grid enterprises.
Menniti et al. [36] simulate the behavior of power generators in EM based on an evolutionary game. The research work in [37] shows that due to the use of EGT, the generation corporations in the supply-side EM have the ability to learn adaptively, thus the obtained long-term competitive evolution characteristics are closer to the actual power generation market, which is different from the competitive evolution laws derived through traditional game theory. In addition, Ladjici et al. [38] model the equilibrium computation in a deregulated EM as solving a two-stage stochastic game problem using a competitive coevolutionary game algorithm.
Obviously, EGT is an important and powerful mathematical tool to investigate the characteristics of long-term gaming behavior of multiple groups. This methodology system adopts the natural selection mechanism and does not require strict assumptions of rationality (i.e., it is founded based on bounded rationality and limited information communication), which is closer to reality and better reflects the spontaneous evolution of strategies of different interest groups during the dynamic process. The advantage of adopting EGT in this paper is that it does not require complete rationality of all game groups, nor does it require the ability to know complete information as common knowledge. Therefore, unlike classical game theory, the evolutionary game used in this paper is concerned with the evolution gaming process of the strategy selection frequency of different interest groups. This process involves two important mechanisms, i.e., the market selection mechanism (which can be seen as a natural selection mechanism) and the mutation mechanism. In terms of relaxing the assumption of rationality, this paper considers role-neutral gaming of individuals in a group, where the game payoffs are related to the decision and not to the participants, which can be called strategic games.
Overall, the previous work greatly enriches the application fields of the 3PmSEG. However, most of these investigations only provide a relatively simple analysis of the system's stability. They do not comprehensively summarize the various factors affecting the dynamic stability of the system, and do not make in-depth theoretical analysis and dynamic simulation verification of the impact of these factors. Moreover, more general 3PmSEG-based models and methods have not put forward for actual complex behavioral decision-making issues. Due to the complexity and diversity of the EM in the context of Energy Internet, the market competition involving multiple interest groups (including interest groups of different parties and different interest groups of the same party) gradually transforms into a complex process of dynamic evolution with more complex characteristics of the market economy and human behavior [29]. Therefore, it is essential to combine the theoretical analysis of multi-group gaming behavior with the complex dynamic evolution process.
To address the complex behavioral decision-making issues in the more-common 3PmSEG scenarios in EM, this paper focuses on a class of symmetric and asymmetric 3PmSEG models with the aim of systematically investigating the evolution process of long-term on-grid bidding of a generation-side EM based on the models' long-term evolution characteristics. The main work of this paper is summarized as follows.
i) This paper first summarizes and verifies the long-term ESE characteristics of general 3PmSEG systems based on theoretical analysis and dynamic simulation, including three-population two-strategy symmetric evolutionary game (3P2S-SEG) system, three-population two-strategy asymmetric evolutionary game (3P2S-AEG) system, and more complex three-population three-strategy asymmetric evolutionary game (3P3S-AEG) system. ii) During the investigation, this paper thoroughly and systematically defines relative net payoff (RNP) parameters for all general 3PmSEG systems investigated in this paper. Moreover, based on these RNP parameters, all the game scenarios including complete behavioral decision-making characteristics (i.e., all the evolution states of the system during evolution) are analyzed, summarized and simulated for various evolutionary game models. iii) Then, this paper elaborates the modeling idea and convergence iteration method of general three-population nstrategy (where n ≥2) asymmetric evolutionary game (3PnS-AEG) system. iv) Lastly, to verify the long-term ESE characteristics of evolutionary games elaborated in this paper, an actual tripartite evolutionary game example involving a population of new energy power generation enterprises, a population of traditional power generation enterprises and a population of power grid enterprises is taken to investigate the long-term on-grid power generation amount competition in a supply-side power market. The remaining part of this paper is organized as follows: In Section II, several core concepts in EGT are introduced as preliminaries. Section III investigates the long-term ESE characteristics of general 3PmSEG models based on theoretical analysis and dynamic simulation. Besides, the modeling idea of the general 3PnS-AEG is expounded in this section. In Section IV, a tripartite asymmetric evolutionary game example of long-term on-grid price bidding for a generation-side EM is taken to verify the effectiveness and universality of the general 3PmSEGs (especially the general 3P2SEGs). Lastly, Section V concludes this paper.

A. BASIC FRAMEWORK OF A TYPICAL EVOLUTIONARY GAME
The basic framework of a typical evolutionary game, denoted by G, usually includes participant set (i.e., the population set), population strategy set, and population payoff set, as follows: (1) VOLUME 9, 2021 where N is the participant set, i.e., the populations. Here, assume that G contains n populations, then N = {1, 2, . . ., i, . . ., n}, where i ∈ N . is the population strategy set, = {S 1 , S 2 , . . ., S i , . . ., S n }, where S i is the strategy set of population i. U is the population payoff set, U = {U 1 , U 2 , . . ., U i , . . ., U n }, where U i is the payoff set of population i.

B. SYMMETRIC AND ASYMMETRIC EVOLUTIONARY GAMES
Based on Eq. (1), for the general three-population n-strategy (where n ≥2) evolutionary game (3PnSEG), when its payoff parameters are symmetric, it is a symmetric evolutionary game, and at this point, all participants in the game know each other's preferences [39]. Otherwise, if the payoff parameters are asymmetric, then it is an asymmetric evolutionary game, such that the degree of information mastered by each population is asymmetric.

C. EVOLUTIONARILY STABLE STRATEGY AND ESE
Assume that two pure strategies s 1 , s 2 ∈ , and s 1 = s 2 , if there is always a number κ ∈ (0, 1) that makes the following inequality true, then the pure strategy s 1 is an ESS of the system [40].
where ∀κ ∈ (0, κ), and f (·) is the fitness function of the system. Further, when the system achieves an ESS at one of its pure strategies, then such ESS is called an evolutionarily stable equilibrium (ESE) state of the system. For an asymmetric evolutionary game system, it only achieves a long-term ESE state at the pure-strategy internal equilibrium points of its RD model.

D. REPLICATOR DYNAMICS MODEL
Replicator dynamics (RD) model is another core conception in EGT [39], which is an important dynamics mechanism and can be well used to reveal the evolution trend of group behavior of bounded rational individuals in a population [40]. Assume that the strategy s ∈ S i is selected by individuals in population i with the probability or individual proportion of x i (t) at time t in each round of repeated evolutionary game, and the corresponding expected payoff of the individual is f i (s; x; t), then the RD model of choosing such strategy s at any time t is described as follows: where f ave (s; t) is the average expected payoff of the population i at time t. Eq. (3) shows that the differential of the probability (or ratio) of individuals selecting a strategy in population is proportional to this probability value, as well as the difference between the expected payoff of this strategy and the average expected payoff of the population at this time [41].

E. LYAPUNOV METHOD-BASED EVOLUTIONARY STABILITY CRITERION
The asymptotical stability (i.e., evolutionary stability) of the evolutionary game system at a certain strategy can be judged by the Lyapunov stability theory [42]- [44], called Lyapunov method-based evolutionary stability criterion (LyESC). Concretely, assume that the Jacobian matrix of the system's RD model in Eq. (3) is an M -order square matrix, where M is a positive integer with M ≥2, and further assume that such Jacobian matrix contains M eigenvalues as follows: {λ 1 , λ 2 , . . ., λ M }. If the real part of all the eigenvalues {λ 1 , λ 2 , . . ., λ M } is negative at a certain internal equilibrium point that is solved by the RD equation(s) shown in Eq. (3), then this point is asymptotically stable or evolutionarily stable, and the strategy corresponding to such equilibrium point is an ESS of the system. At this time, the evolutionary game system can achieve a long-term ESE state under such ESS. Otherwise, if at least one of these eigenvalues {λ 1 , λ 2 , . . ., λ M } has a positive or zero real part, then the internal equilibrium point is evolutionarily unstable, and the corresponding strategy is not an ESS of the system.

F. FLOWCHART OF CALCULATING ALL POSSIBLE ESS FOR A GIVEN MATRIX BASED ON RD AND LYAPUNOV STABILITY THEORY
Based on the elaborations in the precious parts of this section, a flowchart used to demonstrate how to calculate all possible ESS for a given matrix based on RD and Lyapunov stability theory is presented, as shown in Figure 1.

III. LONG-TERM ESE CHARACTERISTICS OF THE GENERAL 3PMSEGS
A. GENERAL 3P2S-SEG MODEL 1) MODEL CONSTRUCTION Assume that the general 3P2SEG system contains three populations denoted by A, B and C respectively, and their strategy set has a pair of opposite pure strategies as SA = {S A1 , S A2 }, SB = {S B1 , S B2 } and SC = {S C1 , S C2 }, where S A1 and S A2 , S B1 and S B2 , and S C1 and S C2 are mutually opposite strategies. For example, S A1 indicates that individuals in population A make a decision, and then S A2 indicates that individuals in population A make a decision that is contrary to such decision. Further, assume that in each round of repeated evolutionary game, the proportion of individuals who choose S A1 and S A2 in population A is x and 1 − x, respectively, S B1 and S B2 in population B is y and 1 − y, respectively, and S C1 and S C2 in population C is z and 1 − z, respectively, where x, y, z ∈ [0, 1]. Based on this, the decision space of this 3P2SEG system is =   the general payoff distribution parameters of the evolutionary game models that can be used throughout this paper.
According to the assumptions above, the payoff matrix of this general 3P2SEG is described as follows. According to Eq. (4), as shown at the bottom of the next page, for the general 3P2S-SEG model, its payoff distribution parameters simultaneously meet a 1 = a 4 , a 2 = a 3 , a 5 = a 8 , a 6 = a 7 , g, h, k, l, p and q are defined as the general evolutionary model's general payoff parameters that are commonly used throughout this paper. Therefore, the payoff matrix of this general 3P2S-SEG model is transformed into (5), as shown at the bottom of the next page.

2) RNP PARAMETERS DEFINITION
According to Eq. (5), we define the RNP parameters for this general 3P2S-SEG model and the number of them is 6, VOLUME 9, 2021 as presented in Table 1. Taking the RNP Parameter 1 in Table 1 as an example, i.e., (a − c), its physical or economic meaning is defined as follows. (a−c) is the relative net payoff of individuals in population A who choose strategy S A1 while the individuals in population B choose strategy S B1 and in population C always choose strategy S C1 , or while the individuals in population B choose strategy S B2 and in population C always choose strategy S C2 . The meanings of remaining RNP parameters in Table 1 can be similarly defined, and will not be repeated here. Certainly, if the signs of the six set of RNP parameters in Table 1 are all taken negative, these RNP parameters will become another six sets of RNP parameters, which indicate the relative net payoffs of the individuals in populations A, B and C who choose the second strategy in their strategy set.
In particular, it should be noted that the long-term ESE laws of the system at different pure-strategic equilibrium points can be well observed by setting the initial conditions of the system. During evolution, the initial conditions of the system are determined by the defined RNP parameters of the system. Therefore, in all numerical simulation studies in this paper, the selection of the initial conditions of the system is strictly based on the evolutionary stable equilibrium conditions of the system at each equilibrium point.
In Figure 2, the simulation time t ∈ [0, 10], and Cases 1 to 8 respectively demonstrate that each internal equilibrium point in ϒ 3P2S−SEG becomes the unique long-term ESE of the system in sequence, Cases 9 to 11 respectively show that the 3P2S-SEG system only achieves 1, 2 and 4 long-term ESE states, and Case 12 indicates that no long-term ESE can be spontaneously formed in the system. In each figure, the red, green and blue solid dots respectively represent the long-term ESE state, evolutionarily unstable equilibrium state, and evolutionarily critical equilibrium state (which is also an unstable equilibrium state), namely the ESE point, unstable equilibrium point, and saddle point that are spontaneously formed in the system, respectively. Further, we substitute each pure-strategy internal equilibrium point in ϒ 3P2S−SEG into the Jacobian matrix J 3P2S−SEG in Eq. (7), and then we can obtain that the real of the Jacobian matrix's eigenvalue real parts at these equilibrium points are only determined by six sets of RNP parameters shown in Table 1. Therefore, the positive and negative signs of these 6 RNP parameters are arranged and combined to show that the long-term ESE characteristics of the 3P2S-SEG system contain a total of 64(=2 6 ) game scenarios. Each game scenario is determined by the sign of 6 RNP parameters: a − c, Table 1.

4) A BRIEF SUMMARY
Overall, through the detailed theoretical analysis and dynamic simulation verification on the long-term ESE characteristics of the general 3P2S-SEG model, we can draw some conclusions, which are summarized as follows.
i) The model has only 8 internal equilibrium points, which are all pure strategies, and at most 4 of them can be spontaneously formed as ESSs at the same time, that is, the system can achieve four long-term ESE states simultaneously in a certain game scenario. ii) The final evolution state that is spontaneously formed in the system is only determined by six RNP parameters as defined in Table 1, so that the system can be guided to evolve toward an expected long-term ESE state by appropriately adjusting these RNP parameters based on some external factors. iii) The system's complete long-term equilibrium characteristics contain a total of 64 game scenarios, which are determined by 6 RNP parameters, and in these scenarios, the system can obtain a total of 64 long-term ESEs, which are all strictly refined NEs, 64 evolutionarily unstable equilibria, and 384 evolutionarily critical equilibria. iv) During the process of long-term dynamic interactions of populations in this evolutionary game system, the total number of ESEs spontaneously formed in populations is the same as that of evolutionarily unstable equilibria. This is because this evolutionary game system is symmetric with strictly symmetrical payoff parameters.

2) RNP PARAMETERS DEFINITION
According to the payoff matrix in Eq. (4), we define a total of 12 RNP parameters for this general 3P2S-AEG system, as presented in Table 2.  Taking the first two RNP parameters in Table 2 as an example, i.e., (a 1 − a 5 ) and (a 3 − a 7 ), their physical or economic meanings are defined as the relative net payoff of the individuals in population A who choose strategy S A1 while the individuals in population B respectively choose strategies S B1 and S B2 from their strategy set and the individuals in population C always choose the strategy S C1 . Similarly, the meanings of the remaining 10 sets of RNP parameters in Table 2 can also be obtained, and will not be repeated here. Obviously, if the signs of these 12 RNP parameters are taken negative, they will become another 12 sets of RNP parameters, which represent the relative net payoff of the individuals in populations A, B and C who choose the second strategy from strategy sets.  8 vertices of the system's decision space. Based on this, the 8 internal equilibrium points in 3P2S−AEG are denoted by E 1 ∼ E 8 in sequence, and they are respectively substituted into the Jacobian matrix J 3P2S−AEG in Eq. (9), then we can obtain its determinant, denoted by det(J 3P2S−AEG ), its trace, denoted by tr(J 3P2S−AEG ), and its eigenvalues, denoted by (λ 1 , λ 2 , λ 3 ), as presented in Table 3. Table 3 shows that the eigenvalues (λ 1 , λ 2 , λ 3 ) of J 3P2S−AEG at each pure-strategy internal equilibrium point are just three RNP parameters that are defined in previous section. This means that the system's long-term ESE characteristics at each one of E 1 ∼ E 8 is only determined by the signs of three RNP parameters.
Therefore, for each internal equilibrium point Table 3, assume that its corresponding three RNP parameters are denoted by RNP i,1 , RNP i,2 and RNP i, 3 . For example, the three RNP parameters of E 1 (0, 0, 0) are RNP 1,1 = a 4 − a 8 , RNP 1,2 = b 6 − b 8 , and RNP 1,3 = c 7 − c 8 . Then, according to the LyESC elaborated in Section II, when RNP i,1 , RNP i,2 and RNP i,3 are all not equal to 0, the long-term equilibrium characteristics of the general 3P2S-AEG system at each pure-strategy internal equilibrium point E i (i = 1, 2, . . ., 8) can be described as follows: Therefore, according to Eq. (10) and Table 3, we know that the long-term ESE state that is spontaneously formed in the general 3P2S-AEG system is only determined by 12 sets of RNP parameters as defined in Table 2, namely a 4 − a 8 , 3 and c 1 − c 2 , which determine the final evolution state of the system in each game situation. To this end, by arranging and combining the signs of these RNP parameters, we can obtain that the system's complete long-term equilibrium characteristics contain a total of 4096 (=2 12 ) game situations. Under these game situations, the evolutionary stability conditions of E i (i = 1, 2,. . ., 8) and its corresponding mutually exclusive equilibrium points are presented in Table 4. Table 4 reveals that the general 3P2S-AEG system can simultaneously achieve at most 4 long-term ESE states at these pure-strategy internal equilibrium points, and they are all strictly refined NE states. In addition, when 1, 2, . . ., 8) becomes an ESS, it corresponds to three exclusive internal equilibrium points from E 1 ∼ E 8 . In order to more intuitively observe the long-term ESE characteristics of the general 3P2S-AEG system at E i (i = 1, 2, . . ., 8) shown in Table 4, 12 sets of dynamic simulations are implemented and they are denoted by Cases 1 to 12 respectively.
The simulation results of these 12 cases in Table 4 are demonstrated in Figure 3, where Cases i (i = 1, 2, . . ., 8) shows that the internal equilibrium point E i (i = 1, 2, . . ., 8) becomes the unique long-term ESE state that is spontaneously formed in the system, Cases 9 to 11 respectively shows that the system finally achieves only 1, 2 and 4 long-term ESE states, and Case 12 indicates that no long-term ESE state exists in the system after a long-term evolution. VOLUME 9, 2021 TABLE 4. Evolutionary stability conditions and corresponding mutually exclusive equilibrium points of the general 3P2S-AEG system at each of its pure-strategy internal equilibrium points.
The simulation time is taken t ∈ [0, 20], and the simulation results of each case have shown the phase trajectories of (x, y), (x, z), (y, z) and (x, y, z). Figure 3 shows that the simulation results of the long-term ESE characteristics of the system are completely consistent with theoretical analysis results obtained in Table 3, thus verifying the effectiveness and practicability of theoretical results.

4) A BRIEF SUMMARY
Overall, based on a detailed theoretical analysis and dynamic simulation for the long-term equilibrium characteristics of the general 3P2S-AEG system, we can obtain some conclusions as follows.
i) The system's RD equations only have eight internal equilibrium points, as shown in 3P2S−AEG , and they are all pure strategies. At these equilibrium points, the system can finally achieve a long-term ESE state, which is a strictly refined NE state. ii) The system has no mixed strategies and cannot achieve a long-term ESE state at a mixed strategy. iii) Each equilibrium point corresponds to three mutually exclusive equilibrium points, and the evolutionary stability of each equilibrium point is only determined by three RNP parameters. iv) The system contains 12 sets of RNP parameters, thus the system contains a total of 4096(=2 12 ) game scenarios. Under these game scenarios, the system contains a total of 32768(= 4096 × 8) evolution states during the evolution.
v) The system can be guided to evolve toward an expected long-term ESE state by appropriately adjusting its RNP parameters, i.e., its initial game situations, according to the payoff parameters a i , b i , and c i , i = 1, 2, 3. vi) The system can simultaneously achieves 1, 2 and 4 long-term ESE states at its pure strategies, and no long-term ESE exists in the system under some game situations. vii) When the system achieves a long-term ESE state, its RD equations always equal to 0, and at this point, any population of A, B and C can achieve a long-term ESE state in a total of 16 game situations, and no small-sized population with a mutation strategy can invade into the evolutionarily stable population.

C. GENERAL 3P3S-AEG MODEL 1) MODEL CONSTRUCTION
As previously stated, we have investigated the long-term evolutionary equilibrium characteristics of the general 3P2SEGs, based on this, when the three populations all have three pure strategies to choose in each round of repeated evolutionary game, the 3P2SEG will become a very complex 3P3SEG.
To this end, this section focuses on 3P3SEG, and investigates the long-term equilibrium characteristics of the asymmetric type, namely 3P3S-AEG. Similar to Eq. According to Eq. (11), as shown at the bottom of the page, we can obtain that the decision space of this constructed general 3P3S-AEG system is a six-dimensional space, denoted by 3P3S 1], where [0, 1] represents a coordinate dimension. Assume that the expected payoff of the individuals in population A choosing strategies S A1 , S A2 and S A3 is l 1 , l 2 and l 3 , respectively, in population B choosing strategies S B1 , S B2 and S B3 is g 1 , g 2 and g 3 , respectively, and in population C choosing strategies S C1 , S C2 and S C3 is h 1 , h 2 and h 3 , respectively. Besides, assume that the average expected payoff of populations A, B and C is l a , g a , and h a , respectively. Then, these payoffs can be obtained according to Eq. (11).
Here, we take population A as an example, we can obtain Based on this, the RD model of the general 3P3S-AEG system is constructed as follows.
Obviously, we can obtain that the Jacobian matrix of the RD equations presented in Eq. (12) is a 6 × 6 square matrix, which is denoted by J 3P3S−AEG and obtained as follows.

2) RNP PARAMETERS DEFINITION
Similarly, we can define complete RNP parameters for this general 3P3S-AEG system. First, we calculate the system's pure-strategy equilibrium point set, denoted by 3P3S−AEG .
Owing to x and y (or p and q, or u and v) cannot equal to 1 simultaneously, then we can obtain that 3P3S−AEG contains a total of 27 pure-strategy internal equilibrium points, denoted by E 1 ∼ E 27 , as presented in Table 5. Further, we sequentially substitute E 1 ∼ E 27 into the Jacobian matrix J 3P3S−AEG in Eq. (13), then we can obtain its corresponding eigenvalues at each equilibrium point, as also presented in Table 5. In this table, we define the Jacobian matrix's six eigenvalues under each pure strategy as six RNP parameters corresponding to each internal equilibrium point, and then we can obtain a total of 81 RNP parameters with different absolute values, as shown in the third column of Table 5.
From Table 5 we can obtain that the long-term ESE characteristics of the general 3P3S-AEG system at each pure-strategy internal equilibrium point is only determined by six RNP parameters, and its complete long-term equilibrium characteristics contain a total of 2 81 (≈ 2.42 × 10 24 ) game scenarios, which is a huge number. Therefore, the game scenarios of the general 3P3S-AEG system are very complex.

3) LONG-TERM ESE CHARACTERISTICS ANALYSIS
According to previous sections, we have known that this general 3P3S-AEG system contains a total of 2 81 game scenarios, thus it is impossible to perform dynamic simulation verification for each game scenario. To this end, we can simulate a typical game scenario where the constructed evolutionary game system achieves the largest number of long-term ESE states simultaneously. In addition, according to Table 5, we can obtain that only seven of E 1 ∼ E 27 can simultaneously become a long-term ESE state in the system and it is a strictly refined NE state. Based on this, by appropriately adjusting the system's 81 RNP parameters, we guide the system evolve toward a long-term ESE state at E 1 , E 5 , E 9 , E 11 , E 13 , E 21 and E 25 simultaneously. This means that these seven pure-strategy internal equilibrium points are spontaneously formed as ESE states in the system at the same time after a long-term evolution. This is simulated as demonstrated in Figure 4, where we take the initial (x, y, p, q, u, v) from 0 to 1 within the system's six-dimensional decision space 3P3S−AEG at an interval of 1/2, i.e., we conduct a total of 729 rounds of repeated evolutionary game dynamic simulations to observe the phase trajectories of (x, y, p), (x, y, q), (x, y, u), (x, y, v), (x, p, q), (x, p, u), (x, p, v), (x, q, u), (x, q, v), (x, u, v), (y, p, q), (y, p, u), (y, p, v), (y, q, u), (y, q, v), (y, u, v), (p, q, u), (p, q, v), (p, u, v) and (q, u, v) during the evolution of the system. These phase trajectories are denoted by Phase Trajectory 1 to Phase Trajectory 20, respectively, as shown in Figure 4. In each figure, the red solid dot represents the long-term ESE state spontaneously formed in the system after a long-term evolution. From Figure 4 we can obtain that the pure-strategy internal equilibrium points of E 1 , E 5 , E 9 , E 11 , E 13 , E 21 and E 25 in Table 5 are simultaneously become the system's long-term ESE states, thus verifying the effectiveness of the theoretical analysis results above.

D. GENERAL 3PNS-AEG MODEL 1) MODELING IDEA
Based on the theoretical analysis and dynamic simulation verification of the specific 3PmSEG models in previous sections, this section elaborates the modeling idea of the general three-population n-strategy (n ≥2) asymmetric evolutionary game (3PnS-AEG). At this point, the three populations A, B and C in the general 3PnS-AEG system all have n strategies 5188 VOLUME 9, 2021 in their strategy sets. Concretely, the strategy set of population A is An = {S A,1 , S A,2 , . . ., S A,n }, and the probability or individual proportion of the individuals in population A choosing strategies S A,1 , S A,2 , . . ., S A,n is x A,1 , x A,2 , . . ., x A,n , respectively, where x A,1 + x A,2 + . . . + x A,n = 1. Similarly, the strategy set of population B is Bn = {S B,1 , S B,2 , . . ., S B,n }, with probabilities of y B,1 , y B,2 , . . ., y B,n , where y B,1 + y B,2 + . . . + y B,n = 1, and the strategy set of population C is Cn = {S C,1 , S C,2 , . . ., S C,n }, with probabilities of z C,1 , z C,2 , . . ., z C,n , where z C,1 + z C,2 + . . . + z C,n = 1. In addition, assume that the expected payoff of the individuals in population A sequentially choosing strategies S A,1 , S A,2 , . . ., S A,n is U A,1 , U A,2 , . . ., U A,n . Similarly, the expected payoffs of populations B and C are U B,1 , U B,2 , . . ., U B,n , and U C,1 , U C,2 , . . ., U C,n , respectively. To this end, U A,k , U B,k and U C,k (k = 1, 2, . . ., n) are as follows.
where u A,k,i,j is the payoff of the individuals in population A when choosing the kth strategy from An while the individuals in populations B choosing the ith strategy from Bn and in population C choosing the jth strategy from Cn ; u B,k,i,j is the payoff of the individuals in population B when choosing the kth strategy from Bn while the individuals in populations A choosing the ith strategy from An and in population C choosing the jth strategy from Cn ; and u C,k,i,j is the payoff of the individuals in population C when choosing the kth strategy from Cn while the individuals in populations A choosing the ith strategy from An and in population B choosing the jth strategy from Bn . Based on Eq. (14), assume that the average expected payoff of populations A, B and C is U A_ave , U B_ave and U C_ave , respectively, as follows: Based on Eq. (14) and Eq. (15), the RD model of the general 3PnS-AEG system is described as follows: ∀k, ∀t (16) where k = 1, 2, . . ., n. Eq. (16) shows that the growth rate of individual proportion or probability of choosing a pure strategy by the individuals in a population in the general 3PnS-AEG model is proportional to this proportion or probability, as well as the difference between the obtained expected payoff (or profit) under this pure strategy and the average expected payoff (or profit) of the population, thus it can well reveal the evolution trend of the population behavior of the bounded rational individuals in a population.

2) CONVERGENCE ITERATION CALCULATION METHOD
After establishing the general 3PnS-AEG system's RD model as shown in Eq. (16), which needs to be discretized to facilitate the iterative calculation of the system in the process of repeated evolutionary game. To this end, when the simulation iteration proceeds to the mth step, its convergence iteration calculation is designed as follows: where σ m , k , ρ m , k and τ m , k are the step sizes of the selection probability (or individual proportion) of the kth strategy of populations A, B and C in the mth iteration, respectively, which are usually taken as a very positive number.
The structure design of Eq. (17) is based on the RD equation structure shown in Eq. (3). The convergence properties and iterative mechanism in evolutionary game theory embodied in Eq. (3) guarantee that Eq. (17) will also be convergent. Specifically, as iterations continue (where each round of iteration implies an evolutionary game process, i.e., a population strategy selection process), as a strategy becomes evolutionarily stable, an individual's expected payoff (or return) will approach the average expected payoff (or return) of the entire population. Taking population A as an example, when the system reaches a long-term ESE state, the U A,k (m) will gradually equal to U A_ave (m). As a result, x A,k (m+1) will gradually equal to x A,k (m). This means that the proportion of individuals in Population A that choose this evolutionary stable strategy will tend to be 100% and remain at a stable level.
In addition, the design of iteration step size in Eq. (17) ensures that the selection probability (or individual proportion) of each strategy does not exceed the range of [0, 1] during each time of iteration. Further, in order to guide the evolutionary game system to converge to the expected accuracy in the iterative process, it is usually necessary to set a very small positive number to determine whether the iterative calculation of populations A, B and C reaches the convergence condition. Once the expected accuracy is reached, the iterative calculations for the corresponding population can be terminated, as described as follows.
where o 1,k , o 2,k and o 3,k are very positive numbers set for populations A, B and C in their iterative calculation processes, respectively. These numbers are used to judge whether various populations have reached the expected ESE state with the expected convergence accuracy after a long-term evolution.

E. A SUMMARY
According to the research ideas in this section, we can further investigate the long-term equilibrium characteristics of the general two-population multi-strategy evolutionary games. To this end, we first compare multiple general multipopulation multi-strategy evolutionary games from several aspects, as presented in Table 6, where the evolutionary games for comparison include two-population two-strategy symmetric and asymmetric evolutionary games, denoted by 2P2S-SEG and 2P2S-AEG, respectively, two-population three-strategy symmetric evolutionary game (2P3S-SEG), 3P2S-SEG, 3P2S-AEG, and 3P3S-AEG. Table 6 reveals that the total number of game scenarios included in a certain kind of evolutionary game system is equal to an exponent taking 2 as the base and the total number of system RNP parameters as its power. Therefore, as the total number of populations included in the whole evolution game system increases, or as the total number of strategies in the population's strategy set increases, the total number of game scenarios and evolution states (including stable, unstable and critical evolution states) of the whole system will increase dramatically.

IV. APPLICATION EXAMPLE IN LONG-TERM ON-GRID BIDDING OF A GENERATION-SIDE ELECTRICITY MARKET A. SUPPLY-SIDE MARKET POWER GENERATION AMOUNT COMPETITION EVOLUTIONARY GAME MODEL
This section explores the application of 3PmSEGs. For ease of explanation, the 3P2S-AEG is taken as an example to describe the application of this more common evolutionary game type in the engineering field. Based on [35], the on-grid power generation amount competition is taken as an application example in the supply-side power generation market involving three populations of enterprises, i.e., the new energy generation enterprises, denoted by population A, the traditional energy generation enterprises, denoted by population B, and the power grid enterprises, denoted by population C. In fact, based on game-theoretic approaches [2] and latest artificial intelligence techniques [45]- [50], the investigations on long-term bidding issues of the power generation market are research highlights in the field of electricity market in recent years.
In actual market bidding scenarios, the competition of on-grid power generation amount among these three enterprise populations with bounded rationality is a long-term market equilibrium evolution process. Moreover, this process is implemented in an information system with limited information and bounded rationality. Therefore, it is very suitable to adopt EGT to address such long-term equilibrium issue.
Based on the assumptions above, the strategy set of the new energy generation enterprises (i.e., population A), the traditional energy generation enterprises (i.e., population B), and the power grid enterprises (i.e., population C) all contains two pure strategies for on-grid power generation amount competition, namely  1], which is the decision space of this tripartite long-term bidding evolution game system.
Since the aim of this chapter is to verify the conclusions drawn in the previous chapters about the long-term equilibrium properties and laws of the 3PmSEG system, the focus of the application example analysis in this chapter is on qualitative research and simulation validation. As to how to design the specific utility function of the parties involved in the long-term bidding in the generation-side EM, it belongs to the scope of qualitative research and is not under discussion. The utility functions of the parties involved in the long-term bidding in the power generation-side market can be referred to other literatures. It is well known that the design of the specific utility function is critical to the strategy that each party ultimately adopts.
As the long-term bidding in the power generation-side market involving new energy enterprises is an emerging field, the utility functions of the parties involved in the bidding are complex and diverse. This is also the focus of the next step of this paper, that is, through qualitative research on the utility function of different enterprises in different environments to participate in the long-term market bidding to determine the specific benefits or payoffs, so as to conduct a specific quantitative research on the market's long-term ESE characteristics, and ultimately draw more accurate conclusions and formulate some more comprehensive market supervision measures.
Based on elaborations above, the payoff matrix of this power generation amount competition evolutionary game system is constructed as: where l i , m i , and n i are the general payoff parameters set in this example to represent the payoffs under different strategy combinations, and i = 1, 2, . . ., 8.
Based on Eq. (19), as shown at the bottom of the next page, in each round of evolutionary game, pure strategies S A1 and S A2 are selected by the individuals in population A with the probability or individual proportion of α and (1 − α), respectively, and they indicate that population A chooses to cooperate with population B who completes on-grid power generation amount with W 1 via new energy resources, and chooses not to cooperate with population B who completes new energy on-grid power generation amount with W 2 , respectively; pure strategies S B1 and S B2 are chosen by the individuals in population B with the probability of β and (1 − β), respectively, and they indicate that population B chooses to cooperate with population A while it completes on-grid power generation amount with T 1 via traditional energy resources, and chooses not to cooperate with population A while it completes on-grid power generation amount with T 2 via traditional energy resources, respectively; and pure strategies S C1 and S C2 are selected by the individuals in population C with the probability of γ and (1−γ ), respectively, and they indicate that population C chooses to actively participate in new energy accommodation while completing new energy accommodation with amount of G 1 , and chooses to passively participate in new energy accommodation while completing new energy accommodation with amount of G 2 , respectively.

B. TRIPARTITE EVOLUTIONARY GAME SIMULATION UNDER NO GOVERNMENT SUPERVISION
Substituting the eight pure-strategy internal equilibrium points in 3P2S−AEG into Eq. (21) in sequence, and then we can obtain the eigenvalues, determinant and trace of J ABC at each equilibrium point, as presented in Table 7. This table shows that the power generation market can achieve 1, 2 and 4 long-term ESE states simultaneously. This means that the market can achieve at most 4 power generation amount competition ESSs at the same time. Such equilibria are achieved based on the situation where no government supervision is conducted to this market. Actually, under no government supervision, this market can finally spontaneously form the following long-term ESE after a long-term evolution.
First, whether power grid enterprise population C actively or passively participates in new energy accommodation, and whether traditional energy generation enterprise population B chooses to or not to cooperate with the new energy power generation enterprise population A, the individuals in population A will tend to choose the second competition strategy from their strategy set to obtain more on-grid power generation amount, thus achieving more profits. At this point, when population C chooses to actively participate in new energy accommodation, we can obtain l 5 > l 1 and l 7 > l 3 . According to Table 7, the pure-strategy internal equilibrium points E 6 (1, 0, 1) and E 8 (1, 1, 1) will become unstable evolutionary strategies, i.e., they cannot be spontaneously formed as long-term ESE states in the market. Similarly, the individuals in population B choosing not to cooperate with population A can obtain more on-grid power generation amount with more profits when choosing to cooperate with population A. From this we can obtain m 3 > m 1 and m 7 > m 5 .
Second, whether population A chooses to or not to cooperate with population B, the individuals in power grid enterprise population C choosing to passively participate in new energy accommodation can obtain more profits when comparing with actively participate in new energy accommodation. This is because when the power grid enterprises choose not to actively participate in new energy accommodation, they do not need additional investment in building a grid to  accommodate new energy resources, thus reducing operating costs and obtaining higher profits. To this end, when population A chooses to cooperate with population B, we can obtain n 2 > n 1 and n 4 > n 3 , and when population A chooses not to cooperate with population B, we can obtain n 6 > n 5 and n 8 > n 7 , thus the pure-strategy internal equilibrium points of E 6 (1, 0, 1), E 8 (1, 1, 1), E 2 (0, 0, 1) and E 4 (0, 1, 1) will become unstable according to Table 7. This means that these equilibrium points cannot be spontaneously formed as long-term ESE states in the market during the evolution. Overall, when the government conducts no supervision to the market, we can obtain that E 2 , E 3 , E 4 , E 5 , E 6 , E 7 and E 8 will all become evolutionarily unstable competition strategies, i.e., they cannot be spontaneously formed as long-term ESE states in the market. At this point, the market can finally achieve a unique long-term ESE state at pure-strategy internal equilibrium point E 1 (0, 0, 0), which indicates that new energy power generation enterprise population A and traditional energy generation enterprise population B choose not to cooperate with each other, and meanwhile the power grid enterprise population C chooses to passively participate in new energy accommodation. Obviously, this will cause that a large amount of new energy power generation in the market is abandoned. As a result, the phenomenon of abandoning wind and solar energy resources gradually becomes very serious, which is not conducive to the sustainable development of renewables and is easy to cause market turmoil and long-term unhealthy operation.
To verify the findings elaborated above, we conduct a dynamic situation to verify this phenomenon. We take the initial values of α, β and γ from 0 to 1 within the system's decision space [0, 1]×[0, 1]×[0, 1] at intervals of 1/4, 1/5, 1/6, 1/7 and 1/8, respectively. This means that we respectively conduct 125, 216, 343, 512 and 729 rounds of repeated power generation amount competition evolutionary game dynamic simulations to observe the phase trajectory of (α, β, γ ) during the long-term evolution of the market. The above five sets of dynamic simulations are denoted by Cases 1 to 5, respectively, as demonstrated in Figure 5 (a) to (e), respectively, where the red, green and blue solid dots respectively indicate that the market finally achieves the unique power generation amount competition ESE, unstable evolution equilibrium, and critical evolution equilibrium. Figure 5 reveals that when the government conducts no supervision to the power generation market, which will achieves the unique long-term ESE state at the pure-strategy internal equilibrium point E 1 (0, 0, 0), and meanwhile, cannot obtain power generation amount competition ESS at E 2 , E 3 , E 4 , E 5 , E 6 and E 7 . Therefore, the simulation results effectively verify the theoretical analysis results presented in Table 7.

C. TRIPARTITE EVOLUTIONARY GAME SIMULATION UNDER GOVERNMENT SUPERVISION
Obviously, the market cannot achieve a healthy development in the above-mentioned unique ESE state. This is extremely disadvantages for promoting the participation of new energy power generation enterprises in EM and promoting new energy accommodation. Therefore, it is essential to guide the market evolve toward an expected long-term ESE state. For this purpose, as stated in Section III, we can approximately adjust the market's RNP parameters to realize that. Concretely, according to Table 7, this can be achieved by the government to develop an effective on-grid trading rule for power generation-side EM transaction. At this time, the government needs to effectively supervise and guide new energy and traditional energy power generation enterprises to cooperate with each other, and simultaneously to promote the power grid VOLUME 9, 2021 FIGURE 5. Dynamic simulation results of the generation-side on-grid power generation amount competition evolutionary game involving participation of new energy enterprise population when the government conducts no supervision to the market: (a)∼(e) show the phase trajectory of (α, β, γ ) after 125, 216, 343, 512 and 729 rounds of repeated on-grid power generation amount competition evolutionary game dynamic simulations, respectively. enterprises to actively participate in new energy accommodation. Under such market situation, the government still needs to let other unreasonable on-grid power generation amount competition strategies gradually disappear in the long-term evolution of the market. This means that the expected market situation will gradually become the unique long-term ESE state that is spontaneously formed in the market.
Therefore, according to Ref. [29], by formulating effective trading rules to approximately adjust the market's RNP parameters, the market will be guided to evolve toward the expected long-term ESE state achieved at E 8 (1, 1, 1). Such pure-strategy internal equilibrium point will become the unique ESS when the following five conditions are met simultaneously. i) Let l 5 < l 1 , m 3 < m 1 and n 2 < n 1 , which makes E 8 (1, 1, 1) become an ESS and accordingly, E 4 (0, 1, 1), E 6 (1, 0, and E 7 (1, 1, 0) all become unstable evolution equilibrium points. ii) At least one of l 4 > l 8 , m 6 > m 8 and n 7 > n 8 satisfies, which makes E 1 (0, 0, 0) become an unstable evolution equilibrium point. iii) At least one of l 3 > l 7 , m 5 > m 7 and n 8 > n 7 satisfies, which makes E 2 (0, 0, 1) become an unstable evolution equilibrium point. iv) At least one of l 2 > l 6 , m 8 > m 6 and n 5 > n 6 satisfies, which makes E 3 (0, 1, 0) become an unstable evolution equilibrium point. v) At least one of l 8 > l 4 , m 2 > m 4 and n 3 > n 4 satisfies, which makes E 5 (1, 0, 0) become an unstable evolution equilibrium point. When these five conditions are met at the same time, the internal equilibrium point E 8 (1, 1, 1) becomes the unique long-term ESE in the on-grid power generation amount competition evolutionary game in the supply-side market involving three types of enterprise populations. Under this unique equilibrium situation, new energy and traditional energy power generation enterprises choose to cooperate with each other with the aim of promoting the former to actively participate in power generation amount competition, and meanwhile, the power grid enterprises choose to actively participate in new energy accommodation based on load forecasting with certain accuracy, which further promotes the on-grid power generation amount and minimizes the waste of new energy such as wind energy curtailment and solar energy curtailment. This is of great significance for the power grid to achieve peak shaving and load leveling and long-term safe and stable operation.
To verify the findings, under the premise of approximately adjusting the above RNP parameters, i.e., under the above five conditions, we perform a dynamic simulation to demonstrate the case where a unique evolutionarily stable equilibrium point exists in the power generation market, i.e., the internal equilibrium point E 8 (1, 1, 1) becomes the unique long-term ESE state of the market. Concretely, we take the initial values of α, β and γ from 0 to 1 within the market's decision space [0, 1]×[0, 1]×[0, 1] at intervals of 1/6, 1/7, 1/8 and 1/9, respectively. This means that we respectively conduct 343, 512, 729 and 1000 rounds of repeated on-grid power generation amount competition evolutionary game dynamic simulations to observe the phase trajectory of market strategy (α, β, γ ) during the long-term evolution of the market. The four sets of dynamic simulations are denoted by Cases 1 to 4, respectively, as demonstrated in Figure 6 (a)-(d), where the indication of the red, green and blue solid dots is presented as same as in Figure 5. Figure 6 reveals that the market achieves the unique ESE state at E 8 (1, 1, 1) when meeting the above-mentioned five conditions in the process of a long-term evolution. At this point, the remaining seven pure-strategy internal equilibrium points E 1 ∼ E 7 change to evolutionarily unstable or critical equilibrium points, as illustrated by the green and blue solid dots in each figure, and they will gradually disappear in the market because they cannot invade into the market which has reached a long-term ESE state.
Overall, the application example in this section fully verifies the effectiveness and universality of research and analysis results on the long-term ESE characteristics of 3P2SEGs. It also shows that, by determining the complete RNP parameters of the evolutionary game model of a specific application example, the evolution state of the system at all internal equilibrium points can be fully explored, thus realizing the complete theoretical analysis and dynamic simulation verification of the long-term equilibrium characteristics of the system. In addition, research shows that, based on appropriate adjustment of the market's RNP parameters through some external factors such as government supervision and making effective trading rules, the whole competitive market can be guided to evolve toward an expected long-term ESE state during the evolution. This has important theoretical guidance and reference significance for studying the more complex multi-population multi-strategic on-grid power generation amount competition games in the supply-side power generation market, especially for the complex asymmetric market bidding issues.

D. POLICY IMPLICATIONS
Through the case study in previous parts, we deem that the government should vigorously guide new energy power generation enterprises to participate in long-term bidding in the power generation market while improving overall social welfare to promote new energy consumption. By actively guiding new energy generation enterprises, it can also enable the government itself to actively participate in energy sources structural readjustment and the future development direction of the new energy industry.
In addition, the government can reasonably use fiscal instruments such as subsidized taxes to promote the development of new energy industries in the process of monitoring the power generation market. At the same time, the government can use measures such as carbon tax or environmental tax on traditional energy enterprises to restrict their participation in on-grid bidding in the power generation market.
Overall, through the active intervention and adequate guidance of the government, a close cooperative development relationship between new energy generation enterprises and traditional energy generation enterprises needs to be promoted in the future in order to achieve win-win cooperation and ultimately accelerate the consumption of new energy and maximize the total social welfare.

V. CONCLUSION
This paper explores the long-term ESE of the general 3PmSEGs. Based on this, the long-term on-grid price bidding of a generation-side EM with three parties is thoroughly investigated. Overall, the main contributions are summarized as follows.
i) The long-term ESE characteristics of general 3P2S-SEG, 3P2S-AEG, and 3P3S-AEG systems are systematically investigated and summarized. Complete RNP parameters are defined for them. Besides, the modeling idea and convergence iteration method of general 3PnS-AEG systems are elaborated.
ii) Research reveals that proper regulation of the evolutionary game system's RNP parameters is essential. This can gradually drive the system to evolve towards an expected long-term ESE state spontaneously. Therefore, the key of investigating the long-term ESE characteristics of the general 3PmSEGs is first to determine and define their RNP parameters according to their payoff matrices.
iii) To verify the effectiveness and practicability of the general 3PmSEG models in this paper, the long-term on-grid bidding of a generation-side EM involving three enterprise populations is investigated.
iv) The application case study reveals that, under no government supervision, the two power generation enterprise populations will choose not to cooperate with each other and meanwhile, the power grid enterprise population will choose to passively participate in new energy accommodation. In contrast, under government supervision, the market's RNP parameters can be approximately adjusted by the government, thus the two power generation enterprise popu-lations can be guided to actively cooperate with each other to promote more new energy accommodation and meanwhile, the power grid enterprise population can also be guided to actively participate in new energy accommodation.
v) The case study also indicates that the government should appropriately regulate the market's RNP parameters according to actual market conditions. This is of great significance to the long-term sustainable and healthy development of new energy resources and the supply-side power market. This can also avoid new energy curtailment, including wind energy curtailment and solar energy curtailment.
Overall, the methodology and obtained conclusions have certain universality and validity, which can be applied to investigate various practical complex behavioral decision-making issues in many actual scenarios, especially the more common 3PmSEG scenarios. It is expected to provide some ideas and reference for the investigation of complex multi-population multi-strategic behavioral decision-making issues involving non-complete rational stakeholders in related fields.

DRM
demand-side response management ESE evolutionarily stable equilibrium EGT evolutionary game theory ESS evolutionarily stable strategy ESSs evolutionarily stable strategies EM electricity market LyESC Lyapunov method-based evolutionary stability criterion NE/NEs Nash equilibrium/Nash equilibria RD replicator dynamics RNP relative net payoff StEG stochastic evolutionary game 2P2S-SEG two-population two-strategy symmetric evolutionary game 2P2S-AEG two-population two-strategy asymmetric evolutionary game 3P2S-SEG three-population two-strategy symmetric evolutionary game 3P2S-AEG three-population two-strategy asymmetric evolutionary game 2P3S-SEG two-population three-strategy symmetric evolutionary game 3P3S-AEG three-population three-strategy asymmetric evolutionary game 3PmSEG three-population multi-strategy evolutionary game 3P2SEG three-population two-strategy evolutionary game 3PnS-AEG three-population n-strategy asymmetric evolutionary game