A System Dynamics Model for Safety Supervision of Online Car-Hailing From an Evolutionary Game Theory Perspective

Car-hailing safety supervision is of great significance to ease the pressure on urban public transportation and facilitate people to travel safely and conveniently. In this article, a novel tripartite evolutionary game theory is proposed to describe the interaction mechanism of the government supervision department, online vehicle platform security monitoring department, and car sharing owner in the process of China’s Internet ride-hailing operation. The replication dynamics equations are used to elaborate the evolutionary stable strategies of stakeholders and system dynamics are presented to explore the dynamic simulation process of the evolutionary game model, analyze the stability of stakeholder interactions and determines an equilibrium solution. The meaningful simulation results are as follows: there is no stable strategy for the evolution of the three-party selection strategy; the optimized dynamic penalty incentive control scenario can not only effectively suppress fluctuations, but also achieve the effect of obtaining an ideal evolutionary stable strategy. It shows that the cost of government supervision, the platform monitoring and the online hailed car owner can influence the strategy choice of the stakeholders; the government should impose appropriate fines and penalty on the platform and reward car owners, which will help all parties to the game reach a stable state; appropriate punishment-reward factors help the system to reach steady state more easily. These results can provide a theoretical guidance for the government to promote the development of online car-hailing service and establishment of the supervision and management system.


I. INTRODUCTION
As a product of the sharing economy and the Internet plus era, the online car-hailing service platforms have realized the effective utilization of idle resources by using Internet technology [1], have greatly satisfied people's demands for convenient travel [2], and have attracted many consumers with its convenient and high-quality service [3]. The demand for online ride-hailing users is gradually increasing and the The associate editor coordinating the review of this manuscript and approving it for publication was Feiqi Deng . growth rate is accelerating. As of June 2019, the number of online taxi-hailing users in China has reached 337 million; the number of online ride-hailing or express train users has reached 339 million. The scale of online car-hailing transactions in 2017 reached 200 billion yuan, and it is expected that the scale of travel transactions in 2020 will exceed 500 billion yuan. Online hailed cars provide higher-quality services for ''tide travel'', alleviate the sharp contradiction between supply and demand during peak travel [4], enhance the service ability of urban taxis [5], are more capable of penetrating into the tips of cities, fill the gaps in public transport [6], [7], VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and also alleviate the problem of difficulty in travel on days with bad weather [8]. In 2020, the online car-hailing industry enters a period of integrated development, business model innovation, and operations become more refined. Affected by the new crown epidemic, the market may shrink in the short term, but the epidemic has also caused residents to worry about public travel. It would be a good thing to see the carhailing right.
Although the emergence and development of online carshailing have brought greater economic and social benefits, the safety issues in its development cannot be ignored. Recently, there have been negative news on the Internet about online cars-hailing, such as drivers harassing passengers, drivers refusing to accept short-distance orders, drivers speeding in violation of traffic rules to grab orders, robbing passengers' property and even causing injuries to passengers. There are still uncertainties as to whether the online taxi drivers operate in compliance with policy requirements and whether the platform supervises a large number of connected drivers and vehicles. The reality shows that current online carhailing service safety supervision and management system still faces many problems and the market regulation alone cannot make a healthy development of the online car-hailing industry.
Under the general mode of inter-platform competition, there are competition incentive measures for the operators and platforms of online car-hailing service, which allow or even encourage online hailed car owners to compete for carrying limited target customers and put forward the requirement of ''five-star high praise'' to customers. The imbalance between the intensity and the policy of regulatory directly affects the effect of regulatory [9]. The regulatory rules on online car-hailing services agreed by various platforms also have the situations of being loose or strict [10]. Various means (such as price reduction, subsidies, and cashback) seriously affect the market competition. The illegal use and abuse of the user's personal information by car owners infringe on the user's life and property, which happen from time to time [11]. Given the current alternating period of the old and new traffic management systems, traffic management departments attach great importance to the operation and supervision mode of online car-hailing service [12], [13]. In fact, the development and operation of online cars-hailing involves many heterogeneous entities, whose different goals may lead to the failure of cooperation.
The purpose of this study is to address the problem of the multi-party complex dynamic game in the safety operation supervision and management of online car-hailing service. By analyzing the evolutionary game relationship of the online car-hailing service safety supervision and management system, a game model composed of safety supervision and management departments of government, safety monitoring departments of online car-hailing service platforms and online hailed car owners is established. In order to further analyze the relationship among equilibrium and dynamic selection processes of the three parties, the dynamic simulation of the evolutionary game model is carried out by using system dynamics, the stability of stakeholder interaction is analyzed and the equilibrium solution is determined. In the case of uncertain information, the evolutionary equilibrium stability analysis is performed to the dynamic game among online car-hailing service safety supervision and management departments of government, safety monitoring departments of online car-hailing service platforms and online hailed car owners, which reveals the dynamic characteristics of the three parties of the game. By establishing the corresponding SD model and performing simulation, the interest changes of the three parties of the game are analyzed. Finally, an ideal evolutionarily stable strategy of the tripartite game is proposed, which provides a theoretical basis for achieving tripartite win-win and sustainable development. The conclusions of the study can provide reference suggestions for solving the problem of safety supervision and monitoring of online car-hailing service, and also have an important significance to the construction of the supervision system of it.
The rest of this study is as follows. Section 2 reviews relevant research; Section 3 establishes the evolution game model of multi-stakeholders and performs the stability analysis of the equilibrium point; Section 4 adopts an optimized dynamic penalty-incentive control scenario to realize the stability; Section 5 analyzes the effects of different conditions on the evolution of multi-stakeholders through simulation; Section 6 summarizes and draws conclusions.

A. SAFETY SUPERVISION OF ONLINE CAR-HRILING SERVICE
In this article, the online car-hailing service safety supervision and management involves three heterogeneous entities, i.e., safety supervision and management departments of government, safety monitoring departments of online carhailing service platforms and online hailed car owners. First, the online hailed car owners not only get the economic returns from legal operation based on the online car-hailing service platforms, but also have to accept penalties and losses from the the online car-hailing service platforms and safety supervision and management departments of government due to the illegal operations. Next, the online car-hailing service platforms are in the middle position, which not only get economic benefits and returns from cooperation with the online hailed car owners, but also bear the cost of supervision and the losses and penalty caused by management negligence. Here, the safety supervision and management departments of the government, as a regulator, can set the supervision and management regulations which need to be implemented by the online car-hailing service platforms and online hailed car owners, including incentive and penalty policies [14]. Of course, the government supervision departments need to pay a cost a for the supervision [15], [16]. It is shown in Figure 1 that the relationship among the three parties in the online car-hailing service safety supervision and management as follows.
Most of the safety supervision and management of the online car-hailing services are based on different stakeholders, and the safety of online car-hailing service is studied from different perspectives. For the effectiveness of the safety supervision and management by the government, it is found that the incidence of accidents associated with online hailed cars is obviously related to the perfection of the safety supervision and management system. Strengthening the supervision and management through policies or rules is an important guarantee for the safe operation of the online car-hailing service [17]- [20]. Based on the analysis of the technical means to realize the safety operation [21]- [24], the establishment of a better sharing system is regarded as the key to solve the problem of the safety operation of online carhailing service and an effective strategy to ensure the safety supervision [25]- [28].
Under the situation of giving rights to market participants, the government give limited decision-making rights and economic autonomy to the online car-hailing service platforms. Whether there is a lack of safety supervision by the government and the intensity of law enforcement and management will have an important impact on the online carhailing service platforms and online hailed car owners. The management strategy and the application of technical means of the platforms play a direct role in the long-term sustainable safety operation. Whether the online hailed car owners actively and conscientiously participate in the operation, and whether they safely operate the online hailed cars according to laws and regulations, depend on the policies of the supervision departments of the government and the incentive and penalty measures of the platforms. The supervision and management needs to consider the strategies of all participants systematically. It needs to coordinate the respective interests of online car-hailing service safety supervision and management departments of government, online car-hailing service platforms and online hailed car owners, and optimize the stability of the equilibrium of interests of all parties. Whether it is strengthening the supervision and monitoring or implanting the security technology means, we know from practice that coordination and management of the interests of multiple stakeholders is an important prerequisite for the realization of safety operation of online car-hailing service [29].
However, little progress has been made in revealing potential drivers of the effectiveness or efficiency of supervision and management.

B. GAME AMONG STAKEHOLDERS IN SAFETY SUPERVISION
The evolutionary game theory is universally applied to examine the multi-stakeholders' dynamic interaction mechanism, which analyzes the conflict and cooperation of multistakeholders and overcomes the assumption of perfect rationality in the traditional game theory [30], [31]. In the evolutionary game, the multi-players make a response to the initial strategy and relative sensitivity of other participants' action strategies, and the evolutionary situation itself is interdependent with the participants' behaviors [32]. Through the process of dynamic strategic learning, the participants will eventually lead to an equilibrium solution. In the actual process of the online car-hailing service, each participant randomly chooses its own strategy at the initial stage of the game because of its bounded rationality and incomplete knowledge. Then, as time goes on, its strategic choice is constantly adjusted and changed according to the various situation it can observe, presenting the characteristics of the complex dynamic game, especially in dynamic safety supervision [33], [34].
At present, most literature focuses on the stakeholders of online car-hailing supervision is mostly confined to two participants, i.e., the safety supervision and operation departments and the online hailed car owners, no tripartite or more stakeholders' behavior and systematic evolutionary game analysis is performed to the supervision and monitoring problem, and the difference between the safety supervision and management departments of government and the service platforms is not distinguished. Actually, ''government for supervision'' and ''platforms for monitoring'' belong to the regulators of online hailed car owners, but they also have their own internal division of functions, rights and responsibilities, and they are not participants with identical interests. Besides, when analyzing this kind of problems, most of the previous studies were focused on the existence of the evolutionarily stable strategy in this kind of game model, did not propose control measures for the game model in the absence of evolutionarily stable strategy equilibrium, only analyzed the problems, did not propose a way to solve the problems involving tripartite or more stakeholders' behavior. Therefore, based on the evolutionary game theory and system dynamics simulation, this article constructs a tripartite evolutionary game model and aiming at the problem of platforms' monitoring and government' supervision in the safety operation of the online car-hailing service. The novel optimized dynamic penalty incentive control model can not only effectively suppress fluctuations, but also achieve the effect of obtaining an ideal evolutionary stable strategy. Finally, the system dynamics method is applied to simulate and analyze the strategy combination of three stakeholders to provide reference for a long-term and stable implementation VOLUME 8, 2020 strategy for the safety operation of online car-hailing service.

III. MODEL AND ASSUMPTIONS A. ASSUMPTIONS
Under the tripartite interactive system of government supervision departments, online car-hailing service platforms and online hailed car owners, it is assumed that there is no bribery between government supervision departments and online hailed car owners, there is a bribery between online car-hailing service platforms and online hailed car owners, and the information between them is incomplete; the law enforcement ability of the government supervision departments is strong enough, there is no situation that online hailed car owners who violate the regulations escape from being punished, and the degree of violation can be monitored once the online hailed car owners who violate the regulations can be monitored.
It is assumed that the government supervision departments supervise the online car-hailing service platforms at a rate p 1 (0≤ p 1 ≤1), which represents the strength of supervision; p 1 =0 or 1 means that the government supervision departments do not supervise or supervise the online car-hailing service platforms, and the cost of real-time comprehensive supervision is very high. Therefore, the finiteness of supervision times is in a normal state. The online car-hailing service platforms often have more information than the government supervision departments, and the platforms can benefit from the operation of the online hailed car owners. The government supervision departments need to pay a cost a for the supervision of the platforms. If the government supervision departments are negligent in supervision, they will bear the expected loss cost b in the later period because of the increase of accident probability; if the government supervision departments find the responsibility negligence of the online carhailing service platforms, the platforms will receive a penalty d, and contrarily, if the supervision effect is good, they will receive a reward f; at the same time, if the online hailed car owners have irregular and illegal operations, the government supervision departments will impose a penalty c; and on the contrary, the government supervision departments will give a reward or subsidy e to the online hailed car owners.
The online hailed car owners legally operate at a rate p 2 (0≤ p 2 ≤1) following the requirements of the government supervision departments and the operating standards of the service platforms. Similarly, the level of the value 1-p 2 represents the severity of irregular behaviors of the online hailed car owners. The benefit gained by the online hailed car owners from normal operation is g, the benefit gained by the online hailed car owners from illegal operation is h, the total expected loss cost while successful rent-seeking is i, and the total expected loss cost of unsuccessful rent-seeking is j.
The online car-hailing service platforms carry out daily safety monitoring to the online hailed car owners in its field at a rate p 3 (0≤ p 3 ≤1). The probability p 3 represents the strength of monitoring, p 3 = 1 means that the online car-hailing service platforms strictly perform the monitoring duties and p 3 = 0 means that the duties are neglected, no monitoring is performed and even rent-seeking is performed by using their rights. It is assumed that the normal benefit of the online car-hailing service platform is k, the cost paid by the platforms for real-time monitoring is l, the expected loss cost born at the later period is m in case of duty negligence, the benefit of the platforms is r in case of successful rent-seeking, under strict monitoring, monitoring errors may occur due to the quality of monitoring and other reasons, and the monitoring error rate is n. The above variables are as shown in Table 1 below. From the above basic assumptions and analysis, we can obtain the benefits from the game among the three stakeholders, namely, the government' supervision departments, the online car-hailing service platforms and the online hailed car owners, as shown in Fig. 2.

B. MODEL SOLVING
According to the evolutionary game theory [35], in the safety supervision and monitoring of the online car-hailing service, an individual in a certain population uses imitation dynamics to describe his own learning evolution mechanism, and imitates and learns by observing and comparing his own benefits with those of other individuals in the same population. If the fitness that the government supervision departments choose to supervise is allowed to be U 1 and the fitness that the government supervision departments choose to not supervise is allowed to be U 2 , then: accordingly, the average fitness of the government supervision departments is as follow: It is assumed that time is continuous and government supervision departments tend to learn and imitate game strategies with relatively high returns; more specifically, the more returns a strategy has in a given current behavior distribution, the more they will learn and imitate. Then, the replicator dynamics equation of the proportion p 1 for the government supervision departments is:  Similarly, we can get the replicator dynamics equation of the online car-hailing service platforms and the online hailed car owners, which are respectively as follows: To sum up, Formulas (3), (4) and (5) describe the population dynamics of the whole evolutionary game system of safety supervision of online car-hailing service, which can be expressed by the following three replicator dynamic equations (6), as shown at the bottom of the next page.
The replicator dynamic equations reflect the speed and direction of strategic adjustment among the government supervision departments, the online car-hailing service platforms and the online hailed car owners. By analyzing the determinant and trace symbol of the Jacobian matrix of the game system at the equilibrium point, we can judge the stability of the equilibrium point of the replicator dynamic equations [35], [36]. However, the stability analysis of all equilibrium points through the Jacobian matrix of the system not only involves a huge amount of calculation, but also has certain difficulty, and it is also difficult to reasonably customize the strategic choices of the players. Therefore, the computer simulation can be considered to obtain better decision-making support. Through dynamic modeling and analysis of the evolutionary game process of the safety supervision, the purpose of analyzing the stability of all equilibrium points of the replicator dynamic equations of the system can be achieved [37], [38].

IV. STABILITY ANALYSIS AND EFFECTIVE STABILITY CONTROL MEASURES OF THE EVOLUTIONARY GAME A. STABILITY ANALYSIS BASED ON SD
In population evolutionary game, an individual in a certain population uses replicator dynamics to learn and evolve, to adjust his strategic choice. Therefore, we can consider using system dynamics to study the feedback structure of the evolutionary game system of the safety supervision and monitoring of the online car-hailing service, and analyze the stability of the equilibrium point of the game. According to the above game model analysis, the system dynamics (SD) model of the evolutionary game of the safety monitoring of the online car-hailing service is established by adopting Vensim software. The model consists of three sub-models, i.e., a government supervision departments SD sub-model, an online car-hailing service platform SD sub-model and an online hailed car owner SD sub-model, as shown in Fig. 3. The functional relationships among the state variables, flow rate variables and intermediate variables in the model are determined according to the replicator dynamic equations of the evolutionary game of the safety supervision.
The data are obtained according to the investigations by related experts of supervision departments of governments, and safety monitoring departments of online car-hailing service platforms, and sharing car owners, and the related literature and references in the field of online car-hailing service, and the government's public documents [16], [39]. Under these circumstances, we acquire the data through the interviews with directors of supervision departments of governments, safety monitoring departments of online car-hailing service platforms and sharing car owners to obtain the data and information. Due to the varieties of the policies and measures are hard to be quantified and not mention some abstract variables, we took Delphi method to quantify the data. Afterwards, we made the data dimensionless and finally conducted open-ended discussion and got the data. Based on this, the model is set as follows: INITIAL TIME = 0, FINAL TIME = 200, TIME STEP = 1. The external variables of the initial value in SD model are preprocessed as shown in Table 2.
The value of each external variable is brought into the replicator dynamic equations of the evolutionary game system for solving. It is obvious that there are eight pure strategy equilibrium points X 1 -X 8 and two mixed strategy equilibrium points X 9 and X 10  Taking X 9 as an example, X 9 is brought into the evolutionary game SD model for simulation to obtain the evolutionary game state. As shown in Fig. 4, it is found by simulation that, at the initial mixed strategy equilibrium point X 9 , the three players of the game do not actively change their initial strategies, and no one in each population adopts a new strategy, at this time the game is in a relative equilibrium state. Similarly, simulation results show that at the mixed strategy equilibrium point X 10 and other pure strategy equilibrium points X 1 -X 8 , the three players of the game do not actively change their initial strategies. However, the states of these equilibrium points are unstable and path-dependent. At the   equilibrium point X 9 , if there are a small number of individuals who are mutated in the population of online car-hailing service platforms, the monitoring rate will mutate from p 3 = 0 to p 3 = 0.01. Simulation results are as shown in Fig. 5.
The simulation results show that the equilibrium state of the point X 9 is unstable, and the online car-hailing service platform population will evolve towards p 3 = 1. Although only a small proportion of individuals in the platform population mutate in the initial strategy, as this mutation has a high return, it will immediately become a target to be imitated and learned by other individuals, which eventually makes the online car-hailing service platform population evolve towards p 3 = 1. Similarly, it can be obtained that the equilibrium states of the point X 10 and other equilibrium points X 1 -X 8 are also unstable. In summary, by using the SD to study and model the evolutionary game process and analyzing the stability of the equilibrium strategy points, it is found that there is a trend of fluctuating repeatedly and oscillating in the game process, which indicates that there is no evolutionarily stable strategy in the game process.

B. EFFECTIVE STABILITY CONTROL MEASURES OF EVOLUTIONARY GAME
The fluctuation of the evolutionary game process of the safety supervision of the online car-hailing service brings great difficulty to the formulation of the safety supervision strategy, which leads to the frequent occurrence of accidents. Therefore, it is necessary to study the stability control measures for the evolutionary game of the safety supervision of the online car-hailing service.
In the safety supervision of the online car-hailing service, it is necessary to implement control measures. The dynamic penalty-incentive control measures are proposed [40]- [42], that is, the government supervision departments give dynamic penalty and incentive according to the gained information about the online hailed car owners and the online car-hailing service platforms, as shown in the following formula (7).
where s 21 , s 22 , t 21 and t 22 are respectively the coefficients of penalty given by the government supervision departments to VOLUME 8, 2020  the online hailed car owners and the platforms; x 21 , x 22 , y 21 and y 22 are respectively the coefficients of incentive given by the government supervision departments to the online hailed car owners and the platforms; and assuming that they are all 1, the SD model of the evolutionary game of the safety supervision under the dynamic penalty-incentive control measures is as shown in Fig. 6. When the initial strategies of the three players of the game are (p 1 = 0.5, p 2 = 0.5, p 3 = 0.5) and (p 1 = 0.5, p 2 = 0.1, p 3 = 0.2), the evolutionary game of the government supervision departments, the platforms and the car owners under dynamic penalty-incentive control measures is simulated. The simulation results are as shown in Fig. 7 and Fig. 8.
From the simulation results, we can see that under the dynamic penalty-incentive control measures, the game evolution process approximately converges to the vicinity of X * = (0, 1, 1). This equilibrium state is very ideal, that is, the government supervision departments supervise the online hailed car owners at a very small probability of supervision, the optimal strategy of the online hailed car owners is to choose to perform legal driving operations, and the optimal strategy of the online car-hailing service platforms is to strictly perform monitoring duties. Furthermore, the evolutionary stable strategy is X * = (α, 1, 1) as a result of α = 0, in which α = 0. Similarly, it is also indispensable to solve the evolutionary game model and analyze its equilibrium solutions to verify the above simulation results.

C. STABILITY ANALYSIS AND CHECK UNDER THE OPTIMAL DYNAMIC PENALTY-INCENTIVE SCENARIO
Whether the evolutionarily stable strategy under this measure is the real stable equilibrium point of the system remains to be proved. Therefore, to verify the simulation results, it is  necessary to solve the evolutionary game model of the system under this measure and prove the stability of its equilibrium points to verify the validity of the simulation results of the evolutionary game.
After adding penalty-incentive measures, the initial values assigned to the parameters of the system are shown in Table 3 and the game model was solved. The three-party evolutionary game replicated dynamic equation are illustrated in Equation set (8), as shown at the bottom of the previous page.
According to the initial values of the parameters and the three-party evolutionary game model replicated dynamic equation, the eight pure strategy equilibrium points X 1 − X 8 are obtained as Equation (9).
Because the dynamic equation under the dynamic penaltyincentive measures contains 4/p 1 and 1.7/p 1 , p 1 = 0 is not established. Then, the equilibrium points X 1 , X 2 , X 3 , and X 4 were expressed with p 1 instead of 0. Then, the stability of the equilibrium solution can be obtained by analyzing the determinant and trace of the Jacobian matrix. The Jacobian matrix of the game system is established as Equation (10).
The characteristic value of each equilibrium point and the stability judgment result of each equilibrium point can be obtained. Besides, the stability of equilibrium point of replicator dynamic equation could be judged by analyzing the determinants and trace of the Jacobian matrix of the game system, that is, whether there is an evolutionarily stable strategy in the game [42]- [45]. According to the Lyapunov stability theory, if all characteristic values have nonpositive real parts, the system is stable; otherwise, the system is unstable. The equilibrium solutions (X 1 -X 8 ) were put into the Jacobian matrix and solved to get their characteristic values. The characteristic value of each equilibrium point and the stability judgment result of each equilibrium point can be obtained. The judgment results of all equilibrium points are shown in Table 4.
As the value of p 3 approaches to 0 and the characteristic values have nonpositive real parts, it can be seen that the equilibrium point X * (0, 1, 1) T is an evolutionarily stable strategy. As can be seen from Table 4, the state of X 8 is ESS, and the remaining points are saddle points. Therefore, the analysis consistent with the above SD simulation results X * (0, 1, 1) T .
Accordingly, it can be seen that the application of system dynamics to simulate the evolutionary game process is an effective method to solve the stability of an equilibrium solution. Besides, optimizing the dynamic penalty-incentive control scheme can effectively suppress the existing game, stabilize the fluctuation of the game and provide an ideal evolutionarily stable strategy. The tourism enterprises will almost implement ecotourism as their optimal strategies.
To sum up, simulating the evolutionary game process of safety supervision by applying SD is an effective method to solve the stability analysis of equilibrium points in the evolutionary game. Under dynamic penalty-incentive measures, the fluctuation of the evolutionary game process is effectively controlled, and there is an evolutionarily stable strategy. In this state, the government supervision departments supervise the online hailed car owners at a very small VOLUME 8, 2020 probability of supervision, the optimal strategy of the online hailed car owners is to choose to perform legal driving operations following the relevant laws and regulations, and the optimal strategy of the online car-hailing service platforms is to strictly perform monitoring duties.

V. SIMULATION A. ANALYSIS OF INITIAL STATE INFLUENCE
The evolution paths of the strategy selection of the three parties under the different initial values are analyzed. As shown in Fig.9(a), when the initial probability of the service platform monitoring and the owner not violating the rules are given respectively (p 2 = 0.5, p 3 = 0.5), the probability of government supervision is from 0.1 to 0.3, 0.7. As the probability increases, the speed to reach a steady state is faster. Similarly, when the initial probability of supervision by the government department and the owner not violating the rules are given respectively (p 1 = 0.5, p 2 = 0.5), the probability of monitoring of the platform is from 0.2 to 0.5, 0.8. Compared with Fig.5, after the penalty-incentive measure is implemented, as the probability increases, the final steady state will not change. The only change is the speed at which the steady state is reached. That is to say, as is shown in Fig. 9(b), the larger the initial value, the faster the steady state is reached. Similarly, as shown in Fig. 9(c), when the initial probability of government supervision and the platform monitoring are given respectively (p 1 = 0.5, p 3 = 0.5), the probability that the online hailed car owner will legally operate according to the rules from 0.3 to 0.6, 0.9. As the probability increases, the owner quickly chooses a steady state that does not violate the rules completely.
It can be seen that when the penalty-incentive measure is implemented, the change of the initial value of the three stakeholders does not cause the final stable state of the threeparty strategy selection, but only affect the speed of reaching the steady state. And the larger the initial value, the faster the steady state is reached. It further shows that punishmentincentive measures can effectively control fluctuations.

B. BEHAVIOURS OF PLAYERS UNDER DIFFERENT PARAMETER
The impact of parameter changes on government policy choices can be described in Figs. 10(a) and (b). The greater the government's supervision costs, the more the government is willing to implement an unsupervised strategy; the more severe the penalty for the not monitoring of the platform, the greater the probability that the government will initially choose to supervise. Comparing Figs.10(a) and (b), we can find that the government's penalty for the platform is generally greater than the supervision cost. In reality, the government will choose to implement the supervision strategy to protect the interests of the public. In the implementation process, the probability that the platform is subject to supervision by the government will gradually become larger, and the probability that the owner will not legally operate after being supervised will gradually become larger. As the evolution progresses, the government will adjust its strategy according to the choice of the platform and the owner. Finally, the three parties have formed a stable situation, that is, the government adopts a non-regulatory strategy, the platform adopts a monitoring strategy, and the owner adopts the legal operation strategy.
As is shown in Fig.10(c), when the penalty factor is less than 1, the strategy of government will eventually be unable to reach a steady state. If the government's dynamic penalty for the platform is lower than the static penalty, there will be no difference between strict supervision and non-strict supervision for the platform. For cost saving, the platform will tend to choose not to strictly supervise. Besides, the higher the reward, the faster the strategy of government will reach a stable state. Of course, low rewards will not affect the stability of evolution. The greater the penalty-reward factor, the faster the system will reach a steady state.
The impact of parameter changes on the strategy choice of the service platform can be depicted in Fig. 11(a). The higher the government's penalty for the platform, the smaller the monitoring cost, and the shorter the time when the strategy of the service platform reaches a steady state. Therefore, the government should choose an appropriate punishment strategy to promote the effective supervision of the online carhailing service platform.
As we can see from Fig. 11(b), when the penalty coefficient is less than 1, the speed at which the online car-hailing service platform reaches a steady state will be significantly reduced. If the reward coefficient and the penalty coefficient are less than 1 at the same time, the speed at which the platform reaches a steady state is the slowest. Consequently, the impact of additional small rewards on the choice of the platform strategy will not be great. However, with the increase of punishment, the rate of supervision of the platform has gradually increased, and the speed of reaching the steady state has gradually increased.
The impact of parameter changes on the strategy choice of online hailed car owners can be seen in Fig. 12(a). The greater the government's reward e to the owner, the shorter the time it takes for the online hailed car owner to reach a steady state. When the owner chooses to comply with relevant laws and regulations, the passenger safety is guaranteed, which helps to attract passengers and increase the income of the owners. At this time, the government's rewards to the owner of the car, in addition to the wages, are the recognition of their working attitude and the behavior.
It can be seen from Fig. 12 (b), when the penalty coefficient is less than 1, the owner of the car finally cannot reach the steady state. If the government's dynamic penalty for the platform is lower than the static penalty, there will be no difference between legal and illegal operation for the online hailed car owner. From the perspective of cost saving, the online hailed car owner will tend to choose not to operate legally. Moreover, the higher the reward, the faster the car owner reaches the steady state, and the low reward will not affect the stability of the system. It can be seen from Fig. 12(b), the greater the penalty-reward coefficient, the faster the system reaches the steady state of the system.
In summary, we can conclude the following. (1) Using the evolutionary game theory and replicator dynamic equations is an effective way to analyze the issue of the safety supervision of the online car-hailing service. Moreover, the feedback structure of the system is more suitable for establishing the corresponding evolutionary game model by adopting SD, and simulating and analyzing the model. (2)When the multiplayer evolutionary game of the safety supervision of the online car-hailing service does not implement effective stability control measures, the simulation results show that there is no evolutionarily stable strategy among the government supervision departments, the service platforms and the online hailed car owners, and the evolutionary game process has a trend of fluctuating repeatedly and oscillating. At this time, as long as anyone of the three players of the game has a small mutation, the fluctuation of the equilibrium strategy point will appear, and the relative equilibrium state will be broken. The fluctuation of the evolutionary game process of the safety supervision of the online car-hailing service brings great difficulty to the reasonable formulation of the safety supervision strategy. (3)To control fluctuations and reach the stable of the system evolution, dynamic penalty-incentive measures are adopted, which make the multi-player game of the safety supervision of the online car-hailing service reach a stable state, and an ideal strategic choice is obtained. The simulation results show that the measures can effectively suppress the fluctuation of the game process. Under the measures, there is an evolutionarily stable strategy, and in this stable strategy equilibrium state, the illegal operation of the online hailed car owners is effectively controlled. As proved by the study, the proposed dynamic penalty-incentive measures are very effective in controlling the fluctuation of the three players of the game, and the three-player game strategies obtained under such control conditions are feasible. (4)The impact of each factor on evolutionary results of stakeholders can be analyzed. The cost of the government and the punishment for the online car-hailing service platform of the government implementing supervision can influence the strategy choice of the government; the penalty for the online carhailing service platform not monitoring and the cost of the platform monitoring can make the effects on the strategy of the platform; the government's incentives for owners who do not violate the rules can make the effects on the strategy of the online hailed car owners. Finally, the dynamic incentive and punishment strategy can effectively promote the monitoring of the platform and the non-violating operation of the online hailed car owners.

VI. CONCLUSION AND POLICY IMPLICATIONS
Aiming at the dynamic complexity of the multi-participant and multi-player game of the safety supervision of the online car-hailing service, a evolutionary game model of the safety supervision of the online car-hailing service based on SD is established by combining the idea of the dynamic evolutionary game with the computer simulation method based on system dynamics. By solving, simulating and analyzing the model, the control measures for effectively controlling the fluctuation of the game process of the government supervision departments, the online carhailing service platforms and online hailed car owners are put forward, the optimized control measures can effectively suppress the fluctuation of the strategies of the three players of the game, and finally, an ideal evolutionarily stable strategy is obtained. Finally, we discuss the influencing factors of the evolutionary stable strategies (ESS) from the perspectives of the control measures. Based on the simulation results, the conclusions and recommendations: are provided below.
(1) This article proposes the dynamic penalty-incentive control method to optimize the control fluctuation, an ideal evolutionarily stable strategy (0, 1, 1) is obtained. That is to say, there exist the optimal dynamic penalty-incentive measures which make the multi-player game of the safety supervision of the online car-hailing service reach a stable state. In this case, even little is done by the government supervision departments, the online car-hailing service platforms are willing to strictly perform the monitoring duties and online hailed car owners are willing to legally operate. Therefore, cost of government' supervision, government' penalty and reward within a reasonable range, rather than being too high or too low, will benefits the safety and sustainable operation of online car-hailing service.
(2) It can be seen that when the penalty-incentive measure is implemented, the larger the initial value, the faster the steady state is reached. The higher the supervision cost, the more the government tends to choose not to supervise; when the government implements appropriate punishment for the platform and rewards to the owner, the platform tends to choose to monitor, and the owner tends to choose not to operate illegally. Besides, based on the dynamic penalty-incentive control method in our research, the greater the penalty-reward factor, the faster the system will reach a steady state. Hence, government can establish a multi-subject coordinated regulatory mechanism, which can not only stimulate the vitality of the safety supervision of the online car-hailing service, but also achieve the ideal and stable state.
(3) The greater the government's supervision costs, the more the government is willing to implement an unsupervised strategy. Thus, the government's supervision costs should be particularly concerned, and various technical means, such as big data, should be used to improve the level of supervision, continuously reduce supervision costs and improve supervision efficiency.
(4) For cost saving, the platform will tend to choose not to strictly supervise. The higher the government's penalty for the platform, the smaller the monitoring cost, and the shorter the time when the strategy of the service platform reaches a steady state. Therefore, the government should choose an appropriate punishment strategy to promote the effective supervision of the online car-hailing service platform. The government supervision departments should increase penalties for platform violations and improve the credit evaluation system of the online car-hailing industry, increase the illegal cost of platform operation, and reward the standardized operation platform.
(5) The greater the government's reward to the owner, the shorter the time it takes for the online hailed car owner to reach a steady state. When the owner chooses to comply with relevant laws and regulations, the passenger safety is guaranteed, which helps to attract passengers and increase the income of the owners. At this time, the government's rewards to the owner of the car, in addition to the wages, are the recognition of their working attitude and the behavior. Thus, the government should speed up the construction of industry supervision platforms and give full play in regulating driver training.
However, there are some unavoidable limitations in this study. When the evolutionary model converges to the ideal state, this study only reveals that sufficient penalty-incentive control mechanisms can effectively control the fluctuation of the evolutionary game process, but the current work still cannot complete the summarization of the stable equilibrium conditions of the general three-dimensional evolutionary game model. In the long run, the realistic choice of the safety supervision system of the online car-hailing service may be proved to be a more dynamic process over time, which is characterized by continuous adjustment and mutual optimization according to the changes of internal and external factors, including changes in the level of incentives and penalties, local competition strategy, official decisionmaking style, and monitoring ability. These problems will make sufficient challenges for the upcoming study. In the future, it will be necessary to develop a more complex and comprehensive model to study the safety supervision system of the online car-hailing service. TSEPING DONG received the Ph.D. degree from National Taiwan University. He is currently a Distinguished Professor with National Taiwan Normal University, Taiwan. He was a Fulbright Visiting Scholar with UCLA, UCSD, and UC Berkeley, from 2011 to 2012. He specializes in innovation and entrepreneurship, organizational culture, strategy management, intellectual property, and Asia-Pacific business strategies. His research interests include cross-disciplinary research integrating business management, arts, technology, and law.