A Differential Game-Based Approach for School-Enterprise Collaborative R&D Strategy on Digital Twin Technology

Digital twin (DT) technology is an effective way to realize intelligent manufacturing, which has been increasingly received attention in both academia and industry. Thus, it is rather necessary and significant to collaboratively accomplish the research and development (R&D) of DT technology (RDDT). To explore a school-enterprise collaborative R&D strategy on DT technology, this paper proposes a differential game-based approach to compute the optimal R&D effort levels and optimal incomes of both parties in the school-enterprise collaborative innovation (SECI) system. First, using Berman’s continuous dynamic programming theory, the optimal R&D effort levels, the optimal incomes of both parties, and total optimal income in the SECI system are calculated in three cases: Nash non-cooperative game, Stackelberg master-slave game and cooperative game. Second, the equilibria of the three game cases are analyzed and compared. Finally, a numerical example is used to verify the validity of the conclusion, and we find that the optimal benefit of two parties in cooperative game are significantly better than those of Nash non-cooperative game and Stackelberg master-slave game, which effectively demonstrates the superiority of school-enterprise collaborative R&D on DT technology.


I. INTRODUCTION
With the introduction of some advanced manufacturing development strategies, representative such as ''American Industrial Internet,'' ''German Industry 4.0,'' ''Made in China 2025,'' the goal of these advanced manufacturing strategies is to achieve the interconnection and intelligent operation of the physical and information worlds [1]- [4]. Intelligent manufacturing, as the development trend of manufacturing industry in future, has received extensive attention [5], [6].
The associate editor coordinating the review of this manuscript and approving it for publication was Xiwang Dong. Furthermore, digital twin (DT) technology has been widely studied in academia in recent year since it is an effective way for manufacturing enterprises to realize intelligent manufacturing [7], [8]. Manufacturing industry itself is facing the rapid development of technology and tools. However, manufacturing teaching and training have not kept up with the advancement of manufacturing technology, nor have they kept up with the demands of the labour market [9]. Therefore, collaboration between universities and industries is necessary and essential. As an organizational form of collaborative innovation, school-enterprise collaborative R&D (Also known as learning factories) can make up for some shortcoming of universities and industries [10], and the purpose of which is to align manufacturing training and teaching to the needs of modern industrial practice [11]. Cooperation between universities and enterprises can improve the innovation performance of enterprises and solve the problem of insufficient R&D foundation in universities Therefore, it is of great significance to coordinate the use of scientific and technological resources and discuss in depth the R&D strategy of DT technology based on school-enterprise collaboration.
Nowadays, collaborative innovation between universities and enterprises has become an increasingly common form of basic research and applied research, which has attracted widespread attention from scholars. For example, Van and Luong [12] proposed the model of skilled worker training basing on the analysis of school-enterprise collaboration factors in training process and labors characteristics in Mekong Delta. Yang [13] evaluated and analyzes the collaborative innovation ability of school-enterprise cooperation through the construction of key evaluation index system and model of the application. Xiao-Mei [14] studied a management mechanism of training base in campus under school-enterprise cooperation to find the insufficient of the mechanism and improve it. Huang et al. [15] explored a mode of school-enterprise cooperation in training application-oriented talents, introducing some achievements of the school and pointed out some problems in schoolenterprise cooperation. Ralph et al. [16] proposed a method about the implementation and operation of an academic learning factory, specifically tailored to the requirements of the metal forming industry. Brenner and Hummel [17] introduce the prototypes of DT in the ESB Logistics Learning Factory while they point out the economic function of DT technology.
In the school-enterprise collaborative innovation (SECI) system, universities and enterprises collaborate based on heterogeneous resources. Universities hope to obtain more scientific research funds and promote the transformation of scientific research results of DT technology through cooperation with manufacturing enterprises [18]. Manufacturing enterprises hope to spread innovation risks and make up for the weakness of their own technology R&D through cooperation with universities to achieve the purpose of intelligent manufacturing [19]. This heterogeneity causes conflicts between the motivations and behavioral objectives of cooperation in SECI system [20], [21].
Due to the long-term and dynamic characteristics of research and development of DT technology (RDDT), the collaborative R&D strategy requires to be self-adjusted between universities and enterprises. The R&D rate and frequency of DT technology increase with the development of scientific and technological information, which means that RDDT in the same space-time area should be considered based on the dynamic behavior of decision-making subjects. Differential game is an important dynamic model to deal with the conflict of competition and cooperation between two parties in a continuous time. Some scholars have introduced it to the research in the related fields of school-enterprise collaboration. For example, Yu and Shi [22] used the theory of differential game to study the knowledge sharing strategies of universities and enterprises under the collaborative innovation of industry, university and research. Yin and Li [23] presents a stochastic differential game of green building technologies transfer from academic research institutes to building enterprises in the building enterprises-academic research institutes collaborative innovation system. Ma et al. [24] aimed at the industry-university-collaborative R&D problem, and used a single research institution and a single enterprise as research objects to construct a differential game model, and analyzed the equilibria results of the three game models.
Form the analysis of above studies, it is a mainstream trend that many scholars use game theory to study schoolenterprise collaboration. However, as a new method for manufacturing enterprises to realize intelligent manufacturing, DT technology is still in the initial stage of R&D, and the current research of DT mainly stays in the conceptual research or application of manufacturing enterprises or a certain production field [25]- [28], lacking the endogenous impetus to promote DT technology in the transformation of intelligent manufacturing. Therefore, this paper attempts to use the differential game method to study RDDT of universities and manufacturing enterprises in the SECI system under the dynamic framework. First, using Berman's continuous dynamic programming theory, the optimal R&D effort level, the optimal incomes of both parties, and total optimal income in the SECI system are calculated in three cases: Nash non-cooperative game, Stackelberg master-slave game and cooperative game. Second, the equilibria of the three game cases are analyzed and compared. Finally, a numerical example is used to verify the validity of the conclusion. Hope to provide some suggestions for universities and enterprises on the R&D of DT. Compared with the existing literature, the main contributions of this work are as follows: (1) To promote the development of manufacturing and the R&D of DT technology, a school-enterprise collaboration model is built, in which universities are responsible for basic research, and the enterprises are responsible for applied research. The effects of both parties on RDDT were discussed based on the model. (2) The R&D subsidies provided by manufacturing enterprises to universities was proposed to coordinate RDDT between universities and enterprises, making universities more willing to research and development DT. (3) Through the comparison of the three games, the experimental results show that under the cooperative game situation, universities and enterprises achieve individual Pareto optimality and the effectiveness of our model is proved. The structure of this paper is organized as follows. In next section, differential game formulations and the analysis of equilibria of the three game cases are provided. Comparative analysis of equilibria results is presented in section III.  In section IV, we simulate the games and give an analysis of the result. Finally, section V summarizes the paper.

II. MODEL FORMULATIONS AND ANALYSIS
For ease of description, the main body of school-enterprise collaborative R&D strategy can be divided into two main parts: universities and manufacturing enterprises. Universities are mainly responsible for the basic research of DT technology, and manufacturing enterprises are mainly responsible for the application and development work of DT technology. In this paper, the two subjects can be expressed as a university and a manufacturing enterprise in the SECI system. The main notations used in our model are represented as Table 1, and Figure 1 depicts the decision process  are defined as C U (t) and C M (t), respectively. Furthermore, the R& D effort cost coefficient of both parties are defined as k U , k M , respectively. E U (t), E M (t) ≥ 0, represent the level of R&D efforts of both parties at time t, respectively. Considering the convexity of the R&D effort cost [29], the R&D effort cost of both parties at time t can be Assumption 2: Let K (t) denote the level of RDDT at time t, which affected by the R&D efforts of both parties and the update of DT technology level. Due to RDDT is a dynamic change process, the dynamic equation of the Nerlove-Arrow goodwill model is employed in this problem, and the dynamics level of RDDT can be expressed aṡ where K 0 is the initial state in the SECI system; α, β > 0, indicate the impact of the respective R&D efforts level of universities and manufacturing enterprises on the total technology level, which is effort coefficient. Furthermore, the technology level will decline if it has not been developed. δ > 0 represents the degree of decline in the overall technology level, which is technical attenuation coefficient.
Assumption 3: Let π (t) denote the total income in the SECI system at the time t. Therefore, the total income function can be as where µ 1 , µ 2 represent the impact of the respective R&D efforts level of universities and manufacturing enterprises on the total income in the SECI system, which is marginal income coefficient. ν indicates the impact of the DT technology level in the SECI system on the total income, which is the R&D impact coefficient of technology. Assumption 4: We further assume that the total income in the SECI system is only allocated between the two participants. The income distribution coefficient of universities is τ , and the income distribution coefficient of manufacturing enterprise is 1-τ . τ is a constant between (0,1), which is determined in advance by both parties. In order to stimulate the R&D enthusiasm of universities, manufacturing enterprises will provide a certain percentage of R&D investment subsidy θ (t) to universities, 0 ≤ θ (t) ≤ 1, and we assume that the discount rate ρ on both sides are the same and positive numbers. Both sides are seeking the best strategy of RDDT to maximize their respective income in infinite time.
The objective functions of universities and manufacturing enterprises can be expressed by using the following partial differential equations: There are three control variables, E U (t), E M (t) and θ (t), and a state variable K (t) in the SECI model. Due to the presence of dynamic parameters, the solution will become very difficult. In this paper, to simplify the model, we assume that the parameters in the model are constants and independent of time [29]. In addition, in order to facilitate writing, the time unit t will be omitted in the following text.

B. RESOVING MODEL OF NASH NON-COOPERATIVE GAME
In the process of Nash non-cooperative game, universities and manufacturing enterprises will simultaneously and independently select their optimal effort level of RDDT to maximize their profits. In this case, manufacturing enterprises does not provide R&D investment subsidy to universities, that is θ = 0. In this case, the objective functions of universities and manufacturing enterprises are: In order to get the Nash equilibria state in this situation, it should first be assumed that both universities and manufacturing enterprises have optimal R&D income, which are continuously and marginally differentiable. For all K ≥ 0, the Hamilton-Jacobi-Bellman (abbreviated as HJB) equation must be satisfied Proposition 1: the optimal R&D incomes of universities and manufacturing enterprises in Nash non-cooperative game situation are respective as follows: Proof: See the Appendix. VOLUME 8, 2020 Hence, the optimal total income in the SECI system can be expressed as follows: In the case of Stackelberg master-slave game, manufacturing enterprises play a leading role in the SECI system. In order to promote RDDT, manufacturing enterprises (the leader) determine an optimal R&D effort level E M and an optimal subsidy θ, and then universities (the followers) choose their optimal R&D effort level E U according to the optimal R&D effort level and subsidy of manufacturing enterprises. The income functions of both participants are V R (K ) and V M (K ), respectively, which are continuously and marginally differentiable. Furthermore, for all K ≥ 0, V D R (K ) and V D M (K ) must satisfy the HJB equation. According to the reverse induction method, the optimal control problem of universities is: The optimal R&D effort level of DT technology can be computed by setting the first partial derivative equal to zero, and the optimal effort level of universities can be Manufacturing enterprises will rationally predict that universities will determine its optimal R&D effort level E M according to formula (14). Therefore, manufacturing enterprises will determine its own optimal R&D effort level E M and R&D investment subsidy θ based on the rational response of universities to maximize its own benefits. In this situation, the optimal control problem of manufacturing enterprises is Proposition 2: the optimal R&D incomes of universities and manufacturing enterprises in Stackelberg master-slave game situation are respective as follows: Proof: See the Appendix. Hence, the optimal total income of RDDT in the SECI system can be expressed as follows:

D. RESOLVING MODEL OF COOPERATIVE GAME
In the process of cooperative game, universities and manufacturing enterprises will select their optimal effort level and income functions of RDDT to maximize their total income. Then, DT technology can be further improved through cooperation innovation between universities and manufacturing enterprises. In this case, the R&D subsidy that manufacturing enterprises need to provide to universities is an internal fund transfer. As an internal problem of the SCEI system, the R&D cost subsidy θ can take any value in [0,1]. We have In order to obtain the cooperative equilibria state in this situation, it should first be assumed that there is an optimal R&D total income V C (K ) in the SCEI system, which are continuously and marginally differentiable. For all K ≥ 0, the HJB equation must be satisfied. We have Proposition 3: The optimal R&D incomes of universities and manufacturing enterprises in cooperative game situation is respective as follows: Proof: See the Appendix. Under this circumstance, universities and manufacturing enterprises distribute the total income in the SECI system in proportions of τ and 1 -τ , respectively. We have

III. COMPARATIVE ANALYSIS OF EQUILIBRIA
In the three game cases, the optimal R&D effort level, optimal R&D incomes of both parties and the optimal total R&D income in the SECI system were compared, and some relevant conclusions were obtained. Corollary 1: In the case of Stackelberg master-slave game, compared with the Nash non-cooperative game situation, the R&D effort level of universities are significantly improved, and the degree of improvement is equal to the R&D investment subsidy coefficient θ , which shows that R&D investment subsidy are used as an incentive mechanism to encourage universities to put more effort into RDDT than when there is no subsidy. In both cases, the R&D effort level of manufacturing enterprises remain unchanged. When universities and manufacturing enterprises engage in a cooperative game, the optimal R&D effort level in the SECI system reach the maximum, and superior to the non-cooperative game situation.
Proof: See the Appendix. Corollary 2: In the case of Stackelberg master-slave game, the optimal R&D incomes of universities and manufacturing enterprises are better than the Nash non-cooperative game situation, that is when manufacturing enterprises provide R&D subsidy to universities, the R&D income in the SECI system is improved.
Proof: See the Appendix. Corollary 3: Under the cooperative game, the total income is highest compared with the rest two situations.
Proof: See the Appendix. It can be seen from Corollary 3 that under the cooperative game situation, the total income in the SECI system reached the highest. If the income distribution plan of both parties is reasonable and feasible, that is the respective optimal R&D income of both parties under the cooperative game situation are higher than non-cooperative cases. As a result, for both parties, collaborative cooperation is Pareto optimality. Therefore, it is necessary to coordinate the R&D strategy of DT technology of both parties. We have From Corollary 2, we can get V D * U − V N * U > 0 and V D * M − V N * M > 0. So, we only just satisfy formula (24). Corollary 4: In order to coordinate the cooperation between universities and enterprises and obtain individual Pareto optimality. Hence, the scope of income distribution coefficient of universities can be expressed as follows: where [µ 2 (ρ + δ) + βν] 2 = A and [µ 1 (ρ + δ) + αν] 2 = B. Proof: See the Appendix.

IV. NUMERICAL RESULTS
From the above analysis, the optimal level of R&D effort of both parties, their respective optimal income and the total R&D income in the SECI system are all related to the value of model parameters. The parameters are set as follows: The cooperative game situation not only realizes the Pareto optimality in the SECI system, but also reaches the Pareto optimality of the individual.
According to the relevant [30] and combined with reality, it is assumed that the parameters in the model are set as Then the value range of the income distribution coefficient of universities τ can be obtained as [401/1611,822/1611], We take τ = 0.4 to meet its constraints. We can obtain The above formula is consistent with the conclusion of Corollary 1. Let = αE U + βE M , we can get K = -δK . The expression of the special solution function obtained by solving the general solution of the first-order differential equation is: /δ+ (K 0 -/δ)e −δt . It can be obtained that the optimal income of both parties and total income level in SECI system under the three game cases are: VOLUME 8, 2020 Using MATLAB to obtain the trend of the optimal R&D income of universities and manufacturing enterprises and the total system income level over time under different game cases, as shown in Figure 2    From Figure 2-4, it can be seen that the optimal R&D incomes of both parties and the total R&D income in the SECI system are positively correlated with time t, and the change in the early stage is large, and the latter period tends to be stable. The order of the income of the three game cases from high to low is always maintained: cooperative game, Stackelberg master-slave game, Nash non-cooperative game. Consistent with the conclusions of Corollary 2, Corollary 3 and Corollary 4. Obviously, collaborative cooperation is Pareto optimality.

V. CONCLUSION
This paper explores a school-enterprise collaborative R&D strategy on DT technology using differential game model. The income functions of both universities and manufacturing enterprises are established with their R&D effort, respectively. Subsequently, we discuss the total R&D income in the SECI system and the R&D investment subsidy of the manufacturing enterprises to universities. Furthermore, their benefits are calculated and compared in three different situations, that are Nash non-cooperative game, Stackelberg master-slave game and cooperative game. Some conclusions draw from the equilibrium results are as follows.
As an incentive mechanism, the R&D investment subsidies can effectively improve the efforts of universities in the research and development of digital twin. Moreover, the improvement level is equal to the level of subsidies. In addition, universities and manufacturing enterprise can obtain more benefits in cooperative game compared with the other two game situations.
In the work, universities and manufacturing enterprises are regarded as two players in our model. However, the cooperation relationships of them in real world always involve more complex elements. In the future research, it is supposed to build a more comprehensive model. For example, the impact of some government policies and more player are considered in the game. Furthermore, due to the limitation of techniques, the parameters in our model are set to independent of time. Therefore, it is necessary to find a software with highly computing power or a more scientific method to solve this problem. In addition, a more realistic case needs to be studied in the future.

Proof of Proposition 1:
The optimal R&D level of effort of both sides can be computed by setting the first partial derivative equal to zero, and the respective optimal effort level of both parties can be Substituting the result of (A.1) into (8) and (9), we can obtain The solution of the HJB equation is a unary function with K as independent variable. We have where f 1 , f 2 , g 1 and g 2 are the constants to be solved. Solving the first partial derivative of formula (A.4), we can get Substituting the results of (A.4) and (A.5) into (A.2) and (A.3), we can get From the previous assumption, for all K ≥ 0, V N U (K ) and V N M (K ) are continuously and marginally differentiable. We can obtain Substituting the results of f 1 and g 1 into (A.1), we can further get Substituting the results of f 1 , f 2 , g 1 and g 2 into (A.4), we can get Proof of Proposition 2: Substituting the result of (13) into (14), and performing the indicated maximization and search for the optimal value of E M and θ by setting the first partial derivative equal to zero, we can get Substituting the results of (13), (A.10) and (A.11) into (12) and (14), we can further get The solution of the HJB equation is a unary function with K as independent variable. We have where f 1 , f 2 , g 1 and g 2 are the constants to be solved. Finding the first partial derivative of formula (A.14), we can get Substituting the results of (A.14) and (A.15) into (A.12) and (A.13), we can get From the previous assumption, for all K ≥ 0, V D R (K ) and V D M (K ) are continuously and marginally differentiable. VOLUME 8, 2020 We can obtain Substituting the results of f 1 and g 1 into (13), (A.10) and (A.11), we can further get Among them, as a result of 0 < θ ≤ 1 and 0 < τ < 1, we can get 0 < τ < 2/3.
Substituting the results of f 1 , f 2 , g 1 and g 2 into (A.14), we can get Proof of Proposition 3: The optimal R&D effort level of DT technology can be computed by setting the first partial derivative equal to zero, and the respective optimal effort level of both parties can be Substituting the result of (A.21) into (19), we can obtain The solution of the HJB equation is a unary function with K as independent variable. We have where f 1 and f 2 are the constants to be solved. Solving the first partial derivative of formula (A.23), we can get Substituting the results of (A.23) and (A.24) into (A.22), we can get From the previous assumption, for all K ≥ 0, V C (K ) are continuously and marginally differentiable. We can obtain Substituting the result of f 1 into (A.21), we can obtain Substituting Corollary 1 is proved. Proof of Corollary 2: From (10), (11), (16), (17), there exist Corollary 2 is proved. Proof of Corollary 3: From (12), (18) and (21), there exist Corollary 3 is proved.
Proof of Corollary 4:From formula (24), we can get According to the previous description, we can get 0 < τ < 2/3. Furthermore, we can find: Therefore, it is only necessary to discuss the values of 4k U A/(4k U A + k M B) and 2/3. Then, we can determine the value range of the R&D income distribution coefficient τ .  He was also a Visiting Scholar in environmental and ecological engineering (EEE) with Purdue University, USA, from September 2017 to August 2019. He is currently an Associate Professor with the School of Intelligent Systems Science and Engineering, Jinan University at Zhuhai Campus. His research interests include industrial engineering, disassembly planning, transportation planning, decision making, and optimization methods.
XIN LIAN is currently pursuing the B.S. degree in Internet of Things with the School of Intelligent Systems Science and Engineering, Jinan University at Zhuhai Campus. Her research interests include digital twins and intelligent manufacturing.
RUI ZHANG received the Ph.D. degree from Tianjin University. He was also a Visiting Scholar with Concordia University. He is currently an Associate Professor and a Master Supervisor of the Tianjin University of Science and Technology. He has published more than 40 academic articles. His six articles were searched by the SCI and 11 articles by the EI. He also presided over or completed many projects such as the NSFC and the NSSP as the Main Executor. He has applied for six invention patents and authorized one patent, applied for more than 20 utility model patents and authorized 16 patents, and obtained eight software copyrights. His research interests include wireless signal perception, the Internet of Things technology and its applications, and spectrum detection technology applications. He has won the Honor of ''Young and Middle-Aged Backbone Innovative Talents'' by universities in Tianjin. He won the Third Prize of the Tianjin Science and Technology Progress Award in 2017.