Noncooperative and Cooperative Multi-Player Minmax H∞ Mean-Field Target Tracking Game Strategy of Nonlinear Mean Field Stochastic Systems with Application to Cyber-Financial Systems

In this study, we investigate multi-player noncooperative minmax H∞ target tracking game strategy with conflicting target strategies and cooperative H∞ target tracking game strategy with common target strategy of nonlinear mean-field stochastic jump diffusion (MFSJD) system with external disturbance. Due to the nonlinear terms and mean-field (average) behaviors in the stochastic nonlinear MFSJD system and minmax H∞ payoff function, the multi-player noncooperative and cooperative minmax H∞ mean-field game strategy of nonlinear MFSJD system are more difficult than the linear MFSJD system and conventional nonlinear stochastic system. To avoid solving complex Hamilton Jacobi Isaacs inequalities (HJIIs) of multi-player noncooperative minmax H∞ mean-field target tracking game strategy of nonlinear MFSJD system, the nonlinear MFSJD system is interpolated by a set of local linearized MFSJD system through smoothing functions by the global linearization method. Then the multi-player noncooperative minmax H∞ nonlinear mean-field target tracking game strategy design can be transformed to a linear matrix inequalities (LMIs)-constrained multi-objective optimization problem (MOP). The LMIs-constrained MOP could be efficiently solved by the help of the proposed LMIs-constrained multi-objective evolution algorithm (MOEA). We can prove that the Pareto optimal solution of LMIs-constrained MOP is the Nash equilibrium solution of noncooperative minmax H∞ mean-field target tracking strategy of nonlinear mean-field MFSJD system. Further, the cooperative minmax H∞ mean-field common target tracking strategy of nonlinear mean-field stochastic system is reduced to an LMIs-constrained single-objective optimization problem (SOP). Finally, two simulation examples of cyber-financial mean-field systems are given to illustrate the design procedure and compare the efficacies of the proposed noncooperative and cooperative minmax H∞ mean-field target tracking strategy of nonlinear MFSJD systems.


I. INTRODUCTION
The mean-field theory was proposed to model collective behaviors which result from all interactions of individuals in various physical and sociological stochastic systems. Re-cently, the mean-field stochastic systems and their researches have gained worldwide attention and become an active research field [1]- [6]. One main feature of mean-field stochastic system is that the mean terms of the system state appear in VOLUME 4, 2016 its stochastic dynamic to represent the stochastic system to be influenced by the present average (mean) behavior. However, the mean terms, which appear in the stochastic mean-field system and cost functional, always make the optimal control design problem of mean-field stochastic system more difficult than the conventional stochastic system. A large number of researchers have studied the mean-field stochastic systems [7], [8] and their applications to diverse areas like biological system [9], [10], social systems [11], smart grids [12], etc. In [9], [10], the scale-free property of biological molecular networks could be analyzed by the mean-field algorithms. In [11], the financial resource allocation problem in share market can be described by multi-player stochastic meanfield game problem of linear stochastic mean-field system. It is found that the share prices of competitive firms with stochastic cooperative game strategy are higher than those with stochastic noncooperative game strategy. The competitiveness and strategy analysis of electrical vehicles in the smart grid can be described as a mean-field game strategy design problem of linear mean-field stochastic system [12]. The Pareto H ∞ game strategy in [13] was introduced based on the weighting sum minimization method for linear meanfield stochastic system in finite horizon. The linear quadratic mean field stochastic differential game is discussed based on an open loop Stackelberger strategy in [14]. The Pareto game strategies of linear mean field stochastic system in finite horizon are proposed in [15] and Pareto H ∞ game strategy was introduced for linear mean-field stochastic system under external disturbance in [16]. From the above application examples, it can be marvelously seen nowadays that individuals may consider the effect of collective behaviors as mean-field (average) terms from all individuals' mutual interactions in the stochastic social systems.
Stochastic game theory is another common method to discuss the behaviors of all players in large population systems [17]- [20]. In [17], Nash minmax stochastic game strategy was employed to treat mixed H 2 /H ∞ control design of linear and nonlinear stochastic system. In [18], the cooperative and noncooperative H 2 and H ∞ game strategies were introduced for linear and nonlinear stochastic systems and their applications to control, communication and social systems. In [19], some theoretical results of Nash equilibrium solution were discussed for linear and nonlinear stochastic systems. In [20], Takagi-Sugeno (T-S) fuzzy interpolation method was employed to efficiently solve noncooperative and cooperative game strategy of nonlinear stochastic system with Wiener process and Poisson process. The stochastic game strategy is an efficient method to analyze multiple players to make their decisions based on their own interests and goals. Recently, stochastic game theory has been widely applied to very diverse areas, for example, stochastic evolutionary game strategies of a population of biological network [21], noncooperative game strategy of bandwidth allocation in 4G heterogeneous wireless access networks [22], game theoretic approach to the optimal scheduling of parking-lot electric vehicle charging [23], noncooperative game strategy in cyber-financial systems [24], etc. From the different strategies of players, the multi-player game can be classified into multi-player noncooperative game and cooperative game. The game strategy of m-players who are competitive with each other to pursue their maximum benefits is called multiplayer noncooperative game strategy [18]- [20]. The game strategy of m-players who reach a compromise to pursue their common benefits is called cooperative game strategy. In general, multi-player noncooperative game strategy is more hard to design than multi-player cooperative game strategy because it is not easy to achieve multiple desired targets as possible for each player simultaneously for game strategies of a population of biological network in [21] and game strategies of networked multi-agent systems in [25].
Nash equilibrium is a crucial concept to validate the design performance of noncooperative game strategy. A noncooperative game strategy is with a Nash equilibrium solution if each player has chosen a strategy, then no player can profit by changing his own strategy while other players keep their strategies unchanged [18], [19]. In general, it is very difficult to find the Nash equilibrium solution for noncooperative game of stochastic system, especially for nonlinear stochastic systems because of the existence of many Nash equilibrium solutions. A lot of iterative searching algorithms such as extremum seeking method have been developed to search for Nash equilibrium point for noncooperative game strategies [26]- [28]. Some shortages of these iterative searching algorithms are addressed as follows: (i) The convergence time to a Nash equilibrium point may be quite long if the initial approximation point is far away from the Nash equilibrium point. (ii) These iterative searching algorithms can not search for all Nash equilibrium points, especially for nonlinear multi-player noncooperative game strategy. Based on the above discussion, it is more appealing to find a direct method to solve all Nash equilibrium points of multi-player noncooperative game strategy of nonlinear stochastic system, especially for nonlinear MFSJD system.
Even the game strategy design for MFSJD system has gained a lot of attention in recent studies, it still remain some problems to be dealt with. At first, the conventional game design is to focus on the stabilization problem for MFSJD system and the target tracking game strategy design is not well considered. Also, to obtain the Nash equilibrium soluution for noncooperative game strategy design, the iterative method may cause a lot of computational complexity during the design procedure. Further, it is not easy to obtain all Nash equilibria.
In this study, to remedy the above shortage, we focus on the M-player noncooperative and cooperative minmax H ∞ mean-field target tracking game strategy design problems of nonlinear MFSJD system with unknown external disturbance, continuous and discontinuous random fluctuations. At first, an individual fractional payoff function based on H ∞ mean-field target tracking performance is proposed for each player to achieve one's own desired target in the multi-player noncooperative H ∞ minmax mean-field target tracking game strategy of nonlinear MFSJD system. Also, a common fractional payoff function is proposed for all players to achieve their desired common target in the multiplayer cooperative H ∞ minmax mean-field target tracking game strategy of nonlinear MFSJD system under external disturbance. In order to avoid solving complicated M nonlinear partial differential Hamilton-Jacobi-Isaacs inequalities (HJIIs) in the nonlinear M-player noncooperative H ∞ meanfield target tracking game strategy, the global linearization technique is employed to approximate the nonlinear MFSJD system by the interpolation of local linearized MFSJD systems at the multiple vertices of the polytope of all global linearization MFSJD systems. By the proposed indirect method, the design problem of nonlinear M-player noncooperative H ∞ mean-field target tracking game strategy of nonlinear MFSJD system is transformed to an equivalent LMIsconstrained multi-objective optimization problem (MOP), which can be efficiently solved with the help of the proposed LMIs-constrained reverse-order MOEA method. We have shown that the Pareto optimal solution of LMIs-constrained MOP is the solution of noncooperative H ∞ mean-field target tracking game strategy of nonlinear MFSJD system. In general, the solution of M-player noncooperative minmax H ∞ mean-field target tracking game strategy of nonlinear MFSJD system is not unique. There exist a lot of Pareto optimal solutions for M-player noncooperative H ∞ mean-field target tracking game strategy of nonlinear MFSJD system, which can be proven to be all the Nash equilibrium solutions. Further, the similar indirect method is employed to treat the M-player cooperative minmax H ∞ mean-field common target tracking strategy design problem of nonlinear MFSJD system. Based on the global linearization technique [29], the M-player cooperative minmax H ∞ mean-field common target tracking strategy design problem of nonlinear MFSJD system can be transformed to a simple LMIs-constrained single-objective optimization problem (SOP) which could be easily solved with the help of LMI toolbox in Matlab. Finally two simulation examples of cyber-financial system are given to illustrate the design procedure and compared for the performance test of the M-player noncooperative and cooperative minmax H ∞ target tracking game strategy in the nonlinear MFSJD financial systems.
The contributions of this study are summarized as follows: (I) In this study, the nonlinear MFSJD system with mean field terms, continuous Wiener process, jumping Poisson process and external disturbance is considered to more realistically describe phenomena in the cyber-physical meanfield systems, especially economic and financial systems. Based on nonlinear MFSJD system, the multi-player nonlinear noncooperative and cooperative minmax H ∞ meanfield target tracking strategy design problem are proposed for more practical applications. The M-player noncooperative and cooperative H ∞ mean-field target tracking strategy are transformed to an equivalent M-player H ∞ stabilization game of augmented mean-field systems to simplify the design procedure.
(II) At present, there exists no efficient method to efficiently solve all Nash equilibrium solutions of multi-player noncooperative game strategy of nonlinear MFSJD systems. In this study, the global linearization and the proposed indirect method are employed so that the nonlinear multiplayer noncooperative and cooperative minmax H ∞ meanfield target tracking game strategy design problems could be transformed to equivalent LMIs-constrained MOP and LMIsconstrained SOP for nonlinear MFSJD system, respectively. Also, the Pareto optimal solutions of MOP are all shown to be Nash equilibrium solutions of multi-player noncooperative minmax H ∞ mean-field target tracking game strategy,.
(III) Unlike the conventional iterative searching algorithm to search out one Nash equilibrium solution, the proposed reverse-order LMIs-constrained MOEA-based algorithm for LMIs-constrained MOP can search in parallel for the Pareto front of MOP in a single run to efficiently obtain all Nash equilibrium solutions of multi-player noncooperative minmax H ∞ mean-field target tracking strategy of nonlinear MFSJD system, from which the designer could select one's own preferable solution. Therefore, the proposed multiplayer noncooperative and cooperative game strategy can be applied to sovling the financial investment strategy in cyberfinancial systems and financial contagion problem due to the global impact of financial crisis.
This study is organized as follows: The problem formulation of M-player noncooperative minmax H ∞ mean-field target tracking game strategy of nonlinear MFSJD system is given in Section II. In Section III, based on the global linearization method, M-player noncooperative minmax H ∞ mean-field game strategy is transformed into an equivalent LMIs-constrained MOP, which could be solved with the help of LMIs-constrained MOEA. A design procedure of multi-player noncooperative nonlinear H ∞ mean-field target tracking game strategy via LMIs-constrained MOEA is proposed in Section IV. In Section V, the design of multiplayer cooperative H ∞ mean-field target tracking game strategy is given for nonlinear MFSJD system. In Section VI, two design examples of cyber-financial mean-field systems with noncooperative and cooperative target tracking game strategies are given to illustrate the design procedure and validate their target tracking performance with comparison. Some concluding remarks are made in Section VII.
Notation: A T : the transpose of matrix A; A ≥ 0 (A > 0) : the positive semi-definite (positive definite) matrix; E {·} : the expectation operator; I n : n × n identity matrix; L 2 F (R + , R n ) : the space of nonanticipative stochastic processes y(t) ∈ R n with respect to an increasing σ-algebras F t (t ≥ 0) satisfying ∥y(t)∥ L 2 F (R + ,R n ) ≜ E ∞ 0 y(t) T y(t)dt

II. PROBLEM DESCRIPTION OF M-PLAYER NONCOOPERATIVE MINMAX H∞ MEAN-FIELD TARGET TRACKING GAME STRATEGY OF NONLINEAR MFSJD SYSTEM
Consider the following more general nonlinear MFSJD system with M-players (M-person decision makers): where x(t) ∈ R n denotes the state vector of the nonlinear MFSJD system, E{x(t)} is the expectation of x(t), u m (t) ∈ R Mm denotes the control strategy of the mth player with its expectation E{u m (t)}, v(t) ∈ L 2 F (R + , R n ) and E{v(t)} are the external disturbance and its expectation, respectively, w(t) is the 1-D standard Wiener process, l(x(t), E{x(t)})dw(t) denotes the continuous random fluctuation of system function, p(t) denotes Poisson counting process with mean λ > 0 in an unit time and n(x(t), E{x(t)})dp(t) is regarded as the intrinsic discontinuous random fluctuation (jumping process) of system function. The nonliear functions f (x(t), E{x(t)}), {g m (x(t), E{x(t)})} M m=1 , h(x(t), E{x(t)}), l(x(t), E{x(t)}) and n(x(t), E{x(t)}) are Borel measureable functions with Lipshitz continuity. Two stochastic processes in (1) are assumed to be mutually independent. Remark 1. [18], [39] Some properties of stochastic processes w(t) and p(t) are given as follows: (I) E{dw(t)} = 0, (II) E{dw(t)dw(t)} = dt, (III) E{dp(t)} = λdt, (IV) E{dw(t)dp(t)} = 0.
Recently, the nonlinear MFSJD system in (1) have been widely employed to model the nonlinear stochastic system with collective behvior resulting from all interactions of individuals in various physical and sociological stochastic systems, especially for the financial investment strategies in financial indices system and control strategies of government and international consortiums in the financial contagion problem due to the global impact of financial crisis. However, at present, there exist no efficient method to solve all Nash equilibrium for multi-player noncooperative target tracking game strategy of nonlinear MFSJD system for practical applications. In this study, the global linearization and the proposed reverse-order LMIs-constrained MOEA algorithm will be employed to treat the multi-player noncooperative and cooperative target tracking game strategies in the nonlinear MFSJD system.
Before the further discussion, the global linearization method is utilized to transform the nonlinear MFSJD system into an interpolation-type nonlinear MFJDS. By choosing suitable J vertices, we have [29], [30] i are local linearized matrices at the ith vertex, for i = 1, · · · , J, m = 1, · · · , M.
Therefore the trajectory of nonlinear MFJD system in (1) can be represented by the convex combination of the trajectories of the following J local linearized MFJD systems of the polytope if the convex hull consists of all local linearized systems at all x(t) and E{x(t)} [29], [30]: According to the global linearization theory [29]- [30], the trajectory of nonlinear MFJD system in (1) can be represented by the convex combination of the trajectories of J local MFJD systems in (2) as follows: Recently, there exist several interpolation methods for the approximation of the nonlinear MFSJD system in (1), e.g., T-S fuzzy interpolation method. In this study, the global linearization method is adopted. Theoretically, if all the local linearized systems are inside a compact set C, the global linearization method is to interpolate the local linearized systems at the J vertices of compact set C in (3) to approximate the nonlinear MFSJD system in (1) with the interpolation functions. In general, compared with other interpolation methods, the global linearization method can approximate the nonlinear MFSJD with less local linearized systems and simpler interpolation functions.
Taking the expectation of nonlinear MFSJD system in (3) with the fact E{dp(t)} = λdt and E{dw(t)} = 0 in Remark 1, we get the mean subsystem of nonlinear MFSJD system as: The variation subsystem of mean-field system can be obtained by subtracting the nonlinear MFSJD system in (3) by the mean subsystem in (4) as: Let us denote competitors of the mth player as is also considered as one kind of competitor to each player. Then the mean subsystem w.r.t. the mth player in (4) can be rewritten as: Moreover, the variation system in (5) w.r.t. the mth player can be formulated as follows: Suppose the desired target of the mth player can be generated by the following reference model: where x r,m (t) denotes the desired trajectory to be tracked by the mth player, r m (t) represents the reference input, A r,m is an asymptotically stable matrix which is specified by the mth player and B r,m is the reference input matrix.
Remark 3. If we choose B r,m = −A r,m for the reference model in (8), it is clear that x r,m (t) = −A −1 r,m B r,m r m (t) = r m (t) at the steady state. Therefore, if the mth player selects B r,m = −A r,m and r m (t) as the desired target, then the target tracking problem of each player will become a reference model tracking problem of (8).
Based on the variation subsystem in (7), the mean subsystem of MFJD system in (6), and the desired target system in (8), M-player noncooperaive mean-field target tracking game strategy is formulated as the following M simultaneous minimax H ∞ reference target tracking design problems: , for m = 1, · · · , M, where T f > 0 denotes the final time, ρ * m denotes the performance of the mth player, Q 1,m ≥ 0 and Q 2,m ≥ 0, R 1,m > 0 and R 2,m > 0 are the corresponding symmetric weighting matrices.x T m (0)P 1xm (0) and (E{x(0)} − x r,m (0)) T P 2 (E{x(0)} − x r,m (0)) are the effect of initial condition of variation system and tracking error with positive definite matrices P 1 and P 2 , respectively.
and mean control power E{u T m (t)}R 2,m E{u m (t)} of the mth player in the integral of numerator in (9) are to be minimized by the control variationũ m (t) and mean control E{u m (t)} of the mth player, while control varianceũ T −m (t)ũ −m (t) and mean control power E{u T −m (t)}E{u −m (t)} of competitors of the mth player, the variance of external disturbancẽ v T (t)ṽ(t) and the power of average external disturbance E{v T (t)}E{v(t)} and the power r T m (t)r m (t) of arbitrary reference input in the integral of denominator in (9) are to be specified by the competitors of the mth player to maximize their effect on the mean-field target tracking performance in the numerator of (9) from the minmax H ∞ game perspective.
The physical meaning of M -player noncooperative min- VOLUME 4, 2016 max H ∞ mean-field target tracking game strategy in (9) is that the competitorsũ −m (t) and E{u −m (t)} to the mth player as well as external disturbance v(t) and any desired reference input r m (t) of the nonlinear MFSJD system want to deteriorate the mean-field target tracking performance by the deviation around the mean target of the mth player as possible while the mth player tries to optimally track his desired target ) in the numerator of (9) are to extract the effect of initial conditions on the multi-player noncooperative minmax H ∞ mean field target tracking game.

III. DESIGN OF M-PLAYER NONCOOPERATIVE H∞ MEAN-FIELD TARGET TRACKING GAME STRATEGY OF NONLINEAR MFSJD SYSTEM
From the above analysis, the M-player noncooperative H ∞ mean-field target tracking game strategy design problem is to solve M simultaneous minmax H ∞ reference target tracking design problem in (9). In order to simplify the design procedure, let us augment two mean-field subsystems in (6) and (7) with the desired reference model in (8) T and the corresponding augmented mean-field stochastic system can be given as follows: The detailed system matrices in (10) are given as Based on the augmented mean-field stochastic system in (10), the M minmax H ∞ target tracking design problems in (9) for M-player mean-field target tracking game strategy of nonlinear mean-field stochastic system in (1) could be rewritten in the following M simultaneous minmax H ∞ stabilization design problems , with the positive matrixP , is the effect of the initial condition to be deducted. The augmented weighting matrices are defined as Based on the above analysis, the M-player noncooperative minmax H ∞ mean-field reference target tracking game strategy design problem in (9) of nonlinear MFSJD system in (1) is transformed to how to solve M simultaneous minmax H ∞ stabilization design problems in (11) for the M augmented systems in (10) simultaneously. However, due to the fractional payoff function in (11), it is not easy to directly solve M simultaneous H ∞ robust stabilization problems in (11) for the M augmented mean-field systems in (10) simultaneously. In this situation, the following indirect suboptimal method is proposed to solve M simultaneous minmax H ∞ robust stabilization problems in (11) for M augmented mean-field systems in (10) where ρ m denotes the upper bound of the mth minmax H ∞ mean-field game strategy of the mth player with control strategyŪ m (t).
Instead of directly solving M minmax H ∞ robust stabilization problem in (11) of M augmented stochastic systems in (10) simultaneously , we solve M minmax H ∞ meanfield game design problem in (12) by an indirect suboptimal method, i.e., to decrease ρ m as possible to approach to ρ * m simultaneously from the suboptimal game perspective. Then the M minmax H ∞ stabilization problem in (11) can be solved indirectly by minimizing their upper bounds in (12) simultaneously as the following multi-objective optimization problem (MOP): (13) and is to be also shown as the Nash equilibrium solution of M-player noncooperative minmax H ∞ meanfield game strategy in (12) in the sequel.
The suboptimal M-player minmax H ∞ mean-field game strategy design problem in (12) of nonlinear MFSJD system becomes how to solve MOP in (13) based on the Pareto domination of all feasible solutions. All solutions of MOP in (13) are called Pareto optimal solutions and are not unique [6]. Some properties of MOP in (13) based on the Pareto domination are given by the following definitions: with at least one of inequalities being a strict inequality.
It means Pareto front collects all the objective vectors of Pareto optimal solution.
Before the discussion of solving M-player noncooperative minmax H ∞ mean-field game strategy via MOP in (13), Nash equilibrium solution (point) of M player noncooperative minmax H ∞ mean-field game strategy in (11) is defined as follows: Definition 5. (Nash Equilibrium Point [3]): For the Mplayer noncooperative minmax H ∞ mean-field target tracking game strategy in (11) of nonlinear MFSJD system in (10), the M -player noncooperative minmax game strategy (Ū * 1 (t), · · · ,Ū * m , · · · ,Ū * M ) with objective vector (ρ * 1 , · · · , ρ * m , · · · , ρ * M ) constitutes a Nash equilibrium solu-tion if and only if the following M inequalities hold: The meaning in (14) is that if each player has chosen a strategy, then no player can profit by changing his own strategy while other players keep their strategies unchanged. In general, there exist a large number of Nash equilibrium solutions to satisfy M inequalities in (14). Before solving the MOP in (13) for the suboptimal solution of M-player noncooperative H ∞ mean-field game strategy in (12), we need to prove the solution of MOP in (13) will approach the solution of M-player minmax noncooperative H ∞ meanfield game strategy of nonlinear MFSJD system in (10). (13) with the corresponding Pareto optimal objective vector (ρ * 1 , · · · , ρ * m , · · · , ρ * M ) by the indirect suboptimal method is also the solution of M-player noncooperative minmax H ∞ mean-field game strategy in (11) for nonlinear mean-field stochastic system.

Proof. See Appendix A.
According to Definitions 1-5 and Theorem 1, we need to solve the Pareto optimal solution (Ū * 1 (t), · · · ,Ū * m (t), · · · , U * M (t)) of MOP in (13) for the M-player noncooperative minimax H ∞ mean-field target tracking game strategy of nonlinear mean-field stochastic system. In the minmax fractional strategy in (13), since the minimization of numerator byŪ m (t) is independent onŪ −m (t), the MOP in (13) is equivalent to the following MOP with the following M Nash H 2 quadratic game inequality constraints [11], [18], [20]: Let us denote = E{ Then we need two steps to solve M minmax H 2 quadratic game inequality constraints in (16) on MOP in (15). The first step needs to solve the M stochastic minmax Nash H 2 quadratic game problems

VOLUME 4, 2016
The second step needs to solve the following M constrained problems Before we solve (18) and (19) for M minmax H 2 quadratic game inequality constraints in (16) on MOP in (15)., the following lemma is necessary. Lemma 1. (Itô-Lévy Lemma [41]): Let V (X m (t)) denote the Lyapunov function of the augmented mean-field stochastic system in (10) such that V (X m (t)) ∈ C 2 , V (0) = 0, V (X m (t)) > 0. For the M-player augmented MFSJD system in (10), the Itô-Lévy Lemma formula of V (X m (t)) is given as follows: The following lemma is also necessary for solving MOP in (15)-(16) for the augmented mean-field stochastic system in (10): , a positive-definite matrix P and interpolation functions , E{x(t)}) = 1, the following inequality holds: Based on Lemma 1 with the selection of Lyapunov function V (X m (t)) =X T m (t)PX m (t), for some positive-definite matrix P, and Lemma 2, the MOP in (15)-(16) for M-player noncooperative minmax H ∞ mean-field game strategy of nonlinear MFSJD system can be solved by the following theorem.
Theorem 2. M-player noncooperative minmax H 2 quadratic game inequality constraints in (16) or (17) whereP and ρ m are the common solution of the following Riccati-like inequalities: where Proof. Please refer to Appendix B.
Proposition 1. The MOP in (15), (16) for M-player noncooperative minmax H ∞ mean-field game strategy can be designed bȳ whereP * and ρ * m are the solution of following MOP Proof. Please refer to Appendix C.
In general, it is still not easy to solve Riccati-like inequality constraints in (24) due to the coupling of design variables. For the simplicity of design, the Riccati-like inequalities in (24) can be transformed to the following equivalent LMIs by several times of Schur complement transformation [29] after multiplyingW =P −1 to both sides of (24): where whereP * = (W * ) −1 and ρ * m are the solution of the following MOP Proof. The result can be immediately obtained by the above discussion.
Proof. Please refer to Appendix D.

IV. DESIGN PROCEDURE OF MULTI-PLAYER NONCOOPERATIVE NONLINEAR H∞ MEAN-FIELD TARGET TRACKING GAME STRATEGY VIA MOEA
The M-player noncooperative minmax H ∞ mean-filed target tracking game strategy design problem of nonlinear MFSJD system is reduced to how to solve LMIs-constrained MOP in (31). At present, the MOEA is a popular searching algorithm to solve MOP. In detail, MOEA is a stochastic algorithm inspired by biological evolution, i.e., reproduction, mutation, combination and selection, to search for the global optimal solutions at the same time without dividing the original problem into several sub problems for parallel searching [8]. By the conventional MOEAs in [31], [32], we need to search symmetricW =P −1 > 0 for solving LMIs-constrained MOP in (31) for the M-player noncooperative minmax H ∞ mean-field target tracking game strategȳ t), m = 1, · · · , M, in (29) of nonlinear MFSJD system. However, it is very difficult to search all components ofW ∈ R 3n×3n to approachW * of MOP in (31) by the conventional MOEAs in the case of large n. For the simplicity of solving MOP in (31), a reverse order LMI-constrained MOEA is employed to simplify the solving procedure of MOP in (31) by indirectly searching (ρ 1 , · · · , ρ m , · · · , ρ M ) instead.
After searching (ρ * 1 , · · · , ρ * m , · · · , ρ * M ) based on MOEA, we could obtainW * from LMIs in (28) indirectly via MAT-LAB LMI TOOLBOX. This indirect method could significantly simplify the design procedure of LMIs-constrained MOP in (31) for the M-player noncooperative H ∞ meanfield game strategy of nonlinear MFSJD system when the dimension n of system state vector becomes large. Consequently, the design procedure of reverse-order LMIconstrained MOEA for M-player noncooperative H ∞ meanfield game strategy of nonlinear MFSJD system is proposed as follows: to represent the lower and upper bound of Pareto optimal solution (ρ * 1 , · · · , ρ * m , · · · , ρ * M ), are positive numbers. Give the population number N p , iteration number N I , crossover rate R c and mutation rate R m . Set the initial iteration number I = 1.
STEP II: Choose N p feasible individuals as the initial population P I from the searching region R.
STEP III: Use mutation operator and crossover operator to generate N p another feasible individuals and add to the population P I to check LMIs in (28) if their corresponding (ρ 1 , · · · , ρ m , · · · , ρ M ) are feasible.
STEP IV: Choose N p elite individuals from 2N p feasible individuals in P I generated in STEP III via nondominated sorting scheme and the crowded comparison method [31], [32]. Place the iteration number I = I +1 and set the choosen populations as P I . STEP V: Repeat STEP III and STEP IV until I > N I , and then set the final population P I = T F as Pareto front and stop the iteration.

Remark 5.
For the weighting sum method in [16], a single Pareto optimal solution can be solved with a specific weighting coefficients. In this case, to obtain all Pareto optimal solutions, large design conditions w.r.t. all possible combinations of weighting coefficients have to be solved. However, instead of using weighting sum method in [16], the proposed multiplayer noncooperative H ∞ is transformed to an equivalent LMI-constrained MOP in (29)- (31), with which all Pareto optimal solutions can be solved by the proposed reverseorder LMI-constrained MOEA in a single run to save a large amount of computation time. VOLUME 4, 2016

V. MULTI-PLAYER NONLINEAR COOPERATIVE H∞ MEAN-FIELD TARGET TRACKING GAME STRATEGY DESIGN OF NONLINEAR MFSJD SYSTEM
For the nonlinear MFSJD stochastic system with M players in (1), if these players have reached a consensus with each other at a desired common trajectory x r (t). Suppose the desired common target can be generated by the following reference model, dx r (t) = (A r x r (t) + B r r r (t))dt (32) where r r (t) is reference input, A r is specific asymptotic matrix and B r is input matrix.
In the M-player mean-field cooperative game, Mplayers cooperate together with other players i.e., u(t) = [u T 1 (t), · · · , u T m (t), · · · , u T M (t)] T =ũ(t) + E{u(t)} to minimize the desired mean tracking error E{x(t)} − x r (t)} and the deviationx(t) = x(t) − E{x(t)} despite the external disturbance v(t), continuous and discontinuous intrinsic random fluctuations and any reference input r r (t). Therefore, the Mplayer cooperative minmax H ∞ mean-field target tracking game of nonlinear MFSJD system in (1) is formulated as follows: ≥ 0 are the corresponding weighting matrices, ρ * denotes the performance of cooperative minmax H ∞ mean-field target tracking game andx T (0)P 1,cx (0) and (E{x(0)} − x r (0)) T P 2,c (E{x(0)} − x r (0)) are the effect of initial condition of variation and tracking error to be eliminated with positive definite matrices P 1,c and P 2,c , respectively. The physical meaning of cooperative H ∞ mean-field common target tracking game in (33) is that the M players collaborate with each other with less control effortũ(t) and E{u(t)} to optimally track x r (t) with the minimum random variation under the worst-case effect of external disturbance v(t) and any reference input r r (t).
To simplify the design, the augmented T is introduced and the corresponding augmented dynamic system can be given as: The detailed system matrices in (10) is given as Then, by using (34), the M-player cooperative minmax H ∞ mean-field target tracking game strategy in (33) for the mean-field nonlinear stochastic system in (1) is transformed to the following minmax H ∞ mean-field stabilization strategy of the augmented mean-field stochastic system in (34).
However, it is not easy to directly solve the M-player cooperative minmax H ∞ mean-field stabilization game in (35) for the augmented nonlinear mean-field system in (34). By using the similar indirect approach, the M-player cooperative minmax H ∞ mean-field stabilization game strategy in (35) can be solved by minimizing the upper bound Instead of solving cooperative minmax H ∞ mean-field stabilization game strategy in (35), we solve the suboptimal minmax H ∞ mean-field stabilization problem by minimizing the upper bound in (36) as the following SOP Lemma 3. The M-player cooperative minmax H ∞ meanfield stabilization strategy in (33) is equivalent to the SOP in (37) and (38).
Proof. It is a special case of MOP of Theorem 1. Hence the proof is omitted.
The cooperative minmax H ∞ mean-field stabilization game constraint in (38) is equivalent to the following constrained minmax H ∞ quadratic mean-field stabilization game constraint (40) Following the two-step method in (17)- (19) to solve the constrained minmax Nash cooperative H 2 quadratic stabilization game constraint problem in (39), the first step is to solve and the second step is to solve the following constraint problem Consequently, we get the following main theorem for Mplayer cooperative H ∞ mean-field common target tracking game strategy of nonlinear MFSJD system in (1).
Theorem 3. The M-player cooperative H ∞ mean-field target tracking game strategy in (33) of nonlinear MFSJD system in (3) can be solved bȳ whereP * c and ρ * are the solution of the following SOP: The derivation is similar to Theorem 2.
From Theorem 3,Ū * (t) andV * (t) in (43)-(44) are the solution of N-player cooperative H ∞ mean-field target tracking game strategy in (37) for the nonlinear MFSJD system. By the similar technique addressed in (28), the Riccati-like constrained SOP in (45), (46) can be transformed to the following LMIs-constrained SOP. (47) and (48), the M-player cooperative minmax H ∞ mean-field common target tracking game strategy in (33) of nonlinear MFSJD system in (1) could be obtained by in (43).

VI. DESIGN EXAMPLE
Recently, the nonlinear stochastic mean-field system theory has become a pioneering issue in the financial investment strategies and financial contagion problems due to the global impact of financial crisis [33]- [40]. Since the individuals may consider the effect of collective behaviors as meanfield (average) terms from all individuals' mutual interactions in the stochastic financial system. The stochastic financial systems become nonlinear stochastic MFSJD systems. In this section, two simulation examples of multi-player noncooperative and cooperative H ∞ mean-field target tracking game strategy in nonlinear mean-field MFSJD financial system are given to illustrate the design procedure and compare with the performance of their target tracking performance.

A. EXAMPLE 1: FINANCIAL INDICES SYSTEM 1) Model Construction
In this simulation, a financial system, which describes the interaction of three indices (i) interest rate, (ii) investment demand, and (iii) price index, is used as a design example for the validation of the proposed multi-player cooperative and noncooperative H ∞ mean-field game strategy. To consider the continuous and discontinuous fluctuations caused by the global event, the nonliear financial system should be revised as follows [24]: where x 1 (t), x 2 (t) and x 3 (t) denote the interest rate, the investment demand and price index, respectively, the parameter a = 1.2, b = 0.5 and c = 0.3 are the saving amount, the per-investment cost, and the resilience of demands of commercials, respectively, u 1 (t) = [u 11 (t) u 12 (t) u 13 (t)] T denotes the investment strategy of government,  Fig. 1 shows the state responses of strategy-free nonlinear mean field stochastic jump diffusion system and the corresponding mean system. In Fig. 1-(a), three state variables suffer from continuous fluctuations influenced by the floating financial factors (e.g., uncertain saving rate) and discontinuous fluctuations caused by the global emergent events (e.g., oil crisis). Besides, for the mean trajectories from Fig. 1-(b), it can be noticed that three financial indices interact with one another. For example, a low investment demand will lead to a low price index.

2) Noncooperative Game Strategy Design for Financial System
For the noncooperative game strategy design, three investors want to regulate the financial system according to their desired mean target as follows: specified as follows: Remark 6. For the desired target r 1 (t) = [4 5 2] T of government, it reveals that the government aims to increase three financial indices for the improvement of financial activity. Beside, for the desired target r 2 (t) = [1 3 −2] T of bank consortium, the desired interest rate and desired investment demand are positive while the desired price index is negative. Therein, if the price index can be regulated to negative, people (i.e., the public) are more appealing to buy goods and to apply loan from bank, which will benefit bank consortium. For the desired target r 3 (t) = [2 2 0] T of the public, it considers that the positive interest rate and positive investment demand can increase market economy. Also, the desired price index of public shows that the public aims to maintain the current price index with zero fluctuation.
To utilize the global linearization technique in (3), the 16 vertices are chosen and the nonlinear mean field jump diffusion financial system in (49) can be written as:   (31) with the proposed reverse-order LMI-constrained MOEA for the noncooperative mean-field game strategy design, the corresponding Pareto front is illustrated in Fig. 2. Each red point in the figure denotes the desired Pareto optimization solution. In this simulation example, we select the label point (3,3,3) in Fig. 2 as a desired solution with the corresponding positive matrixW * since it has the minimum Euclid norm among other solutions. In this case, each investor will achieve the balanced game performance in (15) during the tracking process.
The state and mean trajectories of financial system controlled by three noncooperative investment strategies are shown in Fig. 3. The steady mean state of the financial system is regulated to [3.3 2.7 − 0.6] T . For the three investors, due to the conflicting desired targets in (50), three noncooperative game strategies applied to the financial system are mutually interfered and thus the system mean states reach a compromised result instead of the desired target of one investor. However, according to the investors' weighting matrices in (52), it can be noticed that the first investor puts more attention on the tracking of mean interest rate (i.e., the first mean state) than other two investors. For example, the mean state weighting matrix w.r.t. the first mean state of the first investor is larger than others and the control weighting matrix w.r.t. the first control variable of first investor is lower than others. In this situation, the first investor will use more Similarly, the second investor puts more attention with more control effort on the tracking of the mean investment demand (i.e., the second mean state) than other two investors, and the third investor puts more attention with more control effort on the tracking of the mean price index (i.e., the third mean state) than other two investors. As a result, the first mean state of the financial system at the steady state is more close to the desired target of first investor, the second mean state of financial system at the steady state is more close to the desired target of the second investor and the third mean state of financial system at the steady state is more close to the desired target of third investor.

3) Cooperative Game Strategy Design for Financial System
Different than the noncooperative H ∞ mean-field target tracking game strategy design, three investors have compro-VOLUME 4, 2016 mised a common mean target with one another as follows: and the common reference tracking model for three players is given as dx r (t) = (A r x r (t) + B r r r (t))dt (55) with A r = −0.8I 3 and B r = 0.8I 3 . In this case, three investors aim to regulate three financial indices according to their common target. The variation and mean state weighting matrices and control weighting matrices are chosen as follows: By solving the LMI-constrained SOP in (47) and (48) via MATLAB LMI TOOBOX, the positive matrixW * c can be obtained with ρ * = 2.92. The simulation result of cooperative H ∞ mean-field target tracking game strategy design for financial system is shown in Fig. 4. Since the desired tracking target is common for three investors, three mean states of financial system can be successfully regulated to r r (t) = [5 3 1] T , i.e., E{x(t)} = r r (t) at the steady state. Moreover, from Fig. 4, it can be seen that the effect of intrinsic random fluctuations and external disturbances is efficiently attenuated during the investment process by the cooperative H ∞ meanfield target tracking game strategy. Even three investors could easily achieve their common mean target by the proposed cooperative H ∞ mean-field game strategy, however, it is a very long complicated and time-consuming process for them to sacrifice for their common target.

B. EXAMPLE 2: CROSS-BORDER CAPITAL FLOW SYSTEM 1) Model Construction
Consider the financial contagion problem due to the global impact of financial crisis. While the impact of financial shock is detected, it is found that the net capital flow in source country of financial turbulence declines intensely to join to the driving force. However, in the volatility-affected country, the net capital flow diverges from the normal equilibrium value point in order to respond to the infectious effect from the source country of financial turbulence [40]. A nonlinear MFSJD capital flow system of international capital flow volatility between the source country of financial turbulence and volatility-affected country under the control strategies of source country of financial turbulence u 1 (t), volatilityaffected country u 2 (t) and international consortium u 3 (t) is given as follows [40]: 1c sv x 1 (t)dw(t) + 0.1c sv x 1 (t)dp(t) dx 2 (t) = [b + c vs x 1 (t)x 2 2 (t) + 0.4E{x 1 (t)}E{x 2 2 (t)} +0.7u 12 (t) + u 22 (t) + 0.4u 32 (t) + v 2 (t)]dt +0.1c vs x 2 (t)dw(t) + 0.1c vs x 2 (t)dp(t) x(0) = [3. where T is with historical trend of the net capital flow for the source country of financial turbulence x 1 (t) and historical trend of the net capital flow for the volatility-affected country x 2 (t), u 1 (t) = [u 11 (t) u 12 (t)] T denotes the control strategy of source country of financial turbulence, u 2 (t) = [u 21 (t) u 22 (t)] T denotes the control strategy of volatility-affected country, u 3 (t) = [u 31 (t) u 32 (t)] T denotes the control strategy of international consortium, a = 1.5 is the inertial coefficient of the volatility-affected country, b = 0.5 denotes the inertial coefficient of the source country of financial turbulence, c sv = 1 is the coupling coefficient for the impact from the source country of financial turbulence to the volatility-affected country, c vs = 1 is the coupling coefficient for the impact from the volatility-affected country to the source country of financial turbulence, v(t) = [v 1 (t) v 2 (t)] T is the external disturbance with v 1 (t) = 0.1 cos 0.5t and v 2 (t) = −0.1 sin 0.5t, w(t) denotes 1-D standard Wiener process and p(t) is the Poisson counting process with jump intensity λ = 0.5. The strategy-free system response of nonlinear mean filed jump diffusion capital flow system in (57) is shown in Fig. 5. Without any control strategies, the net capital flow for the source country of financial turbulence quickly approaches to steady state 0.1 with small periodic oscillation while the net capital flow of the volatility-affected country oscillates around 5.5 with large amplitude 0.5. Thus, it shows that the net capital flow in the source country of financial turbulence will quickly reduce to avoid large financial disaster. Due to the financial contagion from the source country of financial turbulence to the volatility-affected country, the volatilityaffected country still maintain a large net capital flow and it may cause severe financial disasters. However, due to the low net capital flow in the source country of financial turbulence, it will also reduce the commercial activities and investment demand of source country of financial turbulence.

2) Noncooperative Game Strategy Design for Cross-Border Capital Flow System
For the noncooperative game strategy design, three players (two country governments and an international consortium) want to regulate the capital flow system according to their desired mean targets as follows: On the other hand, the weighting matrices of three players are specified as follows: To utilize the global linearization technique in (2), the 16 vertices are chosen and the nonlinear MFSJD capital flow system in (57) can be written as: (61) with the following interpolation functions i=1 denote the set of vertices. Before solving the LMI-constrained MOP in (31), the detailed parameters for MOEA are given as: R = [1, 3]×[1, 4]× [1,5], N p = 300, N I = 100, R c = 0.8 and R m = 0.2. Then, by solving the LMI-constrained MOP in (31) with the proposed LMI-constrained MOEA design procedure for the noncooperative game strategy, the corresponding Pareto front is illustrated in Fig. 6. In this simulation example, we select the label point (2.1, 1.5, 4.1) in Fig. 6 as a solution with the corresponding positive matrixW * since it has the minimum Euclid norm among other solutions.
The state and mean state responses of nonlinear MFSJD capital flow system are shown in Fig. 7    matrices of three players, the first player pays more attention on the tracking of net capital flow for the source country of financial turbulence, the second player pays more attention on the tracking of the net capital flow for the volatilityaffected country and the third player has same consideration for the tracking of two countries' net capital inflow. As a result, the net capital inflow for the source country of financial turbulence is more close to the desired target of the first player and the net capital inflow for the volatilityaffected country is more close to the desired target of the second player.

3) Cooperative Game Strategy Design for Cross-Border Capital Flow System
For the cooperative game strategy design, three players have compromised a common mean target with one another as and the reference tracking model for common target of three players is given as dx r (t) = (A r x r (t) + B r r r (t))dt with A r = −0.8I 2 and B r = 0.8I 2 . In general, the meanfield state Ex 1 (t) of source country of financial turbulence is larger than the mean-field state Ex 2 (t) of volatility-affected country. Therefore, three players have compromised the common steady state mean-field target r r (t) = [4 1] T to make a relative large net mean capital flow in the source country of financial turbulence with an acceptable net mean-field capital inflow in the volatility-affected country. In this case, three players aim to make a relatively large net capital inflow in the source country of financial turbulence with an acceptable net capital inflow in the volatility-affected country. Furthermore, the state weighting matrices and control weighting matrices are chosen as follows: By solving the LMI-constrained SOP in (47) via MATLAB LMI TOOLBOX, the positive matrixW * c can be obtained with ρ * = 3.2. The simulation result of the cooperative game strategy design for the nonlinear MFSJD capital flow system is shown in Fig. 8. Since the desired tracking target is common for three players, two capital flows of the source country of financial turbulence and volatility-affected country can be successfully regulated at r r (t) = [4 1] T , i.e., E{x(t)} = r r (t) at the steady state. Moreover, the effect of internal continuous and discontinuous random fluctuations as well as external disturbances on the common target tracking is effectively reduced by the proposed cooperative H ∞ meanfield game tracking strategy. However, how to negotiate for two governments and one international consortium to compromise with a common target of three-player cooper-ative mean-field game is generally a complicated and timeconsuming process.

VII. CONCLUSION
In this study, the multi-player noncooperative H ∞ meanfiled game tracking strategy design with conflict desired targets and cooperative H ∞ mean-field game tracking strategy design with a common desired target are investigated for the nonlinear MFSJD system under external disturbance. Different than the conventional game strategy designs in nonlinear stochastic system, the players not only track their desired targets but also attenuate the random variation between the state and mean state. As a result, the novel H ∞ noncooperative mean-field game design performance and the novel H ∞ cooperative mean-field game design performance are introduced. To avoid solving the corresponding nonlinear partial differential HJII during the design of two game strategies, the nonliear MFSJD is interpolated by a set of local linearized MFSJDs with the utilization of the global linearization method. In the case of noncooperative game, the noncooperative H ∞ mean-field target tracking game strategy design of nonlinear MFSJD system is transformed to an equivalent LMIs-constrained MOP which can be easily solved via the proposed reverse-order LMI-constrained MOEA. Further, we also prove the Pareto optimal solutions obtained by the LMIs-constrained MOP are Nash equilibrium solutions of nonlinear noncooperative H ∞ mean-field target tracking game. On the other hand, for the cooperative H ∞ mean-field game tracking strategy design of nonlinear MFSJD, the design problem becomes solving LMIs-SOP. Two stochastic financial mean-field systems are provided as design examples to illustrate the design procedure and compare the target tracking performance of the noncooperative and cooperative H ∞ mean-field target tracking game strategies. In the future, due to the growth of plant number, this kind large-scale system can be further reformulated as a noncooperative mean-field system. As a result, the developed noncooperative mean-field game strategy can be applied to various practical mean-field systems. On the other hand, to improve the searching efficiency of Nash equilibrium solutions of large-scale noncooperative players, the proposed reverse-order MOEA should be further improved.