Pareto-Improving Incentive Mechanism for Noncooperative Dynamical Systems Under Sustainable Budget Constraint

A Pareto-improving incentive mechanism to improve the weighted social welfare and achieve continual Pareto improvement for a pseudogradient-based noncooperative dynamical system is developed. In the proposed approach, the system manager remodels agents' dynamical decision-making by collecting taxes from some agents and giving some of the collected taxes to other agents as subsidies with a sustainable budget constraint. Sufficient conditions are derived under which agents' state converges toward the socially maximum state associated with a weighted social welfare function depending on the priority ratio of the agents and the initial state. We present several illustrative numerical examples to illustrate the efficacy of our results and reveal the fact that the potentialization of the payoff structure has a very strong relation to generating Pareto-improving system trajectories.


Pareto-Improving Incentive Mechanism for Noncooperative Dynamical Systems Under
Sustainable Budget Constraint Yuyue Yan , Member, IEEE, and Tomohisa Hayakawa , Member, IEEE Abstract-A Pareto-improving incentive mechanism to improve the weighted social welfare and achieve continual Pareto improvement for a pseudogradient-based noncooperative dynamical system is developed.In the proposed approach, the system manager remodels agents' dynamical decision-making by collecting taxes from some agents and giving some of the collected taxes to other agents as subsidies with a sustainable budget constraint.Sufficient conditions are derived under which agents' state converges toward the socially maximum state associated with a weighted social welfare function depending on the priority ratio of the agents and the initial state.We present several illustrative numerical examples to illustrate the efficacy of our results and reveal the fact that the potentialization of the payoff structure has a very strong relation to generating Pareto-improving system trajectories.

I. INTRODUCTION
F OR the coming smart society, the coordination issues be- tween individual interests and social interests have become extremely important.To study the coordination issues, game theory provides one of the mathematical disciplines concerning the strategic interaction among selfish decision-makers [1], [2].Reflecting the fact that each agent in noncooperative systems aims to increase (or even maximize) its own payoff by selecting rational strategies [3], [4], some dynamics are considered in the literature for capturing the fact that the agents mutually affect the behavior of the other agents through the interconnected relations of their payoff structure [5], [6], [7], [8], [9].For example, the selfish agents' dynamic behaviors are modeled by the pseudogradient dynamics (or so-called better response dynamics) in [6], [9], [10], and [11].The authors are with the Department of Systems and Control Engineering, Tokyo Institute of Technology, Tokyo 152-8552, Japan (e-mail: yan.y.ac@m.titech.ac.jp; hayakawa@sc.e.titech.ac.jp).
Digital Object Identifier 10.1109/TAC.2023.3325412 It is well known that the agents' selfish behavior in noncooperative systems may result in degradation of social utilities [12], [13], e.g., the tragedy of the commons [14].Therefore, some explicit incentive mechanisms are constructed in the literature in order to improve the performance of the entire system [11], [15], [16], [17], [18], [19].For example, Alpcan et al. [15] considered a pricing mechanism to achieve the highest social welfare with selfish agents under pseudogradient dynamics.Sandholm [20] proposed a variable pricing scheme to modify the congestion game of a roadway network depending on network utilization.Yan and Hayakawa [11] studied how to design a zero-sum compensation approach to stabilize possibly unstable Nash equilibria under uncertain sensitivity parameters.In those works, by assuming the existence of a system manager who is able to know complete information about the agents and drive the state to a desired state, the constructed incentive mechanisms are often designed as coercion policies under which the agents cannot escape once in place.However, since the agents may have the freedom to break away from the mechanism when the agents come across some undesired situations (e.g., when their payoffs decrease after the mechanism is executed), it is significant to develop incentive mechanisms that enhance the payoff values of all the agents while at the same time guaranteeing Pareto improvements [21] under the imposed incentives.
During the design of an incentive mechanism for noncooperative systems, it is essential to ensure that the desired state is Pareto efficient [22], [23], [24].Pareto-efficient states capture the strategy profiles where no individual agent can be better off without making the others worse off by deviating from the characterized state [25], [26], [27] so that there is no space for further Pareto improvement.If the Nash equilibrium of the noncooperative system is not Pareto efficient, then there is still some room to increase the payoffs for some of the agents without decreasing any other agents' payoffs [28], [29].Therefore, to establish Pareto efficiency of the overall system, making the Nash equilibrium Pareto efficient may be required from the perspective of the system manager [30], [31].
To understand to what extent the limits of the magnitude of positive and negative incentives affect the performance of a noncooperative system, Ferguson et al. [32] compared the effectiveness of taxes and subsidies in influencing the agents' behavior in a congestion game and concluded that subsidies may provide better performance than taxes under similar budgetary constraints.Even though subsidy policy is one of the useful financial methods for encouraging the economic individual to follow the government's social instruction, it is shown in [33] and [34] that the government may face financial problem from a long-term perspective, especially when the system manager does not have a sustainable budget to provide the subsidies.In order to reduce the financial pressure of the government, the system manager may have to serve as a mediator (or a tax collector) in the noncooperative system to assure that every subsidy is financed by taxes taken from the others.In such a case, the incentive mechanism is working in a sustainable manner since the system manager must comply with the budget.For example, Fang et al. [34] constructed balanced subsidy and taxation policies to improve the economic efficiency of the electric vehicle charging infrastructures.Rieh et al. [35] investigated efficient incentive-based control algorithms for heterogeneous decisionmarkers in repeated matrix games with a budget constraint and revealed the transitional behavior of the network games under the provided payoff incentives.Fotakis and Spirakis [36] considered cost-balancing tolls (also known as taxes or prices) for atomic network congestion games where the tolls paid inside the transportation network are feasibly refunded to the agents and hence those tolls do not influence the social costs, whereas Li et al. [37] designed a budget-balancing incentive framework promoting cooperation across devices in broadcast device-todevice systems.The balanced budget constraints in those works are called strong budget-balance in mechanism design problems, which requires lossless monetary transfer and may be hard to be simultaneously achieved with perfect efficiency [38].In the literature, sustainable budget constraints often include a weaker notion of strong budget-balance, which only requires the system manager not to inject additional money into the system [39].How to simultaneously achieve continual Pareto improvements and Pareto efficiency with a sustainable budget is still unclear.
Depending on the specific goal of the government, the government in real society usually gives more preferential treatments to some of the companies/individuals when the performance of those companies/individuals is crucial in achieving the government's goal.For example, tackling extreme poverty was set to be an essential policy goal by developing countries, and hence their governments are likely to provide more resources (e.g., job opportunities or common resources) to the poorer people than the others for enhancing the poor people's lives.Another example is that industry-oriented countries have given more preferential treatments to the new energy vehicles (NEVs) companies to improve international competitiveness under the challenge of the global climate issue [34].Therefore, while designing the incentive mechanism, the system manager may evaluate the priority among the agents for constructing a social welfare function [40].
In this article, we develop an explicit incentive mechanism for noncooperative systems to remodel agents' dynamical decisionmaking for guaranteeing that all the agents are Pareto improving and their state converges to a Pareto-efficient Nash equilibrium.Specifically, we suppose that the system manger collects taxes from some agents and gives some of the collected taxes to other agents as subsidies with a sustainable budget constraint.
Considering the priorities among the agents, we construct a weighted social welfare function for the incentive mechanism, and hence derive the socially maximum state as the target Nash equilibrium.With the well-designed incentive functions associated with the weighted social welfare function, the socially maximum state is ensured to be a Pareto-efficient Nash equilibrium in the incentivized noncooperative system.Several sufficient stability conditions are presented to guarantee that the agents are Pareto improving under the pseudogradient dynamics and their state converges to the socially maximum state with known or unknown sensitivity parameters.As a result, it turns out that the initial state plays an important role in constructing the Pareto-improving incentive mechanism under sustainable budget constraint.In the case of equal priority between the agents, a balanced budget constraint is guaranteed.
Different from the preliminary version [41], this article additionally provides more illustrative and important numerical examples, detailed proofs, and an elaborated literature review.Furthermore, more discussions are added in Section IV, where we discuss the Pareto-improving incentive mechanism and the connection between Pareto improvement and potentialization with equal priority.Our numerical examples exhibit a direct evidence that the Pateto improvement and potentialization do not have an inclusive relation with each other.
The rest of this article is organized as follows.We explain the incentivized noncooperative system and introduce the problem of this article in Section II.In Section III, we design the incentive mechanisms to achieve Pareto-improving trajectories and a Pareto-efficient state with arbitrary priorities for the agents under sustainable budget constraint for a given initial state.In Section IV, we specialize the results to the case where the priorities of the agents are all the same.Several numerical examples are shown in those two sections.Finally, Section V concludes this article.
Notations: We use a fairly standard notation in the article.Specifically, we write R for the set of real numbers, R n for the set of n × 1 real column vectors, and R n×m for the set of n × m real matrices.We denote positive real number by R + .Moreover, (•) T denotes transpose, null(•) and spec(•) denote the null space and the spectrum of a matrix, respectively, and diag[α] denotes a diagonal matrix with (i, i)-entry given by α i for the vector α ∈ R n .Finally, f (•) denotes the gradient of function f (•), 1 N and I N denote the ones vector and identity matrix of dimension N , respectively.

A. System Description
Consider a noncooperative system with n number of agents adjusting their state (strategy) in an unbounded state space R n .Let N {1, . . ., n} denote the set of all agents.The payoff function of agent i is denoted by J i : R n → R : x → J i (x) and the profile of all agents' state is denoted by x = [x 1 , . . ., x n ] T ∈ R n , where x i ∈ R is agent i's individual state.We assume that there is a system manager who imposes some incentive mechanisms among the agents to reconstruct the agents' payoff functions, and hence alters agents' decision for improving the welfare of the entire system.(The precise definition of the welfare of the entire system is given as the weighted social welfare function in Section III considering the priority of the agents.)Specifically, let agents' incentivized payoff functions be given by where p i : R n → R is the incentive function for agent i ∈ N .We denote the incentivized noncooperative system by G( J) and the original (un-incentivized) noncooperative system by G(J) with J { Ji } i∈N and J {J i } i∈N .In order to establish the pseudogradient dynamics for the agents, we assume that the payoff functions J i (x), i ∈ N , and the incentive functions p i (x), i ∈ N , are continuously differentiable.The key question in mechanism design problems is how to properly design the incentive functions satisfying certain requirements to alter the decisions of the agents.It is worth noting that at a Nash equilibrium (defined in Definition 1 below), no agent has any intention to deviate unilaterally from the equilibrium state.Therefore, the Nash equilibrium is often working as an operating point in noncooperative systems.Furthermore, we note that Pareto efficiency is an important notion in economics for indicating the efficiency of a society.For the convenience of readers, the notions of the Nash equilibrium and a Pareto-efficient state are given as follows.
Definition 1: For the incentivized noncooperative system G( J), the state profile x * ∈ R n is called a Nash equilibrium if Definition 2: For the incentivized noncooperative system G( J), the state profile x * ∈ R n is Pareto efficient (optimal) if there is no other state x ∈ R n such that Ji (x) ≥ Ji (x * ) for all i ∈ N with strict inequalities for some i ∈ N .
Note that the state profile x * ∈ R n , which maximizes the function i∈N Ji (x) is always Pareto efficient in G( J) because no agent can further increase Ji (x) without decreasing others' payoffs from x * .Furthermore, since J i (x), i ∈ N , and p i (x), i ∈ N , are continuously differentiable for the unbounded state space R n , the Nash equilibrium x * satisfies In general, the Nash equilibrium x * in the original noncooperative system G(J) is not Pareto efficient, and hence there may still be some room to improve the payoffs for some of the agents without decreasing any of the other agents' payoffs (i.e., there is space to further increase the performance of the entire system).Therefore, from the perspective of the system manager, it is natural to consider the situation where the system manager wishes to establish Pareto efficiency on the target (desired) Nash equilibrium x * of the incentivized noncooperative system G( J).

B. Myopic Pseudogradient Dynamics
In the literature, each agent is often associated with some decision dynamics in order to dynamically change its state in accordance with the other agents' decision.Recalling the fact that the Nash equilibrium is an equilibrium state at which no agent has any intention to deviate unilaterally, we note that each of the Nash equilibria of the noncooperative system G( J) is often characterized as an equilibrium of the agents' underlying decision dynamics.Among the dynamics that are considered in the context of noncooperative systems, the pseudogradient dynamics are the typical continuous-time dynamics capturing the fact that the agents selfishly concern their own payoffs and myopically change their states according to the current information without any foresight on the future state of the other agents [8], [42], [43], [44], [45].In this article, we suppose that all the selfish agents continually change their state under the pseudogradient dynamics given by where α 1 , . . ., α n denote the agent-dependent sensitivity parameters [11], [43], [46], and hence the Nash equilibrium x * of G( J) is an equilibrium of (4) because of (3).For the statement of the following results in this article, we let α (α 1 , . . ., α n ).

C. Motivation and Problem
Before we present the main problem of this article, we give some motivations of this work.Considering the case where the agents (companies) may leave the market when their payoffs decrease after the incentive mechanism is executed, it is important to discuss how to design a special incentive mechanism where every agent's payoff is monotonically increasing over time in the incentivized noncooperative dynamical system G( J).In other words, not only may the system manager wish to guarantee the Pareto efficiency at the Nash equilibrium x * of G( J), but also Ji (x(t)) ≥ 0, t ≥ 0, for all i ∈ N along the system trajectories of the pseudogradient dynamics (4).The conditions Ji (x(t)) ≥ 0, t ≥ 0, for all i ∈ N indicate that whether or not the agents are paying taxes or getting subsidies, their payoff values are still increasing over time, and hence those agents are still willing to follow the incentive mechanism constructed by the system manager.Definition 3: Given the system trajectory x(t), t ≥ 0, with x(0) = x 0 , the agents in the incentivized noncooperative system G( J) are Pareto improving if where J i (x 0 ) denotes the payoff value of agent i at the initial time.
Note that condition (5) is equivalent to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
representing the assumption that there is no change in the payoff levels when we start to impose the incentive mechanism.On the other hand, the system manager in many economic applications serves merely as a mediator (or a tax collector) and does not have productivity to pay the additional profits to the agents.In such a case, it is worth asking whether it is possible to achieve the Pareto-improving conditions ( 5) and ( 6) by using some welldesigned incentive functions p i (x), i ∈ N , satisfying Note that condition (8) imposes some sustainable budget constraint representing the fact that the system manager collects taxes from some agents and gives some of the collected taxes to other agents as subsidies.When the equality holds, the system manager is understood as a mediator who collects taxes from some agents and gives the same amount of subsidy in total to other agents.Now, we present the problem of this article as follows.
Problem: Consider the incentivized noncooperative system G( J) with the pseudogradient dynamics (4).Suppose that the system manager knows all the agents' payoff functions J i (x), i ∈ N .Our objective is to design the incentive functions p i (x), i ∈ N , satisfying the sustainable budget constraint (8) for the incentive mechanism guaranteeing that the agents are Pareto improving and their state converges to a Pareto-efficient Nash equilibrium in G( J).

III. ACHIEVING PARETO IMPROVEMENTS WITH SUSTAINABLE BUDGET CONSTRAINT
In this section, we characterize the incentive mechanisms for the noncooperative system.It is important to emphasize that the system manager may evaluate the priority among the agents.In real society, the policies given by a government are often constructed according to the specific goal of the government considering the priority.For example, the government may give more preferential treatments to the semiconductor companies when the government wishes to raise the international competitiveness of the semiconductor industry in its country.Another example is that the government may provide more resources (e.g., job opportunity or common resource) to the poorer people than the others in its country for enhancing the poor people's income, and hence for tackling extreme poverty.Obviously, depending on how the priority is determined by the system manager, the incentive mechanism should be properly designed so that every agent's welfare is certainly increasing and the entire state heads to a more efficient state state for the society at the same time.
In light of this observation, we suppose that the priority ratio of the agents evaluated by the system manager is given by for some η i ∈ R + , i ∈ N .Without loss of generality, η 1 is taken as 1.Then, we consider the weighted social welfare function Furthermore, we define the target state as the socially maximum state with respect to U (x) given by Now, we consider the situation where the incentive functions arg max with σ > 0 being a scaling factor characterized later.Obviously, the variable σ does not affect the maximum state of i∈N Ji (x), but we keep the notation for further characterization of some requirements below.As a result, the constraint (12) guarantees that the target state x * is Pareto efficient in G( J), whereas the constraints ( 13) make x * to be a Nash equilibrium.In other words, the target state x * maximizing the social welfare function U (x) is a Pareto-efficient Nash equilibrium in the incentivized noncooperative system G( J) under ( 12) and ( 13).Note that condition (12) along with ( 1) and ( 10) is equivalent to (14) Hence, the incentive functions should be designed in such a way that the system trajectories of the pseudogradient dynamics (4) remain in the domain in order to maintain the sustainable budget constraint (8).For the given priority ratio (9), it turns out that the initial state x 0 plays an important role during the designing the incentive functions.
In the following statements, we explore two requirements on the initial state x 0 for constructing our incentive mechanism to allow the system trajectories of (4) to remain in the domain D bud .1) Requirement 1: Since (7) holds at the initial state x 0 , and hence i∈N p i (x 0 ) = 0, (14) implies that the scaling factor σ in (10) should be determined to satisfy i∈N Note that the solution σ of ( 16) is unique as given by In order for σ(x 0 ) to be positive, our framework requires the initial state x 0 to satisfy Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.We emphasize that condition (18) may not hold for some initial state x 0 (and hence we cannot find a positive scaling factor σ(x 0 )).Fig. 1 shows an example of an infeasible initial state (e.g., point A) outside the domain D scale indicated by the striated region given the priority ratio η 1 : η 2 for a two-agent noncooperative system G(J).Note that D scale is invariant with respect to σ(x 0 ).Specifically, it follows from (18) that D scale is characterized as the union of the domains {x ∈ R n : so that the boundary of D scale is given by n i=1 J i (x) = 0 and n i=1 η i J i (x) = 0, irrespective of the agents' individual payoff functions.When the priority ratio (9) changes, the domain D scale alters along with the changes of the level set of the weighted social welfare function.But when all the agents have equal priority (i.e., η 1 = • • • = η n ), those two boundaries coincide with each other and the domain D scale is understood as the entire space R n because n i=1 J i (x) is constant and positive in (18) for all x ∈ R n .This special case is elaborated in Section IV below.
2) Requirement 2: It is important to notice from ( 17) that since the value of the scaling factor σ(x 0 ) depends on the initial state x 0 , the domain D bud (σ(x 0 )) given by ( 15) also depends on the initial state x 0 .Recalling that the system trajectories of (4) should remain in the domain D bud (σ(x 0 )) for maintaining the sustainable budget constraint (8), some initial state may not be allowed for the existence of the incentive functions that meets this requirement.For instance, when the target state x * , which does not depend on the initial state as given by (11), does not belong to the domain D bud (σ(x 0 )), there is no possibility to establish incentive functions satisfying (8) around the target state x * .An example of the initial state where x * ∈ D bud (σ(x 0 )) holds is shown as the point B in Fig. 1, where the domain D bud (σ(x 0 )) is indicated by the blue region.Therefore, in order to make the socially maximum state x * be the target state for the incentive mechanisms, we further suppose that the initial state x 0 yields the domain D bud (σ(x 0 )) satisfying In the case where there is an incentive supply from outside the system and its supply rate is given by c ∈ R + , the right-hand side of ( 8) should be replaced by c.In this case, the characterization of D bud (σ(x 0 )) can be similarly established.Now, we design the incentive functions p i (x), i ∈ N , to satisfy the Pareto-improving conditions ( 6), (7), and the constraints (12) and (13).Specifically, we consider the incentive functions used in (1) given by for each agent i ∈ N , where , so that (7) holds.Then, the agent's incentivized payoff functions (1) are given by (21) Proposition 1: If the incentive functions are constructed by (20), then the socially maximum state x * associated with the weighted social welfare function U (x) is a Pareto-efficient Nash equilibrium in G( J).
Proof: Note that (12) holds because Consequently, it follows from ( 4) and ( 21) that the pseudogradient dynamics are given by For the statement of the following results, we let Theorem 1: Consider the n-agent noncooperative system G(J) with the incentive mechanism (1) and the pseudogradient dynamics ( 4) with (21).If the parameters ζ i ∈ (0, 1), i ∈ N , and (21) are chosen in such a way that there exists a function V : R n → R such that )) for the given initial state x 0 , then the incentive functions p i (x), i ∈ N , given by ( 20) guarantee that the socially maximum state x * is an asymptotically stable equilibrium point and all the agents are Pareto improving with the sustainable budget constraint (8).
Proof: It follows from ( 23) to ( 25) that x * is an asymptotically stable equilibrium point.Furthermore, since the trajectory remains in the domain D (and hence D i ), it follows that: for all i ∈ N and t ≥ 0.Moreover, since the trajectory remains in the domain D bud (σ(x 0 )), it follows from ( 14) and ( 15) that (8) holds.The proof is complete.Remark 1: Even though the decision dynamics of the agents are fixed as the pseudogradient dynamics in this article, it is possible to extend the results of Theorem 1 to the case with other (nonpseudogradient) types of decision dynamics in the form of (22) by well defining the domains D i {x ∈ R n : J i (x)f (x) ≥ 0}, i ∈ N .For example, a similar Pareto-improving incentive mechanism can be constructed for the agents following prediction-incorporated pseudogradient dynamics with some cognitive prediction behaviors as characterized in [47].
Example 1: Consider the two-agent noncooperative system with Even though the constant terms −5.8 and 5.8 in the payoff functions above do not affect the behavior of the agents (these constants are included in w 1 (x 0 ), . . ., w n (x 0 ) in (20) so that its time derivative vanishes in the calculation of pseudogradient), we keep the constant terms to effectively illustrate the domains in the figures.Let the priority evaluated by the system manager be given by η 1 = 1 and η 2 = 0.5.Note that the domain D scale is already indicated by the striated domain in Fig. 1 and the socially maximum state is given by x * = [2.9714,1.8286] T .Supposing that the initial state is given by x 0 = [4, 0.4] T , which is exactly the point C in Fig. 1 satisfying x 0 ∈ D scale , the scaling factor is obtained by (17) as σ(x 0 ) = 0.8074.In this case, the domain D bud (σ(x 0 )) satisfying x * ∈ int D bud (σ(x 0 )) is illustrated as the red region in Fig. 2.
Let the sensitivity parameters be given by α = (1, 1) so that the vector field of the incentivized pseudogradient dynamics is given by f  1.2 1.08 ](x − x * ) guarantees that the agents' state x(t) converges to the socially maximum state x * and both of the agents are Pareto improving with the sustainable budget constraint (8).
Fig. 3 shows the trajectories of the agents' payoffs and incentives versus time.It can be seen from Figs. 2 and 3 that the agents' state indeed converges to the socially maximum state x * with monotonically increasing J1 (x(t)) and J2 (x(t)) even though the sum of the incentive functions p 1 (x(t)) and p 2 (x(t)) are nonpositive for all t ≥ 0 [see the red solid curve in Fig. 3(b)].Moreover, compared to the dashed curves shown in Fig. 3(a) which represent the trajectories of the agents' payoffs without the incentive mechanism, the payoff values of both of the two agents shown as the solid curves in Fig. 3(a) are improved with the incentive mechanism even though one of the agents (i.e., agent 2) is required to pay some taxes in the mechanism.In addition, even though the trajectories of the incentive values converge to some constant taxes/subsidies in Fig. 3(b), the transient part of the incentive values is essential to give contributions to constructing Pareto-improving trajectories.
Example 2: Consider the two-agent noncooperative system with J Let the priority evaluated by the system manager be given by η 1 = 1 and η 2 = 2.Note that the domain D scale is given by R 2 because J 1 (x) < 0 and J 2 (x) < 0 for all x ∈ R 2 .Furthermore, the socially maximum state is given by x * = [3.3779,1.4480] T .Supposing that the initial state is given by x 0 = [3.6720,1.5360] T , the scaling factor is obtained by (17) as σ(x 0 ) = 0.8311.In this case, the domain D bud (σ(x 0 )) satisfying x * ∈ int D bud (σ(x 0 )) is illustrated as the red region in Fig. 4. Let the sensitivity parameters be given by α = (1, 1.5) so that the vector field of the incentivized pseudogradient dynamics is given by f It follows from Theorem 1 that the incentive mechanism (1) along with the incentive function (20)   D bud (σ(x 0 )) with V (x) = −U (x) + U (x * ) guarantees that the agents' state x(t) converges to the socially maximum state x * and both of the agents are Pareto improving with the sustainable budget constraint (8).The trajectories of the agents' payoff values with and without the incentive functions are shown in Fig. 5(a), whereas the trajectories of the amount of incentives satisfying the sustainable budget constraint (8) are shown in Fig. 5(b).Even though the payoff values of agent 1 with the incentive mechanism are much lower than the payoff values without the incentive mechanism in Fig. 5(a), the weighted sum of the payoff values pointwise in time is certainly improved since the payoff values of agent 2 are significantly improved.
In general, it may be hard to examine the existence of the domain D 1 ∩ • • • ∩ D n when the number of the agents is large.However, the next result deals with the case where the sensitivity parameters of the agents are uncertain and suggests that the domain D 1 ∩ • • • ∩ D n exists as long as b ij is taken to be sufficiently close to 0.
Corollary 1: Consider the n-agent noncooperative system G(J) with the incentive mechanism (1) and the pseudogradient dynamics ( 4) with (21).If the domain for the given initial state x 0 , then the incentive functions p i (x), i ∈ N , given by ( 20) with b ij = 0, i, j ∈ N , guarantee that the socially maximum state x * is an asymptotically stable equilibrium point and all the agents are Pareto improving with the sustainable budget constraint (8) for any positive constants α i , i ∈ N .Proof: The result is a direct consequence of Theorem 1 with Now, we specialize the result of Theorem 1 with quadratic payoff functions given by where and c i ∈ R, i ∈ N .The social welfare function (10) is hence given by Supposing that the social welfare function U (x) is concave (i.e., A < 0), it follows that the unique socially maximum state x * is given by: and the social welfare function can be rewritten as with x x − x * .For the statement of the following results, let: Corollary 2: Consider the n-agent noncooperative system G(J) with the quadratic payoff functions (28), the incentive mechanism (1), and the pseudogradient dynamics (4) with (21).(21) are chosen in such a way that 0 < A T P i + P i A, i ∈ N (33) then the incentive functions p i (x), i ∈ N , given by ( 20) guarantee that the socially maximum state x * is globally asymptotically stable.Furthermore, all the agents are Pareto improving with the sustainable budget constraint (8) for the given initial state x 0 satisfying D ⊆ D bud (σ(x 0 )), where D {x ∈ R n : V (x) ≤ V (x 0 )} \ {x * } with V (x) satisfying ( 23)- (25).
Proof: First, note that the vector field f (x) of the pseudogradient dynamics (22) becomes f (x) = diag[α](σ(x 0 )ZA + B)x.Furthermore, note from ( 21) and ( 31) that the agents' incentivized payoff functions are given by Therefore, it follows that: for all x ∈ R n \ {x * } and hence Then, the result is a direct consequence of Theorem 1 using the Lyapunov function candidate V (x) = −U (x) + U (x * ) satisfying ( 23)-( 25) since holds for all x ∈ R n \ {x * }.Remark 2: There is no systematic way of finding the parameters b ij , i, j ∈ N , for the general result of Theorem 1.However, since condition (33) of Corollary 2 are certainly satisfied with b ij = 0, i, j ∈ N , both of the results in Corollaries 1 and 2 show the fact that the parameters b ij , i, j ∈ N , can be simply set to zero as long as the initial state x 0 satisfies condition (27) and the Requirements 1 and 2 [i.e., (18) and (19)].Those restrictions on the initial state x 0 are automatically satisfied in Section IV with equal priorities of the agents and hence the parameters b ij , i, j ∈ N , can be simply set to zero for any initial state (see Corollary 3 below).
Note that it may be hard to determine the parameters ζ i , i ∈ N , and b ij , i, j ∈ N , to guarantee D ⊆ D bud (σ(x 0 )) when the number of the agents is large because we cannot easily find the function V (x).The following result provides different conditions without looking for a function V (x) guaranteeing D ⊆ D bud (σ(x 0 )) for the noncooperative system G(J) with quadratic payoff functions when the Jacobian matrix A possesses a real eigenvalue in its spectrum.
Proposition 2: Consider the n-agent noncooperative system G(J) with the quadratic payoff functions (28), the incentive mechanism (1), and the pseudogradient dynamics (4) with (21).If the parameters ζ i ∈ (0, 1), i ∈ N , and b ij = −b ji ∈ R, i, j ∈ N , in (21) are chosen in such a way that (33) holds along with where λ ∈ R is a real eigenvalue of the matrix A, then all the agents are Pareto improving with the sustainable budget Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
constraint (8) for the given initial state x 0 satisfying that the straight segment from x 0 to x * is contained in the domain D bud (σ(x 0 )) Proof: The proof is immediate since (37) indicates that the vector x 0 − x * is the eigenvector of the matrix A associated with the eigenvalue λ and hence the system trajectory x(t) is a straight line starting at the initial state x 0 and ending at the target state x * .
For a given vector x [x 1 , . . ., xn ] T = x 0 − x * ∈ R n , even though it appears to be hard to find the parameters ζ i , i ∈ N , and b ij , i, j ∈ N , such that condition ( 37) is satisfied, it is possible to solve some linear equations to derive such parameters by constructing a special form for the matrix A when n ≥ 5 and xi = 0, i ∈ N .For example, let b 1j = −ζ 1 σ(x 0 )A j 1 , j ∈ N , where A j i A(i, j), so that A is given by a special matrix shown in (38) shown at the bottom of this page, with 1 , it follows from (38) that condition ( 37) is equivalent to: which is essentially a system of n − 1 number of linear equations shown in (39) shown at the bottom of this page, with n [See the deviation of ( 41 Recalling that ζ 1 + ζ 2 = 1, it follows that: Therefore, there exist parameters ζ 1 ∈ (0, 1), ζ 2 ∈ (0, 1), and b 12 , such that condition ( 37) is satisfied when the initial state Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Let the sensitivity parameters be given by α = (1, 1, 1, 1) so that the matrix A is given by A  (33) holds and condition (37) holds for one of the real eigenvalue of A, which is given by λ = −0.5712.It follows from Proposition 2 that the incentive mechanism (1) along with the incentive functions (20) guarantees that all the agents are Pareto improving with the sustainable budget constraint (8).Fig. 6 shows the trajectories of the agents' state, incentives, and payoff values versus time.It can be seen from those figures that the agents' state indeed converges to the socially maximum state x * under the sustainable budget constraint (8) [see the black solid curve in Fig. 6(b)].Moreover, compared to the trajectories of the agents' payoff values without the incentive mechanism shown as the dashed curves in Fig. 6(a), the payoff values of all of the 4 agents are substantially improved with the incentive mechanism even though only small amounts of incentives are implemented [see Fig. 6(b)].
Example 4: Consider a differentiated oligopoly market composed of n = 4 firms selling different products with the market price function [48] given by λ i (x) = λ 0 − βx i − βδ j =i x j , i ∈ N , where x i ∈ R + denotes the quantity of the products of firm i, λ 0 ∈ R + denotes the cap price, β ∈ R + denotes market power, and δ ∈ [0, 1) denotes the degree of product differentiation.In this market, the firms compete in quantities rather than prices according to the payoff (income) functions given by and suppose that firm 1 is a domestic firm whereas the other 3 firms are foreign-owned firms in the market.Supposing that the initial state is given by x 0 = [2.6466,1.5984, 2.1807, 1.4231] T , the trajectories of the firms' payoff values under the pseudogradient dynamics (4) without imposing any incentives are shown as the dashed lines in Fig. 7(a) and (b).It can be seen from the figure that the payoff values of the foreign-owned firms 2 and 4 are substantially improved at the sacrifice of the domestic firm 1.
To maintain the continued competitiveness of the domestic firm 1 in the market, we consider the case where the priority evaluated by the system manager be given by η 1 = 1 > η 2 = η 3 = η 4 = 0.4.Note that the socially maximum state is hence given by x * = [2.6134,1.6559, 2.2115, 1.6559] T and the scaling factor is obtained by (17) as σ(x 0 ) = 1.6838.In this case, it can be numerically verified that the straight segment from x 0 to x * is contained in the domain D bud (σ(x 0 )) characterized by (15) since i∈N (ση i − 1)J i (x) ≤ 0 holds with x = γ(x 0 − x * ) + x * for all γ ∈ (0, 1).Let the sensitivity parameters be given by α = (1, 1, 1, 1) so that the Jacobian matrix A is given by A = σ(x 0 )ZA + B. Let ζ i = 0.25, i = 1, 3, 4, b 12 = 0.16, b 13 = 0, b 14 = −0.0284,b 23 = 0.0268, b 24 = 0, and b 34 = 0.0384, so that condition (33) holds and condition (37) holds for one of the real eigenvalues of A, which is given by λ = −0.3503.(Note that spec(A) = {−0.3503,−0.8161, −0.3429 ± 0.0468i} for this choice of parameters of ζ i , b ij , i, j ∈ N .)Now, it follows from Proposition 2 that the incentive mechanism (1) along with the incentive functions (20) guarantees that all the agents are Pareto improving with the sustainable budget constraint (8), which can be verified by the trajectories (solid lines) of the firms' payoff values and incentives versus time in Figs.7 and 8, respectively.This example indicates that the proposed incentive mechanism is able to keep the continued competitiveness of the local firms by avoiding the payoff loss in the market.
On the other hand, note that when the system manager adopts slightly different weights for the agents, the above choice of parameters of ζ i , b ij , i, j ∈ N , in general no longer guarantees condition (37) of Proposition 2 for the same initial state x 0 because the socially maximum state x * may be substantially changed.However, in the case when the priority of the agents is set to all the same, the above choice of parameters of ζ i , b ij , i, j ∈ N , still exhibits Pareto-improving trajectories with the sustainable budget constraint for any initial state x 0 ∈ R n .This is because all the restrictions (18), (19), and ( 27) on the initial state x 0 are automatically satisfied with equal priorities of the agents as will be discussed in Corollary 4 in Section IV below.

IV. CONNECTION BETWEEN PARETO IMPROVEMENT AND POTENTIALIZATION UNDER EQUAL PRIORITY
In general, the domains D bud (σ(x 0 )) and D scale characterized in Section III are not the entire state space, and hence we may not be able to construct a Pareto-improving incentive functions for some initial state x 0 with unequal priority.But for a special situation where the agents have equal priority in (9), i.e., η i = 1 ∈ R + for all i ∈ N , recall from (18) that D scale = R n holds.In this case, since the scaling factor is simply obtained from (17) as σ(x 0 ) = 1 irrespective of the initial state, it is worth noting from ( 14) and ( 15) that the incentive functions p i (x), i ∈ N , satisfy (45) i.e., the system manager exactly works as a mediator transferring the payoff values among the n agents, and hence the domain D bud (σ(x 0 )) becomes R n for all x 0 ∈ R n .Furthermore, the social welfare function (10) simply becomes Therefore, in this section, we specialize the incentive mechanism characterized in Section III in this special situation and show the fact that the Pareto-improving incentive mechanism can be constructed for any initial state x 0 in R n .Theorem 2: Consider the n-agent noncooperative system G(J) with the incentive mechanism (1) and the pseudogradient dynamics (4) with (21).Suppose that the agents have equal priority in (9) with η i = 1 for all i ∈ N .If the parameters ζ i ∈ (0, 1), i ∈ N , and b ij = −b ji ∈ R, i, j ∈ N , are chosen in such a way that the socially maximum state x * belongs to the interior of D 1 ∩ • • • ∩ D n , then the incentive functions p i (x), i ∈ N , given by (20) guarantee that the socially maximum state x * is asymptotically stable.Furthermore, all the agents are Pareto improving with the sustainable (balanced) budget constraint (8) holding with equality for any initial state x 0 ∈ D, where D {x ∈ R n : V (x) ≤ δ} with the maximum attainable 23)- (25).
Proof: Consider the Lyapunov function candidate defined by V around the socially maximum state x * , it follows (x)f (x) = − i∈N J i (x)f (x) > 0 holds around x * , and hence x * is asymptotically stable.Now, recalling that D bud (σ(x 0 )) = R n for any initial state x 0 ∈ R n , the result is immediate.
Example 5: Consider the two-agent noncooperative system with Let the priority evaluated by the system manager be given by η 1 : η 2 = 1 : 1.Note that the socially maximum state is given by x * = [3.2321,1.3303] T .Let the sensitivity parameters be given by α = (3, 1).In this case, it follows from Theorem 2 that the incentive mechanism (1) along with the incentive functions (20) with  9 ) guarantees that the socially maximum state x * is asymptotically stable.Furthermore, both of the agents are Pareto improving with the sustainable budget constraint (8) holding with equality for all initial state x 0 ∈ D {x ∈ R n : V (x) ≤ δ} with V (x) = −U (x) + U (x * ) where the maximum attainable δ is given by δ = 1.0354.
The following result provides one of the ways to achieve Pareto improvements without the information of agents' personal sensitivity parameters α 1 , . . ., α n .We let with the maximum attainable δ ∈ R + such that U (x) = 0 holds only at x * in D. Corollary 3: Consider the n-agent noncooperative system G(J) with the incentive mechanism (1) and the pseudogradient dynamics (4) with (21).Suppose that the agents have equal priority in (9) with η i = 1 for all i ∈ N .Then the incentive functions p i (x), i ∈ N , given by (20) with b ij = 0, i, j ∈ N , guarantee that the socially maximum state x * is asymptotically stable and all the agents are Pareto improving with the sustainable (balanced) budget constraint (8) holding with equality for all x 0 in D given by (47) for any positive constants α i , i ∈ N .
Proof: First, let g(x) U (x).Note that the vector field f (x) of the pseudogradient dynamics (22) becomes f (x) = diag[α]Zg(x) and hence for all x ∈ R n except for the state x satisfying g(x) = 0. Furthermore, since 23)- (25).Remark 3: Since the domain D is understood as an invariant set for arbitrary sensitivity parameters α i , i ∈ N , they do not have to be known.
Example 6: Consider the two-agent noncooperative system with J 1 (x) = − sin(x Let the priority evaluated by the system manager be given by η 1 : η 2 = 1 : 1.Note that U (x) = 0 holds at the socially maximum state x * = [3.3524,1.3187] T , the state x 1 = [4.6971,2.0236] T , and the locally maximum state x 2 = [4.496,1.715] T .Fig. 10 shows the domain D of (47) indicated by the grey region with δ = 1.24.It follows from Corollary 3 that the incentive mechanism (1) along with the incentive functions (20) with ζ 1 = 1 − ζ 2 = 0.4 and b 12 = −b 21 = 0 guarantees that the socially maximum state x * is asymptotically stable and both of the agents are Pareto improving with the sustainable budget constraint (8) holding with equality for all x 0 ∈ D for any sensitivity parameters α 1 and α 2 .With the sensitivity parameters be given by α = (2, 1), the vector field of the pseudogradient dynamics ( 22) is shown in Fig. 10.It can be seen from the figure that the socially maximum state x * , the state x 1 , and the locally maximum state x 2 are asymptotically stable, unstable, and asymptotically stable, respectively.Note that the state x 1 is a saddle point of the pseudogradient dynamics and the domain D is an invariant set for arbitrary sensitivity parameters α 1 and α 2 .Now, we specialize the result of Theorem 2 with the quadratic payoff functions given by (28).
Corollary 4: Consider the n-agent noncooperative system G(J) with the quadratic payoff functions (28), the incentive mechanism (1), and the pseudogradient dynamics (4) with (21).Suppose that the agents have equal priority in (9) with η i = 1 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(21) are chosen in such a way that (33) holds, then the incentive functions p i (x), i ∈ N , given by (20) guarantee that the socially maximum state x * is globally asymptotically stable and all the agents are Pareto improving for any initial state x 0 ∈ R n .
Proof: Recalling the expressions (34)-( 36) for quadratic payoff functions, the result is a direct consequence of Theorem 2.
Implementing some incentive mechanisms to make the incentivized (modified) noncooperative system a potential game can be named as potentialization for the noncooperative system, whereas the modified system is referred to as a potentialized game [49], [50].Due to the fact that (exact) potential games guarantee self-improving agents and convergence to a Nash equilibrium in gradient play, it may be intuitive to conclude that potentializing the noncooperative system as an (exact) potential game with the social welfare function being a potential function can achieve Pareto improvement for the agents.However the proposed incentive function (9) may allow the case that the noncooperative system is potentialized to a more larger class of potential games than exact potential games.(See the definitions of various types of potential games in Appendix A.) For example, the incentive mechanism in Corollary 3 (i.e., b ij = 0, i, j ∈ N ) potentializes the agents' payoff functions in G(J), i.e., G( J) reduces to a special class of potential games (so-called weighted potential games) by noting that each agent's payoff function is characterized as Ji (x) = ζ i U (x) + w i by the common function U (x) in (46).Therefore, in addition to the above results, it is interesting to discuss the connection between Pareto improvement and potentialization.Basically, whether the agents' payoff functions are potentialized for a class of potential games in the incentivized noncooperative system depends on how we select the parameters ζ i , i ∈ N , and b ij , i, j ∈ N .For example, it is straightforward to see that if b ij = 0 for all i, j ∈ N , then the agents are all Pateto improving (because of Example 7: Consider the two-agent noncooperative system with quadratic payoff functions J 1 (x) and J 2 (x) such that the social welfare function is given by (31) (33).It is interesting to see that even b = 0 (where agents' incentivized payoff functions are not simple proportion of the social welfare function U (x)), the agents' state still converges to the socially maximum state with a monotonically increasing payoff (in other words, agents are driven in a noncooperative way but result in a cooperative benefit).In fact, in this example, it can be shown that the incentivized noncooperative system G( J) is never an ordinal potential (nor a weighted potential) game when b is nonzero.Hence, our example numerically shows the fact that Pareto improvements do not indicate potentialization.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Next, we show an example to reveal that the agents in the incentivized noncooperative system G( J) possessing an ordinal potential may not be Pareto improving.
Example 8: Consider the two-agent noncooperative system with quadratic payoff functions J 1 (x) and J 2 (x) such that the social welfare function is given by (31)  It can be seen from the figure that the feasible ζ-b region is not contained in the grey region (the strip bounded by the dashed lines), and vice versa.Hence, our example numerically shows the fact that the agents in the incentivized noncooperative system G( J) possessing an ordinal potential may not be Pareto improving.

V. CONCLUSION
We investigated the social welfare improvement problem for the noncooperative dynamical systems through a Paretoimproving incentive mechanism under sustainable budget constraint, where a system manager collects taxes from some agents and gives some of the collected taxes to other agents as subsidies in order to remodel agents' dynamical decision-making.Sufficient stability conditions for our incentive functions were proposed to guarantee that the agents are Pareto improving under the pseudogradient dynamics and their state converges to a Pareto-efficient Nash equilibrium associated with a weighted social welfare function depending on the priority ratio of the agents.It was found that the initial state plays an important role on constructing our incentive mechanism to satisfy the sustainable budget constraint.Furthermore, we revealed the connection between Pareto improvement and potentialization with equal priority between the agents.Our numerical examples give a direct evidence that the Pareto improvement is not the same as potentialization.The future direction may include the incentive design for a hierarchical noncooperative system [51] with large amount of agents and the characterization of bargaining and formation behaviors in noncooperative systems.In addition, the case where the timewise budget constraint (8) can be relaxed to an integral constraint should be investigated so that timewise deficit/debt can be compensated by the end of the operation.

APPENDIX A DEFINITIONS OF POTENTIAL GAMES
Several classes of potential games are found in the literature [52].Specifically, a noncooperative game with the payoff functions J i (x), i ∈ N , is called an (exact) potential game if there exists a function f : R n → R such that for any i ∈ N , x i ∈ R, xi ∈ R, and x −i ∈ R n−1 .This notion can be generalized to the notion of weighted potential game when there exists a positive weight vector (w i ) i∈N such that J i (x i , x −i ) − J i (x i , x −i ) = w i (f (x i , x −i ) − f (x i , x −i )) for any i ∈ N , x i ∈ R, xi ∈ R, and x −i ∈ R n−1 .Furthermore, the notion of weighted potential game can be generalized to the notion of ordinal potential game when for any i ∈ N , x i ∈ R, xi ∈ R, and x −i ∈ R n−1 .
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The weighted potential game (and hence the exact potential game) is a special class of ordinal potential game.
Lemma 1: Consider the two-agent noncooperative system G with quadratic payoff functions (28) satisfying a 1  11 < 0 and a 2  22 < 0.Then, the game G admits a ordinal potential if and only if a 1 is an ordinal potential for G because the function f (x) satisfies arg max x i ∈R J i (x i , x −i ) = arg max x i ∈R f (x i , x −i ), i = 1, 2, and hence (50). 1) .Then, it follows that:

Manuscript received 8
February 2023; accepted 18 September 2023.Date of publication 17 October 2023; date of current version 28 June 2024.The work of Yuyue Yan was supported by Chinese Scholarship Council (CSC).This work was supported in part by the JST Moonshot R&D Program under Grant JPMJMS2021.Recommended by Associate Editor S. Grammatico.(Corresponding author: Tomohisa Hayakawa.)

Fig. 1 .
Fig. 1.Example of (a) the domain D scale and (b) the domain D bud (σ(x 0 )).The boundary of D scale are the two dashed curves elaborated by n i=1 J i (x) = 0 and n i=1 η i J i (x) = 0 in (a).The domain D bud (σ(x 0 )) is characterized by the initial state indicated by the point B in (b).In this example, the socially maximum state x * is not contained in D bud (σ(x 0 )).Another domain D bud (σ(x 0 )) characterized by the initial state on the point C in (a) is depicted in Fig. 2 where x * ∈ D bud (σ(x 0 )) holds.The initial state is likely to be on the boundary of D bud (σ(x 0 )).

Fig. 4 .
Fig. 4. Level sets of U (x) with the domain D bud (σ(x 0 )) and the trajectory of x(t) under the incentive functions (20) with ζ 1 = 1 − ζ 2 = 0.5, b 12 = −b 21 = 2 in Example 2. The state converges to the socially maximum state x * and its trajectory is contained in the domains D bud (σ(x 0 )), D 1 and D 2 .

Fig. 6 .
Fig. 6.Trajectories of the amount of payoff values and incentives and trajectories of the agents' state under the incentive functions (20) in Example 3. The dash-dot lines in figure (c) indicate the socially maximum state, whereas dashed lines in figure (a) indicate the trajectories of the agents' payoff values without the incentive mechanism.

Fig. 7 .
Fig. 7. Trajectories of the payoff values and the sum of the payoff values with and without the incentive functions (20) in Example 4.

Fig. 8 .
Fig. 8. Trajectories of the agents' state and the amount of incentives under the incentive functions (20) in Example 4. The dash-dot lines in figure (a) indicate the socially maximum state.

Fig. 10 .
Fig. 10.Level sets of U (x) with the vector field of the pseudogradient dynamics (22) under the incentive functions (20) with ζ 1 = 1 − ζ 2 = 0.4, b 12 = −b 21 = 0 in Example 6.The state x 1 is a saddle point of the dynamics.The guaranteed region of attraction D (grey region) is understood as the invariant set for arbitrary α 1 and α 2 .

Fig. 11 .
Fig. 11.Feasible solutions in ζ-b region for achieving Pareto improvement in Example 7. The overlapped (brown) region of the red and the green regions denotes the region under which the agents are Pareto improving.In this example, the incentivized noncooperative system G( J) possesses an ordinal potential (or weighted potential) only when b = 0.
and the incentivized noncooperative system G( J) is exactly a weighted potential game.But the connection between Pateto improvement and potentialization is obscure when b ij is nonzero for some i, j ∈ N .Does Pareto improvement always imply potentialization or potentialization always indicate Pateto improvement?To clarify the connections between Pareto improvement and potentialization, we present two numerical examples below.It turns out from those numerical examples that the Pateto improvement and potentialization do not have an inclusive relation with each other.

Fig. 13 .
Fig. 13.Feasible solutions in ζ-b domain for achieving Pareto improvement in Example 8.The overlapped (brown) region of the red and the green regions denotes the region under which the agents are Pareto improving.The grey region denotes the potentialization region under which the incentivized noncooperative system G( J) possesses an ordinal potential.Obviously, the brown region is not contained in the grey region, and vice versa.