Observer-Based Adaptive Inverse Optimal Output Regulation for a Class of Uncertain Nonlinear Systems

This paper addresses the adaptive inverse optimal output regulation problem for a class of uncertain nonlinear systems driven by an exosystem. The unknown parameters, internal disturbances, and unmeasured states are contained in the nonlinear system. Firstly, the output regulation problem is decomposed into a feedforward control design problem which can be solved by the internal model based on the output regulation theory, and an adaptive inverse optimal stabilization problem. Then an auxiliary system is designed, and a new state observer related to the auxiliary system is given. By combining adaptive control technology and inverse optimal control method, a novel adaptive output feedback inverse optimal controller is developed to make the output of the system track the reference signal fast. With this control strategy, all the signals of the closed-loop system are uniformly ultimately bounded (UUB), and the newly well-defined cost functional which is connected with the auxiliary system and the controller can be minimized. Finally, a simulation case is put forward to verify the feasibility of the newly raised controller and the state observer.


I. INTRODUCTION
The nonlinear output regulation problem (ORP) has received great attention as the development of control theory and applications. The ORP aims to design a controller which can not only guarantee the stability of the system, but also ensure the output of the system to track the reference signals or reject the disturbance, where the reference signals and disturbances are both generated by an exosystem. The nonlinear ORP can be encountered in many practical problems, such as attitude tracking and disturbance rejection of aircraft [1], [2], [3]. Recently, many fruitful results have been sprung out to solve the ORP [4], [5], [6]. At the same time, some researchers note that there always exist uncertainties in many practical engineering systems, such as unmodeled dynamics and unknown parameter vectors [7], [8]. These The associate editor coordinating the review of this manuscript and approving it for publication was Haibin Sun . will have a significant influence on the performance, so it is meaningful to research the output regulation problem of uncertain nonlinear systems. At present, adaptive control is an effective way to solve the uncertainty in the system [9], [10], [11], [12], [13], [14]. Based on adaptive control, in [12], [13], and [14], the ORP for a class of uncertain nonlinear systems that were driven by an exosystem. When the exosystem is equal to zero, the output regulation problem is a stabilization problem [9], [10], [11]. That is to say, the stabilization is just a special case of the output regulation problem. For a class of uncertain nonlinear systems with unknown parameter vectors and disturbances, a control strategy was put forward in [15] based on the adaptive internal model and dynamic surface control, and then the ORP was solved. However, the studies mentioned above do not pay attention to the issue of optimal control, which is a critical problem in the modern control field. In many control systems, optimal controllers are needed because of scarce resources. Taking the attitude VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ tracking control of spacecraft as an example, it may minimize fuel consumption, or it needs the optimal combination of fuel and time consumption.
Recently, a lot of effort has been made regarding optimal control [16], [17]. For nonlinear systems, the difficulty in coping with the optimal control problem is to seek the solutions of the Hamilton-Jacobi-Bellman (HJB) equation.
To avoid this problem, in [18] and [19], with the help of neural networks, the approximate solutions of the HJB equation were obtained. Then the optimal control problem of nonlinear systems was addressed. However, the approximate errors will be produced with the method of [18] and [19]. If the errors are not small enough, then the optimal performance can be damaged. The inverse optimal control was proposed in [20] and [21], which is a method to design a controller based on the control Lyapunov function (CLF), rather than to design the controller by minimizing the pre-determined cost functional. In the framework of [21], the stabilization problem of nonlinear systems with unknown parameter vectors was investigated in [22]. Nevertheless, there still exists a restriction that the nonlinear functions must be known, which will limit the application scope. By removing this restriction, the unknown nonlinear functions were approached by fuzzy logic systems (FLSs). Then an adaptive fuzzy state feedback inverse optimal controller was designed [23]. In practical problems, most of the states cannot be measured directly, so a fuzzy state observer and an adaptive fuzzy output feedback inverse optimal controller were designed in [24]. Some applications of the inverse optimal control method can be seen in [25] and [26]. However, it is worth mentioning that the inverse optimal method is mainly employed to solve the stabilization problem of nonlinear systems. This method does not use to solve the ORP, which is widespread in practical problems. In reference [27], the inverse optimal control method was extended to the ORP of nonlinear systems with a minimum phase. But there are two main limitations in [27]. One is that the control system does not have any uncertain terms, which are usually appeared in system modeling, and the other is that all the states are measured directly.
Motivated by the above investigations, this paper investigates the adaptive inverse optimal ORP for a class of uncertain nonlinear systems with unknown time-varying bounded disturbances, unknown parameter vectors, and unmeasured states. The proposed controller not only ensures that all signals of the closed-loop system are UUB, but also minimizes the cost functional. Although the [27] also researches this kind of problem, the unknown time-varying bounded disturbances, unknown parameter vectors, and unmeasured states are not taken into account in [27], so the existing controller is invalid. The main difficulty of our paper is how to give a reasonable cost functional. Although the form of the cost functional is similar to [20], [21], [22], [23], [24], and [27], there are still some differences. Because of the exosystem, the internal model must be employed such that the exosystem can be immersed in it. Then, the internal model must be considered in the cost functional which is different from [20], [21], [22], [23], and [24]. Due to unmeasured states, the proposed cost functional must be related to the state observer, so the cost functional in [27] is unsuitable.
To overcome the above difficulties, compared with the existing results, the innovations of this paper are as follows: (i)A novel adaptive output feedback inverse optimal controller is developed by utilizing adaptive control technology, and inverse optimal control method. Different from previous studies, the newly designed controller is associated with the auxiliary system. By using the inverse optimal control method, it is proved that all signals of the closed-loop system are UUB. And compared with the [15], the output of the controlled system can track the reference signals faster.
(ii)An well-defined cost functional which is connected with the internal model and the state observer is given. The functions l(x) and γ in the given cost functional are designed differently from [20], [21], [22], [23], [24], and [27], and the proposed controller can minimize the proposed functional.
(iii) An auxiliary system is constructed. A new internal model and a state observer are given related to the auxiliary system, and the state observer can be designed to estimate the unmeasured states.
The rest of this paper is organized as follows. Section 2 provides a brief problem formulation on the output regulation problem as well as some preliminary knowledge. Section 3 demonstrates the design of the state observer. An internal model is designed in Section 4. An adaptive fuzzy inverse optimal controller is designed in Section 5. Section 6 presents the stability analysis of the closed-loop system and the minimization of cost functional. Finally, Section 7 includes numerical simulation results, and the conclusion is drawn in section 8.

II. PROBLEM FORMULATION AND PRELIMINARY
Consider a class of nonlinear systems as followṡ , u ∈ R is the control input, y ∈ R denotes the output, θ ∈ R r is an unknown parameter vector. f i ∈ R r and g i ∈ R(i = 1, 2, · · · , n) are known smooth and bounded functions. d(t) is an unknown time-varying bounded disturbance, D i (w)(i = 1, 2, · · · , n) and R(w) are undesirably external disturbance and reference input respectively, e is the tracking error, w is produced by the following exosystemẇ where w ∈ W , W is a compact set. It is worth mentioning that systems (1) and (2) can be obtained in practical problems. For example, the dynamic model is established to solve the attitude tracking and disturbance rejection problem of a rigid spacecraft, and the model can be disturbed by the external disturbance torque. This is a special example of ORP [2].
The control objective of this paper is to design a state observer, an adaptive laws and an adaptive inverse optimal output feedback controller, such that the output of (1) can track the reference signals, all the signals of the closed-loop system are UUB and the cost functional is minimized.
To study the ORP, the following assumptions are introduced.
Assumption 1 [17]: where m i and l i are known constants, r and are compact set.
Assumption 1 is combined frequently with the work of adaptive backstepping control, as it is seen in [10] and [17], which solve the tracking control problem of nonlinear systems.
Assumption 2 [6]: The matrix S has distinct eigenvalues on the imaginary axis.
Assumption 2 means that w is bounded and persistent, and the exosystem is neutrally stable. This is a common assumption for the QRP.
Assumption 3 [6]: For nonlinear systems likė there exists a continuously differentiable mapping ξ = π(w), with π(0) = 0, ∀w ∈ W * , and a continuous mapping α(w) that solve the equations where W * is a compact set. Assumption 3 is a necessary and sufficient condition for the existence of the solutions for an output regulation problem, which can be seen in many articles [5], [6].
According to Assumption 3, let π 1 (w) = R(w). The solutions of regulator equations satisfy the following equations , and a newly closed-loop system is produced aṡ where where M i and L i , i = 1, 2, · · · , n are given constants. Based on the above operation, it can be seen easily that the ORP of controlled system (1) and exosystem (2) can be converted into a stabilization problem of (6).
To achieve the target of this paper, the following important definition and lemmas are needed.
Definition 1 [21]: The inverse optimal gain assignment problem for the system (6) is solvable if there exists a class K ∞ function γ whose derivative γ is also a class K ∞ function, a matrix value function R(x) satisfying R(x) = R T (x) > 0 for all x, positive definite unbounded functions l(x) and E(x), and a feedback law u = u * that is continuous away from the origin with u * (0) = 0. Then the cost functional is minimized, where D is the set of locally bounded functions of x.
Lemma 1 [21]: If γ and its derivative γ are class K ∞ functions, the Legendre-Fenchel transform will satisfy the following properties: ( Lemma 2 [21]: For any two vectors a and b, the following inequality holds and the equality is achieved if and

III. STATE OBSERVER DESIGN
In this part, a state observer is designed to estimate the unmeasured state.
Giving nonlinear system likė where x ∈ R n and u ∈ R are the state vector and the control input, a(x) and b(x) are smooth functions, q(x) is an unknown bounded disturbance vector. An auxiliary system of (9) can be constructed aṡ where γ is a function which can be selected according to Lemma 1, V (x) is the same as that in (20), and LV is Lie derivative. Now, define a function γ (v) = v 2 µ , γ (2v) = µv 2 , µ = 0 is an arbitrary constant. Using (9) and (10), the auxiliary system VOLUME 11, 2023 of (6) can be built aṡ For convenience, rewrite (11) aṡ where · · · , 0, 1, 0, · · · , 0] T , and the parameter k i is chosen such that the matrix A is Hurwitz. The state observer can be constructed aṡ whereθ is an estimation of θ,x = [x 1 ,x 2 , · · · ,x n ] T . Letẽ = [ẽ 1 ,ẽ 2 , · · · ,ẽ n ] T = x −x, we havė where

IV. INTERNAL MODEL
In this section, an internal model is constructed. Assumption 4 [6]: There exist a set of real numbers a 1 , . . . , a n−1 such that α(w) satisfies the equation This is a common assumption to solve the ORP, which can help us design an internal model.
According to Assumption 4, we obtain where . . .
Under the above analysis, the exosystem with the output α(w) can be immersed in the normalized form of the internal model as followsη where For the normalized form of internal model, based on the principle of deterministic equivalence we can obtain the error form of internal model aṡ where χ (·) is a designed function. The internal model is an useful tool to deal with the ORP, and it can help us design a feedforward controller which is a part of the whole controller.

V. ADAPTIVE INVERSE OPTIMAL CONTROLLER DESIGN
In this section a controller is designed, and the controller can make all the signals of the closed-loop system UUB. In addition, the proposed cost functional can be minimized.
Define a series of coordinate transformations as where α i is virtual control law. Then our task is to design the control input u. Design the CLF as where V e = 1 By virtue of (14), the derivative of V e can be calculated aṡ Based on Young's inequality, we obtain where a i = 1 2 n k=1 µL 2 i L 2 k .

Substituting (22) into (21), we geṫ
The function χ (·) can be designed as (25) thus the derivative of V η is given aṡ With the help of Young's inequality, we obtain Taking (27) into (26), we geṫ , · · · , n. By using Young's inequality, it is easy to obtain z n k nẽ1 Based on (29) and (30), the virtual controller can be designed as where c k > 0 and σ > 0 are designed constants, so (29) is converted intȯ where where σ > 0 is a designed constant. From (19), we conclude that each α i will vanish when z = 0. Then based on the [21] and the Mean Value Theorem, there exists a smooth function φ k such that Let u 0 = u − φη + φNe, then u 0 can be designed as where By applying Young's inequality, we get Then the derivative of V iṡ

VI. STABILITY AND PERFORMANCE ANALYSIS
In this part, the feasibility of the controller can be proved, including the stability of the closed-loop system and the minimization of the cost functional. Theorem 1: For system (6) and auxiliary system (10), if the Assumptions hold, there exists state observer (12), internal model (17), adaptive law (33), and control input (36) which guarantee all the signals in the closed-loop system are UUB, the observer can estimate the unmeasured states well. Moreover, the inverse optimal control input u * 0 = −ρu 0 can minimize the following cost functional where ρ ≥ 2, ϑ ≤ 2, M (x) is designed in (36), and Proof: For inequality (39), let we obtainV From (43), it can be seen obviously that all the signals in the closed-loop system are UUB. Next, we will prove the controller u * 0 can minimize the cost functional (40). Rewrite system (4) as the following forṁ where Based on (8) and (9), the auxiliary system of (40) can be expressed aṡ It is easy to see that l(x) satisfies since ρ ≥ 2, ϑ ≤ 2, W (x) > 0, γ is a class K ∞ function. Then l(x) > 0. Therefore, the cost functional J (u) is meaningful. The function d(t) is expressed as d in the following derivation.
Substituting (41) into (40), J (u) can be converted into where By employing Lemma 1 and Lemma 2, the following inequality can be got Based on the above analysis, we conclude that the inverse optimal controller u * 0 can minimize J (u) as where ρ ≥ 2. Thus the proof of this theorem is completed. In order to minimize the cost functional, the parameter can be designed as = 2 based on ≥ 2. Then the cost functional reaches the minimum value J (u) = 4V (0).

VII. SIMULATION
In this section, a simulation example is presented to check out the above method.
Example: Consider the following nonlinear systemṡ where = [ξ 1 , ξ 2 ] T represents the state vector, θ is an unknown parameter vector, u and y are the control input and the output respectively, the disturbance d(t) = cos(2t), w = [w 1 , w 2 ] T is generated by the following exosysteṁ The solutions of the regulator equation are π 1 (w) = w 1 , π 2 (w) = −w 1 − sin w 1 θ. Let x i = ξ i − π i (w), and a new closed-loop system is generateḋ where α(w) = −w 2 cos w 1 θ − d(t).
According to simulation results in FIGURE 1-4, we can see that the controller and adaptive law (58) designed in this paper can guarantee that all the signals of the closed-loop system (55) are UUB, the output of the system (53) can track the reference signal, and the state observer can estimate the states of the system (55) as they are shown in FIGURE 3 and FIGURE 4. Compared with [15], the tracking error in this paper is smaller, and the speed of convergence is faster, which can be seen in FIGURE.1 and FIGURE.2.

VIII. CONCLUSION
In this paper, the issue of the optimal ORP is addressed for a class of nonlinear systems which are driven by an exosystem. The considered nonlinear systems contain unknown parameter vectors, and internal bounded disturbances. By a state transformation, a closed-loop system is obtained, and an auxiliary system is designed. A state observer related to the auxiliary system is raised to estimate the unmeasured state. A novel adaptive output feedback inverse optimal controller and adaptive law are designed by employing adaptive control technology and inverse optimal control method. It has been proved that the new controller makes all the signals of the closed-loop system be UUB, and the well-defined cost functional is minimized. Finally, a simulation case is given to testify the feasibility of the newly raised controller and state observer. Compared with the result in [15], the speed of convergence is faster in this paper. In the future, we will research the finite-time optimal output regulation for a class of uncertain nonlinear systems with time-delay and unknown nonlinear functions, and we will consider combining the proposed control method with the mobile robots and quadrotors control.