Adaptive Controller and Observer Design Using Open and Closed-Loop Reference Models for Linear Time-Invariant Systems With Unknown Dynamics

—This article presents an output feedback controller and observer design approach for linear time-invariant systems with unknown dynamics. The presented method uses an open-loop reference model to generate the desired trajectory and a closed-loop reference model as an observer. The controller only uses the observer states. Lyapunov-based stability proofs show that the error states converge asymptotically to zero and that all other signals are uniformly stable. Furthermore, bounds are proven on the transient behavior.


I. INTRODUCTION
In this article, we present a novel control method for linear timeinvariant (LTI) systems with unknown dynamics.While all real systems are to a greater or lesser extent nonlinear, many can be approximated at least locally by LTI systems.At the same time, dynamics (or at least system parameters) are often unknown in practice.Many traditional control methods for LTI systems assume that the system matrix A is known exactly so that the separation principle can be applied [1], [2].The method presented here reduces the necessity of detailed knowledge of the system parameters.In the future, the method may be extended to some classes of nonlinear systems.
A widely used approach for control of unknown LTI systems is adaptive control, more specifically model reference adaptive control (MRAC) [3,Ch. 6] (other methods are also found in literature, e.g., neural-network-based [4]- [6].This tradition is distinct from the one considered here.)The goal of MRAC is to ensure that the plant output tracks the output of a reference model specified by the designer.Since the system parameters in most cases are unknown or uncertain, the MRAC contains an adaptive law that updates the controller parameters to ensure that the error between the reference system output and the measured output converges to zero.Stability proofs for a standard MRAC can be found in, e.g., [3,Ch. 6.8].
Adaptive techniques can also be used to design observers [3, Ch. 5.3], [7].The approach is similar to that of MRAC, but the unknown plant parameters, the system state, and the outputs are estimated in lieu of estimating the controller parameters.Without a persistently exciting (PE) signal, only the system output will converge to the true value; states and system parameters do not necessarily converge to their true values.Ensuring a PE signal may be difficult in many systems.Recent work on adaptive observers has, however, shown that it is possible to achieve parameter convergence by using an initial excitation, rather than a continuous persistent excitation [8].
In recent years, several modifications to classic MRAC have been introduced [9]- [12].The transient behavior, which can be poor in MRAC as oscillations often occur in the input and output, has been particularly addressed in those works.
One modification to MRAC known as closed-loop reference model adaptive control (CRM) (see, e.g., [10], [13], [14]) introduces a feedback structure in the reference model.This feedback introduces a new degree of freedom for tuning, and allows the reference model dynamics to change if the system is incapable of tracking the original dynamics [14,Ch. 3.2.2].This reduces the oscillations in the state and input at the cost of having the output of the reference system deviate from the desired output specified by the reference model [13].The introduction of the feedback term also makes it possible to use the reference system as an observer, and hence the reference system state can be used in the controller instead of the actual plant state.
A comparison between standard MRAC and adaptive control with CRM is provided in [13].It is clear that the transient is improved significantly when using CRM adaptive control compared to classic MRAC.Adaptive control with CRM, however, is prone to peaking unless the feedback gain in the observer and the adaptation gains are chosen with care [13].
Another recent modification to MRAC is presented in [11] and [12].In those works, the authors introduce a modification scheme through filtering for the reference model and the control action in order to achieve improved convergence of the estimation error.A nonlinear compensator is introduced to reshape the closed-loop system transient.This compensator captures the unknown system dynamics and modifies the given nominal reference model, but the modified reference model can approach the ideal reference model.Furthermore, a leakage term that ensures parameter estimation is introduced.Simulation results show that the adaptive controller with modified reference model and the novel adaptive law of [11], [12] achieves very good tracking of the reference signal and that the transients are suppressed.
In this article, we introduce a new method for adaptive control of LTI systems with unknown dynamics.The proposed method requires relatively little system knowledge, while also forcing the closed-loop system to conform to a known and desired reference model, as well as give estimates of the states without requiring a PE condition.While all these features are individually achieved with other, existing methods, we are not aware of other methods that have all these features simultaneously without further drawbacks.The existing CRM method for LTI systems is a special case of our more general method.The CRM method has also been shown to be applicable to a class of nonlinear and nonsquare systems.Such generalizations of the novel method are at this stage of future work.
In the novel method, we combine the error signal from a closed-loop reference model with that of a classical MRAC (open-loop reference model) in the adaptation law.The system dynamics will trend towards that of the open-loop reference model, and hence the closed-loop reference model trends towards an observer for the system dynamics and can be used as an observer for the unmeasured states, allowing output feedback (without a PE requirement).With our proposed solution, simulations show that the controller ensures improved tracking of the original open-loop reference model output when compared to a CRM controller (which tracks the modified reference model).This ensures a more predictable behavior (in the sense of closer to that of the desired reference model) of the closed-loop system.This article is a continuation of the work found in [15], where a solution for first-order systems is presented.Our method is referred to as a model reference adaptive controller and observer (MRACO) and the controller structure is depicted in Fig. 1.
We prove through Lyapunov analysis that the presented controller and observer ensures that the tracking error and observer error converge to zero, and that all signals are bounded.Furthermore, we also prove bounds on the L 2 and L ∞ norms of the signals.An analysis of these bounds provides insight into the transient behavior of the closed-loop system.
A simulation comparison with the CRM method is performed, where our proposed method achieves a lower integrated absolute error between the system output and the reference signal.

Consider the LTI system
where x is the state, u is the control input, and y is the measured output.These satisfy x ∈ R n and u, y ∈ R m -, i.e., the system is square -with m ≤ n.It is assumed that A ∈ R n×n and Λ ∈ R m×m are unknown, but that B ∈ R n×m and C ∈ R n×m are known.The matrices are constant.The open-loop (i.e., no feedback) reference model is given as where A m ∈ R n×n is chosen by the designer, and r ∈ R m is a piecewise continuous bounded reference signal.This is similar to a traditional MRAC reference model [3,Ch. 6].The closed-loop (i.e., with feedback) reference model is where L = ρB ∈ R n×m , where ρ > 0 will be determined later.This is similar to a traditional Luenberger observer [1,Ch. 8.4], with the difference that we use the reference A m instead of the true but unknown A matrix.
Assumption 2: There exists a K * ∈ R n×m such that A − BΛK * T = A m .Furthermore, ΛK * T ∈ D, where D is known.
Assumption 3: Λ is diagonal with strictly positive elements.Assumption 3 simplifies the notation without loss of generality.Assumption 1 is necessary to use the KYP lemma [16, Lemma 6.3], on which the proof of the novel method rests (use of the KYP lemma also precludes linear time-varying systems from being considered).
In practice, Assumption 2 places a constraint on how different A and A m can be; D in practice represents the user's certainty about the nominal values of A.
The user can be expected to know a range of possible values for A, and choose an A m that is not too different from A (the difference can be large but not arbitrarily large).
Since A − A m = BΛK * T , setting bounds on A − A m is equivalent to setting bounds on ΛK * T .This assumed known bound (Assumption 2) is the set D.
If the user has already chosen an A m , Theorem 1 (Section IV) can be used to find the set of all values of ΛK * T for which the origin of the closed-loop system is provably asymptotically stable; call this set D. As long as D ∈ D, stability can be guaranteed without the user having to know the true value of ΛK * T .
Verifying if D ∈ D can be done numerically by iteration, due to the convex nature of the problem (as discussed in Section IV).

III. CONTROL OBJECTIVE
The primary control objective is to ensure that y → y m (similar to MRAC [3,Ch. 6]).The secondary control objective is to ensure that ŷ → y (similar to an observer [1,Ch. 8.4]).Finally, uniform stability must be guaranteed.To achieve the objectives, we apply the control input where K and L are estimates of K * and L * = Λ −1 , respectively.K * and L * are the "ideal" values that would ensure the best tracking.

A. Error Dynamics
We define the error states as Adding and subtracting BΛK * T x and BΛL * T r to (1), using Assumption 2, and inserting the input u = − KT x + LT r gives The error state e 1 has dynamics The error state e 2 has dynamics after we add and subtract the term BΛK * T x.

B. Observer Feedback Gain
As in [10], the choice of the observer feedback gain L is important.In this article, we use an approach similar to that of [10] to find a suitable observer gain.By Assumption 1 and [16, Lemma 6.3], ∃ matrices P = P T > 0 and Furthermore, we define a matrix M as where We assume that ρ > 0 can be chosen such that Lemma 1 (From [10]): Choosing L = ρB ensures that the closedloop system (A m − LC T , B, C) is SPR.
Proof: If we add the term −ρ(CC T + CC T ), where ρ > 0, to both sides of (13) we get where

IV. STABILITY AND ASYMPTOTIC BEHAVIOR
We now state the main result of this article.Theorem 1. (Main result): For the systems (A, B, C) and (A m , B, C) satisfying Assumptions 1-3, also assume that ρ, P and Q 1 are chosen such that ( 13)-( 15) hold.
Furthermore, let x m be given by (3), x by ( 5) with L = ρB, l > 0 ∈ R m×m be arbitrary matrices.Let the controller be given by (7) with update laws Then, the origin of the system given by e 1 , e 2 , K, L is uniformly stable.Furthermore, e 1 and e 2 converge asymptotically to zero for all initial values of e 1 , e 2 , K, L.
Proof: Consider the function where P is as in (13).Along the trajectories of the system 13) and ( 16), we get Being scalar, terms in (21) containing 1 and 2 are equal to their own trace.Hence, we have that where e T = [e T 1 , e T 2 ], and where M is as in (15).Now define a Lyapunov function candidate with time derivative along the trajectories of the system given by Now utilizing that Tr(X + Y ) = Tr(X) + Tr(Y ) and that trace is invariant under cyclic permutations [17], we get that These last two terms are zero if we choose We are now left with M is positive definite by assumption, so V ≤ 0. By [16,Th. 4.8], the origin of the system e, K, L is uniformly stable.Furthermore, by [16,Th. 8.4], for all initial values of e, K, L. Remark 1: Solving ( 13)-( 15) for P and ρ is a constrained linear matrix inequality (LMI) and can only be solved (excluding trivial cases) by a numerical LMI solver [18].
Remark 2: The LMI problem is convex [18, Ch. 2], i.e., all feasible solutions to the LMI lies in a convex set.
Remark 3: If the term 1 is removed from the update laws ( 17), (18), which removes the influence from system (3), ( 4), the resulting update laws are identical to those used in the CRM method, i.e., our method can be reduced to the CRM method by excluding the feedback from the open-loop reference model.
Theorem 1 might appear to require knowledge of ΛK * T , which is unknown.This is not the case.It is sufficient to know bounds on BΛK * T (see Section II).
We will, in Section VI, illustrate through the use of an example how exact knowledge of ΛK * T is not necessary (as well as illustrating other aspects of the novel method).

V. BOUNDS ON ERROR SIGNALS
The choice of control parameters (as long as they satisfy the criteria of Theorem 1) do not affect the steady-state behavior of the closed-loop system.However, they do affect the transient behavior.
Key aspects are the rate of convergence, the amount of oscillations in the error signal e, and the oscillations in the adaptive gains K and L.
We can use Theorem 1 to find an upper bound on the function norms of the errors, which depends on the transient behavior.However, these bounds are likely to be highly conservative.
We will use the L 2 and L ∞ norms; a similar procedure was used in [13].We use the L 2 norm defined as [16, Ch. 5.1] for some square-integrable function z : [0, ∞) → R n , roughly analogous to the energy in the signal [16,Ch. 5.3].We use the L ∞ norm, which we take as [16, Ch. 5.1] using the Euclidean 2-norm, for some piecewise continuous bounded function To improve readability, we will slightly abuse the notation and use V (t) = V (e(t), K(t), L(t)) and V (0) = V (e(0), K(0), L(0)).Furthermore, λ min (•) and λ max (•) are the largest and smallest eigenvalues of a matrix, respectively.
From the definition of V , we have that λ min (P ) e 2 2 ≤ e T 1 P e 1 + e T  2 P e 2 = V (t) ≤ V (t) ≤ V (0) since V is a nonincreasing function (Theorem 1).Thus which is (34).
The L ∞ norm of x can be found by first noting that From the proof of [16, Th. 5.1], we have where c 1 , c 2 are as in (38) and P , Q 1 are as in (13).Using (42), e 2 (t) 2 ≤ e(t) 2 ≤ e L∞ , r(t) ≤ r L∞ , L = ρB, and that c 2 and ρ are positive, we get which implies (35).

TABLE I PARAMETERS FOR THE SYSTEM (44). FROM [19]
The L 2 norms of K and L can now be found.We note that From the above and (17), we have and thus which implies (36).Starting from ( 18) and using the exact same procedure, we have (37).
It is desirable to reduce the L 2 and L ∞ norms listed in Corollary 1.While we cannot choose e(0), K(0) or L(0) (which enter into V (0)), we can choose P , Γ k , and Γ l , which also influence V (0).
Increasing Γ k 2 and Γ l 2 reduce the L 2 and L ∞ norms of e, but will adversely affect the L 2 norms of K and L.
Adjusting the eigenvalues of P , Q 1 , and M appropriately will reduce all bounds on the norms discussed in Corollary 1.However, doing so is difficult in light of also needing to satisfy the matching criterion ( 13)- (15).Even so, optimizing the choice of P , Q 1 , and M w.r.t. the results of Corollary 1 can be incorporated into the LMI solver that is needed to solve ( 13)-(15) (Remark 1).

VI. SIMULATION EXAMPLE
We illustrate the results of the article with a simulation example.The example (slugging) is drawn from multiphase flow.
Slugging is a phenomenon that often occurs in pipeline-riser systems carrying multiple phases, e.g., gas and liquid.It is characterized by the liquid blocking the pipe until the gas pressure has built up enough to dislodge the mass of liquid (the "slug") at great speeds.Slugging is quite common in offshore oil and gas production.It is highly undesired, as it is characterized by large variations in pressure and flow rate which may cause damage to equipment and nonoptimal production.[19] One way to suppress the slugs and hence enable operation in a desirable but unstable operating region, is to actively control the pressure at the riser bottom with the topside choke valve [19].In [19], a second-order linear approximation of the dynamics between the topside choke-valve (input) and the riser-bottom pressure (output), linearized around a given valve-opening, is identified as The parameters of this transfer function are not truly constant, but change based on the operating point and the valve opening.Hence, designing a static controller with guaranteed stability properties for this system is challenging.
Experimentally obtained system parameters from a laboratory-scale system (from [19]) are in Table I, taken at an operating point of approximately 20% valve opening and pressure 26 kPa.Input is deviation from 20% and output is deviation from 26 kPa; time is in seconds.Note that this system is open-loop unstable.
The transfer function (44) can be realized as a second-order statespace model, e.g., where we assume that β 0 , β 1 are known and constant, and α 1 , α 0 , Λ = λ ∈ R are constant unknowns, but satisfying where α 1 , ᾱ1 , a, b, ā and b are known.Here, the values chosen are

A. Reference System
For this system, we choose While we can choose α m,1 and α m,0 , the true values of ΛK * T are unknown since α 1 , α 2 are unknown.Regardless, (46) can be converted to an equivalent bound on ΛK * T , i.e., where Note that this reference system does not have a stead-state gain of 1, hence we will premultiply the reference signal r with a constant k g = α m,0 /β 0 ≈ 12.20 to ensure correct steady-state values at the output of the reference model.With (49), the true value of ΛK * T is given by ΛK * T = [1.0019,0.0412].Note that while A and A m have very different parameters, ΛK * T is not particularly large.
It is now possible to find all P, ρ that satisfy ( 13)-( 15) for any ΛK * T in some set numerically, as the solutions to the LMI problem is a convex set (Remark 2).

B. SPR Condition
For A m of (47), all solutions to ( 13) and ( 14) are given by  Choosing, e.g., p = 0.0047 makes P and Q 1 positive definite, but this is not the only allowable value.

C. Solutions to the LMI
To verify if (15) has a solution, we now do a two-step procedure.The first step is to find candidate ρ and p that satisfies ( 13)-( 15), for values of ΛK * T on some domain D. The second step is to verify that the candidates are suitable for the desired range of values of ΛK * T [D of (48)].Since the LMI is convex, doing these steps numerically on a discrete grid is sufficient.
Step 1 was done using CVX in Matlab.For some grid points, the LMI does as expected not have a solution.
The largest value ρ found for all the data points where the LMI had a solution is ρ ≈ 9131, and the p corresponding to it was p ≈ 0.0046.These are chosen as the initial candidate values.
Step 2: Using the same uniformly spaced grid as before (but now using 100 points) and fixed values of ρ = 9131 and p = 0.0046, we verified for which values of ΛK * T the matrix M > 0. The set where the LMI has a solution is significantly larger than the set D of (48).However, this implies an unnecessarily aggressive choice of ρ.The choice, e.g., ρ ≈ 25 and p ≈ 0.0055 does not guarantee stability for parameters in D. The choice ρ ≈ 203, p = 0.0039, however, does, although the range parameters for which the controller works is smaller than with the first ρ, p pair.
The results of the tuning are illustrated in Fig. 2.
The area where M > 0 with the chosen values of ρ and p is in blue.This is the domain D, i.e., the range of parameter values where the system is guaranteed to be stabilized.As long as the unknown system parameters are in this set, the controller works.
In red in Fig. 2 is the domain D, i.e., the range of parameter values for which we want the system to be stable.As we can see from the figure, D ∈ D for appropriate choices of ρ, p.Therefore, for the system (45) with (46), the MRACO method will work on this system with parameters anywhere in the allowable range, with ρ = 203, p = 0.0039.

D. Simulation Parameters
We implemented the system in Simulink with the parameters given in Table I, i.e., the open-loop system is unstable.The control objective is to have the output of the system track a rectangular pulse reference signal varying between −0.5 and 1 with a period of 800 s.The system is linearized around a choke-valve opening of 20%, hence we saturated the calculated input between -20% and 80% to respect the limits of the actual choke-valve.We used ρ = 9131 and the adaptation gains were chosen as While the proof of Theorem 1 only holds for LTI systems, we tested the robustness of the system by performing two changes in system parameters.At t = 600 s, the parameters α 1 and α 0 increased by 0.6 and 0.1, respectively.At t = 1000 s they returned to their original values.

E. Simulation Using MRACO
The proposed controller and observer were implemented in Simulink on the system presented above.The initial states of the system are set to x 1 (0) = −20.83,x 2 (0) = −60.98,i.e., the initial output of the system is y(0) = 0.5 while the initial states of the observer and reference model are all set to zero.The results of the simulation are shown in Fig. 3.Note that the plot shows actual pressure and valve position; y = 0 corresponds to 26 kPa, u = 0 corresponds to 20%.

F. Simulation Using Closed-Loop Reference Model Adaptive Control
For comparison, we also simulated an output feedback CRM controller for the same system.Other traditional methods such as MRAC or PID are either not applicable to this scenario, or outperformed by CRM [13].We followed the CRM procedure presented in [10], summarized here.
In this example, parameters with the same name as with MRACO take the same numerical values.
CRM assumes an open-loop system on the same form as in ( 1) and ( 2), but the reference model and observer are combined in the form of an observer The controller and update laws have the forms where e y = y − y m,c .A feedback gain L s is chosen such that the transfer function is SPR, where a = C T B and ρ c > 0 can be chosen freely; L s and ρ c will be used to compute L c per [10].Since the pair (A m , C) is observable and the system is minimum phase, we can place the poles of A m − L s C T freely.We can now find a P c = P T c > 0 and a Choosing L c = L s + ρ c BB T C and ensuring that where λ ≥ sup ||Λ|| and ǩ ≥ sup ||K * || are known, ensures that the error signal e c = x − x m,c is globally bounded and that lim t→∞ e = 0 [10].
Finding ρ is an iterative process.First, a candidate value must be chosen and (59)-(61) must be solved and satisfied.We choose ρ c = 85.The reference system dynamics is as in (49), hence L s ≈ [−7000, −83.3]T ensures a pole-zero cancellation in the left half plane and places the pole of the resulting closed-loop transfer function in −ρ c .The bounds for Λ and K * are set to λ = 1 and ǩ = 2 which implies similar uncertainty on the true system parameters as with the MRACO example.
The corresponding solution for P c and Q s is

G. Discussion
From Fig. 3 we see that the proposed method is able to track the output of the reference model y m and that the errors e 1 and e 2 both converge to zero.The initial deviation, the setpoint changes and both parameter changes (600 and 1000 s) are handled very well, and we see only very minor and short-lived deviations from the desired reference model trajectory.
The simulation results using the CRM method are also shown in Fig. 3.The CRM method is also able to track the output of the combined observer and reference model, but we note that the output of this closedloop reference model significantly deviates from the original reference model trajectory at t ∼ 810 s (highlighted in Fig. 4), i.e., the reference model is acting more as an observer than a reference trajectory.This behavior, which is not present with the proposed MRACO method, is described in [14] as having a potential negative impact the open-loop plant is unstable is the case here), as the reference model is then tracking divergent plant.The behavior is caused by the error signal driving the update laws (56) (57) being very small, i.e., larger adaptation are necessary [14].
There is a clear difference between the observer feedback gains used in the two methods.Increasing the observer feedback ρ for MRACO increases the convergence rate of the observer and does not cause any deviation of the reference model from its expected trajectory, as this is decoupled from the observer.It is, however, recommended to choose a In order to provide a fair comparison of the two methods, we can compare the deviation between the reference signal r and the system output y.Another interesting comparison would be the deviation between the output of the system, y, and the output of the reference model without the injection term, i.e., y m .This signal is the actual desired trajectory and any deviation from this can be considered an error.The CRM method allows deviations from y m in order to decrease oscillations.Whether deviation from the expected trajectory or oscillations in the input and output signals are least desirable largely depends on the use-case.
Table II shows the integrated absolute error (IAE) between the signals of interest in the two simulations.The proposed method (MRACO) has a significantly lower IAE between the system output (y) and the original reference model output (y m ).MRACO also has a slightly lower IAE between the system output (y) and the reference (r).We see that the error signals which are specific for each method (y − ŷ for MRACO and y − y m,c for CRM) are very low.These results are as expected.
Table III shows the theoretical bounds based on (33)-(37) and the actual values of the signals from the simulation (note that the bounds are of e, not 1 = C T e 1 , 2 = C T e 2 ).The actual values are clearly substantially lower than the theoretical bounds.

VII. CONCLUSION
In this article, we presented a novel method (MRACO) for designing an output feedback controller and an observer for linear time-invariant systems with unknown dynamics.The controller is a standard model reference adaptive controller, but the adaptation laws include the observer error as well as the tracking error, and the observer states are used in the controller so that the method does not require full-state feedback.The observer dynamics are the same as those in the reference model.A procedure for finding the observer feedback gain and the reference model, which is based on solving a linear matrix inequality, is also presented.
The presented method has some similarities with what is known as adaptive control with closed-loop reference model (CRM), but the key difference is that in our method the reference model and observer are separated; in CRM they are combined.This means that with MRACO, unlike with CRM, the output of the reference model, at all times, is as specified.Furthermore, the adaptation laws are different in the two methods.
Through Lyapunov analysis, we proved that the differences between the system state, reference model state, and observer state all converge to zero.A transient analysis was performed and upper bounds on error signal and adaptation gain oscillations derived.
Our method was compared to a CRM controller in simulation.Our method has lower tracking error and more closely tracks the output of the reference model, but at the cost of slightly higher oscillations.A method that combines the benefits of both methods with none of the drawbacks has not yet been developed.
Simulations using our method imply that it is capable of stabilizing systems not encompassed by our mathematical proof.This implies that our proof may be somewhat conservative.Also, extending the method to classes of nonlinear systems is considered future work.

Fig. 1 .
Fig. 1.Structure of the novel controller and observer method.

Fig. 3 .
Fig. 3. Simulation results using the proposed MRACO and the CRM method.
where P c B = CC T B. The lower boundρ * c = λ2ǩ2 λ min (Q s ) = 81.7 < ρ cand hence the condition for stability is satisfied.The adaptive controller was implemented in Simulink, with adaptation gains Γ k,c = diag(200 200) and Γ l,c = 50.The system and the reference model are initialized as in the simulation with the proposed method.The results of the simulation are also shown in Fig.3.
0 β 0 where p is chosen so that P and Q 1 are positive definite.Inserted the values from Table I and (49), we have especially if the measured signal is contaminated with noise.A very large ρ would lead to bad noise filtering properties of the observer.