Adaptive Estimation and Control With Online Data Memory: A Historical Perspective

Online data memory is essential for adaptive estimation and control as it can enhance the performance and robustness of adaptive systems compared to adaptive systems without data memory. We provide an overview of four data memory-driven parameter estimation schemes for adaptive systems from a historical perspective, including forgetting-data memory regression extension (MRE), full-data MRE, discrete-data MRE, and interval-data MRE. For clear presentation and better understanding, a general class of nonlinear systems with linear-in-the-parameter uncertainties is applied as a unifying framework to demonstrate the motivation, synthesis, and characteristics of each MRE scheme for parameter estimation in adaptive control. Intensive comparisons of the four MRE schemes are provided to reveal their technical natures, and real-world applications are discussed to show their practicability. It is concluded that all the MRE schemes can achieve exponential parameter convergence under relaxed excitation conditions rather than the classical condition of persistent excitation which is too stringent to satisfy in practice. The distinctive features of interval-data MRE termed composite learning are highlighted with respect to computational simplicity, estimation accuracy, robustness against perturbations, and widespread real-world applications to robot learning and control. Possible directions for future research in this area are suggested to conclude this survey.


I. INTRODUCTION
A S ONE of the major methodologies to handle uncertainties in both linear and nonlinear systems, adaptive estimation and control has evolved for more than 60 years and remains among the most active research fields in the control community, where typical survey papers from the last decade can be referred Yongping Pan is with the School of Advanced Manufacturing, Sun Yat-sen University, Shenzhen 518100, China (e-mail: panyongp@ mail.sysu.edu.cn).
Parameter convergence (i.e., accurate parameter estimation) in adaptive systems can provide accurate online modeling for high-performance control and other high-level tasks, such as monitoring, planning, prediction, and diagnosis.Specifically, exponential parameter convergence usually implies exponential stability of adaptive control systems, which imposes superior exponential tracking and robustness against various perturbations, such as measurement noise, unmodeled dynamics, and external disturbances [14].Classic adaptive systems guarantee parameter convergence under a condition of persistent excitation (PE) [15].However, PE implies that system states contain sufficiently rich spectrum information all the time, which is too hard to satisfy in practice, severely affecting the performance and robustness of adaptive systems [16].Even when PE exists, parameter convergence may still be slow as the convergence rate highly depends on the excitation strength [17].
The requirement of PE for parameter convergence in adaptive systems derives from a rank-1 condition, i.e., the rank of an adaptive law that only utilizes instantaneous data is always at most one, and thus, system states need to continually span the spectrum of parameter space to guarantee parameter convergence [18].Interestingly, there is a long history of exploiting online data memory to overcome the rank-1 condition and improve parameter estimation in adaptive systems, where the first effort can be found in [19].The key idea of this approach is that regressor extension and integral-type operations are utilized to exploit online data memory (termed memory regressor extension (MRE) in [10]) for parameter estimation, such that parameter convergence can be achieved under weaker excitation conditions, such as parameter-dependent PE, interval excitation (IE), and sufficient excitation (SE).MRE improves the smoothness and robustness of parameter estimation and provides the potential to set an arbitrarily high rate of convergence subject to noise levels and sampling periods.Thus, the better performance of adaptive control is expected by faster parameter adaptation without the increase of control gains.
c 2024 The Authors.This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Notations: R, R + , R n and R m×n represent the spaces of real numbers, positive real numbers, real n-vectors, and real m × n-matrices, respectively, L 2 and L ∞ denote the spaces of squareintegrable and bounded signals, respectively, λ min (A) denotes the minimal eigenvalues of A, σ min (A) := λ min (A T A) is the minimum singular value (MSV) of A, det{B} is the determinant of B, adj{B} is the adjugate matrix of B, x is the Euclidean norm of x, and c := {x| x ≤ c} denotes the ball of radius c, where and m and n are positive integers.In the subsequent sections, the arguments of a function may be omitted while the context is sufficiently explicit, and the time variable t of a signal may be omitted for simplicity, except the dependence of a signal on t is crucial for clear presentation.

A. Preliminaries
For the sake of clear presentation and better understanding, consider a general class of nth order nonlinear systems with LIP uncertainties as follows [41], [42], [43]: where vector fields, : R n × R m → R N×n is a nonlinear regression matrix, and c θ ∈ R + is a certain constant.Assume that x is measurable, 1 f , g and are known a priori, θ is unknown but constant (or slowly time-varying), and there exists an admissible u such that the closed-loop system has uniform stability in the sense that all signals involved are of L ∞ [41].Stable adaptive control designs for (1) with particular structures are available in many textbooks [15], [128], [129].The following definitions are provided to facilitate the subsequent analysis 2 .
Remark 1: From Definitions 1-3, one knows that: 1) The PE condition is strongest as it requires excitation to be persistent for all time t ≥ 0; 2) the IE condition is strictly weaker than PE as it only requires excitation occurs at a certain time t = t e which may be fulfilled during transient processes; 3) the SE condition is even easier to be satisfied than IE as the integral starts from t = 0 instead of t = t − τ d to utilize online data fully.However, exploiting all online data in SE has several drawbacks: 1) It loses the flexibility of handling parameter variations; 2) parameter estimation may become unbounded as the time t evolves since T is positive-semidefinite; 3) inaccuracy in modeling can be accumulated, resulting in deteriorated estimation accuracy.

B. Least-Squares Estimation
In practice, the time derivative ẋ is usually immeasurable for physical plants.If the upper equation in (1) with an estimated ẋ is applied directly to estimate the unknown parameter θ , the estimation accuracy of θ is subject to measurement noise.One way to avoid this problem is to implement low-pass filtering on (1) such that ẋ is not required in parameter estimation [13].For the purpose of simplicity, a stable linear time-invariant (LTI) filter L(s) := α s+α is applied to each side of the first "=" in (1), resulting in a filtered state dynamics where α ∈ R + is a regressor filtering constant, s is the complex Laplace operator, χ (t and f (t) := L(s)[ (x, u)] denote filtered elements in hybrid time-frequency domain notations, and sL(s) = αs s+α is termed as a dirty derivation filter (DDF).
Let χ (t) ∈ R n , x(t) ∈ R n and θ (t) ∈ R N be estimates of χ(t), x(t) and θ , respectively.Define a state prediction error ε(t) := χ(t) − χ (t) and a parameter estimation error θ(t) := θ − θ (t).A prediction model of χ is given by 1 When measuring the entire vector x is impossible in practice, parameter estimation-based observers can be applied to estimate x and θ jointly, e.g., see [130], which is out of the scope of this survey. 2The value of σ is regarded as an excitation strength which can be determined by the MSV σ min Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Subtracting (3) from (2), one obtains which implies that the state prediction error ε available from the measurable signals x and u indeed represents a modeling error and provides an indirect measure of the estimation error θ .The standard least-squares update law of θ is given by where 0 ∈ R N×N denotes a positive-definite matrix of initial learning rates, and (5b) is equivalent to d( −1 )/dt = f T f .The following theorem shows the convergence results of the least-squares parameter update law (5).
Theorem 1 [91]: Let [0, t f ) with t f ∈ R + be the maximal interval of existence of solutions for the system (1).Then, the estimation law of θ in (5) guarantees that:

C. Composite Adaptive Control
To connect the adaptive estimation results in Theorem 1 with control problems, we need to introduce a closed-loop tracking error model.Since the design of the control law u depends on the structure of the system (1) and there is not a unified formula of u for different system structures, we directly assume that there exists an admissible u for (1) to render the following uniformly stable tracking error model (as in [41]): where e ∈ R n is a tracking error, c : R n → R N×n denotes a known nonlinear regression matrix for control, and A ∈ R n×n is a strictly Hurwitz matrix.Incorporating least-squares parameter estimation with direct adaptive control results in a hybrid direct and indirect adaptive scheme termed composite adaptive control (CAC) [13].More specifically, combining (5a) with the feedback of e, one gets a composite adaptive law where k a ∈ R + is a weighting factor The following theorem shows the stability and convergence results of CAC.
Theorem 2 [13]: The closed-loop system (6) with ( 7) has global uniform asymptotic stability in the sense that: 1) All closed-loop signals are of L ∞ , ∀t ≥ 0; 2) e, ε → 0 asymptotically, ∀t ≥ 0; 3) e, θ → 0 exponentially, ∀t ≥ 0 when PE in Definition 1 exists for certain constants τ d , σ ∈ R + .Theorem 2 extends the estimation results in Theorem 1 on the convergence of the prediction error ε and the estimation error θ to the control problem that considers the extra convergence of the tracking error e.The PE requirement of the CAC scheme in Theorem 2 to achieve exponential stability with the convergence of θ results from the fact that only instantaneous data at the current time t are utilized to update the parameter estimate θ in (7).In the following sections, we will review four MRE schemes 3 for parameter estimation in adaptive systems, where all of them can achieve the exponential convergence of θ without the classical PE condition in Definition 1.
Remark 2: To explain the smooth behavior of parameter estimation in composite adaptation, it is convenient to interpret the composite adaptive law (7) as passing through a time-varying low-pass filter, in which the parameter search goes along an averaging direction specified by low-frequency components in the tracking error e.The superiority of CAC comes from two aspects as below [13]: 1) The prediction error ε that contains implicit information on the estimation error θ provides extra efforts to enhance the convergence of θ as well as e; 2) due to the averaging effect, the learning gain 0 can be set higher for the faster convergence of θ and e without exciting high-frequency unmodeled dynamics.

III. FORGETTING DATA-DRIVEN MRE
Forgetting-data MRE was initially proposed to design adaptive state observers in [19] and was extended to MRAC for a class of uncertain LTI systems in [20], where the key idea is to exploit forgetting data through linear filtering for parameter estimation.In [20], a passivity-based adaptive control method with forgetting-data MRE was proposed for robot manipulators, where the parameter update law contains standard gradient, composite adaptation, and forgetting-data MRE, and passivitybased analysis is employed to make the choice of parameter update flexible.Compared to the standard gradient estimation, the method in [20] has several major merits: 1) An averaging effect is induced by means of exploiting data memory, which provides smoother parameter estimation and stronger robustness against measurement noise; 2) the exponential convergence of the estimation error θ is achieved under a weakened PE condition; 3) the convergence rate of θ can be specified by a learning rate independent of the excitation strength, which allows parameter estimation without waste of information even if excitation occurs in a short burst.To simplify illustration, the method in [21] is presented based on the system (1) with ( 2)-( 4). 4 Multiplying f at each side of (2), one gets an extended regression equation Applying a stable filter H(s) := λ s+λ to (8) yields a generalized regression equation with forgetting data memory where := H(s)[ f T f ] is an excitation matrix, and λ ∈ R + is a forgetting factor.Define a generalized prediction error Following (10), one gets ξ = ˙ θ + θ .From ( 9) and ( 10), one gets ξ = θ.
, ξ = θ , and ε in (4), one gets ξ = −λξ + λ f ε + θ .Thus, a forgetting data-driven update law of θ can be given by [21] , where (11b) is equivalent to (10), and γ ∈ R + is a learning rate.Following [21, Lemma 1], one gets θ = −γ θ , so the exponential convergence condition of θ is where the convergence rate of θ can be specified to be arbitrarily high in theory by setting γ independently of σ .The method in [21] can achieve even smoother and faster trajectory tracking and parameter estimation than the CAC scheme in Section II-C as the averaging effect is valid under the parameter-dependent PE condition (11) that is weaker than the classical PE condition in Definition 1.The following theorem shows the convergence results of the MRE-driven parameter update law (11).
Theorem 3 [21]: Consider the system (1) with an admissible control law u that renders a stable closed-loop system.Then, the estimation law of θ in (11) guarantees that: 1) θ ∈ L ∞ and θ is nonincreasing, ∀t ≥ 0 independent of any excitation condition; 2) θ → 0 exponentially, ∀t ≥ 0 if the new PE condition in (12) holds for certain constants τ d , σ, λ ∈ R + .A robustness-enhanced version of the method in [21] was proposed in [22] to improve transient tracking while ensuring the arbitrarily fast convergence of θ in theory.In [23], the method in [21] was compared with adaptive control methods under different passivity-based parameter update laws, including standard gradient, modified least-squares, and proportional integral, to validate its superiority in parameter estimation and control.In [24], a sliding mode control (SMC) term was applied to the parameter update law in [21] to achieve finitetime parameter convergence.In [25], a terminal SMC law was incorporated into the control design in [24] to obtain the finite-time convergence of both tracking and estimation errors.In [26], a nonlinear compensator was proposed to adjust the reference model and reshape the closed-loop system to improve the transient performance of the method in [25].The forgettingdata MRE was combined with a high-order tuner (HOT) to improve the transient response of adaptive backstepping control in [27], [28], [29], [30], [31], [32], where the high-order time derivatives of parameter estimates needed are obtained directly by the MRE mechanism [33].The forgetting-data MRE with a HOT was also applied to improve the transient response of MRAC for a class of discrete LTI systems in [34].A passivity and immersion-based modified gradient estimator extended by forgetting-data MRE was proposed to improve transient response and convergence of parameter estimation in [35].
A forgetting-data MRE and mixing (MREM) method was proposed in [10], where mixing means that the extended regression equation (9) with N dimensions is multiplied by adj( ) to generate a set of N decoupled scalar equations regarding N unknown parameters.Asymptotic parameter convergence is established under a condition of square nonintegrability, which is weaker than PE, but exponential parameter convergence still depends on the PE condition in [10].In [36], the method in [10] was improved to weaken convergence conditions, where storing and forgetting online data are driven by the excitation of the regressor f in addition to the forgetting factor λ. 6The MREM-driven update law of θ in [36] is presented based on the system (1) with ( 2)-( 4) as follows: with ε = T f θ in (4), := det{Q} and ϕ := adj{Q}δ, where the notations γ , λ, δ, and have the same meanings as those in (11), and (ϕ − θ ) denotes a generalized prediction error.From (13b)-(13c), one gets δ = Qθ so that ϕ = adj{Q}Qθ = θ.Following the proof of [36, Proposition 3.1] yields θ = −γ 2 θ , so the exponential convergence condition of θ is which is a PE condition regarding .As the IE of f implies the PE of [36, Lemma A.1], the exponential convergence of θ only requires the IE of f .Note that the forgetting factor in the parameter estimation law ( 13) is λ f T f , which implies that the forgetting rate can be driven by the excitation of f .The following theorem shows the convergence results of the MREM-driven parameter update law (13).
Theorem 4 [36]: Consider the system (1) with an admissible control law u that renders a stable closed-loop system.Then, the estimation law of θ in (13) guarantees that: 1) θ ∈ L ∞ and θ is nonincreasing, ∀t ≥ 0 independent of any excitation condition; 2) θ → 0 exponentially on t ∈ (t e , ∞) if IE in Definition 2 exists for certain constants τ d , σ, t e ∈ R + .Remark 3: The reason for terming (12) to be the parameterdependent PE condition is that it can be fulfilled by choosing the forgetting factor λ in (11b) and (11c) based on a desired trajectory y d (t) ∈ R p [21].In this sense, the new PE condition (12) is weaker than the classical PE condition in Definition 1, where an extreme case is that the former can be fulfilled by transient responses.Note that the condition (12) tends to the SE condition in Definition 3 as λ → 0.
Remark 4: Due to the introduction of the forgetting factor λ, the parameter update law (11) with forgetting-data MRE results in a bounded excitation matrix and is directly applicable to a slowly time-varying or suddenly changing (but piecewise constant) parameter θ .The convergence rate of the estimation error θ can be specified by the learning rate γ to be sufficiently high in theory, but the increase of γ is subject to perturbations resulting from, e.g., measurement noise, unmodeled dynamics, and external disturbances in practice.

IV. FULL DATA-DRIVEN MRE
Based on a simple notion that parameter information is a function of all input-output data pairs (x, u) from the system (1), a full-data MRE-driven parameter estimation method was proposed for the MRAC of a class of uncertain LTI systems in [38], where the exponential convergence of the estimation error θ is achieved under the SE condition in Definition 3 which is strictly weaker than PE in Definition 1.In [39], the method in [38] was explained as least-squares parameter estimation equipped with a stable LTI filter.In [40], the robustness of the method in [38] was demonstrated in the presence of nonparametric uncertainties with frequency-domain bounds.
The full data-driven update law of θ in [38] is presented based on the system (1) with ( 2)-( 4) as follows: with ε = T f θ in (4), in which the notations λ, and ξ have the same meanings as those in (11).Note that 0 = −1 0 is the same as that in the least-squares update law (5) but has a major difference in initial settings.This is because 0 must be set small in (15c), so the inversion −1 in (15a) may exhibit a high gain compared to (5).Noting (15a), one gets θ = −ξ .Applying θ = −ξ , ε = T θ and (15c) to (15b), one gets Using the Leibniz integral rule with ξ (0) = λ (0) θ(0), one has that ( 16) implies ξ = λ θ resulting in θ = −λ θ from (15a).Thus, the condition for exponential convergence of θ is which is a SE condition in Definition 3 with respect to f .The parameter update law (15) is questionable as the setting ξ (0) = λ (0) θ(0) is impossible due to 0 = 0 and θ (0) being unknown. 7In [41], the method in [38] was extended to the system (1) without the above drawback, in which finitetime parameter convergence is achieved under the SE condition, and robust parameter estimation is demonstrated under bounded perturbations.The method in [41] inherits all merits of the method in [38] with an extra feature: Parameter convergence is achieved at any time instant that SE exists.But it also has some drawbacks: 1) The invertibility of the excitation matrix has to be checked online, and the inverse matrix −1 has to be solved when necessary, which greatly increases computational cost; 2) the direct calculation of the true parameter θ by inversion is numerically noise-sensitive and results in discontinuous estimation.In [42], the robustness of the method in [41] was analyzed in a hybrid system framework, where the hybrid system contains a globally asymptotically stable compact set with robustness against small perturbations.In [43], a continuous full-data MRE method was developed to guarantee exponential parameter convergence under SE while eliminating the drawbacks of the method in [41].In [44], the method in [41] was employed to control servomechanisms to improve transient responses with only simulation verification.
The method in [43] resorts to a state prediction model in which θ 0 ∈ R N is an initial estimate of θ , and α ∈ R + is a filtering constant.Subtracting (18) from the first row of (1), one obtains a prediction error dynamics Let η(t) [ (x, u)].Thus, one gets η = −αη, η(0) = α x(0) such that η ≡ 0 if we set x(0) = x(0).Now, another full datadriven update law of θ is given by [43] where the notations γ and have the same meanings as those in (11), δ ∈ R n is an auxiliary variable, and (δ− θ ) is regarded as a generalized prediction error.Therefore, the condition for exponential convergence of θ is the SE condition (17).The following theorem provides the convergence results of the MRE-driven parameter update law (20) with (18).
Theorem 5 [43]: Consider the system (1) with an admissible control law u that renders a stable closed-loop system.Then, the estimation law of θ in (20) with (18) guarantees that: 1) θ ∈ L ∞ and θ is nonincreasing, ∀t ≥ 0 independent of any excitation condition; 2) θ → 0 exponentially on t ∈ (t e , ∞) if SE in Definition 3 exists for certain constants σ, t e ∈ R + .Remark 5: The parameter update law with full-data MRE in (20) makes use of all data memory to ensure exponential parameter convergence under SE in Definition 3, which is strictly weaker than PE in Definition 1.However, as all data memory is utilized via (20c) to update the parameter estimate θ, this scheme is not suitable for handling any variation of the parameter θ .Besides, as f T f is positive-semidefinite, the gain of the excitation matrix in (20c) is monotonically increasing, resulting in possible unbounded adaptation.Moreover, the fulltime integral in (20c) can accumulate modeling inaccuracy to deteriorate estimation accuracy.

V. DISCRETE DATA-DRIVEN MRE
Concurrent learning resorts to discrete-data MRE to achieve parameter convergence in adaptive control in the absence of the PE condition [45], [46], [47].In this scheme, a dynamic data stack Z is built to record data discretely, and exponential convergence of the estimation error θ is achieved if sufficiently Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
rich data are recorded in Z to satisfy a discrete-time version of the IE condition in Definition 2 which is strictly weaker than PE in Definition 1.The theoretical frameworks of concurrent learning MRAC were established for uncertain LTI systems and LIP uncertain nonlinear systems in [48] and [49], respectively.Most existing concurrent learning methods require the undesirable signal ẋ for parameter update.A fixed-point smoothing algorithm in [46] was applied to estimate ẋ in most existing concurrent learning methods.In [50], a state-derivative estimator was used to estimate ẋ in concurrent learning.However, both fixed-point smoothing and state-derivative estimation require extra computational costs and result in the loss of parameter estimation accuracy.Inspired by composite learning in [105], an integral concurrent learning method that employs interval integrals was proposed to avoid using ẋ in [51] and was applied to control real-world systems in [52], [53].In [54], a directional forgetting mechanism was designed to improve concurrent learning control such that closed-loop stability and parameter convergence can be achieved under parameter variations.
To avoid using ẋ, the method in [49] is presented based on the system (1) with ( 2)-( 4).Let Z = { f1 , f2 , • • • , fM } denote the data stack and k = {1, 2, . . ., M} represent the index of a stored data point fk := L(s)[ (x k , u k )], where M ≥ N is the maximum allowable member of data points in Z.At the initial time t 0 , one gets For every moment t, a data point f (t) can be considered to be sufficiently different if it is linearly independent of all stored points in Z.Then, f (t) is stored in Z.When the number of stored points equals M, a new data point f (t) is incorporated into Z to replace a stored point if the MSV of M k=1 fk T fk is increased after the replacement, where the stored point to be replaced is found by an exhaustive search over all stored points.Based on the above discussions, a discrete data memory-driven update law of θ is given by [49] with where γ ∈ R + is a learning rate, and the right-hand side of "=" is a discretetime version of the generalized prediction error.Noting (2), one gets χ k −f fk = T fk θ resulting in θ = −γ M k=1 fk T fk θ .Thus, the condition for exponential convergence of θ is a discrete-time version of the IE condition in Definition 2 with respect to f , where is a discrete-time version of the excitation matrix.The following theorem shows the convergence results of the MRE-driven parameter update law (21).
Theorem 6 [49]: Consider the system (1) with an admissible control law u that renders a stable closed-loop system.Then, the estimation law of θ in (21) guarantees that: 1) θ ∈ L ∞ and θ is nonincreasing, ∀t ≥ 0 independent of any excitation condition; 2) θ → 0 exponentially on t ∈ (t e , ∞) if the IE condition in (22) holds for certain constants σ, t e ∈ R + .Concurrent learning has been applied to solve some estimation and control problems without the PE condition [55], [56], [57], [58], [59], [60], [61].In [55], concurrent learning was applied to handle unknown control allocation matrices in adaptive control of LTI systems.In [56], concurrent learning was applied to formulate a problem of control transfer from a source system to a transfer system.In [57], a concurrent learning estimation method was designed to address the recursive subsystem estimation of piecewise affine systems.In [58], a concurrent learning reactionless control method was proposed to post-capture unknown targets by space manipulators.In [59], a concurrent learning control method was developed for spacecraft to achieve the finite-time convergence of inertia parameters.In [60], concurrent learning was applied to establish the global asymptotic stability of a HOT.In [61], a control Lyapunov-barrier function based on adaptive quadratic programming and filtered concurrent learning was proposed for LIP uncertain nonlinear systems with unknown control allocation matrices to balance safety and stability.Note that only matched uncertainties were considered in the above discretedata MRE methods for parameter estimation.
Remark 6: Concurrent learning resorts to discrete-data MRE to achieve exponential parameter convergence in adaptive control and estimation by the discrete IE condition (22) equivalent to rank (Z) = N, where its key feature is that data memory recorded anytime during control can be incorporated into the parameter update law (21).However, the theory of concurrent learning is yet to be developed fully due to some reasons: 1) It does not clarify the procedure of MRE; 2) it does not reveal the connection to composite adaptation with the concept of prediction errors; 3) it does not link to the IE condition in the continuous time; 4) the high computing complexity of maximizing MSVs and estimating state derivatives prevents concurrent learning from being used in real-world complex systems with high orders or high degrees of freedom (DoFs).

VI. INTERVAL DATA-DRIVEN MRE
CAC in Section II-C integrates direct and indirect adaptive control to achieve the asymptotic convergence of both the tracking error e and the prediction error ε while maintaining global closed-loop stability [13].The smoothness and rapidness of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
parameter estimation and control in CAC results from the averaging effect of the composite update law (7).However, without the PE condition, the averaging effect is invalid, and parameter convergence is still not guaranteed in composite adaptation.Motivated by composite adaptation, an innovative methodology termed composite learning has been proposed as a systematic framework to achieve parameter convergence in adaptive systems without the PE condition [76], [77], [78].Composite learning implements regressor extension and interval integrals (i.e., interval-data MRE) on composite adaptation to utilize data memory continuously and constructs a generalized prediction error as extra feedback to update parameter estimates such that exponential parameter convergence is achieved by means of the IE condition in Definition 2 which is much weaker than PE in Definition 1.In [79], the undesirable signal ẋ in composite learning was avoided by an integral transformation, and its faster and better parameter estimation than concurrent learning was validated by numerical results.A least-squares version of [79] without regressor filtering and its comprehensive extension can be found in [80], [81].Composite learning represents a natural evolution of composite adaptation to the learning framework while avoiding the drawbacks of concurrent learning [82].
Composite learning is presented based on the system (1) with ( 2) with the constructive procedure as follows: 1) Regressor Filtering: A low-pass filter L(s) is employed to (1) to generate the state dynamics (2) with the filtered regressor f = L(s)[ ], increasing robustness against high-frequency elements for parameter estimation; 2) State Prediction: A state prediction model is given by with x(0) = x(0), where α ∈ R + is a filtering constant.Using ( 23) and (1) yields a prediction error dynamics with x = x − x implying α x = L(s)[ T (x, u) θ ], such that α x is indeed a filtered prediction error.Multiplying each side of α x = L(s)[ T (x, u) θ ] by f and denoting ψ := L(s)[ T (x, u) θ ], one gets an extended regression equation with enriched data as follows: 3) Integral-Like Operations: Applying integral-like operations to (25) yields a generalized regression equation (26) where (t) := t t−τ d f (τ ) T f (τ )dτ is an interval excitation matrix, and δ(t) ∈ R N an auxiliary variable.4) Memory Exploitation: Defined a generalized prediction error that can exploit online data memory as follows: where the excitation time t e can be updated based on the excitation strength of .

5) Composite Feedback:
A composite learning law of θ that combines multi-source information is given by where ε(t) := α x(t) + ψ(t) − T f (t) θ(t) is another filtered prediction error, γ ∈ R + is a learning rate, k a , k b ∈ R + are weighting factors to balance the convergence of e, ε and θ , c is the regressor for control in (6).Noting (25), one obtains T f θ = α x + ψ.Using this result, one obtains ε = T f θ .Applying δ = θ in ( 26) to (27), one obtains ξ = θ.Applying ε = T f θ and ξ = θ to (28), one obtains θ = −γ ( f T f + ) θ. Consequently, the condition for exponential convergence of θ is which is an IE condition in Definition 2 regarding f .From the same proof of [43,Th. 4.1], one obtains the following theorem that shows the convergence results of the MRE-driven parameter update law ( 28) with ( 23).
Theorem 7 [78]: Consider the system (1) with an admissible control law u that renders a stable closed-loop system (6).Then, the estimation law of θ in ( 28) with (23) guarantees that: 2) ε, e → 0 asymptotically, ∀t ≥ 0; 3) θ , e → 0 exponentially on t ∈ (t e , ∞) while IE in Definition 2 exists for certain constants τ d , σ, t e ∈ R + .Composite learning has been applied to solve some challenges in estimation and control without the PE condition [83], [84], [85], [86], [87], [88], [89], [90], [91], [92].A constructive procedure of composite learning MRAC with an optimal data selecting measure was proposed in [83].In [84], a composite learning MRAC method based on [41] was proposed to achieve finite-time parameter estimation under unknown control allocation matrices.A switched parameter estimation version of [83] was proposed in [85].A switched robust composite estimation method was presented to design adaptive observers in [86].Note that only LTI systems with matched uncertainties were considered in the aforementioned methods.Composite learning for nonlinear systems with LIP uncertainties was considered in [87], [88], [89], [90].To handle parameter variations, a variable data forgetting rate was introduced to composite learning in [87], and only partial data memory was exploited in [88].A high-order optimizer was applied to composite learning to enhance the stability and convergence of adaptation in [89].A memory-augmented system identifier was presented to achieve finite-time parameter convergence in [90].Recently, a least-squares composite learning method was compared comprehensively with several existing leastsquares estimation schemes in [91].For a class of nonlinear dynamical systems within the Hamiltonian-driven framework, an adaptive dynamic programming method based on composite learning was proposed to enhance the convergence of the optimal control solution in [92].

A. Qualitative Comparisons
Qualitative comparisons of the four MRE schemes for parameter estimation with respect to some key features are given in Table I, where "Store data" denotes the storage of historical data, "Complex" denotes computing complexity, which involves computing MSVs, summations, and differential equations, "Excite" denotes the excitation conditions for exponential parameter convergence, and "Varying θ " denotes that θ can handle the parameter variations of θ .Only typical MRE algorithms are selected for fair comparisons, including the forgetting-data MRE in (11), the full-data MRE in (20), the discrete-data MRE in (21), and the interval-data MRE in (28).
The excitation matrices of the forgetting-data MRE in ( 11) and the full-data MRE in (20) are calculated by full-time integral directly, so storing historical data is unnecessary.The excitation matrix of the discrete-data MRE in ( 21) is updated by the data stack Z applied to store historical data, and that of the interval-data MRE in (28)   Discrete-data MRE has the highest computational complexity because an exhaustive search over all stored points in the data stack Z is necessary to maximize the MSV.Note that forgetting-data MRE, full-data MRE, and internal-data MRE have the same total order (N+2)(N+n), and discrete-data MRE has the total order N(n + 1) + 2n but involves the summations of data points.Full-data MRE and interval-data MRE do not require the DDF sL(s) due to using the state prediction models ( 18) and ( 23), respectively.Full-data MRE cannot forget old data, so it is impossible to handle the variations of θ.
Full-data MRE can ensure parameter convergence under the weaker SE condition in (17) compared to PE but is subject to some fundamental drawbacks regarding unbounded adaptation, restricting parameter variations, and accumulating modeling inaccuracy, as discussed in Remark 5. Forgetting-data MRE overcomes the above drawbacks of full-data MRE, but parameter convergence relies on the PE condition in (12).Discrete-data MRE and interval-data MRE can ensure parameter convergence under the IE condition in (29), which is strictly weaker than PE.Compared with the other MRE schemes, the distinctive features of interval-data MRE include: 1) It ensures exponential parameter convergence under the IE condition without unbounded adaptation and accumulating modeling inaccuracy; 2) it is natural to handle a slowly time-varying or suddenly changing (but piecewise constant) parameter θ; 3) it still works when only partial IE exists; 4) it does not resort to the DDF sL(s) which is sensitive to measurement noise; 5) it does not require extra computational costs such as MSV maximization and state derivative estimation.
Simulations are carried out in MATLAB software, where the solver is the fixed-step ode3 with a step size of 0.5 ms.
The estimation error norms θ by the four MRE schemes are shown in Fig. 2. It is observed that the forgetting-data MRE in (11) does not provide the exact convergence of θ to the true value θ due to the lack of PE, but the full-data MRE in (20) achieves exact parameter estimation due to the existence of SE and the averaging effect.As IE holds at t e ≈ 2.3 s, the intervaldata MRE in (28) shows a clear convergence trend for θ during t ∈ [0, 2.3) s, and θ converges to θ exponentially after t = 2.3 s.However, the discrete-data MRE in ( 21) is polluted by noise and performs worse than the full-data MRE in (20) and intervaldata MRE in (28) due to the lack of the averaging effect for discrete data stored in the data stack Z.

B. Visual Servoing Applications
Concurrent learning has been applied to robot visual servoing in [64], where a concurrent learning homography-based visual servoing (HBVS) method was developed for a wheeled mobile robot to achieve trajectory tracking and estimate scene depth information.Composite learning has been applied to enhance visual tracking accuracy while achieving online fast calibration of camera parameters in robot visual servoing [121], [122], [123], [124], [125].In [121], a composite learning HBVS control method was designed to achieve exact 3-D pose control of the Panda robot with 7 DoFs under a monocular eye-in-hand camera with unknown feature positions, where the depths of the feature points can be exactly estimated online.A significant extension of [121] can be found in [122].In [123], a composite learning HBVS controller was developed for the Panda robot with a monocular eye-to-hand camera under unknown extrinsic parameters.In [124], [125], two composite learning imagebased visual servoing controllers were proposed for the Panda robot under a monocular eye-to-hand camera with unknown intrinsic and extrinsic parameters.Compared to the controller in [124] based on the robot dynamics in the joint space, the controller in [125] considers the robot dynamics in Cartesian space so that robot redundancy is adequately utilized to achieve compliant behaviors.

IX. CONCLUSION
In this letter, four data memory-driven parameter estimation schemes in adaptive control have been reviewed and compared based on a unified class of nonlinear systems with LIP uncertainties.The comprehensive analysis discovers that: 1) Classical forgetting-data MRE needs the PE condition to ensure exponential parameter convergence, but some of the latest forgetting-data MRE methods with forgetting can relax it to IE that is strictly weaker than PE; 2) full-data MRE guarantees exponential parameter convergence under the SE condition that is strictly weaker than PE, but using full-time integral brings about some critical drawbacks on unbounded adaptation, restricting parameter variations, and accumulating modeling inaccuracy; 3) discrete-data MRE guarantees exponential parameter convergence under the IE condition but has the highest computational complexity and is more sensitive to measurement noise due to the lack of the averaging effect; 4) interval-data MRE also ensures exponential parameter convergence under the IE condition and is less sensitive to modeling inaccuracy and measurement noise.In further studies, we will focus on the following challenges in MRE for adaptive estimation and control: 1) Reconstructing prediction errors under modeling inaccuracy caused by, e.g., unmodeled dynamics and external disturbances, for model learning-based control; 2) achieving parameter convergence under time-varying uncertainties using efficient directional forgetting; 3) achieving time and data-efficient deep NN learning with nonlinear parameterization; 4) achieving learning with exponential stability under partial excitation by efficient data storage and forgetting; 5) comprehensive experimental studies for high-DoF complex systems considering intensive computing burden.

Manuscript received 21
September 2023; revised 26 December 2023; accepted 22 January 2024.Date of publication 9 February 2024; date of current version 20 March 2024.This work was supported in part by the Fundamental Research Funds for the Central Universities, Sun Yatsen University, China, under Grant 23lgzy004.Recommended by Senior Editor E. Valcher.(Corresponding author: Yongping Pan.) and the PE condition in Definition 1 holds for certain constants τ d , σ ∈ R + .

Fig. 1 .
Fig. 1.An illustration of data storage modes for four MRE schemes using the ijth element of the matrix f T f with i, j ∈ {1, 2, . . ., N}.Note that for forgetting-data MRE, full-data MRE, and interval-data MRE, the solid line denotes the value of the ijth element of f T f , and the area of the shadow part denotes the value of the ijth element of , whereas for discrete-data MRE, the total length of all segments denotes the value of the ijth element of .

TABLE I COMPARISONS
OF FOUR MRE SCHEMES FOR PARAMETER ESTIMATION For discrete-data MRE, data points f (t 1 ), f (t 2 ), . .., f (t M ) are stored in the data stack Z. Interval-data MRE forgets historical data on t ∈ [0, t − τ d ) directly and reserves historical data on t ∈ [t − τ d , t).It is worth noting that for interval-data MRE if IE exists at t = t e , historical data on t ∈ [t e − τ d , t e ) are recorded and incorporated into the parameter update law (28).