A Robust Iterative Learning Control for Continuous-Time Nonlinear Systems with Disturbances

In this paper, we study the trajectory tracking problem using iterative learning control for continuous-time nonlinear systems with a generic fixed relative degree in the presence of disturbances. This class of controllers iteratively refine the control input relying on the tracking error of the previous trials and some properly tuned learning gains. Sufficient conditions on these gains guarantee the monotonic convergence of the iterative process. However, the choice of the gains is heuristically hand-tuned given an approximated system model and no information on the disturbances. Thus, in the cases of inaccurate knowledge of the model or iteration-varying measurement errors, external disturbances, and delays, the convergence condition is unlikely to be verified at every iteration. To overcome this issue, we propose a robust convergence condition, which ensures the applicability of the pure feedforward control even if other classical conditions are not fulfilled for some trials due to the presence of disturbances. Furthermore, we quantify the upper bound of the nonrepetitive disturbance that the iterative algorithm is able to handle. Finally, we validate the convergence condition simulating the dynamics of a two degrees of freedom underactuated arm with elastic joints, where one is active, and the other is passive, and a Franka Emika Panda manipulator.


I. INTRODUCTION
S TARTING from the 80s, a new control framework, namely iterative learning control (ILC), was introduced [1], [2]. The basic idea is to polish, iteratively, the current control input until the system is able to effectively perform the desired task. The iterative algorithm does not require any accurate description of the model, leading to good tracking performance without any substantial modification of the system dynamics, while incorporating persistent disturbances, e.g., gravity acceleration. Not surprisingly, ILC proved to be an excellent tool for repetitive tasks. Indeed, its field of applications are multiple, e.g., robotic manipulation [3], the wafer stage [4], manufacturing process [5], quadrotors [6], and soft robotics [7]- [10].
The iterative algorithm can be robust to disturbances [11] and can follow a switching policy between learning gains [12]. Additionally, the control law can involve a rectifying action for the initial state [13], can be combined with feedback control, e.g., proportional [14] or model predictive control [15], and can learn the desired trajectory even in the case of variations in the learning process [16].
The main problem when dealing with iterative processes is guaranteeing convergence. For continuous-time linear systems and discrete-time systems, it is possible to draw sufficient and necessary convergence conditions [17]- [19], while it is still an open problem for continuous-time nonlinear systems [19]. Disturbances such as model uncertainties [11], [20], error in the measurements, dynamic/external interactions [7], and actuation delays or faults [21] may cause a failure of the convergence condition.
Feedback controllers can mitigate the undesired effect of disturbances through the application of suitable high gains [22]. This leads to a profound alteration of the system dynamics, which is not acceptable in some applications like, for example, soft robotics [7]. In this case, the use of feedback control actions is strongly limit [23], and pure feedforward control action, e.g., ILC, is preferable. However, feedforward methods lack robustness in the case of disturbances. Thus, what happens in the case of disturbances? Which kind of disturbances can an iterative learning controller manage? What can we guarantee in terms of convergence?
The robustness of a pure feedforward iterative control law problem has already been widely investigated in the case of discrete-time systems [18], [20], [24], [25]. However, it is still under-studied for continuous-time systems. In [26], the sampled-data ILC algorithm for continuous-time systems can manage the time nonrepetitive disturbances, while in [27], the Authors tackle the same problem in the case of systems with a fixed relative degree equal to one, constant linear input and output fields, and saturated inputs.
In this paper, we design an iterative pure feedforward controller for multiple-input multiple-output (MIMO) continuous-time nonlinear systems with a generic fixed relative degree. We prove its convergence in the case of a great variety of disturbances. We distinguish disturbances on their dependency on the state or on time. Additionally, we classify them as repetitive or nonrepetitive depending on their occurrences w.r.t. the iteration domain. In [11], [18], [24], [25], the Authors already guarantee a bounded error in the presence of time-dependent nonrepetitive disturbance. We propose and prove a convergence condition (D-condition), which guarantees a robust convergence also in the presence of state-dependent nonrepetitive disturbances. Theoretically, we propose a necessary and sufficient converge condition for a restricted class of nonlinear systems. Then, we quantify the iteration-frequency and module of the nonrepetitive disturbances that the iterative algorithm can handle. Additionally, we prove that the D-condition does not modify the already known convergence results in [11], [18], [24], [25] dealing with time-dependent nonrepetitive disturbances.
Finally, we validate the D-condition on two simulated robotic systems varying disturbances types and output functions. The first robot is an underactuated compliant arm with two degrees of freedom (DoFs), in which the first elastic joint is active, while the other is passive. The second system is a Franka Emika Panda manipulator.

Notation
Let I m ∈ R m×m be the identity matrix and 0 n×m ∈ R n×m be a zeros matrix. Let f (·), g(·) : x ∈ R n → R n be two vector fields, L f g(x) stands for the Lie derivative of g(x) along f (x), i.e., . For any vector v ∈ R n , for any matrix A ∈ R n×m , we denote with ||v|| and ||A|| their infinity norm. Let λ be a positive constant, for any vector v ∈ R n , we denote with ||v|| λ its λ −norm, i.e., ||v|| λ sup t ||v|| e −λt . Let y : t ∈ R → R n be a vector function, we denote with y (i) (t) its i−th time derivative. Let U be a set, we use the notation #U to indicate its cardinality. Finally, all physical units may be assumed to be expressed in SI (MKS), and angles in radian.

II. PROBLEM DEFINITION
Let us consider an iterative process, where j ∈ U is the iteration index, and U is the iteration set. The class of continuoustime nonlinear systems under analysis can be written as : R n × [0,t f ] → R n×m are the drift and control vector field, respectively. Additionally, the system is affected by disturbances d px (x j (t)), d pt (t), d rx j (x j (t)), d rt j (t) ∈ R n , and d rty j (t) ∈ R n y which we classify in relation to their dependency on iteration, state and time domain. In particular, considering the iteration domain j ∈ U, we distinguish between repetitive and nonrepetitive disturbances. Furthermore, we divide them into state disturbances and time disturbances, respectively. It is instrumental for the development of the method to introduce the following definitions.
Lipschitz, and bounded is said to be state-repetitive (or statepersistent).
Lipschitz and bounded, i.e., max t d rx j (x j (t)) =d rx j is said to be state-nonrepetitive.
It is worth highlighting that, nonrepetitive disturbances, i.e., Def. 4, have already been widely studied in the literature, e.g., [11], [20], [26], [28]. However, the other types of disturbances have not been properly analyzed yet. In the following remark, we present a few practical examples of these definitions. Remark 1. State-repetitive disturbances (Def. 1) can represent an external force field, e.g., an unmodeled gravity vector in the dynamics of a robot. Time-repetitive disturbances (Def. 2) can model additive uncertainties in the system nominal model. It is worth remarking that repetitive disturbances are present at each iteration of the whole iterative process.
State-nonrepetitive disturbances (Def. 3) can derive from the interaction between a robot and the environment or actuators failure/delays. Time-nonrepetitive disturbances (Def. 4) are disturbances with no relation with the state, e.g., measurements noise. It is worth remarking that nonrepetitive disturbances occur only for a few iterations (or change at each iteration) during the whole learning process.

Assumptions
We assume for the system (1)- (2) what follows: A1) the system (1)-(2) is square, i.e., n y = m. A2) The system (1)-(2) has a fixed relative degree (vector) r v such as r v = [r 1 , . . . , r m ] (see, e.g., [29]): continuous and differentiable for r times, ∀t ∈ [0,t f ]. It is worth noting that, thanks to assumption A5), there exist bounded u d , x d , and y d , which are the desired control input, state and output, respectively 1 Goals Considering the disturbed system (1)-(2), given the desired trajectory y d (t) : [0,t f ] → R m , and assumptions A1)-A5). The main purpose of this paper is to investigate the robustness of the iterative feedforward control law u j (t) in the presence of In particular, we summarize the goals of this work as follows. G1) Design an iterative feedforward control law u(t) : [ Propose a robust convergence condition, namely Dcondition, which guarantees G1) even in the presence of state-nonrepetitive disturbances d rx j (x j (t)). G3) Find an upper-bound of the state-nonrepetitive disturbances d rx j (x j (t)), which can be dealt with by the convergence condition proposed in G2). G4) Prove that the D-condition in G2) handles the presence of time-repetitive d pt j (t) and time-nonrepetitive d rt j (t) 1 Note that u d is unique, and that both x d , and u d are unknown and required only to theoretically prove the convergence of the method. and d rty

III. SOLUTION
This section is dived into four parts. Firstly, we present the employed control law. Secondly, we report well-known results for this iterative control. Third, we propose the main result of this paper, i.e., a robust convergence condition for the control law (3). This converge condition is able to cope with state repetitive and nonrepetitive disturbances, i.e., Def. 1 and 3. Finally, the fourth section extends the main result considering also the presence of time repetitive and nonrepetitive disturbances, i.e., Def. 2 and 4.

A. ITERATIVE CONTROL LAW
In this paper we employ an ILC control law, which is purely feedforward. This has already been widely used in literature, for example in [9] and [30], achieving G1). Recalling the system (1)-(2) and the assumptions A1)-A5), we employed control law is where Γ j (t) ∈ R m×m is the time and iteration varying learning gain and the error signal e r j (t) ∈ R m is defined as where ϒ i ∈ R m×m , ϒ i 0, ∀i = 0, . . . , r are tunable control gain matrices, which affect the convergence velocity [9]. The initial guess u 0 (t) ∈ R m of the iterative control law (3) can be arbitrarily chosen.
Assuming that the measurements of y (i) j (t) for i = 0, · · · , r can be easily obtained through sensors, for each iteration j and time instant t ∈ [0,t f ], the control law (3) requires (r + 1)(m 2 + m) operations. In the case that the derivative measurements are not available the method complexity increases depending on the adopted algorithm.
It is instrumental for the derivation of the method to introduce the following definition.
Lemma 1. If the control law is convergent (Def. 5), then the error (4) is such that e r j (t) λ → 0 when j → +∞. Proof. Recalling assumptions A3) and A5), i.e., no shift in the initial condition and the feasibility of the desired trajectory, the proof is trivial. VOLUME 4, 2016

B. STATE OF THE ART
A sufficient convergence condition [30] for the controller (3), which we call not-disturbed (ND) convergence condition, is If (5) is verified, then the iterative process is guaranteed to convergence. This occurs also in presence of state-repetitive disturbances (Def. 1), see, e.g., [7], [31]. Indeed, considering the system (1), the state-persistent disturbances d px (x j (t)) can be included in the vector f n (x j (t)), which is still Lipschitz. For this reason, in the following, without loss of generality, we can directly consider the disturbed drift vector It is worth noting that, the Lipschitz constant of f (x j (t)) is stillf ∈ R, i.e., assumption A4). Additionally, (5) can also deal with both timenonrepetitive and time-repetitive disturbances (Def. 2 and 4). However, in this case the iterative process will not have a perfect convergence as in Def. 5, but it will be bounded, i.e., , [11]. On the other hand, (5) does not guarantee the so-called control contraction when state-nonrepetitive disturbances (Def. 3) occur. Therefore, the main contribution of this work is to propose a robust convergence condition (D), which extends (5). This is presented in the following section.

C. MAIN RESULT: STATE-NONREPETITIVE DISTURBANCES
For the sake of clarity, let us define what follows. Definition 6. Let U ≡ N be the iteration set. U is such that U = T ∪V , where V contains all that iteration j such that (5) is not fulfilled, while T = U −V .
The following Theorem represents the main result of this paper. It enables the controller (3) to cope with statenonrepetitive disturbance such as in Def. 3, achieving G2).
Given the control law (3) and (4), we have Defining δ u j u d − u j and δ x j x d − x j , we can write (9) Given (4) and A4), we compute Using again assumption A4), we can write the following inequality for the system (1) Applying the Gronwall's Lemma to (12), leads to where c 1 sup t g(x j ) and c 2 sup t f +ḡ ||u d || .
On the other hand, the presence of state-nonrepetitive disturbances d rx j (x) (Def. 4) affects the constant c 2 in (13), leading to c 2 c 2 +d rx j . This may lead to a failure in the convergence condition (5). Indeed, ∀c 2 , and λ (already selected), ∃d rx j : (17).
Without loss of generality, we can group (16) by windows of N trials, which contains iterations belonging to both V and T . This leads to which is a control contraction for hypothesis, i.e., P j < 1. We substitute all the iterations of the iterative process, and we compute the limit for j → +∞ The right-hand side of (19) is an infinite product of factors P j such that 0 ≤ P j < 1. This implies that ∏ +∞ j=0 P j = 0. Recalling Lemma 1, we state that e r j (t) λ → 0 , j → +∞. Thus, the proof is completed.
A necessary and sufficient convergence condition for the controller (3) and nonlinear system (1)- (2) is still an open problem. However, restricting the class of nonlinear systems under study, it is possible to obtain the necessary and sufficient convergence condition for the controller (3), as proven in the following Theorem. Proof. Sufficiency. We refer to Theorem 1.
Since the windows N is not known a priori, (7) results not trivial for a practical interpretation. To have a trivial comparison with a classic convergence condition (ND), i.e., (5), we state what follows. Corollary 1. Under the same assumptions of Theorem 1, let U = V ∪ T be the iteration set such that #T = ∞ while #V < ∞. A sufficient condition for the convergence of (3) is Proof. We here report only a sketch of it. Recalling (17), we substitute all the previous trials, we split the products, and we compute the limit in which ∏ j∈T (χ j + ν j ) = 0 and ∏ j∈V (χ j + ν j ) = ν ∈ R + \{+∞}. The proof is completed.
We tackle the goal G3) with the following Proposition. Proposition 1. Under the same assumptions of Theorem 1, and given a window N of iterations, let N V and N T be two sets such that N = #N V +#N T . The two sets N V and N T include the iteration indexes j where a state-nonrepetitive disturbance occurs or not, respectively.
In practice, (25) is difficult to apply, but it guarantees an upper bound w.r.t. the iteration frequency for any statenonrepetitive disturbances. VOLUME 4, 2016 Remark 2. The control law (3) depends on the control gains ϒ i ∈ R m×m , for i = 0, · · · , r. These directly multiply the derivative of the output. Large values could speed up the convergence of the method. However, the magnitude of the gains should be proportional to the reliability of the measurements. i.e., inaccurate measurements should be multiplied by low gains. Moreover, in practical applications, the control action could exceed the actuators physical limits and, eventually, damage the system.

D. OTHER RESULTS: ALL DISTURBANCES
In this section, we analyze the presence of also the timenonrepetitive and repetitive disturbances (Def. 2 and 4), achieving G4). As discussed in Sec. III-B, these disturbances do not affect (7), although, they lead to a bounded error [28] and [18]. The following Theorem extends Theorem 1 w.r.t. all disturbances under analysis, relaxing also A3). (1)-(2), and let y d (t) ∈ R m be desired output trajectory.
Recalling (7), and defining P sup j max t P j , one has The proof is completed.

IV. VALIDATION
We validate the effectiveness of the D-condition through simulations, using MATLAB. Firstly, we simulate a 2 DoFs underactuated compliant robot, namely RR, composed of two elastic joints, where only the first one is actuated. Secondly, we test the method on a Franka Emika Panda robot equipped with elastic joints. The dynamic model is used for simulating the system and for tuning the gain Γ j (t) of the controller (3). The gains ϒ are chosen depending on the system, while ε = 0.9. The initial guess u 0 is chosen equal to the constant torque able to maintain the robot in the starting position of the trajectory y d (0), i.e., solving f n (x d (0)) + g(x d (0))u 0 (t) = 0.
To quantify the tracking performance, we use as a metric the root mean square (RMS) of the norm of each component of the output error, showing that the D-condition (7) extends the ND-condition (5). The learning is executed until the RMS error reaches a value of 0.001rad.

A. TWO DOFS UNDERACTUATED ROBOT: RR
We simulate the dynamics of a two DoFs underactuated arm with elastic joints. We refer to [9] for a more exhaustive treatment of the system dynamics. Let m = 0.55 kg, J = 0.001 kgm 2 , l = 0.085 m, a = 0.089 m, and d ν = 0.3 Nms/rad be the mass, inertia, length, center of mass distance, and damping of each link, respectively. The stiffness of each link is tested in two configurations: Soft, i.e., k = 1 Nm/rad, and Stiff, i.e., k = 3 Nm/rad. For the sake of clarity, let us recall that the state x ∈ R 4 of the robot is x = [x 1 , x 2 , x 3 , x 4 ] , where x 1 and x 2 are the joint positions, while x 3 and x 4 are the joint angular velocities.
To test the robustness of the method, we design the learning gain Γ j using a model whose parameters are different from the nominal one. In particular, the second link parametersm 2 ,J 2 ,l 2 , andã 2 are decreased by 50%. This is a state-repetitive disturbance d px in (1). Additionally, we test the control algorithm simulating measurement noise d rty j (t), external disturbances, and delays in the controller u j (t), which can be both modeled as state-nonrepetitive disturbances d rx j (x j (t)) in (1). The chosen output function h(x) ∈ R is the absolute angle of the robot tip i.e., y = x 1 + x 2 , which leads to a relative degree r = 2 iff ( [9]) where b 1 = m 2 a 2 1 + m 1 l 2 1 + J 1 , b 2 = m 2 l 2 2 + J 2 , b 3 = a 1 l 2 m 2 and detM( To fulfil (7), we choose a constant learning gain In each trial, the starting configuration is x j (0) = 0 4×1 ∀ j ∈ U, and the control gains are [ϒ 0 , ϒ 1 , ϒ 2 ] = [80, 5, 1].

1) Soft Configuration
In presence of particularly low stiffness, during the robot motion, the second link position x 2 reaches x 2 = π/2. Thus, (33) vanishes leading to a variation of the relative degree. This variation causes a failure of the convergence condition (5), and no conclusions on the convergence of the iterative method can be drawn. However, this simulation shows that using the gain (35), we can guarantee the convergence thanks to (7). We test the same task in two different conditions: • D -Model: we design the learning gain using (35). The disturbances are due to model uncertainties and a change in the relative degree, namely d pt (t). • D -Noise: we employ (35), and, in addition to the issues of the D -Model case, we inject Gaussian noise into the system simulating the presence of time-nonrepetitive disturbances d rty j (t). The mean value of the Gaussian noise is equal to 0, the standard deviation equal to 10 −3 on the position measurements, and 10 −5 on the velocity measurements. Thus in the D -Noise scenario, recalling (1)-(2), and (6), the simulated system can be written as     ẋ Finally, in the D -Model scenario, we have d rty j (t) ≡ 0 m×1 , and d pt (t) ≡ d rt j (t) ≡ 0 2n×1 in (36)-(37). Fig.1 reports the simulation results, where at trials j = 6, 8, x 2 is x 2 = π/2. Fig.1(a) shows the tracking performance at the last iteration, while Fig.1(b)-1(c) depict the error evolution over iterations.

2) Stiff Configuration
We test the same task in four different conditions: • ND: we use the nominal model in (24), where D(x j ) is computed as (33). • D -Model: we design the learning gain using (35). The disturbances are due to model uncertainties, i.e., d px (t). • D -Force: we employ (35), and, in addition to the issue of the D -Model case, we simulate the presence of an external force due to an interaction between the robot and the environment, which occurs at trails j = 8, 12. This is a state-nonrepetitive disturbance d rx j (x j (t)), which is mapped at the joint level withd rx j = 0.5 Nm at t = 5 s. • D -Delay: we employ (35), and, in addition to the issue of the D -Model case, we simulate the presence of a 1 s delay in the control action. This occurs at trails j = 8, 12 and it can be modeled as state-nonrepetitive disturbance d rx j (x j (t)). Thus, in the D -Force and Delay scenarios, recalling (1) and (6), the simulated system can be written aṡ (38) Finally, in the D -Model scenario, we have d pt (t) ≡ d rx j (x j (t)) ≡ 0 2n×1 in (38). Fig. 2 reports the simulation results. Fig. 2(a) depicts the tracking performance at the last iteration, while Fig. 2
• D -Data Loss: we design the learning gain such as Γ j (t) = εM(q j ), which is a model uncertainty, namely d pt (t). Additionally, at trials j = 4, 6, we simulate a complete loss of joint position data, i.e., Γ j+1 = εM(q 0 ) leading to a failure of (5). The loss of data can be modeled as a state-nonrepetitive disturbance d rx j (x j (t)). • D -Delay: in addition to designing the learning gain such as in the D -Data Loss case, we simulate the presence of a delay of 0.8 s in the control action of bances, the converge is smooth, fast, and exponential Fig.  2(b)-Fig. 3. On the other hand, as expected, the presence of state-nonrepetitive disturbances leads to an increment of the error for some iterations, Fig. 1(b)-Fig. 2(b)-Fig. 3, leading to a non-monotonic convergence. However, thanks to the fulfillment of the D-condition we proposed, the controller is able to achieve the same tracking performance (goal G2)) Fig. 2(a)-Fig. 4(a). This proves that the D-condition is more robust w.r.t. the original ND one. Indeed, the D-condition obtains the minimization of the error while dealing with the incorrect contribution added to the control input, achieving the same tracking performance as the ND-condition.

VI. CONCLUSION AND FUTURE WORK
In this paper, we tackled the problem of trajectory tracking for continuous-time nonlinear systems affected by disturbances. We define different classes of disturbances. The goal was to obtain a controller able to achieve good tracking performance even in presence of state-nonrepetitive disturbances. We proposed and proved a convergence condition for a class of iterative learning controllers. The algorithm is purely feedforward, and it copes with nonlinear systems with a generic and fixed relative degree. The proposed method is robust both to repetitive and nonrepetitive disturbances. Additionally, we presented an upper bound of the disturbance amplitude that can be dealt with. Finally, we validated the proposed method through simulations using an underactuated compliant arm and Franka Emika Panda robot, both subjected to different types of disturbances. Future work will investigate the robustness of the iterative framework from both a theoretical and an experimental point of view. We will combine feedforward and feedback terms and design switching policies depending on the system relative degree. Additionally, the employed control law (3) is based only on the output measurements. Future work will rely on state-observers [35] to design a control law employing the knowledge of the whole state. Finally, from a more experimental point of view, we will implement the algorithm on a real soft continuum prototype and medical image encryption [36] both with disturbances.