Robust offset-free constrained Model Predictive Control with Long Short-Term Memory Networks - Extended version

—This paper develops a control scheme, based on the use of Long Short-Term Memory neural network models and Nonlinear Model Predictive Control, which guarantees recursive feasibility with slow time variant set-points and disturbances, input and output constraints and unmeasurable state. Moreover, if the set-point and the disturbance are asymptotically constant, offset-free tracking is guaranteed. Offset-free tracking is obtained by augmenting the model with a disturbance, to be estimated together with the states of the Long Short-Term Memory network model by a properly designed observer. Satisfaction of the output constraints in presence of observer estimation error, time variant set-points and disturbances


I. INTRODUCTION Model Predictive Control (MPC)
is an optimizationbased control method, that consists in solving at each sample time instant a Finite Horizon Optimal Control Problem (FHOCP) and then applying only the first element of the optimal control sequence.In the last 30 years many theoretical results have been proposed in order to derive MPC control algorithm with guaranteed properties both for linear and nonlinear systems.Stabilization of a given steady state equilibrium is a largely solved problem, with approaches based on terminal ingredients (terminal set/cost) [2], [3] or sufficiently long prediction horizon [4], [5], [6], [7] in the nominal case, i.e. when the model equation and state are exactly known.Starting from this simple case, several results were developed in order to consider more realistic situations that involves disturbances, model uncertainties, time variant references a-priori unknown and unmeasurable states.The development of MPC control algorithms able to tackle these aspects is made particularly difficult when state/output constraints, that are one of the most important characteristic of MPC, are considered.In fact, recursive feasibility is heavily affected by all these aspects.In particular, to take into account model uncertainties and disturbances several algorithms were proposed both in a bounded deterministic context [8] and in a stochastic one [9], [10], [11].For linear systems robustness is typically achieved using tube based robust MPC schemes [12].For nonlinear systems it is possible to rely on constraint tightening approaches, for example based on the Lipschitz constant of the system under control [13], [14], or on min-max approaches [15].Concerning the tracking problem most of the attention was devoted to I. Schimperna and L. Magni are with Department of Civil and Architecture Engineering, University of Pavia, via Ferrata 3, Pavia, 27100, Italy (e-mails: irene.schimperna01@universitadipavia.it,lalo.magni@unipv.it)compensate exogenous signals (reference and/or disturbance signals) generated by a neutrally stable ecosystem [16], [17] and in particular constant signals [18], [19], and to guarantee recursive feasibility with set-point changes by simultaneously optimizing an artificial reference [20], [21], [22].In [23] an offline procedure to compute a parameterized terminal cost has been developed for both set-point and trajectory stabilization.Moreover, to guarantee offset-free tracking at steady state despite the presence of persistent disturbances and model-plant mismatch, a well-known method consists in augmenting the model with a properly designed constant disturbance, to be estimated by the observer together with the states of the model.Sufficient conditions that the disturbance model, the observer and the MPC have to satisfy in order to guarantee zero error at steady state are analysed for linear models in [24], [25] and for nonlinear models in [26].The case of unmeasurable states has been typically solved by the introduction of an observer.The presence of constraints however calls for the design of a robust state-feedback MPC and for a properly designed observer.In most of the literature about output MPC the observer estimation error is regarded as a generic disturbance [27].Few exceptions are [28], that proposes a constraint tightening technique based on the evolution of the maximum observer estimation error, and [29], that uses a min-max optimization problem to simultaneously compute a Moving Horizon Estimation and a robust MPC.The properties of any MPC algorithm are also affected by the quality of the model that can range from linear to nonlinear, physical based to black-box identification, fine dimension to infinite dimension.
In this paper, a Neural Network (NN) model has been considered for its flexibility and modularity and possibility to be learnt directly from the data without the need of analysing the physics of the system.The idea of developing models inspired by human brain neurons dates back to 1957 with the Rosenblatt perceptron [30], and also some interesting theoretical results concerning the modelling capabilities of NN, like the universal approximation theorem [31], have been known for years.Despite this, the use of NN had a great increase in recent years, thanks to the availability of huge amount of data and of cheap hardware specialized for parallel computations (GPUs), that allows to perform computationally intensive training procedures that were infeasible until some years ago.NN are now known to be effective in a large variety of contexts and applications [32], like image [33], speech [34] and handwriting recognition [35], prediction [36] and forecasting [37], [38].
For MPC, a class of NN that can be used to provide input-output models of the system under control are Recurrent Neural Networks (RNN) [39].In fact RNN have the same mathematical structure of a discrete-time dynamical system, and for this reason they can be very effective in describing the dynamic nature of the system under control.In the family of RNN, two architectures that have shown remarkable performances in several tasks are Long Short-Term Memory (LSTM) networks [40] and Gated Recurrent Units (GRU) [41].Recently, in the control field, a new architecture with guaranteed stability properties called Recurrent Equilibrium Networks (REN) has been proposed [42].In [43], LSTM neural networks have been analysed from a stability point of view and applied as models for MPC.In particular, conditions on LSTM model's parameters guaranteeing the Input-to-State Stability (ISS) [44] and the Incremental Inputto-State Stability (δISS) [45] are designed.Then, assuming that the trained model exactly represents the dynamic of the system and relying on δISS, an observer guaranteeing that the estimated state asymptotically converges to the true value is proposed.Based on the LSTM model and on the state observer, a stabilizing MPC control algorithm solving the tracking problem for constant state and input reference values and in presence of input constraints is proposed and analysed.
In this paper, the algorithm proposed in [43] has been modified in order to cope with time variant set-points and disturbances, output constraints and to guarantee zero-error regulation for asymptotically constant set-points and disturbances.In this respect, the δISS LSTM model is employed in the design of an MPC scheme that guarantees recursive feasibility with a-priori unknown slow time variant bounded reference signals, unknown disturbances and input and output constraints by means of a properly designed robust MPC inspired by [28] and a state observer properly designed for the LSTM model completed with a disturbance model that satisfies the sufficient conditions for offset-free stated in [26].Further to the state-feedback MPC and the observer, the control scheme includes a Reference calculator, that computes state and input set-points for the MPC at every sample time instant on the base of the current value of the output reference signal and of the disturbance estimation.A preliminary version of this zero-error control scheme for unconstrained systems was presented in [46].Remarkably, the results derived in this paper are useful not only in combination with LSTM models but for any δISS model like [42] and [47].
The paper is organized as follows.After the notation and some preliminary definitions, in Section II the control problem is described, and all the components of the proposed control algorithm are introduced and analysed.In Section III the main results of the paper are reported, concerning the feasibility, stability and offset-free analysis of the control scheme.In Section IV the proposed control algorithm is tested on a benchmark pH neutralization process [48].Finally, Section V draws the conclusions of the work.All the proofs are reported in Appendix I.

A. Notation
Considering a vector v, v (j) is its j-th component, v ⊤ is its transpose, ∥v∥ is its 2-norm, ∥v∥ ∞ is its infinity-norm and ∥v∥ 2 A = v ⊤ Av is the squared norm weighted with matrix A. |v| is the vector containing the absolute values of the elements of v. Inequalities between vectors are considered element by element.Given two vectors v and w, v • w is their element-wise product.Considering a matrix M , M (ij) is its element in position ij, M (j * ) is its j-th row, ρ(M ) is its spectral radius, i.e. the maximum absolute value of its eigenvalues, and ∥M ∥ and ∥M ∥ ∞ are its induced 2-norm and ∞-norm.λ max (M ) and λ min (M ) are respectively the maximum and minimum eigenvalues of the symmetric matrix M .Given a positive definite matrix A, A 1/2 is the unique positive definite matrix B such that BB = A. 0 m,n is the m × n null matrix, I n is the n × n identity matrix and 1 n is the vector of ones of n elements.sat(v, v max ) denotes the saturation operator between −v max and +v max , that is applied element by element when v is a vector.

B. Definitions
Definition (K-function): A continuous function γ : R ≥0 → R ≥0 is a class K function if γ(s) > 0 for all s > 0, it is strictly increasing and γ(0 ) is a class K function with respect to s for all k, it is strictly decreasing in k for all s > 0, and β(s, k) → 0 as k → ∞ for all s > 0.
□ The notions of Input-to-State practical Stability (ISpS) and δISS are now introduced for the generic discrete-time dynamical system x k+1 = f (x k , u k ), with x ∈ R nx and u ∈ R nu .The stability notion are stated in the sets X ⊆ R nx and U ⊆ R nu , with the set X assumed to be positive invariant for the dynamical system, i.e. for any u ∈ U, it holds that x ∈ X =⇒ f (x, u) ∈ X .
Definition (ISpS, [15]): The dynamical system x k+1 = f (x k , u k ) is ISpS in the sets X and U if there exist functions β ∈ KL and γ ∈ K ∞ and a constant c > 0 such that for any k ∈ Z ≥0 , any initial condition x 0 ∈ X , any input sequence u 0 , u 1 , ..., u k−1 with u h ∈ U for all h = 1, .., k − 1, it holds that: Definition (δISS): The dynamical system x k+1 = f (x k , u k ) is δISS in the sets X and U if there exist functions β ∈ KL and γ ∈ K ∞ such that for any k ∈ Z ≥0 , any pair of initial conditions x a,0 ∈ X and x b,0 ∈ X , any pair of input sequences u a,0 , u a,1 , ..., u a,k−1 and u b,0 , u b,1 , ..., u b,k−1 , with u a,h , u b,h ∈ U for all h = 1, .., k − 1, it holds that: where x a,k and x b,k satisfy the dynamics equation x k+1 = f (x k , u k ) respectively with the inputs u a and u b and initial states x a,0 and x b,0 .Definition (exponential δISS): The dynamical system x k+1 = f (x k , u k ) is exponentially δISS in the sets X and U if it it δISS in the sets X and U and the function β is exponential with respect to the second argument, i.e. there exist µ > 0 and λ ∈ (0, 1) such that

II. PROBLEM FORMULATION AND CONTROL ALGORITHM
In this work a nonlinear plant with unknown dynamics, input u ϕ ∈ R m , output y ϕ ∈ R p and a possible unknown bounded asymptotically constant disturbance d ϕ is considered as system under control.The plant input is saturated: with u ϕ,min , u ϕ,max ∈ R m , and the output has to be limited: with y ϕ,min , y ϕ,max ∈ R p .Assumption 1: The plant has the same number of inputs and outputs, i.e. m = p.
Remark 1: Assumption 1 is typical of nonlinear offsetfree MPC schemes [26], and is required to have a unique input combination associated to each output at equilibrium.However, if m > p the proposed algorithm can still be applied by selecting a subset of p inputs to be used for control and leaving the other m − p inputs constant.□ The objective of the control is to achieve null error at steady state for an asymptotically constant reference y 0 ϕ also in presence of model-plant mismatch and of bounded asymptotically constant plant disturbances d ϕ , and to respect input and output constraints.The proposed control scheme, whose block diagram is reported in Fig. 1, is based on an LSTM nominal model of the plant.A key element to obtain a well tuned LSTM model is the normalization of input and output signals that is represented in the scheme with the blocks "N" and that consists in an affine transformation that scales the variables in a predefined range (typically [-1,1]).The normalized input is denoted with u and the normalized output with y.For the normalized variables constraints (3)-( 4) become where u max ∈ R and is typically equal to 1, and where y min , y max ∈ R p .
To achieve null error at steady state the LSTM model is augmented with an asymptotically constant disturbance term d.Then an observer based on the equations of the LSTM model provides an estimation x of the state x of the LSTM and an estimation d of the disturbance d.At each sampling instant the current values of the normalized set-point y 0 and of the disturbance estimation d are used by a Reference calculator to compute the current values of input and state set-points ū and x for the MPC.The last block is the MPC, that solves a FHOCP and gives in output the first element of the achieved optimal control sequence.Since in the control problem formulation output constraints are considered, to ensure recursive feasibility in presence of the observer estimation error and time variant set-points, it is necessary to rely on a robust MPC.
As usually done for the analysis of offset-free control schemes, closed-loop stability and constraint satisfaction will be proven in Section III-A under the assumption that the plant behaves according to its LSTM model with an asymptotically constant additive disturbance d, as reported in Fig. 2, while offset-free will be shown in Section III-B for the real closed-loop system (Fig. 1) under the assumption that closed-loop convergence is not lost.
In the following subsections the different components of the control scheme are presented in detail.

A. Long Short-Term Memory neural network model
Long-Short Term Memory (LSTM) neural networks are a particular kind of RNN, capable of learning long-term dependencies between data.The LSTM module is composed by two states: the cell state c ∈ R n and the hidden state h ∈ R n , where n is also called number of neurons.At each time-step k, the LSTM network receives an input u ∈ R m and produces an output prediction ξ ∈ R p .Cell state and hidden state are modified through structures called gates.The equations that describe the LSTM network are the following where σ(z) = contain the trainable weights of the network.
The LSTM network ( 7) is a discrete-time time invariant dynamical system, that can be written in a more compact way as where 1) LSTM stability properties: In the following the δISS stability condition and theorem for the LSTM network proposed in [43] are recalled.
Assumption 2: The weights of the LSTM network (7) respect the condition ρ(A δ ) < 1, where and [43]): Let Assumption 2 hold.Then it is possible to upper bound the difference between any couple of system trajectories where and the LSTM system ( 7) is exponentially δISS in the sets X and U, where X = C × H, with Using the Jury criterion in [43] it is shown that it is possible to turn the condition of Assumption 2 on eigenvalues of A δ (i.e.ρ(A δ ) < 1) into the following inequality on the trainable weights of the LSTM model The δISS condition can be checked a posteriori or imposed as constraint during the training procedure.
2) Incremental Lyapunov function for the LSTM: Under Assumption 2 it is possible to compute an incremental Lyapunov function for the δISS LSTM model.Such Lyapunov function will be used to define the terminal cost, the terminal constraint and the constraint tightening for the MPC.
Lemma 1: Let Assumption 2 hold, and denote ⊤ with x a , x b ∈ X , and P s the solution of the Lyapunov equation A ⊤ δ P s A δ − P s = −Q s for some symmetric positive definite matrix Q s .Then the function is an incremental Lyapunov function for the system (7), such that where λmax(Ps) and c s ∈ R p with c s(j) = 0 W y(j * ) P −1/2 s .Proof: The proof is reported in Appendix I-A.The focus on δISS LSTM networks limits the applicability of this work to δISS plants.This is standard when black-box identification is used.In fact if small changes of the inputs can modify significantly the outputs (non δISS systems) model identification is very difficult.In these cases an inner closed-loop can be introduced in order to achieve a δISS system to be modelled with the LSTM and controlled with the MPC.

B. State and disturbance observer
In this subsection first a disturbance model is introduced in order to take into account the inaccuracy of the model and the effect of the possible disturbance d ϕ .Then an observer is proposed to provide estimations of the states of the LSTM model and of the disturbance model.The disturbance is modelled as the integral of a bounded unknown input w, that represents the variation of the disturbance.This representation is motivated by the fact that one of the goals of the paper is to guarantee zero error at steady state when y 0 and d ϕ are constant, and therefore w is null.
1) Disturbance model: The state of the disturbance model is denoted by d ∈ R p , while its variation is indicated with w ∈ R p .The equations of the LSTM model augmented with the disturbance are then the following and will also be denoted by where χ = [x ⊤ d ⊤ ] ⊤ and E is matrix that can be obtained from (12).Observer convergence, robust constraint satisfaction and offset-free can be achieved only for bounded disturbances.For this reason in the following d and w are assumed to be bounded, i.e. d ∈ D = {d ∈ R p : ∥d∥ ∞ ≤ d max } and w ∈ W = {w ∈ R p : ∥w∥ ≤ w max }.Moreover, to achieve offset-free it is necessary that for the model augmented with the constant disturbance there exists an equilibrium associated to any possible pair (u, y), and that this equilibrium is unique (see [26]).In the following lemma and corollary it is shown that in view of the δISS property of the LSTM model it is sufficient to add the disturbance only on the output transformation to satisfy this property.Hence the disturbance was not included also in the state equations.
Lemma 2: If the dynamical system x k+1 = f (x k , u k ) is exponentially δISS in the sets X and U, then given a constant input u ∈ U there exists a unique corresponding equilibrium state x * ∈ X , such that ) Corollary 1: For the augmented system (12), given (u, y) with u ∈ U, there exist unique values (x * , d * ) with x * ∈ X such that The proofs of Lemma 2 and of Corollary 1 are reported in Appendix I-B.
2) Observer design: An observer is designed to obtain an estimation χ = [x ⊤ d⊤ ] ⊤ , with x = [ĉ ⊤ ĥ⊤ ] ⊤ , of the state χ of the augmented LSTM model (12).The structure of the observer is based on the LSTM observer proposed in [43], but it has an additional equation for the estimation of the disturbance d.The equations of the observer are the following where L f , L i , L o ∈ R n×p and L d ∈ R p×p are the gains of the observer, that are selected according to the following assumption.
Assumption 3: the observer gains Remark 3: In Theorem 2 it will be shown that A d is the matrix dynamic for an upper bound of the estimation error dynamic.
Remark 4: A possible suboptimal choice for the gains of the observer that satisfies Assumption 3 is In fact with this choice the matrix A d becomes and therefore the eigenvalues of A d are the eigenvalues of A δ and an eigenvalue in 3) Observer convergence: The following lemma defines a positive invariant set for the state χ of the observer, in which it is possible to prove the convergence of the observer estimation.
Proof: The proof is reported in Appendix I-C.
The following theorem reports the main results related to the observer convergence.
Theorem 2: If the plant behaves according to ( 12) with x ∈ X , d ∈ D and w ∈ W, Assumption 2 holds, the observer parameters are selected according to Assumption 3 and χ ∈ Î, then the function where P o is the solution of the Lyapunov equation where χ + = f aug (χ, u) + Ew, χ+ is the next state computed by the observer (16), Proof: The proof is reported in Appendix I-D.Remark 5: In view of the properties of the Lyapunov function V o ( χ, χ), the observer estimation error is also ISS with respect to the disturbance w.

C. Reference calculator
One of the most important characteristics of the proposed algorithm is the possibility to be applied with time varinat set-point and disturbances unknown in advance.In order to do this the MPC assumes a constant set-point along the prediction horizon but it is designed in order to preserve, under suitable assumptions, recursive feasibility even if the exogenous signals change at any time instant.To manage possible variations of the set-point y 0 and/or of the disturbance estimation d, a Reference calculator is introduced in the control loop.The goal of the Reference calculator is to provide the state and input references x = [c ⊤ h⊤ ] ⊤ and ū for the MPC, that are computed by solving the following equations: In order to guarantee offset-free of the closed-loop system, the following assumption on the set-point y 0 and on the LSTM model is introduced.An additional assumption on the set-point y 0 will be introduced in the next section.
Assumption 4: The set-point y 0 and the LSTM model ( 7) respect the following conditions: 1) there exists a bounded set solving (20), and the Jacobian matrix is invertible.

Remark 6:
The assumption that the Jacobian matrix ( 21) is invertible implies the uniqueness of the couple (x k , ūk ).□ Moreover, since a large variation of the set-points x, ū is critical for the MPC recursive feasibility, in the following lemma an upper bound for the variation of the state setpoint x between consecutive time-steps is derived.This upper bound depends on the variation of the set-point y 0 and on the observer parameters.
Lemma 4: Under Assumption 4, given the maximum output estimation error ēy = max k∈Z ≥0 ∥y k − ŷk ∥, there exists a finite constant K such that The proof is reported in Appendix I-E.Remark 7: In general K cannot be computed explicitly, but it can be estimated numerically by gridding.

D. Robust MPC formulation
The robust MPC solves at every time-step a Finite Horizon Optimal Control Problem (FHOCP), where the evolution of the states of the system is predicted with the LSTM model and is initialized with the observer state estimation.The cost function penalizes the deviation from the state and the input set-points computed at the current time instant by the Reference calculator, that are assumed constant along the prediction horizon.The terminal cost and the terminal set are designed to stabilize the closed-loop system.To ensure satisfaction of the output constraints despite the observer estimation error, the disturbance and the set-point variation, a constraint tightening approach similar to the one proposed in [28] and a time variant terminal set are employed.For the constraint tightening, the MPC uses also a time variant term êo ∈ R related to the uncertainty of the observer.This term has the same time evolution of the incremental Lyapunov function for the observer estimation error (19b), that is described by the following equation êo,k+1 = ρ o êo,k + w (23) Note that depending on the values of êo,0 , of ρ o and of w, êo can increase or decrease, but its behavior is always monotonic with Definition (FHOCP): Given the prediction horizon N , the FHOCP for the robust MPC is the following where Q and R are positive definite matrices and are design choices, while P f is a positive definite matrix satisfying the Lyapunov condition A ⊤ δ P f A δ − P f < −qI 2 , where q = λ max (Q).Coefficients a i ∈ R p and b i ∈ R p for the constraint tightening are defined as X f (k) is a time variant terminal set, chosen as a sublevel set of the terminal cost: with where □ At each time-step k the solution of the FHOCP is denoted by u * 0|k , ..., u * N −1|k .According to the Receding Horizon principle, the MPC control law is obtained applying only the first element of the optimal input sequence

III. STABILITY AND OFFSET-FREE RESULTS
In this section the properties of the proposed control schema are analysed.In order to guarantee offset-free results a mild assumption on the convergence of the closed-loop of Fig. 1 will be introduced in Section III-B.However, first it is necessary to prove recursive feasibility and stability for the nominal closed-loop system reported in Fig. 2, where the plant has been substituted by the LSTM with an additive disturbance on the output (12).

A. Recursive feasibility and stability analysis
In this subsection recursive feasibility and stability of the nominal closed-loop system reported in Fig. 2 are analysed.First note that in order to have a solution of the FHOCP, it is necessary that α k > 0 for all k ∈ Z ≥0 .To guarantee this condition, the following assumption on the set-point is introduced.
Assumption 5: The set-point y 0 is such that for all j = 1, ..., p, for all k ∈ Z ≥0 .□ Consider the closed-loop system composed by the augmented LSTM (13), the observer ( 16), the Reference calculator (20) and the MPC (29).This system has state and inputs d and y 0 .Let's now define the feasible set of states and inputs and ∃ a solution of the FHOCP} Remark 8: In the feasible set X M P C it is required that êo is an upper bound for V o ( χ, χ).This condition is similar to require that êo is an upper bound of the observer estimation error.In fact, using Equation (19a) and V o ( χ, χ) ≤ êo , the following inequality relating êo and ∥χ − χ∥ can be derived: □ In the following theorem the main result for the nominal closed-loop schema described in Fig. 2 is derived.
Theorem 3: Let Assumptions 1, 2, 3, 4, 5 hold.Then there exist Lmax > 0, Ld > 0 and ∆y 0 max > 0 such that for L max ≤ Lmax , ∥L d ∥ ≤ Ld and y 0 such that y 0 k+1 − y 0 k ≤ ∆y 0 max for all k ∈ Z ≥0 , for the closedloop system composed by the augmented LSTM (13), the observer (16), the Reference calculator (20) and the MPC (29) the following properties hold: • constraints ( 5) and ( 6) are satisfied ∀(ψ, d, y 0 ) ∈ X M P C ; 13)-( 20)-( 29) is ISpS with respect to the observer estimation error χ− χ in ⊤ and ū∞ are the solution of (20) when y 0 = y 0 ∞ and d = d∞ .Proof: The proof is reported in Appendix I-F.Remark 9: In order to satisfy the condition on the rate of change of y 0 it is sufficient to pass the set-point through a rate limiter.

B. Offset-free result
It is now possible to follow the results in [26] to show that the proposed scheme based on the model augmented with the disturbance and on the Reference calculator guarantees offset-free at steady state also when applied to the real plant (see Fig. 1), provided that the uncertainty on the model is sufficiently small to preserve convergence and constraints satisfaction, as assumed in the following assumption.
Assumption 6: The plant disturbance d ϕ is bounded and asymptotically constant, and the closed-loop system composed by the plant, the observer ( 16), the Reference calculator (20) and the MPC (29) respects the constraints and converges to constant values strictly in the interior of the feasible set.

IV. NUMERICAL EXAMPLE
As benchmark example to test the proposed control algorithm a pH neutralization process described in [48] and used also in [43] was considered.The schematic layout of the plant is reported in Fig. 3.The process consists in a tank where three flows of substances are mixed: an acid flow q 1 , a buffer flow q 2 and an alkaline flow q 3 .The measured variable is the pH on the output flow.Describing states, disturbance, input and output as follow where W a4 is a charge related quantity in the output flow, W b4 is the concentration of CO 2− 3 ions in the output flow, h is the tank level and the acid flow q 1 is assumed to be a constant parameter, the process model can be written in the form In Table I the nominal values of the model parameters are reported.The objective is to control the pH by acting on the alkaline flow q 3 , while the buffer flow q 2 is considered as a disturbance.The alkaline flow is considered saturated, so that u ϕ ∈ U ϕ = [12.5,17] mL/s

A. LSTM model identification
To extract the dataset for the training of the LSTM model, a simulator of the plant was forced with multilevel pseudorandom signals, and the input-output response was sampled with a sampling period of T s = 10s, as done in [43].In particular the input signals for the plant simulator are piecewise constant signals with random values in U ϕ , where each constant value is applied for a random time between 10T s and 100T s .15 sequences of N t = 1500 time-steps each have been extracted, and have been split in 10 sequences for training, 3 for validation and 2 for testing.All the sequences have been generated considering the disturbance constant at its nominal value q 2 = 0.55mL/s.Before the training, inputs and outputs have been normalized using their maximum and minimum values present in the training dataset, so that The LSTM neural network was developed in Python 3.9, using Tensorflow 2.8.To obtain a final network respecting the δISS condition of Remark 2, the training loss and the stopping rule for the training have been modified as suggested in [43] and [39].In particular, the training loss L was selected as the sum of the Mean Squared Error (MSE) and of some terms penalizing weights that do not respect the δISS condition.The loss for the single sequence is where y real is the output in the dataset and y is the output predicted by the LSTM model.
σf σx ∥U o ∥ − 1 are terms that need to be negative to satisfy the condition of Remark 2. λ 1 and λ 2 are two positive hyperparameters of the training, that weight the different terms in the loss.In particular λ 1 determines how much to penalize sets of weights that do not respect the δISS condition, while λ 2 can be tuned to obtain a model that satisfies the δISS condition by a larger margin, providing a model with smaller values of ρ s .In fact a small value of ρ s , together with a small value of ρ o , implies that coefficients a i defined in (26) are smaller, leading to a less conservative constraint tightening for the MPC.Even if the δISS condition was included in the training optimization as a soft constraint, it was possible to ensure its satisfaction by introducing it also as a condition for the termination of the training.In particular the training was terminated after a predefined number of epochs only if the δISS condition was satisfied, otherwise it was left running for a larger number of epochs.
The performances of the final model were assessed with the FIT index, defined as where y and y real are the vectors containing the predicted and the real evolution of the system, and y avg is the average of y real .A trained network with n = 5 neurons and with

B. Control implementation
In order to tune the controller it was assumed d max = 0.1, that corresponds to the 5% of the range of variation of the output in the training dataset, since the output was normalized so that −1 ≤ y ≤ 1, and w = 0.01.Under this assumption, the following parameters have been used.The cost matrices were set to Q = I 2n and R = 1, and the prediction horizon to N = 5.The output constraint set was selected as Y ϕ = [6.0,9.0].The observer gains L f , L i , L o were chosen by solving the optimization proposed in [43], while L d was set equal to 0.01.The matrix P o was obtained by solving the Lyapunov equation with Q o = 1000I 3 , leading to ρ o = 0.99, L max = 8.4 × 10 −4 , w max = 4.3 × 10 −5 and ē∞ = 1.04.Concerning the parameters related to the recursive feasibility condition, K = 2.67 has been estimated by gridding, while ēy was set to 0.03.This value is clearly affected by the quality of the initialization of the observer.In the considered simulations this bound is always largely respected.The initial value for the observer state was set to the state equilibrium of the LSTM model associated with the initial output of the system under control and d0 = 0.With the considered parameters, the time variant interval for y 0 needed to satisfy Assumption 5 converges to [6.49, 8.51] after an initial transient.

C. Simulation results
First, the conservatism of the feasibility/stability conditions has been tested by performing some simulations where the controller was applied on a perturbed version of the LSTM model.In the first simulation, reported in Fig. 4, slow variations of the set-point y 0 and of the disturbance d are applied, in order to respect all the sufficient conditions required by Theorem 3. As predicted by the theory, recursive feasibility, stability and zero error are fulfilled.Then, considering that all the bounds have been derived using a sequence of possibly conservative inequalities, an analysis has been carried out to verify the applicability of the control in presence of larger variations of the set-point y 0 and of the disturbance d, that do not satisfy the sufficient conditions required by Theorem 3. In particular in Fig. 5 and 6 simulations with increasingly larger set-point and disturbance variations are reported.In the simulation reported in Fig. 5 feasibility is never lost, while in Fig. 6 in the time instants highlighted in grey the FHOCP is not feasible because of the large variations of y 0 .Notably, feasibility is maintained also with large step variations of the disturbance d.
Then, the controller has been applied to the real plant (represented by the physical equations describing the system).Since model uncertainty has not been estimated the bounds d max and w are considered as tuning parameters of the control.Also the parameter êo has been considered as an additional tuning parameter of the control, since the states of the plant do not coincide with the states of the LSTM.The first simulation has been carried out using the same parameters of the simulations performed using the LSTM model, and is reported in Fig. 7.It can be noticed that the  response of the plant to the disturbances is very slow, and null error is reached only after a very long transient.Hence, in a second simulation, L d has been increased to 0.1 to make the disturbance estimation in the observer faster.With this tuning of the observer ρ o = 0.97, L max = 8.5 × 10 −3 and ē∞ = 0.38.In this case, the asymptotic interval for y 0 needed to satisfy Assumption 5 is [6.57, 8.43].The resulting trajectory is reported in Fig. 8, and it can be seen that the control is able to correctly manage the plant, maintain the recursive feasibility and achieve null tracking error also in presence of disturbances.

V. CONCLUSION
In this paper an MPC control scheme based on an LSTM model of the plant under control has been proposed.The MPC algorithm is able to take into account input and output constraints and to guarantee null error at steady state also in presence of modelling errors and of asymptotically constant bounded disturbances.Offset-free is obtained by introducing the estimation of a disturbance term in the observer, that is used to update at every time-step the reference values for the MPC cost.The satisfaction of output constraints in presence of the observer estimation error and time variant set-point is obtained using a constraint tightening technique, based on an upper bound of the norm of the observer estimation error.The key element for the design of the constraint tightening is the derivation of incremental Lyapunov functions for the LSTM model and for the observer, whose parameters are employed in the definition of the coefficients for the constraint tightening.In the FHOCP formulation for the MPC a time variant terminal constraint is also introduced, that guarantees recursive feasibility in presence of variations of the reference values.Then, ISpS and convergence are proven by means of a Lyapunov function, different from the optimal cost because considers the deviation from the asymptotic values of state and input set-points that are apriori unknown.
The main limitation of the proposed approach is the conservatism of the robust constrained MPC algorithm that must cope with both the disturbances and the set-point variations.In order to partially reduce this limitation an artificial reference approach could be adopted, avoiding the restriction on the tightened constraints introduced by the set-point variation.Further improvement could be achieved by deriving less conservative bounds along all the proofs.However most of the conservatism is inherent in the worst case deterministic approach.As shown in the simulation example it is possible to find a compromise between a-priori feasibility guarantee and performance by adapting some of the algorithm parameters.

APPENDIX I PROOFS
The following lemmas will be used for the proofs.Condition (11a) is easily verified for the definition of V s .Condition (11b) can be verified as follows: Finally, to verify condition (11c), consider the j-th row of |W y (h a − h b )| for j = 1, ..., p:

B. Proofs of Lemma 2 and of Corollary 1
Proof of Lemma 2: Existence of x * : It is first proven that if a constant input u is applied to a δISS system, then the resulting state trajectory is asymptotically constant.Considering any possible initial state x 0 , and applying the definition of δISS (2) with x b,0 = x 0 , x a,0 = x 1 = f (x 0 , u) and u a,h = u b,h = u for all h = 0, ..., k − 1, it follows that: Hence the difference between x k+1 and x k becomes small when k increases.By summing up the previous inequality one has that lim that is finite in view of the exponential nature of the function β.Then the resulting trajectory is asymptotically constant.Moreover, given any u there exists x * solving (15a), that corresponds to the asymptotic value of the trajectories associated to the constant input u.Finally, x * ∈ X in view of the positive invariance of X .
Uniqueness of x * : Consider the definition of δISS (2), and assume that there exist two different equilibrium states x * a and x * b corresponding to the same input u.Under this assumption it is possible to study the evolution of the system with x a,0 = x * a , x b,0 = x * b and u a,h = u b,h = u for all h = 0, ..., k − 1.Then, for the δISS assumption, it follows that ∥x a,k − x b,k ∥ → 0 for k → ∞.This is a contradiction with the fact that x * a and x * b are two different equilibrium states.Then the equilibrium state x * is unique.
Proof of Corollary 1: The existence and uniqueness of x * follows from Lemma 2. Then there exists a unique value d * satisfying (15b), that is d * = y − g(x * ).

C. Proof of Lemma 3
First of all, recall from [43] that σf , σi , σo and σc are bounds on the values of the gates of the LSTM.In particular, if h ∈ H and u ∈ U one has that for j = 1, ..., n Consider Equation (16b) for ĥ.H is an invariant set for ĥ because each component of ĥk+1 is computed as the product between a sigmoid function, whose output lies in (0, 1), and an hyperbolic tangent, whose output lies in (−1, 1).
Moreover, D is an invariant set for d in view of the saturation in Equation (16c).
It is now proven that Ĉ is a positive invariant set for ĉ.First note that if u ∈ U, h, ĥ ∈ H, d, d ∈ D, in view of (12d), one has that and that Consider now the j-th component of ĉ, and assume ĉk ∈ Ĉ, h k , ĥk ∈ H and d k , dk ∈ D. By taking the absolute value of the j-th component of (16a) one gets Ĉ is a positive invariant set for state ĉ.

D. Proof of Theorem 2
The proof is divided in three parts: 1) Derivation of an upper bound for the evolution of the observer estimation error; 2) Proof of the properties (19) of the observer Lyapunov function V o ; 3) Proof of convergence, under the assumption that w k → 0 for k → ∞.Part 1: First note that in a similar way to (34), it is possible to show that Let's now define the error variables e c = c− ĉ, e h = h− ĥ and e d = d − d, and let's study their evolution in time.
Consider the time evolution of e c : Taking the norm of both sides, exploiting c ∈ C, (7c), (33a), (33b), (33c), (33d), ( 34), (36), the definition of σx , and the fact that σ(•) and tanh(•) are Lipschitz continuous functions with Lipschitz constants respectively of 1 4 and 1 (see (7)), one has that Ordering all terms one has Consider the time evolution of e h : Taking the norm of both sides, exploiting (33a), (33b), (33c), (33d), ( 34), (36), the definition of σx , lipschitzianity of σ(•) and tanh(•), one has that By substituting ∥e + c ∥ with its upper bound (37) and ordering all terms one has Consider the evolution in time of e d .First define d+ ns = d + L d (y − ŷ), that is (16c) without the saturation operator.Then By taking the norm of both sides one has that Moreover, noting that ∥e + d ∥ ≤ ∥e + d,ns ∥, it is possible to write in matrix form (37), (38) and (39), obtaining where A d is defined in (17).Part 2: Condition (19a) is easily verified for the definition of V o .
Proof of (19b) is shown in the following:

Po
The two terms of this sum are now considered separately.
For the first term one has that For the second term, in view of boundedness of set W, one has that there exists a finite constant w such that for all Finally, to study condition (19d), first note that where f c and f h represent Equations (12a) and (12b) respectively, and f d represents Equation (12c) without the term w k .
In particular Taking the norm of both sides of ( 44), one has then observer convergence immediately follows from (40).

E. Proof of Lemma 4
Let's define ŷ0 = d − y 0 and ζ = [x ⊤ ū⊤ ] ⊤ .Then the equilibrium problem (20) can be rewritten as the problem of finding the solution of F (ŷ 0 , ζ) = 0, with In this proof the function ζ = γ(ŷ 0 ) such that F (ŷ 0 , γ(ŷ 0 )) = 0 is studied.Let's define the following Jacobian matrices of F in the point (ŷ 0 , γ(ŷ 0 )): , where K x(ŷ 0 ) are the first 2n rows of K(ŷ 0 ) and K ū(ŷ 0 ) are the remaining m rows.Then, denoting it is possible to obtain a relationship between the maximum variation of x and the variation of ŷ0 : Note that the maximum that defines K is always finite in view of the boundedness of the sets Y 0 and D. The variation of ŷ0 = d−y 0 can be given by a variation of the disturbance estimation d and/or by a variation of the setpoint y 0 .This two contributions can be separated, obtaining The maximum variation of the disturbance estimation d depends on the maximum estimation error ēy and on the observer gain L d .In particular it holds that Then, combining ( 45) and ( 46), ( 22) is proven.

F. Proof of Theorem 3
The proof is divided in 3 parts: 1) Proof of satisfaction of constraints ( 5) and ( 6), given that and (25e); c) Show that the candidate solution satisfies the terminal constraint (25h).3) Proof of ISpS and convergence, that is divided in: a) Proof of ISpS; b) Proof of convergence.Part 1: If ψ ∈ X M P C , the satisfaction of the input saturation (5) follows from constraint (25f) of the FHOCP formulation.Also the satisfaction of the output constraint (6) is guaranteed by the control design.In particular y k ≤ y max follows from The fact that y k ≥ y min can be proven in a similar way: ≥ y min + a 0 êo,k − a 0 êo,k = y min Part 2a: To verify that if x ∈ X f (k) then the tightened output constraints are satisfied, first note that if x ∈ X f (k) one has that Considering now the j-th row of W y h + b y + d max 1 p one has W y(j * ) h + b y(j) + d max = W y(j * ) (h − hk ) + W y(j * ) hk + b y(j) + dk(j) − dk(j) ≤ [0 W y(j * ) ]P In a similar way it is possible to prove that ≤ [0 W y(j * ) ]P Preliminarily, note that thanks to Lemma 1 and Theorem 2 one has that and that using Lemma 1 and inequality (11b) recursively it follows for i = 0, ..., N .Moreover, as shown in Part 2a of the proof, since x Hence, xi|k+1 satisfies (25d) for i = 0, ..., N −1 as shown in the following In a similar way it follows that xi|k+1 also satisfies (25e) for i = 0, ..., N − 1: starting from the fact that x * N |k ∈ X f (k), i.e.
Comparing ( 49) and (50) it is apparent that both the sides of the inequalities change.Hence, an upper bound for the left hand side of (49) and a lower bound for the variation of the right hand side are computed in the following.Before deriving the upper bound for the left hand side, the following preliminary inequalities are derived: and with ρ f = 1 − q λmax(P f ) .The proof of this inequality is similar to the proof of Equation (11b) in Lemma 1, using Then, the upper bound for the left hand side of (49) is Let's now compute a lower bound for the possible varia-tion of α (right hand side of ( 49) and ( 50)): = min j=1,...,p min α max j,k+1 , α min j,k+1 − min j=1,...,p min α max j,k , α min j,k ≥ min j=1,...,p min α max j,k+1 − α max j,k , α min j,k+1 − α min j,k Noting that min α max j,k+1 − α max j,k , α min j,k+1 − α min j,k it is possible to derive that Moreover, from (53) and (54) it is possible to obtain Finally, note that ρ f < 1 by definition and α k+1 > 0 for all k in view of Assumption 5. Then there exist Lmax > 0, Ld > 0 and ∆y 0 max > 0 such that for L max ≤ Lmax , Part 3: To prove the ISpS and convergence properties of the closed-loop system a candidate Lyapunov function is introduced.The candidate Lyapunov function has a structure similar to the optimal cost of the MPC, but it considers the asymptotic values of the state and input set-points instead of their values at time-step k.Note that these values cannot be used in the MPC cost function because are unknown.The expression of the candidate Lyapunov function is the following where u * 0|k , ..., u * N −1|k and x * 0|k , ..., x * N |k are defined as in Part 2b of this proof, and and ūk , because all these values are needed to compute the optimal input sequence u * 0|k , ..., u * N −1|k .In Part 3a, a lower bound and an upper bound for V k and a bound on the variation of V k between subsequent time-steps are derived to prove ISpS.In Part 3b, convergence is shown by studying the asymptotic values of the bounds, under the assumption that y 0 k → y 0 ∞ and d k → d∞ for k → ∞.Part 3a: The lower bound for V k follows from 55) The upper bound for V k is now proven.Consider the possibly suboptimal control input ũi|k = ūk for all i = 0, ..., N − 1, and denote with x0|k , ..., xN|k the correspondent state trajectory with initial condition x0|k = xk .Note that there exists µ > 0 such that this control input is feasible at time-step k for all xk such that ∥x k − xk ∥ ≤ µ.
Three different cases are considered: In this case the sequence ũi|k is feasible.In fact In particular, under Assumption 4, there exist finite constants Kx and Ku such that This expression contains the optimal cost of the MPC optimization at step k, that is Consider now the cost associate to the suboptimal control input ũi|k , and note that it is greater or equal than the optimal cost, i.e.
and that, in view of Assumption 2 and Theorem 1, there exist μ ≥ 0 and λ ∈ (0, 1) such that ∀i ≥ 0 to obtain that there exists a constant b ≥ 0 such that where In view of (56) and boundedness of D and Y 0 , terms δ xk , δ ūk , δc k and δ hk are bounded for all k ∈ Z ≥0 .In addition, inputs are limited in U and states are limited in Ĉ × H. Then there exists a constant c1 ≥ 0 such that β(k) ≤ c1 for all k ∈ Z ≥0 .
Consider the Lyapunov function at time-step k + 1: where u * 0|k+1 , ..., u * N |k+1 is the optimal sequence given by the MPC at step k+1 and 1|k .It is possible to derive the following upper bound for V k+1 : This expression contains the optimal cost of the MPC optimization at step k + 1, that is Consider now the feasible trajectory ũi|k+1 for i = 0, ..., N − 1 used in Part 2b of this proof.Noting that the cost associated to the feasible trajectory is greater or equal than J * k+1 , one has Therefore the variation of the Lyapunov function is bounded by It is possible to introduce the error terms related to the fact that x0|k+1 = xk+1 ̸ = x * 1|k because of the presence of the observer in the control loop: The different terms of the upper bound of V k+1 − V k are now considered separately.
Consider the state terms at time-steps between k + 1 and k + N − 1: Consider the input terms at time-steps between k + 1 and k + N − 1: Finally, consider the state terms at time-steps k + N and k + N + 1.In view of Lemma 5 one has that, for any φ ̸ = 0, Moreover, for any φ ̸ = 0, ≤ Then the following inequality holds: By construction P f is selected such that A ⊤ δ P f A δ − P f < −qI 2 .Therefore there exists a value of φ small enough to have −P f + (1 + φ 2 )qI 2 + (1 + φ 2 )A ⊤ δ P f A δ < 0. Combining all the computations, one obtains that there exists constants c = λ min (Q) and c 2 ≥ 0 and a Kfunction γ such that In view of (55)-( 60)-(62), V k is an ISpS-Lyapunov function [15].Then the closed-loop system ( 13)-( 20)-( 29) is ISpS with respect to the observer estimation error χ − χ.
Part 3b: To prove convergence, first note that if d k → d∞ for k → ∞, the observer estimation error χ − χ converges to 0 in view of Theorem 2.Then, in view of (56), terms δ xk , δ ūk , δc k and δ hk converge to 0 for k → ∞.
The bounds on V k and on V k+1 − V k are now studied for k → ∞.
Consider the upper bound of V k .If d k → d∞ and y 0 k → y 0 ∞ for k → ∞, in view of observer convergence, ∥x k − x∞ ∥ → 0 for k → ∞.Then there exist k ∈ Z ≥0 such that ∥x k − x∞ ∥ ≤ µ 2 for all k ≥ k.Hence, for k ≥ k, only cases 1 and 2 of the upper bound can appear.Then where β(k) → 0 for k → ∞ for the convergence of δ xk , δ ūk , δc k and δ hk .Consider the bound on V k+1 − V k of Equation (61).γ(k) → 0 for k → ∞ in view of the convergence of the observer estimation error and of δ xk , δ ūk , δc k and δ hk .
Then for k ≥ k the Lyapunov function V k is such that for some a, b, c > 0, with β(k) → 0 and γ(k) → 0 for k → ∞.

G. Proof of Theorem 4
Following [26], the proof can be derived as follows.In view of Assumption 6 the output of the plant converges to a constant value y ϕ,∞ .Then it is sufficient to prove that y ϕ,∞ = y 0 ϕ,∞ .In view of Assumption 6, the observer states x and d converge to asymptotic values, denoted by x∞ and d∞ .By Corollary 1 and Theorem 2, the observer is nominally error free (i.e. it reaches steady state only if y = ŷ), and at steady state satisfies where u ∞ is the input generated by the controller (Reference calculator + MPC), and is a function of x∞ , d∞ , y 0 ϕ,∞ .From (65a), Assumption 4 and Theorem 3 it follows that y 0 ϕ,∞ = g(x ∞ ) + d∞ (66) Hence, from a comparison of (65b) and (66), it follows that y ϕ,∞ = y 0 ϕ,∞ .

Fig. 2 .
Fig. 2. Block diagram of the nominal closed-loop control scheme, where the real plant is replaced by its LSTM model augmented with a disturbance term.

Fig. 4 .
Fig. 4. Evolution of the output of the LSTM model (blue line) compared with reference y 0 (black dashed line), and evolution of the disturbance d (orange dash-dotted line).y 0 and d are selected to respect the sufficient conditions for recursive feasibility of Theorem 3.

Fig. 5 .Fig. 6 .
Fig. 5. Evolution of the output of the LSTM model (blue line) compared with reference y 0 (black dashed line), and evolution of the disturbance d (orange dash-dotted line).The sufficient conditions for recursive feasibility of Theorem 3 are not respected, but recursive feasibility is maintained in practice.

Fig. 7 .
Fig. 7. Evolution of the output of the real plant (blue line) compared with reference y 0 ϕ (black dashed line), and evolution of the plant disturbance d ϕ (orange dash-dotted line), when L d = 0.01.

Fig. 8 .
Fig. 8. Evolution of the output of the real plant (blue line) compared with reference y 0 ϕ (black dashed line), and evolution of the plant disturbance d ϕ (orange dash-dotted line), when L d = 0.1.

Let's introduce
new variables for the difference between the set-point at step k and the set-point for k → ∞ δ xk = xk − x∞ , δ ūk = ūk − ū∞ δc k = ck − c∞ , δ hk = hk − h∞ Terms δ xk , δ ūk , δc k , δ hk can be related to dk − d∞ and y 0 k − y 0 ∞ with a reasoning similar to the proof of Lemma 4.

TABLE I NOMINAL
OPERATING CONDITIONS OF THE PH SYSTEM.