Supervised Learning of Lyapunov Functions Using Laplace Averages of Approximate Koopman Eigenfunctions

Modern data-driven techniques have rapidly progressed beyond modelling and systems identification, with a growing interest in learning high-level dynamical properties of a system, such as safe-set invariance, reachability, input-to-state stability etc. In this letter, we propose a novel supervised Deep Learning technique for constructing Lyapunov certificates, by leveraging Koopman Operator theory-based numerical tools (Extended Dynamic Mode Decomposition and Generalized Laplace Analysis) to robustly and efficiently generate explicit ground truth data for training. This is in stark contrast to existing Deep Learning methods where the loss functions plainly penalize Lyapunov condition violation in the absence of labelled data for direct regression. Furthermore, our approach leads to a linear parameterization of Lyapunov candidate functions in terms of stable eigenfunctions of the Koopman operator, making them more interpretable compared to standard DNN-based architecture. We demonstrate and validate our approach numerically using 2-dimensional and 10-dimensional examples.

. Flowchart summarizing the use of local and global trajectory data in conjugation with EDMD and GLA approaches to learn eigenfunction parameterized Lyapunov functions. The bottom-most block is optional, in case system dynamics f is unknown.
In this letter, we are interested in leveraging tools from Koopman Operator theory towards learning Lyapunov functions by efficient utilization of data, in an interpretable manner. These infinite dimensional linear operators provide powerful and practical techniques that allow well-developed control and analysis principles from linear systems to be applied to nonlinear dynamical systems. Koopman theory has been successfully utilized for a wide range of applications, including modelling and identification in robotics, nonlinear optimal control and MPC via global bi-/linearization, as well as dynamical systems analysis (please see [6] and the references therein).
The eigenfunctions of Koopman operators encode useful information about the underlying dynamics, such as invariant sets and stable/unstable manifolds, and are central to our work. In this letter, we are primarily interested in finding Lyapunov functions, for which we utilize Koopman eigenfunctions. We propose a data-driven technique to learn Lyapunov functions parameterized via Deep Neural Networks (DNN), that are interpretable through the lens of Koopman operator theory: We construct Lyapunov basis functions using Koopman eigenfunctions, which then provide a linear space of Lyapunov candidates. The linearity of this space is instrumental in further refining it and formally verifying the learnt Lyapunov functions [7].
Towards that end, we present a novel learning-based framework that efficiently integrates together and complements the strengths of different techniques, namely Extended Dynamic Mode Decomposition (EDMD) [8] [9], Generalized Laplace Analysis (GLA) [10], [11], and Deep Learning [12]. Thus, the main contributions of our paper are as follows: (i) We leverage dense "local trajectories" (that are initialized inside a local neighborhood N ε of the equilibrium) and sparse "global trajectories" (that are initialized anywhere inside the domain of attraction A ⊃ N ε ) by combining EDMD with GLA, to create explicit ground truth data for Koopman eigenfunctions in absence of a dynamic model. Ground truth data means that corresponding to any point x ∈ A, we can obtain the actual value of an unknown eigenfunction ψ evaluated at x . Figure 1 outlines our proposed approach. (ii) Exploiting the ground truth labels (x , ψ(x )) renders our learning problem into a simplified regression problem, allowing the use of vanilla neural network architectures. This distinguishes our work from related literature [7], [13], [14], [15], that design their training loss criterion primarily on the eigenfunction PDE equation and/or prediction error, and commonly require additional regularization, like autoencoders within their architecture. Similarly, [3], [4], [5] train indirectly using so-called "Lyapunov risk" which penalizes violation of Lyapunov condition. (iii) Finally, we provide technical results to complement our proposed computation technique, leading to numerically well-behaved implementation, 1 as often times, Laplace averages suffer from convergence issues due to exponential terms within the integral [11].

II. BACKGROUND
Notations: Re(·) denotes the real part of its argument, which can be a scalar or a complex-valued function. L f g represents the Lie-derivative of a function g(x) with respect to vector field f (x). The space of continuously differentiable functions on X is denoted by C 1 (X). The notation spec(M) denotes the spectrum of a matrix M ∈ C n×n . Positive definiteness of a matrix P is denoted using P 0. λ min (·) and λ max (·) are eigenvalues with smallest and largest magnitudes, respectively. R denotes set of reals and C is the complex set. The notation · is used for 2-norm of a complex vector.
We now consider the nonlinear dynamical system where function f : X → R n is continuously differentiable and states x(t) evolve in the compact set X ⊂ R n for all t ≥ 0. Let φ t : X → X denote the flow-map of this system, that is, ds, for all t ≥ 0 and x ∈ X. Definition 1: The Koopman semigroup 2 of operators is defined as the linear operators K t acting on a (Banach) space F of functions g : X → C such that [K t g](x) = g(φ t (x)), for every g ∈ F. Elements of F are referred to as 'observables'.
In the rest of this letter, we shall take our observable space F to be the Banach space C 1 (X) due to mathematical convenience. Corresponding to this infinite dimensional linear operator, one can define Koopman eigenfunctions as functions ψ(x) ∈ F that satisfy [K t ψ](x) = e λt ψ(x) for some constant λ ∈ C (in other words, d dt ψ(x(t)) = λψ(x(t))). Scalar λ is the eigenvalue associated with eigenfunction ψ.
Assume system (1) has an asymptotically stable equilibrium at x e . We call a Koopman eigenfunction ψ as a principle eigenfunction if ψ(x e ) = 0 and ∇ x ψ(x e ) = 0, such that its corresponding eigenvalue belongs to spec(∇ x f (x e )). Given the semigroup of eigenpairs E (which is the set comprising of (λ, ψ) pairs), the set of principle eigenpairs are the minimal generator G of the set E. That is, For systems with a hyperbolic equilibrium x e , the uniqueness and existence of principle eigenpairs can be characterized by spec(∇ x f (x e )) under conditions of nonresonance [16].
Definition 2: Let λ ∈ C be a scalar and g : X → C be an observable. Then the Laplace average of g is defined as If the above limit converges for the observable g, this Laplace average g * can be verified to be a Koopman eigenfunction corresponding to eigenvalue λ, since Note that stable Koopman eigenfunctions can be used to describe invariant sets [17]. Given Koopman eigenfunctions ψ i (x) : A ⊆ X → C for i = 1, . . . , N (with corresponding eigenvalues λ i ∈ C with negative real parts), we can define functions V i (x) .
In particular, it follows that all trajectories starting inside A converge to the set M 0 i , due to LaSalle's Invariance Principle, i.e., φ t (x) → i M 0 i , ∀x ∈ A. We use these V i 's in our final step for constructing Lyapunov functions from data.

III. MAIN RESULTS
In this section and the rest of this letter, we shall focus our attention on nonlinear systems that have an asymptotically stable equilibrium point (taken to be origin without loss of generality). We first present convergence results for Laplace averages based on easy to verify conditions. Next, we show how one may use trajectory data to exactly compute Koopman eigenfunctions using the combination of EDMD and GLA. In the final part of this section, we present how the learned eigenfunctions are used for linearly parameterizing a space of candidate Lyapunov functions.

A. Convergence Results for Laplace Averages
GLA has been discussed widely in [10], [11] as an analysis tool for obtaining Koopman mode decomposition of observables onto the space spanned by eigenfunctions. It is important to note that the computation of Laplace averages in general is not numerically amenable, due to convergence issues posed by the exponentially growing term within the integral. Thus, we first present some convergence results that ultimately aid us in their numerical computations.
Lemma 1: Consider the system (1) with asymptotically stable equilibrium x e = 0. If an observable g with an isolated root at x e locally satisfies L f g = λg (for a stable λ with Proof: Without loss of generality, we can consider N ε to be forward-invariant due to the following. Since x e is an isolated root of g, by definition, there is a positive r such that Thus, for any α < min x−x e =r g(x) , the α-sublevel set of g has a connected component C α strictly contained inside the ball x − x e ≤ r. This ball can itself be contained inside N ε by choosing a small enough r, Next, for any point x ∈ A, let us define T * (x) as which is the time it takes for a trajectory starting at point x to reach the set N ε , and is always finite for all x in the region of attraction A. Thus, the Laplace average is The first term in the right-hand side of the above equation is equal to zero since the integral is finite. In the second term, which is finite for all x ∈ A since g is continuous.
Hartman-Grobman theorem applies to the asymptotically stable system of Lemma 1 (and any system with a hyperbolic equilibrium in general), guaranteeing the existence of a local neighborhood of the equilibrium where the flow of the nonlinear system is homeomorphic to that of its linearization. One may note that (3)-(4) provide a more numerically robust way of computing the Laplace average compared to equation (2), since the approximation errors in the function g(x) may accumulate and grow exponentially under the integral. We show in the following theorem how the Laplace average computations of any function may be robust to approximation errors up to a certain order.
Theorem 1: Let us again consider the function g(x) as described in Lemma 1 and its approximationĝ(x) such that is Hurwitz, thenĝ * λ (x) = g * λ (x) ∀λ ∈ spec(A) and ∀x ∈ A. Proof: Let us pick any λ from the set spec(A). Since g satisfies conditions of Lemma 1, its Laplace average g * λ is well-defined on the entire set A. Now, using the triangular inequality for integrals in equation (2), we get We then need to show that the integrand in the above inequality converges to zero under the stated conditions. Let us rewrite the right-hand side of dynamics (1) assuming f is analytic and the equilibrium is hyperbolic allows to do this). One can show that since h(0) = 0 and ∇ x h(0) = 0, h(x) ≤ c ε x 2 for a positive constant c ε in some compact set x ≤ ε using the Mean Value Theorem for function in several variables [18]. We can then estimate the convergence rate of x 2 in this compact set. Let the matrix P 0 be the solution to the Lyapunov equation where κ ε . = 1 − 2εc ε P λ min (Q) increases monotonically as ε gets smaller (and ε is henceforth taken to be small enough so that κ ε > 0). Therefore, it follows that x 2 exponentially decays with rate λ min (Q) λ max (P) κ ε . Since the matrix expression in equation (5) is Hurwitz, every eigenvalue λ of A satisfies λ min (Q) λ max (P) + Re(λ) > 0. We can pick ε small enough such that κ ε is arbitrarily close to 1, thus leading to Since ξ(x) = o( x 2 ), taking the limit t → ∞ on both sides, we obtain lim t→∞ e −λt ξ(φ t (x)) = 0. All trajectories starting anywhere in A will eventually enter the region x ≤ ε, and so this limit holds true for all x in A. This completes the proof.
Remark: Equation (5) imposes a condition on the distribution of the eigenfunctions of A, such that they are not spread too far apart. Intuitively, if Re(λ a ) << Re(λ b ) < 0, then the Laplace average of an observable g corresponding to λ a (i.e., g * λ a ) may not converge, since the exponential term e −λ a t inside integral (2) will grow very fast, whereas g(φ t (x)) will decay relatively slower due to the "slow" eigenvalues in spec(A).
For this system, we obtain eigenvalues λ 1 = −1. the identity matrix and solving PA + A P = −Q, we get P = 0.52 −0.12 −0.12 0.53 . Thus, the matrix in equation (5) is −0.52 0.11 0.34 −0.58 which satisfies the Hurwitz condition of Theorem 1. Additionally, corresponding to the eigenvalue −0.8, this system has a principle eigenfunction ψ(x) = x 3 1 −x 1 −2x 3 2 +2x 2 , which means that the function g(x) = ψ(x) satisfies the condition of Lemma (1). Thus, any approximation of g(x) within an error of o( x 2 ) must result in the same Laplace average g * −0.8 (x) according to Theorem 1. We revisit this later in Example 2.

B. Numerical Computation From Trajectory Data
We have shown in the previous subsection how any function g defined on X that locally satisfies L f g = λg in some neighborhood of the asymptotically stable equilibrium results in a well-defined Laplace average g * λ (x) everywhere in the region of attraction A. This function g * λ can then be used further as a Koopman eigenfunction. However, such a function g itself needs to be obtained first. In this section, we see how EDMD can be effectively combined with GLA to compute Koopman eigenfunctions through dense local trajectory data near x e and sparse global trajectory data throughout the set A. We propose to first utilize just local trajectory data with EDMD towards obtaining accurate local estimates of suitable functions g. With these local estimates, we compute their Laplace averages g * λ defined over the larger domain A. The local estimates via EDMD are expected to have approximation errors, but thankfully, through Theorem 1, the GLA computations are shown to be robust to small (o( x 2 )) approximation errors in g. One may use the global trajectory data to directly perform EDMD and estimate Koopman eigenfunctions defined over the entire set A. However, this may lead to poor estimates if global trajectory data is insufficient; thus utilizing dense, local data for local estimates is a more reasonable alternative.
We now outline the EDMD process, which is a well studied data-driven algorithm to approximate the Koopman operator in finite dimensions by lifting the state space into a higher dimensional embedding wherein the dynamics are (approximately) linear [8], [9]. This approximation is obtained as a solution to a least squares optimization problem. Consider the function θ(x) = [θ 1 (x), θ 2 (x), . . . , θ N (x)] comprised of basis functions θ i (x) ∈ F, i = 1, 2, .., N. Given that we have trajectory snapshots at some uniform sampling time τ , in form of M pairs (x i , y i ) ∈ N ε ×N ε where y i = φ τ (x i ) for i = 1, 2, . . . , M, the EDMD procedure is used to estimate a finite dimensional approximation of the Koopman Operator K τ restricted to the space spanned by the basis functions. This is expressed as the Koopman matrix K, by solving the following least-squares problem: where · F denotes the Frobenius norm, and matrices θ(X) = [θ(x 1 ), θ (x 2 ), . . . , θ(x M )] and θ(Y) = [θ(y 1 ), θ (y 2 ), . . . , θ(y M )]. The K that minimizes (6) is obtained in closedform as K = θ XY θ † XX , where † denotes the pseudo-inverse, θ XY = θ(Y)θ (X) and θ XX = θ(X)θ (X) . Suppose that (v i , μ i ) are the eigenvectors and eigenvalues of K for i =  1, 2, .., N. Then, in the set N ε , letting g i (x) .
Next, with these functions g i obtained by EDMD, we compute the Laplace averages g * λ i (x) for any point x ∈ A, by 'unrolling' the trajectory φ t (x) starting at x forward for a sufficiently large time, 3 followed by numerically integrating (2) using the unrolled trajectory. Note that one can combine the trajectory unrolling and GLA integration steps into one, by augmenting the system dynamics (1) using a new scalar state variable z as , t ≥ 0 (8) for some small constant δ > 0. If z(0) = 0, this state z can be shown to evolve as z(t) = 1 t+δ t 0 e −λs g(φ s (x))ds for all t ≥ 0. Thus, when z(0) = 0 and x(0) = x, we get z(t) → g * λ (x) as t → 0. In other words, unrolling this augmented dynamics (8) from an initial point (x, 0) gives us the Laplace average through the augmented state variable z.
Once we obtain the ground truth values of the eigenfunctions at randomly sampled points in A, we create a training dataset to learn DNN-based approximations of the eigenfunctions. Due to space limitations, we kindly refer readers to our code link for details.
Numerical Example 2: Consider the system described in Example 1 with an asymptotically stable equilibrium at the origin. We consider the neighbourhood N ε = [−0.05, 0.05] × [−0.05, 0.05], and sample 100 trajectories of length 100 seconds each, with a sampling period τ = 0.01s. Basis functions θ i are taken to be monomials of degree up to 3. From EDMD, we obtain g(x) = 1.014x 3 1 − x 1 − 2.027x 3 2 + 2.00x 2 corresponding to the principle eigenvalue λ = −0.8, which is a very close estimate of the actual principle eigenfunction x 3 1 − x 1 − 2x 3 2 + 2x 2 , with an approximation error ξ(x) in the order of o( x 2 ). We compute the Laplace average g * −0.8 (x) at randomly sampled points in [−0.4, 0.4] × [−0.4, 0.4] using this g(x) obtained from EDMD using trajectory data within a much smaller domain N ε . Figure 2 compares this Laplace average with the actual eigenfunction x 3 1 − x 1 − 2x 3 2 + 2x 2 . 3 In practice, this simply means until 1 T T 0 e −λt g(φ t (x))dt numerically converges to a steady state.

C. Lyapunov Certificates From Koopman Eigenfunctions
Our main goal for this letter is to develop a data-driven technique that efficiently leverages trajectory samples of a dynamical system to construct Lyapunov certificates. We particularly focus on developing an approach suitable for highdimensional systems wherein several learning techniques may need to be combined together to make the best use of the available trajectory data.
One can use functions V i = 1 2 ψ i (x) 2 in a number of different ways to construct Lyapunov functions, but we are primarily interested in the linearly parameterized space of candidate Lyapunov functions given by ≤ 0 by construction. V Lyap is particularly interesting for the case when Koopman eigenfunction ψ i 's are represented as polynomials. In that case, each element in V Lyap is a sum-of-squares (SOS) polynomial, and one can leverage SOS semi-definite programming to verify Lyapunov conditions (when the system dynamics are also polynomial) [7].
In our prior work [7], we discuss how inaccuracies in the computation of Koopman eigenfunction via data-driven techniques may lead to violations in the Lyapunov conditions for functions V i and consequently, not every element of the space V Lyap would be a valid Lyapunov candidate for establishing asymptotic stability guarantees for the equilibrium point/manifold. Nevertheless, thanks to linearity of the space V Lyap , it was shown that under certain approximation error bounds on the basis V i , one can still pick functions V(x) ∈ V Lyap that satisfyV(x) ≤ β(γ − V(x)) for some constants β, γ > 0, thus providing certificates for forward-invariance. In this letter however, we develop efficient computation techniques for Koopman eigenfunctions which are more accurate in comparison to [7], enabling us to directly obtain Lyapunov functions via the estimated eigenfunctions through (9), as evidenced from our numerical experiments.

IV. NUMERICAL RESULTS
In this section, we demonstrate our method towards construction of Lyapunov certificates for nonlinear consensus in a 10-dimensional multi-agent system. We demonstrate how trajectory data from the agents are used in the absence of agent model and communication graph to represent the dynamics via Koopman eigenfunctions, which then serve as a building block for obtaining Lyapunov functions.
Consider a graph G , with vertices denoted by the set V = {0, . . . , n − 1} and edge set denoted by E = {1, . . . , m}. The set of neighboring nodes of agent i is denoted by N i ⊂ V . We are interested in consensus problem for multi-agent systems corresponding to this network G , wherein scalar state x i ∈ R for each agent i ∈ V is required to converge to a common value. Let us next consider the following nonlinear distributed consensus protocol for single-integrator agent dynamics: where σ i (·) is continuous and positive for all i ∈ V , and a ij (·) is Lipschitz continuous for all (i, j) ∈ E with xa ij (x) > 0 when x = 0 and a ij (0) = 0. It was shown in [19] that if G is a connected, undirected graph and a ij (·) are odd functions for all (i, j) ∈ E , then the states x converge to a point x * uniquely determined by i∈V The converse holds as well. The proof utilizes a Lyapunov function of the form Note that this Lyapunov function needs knowledge of both the underlying dynamics of the system as well as the graph topology. Our goal is to directly learn a Lyapunov function just from the trajectory data of the agents. As a specific example, we consider a n = 10 agent system, with a ij (y) = r ij [1.2 tanh(y) + sin(y)], for some random constants r ij ∈ (0, 1), σ i (y) = 1 1+exp(−y) , and graph G as follows: We modify this system slightly by adding a linear term −x i to the right-hand side of equation (10) to ensure that the origin is an asymptotically stable isolated equilibrium and that condition (5) holds. We begin our approach with the EDMD step, by sampling N = 100 trajectories of length 10 seconds from within a local neighborhood N ε = [ − 0.05, 0.05] n , with sampling interval of 0.01s. We pick monomials of degree up to 3 as our choice of basis functions θ(x) for EDMD. By allowing the sampling neighbourhood N ε to be small in our approach, we are able to precisely extract principle eigenvalues of this system, of which we pick four with the smallest magnitude, given by λ 1 = −0.55, λ 2 = −0.59, λ 3 = −0.69, and λ 4 = −0.76. The local eigenfunction estimates g i (that are valid within N ε ) corresponding to these four eigenvalues are used to compute Laplace averages over a much larger domain [−0.45, 0.45] n . The Laplace average computations are numerically stable and the integral (2) converges, in agreement with our technical results.
In the final step, we train the DNN parameterizing principle eigenfunctions ψ i (x) corresponding to those four principle eigenvalues, over the larger domain [−0.45, 0.45] n using supervised learning, as outlined in Figure 1. We use a multilayer perceptron (MLP) architecture with sine activation function. For every i = 1, . . . , 4, our MLP ψ i consists of 3 hidden layers with 128 neurons each. The loss is comprised of a supervised learning term L 1 (x) and a physics-informed term L 2 (x): 2 .
where g * λ i is the Laplace average of the function g i obtained by EDMD that locally satisfies (7). For this example, loss L 1 is computed at 3000 randomly sampled points in [ − 0.45, 0.45] n and L 2 is computed at 2000 randomly sampled points in [−0.45, 0.45] n during training. For further details, please see the associated Pytorch code implementation. In Figure 4, we show the Lyapunov basis functions V i (x) = 1 2 ψ i (x) 2 corresponding to i = 1, . . . , 4 along random trajectories, as well as an example of a Lyapunov function from linear combination of the basis V i 's.

V. CONCLUSION AND DISCUSSION
This letter presents a Deep Learning approach for construction of Lyapunov certificates, parameterized linearly using Koopman eigenfunctions of the underlying dynamical system. One of the novelty of our approach lies in the creation of labelled training dataset for supervised learning of Lyapunov basis functions via Koopman eigenfunctions, by combining EDMD and GLA. Trajectory data near the equilibrium are used to locally estimate eigenfunctions. Laplace averages of these local estimates computed along sparse but global trajectories are then used to evaluate eigenfunctions at sample points throughout the domain of attraction, generating ground truth function values for the final deep supervised learning step. This letter additionally contributes towards addressing numerical challenges involved with GLA computations, both in terms of analysis and efficient implementation.
The assumption on the second order estimation error bounds of eigenfunctions obtained in the EDMD step requires further investigation, and is a matter of our current focus. To the best of our knowledge, only constant error bounds for EDMD are available in present literature. Additionally, once the ground truth values for the eigenfunctions are obtained, a key question to be answered is how well can a particular choice of function approximator (DNNs in our specific case) fit the synthesized training data. Studying the sample complexity of our approach as well as formal verification of the learnt Lyapunov function are therefore important future directions. Fortunately, despite both these open questions, our numerical examples successfully illustrate the practicality of our approach. For the low-dimensional example where the eigenfunctions are known in closed analytical form, the second order EDMD error bound is found to hold true and consequently our computed eigenfunction estimates exactly match the actual eigenfunctions. The high-dimensional example illustrates scalability of our proposed method, thereby alleviating concerns surrounding sample complexity of the deep learning step, since ground truth training labels can be generated efficiently for our 10-d system.