A Finite-Time Protocol for Distributed Time-Varying Optimization Over a Graph

In this article, we address a time-varying quadratic optimization problem over a graph under the assumption that the problem shares the same sparsity pattern as the static graph encoding the undirected network topology over which the multiagent systems interact. Notably, this framework allows us to effectively model scenarios in which the optimization problem is inherently embedded within the network topology, e.g., flow balancing, electrical power system management, or packet routing problems. In this regard, we propose a finite-time distributed algorithm, which allows the multiagent system to track the time-varying optimal solution over time. Specifically, we first solve the frozen-time optimization problem, providing a necessary and sufficient condition for a solution to be globally optimal. Then, based on such conditions, a continuous-time distributed nonsmooth algorithm is developed. Numerical simulations are provided to corroborate the theoretical findings.

A Finite-Time Protocol for Distributed Time-Varying Optimization Over a Graph Matteo Santilli , Antonio Furchì, Gabriele Oliva , Senior Member, IEEE, and Andrea Gasparri , Senior Member, IEEE Abstract-In this article, we address a time-varying quadratic optimization problem over a graph under the assumption that the problem shares the same sparsity pattern as the static graph encoding the undirected network topology over which the multiagent systems interact.Notably, this framework allows us to effectively model scenarios in which the optimization problem is inherently embedded within the network topology, e.g., flow balancing, electrical power system management, or packet routing problems.In this regard, we propose a finite-time distributed algorithm, which allows the multiagent system to track the time-varying optimal solution over time.Specifically, we first solve the frozen-time optimization problem, providing a necessary and sufficient condition for a solution to be globally optimal.Then, based on such conditions, a continuoustime distributed nonsmooth algorithm is developed.Numerical simulations are provided to corroborate the theoretical findings.
Digital Object Identifier 10.1109/TCNS.2023.3272220 packet routing [17] problems.In this view, it is reasonable to assume that the matrices involved within the optimization problem are, indeed, sparse and their sparsity shares the underlying sparsity pattern of the network topology while the time-varying terms in both the constraints and objective function account for the variability due to external factors.Notably, this setting is fundamentally different from classical distributed optimization approaches (e.g., see [18] and references therein).Our strategy to achieve finite-time tracking consists in deriving a nonsmooth algebraic optimality condition for the "frozen-time" problem, i.e., for the problem obtained by considering the actual values of the formulation at a specific time instant and then developing a distributed algorithm able to track the time-varying solution of the nonsmooth algebraic optimality condition in finite time.
We resort to nonsmooth stability theory to prove convergence and finite-time tracking.To the best of our knowledge, this is the first work where finite-time tracking of a quadratic optimization problem with time-varying and coupling objective function and time-varying inequality constraints is obtained.
Let us now discuss the algorithms with finite-time tracking capabilities, which are compared in Table I.In [19], an unconstrained optimization problem with time-varying objective functions is considered, where the agents are able to reach and maintain a consensus on their decision variables in finite time but convergence to the time-varying optimal solution is asymptotic.Notice that Bai et al. [8] considered a particular resource allocation problem with quadratic, invariant, and decoupled objective functions while time variance only occurs in an equality constraint that couples the variables.Moreover, Santilli et al. [13] focused on a particular quadratic problem where the aim is to minimize the square norm of the agents' variables while the agents are coupled by a linear inequality constraint with a time-varying known term.However, Santilli et al. [13] require 2-hop information, whereas Bai et al. [8] rely on 1-hop information; such 2-hop information can be retrieved by resorting to a state-of-the-art finite-time k-hop-distributed observer, which can be implemented using only 1-hop information [24].To summarize, most of the previous literature is not able to provide finite-time convergence guarantees while the few works that have this property are tailored to a particular class of problems [8] or consider simple and invariant objective functions [13].In this article, we aim to fill this gap.Specifically, this article represents an extension of the work [13]; in fact, we introduce a number of improvements: 1) We extend the setting to quadratic programming problems having an objective function that is also time-varying; 2) the objective function considered in this article couples the agents while, in [13], the agents only aim to minimize the square norm of the decision variables; 3) we allow the time-varying constraint vector to possess nonderivable points; 4) we provide a finite upper bound on the convergence time; and 5) we provide a computationally efficient bound on the gain required to guarantee finite-time tracking.

II. NOTATION AND PRELIMINARIES
We denote vectors by boldface lowercase letters and matrices with uppercase letters.We refer to the (i, j)th entry of a matrix A × A ij .We represent by 0 n and 1 n vectors with n entries, all equal to zero and to one, respectively.We use 2 R n to denote the power set, i.e., the set of all subsets of R n .We denote with I n the identity matrix of size n and, with O n×m , the zeros matrix with dimension n × m.Given two vectors x, y ∈ R n , we use max{x, y}∈ R n and min{x, y}∈ R n to denote the componentwise maximum and minimum, respectively.We denote with λ i (A) (σ i (A)) the ith largest eigenvalue (singular value) of the matrix A ∈ R n×n , respectively.Moreover, we use λ max (A) (σ max (A)) and λ min (A) (σ min (A)) to denote the maximum and minimum eigenvalue (singular value) of A, respectively.We use A and A F to denote the 2-norm and the Frobenius norm of a matrix A, respectively, while we use x and x ∞ to denote the Euclidean and the infinity norm of a vector x, respectively.In addition, we introduce the discontinuous sign function sign( and also define their respective vector forms Let G = {V, E} be an undirected graph with node set V = {1, . . ., n} with |V| = n and edges E ⊆ V × V, where (i, j) ∈ E captures the existence of a link from node i to node j.Note that since the graph is undirected, the existence of an edge (i, j) ∈ E implies the existence of the edge (j, i) ∈ E. Let us define a path between agents i and j as the set of edges through which an agent j can be reached by an agent i; in the following, we will denote a path which involves k edges from agent i to reach agent j as a k-hop path between agents i and j.Let N k i denote the k-hop neighborhood of an agent i, that is, the set of agents j for which there exists a p-hop path from agent j to i with p ≤ k.In addition, given a graph G, let A G be the set of matrices compatible with it defined as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
with C = {(i, i)}, i = 1, . . ., n.Note that by definition, the matrix Γ ∈ A G is not required to be symmetric and can have nonzero diagonal entries.The degree d i of a node v i is the number of its incoming edges, i.e., d i = |N i |.Given an undirected graph G = {V, E} with n nodes, we define the Laplacian matrix L as the n × n matrix such that It is well known [25] that when G is connected, L has a unique eigenvalue equal to zero and that the corresponding left eigenvector is 1 T n .Let us now review the Filippov solution concept for differential equations with a discontinuous right-hand side, the nonsmooth analysis of Clarke's Generalized Gradient, and the chain-rule for differentiating regular functions along Filippov solution trajectories.The reader is referred to the works [26], [27], and [28], and references therein for a comprehensive overview of the topic.Let us consider the differential equation with f : R n × R → R n being a measurable and essentially locally bounded function.In the following, where understood, we omit the time dependency.First, we need to clarify what it means to be a solution of this equation.
where μ{H}=0 denotes the intersection over all sets H of Lebesgue measure zero, B(z, δ) denotes the ball of radius δ centered at z and co the convex closure.
The ability to disregard sets of measure zero represents an interesting feature of the above definition that makes it possible to identify solutions even at locations where the vector field is not defined.
We now recall from the work in [28], the conditions for the existence and the uniqueness of the Filippov solutions.
Proposition 1 (Existence and Uniqueness [28]): Let f (z, t) : R n × R → R n be measurable and locally essentially bounded.Assuming that ∀ z ∈ R n , there exists > 0 such that f (•) is essentially one-sided Lipschitz on B(z, ).Then, for all z 0 ∈ R n , there exists a unique Filippov solution of (1) with the initial condition z(0) = z 0 .
We now review the concept of Clarke's Generalized Gradient, an essential tool in the machinery of the nonsmooth analysis.
Definition 2 (Clarke's Generalized Gradient): Consider a locally Lipschitz function V : R n × R → R.Then, the generalized gradient at (z, t) is defined as where Ω V is the set of measure zeros where the gradient of V is not defined.Note that the gradient ∇ includes the derivative with respect to time (∂/∂t).
We now review the chain rule, which allows us to differentiate Lipschitz regular functions along Filippov's solution trajectories.
Theorem 1 (Chain Rule [27]): Let z(•) be a Filippov solution to (1) on an interval containing t and let V : R n × R → R be a Lipschitz and, in addition, regular function.Then, V (z(t), t) is absolutely continuous.d dt (V (z(t), t)) exists almost everywhere (i.e., save for a set of measure zero) and where a.e. is a shorthand for "almost everywhere" (the reader is referred [29] for a more comprehensive overview on the nonsmooth analysis) and ˙ V (z, t) is defined as Let us now recall a revised version of the Generalized Lyapunov theorem given in [26] based on the results given in [27].This will prove useful to establish finite-time stability results for dynamical systems described by differential equations with discontinuous right-hand side.
Theorem 2 (Finite-Time Stability Theorem): Consider a Filippov solution z(t) : R → R n to (1) and let V (z, t) : R n × R → R, be a time-dependent regular function such that V (z, t) = 0 ∀z ∈ C(t) and V (z, t) > 0 ∀z ∈ C(t), with C(t) ⊂ R n being a compact set.Furthermore, let z(t) and V (z, t) be absolutely continuous on [t 0 , ∞) with almost everywhere on {t : z(t) ∈ C(t)}.Then, V (z(t), t) converges to 0 in finite time and z(t) reaches the compact set C(t) in finite time as well.
Let us now also introduce the concept of generalized Jacobian.Consider a Lipschitz vector-valued function It follows from the work in [29] that the generalized Jacobian ∂F (z, t) is: with JF (z, t) ∈ R m×n being the classical Jacobian whenever it exists and Ω F is the set of measure zeros where JF (z, t) is not defined.

III. PROBLEM STATEMENT
Let us consider the following quadratic optimization problem with a time-varying linear objective term and time-varying linear constraints with the same sparsity pattern as the static graph encoding the undirected network topology over which the multiagent systems interacts.
Problem 1: Consider the following optimization problem over a graph G = {V, E} with |V| = n: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
with 1) , . . ., e i(m) ] T , with e j being the jth vector in the canonical basis in R n and In other words, the matrix A essentially amounts to a subset of the rows of a matrix P that is compatible with the graph underlying the agents' interaction.
In this article, we are interested in solving Problem 1 in a distributed manner as detailed in the following.
Problem 2: Let us consider a multiagent system composed of n agents interconnected by a communication network G.Our problem consists in designing a distributed control protocol that drives each agent i to a component of the optimal solution x * (t) and the optimal Lagrange multiplier ζ * (t) of Problem 1 in finite time T , i.e., Before moving forward with the technical derivations of the article, we now discuss an interesting subclass of problems that represents a motivational example for the proposed framework.

Example 1 (Motivational Example):
Let us consider a problem where each agent has two clashing objectives: from one side, the agents want their variables x i (t) to track a time-varying reference signal φ i (t) while, from another side, the agents want to have values as similar as possible with each other (e.g., [30], where multiobjective approaches are used to drive the exploration task of mobile robots).This kind of setting has a number of applications such as in exploration problems in the context of mobile robotics, where agents may want to explore different zones but also stick with each other.Another interesting case is in the context of networks of distributed electrical prosumers, able to consume, provide, or exchange energy with their neighbors; in this context, an interesting feature is the ability to mediate between the local utility of the agent, which can be considered to be time-varying based, for instance, on the energy prices (in this context the possibility to handle nonsmooth variation could be useful to model abrupt price changes), and requirements that the energy provided by the agents is similar, in order to reduce the risk of instability.An example in this direction is given in [31], where prosumers interact by exchanging energy with their neighbors, and the amount of energy produced, exchanged, or consumed is decided by solving a multiobjective optimization problem.From a practical standpoint, the objective function is in the form where the parameter γ is used to mediate between the two objectives.By some algebra, the above objective function can be equivalently expressed as notice that the term γψ T (t)ψ(t)/2, being independent on x(t), can be neglected, in that the optimal solution does not change when a constant is added to the objective function.As a consequence, the objective function becomes x and it can be noted that, within the objective function f (x(t), t), time variability only occurs in the linear term.As for the constraints, for simplicity, one may consider time-varying lower limits on We point out that, while we require ϕ(t) and b(t) to be bounded, we do not need to know their actual bounds; instead, as discussed later, the agents will need to know a bound on their derivative.In addition, such assumption is intrinsically satisfied for well-posed problems (i.e., when b(t) is not bounded, the problem can be easily shown to be either unfeasible or unconstrained, whereas when ϕ(t) is not bounded, the problem reduces to finding a feasible solution).
Let us now define the set The next assumption is required to guarantee that the problem at hand is feasible at all time instants.Assumption 5: For all time instants t ∈ [0, ∞), the set X(t) is nonempty.
As discussed later in the article, the next technical assumption is required in order to set up a proper gain α in our algorithm.
Assumption 6: Matrix Q − A T A is positive definite.Assumption 6 is given without loss of generality.In particular, as discussed later in Remark 4, if the assumption is not satisfied, it is sufficient to scale the objective function by a constant β > 0, which can be computed in a distributed fashion, obtaining an equivalent problem, which satisfies the assumption.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The following assumption characterizes the information available to each agent.
Assumption 7: Each agent i knows: 1) the total number n of agents; 2) the total number m of constraints; 3) the entries Q ij for all j ∈ N i ; 4) the entries A ji for all j ∈ N i ; 5) the entries A j for all j ∈ N i , if the agent is responsible for the lth constraint; 6) φ i (t) and b i (t); 7) the constants κ ϕ and κ b ; 8) the minimum eigenvalue λ min (Q) of Q; 9) the minimum singular value σ min (C) of Remark 1: Knowledge of λ min (Q) and σ min (C) is required in order to adequately choose the gain α within the proposed algorithm.Notice that, as it will be shown later in the article, under Assumption 6, C is guaranteed to be nonsingular, and thus, σ min (C) > 0. Notice further that, as discussed later in the article, the requirement to know σ min (C) can be relaxed, as it is possible to derive a positive lower bound on σ min (C) that only depends on λ min (Q), Q F , σ min (A), A F and n.This requirement could be lifted by resorting to adaptive gains, which represent a valuable future work direction.
Remark 2: The objective function of Problem 1 is convex by construction, and the constraints are linear.Therefore, assuming a feasible solution exists at all time instants, Slater's Constraint qualification holds true at all times [32].
Remark 3: In this article, for the sake of simplicity, we assume each agent is associated with a scalar choice variable.However, the approach can easily be extended to the vectorial case where each agent is associated with vectorial variables x i (t) ∈ R h and time-varying signals b i (t), φ i (t) ∈ R m and the aim is to solve a problem where Q ∈ R nh×nh is positive definite and A ∈ R ×nh , with ≤ nh is full row rank.In particular, in order to generalize the approach, matrices Q and A should exhibit the same sparsity pattern as the graph.For instance, it is possible to consider matrices Q with the following structure: where Q local ∈ R m×m models a local term that only depends on the choice variables available at each agent, whereas the second term is the combination of Q interaction ∈ R n×n , which accounts for the agents' interaction and has the same structure as the communication graph, and Q coupling ∈ R m×m , which models the influence among pairs of agents.

IV. FROZEN-TIME GLOBAL OPTIMAL SOLUTION
Let us now characterize the structure of the global optimal solution at a given time instant t, which we refer to as the frozentime solution at time t.The result will be a system of nonsmooth algebraic equations in the Lagrangian multipliers, which will be the basis for the proposed algorithm.
Proof: The proof follows by classical KKT theory (e.g., see [32], [33]).In particular, we have that the Lagrangian function is ζ being the vector of Lagrange multipliers associated with the constraint.Moreover, x * (t), ζ * (t) are globally optimal and, in particular, x * (t) is unique, if and only if the following conditions hold true: where denotes the Hadamard product.The proof follows noting that points 2)-4) and 1) correspond, respectively, to the first and second blocks of h(•) in (3).
We now establish two results on the Lagrange multiplier vector corresponding to the optimal solution of the frozen-time problem.We will utilize these results later to prove the finitetime convergence and tracking properties of our protocol.
Proposition 2: Under Assumptions 1-5, the set of Lagrange multiplier vector ζ * (t) corresponding to the global optimal solution x * (t) is unique for all t ≥ 0.
Proof: To prove the result, as described in [34] and references therein, it is sufficient to show that the Linear-Independence Constraint Qualification (LICQ) holds true, i.e., the fact the gradients of the constraints evaluated at x * (t) are linearly independent.Notably, in our case, the matrix having such gradients as columns corresponds to A T .Therefore, by Assumption 3, LICQ is verified.

V. DISTRIBUTED OPTIMIZATION ALGORITHM DESIGN
In this section, we develop a distributed algorithm to solve the time-varying optimization problem illustrated in Problem 2. Note that, from Theorem 3, at each time instant t, the optimal solution x * (t) and the optimal Lagrange multipliers ζ * (t) satisfy (3).Therefore, our goal is to enforce this condition for any t ≥ T , with T > 0. To achieve this goal, let us introduce the stacked vector z(t) ∈ R n+m collecting the set of Lagrangian multipliers ζ(t) and the state x(t) as Notably, based on the definition of A in Problem 2, we have that only the subset H ⊆ V of agents is in charge of handling a constraint and it is thus associated with a variable ζ i (t), which Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.models the corresponding Lagrange multiplier.Conversely, each agent i ∈ V is associated with a variable x i (t).
For the sake of readability, let us introduce the functions and the function y : R n+m × R → R m defined as Furthermore, let S(z, t) be the diagonal m × m matrix such that S ii (z, t) = 1 if y i (z, t) ≤ 0, 0 otherwise.and let also S(z, t) be the set of diagonal matrices S(z, t) ∈ R m×m with structure Matrices S(z, t) and S(z, t) will be used later to compute the derivative of the minimum function in (4).We now outline our distributed protocol.Specifically, from the perspective of the ith agent, the proposed algorithm reads as follows (we omit dependencies on the state z and on time t for the sake of readability): Note that, in order to implement (7), each agent i is required to collect the following information.
i) The state variables z l (t) = [ζ l (t) x l (t)] T of the agents l ∈ N 2 i belonging to its 2-hop neighborhood (i.e., in order to compute the functions w j and g j for each 1-hop neighbor j ∈ N 1 i ).ii) The elements A ji for the agents j ∈ N 1 i belonging to its 1-hop neighborhood.
iii) The elements A jl , A lj , and Q jl for the 2-hop neighbors l ∈ N 1 j such that j ∈ N 1 i .iv) The time-varying values ϕ j (t), b j (t) for the agents j ∈ N 1 i belonging to its 1-hop neighborhood.For point i), we notice that the states z l (t) of the 2-hop neighborhood can be locally estimated by agent i through 1-hop local interactions by resorting to the state-of-the art finite-time k-hop distributed observer proposed in [24]; whereas for points ii) and iii), we observe that since the required elements are constant, they can be exchanged once before the execution of the proposed algorithm; finally, for point iv), we assume that the values ϕ j (t), b j (t) of the 1-hop neighborhood can be exchanged through 1-hop communication.Stacking (7) ∀ i ∈ V yields the following matrix form: with h(z, t) defined in (3) and M (z, t) ∈ R (n+m)×(n+m) the matrix defined as In the sequel, we will analyze the convergence properties of our proposed algorithm given in (7) considering its equivalent matrix version given in (8).Moreover, we will show how to choose the gain α in order to guarantee convergence, based on the information available to the agents as per Assumption 7.
Finally, we will also demonstrate how to relax the assumption that the agents need to know σ min (C).

A. Convergence Analysis
In order to establish the convergence of the proposed algorithm, let us first introduce a preliminary result.In this view, let M(z, t) = {M S (z, t) : S(z, t) ∈ S} be the set collecting all matrices M S (z, t) ∈ R (n+m)×(n+m) defined as Lemma 1: Let Assumptions 2 and 3 hold.Then, every matrix M S (z, t) ∈ M(z, t) defined as in (10) is nonsingular.
Proof: In order to prove the result we observe that, by Assumption 2, the lower diagonal block of M S is invertible and its Schur complement with respect to such a block is It is well known, e.g., [35], that it holds det(M S ) = det(Q) det(M S /Q).Since by construction det(Q) = 0, we have that det(M S ) = 0 if and only if det(M S /Q) = 0.In view of a contradiction, suppose det(M S /Q) = 0.This means there is v = 0 m such that Notice that if the diagonal entries of S are all zero we have that ( 11) is satisfied only for v = 0 m , i.e., we reach a contradiction.Hence, let us assume that S has 0 < ≤ m nonzero diagonal entries and let P be the m × m permutation matrix such that has the first diagonal entries that are positive, i.e., S 11 ∈ (0, 1], . . ., S ∈ (0, 1].Since v = 0 m , we can write v = P T v, for some v = 0 m ; therefore, noting that P P T = I m , by premultiplying (11) by P , we obtain where A and Q are obtained by permutation of rows and columns of A and Q, respectively.Let us now partition v Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Such a decomposition induces the following block decomposition for the matrix
With this decomposition, (13) corresponds to and v 2 = 0 m− .Therefore, (15) corresponds to Since S 1 is nonsingular by construction, ( 16) can be rearranged as At this point, we observe that ( A Q −1 A T ) 11 is the leading principal minor of A Q −1 A T of size .Notice that by Assumptions 2 and 3, AQ −1 A T is symmetric and positive definite; therefore, by construction, A Q −1 A T is symmetric and positive definite.As a consequence, by Sylvester's Criterion [36], also ( A Q −1 A T ) 11 is symmetric and positive definite.Furthermore, we observe that S −1 1 − I is diagonal and positive semidefinite since its diagonal entries are being 0 < ( S 1 ) ii ≤ 1.Therefore, H is the sum of a positive semidefinite and a positive-definite matrix and is, thus, positive definite.Moreover, since S 1 is positive definite, S 1 H is nonsingular.Hence, ( 16) is satisfied only for v 1 = 0 .This implies that ( 11) is satisfied only for v = 0 m .We reached a contradiction; therefore, the Schur complement M S /Q is nonsingular, and this implies that M S is nonsingular.
Let us now to prove that our distributed algorithm introduced in (8) allows our system to reach the compact set C(t) = {z * (t)}, i.e., the singleton corresponding to the unique optimal solution at time t, in finite-time T and then remain contained therein for t ≥ T , i.e., is able to solve Problem 2 in finite time.
Theorem 4: Consider the settings of Problem 2 and let the agents run the proposed protocol in (8).Let Assumptions 1-7 hold and suppose that the coefficient > 0 and ρ are known to the agents.Assume also that the gain α satisfies where > 0 is a design parameter used to impose arbitrary convergence time and ρ > 0 is with σ min (M S ) the smallest singular value of the matrix M S introduced in (10).Then, there exits T ( ) > 0 such that the stacked vector h(z, t) introduced in (3) converges to zero in finite time, that is, h(z, t) 1 = 0, for all t ≥ T ( ), where the convergence time T ( ) is upper bounded by a positive-finite value T max = 1/ h(z(0), 0) 1 .
Proof: Consider the following generalized time-varying Lyapunov-like function V (z, t) : which, by Proposition 2, satisfies V (z, t) = 0 for z = z * (t) and V (z, t) > 0 for all z = z * (t).We now prove that the Lyapunovlike function introduced in (20) reaches zero in finite time T ( ) and remains zero ∀ t ≥ T ( ).
In order to apply Theorem 1, let us now compute the generalized gradient ∂V (z, t) as from the application of [29, Th 2.6.6] and [26,Th 1].It can be noticed how the co{•} in ( 21) is superfluous, since [∂h/∂z ∂h/∂t] T SIGN(h) is a vector of closed intervals [37] and thus convex by construction.Since h(z, t) can be discontinuous for some i when y i (z, t) = 0, in general, the terms ∂h/∂z and ∂h/∂t are not singletons and generate the following structure for ∂V (z, t) : In order to analyze the structure of a generic element ξ ∈ ∂V , let us introduce the matrix M z S ∈ R n+m×n+m and the vector M t S ∈ R 1×n+m defined as with S ∈ S, β ∈ K[χ](t), and ψ ∈ K[ω](t).An element ξ ∈ ∂V can then be expressed as In virtue of the aforementioned equations, we can now restate Now, since the proposed control law is the nonsmooth version of the classical gradient descent flow of a differentiable function, where we point out that ξ t is a scalar in virtue of ( 23) and (24).At this point, we can proceed by applying a similar reasoning as in [27].In particular, since ∂ z V is convex, it follows that for all z = z * (t), there exists Considering now ξ z = M z S η and ξ t = M t S η, we obtain the following bound on the generalized time-derivative: where ρ defined as in ( 19) is positive in virtue of Lemma 1, κ b , κ ϕ are the positive bounds on the possible values of the derivatives (whose knowledge is required to set an adequate gain α) of the signals b(t) and ϕ(t), respectively, S ≤ 1 from its structure detailed in ( 6), and we used the fact that whenever h = 0 n+m , i.e., z = z * (t), η has at least one component with | η i | = 1 while, in general, all other components satisfy | η j | ≤ 1, and thus, it holds η 2 ∞ = 1.At this point, by choosing α according to (18), the following holds true: thus from Theorem 2, noting that in our case, C(t) = {z * (t)} is compact by definition (i.e., by Proposition 2 z * (t) is unique, and thus, {z * (t)} is compact at each t), it follows that V (z, t) converges to 0 in finite time and z(t) − z * (t) reaches zero in finite time too (and remains equal to zero).A characterization on the convergence time T ( ) can be computed as An upper bound on the convergence time T ( ) ≤ T max is

The result follows.
Having proven finite-time convergence of the function h(z, t) to the origin, we can now prove that Problem 2 is solved in finite time as well.
Corollary 1: Let the conditions of Theorem 4 hold.Then, Problem 2 is solved for all t ≥ T ( ).
Proof : The results follow from the application of Theorem 4. In particular, for z(t) = z * (t) it holds h(z * (t), t) = 0 n+m , and thus, the optimality condition in (3) is satisfied, proving that x(t) = x * (t) for each time instant t ≥ T ( ).

B. Choice of the Gain α
According to Theorem 4, for each given T max , there is a sufficiently large choice of α such that the system achieves convergence in finite-time that is upper bounded by T max .However, choosing α via (18) may look very hard at first glance, since computing ρ requires to evaluate an infinity of matrices M S ∈ M.
In this section, in order to simplify this endeavor, we first provide a practical way to choose α based on the information available to the agents as per Assumption 7, without the need to consider the different matrices in M; then, with the aim to further simplify the task of choosing α, we show how to lift the assumption that the agents need to know σ min (C).
In the following proposition, we derive a lower bound for ρ, trading easiness of computation for: 1) a slight increase in the magnitude of the resulting gain α and 2) an additional assumption which, as it will be shown later in Remark 4, can always be satisfied by considering an equivalent formulation of the problem at hand.Proposition 3: Let Assumptions 2, 3, and 6 hold true.Then, we have that ρ ≥ ρ > 0, where ρ is defined as in (19), whereas ρ is with C defined in (2).
Proof: In order to prove the result, let us introduce the matrices C S , X ∈ R n+m×n+m defined as By construction, it follows that C S = X M S .Therefore, as shown in [38], it holds σ min (C S ) ≤ X σ min (M S ), and, thus At this point, by resorting to the properties of the Shur complement of block matrices, since S ii ∈ [0, 1] for all i and due to the fact, by Assumption 6, Q − A T A is positive definite, we have that Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where the last equation holds in virtue of the properties of the Schur complement [35] A(Q − A T A) −1 A T of C. To conclude our proof, we observe that, since A is full row rank and Q − A T A is positive definite, also A(Q − A T A) −1 A T is positive definite: In fact, for all x ∈ R m with x = 0 m we have y = A T x = 0 n and thus where the latter inequality holds since y = 0 n and since Q − A T A is positive definite (which implies that also (Q − A T A) −1 is positive definite).Therefore, we have that σ min (M S ) ≥ σ min (C)/ X > 0. The proof is complete since it holds We point out the above lower bound is remarkably easier to compute than ρ, as it requires only knowledge of just σ min (C), n, m and A F without the need to inspect the set of all M S ∈ M. Interestingly, as discussed later in this section, A F can be computed in a distributed way, based on the information available to the agents as per Assumption 7.
Remark 4: Proposition 3 requires Q − A T A to be positive definite.Since for symmetric matrices U, V ∈ R n×n , it is well known [39] where we used the property that A F ≥ A .Notably, an optimization problem is equivalent under positive scaling of the objective function; hence, given Q, A, we can consider the equivalent formulation which satisfies Assumption 6.Interestingly, the Frobenius norm of A can be computed in a distributed way in finite time.In particular, assuming each agent i knows the entries A ij corresponding to its neighbors (if an agent is not in charge of a constraint, it simply assumes A ij = 0 for all its neighbors), it is sufficient to run a finite-time distributed average consensus procedure [40] with initial condition y i (0) = j∈N i A 2 ij , which converges to y = Then, based on knowledge on n, each agent can compute A F = √ ny.Similarly, the agents are able to compute Q F in a distributed way.Therefore, A F , Q F can be computed during an initialization phase.
To conclude the section, with the aim to further reduce the amount of information required for the agents in order to choose the gain α, let us discuss a way to relax the assumption that the agents need to know σ min (C).Specifically, we now show that there is a positive lower bound of σ min (C), which is based on .
Proof: In order to prove the result, we resort to the lower bound in [41], where it is shown that, for a given nonsingular square matrix U ∈ R n×n , it holds > 0 from which we have that Notice that, since Q − A T A is nonsingular, by resorting to the properties of the Schur complement of block matrices [35], we have that where we used the well-known properties that, for Y ∈ R n×n and G ∈ R m×n with m ≤ n, it holds det(Y ) = n i=1 λ i (Y ) and det(GG T ) = m i=1 σ 2 i (G) (see, for instance, [42]).At this point, noting that λ i (Q − A T A) ≥ λ min (Q − A T A) and that [43] Moreover, it holds The proof follows by plugging ( 28) and ( 29) into (27).

VI. SIMULATIONS
For the numerical validation of the proposed protocol, we considered a multiagent system with n = 10 agents interacting over an undirected graph with |E| = 14 edges.Moreover, we consider uniformly random Q ∈ R 10×10 and A ∈ R 7×10 , i.e., |H| = 7.The time-varying vectors ϕ(t) and b(t) are depicted in Fig. 1(d) and satisfy Assumption 4 with κ ϕ = 1.755 and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
In order to correctly tune the gain α, the results of Proposition 3 can be exploited.However, since Q and A do not satisfy the condition on the positive definiteness of Q − A T A, the method explained in Remark 4 can be applied to scale the matrix Q such that β Q − A T A is positive definite.In particular, for this example, we chose β = 1.83.
The proposed algorithm was implemented in discrete time using the forward Euler method with sampling time τ = 10 −8 .Agents implement the local interaction rule given in (7) with gain α = 1706 according to the results of Theorem 4 and Proposition 3. Notably, the estimated bound for the minimum singular value of the matrix M S obtained by exploiting the results of Proposition 3, i.e., ρ = 0.0781, is not far from the best value numerically obtained via a Monte Carlo simulation campaign featuring 10 6 trials.The minimum singular value of the matrix M S , according to Monte Carlo evaluation corresponds to a value of ρ = 0.323.The ratio between the bound computed utilizing Proposition 3 and the Monte Carlo minimum value is ρ/ρ = 4.1379.This implies that our control gain α is about 17 times larger than the minimum value required by the conditions of Theorem 4.
We remind the reader that the proposed algorithm requires 2-hop information that can be estimated in finite-time implementing a 2-hop distributed observer as the one given in [24], which requires only 1-hop information to work.For the sake of simplicity and with no lack of generality, we assume that, at the initial time t 0 , the local observer tracking error has already reached zero, i.e., all the agents for time t ≥ t 0 possess the 2-hop state information required to implement the proposed distributed strategy, and we focus only on illustrating the properties of the proposed algorithm.

VII. CONCLUSION
In this article, we considered a class of quadratic optimization problems with time-varying linear objective term and time-varying linear constraints with the same sparsity pattern as the static graph encoding the undirected network topology over which the multiagent systems interacts.Our contribution is twofold.First, we exploited the Karush-Kuhn-Tucker conditions to derive a necessary and sufficient global optimality condition for the frozen-time problem.Since the derived optimality condition is in the form of a system of nonsmooth equations, we developed a nonsmooth distributed algorithm to achieve finite-time convergence and track to the optimal timevarying solution.Furthermore, we derived a lower bound for the minimum singular value of the family of matrices M S ∈ M, providing a method to practically compute the gain α required to solve optimization problem.Future work will aim to consider more general time-varying problems, e.g., quadratic problems with time-varying Hessian and constraint matrix; in this context, a challenge to overcome is that the time-variability of the aforementioned matrices would not be dominated by a static gain, thus calling for an adaptive gain approach.Furthermore, the introduction of adaptive gains will also allow us to lift the requirements on the information that must be available to the nodes, e.g., the number of agents, the bounds of the derivatives of the time-varying signals, and σ min (C).

Theorem 3 :
Consider Problem 1 under Assumptions 1-5; the frozen-time formulation at any time instant t ≥ 0 has a unique global optimal solution x * (t) and Lagrange multipliers ζ * (t) ∈ R m for the inequality constraint that satisfy

κ b = 1 . 5 .
The considered matrices A, Q are reported in the following:

Fig. 1 (
a) and (b) shows the evolution of x(t) and ζ(t), respectively.Fig. 1(c) depicts the evolution of the Lyapunov-like function V (z, t) introduced in (20) with a small frame in the top-right side of the picture showing a detail of its convergence during the first 0.0025 s of the simulation.Fig. 1(d) shows the evolution of the time-varying vectors ϕ(t) and b(t).Fig. 1(e) shows the evolution of the time-varying constraint Ax(t) − b(t) ≥ 0. Finally, Fig. 1(f) shows the evolution of the errors between x(t), ζ(t),and the optimal solution x * (t), ζ * (t) obtained via a centralized solver where in the top-right side, a detail of their convergence during the first 1.5 s of the simulation is shown.According to the figures, the proposed algorithm is able to track the global optimal solution of the time-varying optimization problem as expected from the results of Corollary 1.

TABLE I COMPARISON
OF THE PROPOSED ALGORITHM AGAINST THE STATE OF THE ART

if z(•) is
absolutely continuous on [t 0 , t 1 ] and for almost all t ∈ [t 0 , t 1