Design of a Distributed Switching Model Predictive Control for Quadrotor UAVs Aggregation

This letter proposes a novel distributed model predictive control (MPC) strategy to address the swarm aggregation of a team of quadrotor unmanned aerial vehicles (UAVs). First, a switched formulation of the quadrotor model is derived by mapping the UAVs dynamics into a set of finite motion modes. Then, relying on a suitably selected control Lyapunov function (CLF), the inter-agent collisions and the aggregation task are taken into account to design a switching MPC (SMPC) strategy. A clustering method is also introduced to define the communication network among the agents, which is essential to sequentially solve the optimal control problem. Finally, the efficacy of the proposal, also in comparison with other methodologies, is satisfactorily shown in simulation.


I. INTRODUCTION
I N RECENT years, thanks to the non-stop advances in communication, sensing and processing technology, the application domain of quadrotor unmanned aerial vehicles (UAVs) has grown in several fields from military purposes [1], to agriculture [2], to racing [3], among many others. Their main attractiveness relies on autonomously performing activities in scenarios where human integrity might be compromised. However, and despite the enormous research on single UAV flight control, the use of multiple UAVs in an organized swarm has demonstrated to improve the single performance of the agents as well as the efficiency of the entire group.
In the literature, many works were devoted to address the design of controllers for UAVs swarms. In [4], for instance, a multicopter geometrically constrained trajectory planning is investigated. Among decentralized solutions, a hybrid control algorithm to achieve swarm formation is proposed in [5], while in [6] a decentralized model predictive control (MPC), defined by inter-agent bearings to avoid collisions, is presented. Among results with distributed topology, instead, a consensus based control strategy for swarms of UAVs under a timevarying topology for flight formation, swarm tracking, and social foraging is presented in [7]. A distributed trajectory optimization algorithm for safe multi-agent trajectory planning is proposed in [8], while an observer-based optimal consensus controller for a multi-UAVs system, which integrates linear quadratic regulator technique and linear matrix inequalities, is introduced in [9]. In the stochastic framework, [10] discusses a chance-constraint formulation to account for position and velocity uncertainty, while achieving obstacles avoidance.
Having in mind multi-UAVs aggregation, localization requires continuous sensor information. However, computational and communication limitations make some simplifications mandatory to reduce the curse of dimensionality. Spurred by these motivations, this letter takes inspiration from [11], [12], where a differential wheeled robot dynamics is constrained to only two motion modes, and from [13], which proposes an order-reduction of a remote controlled car model, recasting it as a hybrid system. Indeed, all these works address the previously mentioned problems, which arise together when a swarm of agents with high order dynamics is studied, e.g., an UAVs swarm, as in this letter. The main purpose of this letter is to overcome these challenges by proposing a novel optimization-based strategy, exploiting switched systems, in order to solve a swarm aggregation problem. Indeed, aggregation processes are very common, e.g., in biological systems, and they often are a needed prerequisite for many collective systems for accomplishing other cooperative tasks.
Specifically, this letter proposes an original swarm aggregation strategy, that combines the advantages of switched systems with those of distributed MPC. Differently from [11], [12], where only two switching planar modes are considered, the proposed aggregation strategy is extended to the tridimensional case. Moreover, while [12] presents a control Lyapunov function (CLF) based approach, in this letter, starting from a suitable CLF, a novel distributed switching MPC (SMPC) is proposed, capable of dealing with an arbitrary number of modes capturing the UAVs motion, and employing a clustering method to define the data exchange among agents. This allows achieving both optimal control performance and computational simplifications for the aggregation algorithm, which has to face the issue that it needs to scale well with the swarm population, as finally assessed in simulation on a large-scale case study.

II. MODELLING AND PROBLEM FORMULATION
This section introduces the considered quadrotor UAV model and the problem formulation.

A. Quadrotor UAV Model
Here, we focus on a general aggregation problem in which the UAVs could collide among each other. Since in practice, in this case, it is convenient to provide velocity references to the internal control loops, the differential kinematic model of the UAVs is hereafter considered. Moreover, we assume that a local replanner is used to allow smoothness in velocity and acceleration (see, e.g., [14]) even in the case of switching policies, with local controllers capable of perfectly tracking the desired velocities.
Consider Fig. 1, and let [p x p y p z φ θ ψ] be the vector containing the linear and angular position of the quadrotor in the world frame O W −X W Y W Z W , and [v u v v v w w p w q w r ] be the vector containing the linear and angular velocities in the body frame O B − X B Y B Z B . In the following, the dependence of all the variables on time will be omitted when obvious.
Relying on Euler equations for the spatial motion of a rigid body, the relation between the body and the world frame is where with c(·) := cos(·), and s(·) := sin(·). Now, by posing the quadrotor pose and orientation in the state vector x := [p x p y p z φ θ ψ] ∈ R 6 , and the velocity vector in the input vector u := [v B w B ] ∈ R 6 , then the differential kinematic model (1) can be rewritten aṡ (2)

B. The Switched Quadrotor Model
Motivated by the need to reduce the curse of dimensionality when dealing with multi-quadrotor systems, we model the quadrotor dynamics (2) as a switched system, switching among a finite set of operating modes. Let m be the number of predefined motion modes such that σ ∈ M := {1, . . . , m} is the so-called switching signal. The latter defines the current motion mode so that u(σ ) ∈ U sw can be modelled as a switching input, with U sw containing the following vectors wherev andw are predefined longitudinal and rotational velocities, and e σ ∈ R 6 is the canonical unit vector. Then, the switched time-invariant system corresponding to (2) iṡ where

Remark 1 (Modes Selection):
Note that the modes selection is not trivial. Indeed, the set of vector fields Q must include at least one non-zero longitudinal velocity component in at least one of its elements, since pure angular velocities in all modes do not contribute to any displacement in the tridimensional space.
For design purposes, the set of vector fields must also include a zero vector, meaning that the quadrotor is capable of keeping still. In this letter, without loss of generalization, we assume m = 13 motion modes, with 6 velocity vectors having only one positive non-zero component and zero for the others, 6 velocity vectors defined as the previous ones but with negative values only, and one fully zero vector. This modes selection, based on a specific motion policy and according to Remark 1, allows the UAVs to move in any possible direction of the world frame.

C. Multi-UAVs Switched System and Problem Statement
Having in mind a team of quadrotors, consider now a system composed of n agents, and let N := {i ∈ [1, . . . , n]} be the set comprising all the agent indexes. In the following, the apex [i] will be used to indicate the ith subsystem. The multi-quadrotor system model is obtained by redefining the vectors x and u as x := [x [1] , . . . , x [n] ] and u := [u [1] , . . . , u [n] ] , respectively. Analogously, a switching string := {σ [1] , . . . , σ [n] } ∈ S := M n is introduced such thaṫ where u( ) ∈ V := U n sw , and f ∈ Q n := {f 1 , . . . , f m n }. Therefore, given the switched model (5) constrained to m n motion modes, the control problem to solve consists of designing a control strategy capable of aggregating the quadrotors, while avoiding collisions among them, and taking into account the switching nature of the system dynamics.

III. CONTROL LYAPUNOV FUNCTION BASELINE
To solve the problem formulated in Section II, let us introduce the design of a CLF for the switched system (5). To streamline the exposition, we refer to [15], [16] for further details on CLFs.
Having in mind an aggregation objective for UAVs with rigid body properties, it is natural to consider for each agent the pose dynamics (1a) and the reciprocal Euclidean distance between the ith and jth centroids, i.e., In order to take into account the space occupancy of UAVs, letd ∈ R be the reference between two agents, and d ∈ R a distance lower bound. Note that d is instrumental to manage the collision avoidance among agents, and such a parameter is selected relying ond, the maximum longitudinal velocityv and, considering practical implementation, also on the sampling time T, such that < d <d − 2vT, with being the length of the UAV's frame. Then, posingd ij := d ij −d, andd as the vector containing all the distance errors, consider the following function with γ ij being a weight defined as The weight (7) is a continuous function of d ij , which is aimed at avoiding that the UAVs move away fromd. As customary in the theory of switching control [17], the Lyapunov function induced by the argmin-based switching control action is continuous but only piecewise differentiable so that the notion of the Dini derivative should be used. Remark 2 (Switching Law): Note that, in the considered CLF baseline, making the zero mode possible only when the agents aggregate, the switching law can be designed as the arg min(D + V), where D + is the Dini derivative. In the case of two or more equal minima, one could select the first one in order. Typically, this kind of strategy leads to a sliding trajectory around the discontinuity points of the CLF. In our case, depending on the selected switching inputs as in (3), it is possible to show that such a derivative is negative, and zero only when the agents are aggregated.
By virtue of the decreasing property of the selected CLF, it can be employed as value function for establishing stability of a finite horizon optimal control problem in discrete timedomain.

IV. THE PROPOSED DISTRIBUTED SWITCHING MODEL PREDICTIVE CONTROL
Motivated by the reasoning of the previous section, we present now an alternative aggregation approach based on SMPC. This approach keeps the spirit of the CLF based law, and, by virtue of the guarantees mentioned above, it exploits the CLF in (6) as cost function to be minimized subject to the switched dynamics in (5).
The SMPC is designed to be solved at each time instant k ∈ N 0 , so that by introducing the sampling time T > 0, and discretizing the UAVs dynamics by using forward Euler method, one has (8) with f d k belonging to the set of vector fields Q d n := {f d 1 , . . . , f d m n }. Letting N ≥ 1, the prediction horizon is instead defined as T k := {k, . . . , k + N − 1}, while the variable t is used to span along the prediction horizon, i.e., t ∈ T k . The cost function of the SMPC, relying on the function designed in (6), is chosen as with the weight γ ij t as in (7). Therefore, letting k := wherex k expresses the measured state at each k ∈ N 0 . Moreover, in (10), the set of the state constraint is in order to avoid collisions among the quadrotors, and limit the angle variation according to positive thresholdsφ,θ ,ψ. It is worth highlighting that, due to the combinatorial nature of the strategy, the formulated SMPC problem (10) can be computationally demanding. More specifically, the computational complexity is indeed given by the number of evaluations of the nonlinear dynamics f ( ) that have to be performed for computing the optimal sequence u * The number of possible control sequences belonging to the set W is given by the cardinality |W| = m N , so that, for n UAVs, the whole complexity is n · m N . As a consequence, some simplifications are introduced as follows.
1) Switching Policies: In order to alleviate the computational burden required to solve the optimal control problem (10), the set of possible sequences W is reduced by introducing policies in order to determine the permitted switching between the considered modes. For example, one can impose that only one agent per time can modify its dynamics, or, depending on the task's constraints, some sequence can be discarded (e.g., employing only the heading, the forward and the vertical motions in the case of a non-holonomic way of navigation). This implies the selection of the optimal input sequence be within a subset of feasible sequences, i.e., 2) Clustering: Despite the introduction of switching policies, the number of agents is still a critical parameter that compromises real-time computations and could possibly affect fault-tolerance, organizational complexity and maintenance problems. Therefore, a distributed control method is a valid solution to further reduce the complexity, locally solving the optimal control problem inside subsystems that share limited resources over a constrained communication network. Making reference to the distributed solution in [18, Sec. 5.2.1], a clustering approach is hereafter introduced.
Let C [i] k be the set defined by the clustering method as where r cl is a predefined radius of the sphere cluster attached to the ith robot centroid. Consider now Fig. 2, where each quadrotor locally generates the optimal input sequence u * The ith quadrotor receives at each sampling time the state feedback set and the predicted input set in order to compute, according to the receding horizon principle, the input u [i] k in a decreasing sequential order, according to a predefined indexing criterion. This implies that the jth SMPC, j = n, . . . , 2, retrieves no predicted input set from subsystem κ = j − 1, . . . , 1. To solve this issue, the optimization problem feasibility is guaranteed by letting the jth SMPC, j = n, . . . , 2, assume specific trajectories for the predicted input subset U [κ] k , κ = j−1, . . . , 1 (for instance the quadrotors are kept still along the prediction horizon).
3) Distributed SMPC: For each quadrotor the SMPC problem is finally formulated by defining the cost as

Remark 3 (Sub-optimality):
The distributed SMPC solution depends on the predefined indexing criterion. Then, other suboptimal solutions can be found by reassigning the indexes, solving an optimization problem aimed at searching for the index criterion which generates the smallest cost.
Remark 4 (Scalability): It is worth highlighting that, by virtue of the clustering method and the collision avoidance mechanism, which imply a maximum number of agents belonging to the same subgroup, if the total number of quadrotors increased, this would not correspondingly affect the computational complexity to solve (16).

V. CASE-STUDY
In this section, the proposed distributed SMPC is assessed in simulation relying on different sizes of the UAVs swarm.

A. Settings
The simulated test benchmark consists of different scenarios to evaluate the performance of the proposed algorithm on UAVs swarms with a various population size, ranging from n = 10 to 150 agents. The quadrotors have a random initial distribution over a tridimensional grid in a 6 × 6 × 6 m cube space and are characterized by the parameters listed in Table I. Finally, some performance metrics are introduced.
Let a cluster of UAVs be the maximal connected subgraph of the graph defined by the UAVs positions, where two agents are considered to be adjacent if another one cannot fit in between them, that is if d ij < 2d [11]. Therefore, the first index is given by the ratio between the number of quadrotors within a cluster, as defined above, namely n C , and the total number of agents n, that is with full aggregation corresponding to μ * a = 1. The second index is the packing density ratio between the sum of the sphere volumes of diameterd for each agent, and the volume of the smallest sphere encapsulating the previous ones, i.e.,  where b is the barycentre of the network. Note that the sphere packing problem has been widely studied in the literature, see, e.g., [19], and optimal density values for systems up to 200 equal spheres are listed in [20]. Then, the percentage error between the density ratio μ p and its optimal value given by μ * p is computed as Also the computation time to solve the SMPC, i.e., μ t c , and the objective cost value, i.e., μ J , are adopted. The average values of the performance metrics, denoted asμ a,p,δ,t c ,J , are computed over 25 simulations of 100 s, and with T = 0.01 s.

B. Results on the 20-UAVs Scenario
First, an illustrative example with n = 20 UAVs is shown. Fig. 3 illustrates the time evolution of the average performance metricsμ a in (18) andμ p in (19) (blue lines), respectively. Moreover, the evolutions of their maximum and minimum values (red lines), the maximum aggregation ratio μ * a = 1, the optimal packing density ratio μ * p,20 = 0.477 (green lines), and the median simulation (black line), indicated by the apex (13) , are illustrated. Relying on the median simulation, Fig. 4 shows the trajectories of the quadrotors, where the initial and final positions are represented by black circles and blue stars, respectively. It is worth highlighting that the swarm aggregation occurs in the vicinity of the origin, while, as expected, the collision among the quadrotors is avoided.
Finally, note that the proposed distributed SMPC has been tested on a laptop with 12th Gen Intel Core i7-12700H processor, achieving an average computational time of 3.8 × 10 −4 s to solve (16).
Comparison and Discussion: In order to further assess the proposed distributed SMPC (briefly, dSMPC), the results are   hereafter compared with those achieved by applying a centralized SMPC (briefly, cSMPC), and a distributed version of the CLF approach (briefly, dCLF) as baseline. The outcome of the simulations is reported in Table II. Figs. 5 and 6 show the time evolution of the average performance metricsμ a andμ p for the cSMPC and the dCLF strategies, respectively, where the transparent lines are the corresponding signals in the case of dSMPC. The results highlight that the three strategies are finally (at t = 100 s) capable of fully aggregating the system while fulfilling the collision avoidance objective. As expected, the cSMPC performs better in terms of packing with respect to dSMPC and dCLF. The dSMPC performance is however better than the one achieved with the dCLF approach. These results are also coherent with the values of the objective costμ J . Nevertheless, a higher average computational time is evident (about 4 orders of magnitude) for the cSMPC approach, while the dSMPC takes only almost twice as long as the time taken by the dCLF, which is the fastest method since no predictions occur.

C. Results on Larger-Scale UAVs Scenarios
In order to assess the scalability of the proposal in more complex scenarios, different population sizes of the UAVs swarm, ranging from n = 10 to 150 UAVs, are now considered. Specifically, 25 simulations were performed for each value n, and the time taken for solving the distributed SMPC problem was recorded. Fig. 7 shows a box plot of these times, and it is worth highlighting that, despite the increasing number of UAVs, the solution to (16) is still feasible with an average computational time of 8.3633 × 10 −4 s. Clearly, as expected, the computational time, depending on the clustering mechanism, slightly rises with the higher number of UAVs (see Remark 4), but, even with the most demanding setting (n = 150), the average value for the proposal is below T = 0.01 s (i.e., 1.2 × 10 −3 s).

VI. CONCLUSION
This letter has proposed a distributed SMPC strategy to solve a rendezvous problem for a scalable UAVs swarm. The proposed approach is based on a switched formulation of the UAV model suitably quantizing the velocity inputs. Under the assumption that the UAVs location is available, a clustering method is first defined, and a sequential distributed MPC, with collision avoidance properties, is then solved. Future works will be devoted to the application on a real setup, and the extension to general trajectory optimization problems and more complex scenarios, e.g., in presence of disturbances, delays or fixed and moving obstacles. The use of alternative collision avoidance constraints (for instance reformulating them in linear terms, see, e.g., [21]) is also of interest.