Communication-Free MPC-Based Neighbors Trajectory Prediction for Distributed Multi-UAV Motion Planning

In an environment with multiple static obstacles, UAVs usually communicate with each other to avoid collisions during trajectory planning. However, such communication may become infeasible or unreliable due to interference or jam in practice. This paper introduces a neighbors trajectory prediction algorithm based on model predictive control (MPC), which enables each UAV to predict the motion behavior of its neighbors without communication. By solving the MPC model of its neighbors, an UAV can predict their trajectories and then avoid collision with them in the future. To prove the practicability, we integrate the proposed algorithm into distributed model predictive control (DMPC) framework to realize multi-UAV trajectory planning without communication and with static obstacles. The performance of our method is verified by simulation experiments in two scenes.


I. INTRODUCTION
Autonomous navigation and trajectory generation of multi-UAV is very important and widely used in many fields, such as disaster rescue [1], warehouse management [2], multi-view photography [3], smart agriculture [4], and so on. In order to predict the collision with other individuals, UAVs need to communicate their planned future trajectory. In actual scenes, communication jam or interference may occur, which will cause the interaction between UAVs to become unavailable or unreliable. Therefore, it is particularly important for multi-UAV to fly safely without communication.
For the safe and reliable trajectory planning of multi-UAV, there are some traditional distributed approaches: artificial potential field (APF) [5], [6], optimal reciprocal collision avoidance (ORCA) [7], force-based motion planning (FMP) [8], some methods based on swarm intelligence [9], The associate editor coordinating the review of this manuscript and approving it for publication was Emre Koyuncu . and so on. These methods have their own advantages, but they also have some shortcomings, such as only calculating the state of the next instant. In recent years, reinforcement learning (RL) has been applied to multi-UAV trajectory planning [10]- [12]. UAVs optimize their learning strategies according to the rewards they get from performing each action. However, a bad reward function will make these algorithms difficult to train effectively. In distributed model predictive control (DMPC) [13], each UAV solves an optimization problem in a receding horizon manner to obtain its future trajectory [14]. It has the ability to predict the states of a long period of time in the future. For an individual in multi-UAV, when solving a local optimization problem with collision avoidance constraints, it needs to know the future trajectory of its neighbors. However, when the communication between UAVs is not feasible, this kind of states interaction will face difficulties.
Predicting the motion behavior of other individuals is a good way to avoid collision without communication. Trajectory prediction has been researched through multiple VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ methods. In [15], a social forces model has been proposed to describe pedestrian motions by simulating attractive and repulsive potentials. A method based on hierarchical reasoning game theory has been proposed to simulate driver and vehicle interactions in traffic under given conditions [16]. In addition, there are methods such as Bayesian formulation [17], interactive Gaussian Processes (IGP) [18] and Linear Trajectory Avoidance (LTA) [19], etc. These methods have their own characteristics, but still have some problems, such as low prediction accuracy or high calculation cost. In recent years, many methods based on deep learning have been used in trajectory prediction. Literature [20] has introduced a model based on long-short term memory networks (LSTM), which incorporated both static obstacles and surrounding pedestrians for trajectory forecasting. Inspired by [20], an interaction-aware trajectory prediction model based on recurrent neural networks (RNN) has been proposed [21]. In [22], a generative adversarial networks (GANs) encoder-decoder framework has been developed for capturing the multi-modality of the future trajectory prediction. The prediction accuracy of learning-based methods is high, but most methods need a large number of samples to train the model. This limits the application of learning-based methods in the field of trajectory prediction.
In this paper, we propose an algorithm that enables each UAV to predict the future trajectories of its neighbors without communication, and then combine it with DMPC framework to complete the trajectory planning task of multi-UAV. In order to get closer to the actual scene, we also consider the collision avoidance between UAVs and multiple static obstacles, which makes a great increase in the difficulty of neighbors trajectory prediction. Fig.1 gives an overview of the proposed method. Firstly, each UAV obtains the current states of its neighbors through sensors (such as depth cameras and optical flow sensors, etc). Then model predictive control (MPC) is used to establish a receding horizon optimization problem for each neighbor. By solving these optimization problems, an UAV can get the future trajectories of all its neighbors. Finally, the proposed algorithm is integrated into DMPC framework to obtain the future state sequence of each UAV. For an optimization problem, whether it is established by an UAV to predict its neighbor trajectory or its own trajectory, we transform it into a quadratic programming problem. Considering the collision avoidance constraints for neighbors and static obstacles in this problem, we use the predicted first collision to establish soft constraints, thereby reducing the complexity of the model and the difficulty of solving it. We implement the offline sequential form of the above method. In theory, it can also be applied in a distributed form. Through simulation experiments in two specific scenarios, the feasibility and effectiveness of our proposed method are verified. The major contributions of this paper are summarized as follows: (1) A neighbors trajectory prediction algorithm based on MPC is proposed to overcome the absence of communication; The overall framework of the proposed method. It mainly includes neighbors trajectory prediction and trajectory planning based on MPC. sen i _UAV , sen i _obs respectively represent other UAVs and obstacles in the sensing area detected by n i . S i ,s [k + 1 : k + K |k] is defined as the state sequence of a neighbor of n i at K times in the future. S i [k + 1 : k + K |k] and a i [k : k + K − 1|k] are denoted as the state and control sequence of n i at K times, respectively. S i [k + 1] is the next state of n i . After trajectory planning, n i calculates whether its position at the next time reaches the target. If not, proceed to the next round.
(2) The neighbors trajectory prediction algorithm is integrated into DMPC framework to realize the trajectory planning of multi-UAV; (3) We conducted simulation experiments of the proposed algorithm in two scenes, and the results show that the proposed method has good performance.
The rest of this paper is organized as follows: Section II describes the preliminaries. Section III details the proposed method. The effectiveness of the proposed method is verified by simulation experiments in Section IV. Section V gives the conclusion.

A. UAV MODEL
The multi-UAV set composed of N homogeneous UAVs is defined as N = {n 1 , . . . , n i , . . . , n N }, where n i represents the ith UAV. The downwash air generated by a quadrotor in flight will affect the safety of other UAVs under it. Similar to the literature [23], we model each UAV as an ellipsoid elongated along the vertical axis. An UAV at position p ∈ R 3 can be represented by a set of points as follows: where Q = diag(1, 1, r c ), r c > 1, and r is the radius of the UAV in the XY plane. Each UAV is assumed to satisfy double integral dynamics. Let p i ∈ R 3 , v i ∈ R 3 , a i ∈ R 3 denote the position, velocity, and acceleration of the UAV n i respectively. The dynamic equation of n i is where k is a discrete time point and d is a discrete time step. 6 and u i = a i ∈ R 3 represent the state quantity and control quantity of UAV n i respectively. Then (1) can be written as where , 0 ∈ R 3×3 denote the 3×3 identity matrix and 3×3 zero matrix respectively.

B. MUTIL-UAV COLLISION AVOIDANCE
For an insight on collision avoidance approaches of UAVs, the reader is referred to [24]. If UAV n i wants to avoid collision with n j at the time point k, the following collision avoidance constraint should be satisfied: where p i [k] and p j [k] represent the positions of n i and n j at time point k, respectively. 2r u is the minimum distance between two UAVs on the XY plane, where r u is slightly larger than the radius of one UAV.
If n i wants to avoid collision with O m at the time point k, it needs to satisfy the following collision avoidance constraint: Regarding the acquisition of p i,m [k], it is assumed that it can be obtained by the sensors of n i . Or according to the literature [25], if the obstacle O m has a hyperplane boundary, and the boundary has a unit normal a m ∈ R 3 and passes through the point y m ∈ R 3 , then the point on O m that is closest to n i can be determined by where P = I − a m a T m . If the obstacle O m can be simplified as a closed sphere with radius r m centered at y m ∈ R 3 , then the point on O m that is closest to n i can be determined by

C. MODEL PREDICTIVE CONTROL 1) THE PREDICTION MODEL OF UAV
According to the principle of DMPC [13], [26], the flight time of UAVs is discretized. Assuming that the current time point is k, each UAV wants to predict its states at K future time points. Let The following formula can be deduced from (3): denote the position and control sequence of UAV n i at K time points in the future, respectively.
is the initial state of instant k, then an affine function from U i to P i can be obtained through the recurrence relation in (9): Let (10) is defined as The matrix ∈ R 3K ×3K in (10) is defined as Similarly, we can get the affine function from U i to V i : where A 0 and Φ have the same matrix form as A 0 and Φ, except that the matrix C = 0 I ∈ R 3×6 instead of C is brought into them.

2) OBJECTIVE FUNCTION
The objective function for each UAV needs to balance the following three aspects: (1) The control effort. (2) The change range of control quantity between two instants. (3) The distance between the planned position point and target point. It is established as follows: where p i [k +K |k] is the last position in n i 's position sequence of future K time points and p fi is n i 's target point. According to (10) and U i , the objective function can be transformed into a quadratic form: ×3K are all positive definite and block-diagonal matrices, they are the weights of the three penalties. R 1 , R 2 , R 3 ∈ R 3×3 are all positive definite diagonal matrices. The matrix Γ ∈ R 3K ×3K is defined as follows: the matrix U i ∈ R 3K is defined as follows: where u i [k − 1] T represents the control quantity of UAV n i at the last time of current time k, and 0 T

3) CONSTRAINTS AND MODEL
Due to the physical characteristics, space limitations, and collision avoidance requirements of UAV, the state and control constraints of UAV n i are as follows: where U min , U max , P min , P max , V min , V max ∈ R 3K are the boundary values of control, position and velocity, respectively. After the above constraints are processed by using (10), (13) and linearization methods, the model of n i can be established: Generally, there are two ways for UAV n i to obtain the future trajectories of other UAVs (p j [k + t|k] in (21)). One is through communication. UAVs communicate their planned future trajectories to others at each time step, but when the number of UAVs is large, this method consumes a lot. In addition, communication is not always available and reliable in practice. Another method is n i to treat other UAVs as moving at a constant velocity and calculate their future trajectories according to their current state. However, this method can be inaccurate and may lead to unsafe trajectory planning [27]. In this paper, we propose an MPC-based algorithm for each UAV to predict its neighbors' future trajectories without communication.

III. METHOD
In this section, we introduce in detail the proposed neighbors trajectory prediction algorithm and how to integrate it into DMPC framework to achieve the trajectory planning of multi-UAV.

A. PROBLEM FORMULATION
The goal of our works is to generate collision-free trajectories of multi-UAV from their initial locations to their target locations under an environment with static obstacles and without communication. In this situation, UAVs can not obtain the future trajectories of other individuals through information interaction. We propose an algorithm in which an UAV predicts the future state sequences of neighbors and judges whether collisions will occur.
Before describing our proposed algorithm, the following reasonable assumptions are made: (1) Each UAV has a sensing area. Assuming that the sensing area is a sphere with radius d s , each UAV can obtain the states (positions and velocities) of other individuals in its sensing area by its own sensors; (2) Each UAV has an inter-UAV safe area and an UAV-obstacle safe area. They are spheres with radii of d u and d o respectively, and d u , Only when other individuals or obstacles are in its safe area, one UAV will consider collision avoidance; (3) Due to the homogeneity of UAVs, they have the same size of sensing area, inter-UAV safe area, and UAV-obstacle safe area.
Based on the above assumptions, the UAVs' sensing area and safe areas are shown in Fig. 2 (The mappings of 3D areas on 2D).

B. NEIGHBORS TRAJECTORY PREDICTION
For an UAV n i , the set of neighbors it senses in its safe area at current instant is defined as . For any neighbor ne s i ∈ Θ i (1 ≤ s ≤ g), n i uses quadratic programming to solve an MPC problem about ne s i to obtain its future state sequence.
The collisions ne s i may encounter are considered. According to the second assumption in Section III-A, the sensing radius of UAV is much larger than the safe radius, so it can be counted that ne s i 's neighbors set . . , ne l i,s } can be sensed by n i . We regard ne s i 's neighbors as individuals moving at a constant velocity. If at the time point k + t 1 (1 ≤ t 1 ≤ K − 1) that after the current instant k, there exists a set Θ k+t 1 i,s ⊆ Θ i,s , which satisfies the following constraint for any ne q i,s ∈ Θ k+t 1 i,s : we call ne s i collides with its neighbors at k + t 1 . The obstacles set sensed by n i at instant k is represented as i , which satisfies the following we call ne s i collides with obstacles at k + t 2 . p i,s,m [k + t 2 ] is the point closest to ne s i on obstacle O m i,s at instant k + t 2 . In (25) and (26) According to the description above, n i can get the instants when ne s i collides with other UAVs or static obstacles. Similar to the literature [14], in order to simplify the model, we select the instant when ne s i first collides to constrain its state, this instant is assumed as k + t 0 . In addition, we adopt the soft constraint to make the planning problem more solvable. The future collisions of ne s i can be divided into the following three categories: (1) The UAV ne s i only collides with other UAVs at instant k + t 0 . This occurs in three cases: 1) there are no obstacles in the safe area of ne s i ; 2) there are obstacles in the safe area of ne s i , but there will be no collision in the K − 1 future time points; 3) the collision time between ne s i and obstacles is after k + t 0 . Assuming that the set of UAVs that collide with ne s i at k + t 0 is Θ k+t 0 i,s ⊆ Θ i,s , then for any ne q i,s ∈ Θ k+t 0 i,s , ne s i needs to satisfy the following constraint: where ε (1) i,s,q > 0 is a new variable to relax the constraint.
In order to linearize (27), Taylor expansion is used at p i,s [k +t 0 ], where p i,s [k +t 0 ] is the position of ne s i at the future instant k + t 0 , which calculated by n i in advance. This is one of the methods for linearizing collision avoidance constraints mentioned in Section II-C 3). The result is as follows: where ω i,s , ne s i needs to satisfy the following constraint: where ω The UAV ne s i collides with both other UAVs and static obstacles at instant k + t 0 . In this case, the above two constraints (27) and (29) ne s i need to be satisfied. We unify the forms of collision avoidance constraints in the above three cases. ε i,s, * , ω i,s, * , ρ i,s, * , η i,s, * are defined as general references to the corresponding symbols in (28) and (30). According to the actual situation, choose one type of specific symbol (corresponding to the above case (1) or (2)), or the combination of the two types of symbols (corresponding to the case (3)). (28) or (30) can be transformed into the following form by (10): where σ i,s, T . It can be seen from the above that the soft constraint is equivalent to adding a relaxation variable ε i,s, * to each collision avoidance constraint of ne s i , which is a new control variable. Let num is denote the number of ε i,s, * . Then when they are other UAVs that collide with ne s i at k + t 0 , num is = dim(Θ (35) represents the combination of num is collision avoidance constraints, i,s ∈ R num is ×3K , P i,s ∈ R num is ×num is and i,s ∈ R num is . We can vertically stack inequality constraints (32)-(35): In order to solve the quadratic programming problem smoothly, instead of restricting the magnitude of each relaxation variable, we add a penalty term in the objective function (15) to punish the larger relaxation variables: where κ > 0 is the penalty term of relaxation variables, I num is ∈ R num is ×num is is denoted as the identity matrix. Finally, the optimization problem of ne s i with collision avoidance constraints can be expressed as By solving (38), n i can get the predicted control sequence of ne s i at K future time points, and then the future state sequence of ne s i can be predicted through the dynamic equation of UAV. The whole process of the proposed neighbors trajectory prediction algorithm is shown in Fig 3. Similar to the above method, n i can calculate its own collision avoidance constraints. Different from the case of ne s i , p i [k + t 0 ] can only be obtained by p i [(k − 1) + (t 0 + 1)|k − 1] in the predicted state sequence of n i at the last time. Similarly, the optimization problem of n i with collision avoidance constraints is as follows: By solving this problem for UAV n i , its control quantity at the current instant k and predicted state sequence of K future time points can be obtained.

C. DETAILED DESCRIPTION OF THE ALGORITHM
Algorithm 1 shows the sequential implementation form of multi-UAV trajectory planning based on DMPC. The input is UAVs' initial positions (p 0 ) and target positions (p f ), the output is their flight trajectories. In the first line of Algorithm 1, is used to store the trajectories of UAVs. UAVs' initial positions are assigned by p 0 , their initial velocities are set to 0. When UAVs do not all reach their target points and the flight time does not reach the specified time, for an UAV n i , we need to judge whether it has reached its target point. If it has, stop updating its trajectory and start to consider the next UAV (lines 5-6). If not, it will conduct a safe check to obtain information about other individuals and static obstacles in its sensing area and safe area. Then, by solving an optimization problem modeled by Algorithm 2, n i gets its own control quantity sequence in the future and the predicted future trajectories of neighbors. The future state sequence of n i can be predicted. The position point of n i at the next time is added to the trajectory sequence of n i in , and the first quantity in n i 's future state sequence is selected as its state quantity at the next time, the first quantity in n i 's control quantity sequence is selected as the control quantity at the current time (lines 7-11, t ∈ {1, 2, . . . , K } in line 8, the same below). The distances between UAVs and their target points are calculated. When all UAVs arrive at their target points or the flight time reaches the specified time, the algorithm ends and returns to the trajectory of each UAV.
Algorithm 2 describes the establishment and solution of an optimization problem based on the proposed neighbors trajectory prediction algorithm for a certain UAV (assumed as n i ). Supposing the current time is k, the input includes the state and control sequence of n i , as well as the information of neighbors and obstacles perceived by n i at the current instant and the neighbors' information calculated by n i at the last instant. The output is the control sequence of n i in the future K instants, the neighbors' control quantities at instant k and state sequences in the future K instants predicted by n i . If there are no other UAVs and static obstacles in the safe area of n i , the control sequence of n i in the future K instants can be obtained by solving a quadratic programming problem without collision avoidance constraints (lines 1-2). If there are other UAVs in n i 's safe area, for each neighbor ne s i , n i if TargetDistCheck(p i (k))< D max 6: continue 7: [ne i _UAV , ne i _obs, sen i _UAV , sen i _obs] ←SafeCheck( , d u , d o , d s ) 8:

Algorithm 2 The Build and Solution of Optimization Model Based on Neighbors Trajectory Prediction
for each neighbor s = 1, 2, . . . , l 5: first determines whether it is the neighbor at the last time, if so, n i calls the future state sequence of ne s i predicted at the last time, otherwise ne s i is regarded as moving at a constant speed. Then, the ne s i is checked to get the information of other individuals or obstacles in its safe area. Determine the time of the first collision of ne s i , an optimization problem with collision avoidance constraints is solved to obtain its future control sequence (if there is no collision, an optimization problem without collision avoidance constraints is solved), and then its future K instants sate sequence is obtained. The first value of ne s i 's control quantity sequence is taken as its control quantity at the current time (lines 3-10). After obtaining the states of all neighbors in K future instants, n i judges the time of the first collision with neighbors or obstacles, and then solves an optimization problem to obtain its own control sequence in the future K times (lines [11][12]. The output is returned and the algorithm ends.

IV. SIMULATIONS
In this section, two scenes are provided to verify the performance of the proposed method. One scene is that multi-UAV fly from one side of static obstacles to the other, and the other scene is that multi-UAV fly from opposite sides of static obstacles. In order to illustrate the feasibility and effectiveness of the proposed method, we compare it with two methods in terms of success probability and computation time. One is the DMPC algorithm that treats neighbors as constant velocity individuals (we call it CVN-DMPC), and the other is the DMPC algorithm with communication.
In each scene, the experiment is repeated 30 times for different numbers of UAVs. We call the planning failure if the following three situations occur: (1) The distance between an UAV and an other individual or a obstacle is less than the set threshold; (2) the optimization problem becomes infeasible; (3) Some UAVs failed to reach their target within the specified time. Otherwise, it is called planning success. For each number of UAVs, we calculate the proportion of planning success times of each method in 30 experiments as the success probability of this method. Computation time of each method is the average time of its successful experiments. These simulations are implemented as a sequence form described in Algorithm 1.
The hardware environment is a PC with an Intel Core i7-10875H CPU with 8 cores and 16GB RAM, running at 2.3GHz. In these experiments, r u = 0.175m, Q = diag (1,1,2), p max = −p min = 400m, v max = −v min = 3.8m/s, u max = −u min = 2.5m/s 2 , d u = 2r u = 0.35m, d o = 5d u = 1.75m and d s = 10m. The discrete time step d = 0.2, the number of discrete time steps predicted each time is K = 15, and the planned flight time is T max = 100s. We assume that when ε 1 i,s, * > 0.07, UAVs will collide with others; when ε 2 i,s, * > 0.035, UAVs will collide with obstacles. We keep the density of the flight space constant, and there are 12 static obstacles with a radius of 2m.

A. SCENE 1: MULTI-UAV FLYING IN THE SAME DIRECTION
In this scene, UAVs randomly take the initial points on one side of the static obstacles, and then fly through the static obstacles to the other side. Their target points are also randomly selected. It is required that UAVs do not communicate with each other during flight and reach their respective target points on the premise of avoiding collision between UAVs and static obstacles. This scene is suitable for applications such as UAVs performances and target tracking.
We test the success probability and computation time of the proposed method, and compare it with CVN-DMPC and DMPC with communication. The results are shown in Fig.4. It can be seen from Fig.4(a) that when the number of UAVs is less than 60, the success probability of the proposed method is above 90%, and it is not very different from the method with communication. With the increase of the number of UAVs, the success probability of both three methods decreased, but the gap between the proposed method  and DMPC with communication is not very large. However, the success probability of CVN-DMPC decreases rapidly. In the proposed method, UAVs need to use MPC to predict the future trajectories of neighbors, this process increases the computation time, which can be seen from Fig.4(b), especially when the number of UAVs is large. However, comparing with the other two methods, the time consumption of the proposed method increases little. Fig.5 shows 42 UAVs passing through obstacles from random initial positions to their target points, which form the word ''hello''. Because the space between the letters of ''hello'' is small, it puts forward higher requirements for UAV collision avoidance, but our method can still successfully complete the planning task.

B. SCENE 2: MULTI-UAV FLYING IN THE OPPOSITE DIRECTION
In this scene, UAVs are divided into two groups. Each group randomly takes the initial points on one side of the obstacles, and then takes the initial points of the opposite UAVs as their target points. They also avoid collision and fly to the target points without communication. This scene is suitable for multiple UAVs flying in a cluttered environment, such as warehouses, workshops, etc.
Similar to scene 1, the success probability and computation time of the three methods are shown in Fig 6. The scene of opposite flight poses a great challenge to the trajectory planning of multi-UAV. This is mainly because when UAVs  fly in the opposite direction in the environment with static obstacles, the time available for UAV response and adjustment becomes shorter. In the absence of communication, UAVs need to predict the future trajectories of their neighbors more accurately. As can be seen from Fig. 6(a), comparing with DMPC method with communication, the success probability of the proposed method is not much different. When the number of UAVs is less than 40, the proposed method can also maintain more than 80% success probability. However, with the increase of the number of UAVs, the success probability of CVN-DMPC quickly dropped below 50%. This shows that our method has better performance than CVN-DMPC in the case of more strict requirements for trajectory prediction accuracy. The computation time of the three methods is shown in Fig. 6(b). Fig. 7 shows 20 UAVs divided into two groups and flying in opposite directions. It can be seen that the UAVs can pass through obstacles and avoid inter-UAV collisions, then reach their respective target points smoothly.

V. CONCLUSION
In this paper, we proposed an MPC-based algorithm for each UAV to predict its neighbors' trajectories when it can't communicate with others. Then the proposed algorithm is integrated into DMPC framework to realize the trajectory planning of multi-UAV in an environment with static obstacles. In the simulation of two scenes, we showed that the proposed method is feasible and effective. Moreover, with only a small increase in computation time, it can achieve a performance close to DMPC with communication and much better than the DMPC algorithm that treats neighbors as constant velocity individuals. In this paper, the proposed method is implemented in a sequential form. In the future work, we will try to implement it in a distributed form and carry out real machine experiments.
ZIJIA NIU received the B.S. degree in information and computing sciences and the M.S. degree in operational research and cybernetics from Yanshan University, Qinhuangdao, China, in 2016 and 2019, respectively. He is currently pursuing the Ph.D. degree in applied mathematics with Beihang University, Beijing, China. His research interests include optimal and distributed control, multiagent trajectory planning, and swarm intelligence.
XIAOHU JIA received the B.S. degree in applied statistics from the Nanjing University of Science and Technology, in 2020. He is currently pursuing the M.S. degree in applied mathematics with Beihang University, Beijing, China. His current research interests include optimal control, reinforcement learning, and game theory.
WANG YAO received the B.S. and Ph.D. degrees in mathematics from Beihang University, Beijing, China, in 2012 and 2017, respectively. He is currently a Lecturer with Beihang University. His research interests include multi-agent systems, optimization, and game theory. VOLUME 10, 2022