Optimum UAV Trajectory Design for Data Harvesting From Distributed Nodes

This paper designs energy-efficient trajectories for unmanned aerial vehicles (UAVs) harvesting data sequentially from distributed ground nodes. We propose a novel optimization framework for path planning, based on dynamic programming. We develop an optimum backward-forward algorithm that jointly optimizes the hovering locations for each ground node, and the visiting order to those locations. Our algorithm minimizes the total energy consumption of the UAV over its trajectory. Our framework is compatible with various probabilistic wireless communication channel models, and can also be applied to different cost functions, including minimising the total flying time, and allowing for bi-directional communications. We also develop a lower complexity algorithm that approximates the optimum UAV trajectory by decomposing the original problem into two sub-problems, and iterating back and forth between the two. This alternating algorithm has polynomial time complexity, and we show that it produces a near-optimum UAV trajectory, with as little deviation as 5% to 15% from the average energy consumption of the optimum algorithm.


Optimum UAV Trajectory Design for Data
Harvesting From Distributed Nodes Dhanushka Kudathanthirige , Member, IEEE, Hazer Inaltekin , Member, IEEE, Stephen V. Hanly , Fellow, IEEE, and Iain B. Collings , Fellow, IEEE Abstract-This paper designs energy-efficient trajectories for unmanned aerial vehicles (UAVs) harvesting data sequentially from distributed ground nodes.We propose a novel optimization framework for path planning, based on dynamic programming.We develop an optimum backward-forward algorithm that jointly optimizes the hovering locations for each ground node, and the visiting order to those locations.Our algorithm minimizes the total energy consumption of the UAV over its trajectory.Our framework is compatible with various probabilistic wireless communication channel models, and can also be applied to different cost functions, including minimising the total flying time, and allowing for bi-directional communications.We also develop a lower complexity algorithm that approximates the optimum UAV trajectory by decomposing the original problem into two sub-problems, and iterating back and forth between the two.This alternating algorithm has polynomial time complexity, and we show that it produces a near-optimum UAV trajectory, with as little deviation as 5% to 15% from the average energy consumption of the optimum algorithm.
Index Terms-UAV communication, data harvesting, dynamic programming, optimum trajectory design.

I. INTRODUCTION
U NMANNED aerial vehicles (UAVs) equipped with mobile base stations hold great potential to support a broad range of applications, including data harvesting, remote equipment operation, emergency response and public safety communication [1], [2].UAVs are particularly useful in scenarios where ground users are isolated from existing communication infrastructure, either due to network failure or for security reasons.Examples include sensor data collection from nodes deployed across large agricultural farms, temporary sensor networks deployed in an urban area for environment/traffic monitoring, or ground sensors placed by soldiers in a built-up area to gather information from an urban war zone.UAVs can efficiently perform these data-gathering and dissemination tasks by following a precomputed trajectory between the ground nodes (GNs).The authors are with the School of Engineering, Macquarie University, Sydney, NSW 2109, Australia (e-mail: dhanushka.kudathanthirige@mq.edu.au; hazer.inaltekin@mq.edu.au;stephen.hanly@mq.edu.au;iain.collings@mq.edu.au).
Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCOMM.2023.3323513.
Digital Object Identifier 10.1109/TCOMM.2023.3323513In this paper, we develop a novel analytical framework to design optimum UAV trajectories for efficient data harvesting from GNs, while minimizing total energy consumption.We jointly optimize the hovering locations for each GN, and the visiting order to those locations.
For the distributed data harvesting applications we are considering in this paper, energy consumption is the most important cost function to ensure mission success over widearea node deployments.In [4], [12], [15], [22], [23], and [24] the total energy consumption was considered for a fly-andhover operating protocol, in which the UAV flies between sensor terminals, where it hovers and communicates for a period of time, before then flying to the next hovering location.Fly-and-hover has also been considered in other non-energy-minimising schemes [13], [14].In contrast to a communicate-while-flying operating protocol, the fly-andhover protocol does not require frequent adaptive channel estimation, with its associated training-symbol communication overhead, or require Doppler compensation.Under these flight conditions, the trajectory design has two main challenges: hovering location optimization and the visiting order.Typically, these two problems have been treated separately since the exact computation of optimum UAV trajectories is NP-hard with exponential complexity in the number of ground nodes.In [14], [15], [22], and [23] the visiting order problem was treated as a classical travelling salesman problem (TSP), and in [13], [15], [22], and [24], a successive convex approximation technique was invoked to find hovering locations for the given visiting order.
In this paper we take a dynamic programming (DP) approach to design minimum energy trajectories for the flyand-hover operating protocol.DP has previously been applied to UAV trajectory design, with the communicate-whileflying operating protocol and for non-energy-minimisation cost functions [7], [8], [31].In those scenarios, there is no defined node-visiting order.In [7] and [8] the DP was solved over discretized time segments, and in [31] the DP was solved over discretised space for an application problem quite different to the data harvesting problem we are considering here.In [23], DP was used to find the speed of a UAV that minimizes the energy consumption for fly-and-hover, while satisfying latency constraints.In that case the UAV trajectory was assumed to be known.
In this paper we derive the optimal trajectory solution for energy minimisation, when jointly considering both the hovering location problem and the visiting order problem, for UAV fly-and-hover data harvesting.This is the first algorithm to jointly optimize hovering locations and the visiting order.We utilize a backward algorithm based on dynamic programming to determine optimum UAV control policies, and a forward algorithm that outputs optimum hovering locations and a visiting order via a sequential table lookup procedure, given any starting point of the UAV trajectory.Our framework is highly adaptable and compatible with various probabilistic wireless channel models (ie.we do not need to make the average channel gain assumption discussed above) since it does not rely on a convex structure in the objective function for UAV energy consumption.
Our main contributions in this paper can be summarized as follows.
1) We develop a novel DP-based UAV trajectory optimization framework that jointly optimizes the hovering locations and visiting orders to find optimum UAV trajectories.We derive a backward-forward algorithm, that produces the optimum trajectories, and show it has complexity O (M N ) 2 K2 K , where M N is the size of the grid over which we implement the algorithm, and K is the number of GNs.The exponential complexity is unavoidable in the optimum algorithm since it solves a version of the TSP.Nevertheless, our algorithm has massive computational gain over exhaustive search, which has complexity 2) To further overcome the complexity hurdle, we develop a low-complexity alternating optimization framework that decomposes the UAV trajectory optimization problem into two sub-problems.For the first sub-problem, we optimize hovering locations in polynomial-time for any given visiting order by using a DP algorithm.For the second sub-problem, we present a 2-approximation solution for the optimum visiting order for given hovering locations, again in polynomial time.Our numerical results show that the UAV trajectories produced by our low-complexity algorithm are within a range of 5% to 15% of the optimum trajectories in terms of the average energy consumption.And, this is achieved with a tremendous gain in run time, with computational complexity of O (M N ) 2 K + K 2 log K .

A. System Model
Consider a UAV-assisted wireless communications scenario that consists of K GNs (such as Internet of Things devices) k , 0 Each GN has Q k bits to transmit to a rotary-wing UAV.The UAV operates under the fly-and-hover protocol [22], with the objective of minimizing total energy expenditure while harvesting data from all the GNs.The energy minimization problem here is to find an optimum visiting order to collect data from GNs and the optimum set of corresponding service locations.
According to this definition, ϕ (k) gives the index of the GN served by the UAV at the kth hovering location for k ∈ [1 : K].We set ϕ (0) to 0 and ϕ (K + 1) to K +1 to fix the indices for the start and final positions of the UAV.The optimum visiting order must be chosen from the K! permutations of Given a visiting order ϕ, the UAV serves GN ϕ(k) at the kth hovering location x k ∈ R 3 for k ∈ [1 : K].The points x 0 ∈ R 3 and x K+1 ∈ R 3 denote the initial and final locations for the UAV's flight.We fix them at arbitrary locations, allowing the possibility that they may coincide.A single hovering location is sufficient to communicate with a single ground node and, therefore, at most K hovering locations are required to communicate with K ground nodes.
Our energy consumption minimization framework for UAVs is capable of generating the optimal sequence of hovering locations and visiting order for a UAV flying with three degrees of freedom in the flight path.However, in this paper, we choose the UAV to fly at a constant altitude H (meters).This will both simplify the notation and exposition of the key concepts, with the obvious extension to varying flight altitude scenarios.By restricting the optimization to a single height, H, we save considerably on algorithm complexity also.

B. Channel Model
The wireless channel (between GNs and the UAV) is one of the primary dynamics that govern UAV energy consumption in our setup.We model the wireless channel using the flexible and versatile model introduced in [32].This model characterizes the channel for a wide range of practical operating scenarios, such as suburban, urban, dense urban and high-rise urban, which are described in the current industry standards [33].
Specifically, the probability of LoS connectivity between a GN located at v and the UAV located at x is modelled by a generalized logistic function of the form where ∥v − x∥ is the distance between the GN and UAV.
In (1), A (θ) and f (∥v − x∥ , θ) are functions of elevation angle θ = 180 π sin −1 (H/∥v − x∥), which depends on the GN-UAV distance.For example, a commonly used model for the LoS probability takes the form A (θ) = 0 and f (∥v − x∥ , θ) = −bθ + a, where a and b are environmentdependent modelling parameters [34].A more sophisticated model for characterizing the LoS connectivity in a Manhattantype city environment is considered by using a fourthorder polynomial A (θ) and the function f (∥v − x∥ , θ) = (∥v − x∥ cos(θ) − B (θ)) /C (θ), where B and C are also fourth-order polynomials of θ for each environment type [32].
This construction is crucial in our setting because it enables us to determine the large-scale fading gains for GN-UAV connectivity.We write the large-scale fading power gain (in dB scale) for a link between a GN located at v and the UAV hovering at x according to where α is the path-loss exponent, f is the operating frequency (in Hz), c is the speed of light (in m/s) and η is a twopoint random variable accounting for the excess path-loss experienced in LoS and NLoS conditions [22], [32], [35].η takes the value η L with probability p L (∥v − x∥) and the value η N with probability 1 − p L (∥v − x∥).We consider that the UAV has access to the large-scale fading information including η L , η N , and p L (∥v − x∥)).

C. Data Collection Model
Using the channel model described above, we write the link rate in bits/s between the GN located at v and the UAV hovering at x according to where B is the communications bandwidth (in Hz), P is the GN transmit power (in Watts), σ 2 is the receiver noise power (in Watts) and ρ = c 4πf 2 10 − η 10 .Here, R is a random variable due to the randomness in the excess path-loss parameter η.The randomness in R must be taken into account appropriately to calculate the expected energy consumption of a UAV when it hovers.
In particular, under the fly-and-hover protocol, the UAV hovers at x until it fully collects the data from the GN at v. Hence, we can express the expected duration for hovering, parametrized by the number of bits Q that will be harvested from the GN by the UAV, as The average hovering time, as presented in (4), is a complex function of both the GN and the hovering locations.This function does not possess a convenient convex structure for optimization, even for a single GN location and a UAV hovering location.A common simplification in the literature [9], [22], [29] is to approximate T hov (∥v − x∥ ; Q) by pushing the expectation operator inside the rate function, which results in However, this approximation only gives a lower bound on the actual average hovering time T hov (∥v − x∥ ; Q), which can be seen by applying Jensen's inequality.More importantly, this lower bound can be excessively loose and deviate significantly from the true value of the average hovering time, particularly in built up environments (unlike the scenarios considered in [9], [22]).
We illustrate this point through the following example.Consider an urban communications scenario characterized by the parameters α = 2.6, σ With these parameters, the average hovering time is 10.2 s, calculated using (4), whereas the approximation in (5) gives 0.68 s as a lower bound.There is clearly more than an order of magnitude difference between the two.The disparity arises from the fact that the utilization of the average channel gain inside the rate function to calculate the hovering time fails to adequately consider the possibility of a prolonged hovering period during a blockage event.

D. UAV Energy Consumption Model
We characterize the UAV energy consumption for both moving and hovering using the model in [22].In this model, the power consumption of the UAV operating at a constant altitude is given by Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where V is the UAV speed, U tip is the speed of the tip of the rotor blade, and ξ represents the mean rotorinduced speed.C 0 , C 1 , and C 2 are constants that depend on the UAV design and operating environment.Using (6), the hovering power consumption can be obtained as P UAV (0).This calculation also gives the expected energy consumption to hover at x and harvest Q bits from a GN at v according to T hov (∥v − x∥ ; Q) P UAV (0).We ignore the RF circuit energy cost of signal reception since it will be negligible when compared to the hovering energy consumption.We consider that the UAV operates at the optimum speed that minimizes energy consumption per meter when travelling from one GN to another, and it can be found by solving V ⋆ is also known as the maximum range speed [22], [36], which is the optimum choice of UAV speed for the energy minimization problem we solve in this paper since it minimizes the total UAV energy consumption uniformly over all trajectories.Using V ⋆ , we calculate the optimum energy consumption per unit travelling distance as (in J/m).In our energy minimization framework, we will utilize E ⋆ as a key parameter to calculate the total energy consumption during UAV movement between different hovering locations.

III. UAV ENERGY MINIMIZATION PROBLEM
We focus on minimizing the total UAV energy expenditure when the UAV flies to harvest data from all GNs.The ith GN has Q i bits to send to the UAV, thus the average hovering time when the UAV is serving that GN from location x, is given by For a given visiting order ϕ, the total UAV energy expenditure can then be written as The first term relates to hovering, and depends only on the hovering locations.The second term relates to travel, which depends on the sequence in which the hovering locations are visited.
The UAV energy minimization problem is, therefore: minimize where Φ is the set of all possible visiting orders.x H start and x H end are the points on the plane P H = x ∈ R 3 : x (3) = H directly above the start and end locations.The first constraint in (9) enforces a fixed altitude during the flight, and the second constraint establishes the starting and ending for the UAV at altitude H.This means that we optimize the x k Fig. 1.
A example depicting the non-convexity of g(t) = for k ∈ [1 : K] on a 2D plane while keeping the third component fixed at H in (9).In this formulation, the UAV first moves vertically from x start to x 0 which is directly above it, and then maintains an altitude of H while flying to other hovering locations, and ultimately lands on x end via a vertical descent straight down from x K+1 .In ( 8), we exclude the constant energy consumption associated with the initial and final vertical movements.This exclusion is justified because these movements do not impact the determination of the optimal hovering locations and visiting order.We will solve ( 9) by leveraging the additive cost structure to develop a DP-based algorithm.Note that solving (9) is highly complex due to the need to determine the optimal visiting order from among K! different possibilities.In fact, with fixed hovering locations, solving (9) is equivalent to addressing the well-known NP-hard TSP [37].
Additionally, the total hovering energy consumption is not necessarily a convex function of {x k } k∈[0:K+1] .We can investigate this fact by restricting the total hovering energy term in (8) into a line, i.e., g(t) = P UAV (0) where t ∈ R, x ′ k ∈ R 3 is an arbitrary hovering location and d k ∈ R 3 is a random directional vector.The hovering energy term is convex if and only if g(t) is a convex function of t for all x ′ k and d k values [38].However, Fig. 1 depicts that g(t) is clearly non-convex for the case of two GNs with the following parameter values: α = 2.6, 10 log 10 ρP/σ 2 = 65 dB, and . Hence, the total hovering energy expenditure is not always convex, which implies that the baseline trajectory optimization problem in ( 9) is not necessarily convex, even for a fixed visiting order ϕ.
Remark 1: Our objective function in (9) can also be extended to include the cost of acceleration/deceleration of the UAV at the points the UAV needs to accelerate to V * at x k−1 and decelerate to 0 at x k .For wide-area node deployments, the UAV acceleration/deceleration duration takes a small portion of the total operation time, and hence, we assume that the UAV reaches the necessary speed within a short distance Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
[15], [22], [24].Let E a and E d be the energy required for acceleration and deceleration in Joules.Then, the objective function, including the acceleration/deceleration energy cost, is given by where Our optimum DP-based algorithm trivially extends to cover this case since the addition of acceleration and deceleration costs does not change the key underlying additive structure of the objective function we minimize over UAV trajectories.

IV. UAV TRAJECTORY DESIGN: OPTIMUM ALGORITHMS
In this section, we will develop a DP-based optimization framework that leverages the additive structure of the energy consumption function J {x k } k∈[0:K+1] , ϕ as defined in (8) to solve the UAV energy optimization problem in (9).By utilizing this framework, we are able to produce an optimum trajectory for the UAV that minimizes total energy consumption while harvesting data from K GNs.Our framework optimizes the visiting orders and hovering locations jointly to generate an optimum UAV trajectory.

A. Dynamic Programming Based Energy Optimization Framework
A critical component of the proposed framework is the system state, which will play a key role in determining the "future" UAV hovering locations for energy-optimum data collection.As we develop our framework, the time-evolution of the system and the concepts like "future" and "current" will be clear from the context.We define the system states formally as follows.
Definition 2: We call the tuple s = (x, I) a system state if The state s = (x, I) includes two components: x ∈ R 3 specifies the hovering location for a UAV operating at any altitude, and I ⊆ [1 : K] specifies the set of GNs from which the UAV has yet to harvest data.
In our framework, the UAV starts at location x 0 = x H start , with an initial set of GNs that are yet to be served of The initial state is denoted s 0 , with s 0 = (x 0 , I 0 ).This initial state serves as the starting point for our optimization framework that will gradually evolve into a final energy-optimum UAV trajectory.
For the final state in the trajectory, s K+1 = (x K+1 , I K+1 ), we have x K+1 = x H end and I K+1 = ∅.Thus s K+1 gives the endpoint of the UAV's data collection mission, at which all GNs have been successfully served and there are no remaining GNs awaiting data collection.For other values of index k ∈ [1 : K], we have a continuum set of system states.We represent the system state space for each k ∈ [0 : K + 1] by S k .
We update the states using a UAV control strategy, which determines both the subsequent hovering location and the index of the GN to be served from this hovering location.Formally, a UAV control strategy is a mapping π : S → R 3 × [1 : K + 1] from a given set of system states S to R 3 ×[1 : K + 1].In our formulation, the control strategy takes any state in S to a state in which the UAV is at an altitude H above the ground.This fact is expressed in constraint 1) in Definition 3 below.
Definition 3: We say π = (u, q) is a UAV control strategy if the following are satisfied for all s = (x, I) ∈ S: end and q (s) = K + 1 if I = ∅.The first component, u, of a UAV control strategy π = (u, q), gives the necessary displacement to relocate the UAV from the current hovering location to the next.Constraint 1) above ensures that the UAV maintains a constant flying altitude H.The second component, q, designates the index of the GN that the UAV will serve at the subsequent hovering location.Constraint 2) guarantees that this will be one of the remaining GNs.Constraint 3) guarantees that the UAV flies to the endpoint of its trajectory when there is no remaining GN to serve.
Similar to system states, we index the UAV control strategies according to π k = (u k , q k ) for k ∈ [0 : K].The control strategy π k takes as input s k ∈ S k and determines the (k + 1)st hovering location and the GN index to be served from this location.
We call a sequence of control strategies π = (π 0 , . . ., π K ) a UAV control policy.Given a UAV control policy π, the system states evolve according to for k ∈ [0 : K].A control policy π determines a unique UAV trajectory, starting from an initial state s 0 , as the system states evolve according to (11).Note that if the starting point x 0 is at height H,1 then all subsequent x k locations will remain at height H, by the constraint 1) in Definition 3. When this is the case, the third component of the displacement vector u k in the control action at state s k equals zero for k = 0, 1, . . .K.
We establish some key properties for UAV control policies in the next lemma.
2) The endpoint of the UAV trajectory is x K+1 and there is no remaining GN to serve at this endpoint, i.e., s K+1 = x H end , ∅ .
Proof: See Appendix A. Lemma 1 highlights a crucial observation: a control policy for the UAV ensures that each GN is served in a specific order, dictated by the policy, and once all GNs have been served, the UAV proceeds to the endpoint of the trajectory.Further, all possible visiting orders can be achieved (starting from the initial state s 0 = (x 0 , I 0 ) with I 0 = [1 : K]) as we scan through all possible UAV control policies.
We denote the set of all UAV control policies by Π below.Then, for any π ∈ Π, we express the total energy consumption function according to where the first term gives the total hovering energy to harvest data from GNs and the second term gives the total travel energy.Starting from a given initial state s 0 , the system evolves according to (11), and hence all the terms in ( 12) are well-defined.
Using this setup, we express the UAV control policy optimization problem formally as follows: We will solve (13) using the DP algorithm since we formulated the problem to align with the classical DP framework [39].Note that ( 13) is a more general version of ( 9) since ( 13) obtains a dynamic policy (i.e., a collection of functions) that is capable of responding to observed system states, not just a set of hovering locations and a visiting order as a solution.This is required for the policy optimization stage in our DPbased trajectory optimization framework because it is ex-ante unknown which system states the optimum policy will produce in future steps.Before proceeding to the DP algorithm, we establish the important fact that the optimum values of the problems ( 9) and ( 13) are the same if the UAV control policy is optimized for the initial state s 0 = x H start , [1 : K] .Theorem 1: Let J ⋆ be the optimum value of (9) and J ⋆ DP (s 0 ) be the optimum value of (13).Then, for

B. The Dynamic Programming Algorithm
To construct the energy-optimum UAV trajectory, we will minimize the future UAV energy consumption by characterizing the best UAV hovering locations and the visiting order, starting from any state, for each k ∈ [0 : K].This approach is a common practice in the DP literature [39] and leads to the optimum UAV control policy achieving the minimum value of (13).
To this end, we define the energy-to-go functions using a generic system state s = (x, I) as the input variable as follows: J DP,k (s) = min u∈R 3 :u+x∈P H i∈I for k ∈ [0 : K − 1].In ( 14), J DP,K (s) represents the energy consumed by the UAV in its final step towards reaching the trajectory endpoint, after serving the last GN.On the other hand, for k ∈ [0 : K − 1], J DP,k (s) denotes the minimum energy necessary to serve the remaining GNs in I and complete the trajectory at the final point x H end , starting from the state s.
We also define the following UAV control strategies: for k ∈ [0 : K − 1], where (u ⋆ k (s) , q ⋆ k (s)) are an optimum control action pair achieving the minimum in (15) at state s.Using ( 14)-( 17), the next theorem establishes a solution for the UAV control policy optimization problem in (13).
Proof: The feasibility of π ⋆ generated by the backward algorithm follows from our construction.The optimality of π ⋆ follows from Bellman's principle of optimality for DP [39].The optimality of {x ⋆ k } k∈[0:K+1] and ϕ ⋆ for the trajectory optimization problem (9) follows from Theorem 1.
In our proposed DP framework for solving the trajectory optimization problem in (9), two critical components should be noted.Firstly, the backward direction constructs an optimal UAV control policy π ⋆ that minimizes the UAV's future energy consumption while serving the remaining GNs and completing its trajectory at x H end , from any observed state s at any step k ∈ [0 : K].The backward algorithm does not produce a solution for (9) at this stage yet.In the forward direction, on the other hand, by utilizing the optimum policy π ⋆ constructed in the backward direction, we generate a set of optimum hovering locations solving (9) for a given initial state s 0 to serve all GNs and complete the trajectory at x H end .Once we have constructed π ⋆ in the backward direction, the forward algorithm is a simple linear-time (with respect to the number of GNs) sequential table lookup procedure.
Remark 2: Our solution given in Theorem 2 can be readily extended to a broader scenario, where each GN, indexed by i ∈ [1 : K], possesses Q i,up bits for uplink transmission and Q i,down bits for downlink reception.In this case, we can Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
simply redefine hovering time functions as T hov i (x) = T hov i,up (x)+T hov i,down (x) and use exactly the same DP algorithm in Theorem 2 to find the optimum UAV trajectory.Here, T hov i,up (x) and T hov i,down (x) represent the uplink and downlink hovering time functions for GN i when it is served from position x.This can be seen by a simple application of triangular inequality and monotonicity of hovering time functions to conclude that there exists an optimum UAV trajectory that employs the same position for both uplink and downlink communications to serve GN i, ∀i ∈ [1 : K].As a byproduct, this argument also shows that the optimum number of hovering locations cannot be larger than K.
Remark 3: In addition to the UAV energy consumption, another key metric to optimize in practical applications is the amount of time that is required to collect all the GN data and deliver it to a final data collection point.In this case, by choosing the speed of UAV to be maximum speed V max , the objective function is given by which is exactly in the same form of our objective function in (8).Therefore, the same solution in Theorem 2 directly extends to discover an optimum UAV trajectory that minimizes total data harvesting latency in UAV-assisted wireless communications scenarios.

C. Implementation and Complexity Analysis
In this section, we formulate a low-complexity implementation of the Backward-Forward Algorithm from the preceding section.This is important because the location component of system states in our DP framework are continuous variables.For continuous-state DP problems, two common approaches for solving the problem numerically are either to approximate the optimum cost functions (e.g., by using least-squares regression or neural-networks) or to discretize the state space [40], [41], [42].
We will use the discretization approach to numerically compute the optimum UAV control policy in Theorem 2. Our formulation, which keeps the UAV at a constant height H, is advantageous in that it means we only need to discretize the two-dimensional plane at height H.As a result, our formulation does not suffer the impact of "curse of dimensionality" in moving from two to three dimensions, although our formulation can in principle easily be extended to a search over three dimensions.
We now define the grid for our DP algorithm by placing the xy origin such that the x and y coordinates of all GNs are non-negative.Specifically, we do this by shifting all GN locations such that min k∈ Then, we construct a grid according to k .Here, ∆x and ∆y are grid resolutions along the first and second dimensions, respectively.
Using G, we can now write the energy-to-go functions in ( 14)-( 15) on this grid and obtain the optimum UAV control policy by restricting the control actions that will result in the hovering locations belonging to G.This procedure is illustrated in Algorithm 1 below.
In the next theorem, we establish a computational complexity result that characterizes the run-time efficiency of Algorithm 1 as a function of the grid size and the number of GNs.
Theorem 3: For a given grid G with size |G| = M N and K GNs to serve, the computational complexity of Algorithm 1 is O (M N ) 2 K2 K .
Proof: See Appendix C. Our proposed DP framework offers significant computational complexity benefits compared to any exhaustive search-based approach.In the absence of our DP framework, the exhaustive search would be the baseline approach to developing a solution for the UAV trajectory optimization problem in (9).This is because ( 9) is a non-convex and combinatorial optimization problem due to the need for an optimal visiting order.
The computational complexity of exhaustive search to solve (9) for K GNs over a grid G with size |G| = M N is O (M N ) K K! , while our DP algorithm reduces this complexity to O (M N ) 2 K2 K .Notably, the computational complexity of our proposed DP algorithm scales quadratically with the size of the grid, in contrast to the exhaustive search approach, which scales as the Kth power of the grid size.Moreover, for a fixed grid size, the complexity of our DP is O K2 K as a function of the number of GNs, which is much more efficient than the super-exponential growth of complexity of the exhaustive search approach, which is in the order of K!, or approximately

V. LOW-COMPLEXITY APPROACHES: APPROXIMATION ALGORITHMS
Our results in the previous section reveal an important decoupling property.We showed in Theorem 3 that our DP algorithm has a computational complexity of O (M N ) 2 K2 K .It can be seen that the term O (M N ) 2 K is a result of optimizing the hovering locations.This requires computing the energy-to-go functions for all grid points (of which there are M N ), and each function computation necessitates O (M N ) operations.This process must be repeated for K − 1 steps.The term O 2 K is a result of the necessity to determine the optimum visiting order.This is equivalent to solving a variant of the TSP.
Based on these observations, we now develop a systematic approach to design near-optimum UAV trajectories, which will involve decomposing the joint optimization of hovering locations and visiting orders into two sub-problems.In one sub-problem, we will utilize our DP algorithm to find the optimal hovering locations in polynomial time for a given visiting order.A major advantage of our DP algorithm is that it does not require any convex structure for the objective energy consumption function to produce the optimum hovering locations.In the second sub-problem, given hovering locations, we will design a polynomial time approximation algorithm to find an efficient visiting order that provides provable guarantees on the travel energy consumption.

A. Optimizing Hovering Locations for Given Visiting Orders
We modify our DP algorithm to find optimum hovering locations in polynomial time for a given visiting order.We simplify the definitions of system states and control policy since it is not necessary to store all possible future GN combinations when the visiting order is fixed.We redefine the system state so that it only includes the UAV location x ∈ R 3 , and the UAV control policy π = (π 0 , . . ., π K ) so that it only includes the displacements needed to move the UAV from one hovering location to the next at each step.The system states evolve as In this scenario, we have the following theorem characterizing the optimum hovering locations.
Theorem 4 (The Backward Algorithm): For a given vising order ϕ, let the energy-to-go functions be defined as follows: JDP,k (x; ϕ) = min x, and π ⋆ k (x) be a control action achieving the minimum in (22) for k ∈ [0 : K − 1].Then, JDP,0 x H start ; ϕ is equal to the optimum value of ( 9) when the visiting order is restricted to be ϕ, i.e., Φ = {ϕ}, and π ⋆ = (π ⋆ k , . . ., π ⋆ K ) solves ( 13) when Π is restricted to the set of policies that generate ϕ as the visiting order.
The Forward Algorithm: Let x 0 = x H start and x k be the sequence of states generated by π ⋆ according to (20), going forward from step 0. Let also x ⋆ k ≜ x k for k ∈ [0 : K + 1].Then, the locations {x ⋆ k } k∈[0:K+1] solve (9) when the visiting order is restricted to be ϕ.
Proof: The proof follows from the same arguments used to prove Theorem 2.
We present an implementation of this given-visiting-order DP algorithm, in Algorithm 2. The complexity of this implementation is O (M N ) 2 K .

B. Efficient Visiting Order Discovery Through 2-Approximation Algorithm Design
In this section, we develop an approximation algorithm to discover a visiting order with provable guarantees on the travel energy consumption, compared with the optimal exponential-complexity TSP solution, for any given set of hovering locations.In particular, the visiting order returned by our algorithm achieves a travel energy that is not more than twice the travel energy given by the TSP solution.We call this a 2-approximation solution.
We modify a well-known algorithm called the minimum spanning tree (MST) [43], to obtain our 2-approximation algorithm.The modification is necessary because the initial and final UAV locations in our data harvesting problem are not necessarily the same location, unlike the classical TSP setup.
Algorithm 2 : DP Algorithm for UAV Trajectory Optimization With Fixed Visiting Order such that u ⋆ achieves JDP,k (x; ϕ) in (23) 12: end for 13: end for Forward Algorithm: Before starting the algorithm, we introduce the new notation xk as the hovering location of the UAV when serving the kth GN for k ∈ [1 : K].Unlike x k , the output of our DP algorithm generated at step k for k ∈ [1 : K], xk is associated with kth GN and hence ordering the hovering locations {x k } K k=1 will lead us to a visiting order ϕ.
Our 2-approximation algorithm is based on the following underlying idea.We consider a given set of locations {x k } K+1 k=0 , where x0 = x H start and xK+1 = x H end .We construct a spanning tree T ⋆ for the vertex set {x k } K+1 k=0 such that the final UAV trajectory location xK+1 is a leaf node, and connected to its nearest neighbour vertex.T ⋆ has the property that the sum of its edge weights does not exceed the total travel distance of the minimum travel cost trajectory for visiting the given hovering locations, where we use the Euclidean distances between the vertices as edge weights.We select the Euclidean distances as weights because they directly correspond to the energy consumed during UAV travel, as per the physical relationship described in (8).
Utilizing T ⋆ , we form a UAV path that visits all hovering locations, starting at x0 and ending at xK+1 .We show that this path possesses the property that the total UAV Proof: See Appendix D.
Our algorithm, as presented in Algorithm 3, has a polynomial time complexity of O K 2 log K .The most computationally expensive steps of the algorithm are Step 1, which involves constructing a fully connected graph, and Step 2, which involves finding an MST of this graph.These steps contribute the bulk of the computational complexity, while the subsequent steps of the algorithm run in linear time with respect to K.

C. Alternating Optimization Algorithm
We now propose an alternating optimization algorithm for the UAV trajectory optimization problem in (9).Our approach is based on two polynomial time algorithms that we have developed in the previous parts: finding optimum hovering locations (given a visiting order) and a 2-approximation solution for the optimum visiting order (given hovering locations).Our alternating optimization algorithm will iterate between these two stages, discovering an efficient visiting order and optimum hovering locations for this visiting order in turn.
We present the pseudocode for our new algorithm in Algorithm 4, which illustrates how Algorithm 2 and Algorithm 3 are sequentially utilized as sub-routines to output an efficient set of hovering locations and a visiting order.Each Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Algorithm 4 : Efficient Hovering Location and Visiting Order Discovery With Alternating Optimization start , x H end , P UAV (0), E ⋆ and G. 2: Output: Hovering locations {x k } k∈[1:K] and a visiting order ϕ.
end and E travel ← ∞, where v H k ∈ P H is the point directly above v k with altitude H. if ϕ temp achieves travel energy smaller than E travel for input locations then xk 10: E travel ← Travel Energy Achieved by ϕ for updated locations 11: end if 12: until Convergence or maximum iteration count 13: Return: {x k } k∈[0:K+1] and ϕ, where we set Algorithm 4 is guaranteed to convergence, even if the maximum iteration count is set to infinity.This is because it improves the combined hovering and travel energy consumption of the UAV at each iteration.When Algorithm 4 convergences, it returns a 2-approximation solution for the visiting order and a set of optimum hovering locations for this visiting order.
The maximum iteration count parameter in Algorithm 4 can be used to trade-off quality of the solution and the run-time computational complexity by terminating the algorithm before convergence.One specific value of interest for this parameter is 1.When set to 1, Algorithm 4 uses the initial GN locations {v k } k∈[1:K] to find a 2-approximation solution for the visiting order, generates a set of optimum hovering locations for this visiting order, and then terminates.In the next section, we investigate this scenario in detail numerically.Our results show that even in this very low-complexity implementation, the returned hovering locations and visiting order perform close to the optimum value of (9) (within 5% to 15% range).

VI. NUMERICAL PERFORMANCE ANALYSIS
In this section, we present our numerical results to validate the performance of the proposed algorithms.The following are the system parameters we use in our numerical analysis.
The UAV operates at H = 100 m altitude.The transmit power is set to −10 dBW.Using (6) and the model parameters in [22], we obtain the hovering power consumption as P UAV (0) = 168.12W. Solving (7) for the same model, we obtain the optimum energy consumption per unit travelling distance as E ⋆ = 8.83 J/m.The communication bandwidth is set to B = 1 MHz with carrier frequency 900 MHz.We adopt the probabilistic LoS/NLoS model in [32], along with corresponding parameter values for various environmental conditions and excess attenuation estimates for both LoS and NLoS scenarios.The path-loss exponent is set to α = 2.6, which is in the range of common path-loss values used in the literature [6], [22], [44].Unless otherwise specified, the initial and end points for the UAV trajectory are set to x H start = (0, 0, 100) T and x H end = (1000, 800, 100) T , where we measure all distances in meters.
Our DP-based UAV trajectory optimization framework has an important feature that distinguishes it from other methods: it does not rely on an average channel gain (ACG) approximation to determine the energy required for UAV hovering.In the literature, the hovering time is typically estimated using this approximation through the equation T hov approx (∥v (as given in ( 5)), where the expectation operator is moved inside the rate function.This approximation simplifies the theoretical analysis (as shown in [9], [22]) but it can result in significant deviations from the actual energy consumption of the UAV.
To illustrate this point more concretely, we replace (8) to compute the energy consumption of the UAV over its data harvesting path.We then use the successive convex approximation technique developed in [22] to minimize the approximated UAV energy consumption function.
Our results are plotted in Fig. 2. In this figure, we focus on a scenario in which UAV operates in a dense environment characterized by Rayleigh distributed building heights and 500 buildings per square kilometer [32].The red curve in Fig. 2 displays the actual energy consumption for the hovering locations found by using the approximated energy consumption function, whereas the blue one represents the predicted UAV energy consumption for the same hovering   3 locations.The significant deviations between these two curves demonstrate the critical importance of employing the actual hovering energy consumption function to identify optimum UAV trajectories.The approximation method consistently underestimates the energy needed for data harvesting, particularly when a large number of bits are transmitted by the GNs.This is primarily due to the approximation's tendency to downplay the effect of NLoS conditions on the hovering time, resulting in a totaly different form for the objective function to optimize.
Our proposed framework (DP Scheme) generates optimum UAV trajectories that are jointly optimized over hovering locations and visiting orders using Algorithm 1.These trajectories are shown in Fig. 3 for the same number of bits Q at all GNs for four different Q values.To assess the performance of our approach, we compare it against two benchmark schemes: (i) the travelling salesman, successive convex approximation, average channel gain scheme (TS-SCA-ACG scheme), as proposed in [22], and (ii) a travelling salesman hovering above GNs scheme (TS-HAG scheme).Both Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
benchmark schemes use the travelling salesman ordering based on GN locations, resulting in exponential complexity in the number of GNs, for all schemes.The energy consumption of each trajectory is tabulated in Table I.Clearly, our optimal scheme performs the best for all data sizes, and significantly outperforms the TS-SCA-ACG and TS-HAG schemes for data sizes ranging from 100 kbits to 10 Mbits.Fig. 3a shows that for the low 1 kbit case, the flight path is essentially a straight line between the start and end points, as the data collection requirement is minimal, and both our scheme and the TS-SCA-ACG scheme have similar trajectories.Fig. 3d shows that for the extreme case of 1 Gbit, the flight path for all three schemes closely visits each node, as the data collection requirement is so dominant.
Our findings in Fig. 3 offer a significant insight based on data-gravity, where we note that the UAV becomes more attracted to the GN and the optimized hovering locations become closer to the GN locations as Q increases.For instance, the trajectory for Q = 1 kbits case is almost a straight line from the initial to the final location.In contrast, when Q = 1 Gbits, the UAV hovers just above the GN locations, similar to the TS-HAG scheme.This is a consequence of the trade-off between hovering and travel energy costs.Hovering farther away from GNs becomes very onerous for large values of Q since hovering time and energy increase with increasing distance of the communication link.On the other hand, moving closer to the GN decreases the hovering energy to collect data but increases the travel energy due to the additional flying required.Our proposed DP scheme optimizes the hovering locations to balance these two conflicting costs.
From trajectories in Fig. 3, we also observe that the optimized hovering locations generated by our DP scheme deviate more towards the GN locations than those of the TS-SCA-ACG scheme, especially for low to moderate values of Q.The reason for this observation is twofold.Firstly, the TS-SCA-ACG scheme does not optimize the visiting order, whereas our DP scheme does.Secondly, as previously mentioned, the TS-SCA-ACG scheme inaccurately estimates the UAV energy consumption function, resulting in suboptimum hovering locations that do not accurately account for GN traffic demands.As a result, the actual energy cost for the TS-SCA-ACG scheme is significantly higher than that of the DP scheme for a moderate number of bits, such as Q = 10Mbits.However, the energy consumption of both schemes converges when either the number of bits is extremely low (Q = 1 kbits) or extremely high (Q = 1 Gbits), producing similar trajectories in extreme cases.
We consider the effect of visiting order selection on the UAV energy consumption in Fig. 4. In this figure, we also evaluate the performance of the low-complexity alternating optimization algorithm presented in Algorithm 4. To this end, we plot the average UAV energy consumption as a function of Q for 100 random realizations of 8 GN locations distributed in a square kilometer area in Fig. 4. In each realization, we determine the visiting order by solving the TSP for the GN locations, and we then use this visiting order in Algorithm 2 to obtain optimum hovering locations (TSP Scheme).To ensure a fair comparison, we set the max iteration count in Algorithm 4 to 1, resulting in a 2-approximation solution for the visiting order using GN locations, which we then exploit to find the optimum hovering locations.
As seen in Fig. 4, UAV energy consumption for all ordering schemes increases with Q since the hovering time, and hence hovering energy, is directly proportional to Q.Our DP-based framework performs slightly better than the TSP scheme.Notably, our low-complexity proposal, Algorithm 4, produces UAV trajectories that consume only around 5%-15% higher energy than the energy consumed by those obtained using the optimum DP scheme.It is worth mentioning that both the DP and TSP schemes have exponential computational complexity, whereas our alternating optimization algorithm generates efficient visiting orders and hovering locations with only polynomial time complexity.

VII. CONCLUSION
In this paper, we have proposed a DP-based optimization framework to design optimum UAV trajectories for data harvesting from distributed GNs.We have developed a backward-forward algorithm that jointly optimizes over hovering locations and visiting orders to generate optimum UAV trajectories.The optimum UAV trajectory design problem we solve in this paper is NP-hard and a non-convex optimization problem, resulting in exponential complexity in our optimum algorithm.To address the complexity problem, we have also developed a polynomial time alternating optimization algorithm that decomposes the UAV trajectory optimization problem into two sub-problems of optimizing the hovering locations (given a visiting order) and finding a 2-approximation solution for the visiting order (given hovering locations).
Our numerical results have revealed the importance of accurate estimation of hovering energy consumption by taking the cumulative effects of link distance and LoS/NLoS probabilities into account.Our numerical results have also shown that our low-complexity alternating optimization Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
algorithm achieves energy consumption performance within 5% to 15% of the optimum.
Our work in this paper opens up important research directions for further investigation in the field.We have developed a 2-approximation algorithm to find a visiting order, for which we can optimize hovering locations in polynomial time.The other potential candidate approaches to approximate the visiting orders include the algorithms proposed by Christofides [45] and Arora [46].It is of notable interest to investigate the trade-off between solution quality and complexity across various visiting order approximation algorithms for the UAV trajectory design problem.Our algorithms can be used to optimize the UAV trajectory offline with the knowledge of GN locations and the probability of LoS/NLoS channel conditions.Another important research direction is to incorporate the instantaneous channel state information and real-time LoS/NLoS conditions with the GNs while the UAV is in flight to update the UAV hovering locations.

APPENDIX A PROOF OF LEMMA 1
Consider a sequence of states {s k = (x k , I k )} K+1 k=0 generated by a UAV control policy π, starting with the initial state s 0 = (x 0 , I 0 ) with I 0 = [1 : K], according to the state update rule (11).For k ∈ [0 : K], the number of GNs in I k is equal to K − k since π removes a GN from I k at each step.This implies I K = ∅, and therefore ϕ (K + 1) = q K (s K ) = K +1.Additionally, ϕ (k) ̸ = ϕ (j) for any k < j in [1 : K] since ϕ (j) ∈ I j−1 but ϕ (k) / ∈ I j−1 since it is removed from I k−1 at step k − 1.These results show that ϕ is a visiting order, i.e., bijection satisfying ϕ (0) = 0 and ϕ (K + 1) = K + 1.This completes the proof of the first assertion.
For the second assertion, the above arguments show that I K = ∅ for any UAV control policy.This implies x K+1 = x H end since π K is a UAV control strategy and outputs

APPENDIX B PROOF OF THEOREM 1
Let ϕ be the visiting order induced by a UAV control policy π ∈ Π for the initial state s 0 = x H start , [1 : K] , as established in Lemma 1.Let s k = (x k , I k ) be the sequence of states generated by π, starting from s 0 , according to the state update rule in (11) for k ∈ [0 : K + 1].Then, we can express (12) according to The second equality follows by invoking ϕ(k + 1) = q k (s k ).
By Lemma 1, ϕ is a visiting order.The equations ( 8) and (24) are exactly in the same form, and the variables {x k } k∈[0:K+1] satisfy the constraints in (9) for s 0 = x H start , [1 : K] since π is a UAV control policy.This shows J ⋆ DP (s 0 ) ≥ J ⋆ if s 0 = x H start , [1 : K] because π was an arbitrary UAV control policy in Π.
For the other direction, let {x ⋆ k } k∈[0:K+1] and ϕ ⋆ be a set of optimum hovering locations and a visiting order achieving J ⋆ in (9).Such an optimizer exists due to the continuity of J {x k } k∈[0:K+1] , ϕ in x k , and the fact that the hovering energy grows unbounded as the distances between hovering and GN locations increase, which means that the optimal solution must lie in a compact subset of P H .
Consider now a sequence of system states defined as . We construct a possible π by constraining the control actions to be q k (s ⋆ k ) = ϕ ⋆ (k + 1) and u k (s ⋆ k ) = x ⋆ k+1 − x ⋆ k on system states s ⋆ k for k ∈ [0 : K].We allow π to take any control action on other system states without violating the conditions in Definition We construct a new spanning tree T ⋆ by connecting xK+1 to x⋆ .T ⋆ spans all the locations {x k } k∈[0:K+1] .In addition, it contains xK+1 as a leaf vertex, a crucial property that we will utilize in our proof.Let now ϕ ⋆ be an optimum visiting Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
We first show that COST (T ⋆ ) ≤ K k=0 xϕ ⋆ (k+1) − xϕ ⋆ (k) , where the COST function is defined to be the sum of all edge weights of the input graph.To this end, we consider a sub-path P of P ⋆ that ends at xϕ ⋆ (K) , i.e., P = x0 , xϕ ⋆ (1) , . . ., xϕ ⋆ (K) .Then, the inequality COST (T ) ≤ K−1 k=0 xϕ ⋆ (k+1) − xϕ ⋆ (k) holds because P can be mapped to a spanning tree of G that only contains the edges connecting subsequent locations in P.Then, we have where the last inequality follows from our construction that x⋆ is closest to xK+1 .We utilize T ⋆ to construct a UAV trajectory that consumes no more than twice the energy of the optimum trajectory.To this end, starting at x0 , we construct a walk W on T ⋆ , visiting all the vertices {x k } k∈[0:K+1] by using a depthfirst search, exploring the branch leading to xK+1 at the final step and terminating the walk whenever xK+1 is reached. 2 We can represent this walk as a sequence of vertices W = (x k0 , . . ., xk L ), where xk0 = x0 , xk L = xK+1 and L ≥ K + 1.In this construction, W visits each edge of T ⋆ at most twice, and hence we have We still need to form a path P W (that will visit each vertex only once) using W since W can contain some vertices {x k } k∈[0:K+1] multiple times.Thus, we prune W by removing xkj from W if k i = k j for some i < j.Let (i 0 , . . ., i K+1 ) be the remaining sequence of indices after this pruning operation.The length of the pruned index sequence is K + 2 since it must contain all indices in [0 : K + 1].We also have have i 0 = 0 and i K+1 = K + 1 by the construction of W.
Let ϕ (k) ≜ i k for k ∈ [0 : K + 1].ϕ is a visiting order since ϕ(0) = 0, ϕ (K + 1) = K + 1 and it does not map two different indices in Combining ( 25) -( 27), we conclude that the travel energy consumption of ϕ is less than twice that of the optimum visiting order.

Manuscript received 28
May 2023; revised 31 August 2023; accepted 25 September 2023.Date of publication 10 October 2023; date of current version 17 January 2024.This work was supported by the Australian Research Council under Grant DP200101627.The associate editor coordinating the review of this article and approving it for publication was Q. Wu. (Corresponding author: Dhanushka Kudathanthirige.) [a : b] for a < b, and a, b ∈ Z indicates the set of integer values from a to b including a and b.Moreover, {a r } r∈[a:b] is the set given by {a a , a a+1 , • • • , a b }.

5 :
step energy-to-go function computation JDP,K (x; ϕ) ← x H end − x and π ⋆ K (x) ← x H end − x 6: end for 7: for k = K − 1 : 0 do ▷ Computation of the kth step energy-to-go functions 8:if k ̸ = 0 then S ← G

Algorithm 3 : 2 - 1 : 6 :
Approximation Algorithm for Optimum Visiting Order Input: {x k } k∈[0:K+1] 2: Output: Visiting order ϕ with 2-approximation 3: Step 1: Set V ← {x k } k∈[0:K] and construct a fully connected weighted graph G = (V, E), with edge weights set to Euclidean distances between the vertices.▷ E is the corresponding edge set 4: Step 2: Generate an MST T of G and select x0 as the root vertex.▷ Prim's or Kruskal's algorithm can be used at this step 5: Step 3: Find the nearest vertex x⋆ of T to xK+1 .Connect xK+1 to x⋆ to form a spanning T ⋆ .▷ T ⋆ spans all input locations Step 4: Construct a walk W = (x k0 , . . ., xk L ) on T ⋆ by using depth-first search, starting at x0 , exploring the branch leading to xK+1 at the final step and terminating the walk whenever xK+1 is reached.7: Step 5: Remove duplicates from W, keeping the first occurrence of each vertex.Let i 0 to i K+1 be the resulting sequence of vertex indices in the pruned walk.8: Step 6: Set ϕ (k) ← i k for k ∈ [0 : K + 1].9: Return: ϕ energy consumed by tracing it does not exceed twice that of the minimum travel cost visiting order.We present our 2-approximation algorithm in Theorem 5. Theorem 5: For any given set of locations {x k } K+1 k=0 , where xk is the hovering location for serving kth GN, x0 = x H start , and xK+1 = x H end , the visiting order returned by Algorithm 3 consumes at most twice the minimum UAV travel energy.

Fig. 2 .
Fig. 2. A comparison between the predicted and actual energy consumption of ACG scheme for different Q values.

3 .
By construction, π generates the optimum hovering locations {x ⋆ k } k∈[0:K+1] and the visiting order ϕ ⋆ if s 0 = x H start , [1 : K] .This establishes the reverse inequality J ⋆ DP (s 0 ) ≤ J ⋆ .APPENDIX C PROOF OF THEOREM 3 We count the number of computations needed at each step k ∈ [0 : K].For k = K, we have to calculate J DP,K (s) and π ⋆ K (s) for M N different states (one for each x ∈ G).This computation requires O (M N ) constant time operations.For k ∈ [1 : K − 1], the size of the state space is |S k | = M N K K−k .For each s ∈ S k , we have to perform O (M N (K − k)) constant-time operations (i.e., minimization over each grid point and GN index combination) to construct J DP,k (s) and π ⋆ k (s).Hence, the step k requires O (M N ) 2 K K−k (K − k) computations for k ∈ [1 : K − 1].Finally, for k = 0, we have S 0 = {s 0 }.Calculation of J DP,0 (s 0 ) and π ⋆ 0 (s 0 ) requires O (M N K) constant-time operations.The total of constant-time operations required for algorithm is O (M N ) 2 K2 K in the backward direction.The forward algorithm is just a sequential table lookup, which has only linear complexity in K and therefore does not affect the overall algorithm complexity.APPENDIX D PROOF OF THEOREM 5 Let G be the fully connected graph having {x k } k∈[0:K] as its vertex set.The weight of an edge connecting xi and xj in G is set to the Euclidean distance ∥x i − xj ∥ between xi and xj .Let T be an MST of G and x⋆ be a vertex of G closest to xK+1 , i.e., x⋆ ∈ arg min x∈{x k } k∈[0:K] ∥x K+1 − x∥ .

TABLE I ENERGY
CONSUMPTION OF THE TRAJECTORY DESIGN SCHEMES IN FIG.