Joint Beamforming and Trajectory Optimization for Intelligent Reflecting Surfaces-Assisted UAV Communications

Unmanned aerial vehicle (UAV) and intelligent reflecting surface (IRS) are anticipated to be widely applied for improving the spectrum and energy efficiency in the forthcoming wireless communication systems. To take full advantages of IRS-assisted UAV system, both beamforming and UAV’s trajectory should be optimally designed. In this paper, we consider a challenging scenario where there are one UAV and several IRSs. For the IRSs-assisted UAV system, we formulate the problem as maximizing the received power at the ground user by jointly optimizing active beamforming at the UAV, passive beamforming at the IRSs, and UAV’s trajectory over a given flying time. An efficient framework is proposed so that the joint optimization problem can be decomposed into three subproblems, which can be iteratively optimized individually. In particular, a closed-form expression is derived for updating the phase shifts of the reflecting elements, which helps develop a low-complexity algorithm. Numerical results show that our scheme outperforms other benchmark schemes, which corroborates the feasibility and effectiveness of our proposed algorithm.

, the authors study the optimal 3D placement of UAVs for maximizing the number of covered users to meet different quality-of-service constraints. The authors in [9] investigate path planning and optimal deployment for UAV-assisted communications. The work in [10] optimizes the placement and distribution of UAVs to minimize the delay of the overall system. In [11], the authors jointly optimize the UAV's transmit power and trajectory to fully exploit the controllable channel variations provided by the UAV's mobility. The previous works motivate us to exploit the UAV's mobility to improve service for ground users by dynamically adjusting the UAV's real-time position. Recent measurement results in [12] have shown that the LoS model offers a good approximation for air-to-ground communications. But one prominent challenge is the high probability of blockage caused by common objects in urban areas [3], which severely degrades the channel quality. Besides, path loss [12] is also an important factor of signal degradation.
More recently, intelligent reflecting surface (IRS) has been considered as a potential solution to address the challenges in UAV communication systems. Recent studies have found that IRS can be utilized to enhance the capacity and energy efficiency for wireless systems [13]- [15]. Specifically, an IRS usually consists of a vast amount of low-cost passive elements, which are able to reflect the incident wireless signal with reconfigurable amplitudes and phase shifts [16]. By intelligently tuning the amplitudes and phase shifts of the passive elements, IRSs are able to modify the wireless channel and create a favorable wireless propagation environment.
Several recent works have studied IRS for terrestrial communication systems, which are mainly classified into performance analysis [17], [18] and joint beamforming optimization [19], [20] for different design objectives. In [17], the authors investigate the downlink performance of an IRSassisted MIMO system and derive the close-formed outage probability and average capacity. The authors in [18] analyze the asymptotic achievable rate in an IRS-assisted downlink system and derive the expected asymptotic symbol error rate. In [19], the authors jointly optimize the active beamforming vector at the BS and the reflection coefficients at the IRS to maximize the received signal power at the user. The joint BS-IRS optimization problem is also investigated in [20] to maximize the energy efficiency in a downlink multi-user scenario. However, most works assume infinite resolution on phase shift adjusting of IRS, which is infeasible in practice. Quantization of the phases causes a significant degradation to the system performance since only low-resolution elements are used in the real scenarios [21].

B. MOTIVATIONS AND CONTRIBUTIONS
To cope with the challenges in the UAV communication systems, IRS can be deployed to create multiple links and improve channel gain. Compared to the non-IRS-assisted UAV communication system, IRS-assisted UAV wireless network can effectively enhance the signal reception at the receiver, extend the network coverage, and increase the link capacity. To get benefits of both UAV communications and IRS, IRS-assisted UAV communication system must jointly optimize the UAV's trajectory and the IRS's beamforming due to their inseparable relation [22]. The authors in [6] propose an alternating method to solve the passive beamforming and trajectory optimization. A reinforcement learning approach is proposed in [5] to model the propagation environment to optimize the location and reflection coefficient of UAV-IRS. The authors in [7] analyze the signal gains at UAVs as a function of their heights as well as various IRS parameters. However, there is only one IRS employed in these studies which limits the improvement of system performance. Moreover, quantization of the phase shifts of IRS and the number of antennas at the UAV also need to be considered.
In this paper, we propose a novel IRSs-assisted UAV communication system. The UAV works as an aerial BS and communicates with a user in a finite time horizon. The IRSs are deployed outside building walls to help enhance the communication links. We investigate the joint beamforming and UAV's trajectory design. We maximize the received power at the user by jointly optimizing passive beamforming, active beamforming, and UAV's trajectory. The main contributions of this paper are summarized as follows.
• A novel IRS-assisted UAV communication system framework is proposed, where several IRSs act as passive reflectors to provide configurable reflecting paths between the UAV and the user. We build up several LoS links from the UAV to IRSs and from IRSs to the ground user to overcome blockage in the UAV-assisted communication system. We optimize active beamforming vector, passive beamforming matrices, and UAV's trajectory in a finite time horizon jointly to maximize the received power while considering practical mobility, transmit power, and IRSs' phase shift constraints.
• The original intractable problem is decomposed to three subproblems of active beamforming, passive beamforming, and UAV's trajectory design. A closed-form expression is derived for the IRSs' matrices. Moreover, UAV's trajectory is successively updated by finding the optimal trajectory that maximizes a lower bound of the received power at the user.
• Furthermore, we develop an iterative algorithm to optimize active beamforming vector, passive beamforming matrices, and UAV's trajectory. The overall convergence is analyzed. The obtained results show that the joint optimization heavily depends on the locations of IRSs. The joint UAV's trajectory, active and passive beamforming optimization can improve system performance significantly. Moreover, our work provides a low-complexity design guideline for multi-antenna UAV and multiple IRSs.
The remainder of this paper is organized as follows. In Section II, the system model of IRSs-assisted VOLUME 8, 2020 FIGURE 1. IRSs-assisted UAV communication.
UAV communication system and problem formulation is described. In Section III, we propose joint active beamforming, passive beamforming, and UAV's trajectory design. Finally, simulation results are presented in Section IV before concluding remarks in Section V.
Notations: In this paper, scalars are denoted by italic letters. Vectors and matrices are denoted by bold-face letters. C M ×N denotes the space of M × N complex-valued matrices. For a complex-valued vector v, v , v T , v H and diag(v) denote its 2 -norm, transpose, conjugate transpose, and a diagonal matrix with each diagonal element being the corresponding element in v, respectively. For a complex-valued scalar v, |v| and arg (v) denote its absolute value and phase, respectively.

II. SYSTEM MODEL AND PROBLEM FORMULATION A. SYSTEM MODEL
As shown in Fig. 1, we consider a UAV-enabled system where the UAV is employed as an aerial BS to serve the user for the downlink transmission. To compensate for the fast attenuation of signals, the UAV is equipped with multiantennas for beamforming. When there exists a LoS path from the UAV to the user, the downlink transmission can be reliable and efficient. However, when the LoS path is blocked, the received signal at the user will be substantially attenuated. To improve the received power, multiple IRSs are deployed to assist the transmission to the user, which replaces the direct non-LoS (NLoS) link with two connected LoS links by reflecting the signals from the UAV to the user. We divide time into small slots with duration δ t so that the location of UAV and all channels are approximately unchanged within each slot. In each slot, the scheduler optimizes the UAV's trajectory and beamforming according to the information on channel conditions. We focus on the downlink transmission of the IRS-assisted UAV system. The signal from the UAV can reach the user directly or be reflected by the IRSs. The scenario that the signals are reflected by the IRS once is considered. Due to the serious attenuation, the signals reflected by IRS for two and more times are ignored [23]. If there are K IRSs employed in the system, the received signal at the user in the nth time slot can be expressed as where s is the transmitted signal modeled as a random variable with zero mean and unit variance, and η denotes the additive white Gaussian noise with zero mean and variance σ 2 . w [n] ∈ C N t ×1 is the beamforming vector of the UAV in the nth time slot and N t is the number of UAV's transmit antennas. G [n] is the channel between the UAV and the user in the nth time slot. H k [n] and g k [n] represent the channels from the UAV to the kth IRS and from the kth IRS to the user in the nth time slot, respectively. k [n] is the phase shift matrix of the kth IRS in the nth time slot, which can be expressed as where θ km [n] is the phase shift associated with the mth element of the kth IRS in the nth time slot and M is the number of reflecting elements of each IRS. In practical implementation, the phase shift of each entry needs to be quantized to B bits of precision due to practical constraints, which is denoted by The position of UAV plays an important role in channel modeling. Consider a 3D Cartesian coordinate system. The UAV is assumed to fly at a fixed altitude z U above the ground for the whole system time. The trajectory of the UAV can be approximated by the sequence q and y [n] are the horizontal coordinates of the UAV. Due to the constraints of maximum speed and initial and final locations of the UAV, the trajectory of the UAV is subject to the following mobility constraints where D max = V max δ t is the maximum horizontal distance that the UAV can travel in each time slot and V max is the maximum UAV speed in meter/second (m/s), q 0 and q F are the initial and final horizontal locations, respectively. We assume that the user is equipped with a single-antenna and the horizontal coordinates of the user is fixed at As a result, the distance between the UAV and the user in the nth time slot can be given as d UG In practical implementations, the LoS path between the UAV and the user may be blocked [3] due to various obstacles in the complex urban environment; however, the wireless channel is filled with lots of scatters [6]. Using d UG , the channel from the UAV to the user can be modeled as where ρ denotes the path loss at the reference distance d 0 = 1m and κ is the path loss exponent of the UAV-user link. Each element ofh is independent and identically distributed (i.i.d.) complex Gaussian distributed with zero mean and unit variance. According to [12], the air-to-ground communication channels are mainly dominated by the LoS links. Thus, we assume that the channel quality depends mainly on the UAV-IRS distance for simplicity. In addition, Doppler effect caused by the UAV mobility is assumed to be perfectly compensated at the receivers [24]. Thus, H k [n] can be expressed as [6] where d UI,k [n] is the distance between the UAV and the kth IRS in the nth time slot. All IRSs are equipped with a uniform linear array (ULA) of M reflecting elements. The first element of each IRS is regarded as a reference point and the altitude and horizontal coordinates of the reference point of the kth IRS is denoted by z k and w k = [x k , y k ] T ∈ R 2×1 , k ∈ K = {1, 2, · · · , K }, respectively. Each IRS is usually attached with a smart controller that controls the phase shifts of all reflecting elements in real time [25].
where a M (β k [n]) and a N t (α k [n]) are the array responses in the nth time slot, and can be given respectively as where d is the antenna separation and λ is the carrier wavelength, β k [n] and α k [n] are the cosine of angle-of-arrival (AoA) and angle-of-departure (AoD), respectively. Different from G [n] and H k [n], both the LoS component and NLoS components exist in the channels from IRSs to the user. Therefore, g k [n] is modeled in Rician fading [26] as where d IG,k = z k 2 + w k − w G 2 is the distance between the kth IRS and the user, α is the path loss exponent related to the IRS-user link and M 1 is the Rician factor. The LoS is expressed by the responses of the ULA, where r k is the cosine of the AoD of the signal from the ULA at the kth IRS to the user.g k [n] ∈ C M ×1 is the NLoS component, each element of which is i.i.d. complex Gaussian distributed with zero mean and unit variance. In this paper, we assume that all the channel state information can be obtained based on existing channel estimation techniques, such as in [15], [27].

B. PROBLEM FORMULATION
Our goal is to maximize the received signal power at the ground user by jointly designing the active beamforming vector w = {w [n] , n ∈ N }, the phase shift matrices Accordingly, the average signal power received at the user over N time slots is given by Mathematically, the corresponding problem can be formulated as where P t represents the maximum transmit power at the UAV.
(13a) and (13b) denote IRS's phase shift constraints and the maximum transmit power constraint at the UAV, respectively. (4) and (5) represent the UAV's mobility constraints, respectively. Problem (P1) is a non-convex optimization problem, which is challenging to solve due to the non-concave objective function with respect to w, Q, and . Furthermore, practical IRSs implement only discrete phase shifts for each element. In general, there is no standard method for solving such a non-convex optimization problem.

III. JOINT BEAMFORMING AND TRAJECTORY DESIGN OF IRSs-ASSISTED UAV SYSTEM
To make Problem (P1) more tractable, we first relax discrete optimization variables to their continuous values. Thus, Problem (P1) can be modified as We find Problem (P2) is still non-convex, which may lead intractable complexity to obtain the optimal solution. In every time slot, the system jointly performs active beamforming, passive beamforming and UAV's trajectory design. Therefore, we decompose the optimization problem VOLUME 8, 2020 into several subproblems and solve the problem by iteratively optimizing the variables until the convergence is reached.

A. ACTIVE BEAMFORMING AND PASSIVE BEAMFORMING
For any fixed Q and , it can be verified that the maximum ratio transmission (MRT) is the optimal beamforming to Problem (P2) [28]. Thus, the optimal active beamforming vector at the nth time slot is given by By substituting w opt to (III), Problem (P2) can be simplified as the following equivalent problem  (17) where Re {·} denotes the real part. According to (6) and (7), it is obvious that the first item and second item of (17) can be derived as The following two lemmas, proved in Appendix, provide some insights on massive MIMO channel.

Lemma 1: The product of H and H H can be approximated as
when the number of antennas is large enough.
Using Lemma 1, AHH H A H in (17) can be further simplified as By substituting (18), (19) and (21) into (17), we can rewrite (17) as Based on the above discussion, we have the following lemma, proved in Appendix B.
Lemma 2: For fixed Q and w, the optimal reflecting elements of passive beamforming to maximize the nth item of the objective function of Problem (P3) are given by

B. UAV TRAJECTORY OPTIMIZATION
Based on active and passive beamforming in section A, the nth item of the objective function of Problem (P3) can be further rewritten as where in (b), we define Based on the above discussions, for fixed active beamforming vector w and passive beamforming matrix , Problem (P3) can be reformulated as s.t. (4), (5) There are only two constraints about UAV's trajectory in Problem (P4), however, it is still difficult to deal with the optimization problem. Thus, we relax the variables d UG [n] and d UI,k [n] in (III-B) and have the following constraints After the relaxation, Problem (P4) can be converted to With the following lemma, proved in Appendix C, we solve the optimization problem.
Note that the above discussion allows us to apply the successive convex approximation (SCA) technique to solve the optimization problem. The first-order Taylor Then Problem (P5) can be reformulated as where the expressions of R 0 [n] and S 0,k [n] are presented at the bottom of this page. Problem (P6) has a convex objective function with convex constraints which can be efficiently solved by standard convex optimization solvers, such as Interior Point Method [29]. It is worth noting that the optimal objective value obtained from approximate Problem (P6) in general serves as a lower bound of that of Problem (P2).
So far, we assume that continuous phase shifts are available for each reflecting element. However, the assumption is not always feasible since manufacturing each reflecting element with more levels of phase shifts incurs a higher cost. To address the limitation of the phase shift resolution, the phase of each element of is quantized to B bits of precision, which is quantized to its nearest neighbor based on the shortest Euclidean distance.

< ξ
Based on the results presented in the previous subsections, we propose an overall iterative algorithm for Problem (P1), which is shown in Algorithm 1.

IV. NUMERICAL RESULTS
In this section, we present numerical results to validate our proposed design. It is assumed that the user is located at (0, 70m). The initial and final horizontal coordinate of UAV is set as (−500m, 20m) and (500m, 20m) and the altitude of UAV is 80m [6]. We assume that there are two IRSs in our simulation and the number of IRSs can be extended to a general value. The locations of two IRSs are (−200m, 0) and (200m, 0), and the altitudes of them are both set as 40m. The signal attenuation at a reference distance of 1 meter is set as 20dB. The path loss exponents of UAV-user, UAV-IRS and IRS-user channels are set to be 3.5, 2 and 2.2, respectively. Other parameters are as follows: P t = 10 −2 W, d = λ 2 , M 1 = 3 dB, ξ = 10 −4 , V max = 25 m/s, δ t = 1 s and B = 4.
Our proposed scheme consists of three parts, i.e., active beamforming, passive beamforming and UAV's trajectory design(referred as ActivePassiveQ). To assess the long term system performance of our proposed algorithm, we compare it with three benchmark policies.
• Active beamforming, passive beamforming without UAV's trajectory design (referred to as ActivePassive-WithoutQ) • Active beamforming without passive beamforming and UAV's trajectory design (referred to as ActiveWithout-PassiveQ) • Without active beamforming, passive beamforming and UAV's trajectory design (referred to as WithoutAc-tivePassiveQ) Note that without trajectory optimization means that the UAV flies straightly from the original location to the user and then to the final location if time permits.  In Figure 2, the iteration rate of the proposed algorithm is plotted. The number of reflecting elements is set as M = 50, 100, 150 for different scenarios. It is observed that at most three iterations are needed for convergence when M = 50, 100 and the number of iterations needed for convergence increases to six for the case of M = 150. All results suggest a low-complexity implementation in practice. This is due to the fact that, in the proposed approach, closed-form expressions for active beamforming and passive beamforming are derived, hence reducing the convergence time. Figure 3 shows the received power versus M with N t = 16, 32, 64. Clearly, the received power substantially increases by deploying more antennas at the UAV, which illustrates that deploying large-scale antenna arrays is an effective way to boost the system performance. Furthermore, increasing the number of IRS's reflecting elements can also significantly improve the received power, especially when M is large. It shows that the number of reflecting elements plays an important role in achieving a substantial gain. When M is large, the performance gap between N t = 16, 32, 64 tends to be larger. Therefore, it is significant to take into  account the number of antennas and reflecting elements jointly. Figure 4 compares the received power of the four designs versus the number of IRS reflecting elements. From the figure, the received power increases quadratically along with the number of IRS reflecting elements in our proposed design, i.e. ActivePassiveQ. The performance gap between our design and other schemes is quite large, which shows that our proposed design outperforms other schemes. This further demonstrates that the performance can be effectively improved by applying the joint active, passive beamforming and UAV's trajectory design. Figure 5 shows the corresponding received power versus flying time T , which compares our proposed algorithm with the benchmark algorithms. It is observed that the received power of our algorithm increases significantly with T . This is because UAV has sufficiently large T to reach its favorite location to serve the ground user. Also, the deployment of IRSs plays an important role in performance improvement. Furthermore, it is worth pointing out that trajectory adaptation is more significant and the received power gap caused by trajectory is much larger than that of passive beamforming and active beamforming. The performance gap tends to be a constant when T becomes large. Figure 6 shows the trajectories of the UAV with different values of flying time T and numbers of reflecting elements M . It is observed that T = 40 is the minimum time required for the UAV to fly from the initial location to the final location in a straight line at the maximum speed. As T increases, the trajectories of different settings become quite different. Because of the influence of IRSs, the UAV tries to fly as close as possible to the locations of IRSs to enhance the received signal strength. When M becomes larger, the UAV tends to spend more time flying around the IRSs, because increasing the number of IRS reflecting elements makes greater efforts to the performance.

V. CONCLUSION
This paper investigates the joint design of active beamforming, passive beamforming, and UAV' trajectory for the IRS-assisted UAV communication system. Specifically, we enhance the received power performance by adjusting the UAV's trajectory and intelligently designing beamforming matrices, which leads to a new joint optimization framework. By applying successive convex approximation technique, an efficient algorithm is proposed to solve the joint optimization problem. Furthermore, we have found that the combination of UAV and IRSs is quite significant for improving system performance. In (37), the main diagonal element can be derived as

APPENDIXES
Off-diagonal elements of the matrix HH H can be simplified as H ≈ 0 when the number of antennas is large enough [30].

APPENDIX B PROOF OF THE LEMMA 2
Note that for fixed Q and w, only the first item of (22)  · 1, · · · , e −j 2π Therefore, it can be observed that the optimal value of θ km [n] to maximize the second item of (22) is given by Similarly, as for the third item, A k X k A H k can be written as For fixed trajectory of UAV Q and active beamforming vector w, we can find that the optimal value of θ km [n] to maximize the third item of (22) should satisfied Note that the conditions that maximize the second and third items of (22), i.e. (41) and (43), can be satisfied at the same time. In other words, (23) is the solution to simultaneous equations of (41) and (43). As a result, the optimal reflecting elements of passive beamforming to maximize the nth item of the objective function of Problem (P3) is given by (23).

APPENDIX C PROOF OF THE LEMMA 3
Abstracting from p slack [n], f (x, y 1 , y 2 , · · · , y K ) can be expressed as shown at the bottom of this page.
First, when K = 1, the function is given by The first-order and second-order partical derivatives of f (x, y 1 ) are expressed as Then, we can obtain the hessian of f (x, y 1 ) which is denoted by Based on (48)-(50), the determinant of ∇ 2 f is given by Thus, when K = 1, f is a convex function.
The first-order and second-order partical derivatives of z (x, y) are given by The hessian of z (x, y) is Then, the determinant of ∇ 2 z is given by Therefore, when K = n + 1, the function is also a convex function. In conclusion, f (x, y 1 , y 2 , · · · , y K ) is a convex function based on the above inductive method.