Multi-UAV Assisted Multi-Tier Millimeter-Wave Cellular Networks for Hotspots With 2-Tier and 4-Tier Network Association

The evolution of the fifth-generation (5G)/ beyond 5G (B5G) mobile communication networks greatly depends on the seamless integration of terrestrial and sky networks over advanced multi-tier heterogeneous network (HetNet) infrastructure. By using the huge potential of unmanned aerial vehicles (UAVs) and mm-Wave, we develop a novel and practical multi-UAV assisted multi-tier HetNet model which consists of clustered aerial UAV base stations (U-BSs) and Poisson point process (PPP) based ground BSs (G-BSs) for emergency and crowded hotspot area. In a typical cluster, both ground user equipments (GUEs) and UAVs are modeled as Poisson cluster process (PCP) and scattered around the common cluster center. Specially, the analysis is conducted by adding two additional tiers which consist of multiple intra-cluster U-BSs and single G-BS in representative cluster so that the point sets of U-BSs and G-BSs are partitioned into two sub-sets, respectively, and a 4-tier network model is formed. To find the contribution of the inter-cluster associations on the performance of networks, we construct 2-tier and 4-tier association schemes. The 2-tier association allows a GUE associated only with intra-cluster U-BSs and G-BS, while the 4-tier association allows all inter/intra-cluster U-BSs and G-BSs can be associated. Then, according to stochastic geometry methods, statistical description of the downlink interferences experienced at a typical GUE is derived. This derivation yields the downlink coverage probabilities and total network throughput are obtained. It is achieved that association probabilities greatly depend on the distribution of daughter point process. UAVs’ height and the distributions of U-BSs and GUEs impose non-monotonous impact on coverage probabilities. Due to severe inter-cluster interference, the achievable average area throughput is not always increasing with SINR so that there is an optimal SINR threshold. In addition, our analysis finds the proposed 4-tier association scheme outperforms the simple 2-tier one in terms of the average area throughput.

relays in cooperative cellular networks, which is an effective technique for performance improvement and coverage expansion under weak channel conditions. While serving as aerial mobile relays, UAVs are able to flexibly adjust locations to gain more favorable channel condition. With the method from independent homogeneous Poisson point process (PPP) [10], in [11]- [14], the UAV-assisted mm-Wave relay network was considered.
In addition to aerial mobile relay applications, the works [15]- [23] considered UAVs as aerial mobile BSs to assist the terrestrial cellular network communications over mm-Wave, where UAV-network is overlaid over terrestrial cellular networks, referred to multi-tier heterogeneous network. In such UAV-assisted multi-tier HetNets, because of the increasing irregularity in U-BSs/G-BSs locations, the methods from stochastic geometry and spatial point process theory have been widely used to achieve accurate modeling and tractable analysis [10]. However, although assuming the locations of U-BSs/G-BSs and user equipments (UEs) as independent PPPs is tractable, it is not rich enough in capacity for the case with spatial coupling between G-BSs/U-BSs and UEs in some practical implementation, especially, for UAV-assisted emergency and hotspot communications. That is to say, it has been recently shown that Poisson cluster process (PCP) [1]- [3] based models are well-suited to capture the aforementioned spatial coupling. By using Matern hardcore point process (MHP), the UAV-assisted multi-tier mm-Wave network was investigated in [15] by YX. et al., where aerial U-BSs were modeled as MHP and general model as Poisson point process (PPP). With 3D antenna gain model, authors proposed transmission jamming strategy to improve the network performance. Unlike the work [15], in works [16], [17] both aerial U-BSs and G-BSs were modeled as independent PPPs. The locations of GUE were modeled as PCP whose parent process was projections of aerial UAVs. They exploited the coupling between U-BSs and GUEs.
In works [18], [19], WQ Yi et al. also investigated the UAV-assisted mm-Wave HetNets by specially considering the coupling between U-BSs/G-BSs and UEs. Moreover, the spatial models of the network elements' locations are completely same as the ones in [16], [17]. The main contributions of the works [18], [19] are that the authors firstly proposed a novel 3D analytical framework for UAV-assisted cellular network by invoking a tractable 3D blockage model and a 3D sectored antenna pattern, in which the altitude of all transceivers and the cluster property of GUEs were jointly considered. At the same time, both the uplink and downlink were simultaneously investigated. It is interested that the similar model was also used in [20], [21]. However, it is unfortunate that the works [20], [21] focused on conventional sub-6GHz frequency band without the consideration on mm-Wave. In practice, under sub-6GHz scenario, the contributions for the UAV-assisted cellular networks can also be found in [22]- [25], where the locations of network elements were modeled as PPP without the exploitation of coupling among network elements. Specially, the work [23] considered a VOLUME 8, 2020 multi-UAV aerial network model where each cluster consists of non-homogeneous UAVs. In each tier, the UAVs were distributed according to PPP. In [26] and [27], J.H Lee and Y.C Ko et al have explored the UAV-assisted communications.
Combing multi-tier HetNets and the method from stochastic geometry [10], an enhanced method was been developed in [28]- [30] for the cellular networks with user-centric small-cell deployment. In this method, to distinguish the difference between the distributions of user-centric and user-edge distances, the points of a given point process is portioned into two sets: 1) the single point x 0 in the representative cluster, 2) the points in other clusters \x 0 . By using Slovnyak's theorem, it can be argued that \x 0 has the same distribution as . As a result, an additional tier is formed. This further yields that a two-tier network model is developed, of which the first tier consists of the single point x 0 and the second one consists of the remaining points \x 0 . Motivated by the works [28]- [30], this work extends the method to the scenario where there are multiple points in the representative cluster.

C. MOTIVATIONS AND CONTRIBUTIONS
Summarizing above literature reviews, 5G/B5G networks are expected to accommodate a huge number of mobile users for such emergency or hotspots with high QoS in terms of high data rate, low latency, and high reliability. To meet these heightened requirements, the different types of spatial coupling across the locations of BSs and UEs are exploited in user-centric deployment. Therefore, to effectively capture the UE-BS coupling and the property of user-centric deployment, the cluster-based model was adopted widely in literature such as in [1]- [3] for terrestrial networks, which modeled the geographical centers of UE hotspots as independent PPP, around which the BSs and UEs form clusters independently with different distributions. It was found that the cluster-based method is a more accurate model than PPP for such hotspot communication scenarios.
However, for the UAV-assisted multi-tier mm-Wave cellular networks, the cluster-based aerial deployment of UAV small cells is still in its infancy. Specially, the following problems are not clear so far, but not limited to these. Firstly, although the very valuable work [15] considered UAVs as mobile aerial BSs, it only modeled the locations of UAVs as independent PPP without exploiting the coupling between UEs and U-BSs. It is difficult to meet users' QoS by using the method in [15] under the case such as emergency and crowded hotspot areas. Secondly, different from [15], in the latest works [16]- [19] authors investigated UAV-assisted mm-Wave HetNets under the hotspots where the projects of U-BSs on ground were modeled as cluster center and the ground users were scattered around the cluster center. Unfortunately, while the coupling between ground users and aerial UAVs was exploited, the works [16]- [19] did not attempt to further improve the capacity of networks for hotspots by deploying more UAVs that are also scattered around the center of hotspots. In practice, to evidently enhance the capacity of networks and decrease the load of G-BSs under emergency and hotspot scenarios, it is a preferred choice to place more UAVs by using cluster-based scheme and to model both the users and UAVs scattered around the common cluster center as used in existing works for ground communication networks [1]- [3]. As a result, the users' performance can be greatly enhanced via the utilization of the diversity of UAVs. Thirdly, unlike the aforementioned single or few UAV-assisted communications, the investigation on the deployment of UAV small cell networks is also for emergency, especially, in the case of natural disasters such as earthquakes and flooding where more UAVs are placed in sky to construct the aerial small cell to complement the destroyed terrestrial BSs. The multi-UAV small cell networks are generally characterized by the multi-UAV's clusters and both UAVs and users scattered around the common cluster center in each independent cluster. Fourthly, although the mm-Wave was used in works [12]- [14] and [16]- [19], these works did not consider the multi-UAV small cell networks over the promising mm-Wave frequency band. Therefore, motivated by the above reviews, by using the cluster-based method, this work focuses on multi-UAV assisted mm-Wave multi-tier HetNets for hotspots. The main contributions are summarized as follows.
1) Over mm-Wave frequency, we develop a novel and practical multi-UAV assisted cellular network model which consists of clustered aerial U-BSs and PPP-based G-BSs for emergency and crowded hotspots. The multi-UAV assisted HetNets consists of multiple clusters, and in each cluster both ground users and aerial UAVs are clustered and scattered around the common cluster center. By using the idea from PCP, we develop an accurate and effective multi-tier heterogeneous model where the locations of GUEs and aerial U-BSs are respectively modeled as independent PCP. The PPP distributed G-BSs are the parent point process of the natural coupling across the aerial U-BSs and the center of ground hotspots, but also the ones of GUEs. 2) By using the method from cluster process and property of mm-Wave signal propagation, we comprehensively characterize the statistical descriptions of aerial/ground intra-cluster and inter-cluster distances from a typical GUE to its serving U-BS/G-BS and to other interfering U-BSs/G-BSs in a Thomas cluster process and independent Rayleigh/Rician distribution. For the tractability of mathematical analysis, by referring the existing result, the intra-cluster distances are modeled Rayleigh distribution, but the inter-cluster ones are modeled Rician distribution. These yield the complementary cumulative distribution functions CCDFs) and probability density functions (PDFs) of path losses are derived with closed-form expressions. 3) By adding two additional tiers which consist of multiple intra-cluster U-BSs and single G-BS in representative cluster, the 4-tier network model is used for the analysis of the considered multi-UAV assisted mm-Wave HetNets. To find the advantage of the 4-tier network model, we construct two types of association schemes. We construct the GUE association strategy which is based on the strongest long-term biased received power (BRP) [31], [32] and achieve the probabilities that a typical GUE is associated with a G-BS/U-BS in each tier. With the 4-tier network model, it is very convenient to quantify the effect and contribution of inter-cluster U-BSs/G-BSs on the performance of the UAV-assisted mm-Wave HetNets. 4) The mm-Wave signals suffer from not only large-scale fading, but also small-scale fading. Considering the short distances of intra-cluster transceivers and long distances of inter-cluster transceivers as well as obstacles, the intra-cluster links are modeled as LoS and non-line-of-sight (NLoS) two states' mode, but the inter-cluster links are modeled as signal NLoS state. The NLoS approximation for the inter-cluster links is effective for the two cases. The first is the case where the density of cluster centers is small, and the second one is the hotspot case where the UEs are heavily scattered around the cluster center. Under the two cases, the probability that the UEs are located in the adjacent clusters is small. As a result, the NLoS approximation for the inter-cluster links is reasonable for these cases. For small-scale fading, the general normalize Nakagami-m model is adopted. With the modes, the Laplace transforms (LTs) of the interferences experienced by a typical GUE from the intra-cluster and inter-cluster U-BSs/G-BSs are derived. The resulting derivations lead to the coverage probabilities of downlinks signal-to-interference noise ratio (SINR) and the network throughput.
Finally, for a quick reference, a summary list of variables and notations employed this paper is provided in Table 1. At the same time, to quickly understanding the developed 4-tier model, the definitions of the k-th tier, k = {0, 1, 2, 3}, are given here. As commonly used in literature, the investigation focused on a representative cluster that is selected randomly from all clusters, and a typical GUE is selected in the representative cluster. Thus, the 0-th tier consists of the UAVs in the representative cluster, the 1-st tier consists of the UAVs in other cluster, the 2-nd tier consists of the only G-BS that is the center of the representative cluster, and the 3-rd tier consists of all other G-BSs that are the centers of other clusters.

A. SPATIAL DESCRIPTION
We consider a mm-Wave heterogeneous cellular network, where macro cells and small cells consist of G-BSs and aerial U-BSs, respectively. As commonly adopted, the locations of macro G-BSs are modeled as a homogeneous PPP G−BS with density λ G−BS . At the same time, unlike previous works, this work focuses on more realistic ground hotspots such as sporting events, concerts, emergencies, etc. In this communication scenario, the ground users are scattered around the G-BSs with distribution parameter σ u . These kinds of scenarios cause a sudden data rate surge so that the existing ground network can not meet the deluge of data transmission. Therefore, to offload the G-BSs under such hotspots, VOLUME 8, 2020 it is assumed that the aerial U-BSs are deployed clusterly and the projections of U-BSs on the Euclidean plane are scattered around the G-BSs. Therefore, on the Euclidean plane, the locations of U-BSs can be modeled by a PCP with parent point process G−BS . Since the aerial U-BSs are deployed in hotspot area, they are expected to be closer to each other. Therefore, we model the PCP as a Thomas cluster process (TCP), denoted by U −BS (λ G−BS ,m), where the projections of U-BSs (cluster members) are symmetrically independently and identically distributed around the points in G−BS according to a Gaussian distribution with zero mean and variance σ 2 . Mathematically, on the Euclidean plane, the TCP U −BS (λ G−BS ,m) is modeled as the set of locations (projections) of cluster members (U-BSs) relative to the cluster center x ∈ G−BS , x and y are the two dimensional coordinates. With TCP, the PDF of the location y ∈ N x (the projection of any arbitrary U-BS on the ground) relative to cluster center where y is the Euclidean norm of the location y. Thus, the PDF of the distance r = y from any point in the cluster to the parent point follows the Rayleigh distribution as Without loss of generality, the downlink analysis is performed at a typical GUE, which is randomly chosen in a randomly chosen cluster that is also referred to as representative cluster. Since the point processes are stationary, according to Slivnyak's theorem, we can transform the origin to the location of this chosen typical GUE. Now, as stated in [28]- [30], given that the typical GUE is at origin, let the location of the representative cluster center is x ∈ G−BS . We can exclude this x from G−BS and Slivnyak's theorem guarantees that the remaining process G−BS\{y} has the same distribution as G−BS [10]. Therefore, to distinguish the intra-cluster and inter-cluster distance and highlight the insight of the considered association schemes, we form two additional tiers. Specially, the additional 0-th tier 0 U −BS (λ B ,m) consists of the U-BSs in the representative cluster, and the additional 2-nd tier 2 G−BS (x 0 ) consists of the only cluster center G-BS of the representative cluster [34]. Thus, our model is extended to 4-tier network: the 0-th tier consisting of intra-cluster U-BSs, the 1-st tier consisting of inter-cluster U-BSs, 2-nd tier consisting of cluster-center G-BS, 3-rd tier consisting of inter-cluster G-BSs. With the above descriptions, to highlight the contributions of this work, we consider two types of association schemes, i.e., 2-tier and 4-tier association schemes. The 2-tier association scheme allows a typical GUE only associated with the intra-cluster G-BS/U-BSs located in the representative cluster, while the 4-tier association scheme permits a typical GUE associated with G-BSs/U-BSs in FIGURE 1. Layout of clustered UAV-assisted mm-Wave multi-tier networks (x 0 ∈ G−BS is the location of G-BS in the representative cluster, z 0 and z are the locations of the intra-cluster serving and interfering U-BSs in R 3 , v 0 = ||x 0 || is the distance of the typical GUE to the center of the representative cluster, w 0 and y d 0 are the distances of the horizontal projection of the aerial serving U-BS of the typical GUE to the representative cluster center and the typical GUE; w and y d are the intra-cluster distances from the horizontal projection of an aerial interfering U-BS to the typical GUE and the representative cluster center; x ∈ G−BS is the location of the inter-cluster G-BS, the distance r from the inter-cluster interfering U-BS in any arbitrary cluster centered at x ∈ G−BS , v = ||x|| is the distance of the typical GUE to the cluster center at x ∈ G−BS , u and y are the distances from the horizontal projection of an aerial inter-cluster interfering U-BS to the typical GUE and the cluster center x ∈ G−BS ). all tiers. Figure 1 gives the layout of the proposed clustered UAV-assisted mm-Wave cellular network.
Note that, in the work [33], the similar network model has been proposed. However, compared with the work [33], this work attempts to exploit the UAV-assisted wireless communications over mm-Wave frequency band, which would be one of the key techniques in 5G/B5G networks.

B. DIRECTIONAL BEAMFORMING
The directional antenna array is equipped at all G-BSs and U-BSs. Based on the location information, the transmitters and receivers can steer their antenna directions towards each other in order to achieve the maximum beamforming gain so that the high path loss caused by the short wavelength of mm-Wave is compensated. For analytical tractability, the sectored antenna model is used, where the characteristics of antennas of the i-th tier are parameterized by three values [35], [36]: 1) Main-lobe gain M i,s (dBm), 2) Side-lobe gain m i,s (dBm), 3) Main-lobe beam width θ i,s ∈ [0, 2π], where s ∈ {T t , r}, s ∈ T t and T ∈ {G, U } representing the transmitting antenna at G-BSs for T = G or U-BSs for T = U , and s = r denoting the reception antenna at GUEs. Note that, when the reception antenna at GUEs is considered, i.e., s = r, we take i = u in M i,s , m i,s , and θ i,s for the sake of consistency of expression. Thus, the achievable total transmission gain G i received by a typical GUE from a transmitter T t in the i-th tier is characterized by discrete random variables having 4 possible patterns of the gain G i,j and probability b i,j , j ∈ {1, 2, 3, 4}, see Table 2. From Table 2, it is showed that the maximum transmission gain received by a GUE from a transmitter in the i-th tier is given by As aforementioned, the large-scale network consists of 4-tiers. This work assumes all wireless signals experience simultaneously both the small-scale and large-scale fading. On the one hand, for the 0-th and 2-nd tiers, because one remarkable characteristic of mm-Wave is that mm-Wave signals are vulnerable to obstacles, it is necessary to model large-scale fading by embracing both LoS and NLoS propagation, where LoS indicates that GUEs are visible to the BS, with NLoS implies that blockage exists between the BS and GUE. On the other hand, for the 1-st and 3-rd tiers, when the transmission from the typical GUE to an inter-cluster U-BS or G-BS is considered, due to the longer distance between transceivers and more ground blockages, the effect of blockages plays a dominated role and the transmission is dominated by NLoS links. Therefore, we model the path losses of the inter-cluster U-BS-GUE and G-BS-GUE links by using a single NLoS state with one probability.
Note that, for simplicity and tractable, we use the NLoS approximation for the inter-cluster links for the 1-st and 3-rd tiers. The NLoS approximation is effective for the two cases: the first is the hotspot case and the second is the one where the density of cluster centers is small. However, by using the similar method as in the 0-th and 2-nd tiers, the exact results can be achieved for the links in the 1-st and 3-rd tiers so that we can evaluate the network exactly.
Hence, we first formulate the path loss of the 0-th and 2-nd tiers. In [29], authors employ a D-ball with piece-wise LoS probability function. In this paper, a single-ball model for describing the probability of LoS and NLoS is used. As a general model, considering a BS from the i-th tier, i ∈ {0, 2}, we assume that R i is the radius of the LoS ball. Inside the ball, r ≤ R i , links can be either LoS with probability P L i (r) or NLoS with probability P N i (r) = 1 − P L i (r), while beyond the ball, links are assumed to be single NLoS, where r is the distance between the typical GUE and BS. As a result, the general LoS/NLoS probability function for the i-th tier can be expressed as follows [29]: where I ( ) is the indicator function. Now the general path loss function can be expressed as where α i,L and α i,N are the LoS and NLoS path loss exponents of the i-th tier, i ∈ {0, 2}, respectively. Similar as in [37], the LoS probability for the link from aerial U-BSs to GUEs is formulated as (i = 0) As presented in [48], we formulate the probability of LoS link between GUE-GBS in the tier 2 as where is a constant that depends on the geometry and density of building blockage process, and 1/ is what we called the average LoS range of the network. For the 1-st and 3-rd tier, we model the path loss in the inter-cluster UBS-GUE link by using a single NLoS state with one probability. As a result, the path loss of a link in the i-th tier is written as

2) SMALL-SCALE FADING
Nakagami-m fading is a general fading model which is suitable under various conditions, and hence we assume that all links experience independent Nakagami-m fading. Let h TG , T ∈ {U , G}, denote the small-scale fading gains of the link from a T-BS to the typical GUE, (i.e., magnitude-squares where N i,L and N i,N are the Nakagami-m fading parameters for LoS and NLoS links in the i-th tier, respectively, and are assumed to be positive integers.

III. DISTRIBUTION OF DISTANCES AND PATH LOSSES
This section characterizes the distributions of communication distances from the typical GUE to various intra-cluster and inter-cluster U-BSs/G-BSs. As mentioned in Section II, it is assumed that a typical GUE is located in the representative cluster with cluster center x 0 ∈ G−BS , and the projections of the intra-cluster U-BSs are scattered around the typical cluster center x 0 ∈ G−BS .

A. DISTANCES (OR PATH LOSSES) DISTRIBUTION IN THE 0-TH TIER
Let's now start our discussion in terms of the distributions of the intra-cluster distances (path losses) from the 0-th tier composed the intra-cluster U-BSs. For clarity, we first consider the distribution under unordered case where a U-BS is randomly selected in a cluster. As shown in Figure 1, the distances of the typical GUE to any arbitrary U-BS is z in R 3 , and the projections on horizontal plane is w. In general, the distances z (w) from different intra-cluster U-BSs are correlated. The reason is that the location set of all U-BSs is modeled as TCP U −BS (λ G−BS ,m) and the typical intra-cluster U-BSs are scattered around the cluster center x 0 ∈ G−BS according to an independent Gaussian distribution of variance σ 2 . As a result, only conditional on the common distance v 0 = x 0 , the distances z (w) are i.i.d.. We have that the exact PDFs of the distance z (w 0 ) are VOLUME 8, 2020 characterized by Rician distributions [38] and [39].
where I 0 ( ) is the modified Bessel function of the first kind with order zero. With (7), the CCDFs of the distance z is I 0 (at) dt. However, the results from [38] and [39] showed that the correlation among intra-cluster distances is very weak to be ignored. Therefore, we do not condition on the common factor x 0 = v 0 in the representative cluster, which would greatly simplify the sequent analyses and leads to achieve clearer insights. With the results from [38] and [39], we have that the distribution of the distance z can be approximated by Rayleigh distributions of variance 2σ 2 .
To reduce the complexity of analyses, the approximation (9) is used throughout this work. We now turn to look at the distance r from the inter-cluster interfering U-BS in any arbitrary cluster centered at x ∈ G−BS . As shown in Figure 1, the projection of the distance r on the horizon plane is u. With the completely same reasons, it is easy to have that, conditioned on the common factor v = x , the PDF of the inter-cluster distance r is The CCDFs of the distance r is written as After characterizing the distributions of intra-cluster and inter-cluster distances under unordered case, it is convenient to characterize the distributions of the closest distances. In Figure 1, we assume that the distance from the closest intra-cluster U-BS to the typical GUE is denoted by z 0 and the corresponding projection distance is w 0 . Different from (7)-(10), here we focus on the distribution of path losses. For the intra-cluster transmissions, the path loss models with LoS and NLoS two states are used, which is defined by (4). Because there is only one closest intra-cluster U-BS in the representative cluster, from which the line can be either LoS or NLoS. Specially, when the link is in LoS state, with the path loss model (3) and (4), for the projection of any arbitrary distance w (unordered case), the corresponding LoS path loss is given by L 0,L (w) = w α 0,L . The density measure of L 0,L (w) is [40]- [43] As shown in [44], [45], using the density measure (12) leads to the CCDF of distance projection path loss L 0,L (w) Using Figure 1, it is achieved that the CCDF of the path loss of the distance z is The PDF is Then, using the independence of intra-cluster U-BSs and order statistics [46], we can characterize the CCDF of the path loss L 0,L (w 0 ) of the closest distance z 0 in Figure 1, which is given bỹ The PDF of L 0,L (z 0 ) can be achieved by taking the derivative When the link from the typical GUE to the intra-cluster U-BS located at z is in NLoS state, with the path loss model (3) and (4), the density measure of the NLoS path loss L 0,N (w) is calculated as follows Therefore, the CCDF of the path loss from the typical GUE to the projection of any arbitrary U-BSs in the representative cluster can be formulated as Similar to (14), using Figure 1 leads to write the CCDF of the path loss of the distance z as The PDF is Similarly, considering the independences among all intra-cluster U-BSs and using order statistics [46], we can characterize the CCDF of the path loss of the closest distance z 0 , which is given bỹ The PDF of the path loss L 0,N (z 0 ) is given by With the above expressions (16)-(22), Lemma 1 is achieved. Lemma 1: The CCDF of the path loss from a typical GUE in a representative cluster to the nearest intra-cluster U-BS is formulated asF whereF L 0,L (t) andF L 0,N (t) are given by (16) and (22), respectively. P L 0 and P N 0 = 1 − P L 0 are the LoS and NLoS probabilities defined by (5).

B. DISTANCES (OR PATH LOSSES) DISTRIBUTION IN THE 2-ND TIER
The 2-nd tier consists of only one G-BS located at the center of the representative cluster. When the link of the typical GUE to the only one G-BS in LoS state, with the assistance from Figure 1 and equation (2), we can characterize the LoS path loss. The CCDF of the path loss When the only link of the typical GUE to the intra-cluster G-BS is in NLoS state and the distance is less than R 2 , with the help from equation (3), the CCDF of the NLoS path loss L 2,N 1 (v 0 ) is given bỹ When the distance is greater than R 2 , with the help from equation (18), the CCDF of the NLoS path loss L 2,N 2 (v 0 ) is given bỹ Combing (26) and (27), the total CCDF of the NLoS path loss is given bỹ Therefore, we have Lemma 2. Lemma 2: The CCDF of the path loss from a typical GUE in a representative cluster to the only one G-BS located at the cluster center is whereF L 2,L (t),F L 2,N 1 (t), andF L 2,N 2 (t) are given by (25), (26), and (27), respectively.

C. DISTANCES (OR PATH LOSSES) DISTRIBUTION IN THE 1-ST AND 3-RD TIERS
The 1-st tier consists of the set of other U-BSs besides the ones in the representative cluster centered at x 0 ∈ G−BS . Due to the long distances between the typical GUE and the inter-cluster U-BSs, as mentioned in Section II, when the density λ G−BS and the variance σ 2 are small relatively, the link is modeled by the single NLoS state. Based on equation (8), by considering the independence among all inter-cluster U-BSs and the order statistics [46], we have Lemma 3. Lemma 3: The conditional CCDF of the path loss L 1 of the link from the typical GUE to the nearest inter-cluster UAV in the 1-st tier the cluster centered at x ∈ G−BS isF The PDF of the path loss L 1 is given by Proof: See Appendix A. Similarly, considering the fact that the 3-rd tier is composed of other G-BSs and the path loss of the inter-cluster G-BS-GUE links is modeled by single NLoS state, Lemma 4 is achieved.
Lemma 4: The CCDF of the path loss L 3 of the link from the typical GUE to the nearest G-BS in the 3-rd tier is The PDF of the path loss L 3 is Note that, for simplicity and tractability, we use the NLoS approximation for the inter-cluster links from the 1-st and 3-rd tiers. The NLoS approximation is effective for the two cases: the first is the hotspot case and the second is the one where the density of cluster centers is small. However, by using the similar method as in the 0-th and 2-nd tiers, ball-model can be used for the links in the 1-st and 3-rd tiers so that we can evaluate the network exactly.

IV. UE ASSOCIATION AND STATISTICAL DESCRIPTION OF ASSOCIATION DISTANCES
This section gives the used UE association criterion that allows a typical UE associated with a serving BS. Because we consider two types of association schemes, the associations for the 2-tier and 4-tier association schemes are required, respectively. However, due to the similarity, we only present the detailed analysis for the 4-tier association scheme. The one for the 2-tier association scheme is given with a simplified way. Note that, without loss of generality, the assumption is held in the sequential analysis.
The maximum BRP criterion is employed for user association, where a typical UE is associated with the BS that provides the strongest long-term average BRP [31], [32]. Therefore, let the typical GUE associated with the closest BS located at x in the i-th tier, we have where B k is the identical biased factor of the k-th tier, L k is the minimum path loss of the typical GUE to the k-th tier.
With this association criterion, the association probability is defined as the probability that the typical GUE is associated with a LoS (or NLoS) BS in the i-th tier for i ∈ {0, 1, 2, 3}.
Considering the fact that the path losses in the 0-th and 2-nd are modeled by two-state model, the association probability that the typical GUE is associated with a BS in the i-th tier, i = 0, 2, over a t link, t ∈ {L, N }, is generally formulated as where (a) follows from the independence assumption and C k,i = P k B k G Mk P i B i G Mi . With the general formulations (36), the association probabilities of the typical with the BSs in the i-th tier are calculated as follows At the same time, the PDFs of the association distances, which are the distances of the typical GUE to its serving BSs, are derived too.

A. ASSOCIATED WITH THE 0-TH AND 2-ND TIERS
We first calculate the association probability of the typical GUE with the 0-th and 2-nd tiers. Since the BSs in the 0-th and 2-nd tiers fall inside the representative cluster, there is only one closest U-BS from the 0-th tier or G-BS from the 2-nd tier. This yields that in (36) can be expressed as P L i and P N i for t = L and t = N , respectively. The formulation (36) is further written as for i = 0 and i = 2: where L k\i denotes the path loss of the typical GUE to the nearest BS belonging to the k-th tier and except the i-th tier. Then, using the statistical descriptions of the path losses L k , Theorem 1 is achieved. Theorem 1: The probability that the typical GUE is associated to the closest intra-cluster U-BS (G-BS) in the 0-th (2-nd) tier with the association distance X i,t (z 0 (v 0 )) over the t ∈ {L, N } link is The PDF of the path loss of the corresponding association distance X i,t (z 0 (v 0 )) over the t ∈ {L, N } link is In (39), the CCDFF L 1 C 1,i L i,t (z 0 (v 0 )) of the path loss L 1 isF The CCDFF L 3 (.) of the path loss L 3 can be achieved from (33). When i = 0, the CCDFF L 2 (.) of L 2 is given by (42), as shown at the bottom of the next page. The PDF of the path loss L 0,L (z 0 ) and L 0,N (z 0 ) are given by (17) and (23), respectively.
When i = 2, the CCDFF L 0 (.) of the path loss L 0 is given by (43), as shown at the bottom of the next page, The PDFs of the path losses L 2,L (v 0 ) and L 2,N (v 0 ) can be achieved from (30), respectively.
Proof: See Appendix B.

B. ASSOCIATED WITH THE 1-ST AND 3-RD TIERS
In the 1-st and 3-rd tiers, the signal transmission link of a typical GUE to the BSs is modeled by using single state with path loss exponent α 1 and α 3 . In this case, the association probability A i , i = 1, 3, is generally formulated as Theorem 2: The probability that the typical GUE is associated to the closest inter-cluster U-BS (G-BS) in the 1-st (3-rd) tier with the association distance X i (z(v)) over the single link is The PDF of the path loss of the corresponding association distance X i (z(v)) over the t ∈ {L, N } link is In (45), the CCDFF L 0 (t) of the path loss L 0 is given by (43), the CCDFF L 2 (t) of the path loss L 2 is given by (42).
When i = 1, the CCDFF L 3 (t) of L 3 is given by (33) and the PDF of the path loss L 1 (z) are given by (32); when i = 3, the CCDFF L 1 (t) of L 3 is given by (41) and the PDF of the path loss L 3 (z) are given by (34). Proof: Proof follows similar steps in the proof of Theorem 1 and the details are omitted for the reason of brevity.
Theorem 1 and Theorem 2 present the association probabilities of the typical GUE for the 4-tier association scheme that can exploit the effect of the inter-cluster interference. With a simplification, the work also givens the 2-tier association scheme where the typical GUE only is associated with the BSs in the representative cluster, i.e., the BS in the 0-th tier and the BS in the 2-nd tier. In this case, the association criterion is written as where i ∈ {0, 2}. Similar to (39), for the 2-tier association scheme, the probability that the typical GUE is associated with a BS in the i-th tier, i = 0, 2, over a t link, t ∈ {L, N }, is We have Corollary 1. Corollary 1: Under the 2-tier association, the probability that the typical GUE is associated to the closest intra-cluster U-BS (G-BS) in the 0-th (2-nd) tier with the association distance X 2 i,t (z 0 (v 0 )) over the t ∈ {L, N } link is The PDF of the path loss of the corresponding association distance X 2 i,t (z 0 (v 0 )) over the t ∈ {L, N } link is

V. STATISTICAL DESCRIPTION OF INTERFERENCE
In general, to make tractable calculation of network performance, it is needed to calculate LTs of interference. Since the typical GUE is served by a G-BS (or U-BS) that provided the strongest BRP, we have that if the typical GUE is associated with a BS from the i-th tier, there exists an exclusive region in which no interfering BS exists. As shown in Figure 1, when the typical GUE is associated the BS from the i-th tier, the received signal may be suffered from the interference of other tiers as well as the one of the co-tier BSs due to the mm-Wave co-channel deployment. Without loss of generality, we assume that the typical GUE is associated a BS in the i-th tier over t ∈ {L, N } link with the association distance X i,t .

A. LT OF THE INTERFERENCE FROM G-BS IN THE 2-ND TIER
The interference experienced at the typical GUE from the only one G-BS in the representation cluster (2-nd tier) over t ∈ {L, N } link, is formulated by I 2,t i,t = P 2 G 2 h 2,t L −1 2,t (v 0 ), where h 2,t and L 2,t (v 0 ) are the small-scale fading channel gain and the path loss from the typical GUE to the only one G-BS located x 0 with v 0 = ||x 0 || in the 2-nd tier. The small-scale fading channel gain h 2,t is modeled as a normalized Gaussian random variable with parameter N 2,t , t ∈ {L, N }, i.e., h 2,t ∼ N 2,t , 1/N 2,t . With the used strongest BRP criterion and the assumption that the typical GUE is associated with a BS in the i-th tier over the t ∈ {L, N } link, the constraint for the path loss L 2,t (v 0 ) is given by i,t is given by Lemma 5 with the assist of the proof in Appendix C. Lemma 5: When the interference link from the only one G-BS is in LoS state, the LT of the interference experienced at the typical GUE is where L

B. LT OF THE INTERFERENCE FROM U-BSs IN THE 0-TH TIER
When the typical GUE is associated with a BS in the i-th tier, i = 0, the received interference from the intra-cluster U-BSs in the 0-th tier over the t ∈ {L, N } link is formulated where the h 0,t is the small-scale fading channel gain and L 0k,t (||z k ||) is the path loss from the typical GUE to the U-BS located at z k ∈ 0 U −BS (λ G−BS ,m) in the 0-th tier. Then, we have Lemma 6 that gives the LT of I 0,t i,t .
158982 VOLUME 8, 2020 Lemma 6: When the set of LoS U-BSs is considered, the LT of the aggregated interference from the LoS U-BSs is When the set of NLoS U-BSs is considered, the LT of the aggregated interference from the NLoS U-BSs is where the PDF f L 0,L (z) (t) is given by (15) and the PDF f L 0,N (z) (t) is given by (21). Proof: See Appendix D.

C. LT OF THE INTERFERENCE FROM U-BSs IN THE 1-ST TIER
Since the signal transmission link of the typical GUE to the U-BSs is modeled by using single state with path loss exponent α 1 , the received interference from intercluster U-BSs in the 1-st tier by the typical GUE associated with a BS in the i-th tier over the t link is Then, using the proof in Appendix E, we have Lemma 7.
Lemma 7: The LT of the interference I 1 i,t from inter-cluster U-BSs in the 1-st tier is

D. LT OF THE INTERFERENCE FROM G-BSs IN THE 3-RD TIER
The received interference from G-BSs in the 3-rd tier is given by is the path loss of the link from a G-BS to the typical GUE with x k ∈ G−BS \x 0 and v = ||x k ||. The LT of I 3 i,t is given by Lemma 8.
Lemma 8: For the typical GUE that is assumed to be associated with a BS in the i-th tier, the LT of the interference I 3 i,t from G-BSs in the 3-rd tier is where the density measure 3 ([0, t)) is given by (33) and (34).

VI. DOWNLINK COVERAGE PERFORMANCE AND AVERAGE AREA THROUGHPUT A. DOWNLINK SINR
According to the employed association strategy, a typical GUE is served by the BS that provides the strongest long-term average BRP. As stated previously, it is assumed that the typical GUE is associated with the BS in the i-th tier over t ∈ {L, N } link, with the association distance X i,t or path loss L i,t X i,t . Hence, the received SINR by the typical GUE associated with the BS in the i-th tier over t link is formulated as where σ 2 0 is the power of additive thermal noise, I i,t is the aggregate interference experienced at the typical GUE, and h i,t is the small-scale fading channel gain that is modeled as a normalized Gamma random variable with parameter N i,t , i.e, h i,t ∼ N i,t , 1/N i,t . Note that, with the considered communication scenario plotted in Figure 1, we only consider the single path loss model from the typical GUE to the BSs in the 1-st and 3-rd tiers. In this case, we will drop the index t from SIN R i,t , h i,t , L −1 i,t X i,t , and I i,t . However, for the convenience of derivations, unless specified otherwise, (63) is assumed.
Hence, the aggregate interference I i,t is further written as In (63), I k,t i,t denotes the aggregated interference from the BSs in the k-th tier over t ∈ {L, N } link, k ∈ {0, 2}, and I k i,t is the one from BSs in the k = 1, 3 tiers having single path loss state. I k,t i,t and I k i,t are defined in section IV.

B. DOWNLINK SINR COVERAGE PROBABILITY
The downlink SINR coverage probability is defined as the probability the received SIN R i,t by the typical GUE is larger than a certain threshold τ > 0. Therefore, when the typical GUE is associated with a G-BS/U-BS located in the i-th tier over t ∈ {L, N } link, the corresponding SINR coverage probability P i,t (τ ) is formulated as As a result, combining the above analysis, we have Theorem 3 that gives the achievable total coverage probability by the typical GUE.
Theorem 3: Under the considered communication scenario, with a given threshold τ , the total average downlink SINR coverage probability of the typical GUE is given by where A i,t is the association probability with a LoS (t = L) or NLoS (t = N ) U-BS/G-BS from the i-th tier for i ∈ {0, 2} which is given by (39), and A i is the association probability with a G-BS/U-BS in the i-th tier for i ∈ {1, 3}, which is given by (45). The conditional coverage probabilities P i,t (τ ) and P i (τ ) are given by (68), (69) and (70), respectively. Proof: See Appendix F.

C. AVERAGE AREA THROUGHPUT
Besides the coverage probability, the average transmission rate is also another crucial performance metric which can be found by calculating average area throughput or average area spectrum efficiency. Here, we study the average area throughput (AAT). The downlink AAT is defined as the downlink average number of bits transmitted per unit time per unit area over given bandwidth. With this consideration and the definitions in [50] and [51], Theorem 4 is achieved. Theorem 4: Given that the bandwidth is W and the SINR threshold is τ , the achievable AAT R Total is given by {0, 1, 2, 3}, is the downlink AAT from G-BS/U-BS in the i-th tier, which are respectively given by

VII. SIMULATIONS AND NUMERICAL RESULTS
In this section, we present the numerical results to validate the derivations in above sections and to highlight the insights of the impact of key system parameters on performance. Specially, in the numerical evaluations and simulations, a heterogeneous network consisting of G-BSs and U-BSs is considered with two additional tiers so that a 4-tier network model is constructed, i.e., the 0-th tier consisting of intra-cluster U-BSs, the 1-st tier consisting of inter-cluster U-BSs, 2-nd tier consisting of cluster-center G-BS, 3-rd tier consisting of inter-cluster G-BSs. To find the contribution of the inter-cluster association, this work considers the 2-tier and 4-tier association schemes. The 2-tier association scheme allows a typical GUE only associated with the intra-cluster G-BS/U-BSs located in the representative cluster, while the 4-tier association scheme permits a typical GUE associated with G-BSs/U-BSs in all tiers. With this heterogeneous cellular network, we assume that the total available bandwidth is B = 20MHz and is shared by the whole network, and the reference distance for intercept is one meter. We also take σ u a constant. For clarity, the other parameters and the corresponding values are listed in Table 3, unless otherwise specified.

A. ASSOCIATION PROBABILITIES
With the above assumption, we first analyze the effect of varied network parameters on the association probability (AP). Note that, for the comparison, we consider two systems which are differentiated by powers. That is to say, in systems 1(Sys.1), we take P 0 = P 1 =46dBm, P 2 = P 3 =36dBm, while in system 2 (Sys.2), we take P 0 = P 1 =46dBm, P 2 = P 3 =46dBm. Figure 2 (a) focuses on the association probabilities versus the variance σ 2 of the distribution of the projections of the U-BS locations on ground. The figure clearly shows that with the increase of the variance σ 2 of the distribution of U-BS projections on ground, the association probabilities A 0,L(N ) and A 1 with the U-BSs are increasing, but the ones A 2,L(N ) and A 3 with the G-BSs are decreasing. This is due to the fact that, as σ 2 becomes larger, U-BSs start being more widely spread on the horizontal plane around their own cluster center. Therefore, the average distance of the typical UE to the U-BSs decreases. As a result, the path loss from the typical UE to the U-BSs is generally reduced. Since a BS is associated with a UE to which this BS provides the strongest long-term average biased received power, we notice that in Figure 2 (a) the association probabilities A 0,L , A 0,N , and A 1 with U-BSs increase while the ones A 2,L , A 2,N , and A 3 with G-BSs decrease with the variance σ 2 of the distribution of the U-BS projections. We also see that the probabilities that the typical UE is associated with LoS BSs (U-BS, G-BS) are generally greater than the ones that the typical UE is associated with NLoS BSs. Besides the above observations, it is also found that the probabilities A 0(2),L and A 0(2),N that the typical UE is associated with the BSs in the representation cluster are generally greater than the ones A 1 and A 3 that the typical UE is associated with the BSs in inter-cluster BSs, especially when σ 2 is small. All of the results are agreeing with the network deployment, which validate our analytical results. Different from Figure 2 (a), Figure 2 (b) exploits the relationship between the association probabilities and the density λ G−BS of G-BSs. We see that, with the increase of the density λ G−BS , the association probabilities A 0(2),L(N ) with the intra-cluster BSs (G-BS,U-BS) are decreasing and the ones A 1(3) with intra-cluster BSs (G-BS,U-BS) are increasing. For these observations, we have the following explanations. As the density λ G−BS becomes larger, the average distance of the typical UE to G-BS (or U-BS) decreases. As a result, the path loss from a typical UE to BS (G-BS, U-BS) tends to become smaller. However, Figure 2 (b) also shows that the effect of the density λ G−BS on association probabilities is small, especially on the ones with the intra-cluster BS.
After exploiting the effect of both λ G−BS and σ 2 , in Figure 3 we investigate the effect of the height H of U-BSs and the average numberm of cluster members. At the same time, Figure 3 (a) and (b) also present the comparison of association probabilities between Sys.1 and Sys.2, where the transmit powers of G-BSs are different, but the ones of U-BSs are the same. As expected, because the transmit power of G-BSs in Sys.2 is greater than the one in Sys.1, the association probabilities with G-BSs in Sys.2 are greater than the ones in Sys.1. Straightforwardly, for the association probabilities with U-BSs, the contrary results are achieved.
When the parametersm and H are considered, we see that the average numberm of the cluster members has small effect on all association probabilities, but the height H of U-BSs imposes greater one. For simplicity, we only investigate the association probabilities A 0 and A 2 with the intra-cluster BSs. Obviously, because the path loss from U-BSs increases with the height H , the association probability A 0 of the typical UE with U-BS decreases, but the one A 2 with G-BS increases. At the same time, we see that when the height H is greater than the radius of the aerial LoS ball, the association probability A 0 with U-BS decreases quickly and tends to zero, approximately, and the one A 2 with G-BS exposes greater jump. By using the employed network model, it is easy to explain the above observations.

B. COVERAGE PROBABILITIES
After observing the association probabilities, we then consider the SINR coverage probability. To effectively find the effect of network parameters, we present 3-D plots of coverage probabilities in Figure 4 and Figure 5. Specially, in Figure 4 we exploit the joint impact of the parameters σ 2 andm on coverage probabilities. Figure 4 (a) shows that, when the average numberm is small, there exits an optimal σ 2 o at which the SINR coverage probability from U-BS achieves its maximum. When the realization of σ 2 is smaller than the optimal σ 2 o , the increase of σ 2 results in the increase of the SINR coverage P U −BS Cov . On contrary, it yields the decrease of P U −BS Cov . The observations can be explained as follows. On the one hand, when the variance σ 2 is small, the increase of σ 2 yields the decrease of average distance from the typical UE to U-BSs so that the received signal power increases. On the other hand, with the continued increasing of σ 2 , the received interference from inter-cluster and intra-cluster U-BSs increases sharply. This yields the decrease of the coverage probabilities P U −BS Cov . For the average numberm, we see that the increase ofm yields the increase of P U −BS Cov only whenm is very small. Otherwise, the increase ofm results in the decrease of P U −BS Cov . Different from Figure 4 (a), Figure 4 (b) shows that the SINR coverage probability P G−BS Cov of the typical UE from G-BS decreases monotonously with σ 2 because of the increased interference  Figure 4 (c) gives the total average SINR coverage probability of the typical UE. We see that, due to high P G−BS Cov , the total SINR coverage probability P Tot Cov decreases with σ 2 . When σ 2 is large, for a given σ 2 , there exists an optimalm o at which the total SINR coverage probability arrives at its maximum.
In Figure 5, we exploit the joint impact of H and σ 2 on coverage probabilities. Similar to Figure 4 (a), Figure 5 (a) shows that, when H is small, for a given H , there exists an optimal σ 2 H at which the typical GUE achieves the maximum SINR coverage probability from its associated U-BS. Because of high path loss, we see that the SINR coverage probability P U −BS Cov decreases with the height H . With Figure 5 (b), we find that, only when the variance σ 2 is small, the height H of U-BSs imposes distinctive impact on the SINR coverage probability P G−BS Cov from G-BS. Specially, when H is small, the SINR coverage probability P G−BS Cov is increasing with H . Then, with the continuous increase of H , the SINR coverage probability P G−BS Cov arrives at its saturated value because of the high interference from U-BSs. As a result, Figure 5 (c) gives the total coverage probability P Tot Cov versus both H and σ 2 . Clearly, the total coverage probability P Tot Cov decreases with the variance σ 2 . However, for the effect of H , we find that, when σ 2 is small, there exists an extreme value point H at which the typical UE achieves the total minimum coverage probability P Tot Cov . When the realization of H is smaller than the extreme value, the total coverage probability P Tot Cov decreases with H , but when it is greater than the extreme value, P Tot Cov increases with H . These results can be explained by considering the joint impact of increased signal power and interference power caused by the decrease of H .
After exploiting the joint impact of network parameters on coverage probabilities, we now turn to present the coverage probabilities' comparison between the two systems. Figure 6 gives the comparison of coverage probabilities versusm under different H between the two systems. We find that, in Figure 6 (c) the exploitation of inter-cluster BSs (G-BS, U-BS) is beneficial to enhance the total average coverage probability. The completely same result is achieved from Figure 6 (b) too, where we give the comparison of P G−BS Cov between the two systems. However, Figure 6 (a) that gives the P U −BS Cov comparison between the two systems, displays the different results from Figure 6 (b) and Figure 6 (c). We see that the achievable coverage probability gain by the 4-tier association scheme over the simple 2-tier association scheme greatly depends on H andm. It is found that, when H is higher, over the entire region ofm, the 4-tier association scheme is inferior to the simple 2-tier association one. Then, with the decrease of H , it is possible to have the result that the 4-tier association scheme outperforms the simple 2-tier association one. Moreover, where H is smaller, the achievable coverage probability gain by the proposed 4-tier association scheme is outstanding.
In Figure 7, we investigate the effect of the variance σ 2 on the achievable coverage probability gain achieved by the 4-tier association scheme over the 2-tier association one. Note that, Figure 7 only presents the comparison of P U −BS Cov . Firstly, Figure 7 further shows that there exists an optimal σ 2 o at which the coverage probability P U −BS Cov arrives at its maximum so that when the realization of the variance σ 2 is greater the optimal value σ 2 o , the coverage probability P U −BS Cov decreases quickly with σ 2 because of the increases of aggregated interference. Secondly, the figure also displays that the achievable coverage probability gain by the 4-tier association scheme is dominated by the parameters H andm, but also the variance σ 2 . Figure 7 (a) shows that, for given smallm, when H is higher, the simple 2-tier association scheme outperforms the 4-tier one. Only when H is small, it is possible that the proposed 4-tier scheme is superior to the 2-tier one under the condition where σ 2 is higher. However, with largerm, we have that the proposed 4-tier scheme is superior to the 2-tier one over the entire range of the variance σ 2 and H . Nevertheless, from Figure 8 that presents the comparison of the total coverage probability P Tot Cov , we find that the proposed 4-tier association scheme complete outperforms the 2-tier association scheme.

C. ACHIEVABLE AVERAGE AREA THROUGHPUT
Finally, we give the comparison of the average area throughput between the two schemes. Specially, Figure 9    increase of SINR threshold is not always beneficial to the enhancement of average area throughput. In practice, when VOLUME 8, 2020 the SINR threshold is small relatively, the achievable average area throughput increases as the threshold increases. With the continuous increase of the SINR threshold, we can obtain an optimal SINR τ o at which the network achieves the maximum average area throughput. As a result, when the SINR threshold τ is greater than τ o , the achievable area throughput decreases with the threshold τ . Besides this observation, we see that the presented 4-tier association scheme not only improve the total average area throughput R Total (spectrum efficiency), but also the individual area throughput R G-BS and R U -BS from G-BSs and U-BSs, respectively. Moreover, this observation holds for all value ofm. The only difference is that the achievable gains of the throughput R U -BS , R G-BS , and R Total are increasing withm. As a result, we have that the employment of 4-tier association scheme is beneficial to the improvement of network spectrum efficiency.

VIII. CONCLUSION
This paper focuses on UAV-assisted heterogeneous mm-Wave cellular network, where G-BSs and U-BSs form macro cells and aerial small cells, respectively. The locations of G-BSs are modeled as a homogeneous PPP and the locations of U-BSs are modeled by a TCP with parent point process consisting of the points from G-BSs. To distinguish the intra-cluster and inter-cluster distances and highlight the insight of the considered association schemes, we construct two additional tiers, i.e., the 0-th and 2-nd tiers, so that the 2-tier and 4-tier association schemes are proposed. By using the tools from stochastic geometry, we derive the LTs of the received aggregate interference by a typical UE from U-BSs/G-BSs in each tier so that the downlink SINR coverage probabilities and average area throughput are derived. The presented simulations and numerical results first validate the mathematical derivations. On the other hand, the numerical analysis exploits the effect of network parameters on the network performance. The insights are 4-folds. Firstly, the association probabilities greatly depend on the variance of U-BSs' distribution. The association probability with U-BS is increasing with the variance so that the proposed 4-tier association scheme is beneficial to exploit network resource. Secondly, the height and variance of U-BSs have the non-monotonous effect on the SINR coverage probabilities of a typical GUE from G-BSs as well as the total SINR coverage probability. Thirdly, we find that the achievable coverage probability gain by the 4-tier association over the simple 2-tier association scheme is mainly dominated by the height of U-BSs and the average number of cluster elements (U-BSs). Finally, as for the achievable average area throughput from G-BSs, U-BSs, and total one, when the SINR threshold is small, they are increasing; when the threshold is larger relatively, they are decreasing. For all considered network realization, the total average area throughput of the proposed 4-tier association scheme is greater than the 2-tier association scheme, so do the ones from U-BSs and G-BSs.

APPENDIX A PROOF OF LEMMA 3
As shown in Figure 1, in other any arbitrary cluster centered at x ∈ G−BS \x 0 , the location set of the possible U-BSs is denoted by N x . In this randomly selected cluster, we consider an U-BS located y i ∈ N x . The corresponding distance of the typical GUE to the U-BS is r, whose CCDF is given by (11). Since the inter-cluster link is modeled by NLoS state, the corresponding path loss is written as L i = r α 1 . Hence, with (11), conditioned on v = x , the CCDF of the path loss L i is written as where t > H α 1 is required. The PDF of the path loss L i is Considering the independence among all inter-cluster U-BSs, the CDF of the path loss L 1 from the typical GUE to the nearest U-BS in the cluster centered at x ∈ G−BS is Therefore, the CCDF of the path loss L 1 is The PDF of the path loss L 1 is

APPENDIX B PROOF OF THEOREM 1
We present the concise proof for Theorem 1. With equation (37), by using the independence and the definition of CCDF, the association A i,t , i ∈ {2} and t ∈ {L, N } is written as whereF L 2 (t) for i = 0 is given by (29) andF L 0 (t) for i = 2 is given by (24),F L 3 (t) is given by (33). Since the 1-st tier consists of multiple clusters centered at x ∈ G−BS , with the result (31), the conditional CCDF of minimum path loss L 1 (C 1,0 L 0,t (z 0 )) relative to the cluster center x is given bỹ Then, by deconditioningF L 1 (r |x ) on v leads tõ Hence, we have the result (39) as well as the CCDF (41).
To derive the PDF of association distance, we define the event T = { The typical GUE is associated with a BS over t ∈ {L, N } link in the i-th tier with association probability A i,t , i ∈ {0, 2} and t ∈ {L, N } }. Given the event S, using Bayes's rule and independent assumption, we can mathematically express the CCDF of the access distance X i,t (z 0 (v 0 )) as (84) Considering the CDF of the association distance X i,t (z 0 (v 0 )) is given by

APPENDIX C PROOF OF LEMMA 5
Since the interference experienced at the typical GUE from the only one G-BS in the representation cluster (2-nd tier) over t ∈ {L, N } link is formulated by In (86), the term L I G 2,j 2,t (s) is further calculated as follows.
where (a) follows from the used strongest BRP criterion and the assumption that the typical GUE is associated with a BS in the i-th tier over the t ∈ {L, N } link. Since the small-scale fading channel gain h 2,t is modeled as a normalized Gaussian random variable with parameter N 2,t , t ∈ {L, N }, i.e., h 2,t ∼ N 2,t , 1/N 2,t , we have the inner expectation operation of (87) is written as VOLUME 8, 2020 where (b) follows from the equation (3.351.2) in [47], ∞ 0 x n e −µx dx = n!µ −n−1 . Then, with the substitution of (88) into (87), we have By separately considering the LoS and NLoS links, Lemma 5 is achieved.

APPENDIX D PROOF OF LEMMA 6
The received interference from the intra-cluster U-BSs in the 0-th tier over the t ∈ {L, N } link is for- By using the thinning property of PCP, the interference can be split 4 independent PCPs as I 0,t the aggregate interference with random gain G 0,j . Further, according to the thinning theorem, the independent component has a average member b 0,jm for 0,j U −BS . Therefore, using the definition of LT yields In (51), by taking z = ||z k ||, the LT L I G 0,j 0,t (s) is calculated as follows.
Using h 0,t ∼ N 0,t , 1/N 0,t and the method similar to (88)-(89), (91) is further given by (92), as shown at the bottom of the next page. In (92), (c) follows from the fact that the locations of intra-cluster U-BSs, conditional on the cluster center x 0 = G−BS , are independent and the expectation over number of interfering U-BSs which are Poisson distributed. Note that, in (92) When the active number of U-BSs is small, (92) is further written as With the substitution of (93) into (90), it is achieved that By separately considering the LoS and NLoS link states, Lemma 6 is achieved.

APPENDIX E PROOF OF LEMMA 7
The LT of of interference I 1 i,t from inter-cluster U-BSs in the 1-st tier is where L I G 1,j 1 (s) is given by (96), as shown at the bottom of the next page. In (96), (a) follows by applying a change of variables with g k + x − > r as shown in Figure 1, and converting the coordinate from Cartesian to polar by using the conditional PDF of distance distribution. With the substitution of (96) into (95), by applying the probability generating functional of G−BS , we have (60).

APPENDIX F PROOF OF THEOREM 3
The coverage probabilities are calculated by separately considering i ∈ {0, 2} and i ∈ {1, 3}.
A. CALCULATING P i ,t (τ ), i ∈ {0, 2} With the definitions P i,t (τ ) and SIN R i,t in (64) and (62), the coverage probability P i,t (τ ) is written as With the fact h i,t ∼ N i,t , 1/N i,t and the derivation in [36] and [48], [49], for a constant η > 0, the probability P h i,t > η is approximated by P h i,t > η = 1 − 1 − e −ζ i,t η N i,t . P i,t (τ ) is given by where (a) follows from the binomial theorem and the binomial coefficient is defined by a b = a! b!(a−b) and Considering the fact that there is only one G-BS in the 2-nd tier, with the aggregate interference given by (63), the conditional coverage probability P i,t (τ ) is calculated separately as follows.
When i = 0, we have P 0,t (τ ) given by  When i = 2, P 2,t (τ ) is given by Considering the fact that the signal propagation from inter-cluster BSs to the typical GUE is modeled as single path loss mode, similar to (99) and (100), we have where σ 2 i = ζ i τ As a result, combining the above analysis, Theorem 3 is achieved. PENGSHAN