Drone-Base-Station for Next-Generation Internet-of-Things: A Comparison of Swarm Intelligence Approaches

The emergence of next-generation Internet-of-Things (NG-IoT) applications introduces several challenges for the sixth-generation (6G) mobile networks, such as massive connectivity, increased network capacity

A novel concept that can effectively assist in satisfying the increased communication requirements and facilitate network expansion is the drone-base-station (DBS) [7]. DBSs are flexible solutions that aspire to increase user throughput, improve the quality of service (QoS), and expand the coverage of mobile networks. Furthermore, DBSs can provide enhanced network throughput in cases of temporary events, by offloading congestion, or provide connectivity in emergency situations, such as ground BS failures and natural disasters [8]. Finally, DBSs can effectively alleviate the capital and operational expenditure of mobile network operators, as they provide on-demand wireless coverage [9].
As the DBS power is limited, its efficient usage is of utmost importance. Free space pathloss, as well as signal reflection and shadowing, heavily affect the user received signal [10]. To achieve a high QoS, the DBS should be optimally positioned in order to mitigate signal quality degradation. Motivated by this, several researchers have turned their attention to formulating and solving the DBS optimal positioning problems. In particular, Alzenad et al. [11] aim to maximize the number of covered users with different QoS requirements. The DBS placement is modeled as a multiple circles placement problem and solved using exhaustive search. The optimal placement of a DBS in a high-rise building scenario is considered in [12]. Due to the intractability of the problem, the Particle Swarm Optimization algorithm was utilized in order to find an efficient DBS position that minimizes the total power that is required to provide coverage to the building users. The authors in [13] formulated the DBS placement problem as a non-linear non-convex combinatorial optimization problem. The problem was decomposed into the DBS placement problem and the joint bandwidth and power allocation problem. The authors developed two heuristic algorithms, namely Dynamic and Fixed DBS placement, based on different placement strategies.
Wang et al. [14] proposed an optimal DBS placement method that uses the minimum transmit power to serve a set of ground users. To achieve this, the optimal DBS placement problem was decoupled into two the horizontal and vertical placement sub-problems, respectively. The simulation results showed reduced power consumption for the suburban, urban, and dense urban environments. The joint placement and resource allocation problem is investigated in [15]. The authors decompose the problem into two sub-problems, namely the resource allocation and the DBS placement sub-problem. To find the optimal DBS position, the authors exhaustively search all the possible locations for the one that achieves the highest throughput. The authors in [16] derived analytic expressions for the DBS hovering altitude that optimizes the energy efficiency. Additionally, the horizontal positioning problem was formulated as a circle packing problem for maximum packing density and solved by utilizing the multilevel regular polygon-based placement algorithm. The authors in [17] focused on the energy-efficient placement optimization problem. To address its complexity, they decoupled it into two separate problems, namely the horizontal placement and the vertical placement problems. A Weiszfeld-based algorithm was leveraged for optimizing the horizontal placement, while the vertical placement was analytically solved using the optimal elevation angle and the radius of the coverage area.
Swarm intelligence approaches have been widely used in solving optimization problems in various wireless applications. The design of design a small-size tag antenna with large gain and reading distance is presented in [18]. Particularly, the design of the tag antenna was generated using the artificial ant bee colony algorithm. The authors in [19] used the spider monkey optimization approach to optimize a linear antenna array in terms of suppressing the sidelobes and null placement in certain directions. Goudos and Athanasiadou in [20] proposed a modeling method for the prediction of the received signal power at a drone. This proposed method combines the salp swarm algorithm along with five base learners for obtaining accurate predictions. The authors in [21] presented the design of a multiband microstrip patch antenna that was generated with the assistance of the coyote optimization algorithm. A social network optimization method for beam-scanning reflectarray antenna design was presented in [22]. The method was used for optimizing the various antenna radiation patterns, as well as minimizing the difference between the desired and actual radiation direction. Moreover, Luo et al. [23] leveraged the Adaptive Opposite Fireworks Algorithm in order to suppress the interference caused by the main lobe in radar systems. In [24], the authors developed a beamforming technique that is able to achieve fast data updates using the conformal phased array radar system. A modified particle swarm optimization algorithm was used to generate the beamforming weights of the antennas to achieve an optimal radiation pattern. Chen et al. [25] used the particle swarm optimization method to optimize the parameters of a deep neural network in order to optimally map the configuration parameters of meta-atoms to the antenna reflection coefficients.
The complexity of the DBS placement problem has been established in the aforementioned works. In most of these, the authors aimed to tackle the problem by decoupling it into smaller and more tractable problems in order to obtain a near-optimal solution. Also, in some works, the authors resorted to exhaustive search methods. It is apparent that swarm intelligence approaches can be directly applied to the DBS positioning problem. Therefore, in this work, we aim to evaluate the performance of well-known swarm intelligence approaches in terms of finding the optimal DBS location under various objectives and parameters. In this direction, we consider three scenarios involving single and multiple DBSs providing wireless coverage to a number of IoT devices that are deployed in a particular area. The contributions of this work are summarized as follows: • We formulate two optimization problems associated with the performance of the system, namely the minimization of the average pathloss experienced by the devices and the maximization of the device coverage. Due to the complexity and the large search space of the problems, swarm intelligence approaches can prove valuable assets. • We provide an overview of five swarm intelligence approaches, namely the Cuckoo Search (CS), the Grew Wolf Optimization (GWO), the Monarch Butterfly Optimization (MBO), the Elephant Herd Optimization (EHO), the Salp Swarm Algorithm (SSA), and the Particle Swarm Optimization (PSO). Also, we present the main functionality of each approach. • We compare the swarm intelligence approaches in terms of performance in finding the optimal DBS position, that optimizes the two optimization problems. The approaches are compared in three evaluation scenarios. In the first two scenarios, we aim to minimize the average pathloss using single and multiple DBSs, respectively. On the other hand, in the third scenario, we aim to maximize the device coverage using multiple DBSs. • Additionally, we perform non-parametric statistical tests, namely the Friedman and Wilcoxon tests, in order to further evaluate the performance of these approaches. Finally, we rank the swarm intelligence approaches, based on the results of these tests. The rest of the paper is organized as follows. Section II presents the system model, while Section III provides a background of swarm intelligence approaches. The evaluation of swarm approaches is presented in Section IV, while Section V presents an analysis of the pathloss distribution, along with a discussion of the evaluation results. Finally, Section VI concludes this work.

II. SYSTEM MODEL AND PROBLEM FORMULATION
The considered system model is presented in Fig. 1. A number of DBSs, denoted by K = {1, . . . , K}, provide connectivity to a number of IoT devices, denoted by S = {1, . . . , S}. The devices are randomly deployed over a geographical area, while the DBSs hover over the devices.
Due to the hovering altitude of the DBS, the conventional channel models are not sufficient for modeling the links between the DBS and the ground devices. According to the AtG model presented in [26], the Line of Site (LoS) probability of an AtG link between the k-th DBS and the s-th device is calculated as where α, and β are coefficients based on the environment (e.g., urban, suburban, etc.). h k denotes the altitude of the DBS, while the horizontal distance between the k-th DBS and the s-th device is denoted by where (x k , y k ) and (x s , y s ) are the horizontal locations the DBS and the device, respectively. Therefore, considering the LoS and Non-LoS (NLoS) probabilities, the pathloss can be obtained by where A = η LoS − η NLoS and B = 20 log( 4π f c c ) + η NLoS , f c is the carrier frequency (in Hz), c is the speed of light, while η LoS and η NLoS are the mean additional losses based on the propagation environment.

A. MINIMIZING THE AVERAGE PATHLOSS
According to Shannon's formula, the datarate of a device increases as its respective signal power is increased, assuming that the system bandwidth and noise remain constant. The received power at a ground device depends on the pathloss experienced by the wireless channel can be calculated as where P t k is the transmission power of the k-th DBS and P N the power of the Additive White Gaussian Noise (AWGN). The AWGN power will essentially be the same for all the devices located in a particular area. Therefore, the impact of the noise on determining the optimal D-RRH location will be nominal. Without loss of generality, we assume that P N = 0.
Given a constant transmission power P t k , the received power can be increased by minimizing the distancedependent pathloss between DBS and the devices. In this direction, we optimize the DBS placement so as to minimize the average pathloss experienced by the devices. Assuming that each device is connected to the closest DBS, the optimization problem is be formulated as where vectors x, y, h denote the position of each DBS in the three-dimensional (3D) space, while x min/max , y min/max , and z min/max are the area limits.

B. MAXIMIZING DEVICE COVERAGE
A device is considered to be under coverage if its QoS can be satisfied. This can be alternatively expressed as the device experiencing pathloss that is lower than a pre-defined threshold, denoted by T. Therefore, the corresponding optimization problem is formulated as subject to: C1: where C k,s is expressed as follows: The solutions to the aforementioned optimization problems are challenging. To this end, swarm intelligence can be leveraged in order to find an efficient solution.

III. SWARM INTELLIGENCE
Swarm intelligence approaches have received great focus from multiple domains, such as science, engineering, and industry [27], [28], [29]. Swarm intelligence is a branch of computational intelligence concerned with the collective self-organizing behavior of a species' population. In this section, we present well-known swarm intelligence approaches, namely the CS, GWO, MBO, EHO, SSA, and PSO. The main operating principles of these approaches along with the relevant references are summarized in Table 1.

A. CUCKOO SEARCH
The CS algorithm is based on the breeding behavior of certain species of cuckoos [30]. There exist three types of breeding, namely intraspecific brood parasitism, cooperative breeding, and nest takeover. In the first type of breeding, a bird places its eggs on the nest of another bird. If the nest owner discovers that the eggs are not their own, they throw them away or abandon the nest. In addition, some species have evolved and became specialized in the mimicry of the colors and patterns of other birds' eggs. This reduces the probability of their eggs being abandoned increasing, therefore, the survival rate.
The mathematical model of the CS breeding is based on the rules below: • The number of nests is fixed.
• Each cuckoo lays one egg at a time in a random nest of another bird. • A foreign egg is discovered with a probability of p ∈ [0, 1] • The eggs that survive will carry over to the next generation. Algorithm 1 shows the CS procedure.

B. GREY WOLF OPTIMIZATION
The GWO algorithm mimics the leadership, hierarchy, and hunting mechanism of grey wolves in nature [34]. There exist four types of grey wolves, namely alpha, beta, delta, and omega, that are employed for simulating the leadership hierarchy. Grey wolf hunting involves three main phases, namely a) tracking and approaching the prey, b) pursuing and encircling the prey until it stops moving, c) attacking the prey. Select a cuckoo randomly using Levy flights [33], [54] 4: Evaluate its fitness F i

5:
Choose a random nest x j (j = rand(1, n)) 6: if F i > F j then 7: replace j by the new solution 8: end if 9: 10: end while The GWO algorithm is mathematically modeled based on the aforementioned hierarchy and hunting method. Therefore, the fittest solution is considered the alpha, while the second and third best solutions are the beta and delta, respectively. The rest of the candidate solutions are assumed to be of the omega type. The alpha, beta, and delta lead the optimization, while the omega follow.
The positions of candidate solutions (wolves) vary during optimization around α, β, and δ as: where t indicates the current iteration, − → X p is the position vector of the prey, and − → X indicates the position vector of the wolf.
The vectors − → A and − → C are calculated as: where components of − → a are linearly decreased from 2.0 to 0 throughout the iterations, while r 1 , and r 2 are randomly selected in [0, 1]. Grey wolves recognize the location of potential prey and encircle it. This is usually guided by the alpha, while the beta and delta can also participate. However, in a wide search space, the alphas have no clues about the location of the optimum (prey). To mathematically model this hunting behavior, it is assumed that the alpha, beta, and delta have better knowledge about the potential location of prey. Therefore, the first three best solutions are imposed on the other search agents (including the omegas) to update their respective positions according to the position of the best search agents. To this end, the following formulas are derived Algorithm 2 Grey Wolf Optimization 1: Initialize the grey wolf population X i (i = 1, 2, ..., n) 2: Initialize α, A, and C 3: Calculate the fitness of each search agent: X a : the best search agent X β : the second best search agent X δ : the third best search agent 4: while t < t max do 5: for X i = 1 to n do 6: Update the position of the search agent using (11) 7: end for 8: Update α, A, and C 9: Calculate the fitness of all search agents 10: Update X a , X β , and X δ 11: At the final phase of the hunt, the wolves attack their prey by moving in. Mathematically, this is translated as reducing range, the next position of the search agent is any position between the current position and the optimal position.
A grey wolf's search space is dominated by the positions of alpha, beta, and delta. These diverge from each other in order to search for prey and converge when they attack it. This divergence is modeled by utilizing − → A ∈ (−∞, −1) ∪ (1, ∞) in order to force the search agent to diverge from the prey. Additionally, the exploration is also imposed by the − → C term. Specifically, this term provides random weights so that the impact of the prey during the search is emphasized (C > 1) or de-emphasized (C < 1). This forces the GWO to feature a more random behavior that favors exploration instead of converging to potential local optima.
Algorithm 1 summarizes the GWO process. The search starts with generating a random population of grey wolves (candidate solutions). Throughout the iterations alpha, beta, and delta wolves estimate the probable position of the prey (optimum solution). The parameter α is decreased from 2 to 0 in order to emphasize exploration. Candidate solutions tend to diverge from the prey if | − → A > 1| or converge towards the prey if | − → A < 1|. The GWO process is terminated when the maximum number of iterations is reached.

C. MONARCH BUTTERFLY OPTIMIZATION
MBO is a meta-heuristic algorithm that mimics the migration of butterflies from the northern United States to Mexico. MBO consists of two main processes [39]. The first process mimics the way a number of butterflies move from their current to a new position based on the migration operator. In the second process, the algorithm tunes the position of the rest of the butterflies by adjusting the migration operator.
To model the behavior of the monarch butterflies, the following rules are established: • The whole population is divided into two locations, namely land A and land B. • A child butterfly is born by the migration operator from a parent either in land A or land B. • A newborn butterfly will take the place of its parent if it has better fitness. • The butterflies featuring the best fitness will carry over to the next generation, and cannot be modified by any operators. This guarantees that the butterfly population will not be eliminated after many generations. The number of butterflies in each region is denoted by: where NP 1 and NP 2 are the sub-populations in land A and B, respectively, while NP total is the total population number and p is the ratio of monarch butterflies in land A. The value r = 1.2rand() is introduced, where rand() denotes a uniform random number. If r ≤ p, a new butterfly is generated using (16).
where x t+1 i,k indicates the k-th element of x i at generation t + 1 that corresponds to the position of the butterfly i. Similarly, x t r 1 ,k indicates the k-th element of x r 1 that is the new position of butterfly r 1 . Butterfly r 1 is randomly selected from sub-population A (i.e., the one located in land A).
If r > p, a butterfly is generated using (17).
where x t r 2 ,k indicates the k-th element of x r 2 at generation t that is a newly generated position r 2 .
It is apparent that the p ratio affects the direction of the migration operator. If p is large, more butterflies in land A will be selected, and thus, more butterflies from sub-population A will carry over to the next generations. Similarly, if p is small, more butterflies in land B will be selected and sub-population B will mainly affect the population of new butterflies. In this work, p is set to 5/12. Algorithm 3 represents the migration operator process.
Apart from the migration operator, the positions of butterflies can be also updated by the adjusting operator, as follows. For all elements in monarch butterfly j, if rand() ≤ p, the butterfly is updated as: where x t best,k represents the k-th element of x best that is the best butterfly in land A and land B, at generation t. for k = 1 to D do 3: Randomly generate a random number uniformly distributed: r = rand()period 4: if r ≤ p then 5: Randomly select a monarch butterfly in subpopulation A 6: Generate the k-th element of x t+1 i using (16) On the other hand, if rand() > p, the butterfly is updated as: where x t r 3 ,k indicates the k-th element of x r 3 that is randomly selected in land B (i.e., r 3 ∈ {1, 2, . . . , NP 2 }. Under this condition, if rand() < BAR, where BAR is the butterfly adjusting rate, it can be further updated as: where d x is the walk step of butterfly j that can be calculated by the Levy flight [33], [54] and α is the weighting factor given by: where S max is the maximum walk step. Therefore, a bigger α corresponds to a longer search step and, thus, increases the exploration process. On the other hand, a smaller α decreases the step search and encourages the exploitation process. The process of the adjusting operator is shown in Algorithm 4.

D. ELEPHANT HERD OPTIMIZATION
EHO is a swarm-based meta-heuristic search approach, inspired by the herding behavior of an elephant group [42]. Male elephants often live in isolation, while females live in family groups. An elephant group is a complicated social structure that consists of multiple clans under the leadership of the oldest female elephant. The elephant behavior is modeled using two operators, namely the clan updating operator and the separating operator. The global optimum is achieved through the optimization of these two operators. The following assumptions are established: • The whole population consists of several clans, where each clan has a fixed number of elephants. Calculate the walk step d x using the Levy flight method 3: Calculate the weighting factor using (21) 4: for k = 1 to D do 5: Randomly generate a random number uniformly distributed: r = rand()period 6: if r ≤ p then 7: Randomly select a monarch butterfly in subpopulation A 8: Generate the k-th element of x t+1 j using (18) 9: else 10: Randomly select a monarch butterfly in subpopulation B 11: Generate the k-th element of x t+1 j using (19) 12: if rand() > BAR then 13: x t+1 end if 15: end if 16: end for 17: end for 18: return x t+1 i • At each generation, a fixed number of male elephants will leave their group and live in solitary • The elephants in each clan live together under the leadership of the oldest female. The first operator, namely the clan updating operator, is focused on updating the position of each elephant in the clan, depending on the position of the leader. Therefore, the position of that elephant is denoted by: x new,ci,j = x ci,j + α × x best,ci − x ci,j × r (22) where x new,ci,j and x ci,j are the new and old position of elephant j in clan ci, respectively. α ∈ [0, 1] is a scale factor that determines the leader influence. Finally, r is a uniform random number in [0, 1]. The fittest elephant in each clan is updated by: where β ∈ [0, 1] is a factor that determines the influence of x center,ci on x new,ci,j . x center,ci is considered the center of clan ci and is calculated as: where d indicates the dimension, n ci is the number of elephants in clan ci, and x ci,j,d is the position of elephant j in the d-dimension.
The clan updating operator is summarized in Algorithm 5.
The male elephants will leave their group and live in isolation. It is assumed, that all elephants featuring the worst Sort the elephants according to their fitness 8: Perform clan updating operator (Algorithm 5) 9: Perform separating operator (Algorithm 6) 10: Evaluate population using the updated positions 11: t = t + 1 12: end while fitness at each generation will leave their respective groups. This behavior is modeled by the separating operator as: where x max and x min are, respectively, the upper and lower bound of the elephant's position, while x worst,ci is the elephant with the worst fitness in clan ci. Finally, r is a uniform random number in [0, 1]. Algorithm 6 describes the separating operator procedure, while Algorithm 7 summarizes the whole EHO process.

E. SALP SWARM ALGORITHM
Salps belong to the family of Salpidae and are very similar to jellyfishes. In deep oceans, salps form a swarm called a salp chain. SSA mimics the swarming behavior of Salps [46].
To mathematically model a salp chain, it is assumed that the whole population consists of two parts, namely the leading salp and the following salps. The leader is the salp at the top of the chain and guides the swarm, while the rest of the salps compose the followers.
The n variables of an optimization problem correspond to the positions of salps in an n-dimensional search space. The swarm's target is to search for the food source, denoted where x 1 j denotes the position of the leader in the j-dimension, while F j denotes the position of the food source in the j-dimension. ub j and lb j indicate the upper and lower bound of j-dimension, respectively. Finally, c 1 , c 2 , and c 3 are random numbers.
Particularly, coefficient c 1 is the most important parameter, since it balances exploration and exploitation, and is calculated as: (27) where l and L correspond to the current and maximum iterations, respectively. Parameters c 2 and c 3 are random numbers uniformly distributed in [0, 1]. c 2 is the step size, while c 3 dictates if the next position is towards positives or negatives.
The positions of each follower salp i in the j-dimension is calculated by: where t is the time, v 0 is the initial speed, and a = v final v 0 .
Throughout the optimization, the time corresponds to the iterations, therefore, considering v 0 = 0 the above equation can be expressed as: The whole SSA is summarized in Algorithm 8.

F. PARTICLE SWARM OPTIMIZATION
PSO is an optimization approach based on the flocking behavior of birds in search of food [50]. A swarm of particles  6: for each particle do 7: update the particle position (30) 8: evaluate the fitness function in that position 9: update the particle's knowledge of the best solution location x i In each generation, the position of each particle is updated according to two variables. The first variable corresponds to the particle's own knowledge about the location of the solution, while the second variable corresponds to the swarm's knowledge of the solution location.
According to the PSO operation, a particle's new position can be obtained as where i denotes the particle, while j denotes the generation. Additionally, w is the inertia parameters, while c 1 and c 2 are the coefficients adjusting the exploration. Moreover, r 1 and r 2 are random values in range [0, 1], while x i best and x global best denote the particle's and swarm's knowledge of the solution location, respectively. The PSO algorithm is summarized in Algorithm 9.

IV. NUMERICAL RESULTS
To evaluate the performance of each of the aforementioned algorithms, we carried out extensive Monte Carlo simulations and the final results are generated by averaging 1000 simulations. Four propagation environments are considered namely the suburban, urban, dense urban, and high-rise urban environments. The propagation parameters of each environment are presented in Table 2.
We consider three scenarios for the evaluation of the swarm intelligence algorithms. In the first scenario, we assume that a single DBS is hovering over the devices and we evaluate the performance of the algorithms in terms of minimizing the average pathloss under various parameters, such

FIGURE 2. Average pathloss as a function of number of search agents.
as the number of devices, number of search agents, number of generations, and the propagation environment. In the second and third scenarios, we assume that multiple DBSs are hovering over a number of devices deployed throughout a large area. Particularly, in the second scenario we evaluate the swarm intelligence algorithms towards minimizing the average pathloss for various numbers of DBSs, while in the third scenario, we evaluate these algorithms towards maximizing the coverage for various numbers of DBSs.

A. SCENARIO 1: AVERAGE PATHLOSS USING A SINGLE DBS
In this scenario, a single DBS is hovering a number of devices in a 1 km 2 area, while the number of uniformly distributed ground devices is set to 10, 20, 30, 40, and 50. The number of search agents (population) and maximum generations in each algorithm are respectively set to 5, 25, 50, 75, 100 and 10, 50, 100, 200, 500. The simulation parameters for the first scenario are summarized in Table 3. Fig. 2 shows the average pathloss as a function of the number of search agents when there are 20 devices deployed in an urban environment, while the maximum number of generations is 100. CS, EHO, GWO, and SSA feature almost the same respective performance, regardless of the number of search agents. Specifically, CS, GWO, and SSA feature the best performance, while EHO features a slightly worse performance. On the other hand, the number of search agents affects the performance of the MBO and PSO approaches. In particular, MBO performs better when the number of search agents is increased. Fig. 3 shows the average pathloss as a function of the maximum generations, when the numbers of search agents and devices are 25 and 20, respectively, while an urban environment is considered. According to the results, when the number of generations is 100, 200, and 500, the algorithms feature a similar performance. On the other hand, the algorithms perform worse when there are only 10 and 50 generations. This is expected, as the optimum is more closely approximated when more generations are used. Overall, CS, GWO, and SSA have the same performance, while PSO and MBO have the worst and second worst, respectively.
The average pathloss in various propagation environments is shown in Fig. 4. The numbers of maximum generations and search agents are 100 and 25, respectively. An urban area with 20 devices is considered as the deployment area. It is apparent that the propagation environment has a significant impact on the pathloss experienced by the devices. This is expected due to the denser existence of buildings in the deployment area. Specifically, the suburban environment features the lowest pathloss as there are very few buildings. On the other hand, the high-rise urban and dense urban environments feature the highest and second highest pathloss, respectively. In addition, CS, GWO, and SSA feature a similar overall performance in all cases.   respectively. An urban area is considered as the deployment area. According to the results, as the number of devices increases, the pathloss is increased because the close DBS deployment to devices is more complicated. CS, GWO, and SSA feature a similar overall performance in all cases.

B. SCENARIO 2: AVERAGE PATHLOSS USING MULTIPLE DBSS
In this scenario, we evaluate the coverage probability over a number of devices deployed in a larger area, while multiple DBSs provide connectivity. A device is considered under coverage if its experienced pathloss is lower than a threshold. The area size is 5 km 2 , the number of devices is 100, and the number of DBSs ranges from 1 to 10. Also, the threshold is set to 90, 100, 110, and 120 dBs. Moreover, the number of search agents (population) and maximum generations in  each algorithm are 25 and 100, respectively. The simulation parameters for the second scenario are summarized in Table 4.
The average pathloss as a function of the number of DBSs for various environments is depicted in Fig. 6. In all cases, when the number of DBSs is increased, the average pathloss is decreased. This is expected as when there are more DBSs, they can be deployed closer to the devices, thus, reducing the experienced pathloss. According to the results, GWO has the best overall performance, while MBO features the worst overall performance in all cases. EHO, CS, SSA, and PSO feature a similar performance. Furthermore, as also deduced in the first scenario, the suburban and high-rise urban environments have the lowest and highest average pathloss, respectively.

C. SCENARIO 3: COVERAGE PROBABILITY USING MULTIPLE DBSS
In this scenario, we evaluate the coverage probability over a number of devices deployed in a larger area, while multiple DBSs provide connectivity. A device is considered under coverage if its experienced pathloss is lower than a threshold. The area size is 5 km 2 , the number of devices is 100, and the number of DBSs ranges from 1 to 10. Also, the threshold is set to 90, 100, 110, and 120 dBs. Moreover, the number of search agents (population) and maximum generations in each algorithm are 25 and 100, respectively. The simulation parameters for the third scenario are summarized in Table 5.
The coverage probability as a function of the number of DBSs for various thresholds is presented in Fig. 7-Fig. 10. As expected, when the number of DBSs is increased, the coverage probability is also increased, as the DBSs can be spread throughout the area to cover more devices. As far as the threshold is concerned, lower threshold values correspond to lower coverage probability. This is expected as the devices that are closer to the DBSs will experience a lower pathloss lower than the threshold. Overall, when the threshold is set to    90 dB, more DBSs are required to maintain a high coverage probability.
With respect to the propagation environment, we can notice that, in the suburban environment, the least number of DBSs is required to maintain a high coverage probability. On the other hand, in the highrise urban environment, even with a threshold value of 120 dB and 10 DBSs, the coverage probability is about 75%.
Finally, all swarm intelligence approaches feature a similar performance for higher numbers of DBSs and threshold values. Nevertheless, for lower threshold values, GWO and MBO have best and worst overall performance, respectively, while the rest approaches feature a relatively similar performance.

V. PATHLOSS ANALYSIS AND RESULTS DISCUSSION
After the DBSs have been deployed in the optimal location, we present an analysis of the pathloss distribution for a single DBS providing coverage to the devices. Fig. 11 depicts the coverage area of a DBS, where h d denotes the hovering altitude, r the radius of the coverage area, and θ denotes the elevation angle. Since the devices are uniformly distributed over the whole area, their deployment inside the coverage will also follow a uniform distribution.
The derivative of (2) with respect to the elevation angle is obtained by [26] ∂PL ∂θ = π 9 ln(10) By setting the equation above equal to zero and solving for θ , the optimal elevation angles are found for the various environments, as follows: As stated in the previous subsection, a device is considered to be covered if the experienced pathloss is lower than a predefined threshold. Therefore, the coverage probability can be expressed as where The probability in the rightmost side of (33) is also the cummulative distribution function of the radius. By defining R to be the maximum radius of the coverage area (i.e., the radius where all devices experience pathloss equal to PL), the probability density function (PDF) of r is obtained by By integrating the PDF, the cummulative density function (CDF) is obtained by Therefore, by combining (33) and (36), the CDF of the pathloss can be expressed as The CDF of the pathloss is depicted in Fig. 12. As the maximum threshold is increased, the coverage radius also increases as more devices located further from the DBS can be covered. Also, based on the results, we can notice that the convergence rate is similar for all cases of T max . Of note. The CDF is the same for all environments since the propagation parameters will also affect the maximum coverage radius, thus resulting in the same CDF.
In order to further evaluate the comparison results, we perform non-parametric statistical tests [55], [56]. Firstly, we apply the Friedman test using the data as presented in Fig. 2-Fig. 10. Through the simulations, 99 results for each approach were generated in total. Based on each result, the approaches were ranked from 1st to 6th and the average rank of each approach was determined. Table 6 presents the average ranking of the algorithms obtained by the Friedman test. According to the results, GWO is ranked as the highest performing approach in optimizing the DBS position in terms of maximizing the network performance. On the other hand, MBO features the lowest rank.   Furthermore, to test if GWO is significantly better than all the rest approaches, we perform the Wilcoxon signed-rank test with a significance level of 0.05. The respective results are summarized in Table 7. Based on the results, GWO is significantly better than the other approaches.
However, this conclusion cannot be generalized to all optimization problems in electromagnetics. An important theorem in evolutionary optimization is the "No Free Lunch" (NFL) theorem [57]. This theorem is about the average behavior of optimization algorithms over given spaces of optimization problems. NFL states that when averaged over all possible optimization problems defined over some search space X, no algorithm has a performance advantage over any other. Thus, there exists no optimization algorithm that is able to solve every problem effectively and efficiently. There are different optimizers that have the capabilities for obtaining a solution in different types of optimization problems. Therefore, it is worthwhile to try to find the best suited optimizer for a specific optimization problem.
In light of the aforementioned results, the following remarks can be established: • The number of search agents does not have a significant impact on the performance of each algorithm, except for the MBO and PSO algorithms. • After a particular number of generations (i.e., 100 according to Fig. 3), the performance of each algorithm is not affected. On the other hand, a lower number of generations leads to worse performance. • The density, as well as the height of buildings, considerably affects final results. Nevertheless, the differences among the performance of each of the algorithms remain the same in all cases. • The number of devices has a limited impact on the average pathloss results. Apart from the case of 10 devices, where CS, EHO, and SSA perform considerably better, the rest differences in each algorithm's performance are similar. • The number of DBSs has a considerable impact on the average pathloss. When the number of DBSs is high, they can be placed closer to the user, thus, reducing the average pathloss. • The coverage probability is heavily dependent on the threshold value since the users that are placed farther from the DBSs cannot be covered when the threshold value is low. Of note, in the high-rise urban environment, higher threshold values and number of DBSs are required in order to maintain a high coverage probability. • According to the Friedman ranking test, GWO is the best performing approach in finding the optimal DBS position, while MBO is the worst performing one. Also, as verified by the Wilcoxon significance test, GWO has a significantly better performance compared to the rest approaches. • Finally, the visual representations of the results (i.e., Fig. 2-Fig. 10) are verified by the statistical results of the Friedman and Wilcoxon tests.

VI. CONCLUSION
Motivated by the emergence of NG-IoT and the associated challenges in terms of massive connectivity, we investigated the efficiency of swarm intelligence approaches on optimizing the DBS deployment. In this direction, we designed three evaluation scenarios related to the overall network performance and evaluated the performance of five swarm intelligence approaches, namely the CS, EHO, GWO, MBO, SSA, and PSO. In particular, these approaches were evaluated and compared in terms of finding the optimal DBSs positions using various parameters, such as the number of search agents, number of maximum generations, propagation environment, number of devices, and area size. Additionally, we performed non-parametric statistical tests, namely the Friedman and Wilcoxon tests, in order to further evaluate the comparison results. The comparison results showed that GWO and MBO rank highest and lowest in the Friedman test, respectively. Furthermore, according to the Wilcoxon test, GWO performs significantly better than the rest of the approaches. Therefore, we deduce that GWO features the best performance in finding the DBS position in the 3D space so that the network performance is optimized. As a future extension of this work, we aim to modify the swarm intelligence approaches in order to consider multiple DBSs. By grouping the devices and assigning different DBSs, the overall performance of the network can be further increased. To further improve the network performance, nonorthogonal multiple access (NOMA) schemes can be utilized to schedule radio and power resources to the devices. In this direction, we aim to integrate this work with our previous ones in [58] and [59].