Deep-Ensemble-Learning-Based GPS Spoofing Detection for Cellular-Connected UAVs

Unmanned aerial vehicles (UAVs) are an emerging technology in the 5G-and-beyond systems with the promise of assisting cellular communications and supporting IoT deployment in remote and density areas. Safe and secure navigation is essential for UAV remote and autonomous deployment. Indeed, the opensource simulator can use commercial software-defined radio tools to generate fake global positioning system (GPS) signals and spoof the UAV GPS receiver to calculate wrong locations, deviating from the planned trajectory. Fortunately, the existing mobile positioning system can provide additional navigation for cellular-connected UAVs and verify the UAV GPS locations for spoofing detection, but it needs at least three base stations (BSs) at the same time. In this article, we propose a novel deep-ensemble-learning-based, mobile-network-assisted UAV monitoring and tracking system for cellular-connected UAV spoofing detection. The proposed method uses path losses between BSs and UAVs communication to indicate the UAV trajectory deviation caused by GPS spoofing. To increase the detection accuracy, three statistics methods are adopted to remove environmental impacts on path losses. In addition, deep ensemble learning methods are deployed on the edge cloud servers and use the multilayer perceptron (MLP) neural networks to analyze path losses statistical features for making a final decision, which has no additional requirements and energy consumption on UAVs. The experimental results show the effectiveness of our method in detecting GPS spoofing, achieving above 97% accuracy rate under two BSs, while it can still achieve at least 83% accuracy under only one BS.


I. INTRODUCTION
U NMANNED Aerial Vehicles (UAVs) play the crucial role in the upcoming Internet of Things (IoT) platforms for remotely data gathering and aerially data transiting, supporting not only the capability of delivering IoT-based services but also the prospects of providing communications to dense and remote areas [1].According to UAVs market report in [2], the value of UAV will reach 58.4B dollars by 2026.As more and more UAVs are coming, the security of autonomous UAVs must be addressed before the envisaged growth in UAV-based applications and services [3].
As a response, the Unmanned Aircraft Systems (UAS) Traffic Management (UTM) systems have been developed by the federal aviation administration for providing UAVs missionrelated security services, including UAV authentication, flight plan authorization, real-time tracking and geofencing [4].Reliable information on UAV position is essential for its operations and for a UTM system to carry out its mission.The global navigation satellite system, particularly Global Positioning System (GPS), is widely used to obtain UAV positioning information due to its global coverage and accuracy [4].However, the unencrypted GPS signals are inherently vulnerable to spoofing attacks.In fact, an attacker can send counterfeit GPS signals to mislead the UAV's GPS receiver into generating false position information [5].Furthermore, a malicious UAV may intentionally report forged GPS information to UTM system, resulting in violation of no-fly zone regulation and/or collision risks.Therefore, an effective GPS spoofing detection approach is vital to guarantee safe and secure integration of UAVs in the airspace.
GPS spoofing detection methods include GPS navigation signal analysis methods (e.g., [6]- [12]) and GPS navigation message encryption methods (e.g., [13]- [20]).The former signal analysis methods detect spoofing either through the difference of direction of arrival between the satellite signals and spoofer signals or through cross-correlation property between the military and civil GPS signals, while both need a groundtruth source in their detection processes.The latter encryption methods embed the authentication signature into navigation messages to guard against GPS spoofing attacks.However, the encryption processes need secured infrastructure support as well as more computing resources on the GPS receiver.The limited battery capacity and weight load of UAVs inhibit this GPS spoofing detection methods adoption in UAVs swarm systems.
Fortunately, the 3rd Generation Partnership Project (3GPP) has defined several standards to enhance long-term evolution support for Unmanned Aerial Systems (UAS) [21]- [24].Those new standards allow the terrestrial cellular networks to provide identifying, locating, and tracking services for the UAS in order to enhance the security of the UAS operation.In this vein, the authors in [25] proposed a UAV tracking method, namely Adaptive Trustable Residence Area (ATRA), to detect the spoofed GPS position by leveraging up-link received signal strength indication.Regardless of the performance brought by the ATRA method, it requires at least three base stations at the same time.To overcome the deficiency of the ATRA method, the authors in [26] introduced a deep neural network on the edge server that allows lively detecting GPS spoofing with three, two, or one base station, where the neural network model takes the statistical path losses features as inputs and produces the spoofing possibility as its output.Despite the acceptable performance provided by the neural network, it needs to collect the path losses data from different base stations which may lead to network congestion.
In this paper, we leverage the potential of deep ensemble methods and exploit the statistical features of path losses between a UAV and base stations to detect GPS spoofing for cellular-connected UAVs.Unlike the aforementioned GPS signal analysis and GPS information encryption contributions, the proposed approach is based on the analysis of the terrestrial mobile BS signal rather than GPS satellites signal from medium Earth orbit.Compared with the GPS satellites signal, the BS signal provides immunity from ionospheric interference, which makes it more stable and reliable for detecting the UAV trajectory deviation caused by GPS spoofing.In addition, our approach needs no additional hardware or computation load at the UAV.In fact, the detection task is accomplished at the Multi-access Edge Computing (MEC) servers and edge cloud server based on the path losses received from the mobile network.Moreover, its effectiveness is insensitive to the changes of the environmental conditions, thanks to the use of statistical properties of path losses.Finally, the use of deep ensemble learning amongst edge servers help to make collaborations between BSs, which can mitigate the downtilt directional sector antennas impacts on spoofing detection performance.The main contributions of this paper are summarized as follows: • We first formulate the spoofing detection processes as a nonlinear and non-convex optimization problem, which is subject to the threshold, the statistical feature weights, and the number of BSs.The goal of the optimization function is to minimize the sum of hypothesis test errors of GPS spoofing by optimizing the threshold of the hypothesis test and the statistical feature weights of the differences between the theoretical path losses and the actual ones provided by BSs.• Due to the path losses vulnerable to environmental changes, we adopt three statistical methods to analyze the statistical properties of path losses, including moments, quartile, and probability distributions, for making our spoofing detection insensitive to the changes in environmental conditions.• We further propose a MEC architecture for deep ensemble learning-based GPS spoofing detection.In MEC servers, each MLP works independently to predict the spoofing probability of a GPS position reported by each BS.To decide whether the GPS position is spoofed or  In the end, we build a simulation platform using the 3GPP defined path losses model and the TensorFlow deep learning models.The experimental results illustrate the effectiveness of our proposed approach in detecting the spoofed GPS positions even with only one BS.This manuscript contains a number of abbreviations.For the sake of ease of readership, Table I lists the abbreviations used in this paper.The rest of the paper is organized as follows.Section II summarizes related work in the literature.The system model, path loss model and hypothesis testing are described in Section III.Section IV introduces three statistical methods and formulates the GPS spoofing detection as an optimization problem.A MLP-based ensemble approach is proposed in Section V. Experiment results are illustrated in Section VI.Section VII concludes the paper.

II. RELATED WORK
Over the last decade, much research efforts have been dedicated to studying the GPS spoofing problem.The proposed contributions can be categorized into the following five classes.

A. Navigation Signal Analysis Approaches
GPS navigation signal analysis is widely used to determine whether the GPS is spoofed or not.The authors in [9] proposed a multi-antenna spoofing detection technique which analyzes the DoA of GPS signals to discriminate between authentic and fake GPS signals.The received signals are considered spoofed if they arrive from the same source in space.Similarly, the work in [10] introduces a spatial signal processing approach for GPS spoofing detection and mitigation.The approach leverages multi-antenna reception and null-steering for, respectively, identifying and filtering out fake GPS signals.The adoption of the aforementioned approaches requires multiple antennas and incurs more computational load on the GPS receiver.
The methods in [11] and [12] rely on the cross-correlation between encrypted/military GPS signals received by a trusted receiver and the defended civil receiver in order to detect the spoofing of unencrypted GPS signals.A low cross-correlation indicates the presence of a spoofing attack.While these methods do not depend on a multi-antenna GPS receiver, they require a communication link between the defended receiver and a secure receiver to perform the cross-correlation.

B. Cryptal-protected Navigation Message Approaches
To protect civil GPS receivers against spoofing attacks, the approaches falling in this category consider either the encryption or the authentication of the navigation messages.The navigation message encryption methods (e.g., [16]) ensure the confidentiality of the navigation message by ciphering its content.The navigation message encryption methods are deemed impractical as their implementation requires changes to the GPS interface specifications [17].The navigation message authentication techniques aim at guaranteeing the integrity of the navigation message and the authenticity of its source by digitally signing its content.For instance, Wesson et al. [18] combined signature-based authentication of GPS navigation messages with a statistical hypothesis test to prevent counterfeit navigation messages.Wu et al. [14] leveraged SM cryptographic algorithms to authenticate the BeiDou-II navigation messages.Similarly, the work in [19] presented a BeiDou-II navigation message authentication scheme based on digital signatures generated by an elliptic curve digital signature algorithm.In [20], a trusted execution environment was used to generate cryptographically-signed GPS messages in order to prevent their forgery.The European Galileo global navigation satellite system will provide an open navigation message authentication service (OS-NMA) [27] based on an adaptation of the Timed Efficient Stream Loss-tolerant Authentication (TESLA) scheme.OS-NMA enables source authentication and message integrity by transmitting the navigation message along with a message authentication code.The message authentication code key is derived from a one-way hash chain and is released after a pre-defined period of time.Even though navigation message authentication techniques are considered a practical and effective defense against GPS spoofing attack, their use induces significant computational cost and latency due to signature verification.Moreover, some of them (e.g., OS-NMA) require the availability of loose time synchronization [28].Finally, navigation message authentication based methods cannot withstand replay attacks.

C. Inertial Navigation System based Approaches
An INS uses an IMU comprising various inertial sensors (e.g., accelerometers, gyroscopes, and magnetometers) to assist navigation based on sensor measurements.The INSbased approaches detect GPS spoofing attacks by using the position estimated from the IMU readings to cross-validate the veracity of the reported GPS position.Lee et al. [29] used a fixed probability of false alarms and the root mean square error between the accelerometer outputs and the acceleration estimated from the GP outputs to detect spoofing GPS signals.
In [30], the accelerometer readings were leveraged to identify spoofing GPS signals, where the spoofing decision is based on the probability density function.Feng et al. [31] used onboard gyroscopes' measurements to determine whether a UAV has been hijacked by GPS spoofing.
The INS has the advantage to operate without reliance on any external signals, making it immune to spoofing attacks.However, the main issue with INS is the error accumulation of the IMU measurements over time, which can negatively impact the detection accuracy.

D. Mobile Cellular Network based Approaches
Recently, a new class of anti-GPS spoofing approaches has been introduced leveraging the localization ability of mobile cellular networks to relocate the target and discriminate the spoofed GPS positions in the base stations' coverage area.The work in [32] exploited the strength of signals received from base stations of a 2G cellular network to estimate the vehicle position.The estimated position is then used to cross-check the consistency of the vehicle's GPS position.The authors in [33] used the position estimates obtained by the data relative to the neighboring cells in order to check the validity of the GPS position of a smartphone.Different from the aforementioned solutions, the work in [34] considered the use of information received from 5G network to recognize spoofed UAV's GPS positions reported to UTM.The proposed approach utilizes the Received Signal Strength (RSS) collected from three base stations to infer the trust area within which the GPS position should be located in order to be considered genuine.

E. Machine Learning based Approaches
The capability of ML to learn from historical data to unveil hidden patterns has driven the recent trend of using ML techniques for GPS spoofing detection in UAV environments.In this vein, different ML methods have been proposed to detect GPS spoofing either by classifying the spoofed GPS signal directly (e.g., [35]- [40]) or by verifying the GPS information supplementally (e.g., [41]- [44]).The GPS signal classification methods take advantage of the ML classifier techniques to discriminate the fake GPS signal from the actual GPS signal, while the GPS information verification methods make use of Manesh et al. [35] exploited the received GPS signal characteristics, such as pseudo range, Doppler shift and signalto-noise ratio, to build a supervised neural network model for detecting GPS spoofing against unmanned aerial systems.Similarly, the spoofing detection approach in [36] used a neural network to analyze the differences between the spoofing and authentic signal patterns.The neural network takes as inputs the delta criterion, coefficient of early and late phase criterion, and total levels of signal.The authors in [37] proposed a GPS spoofing detection approach that leverages a supervised ML algorithm, specifically Support Vector Machine (SVM), to analyze the cross-correlation among the GPS signals of multiple GPS receivers.The work in [38] introduced RBF SVM, Ada Boost, Decision Trees, K-Nearest Neighbors, Random Forests methods for GPS signal's radio frequency interference analysis under the assumption that the attacker was not able to null the satellite signal.It is worth mentioning that the space weather changes diminish the features of raw GPS signals [39] and the attacker can broadcast the same frequency jamming GPS signals for blinding the GPS receiver [40], which can considerably hinder the effectiveness of the aforementioned approaches.
2) ML-based GPS information verification: To overcome the weakness of GPS signal analyses based spoofing detection methods, the external positioning techniques, such as INS and MPS, are also used to not only provide location services for UAVs when the GPS is unavailable but also to cross-validate the authenticity of GPS information.Recently, ML techniques have been leveraged in both INS and MPS based GPS spoofing detection methods for improving the detection accuracy.
Motivated by the fact that a camera view, one of the INS components, endows a UAV with the ability to locate itself and recognize its surroundings, a number of studies have developed UAV camera view based ML methods to detect GPS spoofing.For instance, the authors in [41] used a hidden Markov model to model the human operator approaches in detecting fake GPS locations based on analysis of aerial photos taken by the UAV.Based on the PoseNet convolutional neural networks, the work in [42] used aerial images to provide the UAV's GPS position.In the same vein, the deep convolutional neural networks was adopted in [43] to enable the UAV navigation in GPS-denied environments using only visual information.Nevertheless, the acquisition, transmission and processing of camera views are time consuming operations which can introduce noticeable latency for detecting the GPS spoofing.
As aforementioned, MPS has been used to locate and monitor the mobile targets using at least three BSs or multiantennas for acquiring an accurate target position.However, some airspace areas may be covered by less than three BSs and even only one BS, which can limit the effectiveness of existing MPS based location tracking and spoofing detection solutions.Luckily, ML is a potential enabler for building MPSbased spoofing detection services that can perform well even in some worse cases.In [44], Xiao et al. investigated the potential of recurrent neural networks in recognizing the deviation in the UAV's trajectory caused GPS spoofing.To this end, the DoA measurements of BSs' signals from the UAV equipment are analyzed by the proposed recurrent neural network models.Nevertheless, the adoption of their approach requires UAVs to be equipped with cylindrical antenna arrays and incurs more computational load on the UAVs to estimate the DoA values.
Table.II summarizes the aforementioned spoofing detection methods based on the requirements of data source, computing location, hardware, energy, and UAV loads.Based on the difference in data sources, the spoofing detection methods can be divided into four categories, including GPS signals analysis methods, GPS message encryption methods, internal sensors data analysis methods, and cellular signals analysis methods.Both signal analysis or ML-based signal analysis methods require additional computing resources on the GPS receiver and rely on multi-GPS receivers proving either signals directions or a secure template, which increases the UAV energy consumption and brings extra loads on the UAV.The GPS message encryption methods execute encryption and decryption processes on the GPS receiver secure hardware, which consumes more energy but without the increase in the UAV loads.The sensor data-based spoofing detection methods detect spoofing through the differences between GPS location and internal sensors' location, which needs support from UAV board computer and different kinds of sensors.However, the internal sensor and ML-based GPS verification methods consume UAV energy for location computation and put more loads on the UAV for carrying sensors.Although the MPS-based spoofing detection methods use the cellular signals and are deployed on the edge servers without energy consummation and load requirements on the UAV, they require at least three BSs at the same time for triangular localization.
Despite the merits of the above mentioned solutions, an effective approach to detect GPS spoofing while accommodating to resource, cost and environmental constraints of a UAV environment is still missing.To fill this gap, we leverage the potential of ML and exploit the statistical features of path losses between UAV and 5G base stations (BSs) to devise an effective 5G-assisted GPS spoofing detection approach.The proposed approach needs no additional hardware or computation and load at the UAV.Moreover, its effectiveness is less prone to changes in the environmental conditions, thanks to the stability introduced by the statistical features.Furthermore, we use deep learning method to detect GPS spoofing with one single BS and also can achieve cooperation between different BSs with the help of ensemble learning.By using the path losses that can be obtained from the BSs broadly and speedily, and taking advantage of the capability of ML to deliver faster decisions, the proposed approach will empower live detection of spoofed GPS positions.

A. Network Model
As illustrated in Fig. 1, we consider a space-aerial-terrestrial integrated network consisting of a space GPS satellites network providing location service for UAVs, an aerial vehicle network including a target UAV u and an aerial GPS spoofer, and a terrestrial mobile network with N BSs connecting with aerial vehicles through wireless channels.
In the aerial network, the aerial GPS spoofer sends fake GPS signal to mislead the target u into deviating from its planned trajectory.As shown in Fig. 1, if there is no GPS spoofing, u will reach the waypoint P u of the planned trajectory starting from the waypoint P j to the waypoint P j+1 .Once GPS spoofed, it will reach the waypoint P ′ u of the spoofed trajectory starting from P j to the waypoint P ′ j+1 .The terrestrial mobile network is mainly used for GPS spoofing detection.The terrestrial BSs first collect the PL data during the communications between u and these BSs.Three kinds of statistical methods, introduced in the next section, are then applied to process the PL data for reducing the negative impact of the environmental conditions on it.In addition, the edge servers deployed near the BSs provide computation resources in order to reduce the data transmission latency and improve spoofing detection effectiveness.Finally, the edge servers receives the processed PL data through the core network connecting these BSs, and further decides whether The probability of LoS/NLoS link L/L The actual/theoretical path loss

∆L
The difference between actual and theoretical path loss.L ′ The reference path loss in a free space The 2D/3D distance between i th BS and u h u The altitude of u T The threshold of hypothesis testing E The number of data points in a time slot f c The carrier frequency 2 GHz F The statistical features set F The ensemble statistical features from one BS F The weighted sum of The cumulative distribution function C The set of MLP neural networks ω The weight of statistical feature Ω The weight of BS ensemble feature B The binary cross-entropy loss γ Fraction to create bootstrapped training data dE The system GPS error tolerance α The importance coefficient for miss detection M Ensemble ML methods D Data set of ML predictions Θ Thresholds for ML ensemble the GPS position is spoofed or not.The notations uses in this paper are summarized in Table III for clarity.

B. Path Loss Model
The aerial UAV and terrestrial BS communication contains both LoS and Non-LoS (NLoS) links.The following air-toground path loss model for UAV u and BS i has been tested by the 3GPP in [22], and is determined as where L L iu and L N iu denote the path loss of LoS link and NLoS link, respectively.λ iu is the probability of LoS link and λ iu is the probability of NLoS link, and For an urban micro UAV, the LoS probability, λ iu , is given by where d 1 = max(294.05log 10 (h u ) − 432.94, 18) and q 1 = 233.98log 10 (h u )−0.95.d iu is the horizontal distance between i th BS and u, and h u is the u's altitude, as shown in Fig. 2. Note that h u can be larger, equal, or smaller than the BS i 's height.
The path losses L L iu and L N iu are expressed as: and where the carrier frequency f c is equal to 2 GHz.L ′ = 15.3 + 37.6 log 10 (d ′ iu ) is the path loss for a carrier frequency of 2 GHZ in a free space, and d ′ iu is the distance between BS i and u, as shown in Fig. 2. P u is the position of u and P i is the position of the i th BS.Alternatively, the path loss model above can be further extended into the Rician channel that includes both large scale and small scale fading, following procedures in [22].It is worth mentioning that the Rician fading impacts are insignificant thanks to the use of statistical methods on the path losses.It is notable that there are three unknown parameters in (2), ( 3) and (4), i.e., the UAV height, the horizontal distance between the target UAV and each BS, and the distance between them.To obtain these three parameters, the edge server need to know the locations of the UAV and BS, which are obtained according to the Federal Aviation Administration (FAA) regulation [45].Under such a regulation, the target UAV needs to broadcast its location for safety and security purposes [45].The edge server can also ask the location of each BS from the mobile network operator.Based on the locations of the UAV and BS, the edge server can obtain these three parameters and also determine the theoretical path loss between the target UAV and BS using the path loss model of ( 2), ( 3) and (4).Furthermore, the edge server can obtain the actual path loss based on the UAV transmit power and the received signal strength at BS.This is because the edge server can control the UAV transmission power through an edge UAV flight controller [46] and also can get the received signal strength through mobile network services defined by 3GPP [22].

C. Hypothesis Testing
Based on the observation that different UAV's positions generally result in different values of path loss between the BS and the UAV, UTM uses the difference between the actual path loss L iu and the theoretical one L iu to decide whether the GPS position of UAV u is spoofed or not.Here, the actual path loss and the theoretical path loss are determined according to Subsection III-B.Thus, we have where ∆L iu denotes the absolute value of the difference between the actual path loss and the theoretical one.Notice that an actual GPS position of a UAV corresponds to a theoretical path loss nearly the same as the actual one.Meanwhile, a spoofed GPS position corresponds to a theoretical path loss deviating from the actual one, which means that a bigger ∆L iu indicates a higher probability that the GPS position of the UAV is spoofed.Therefore, the GPS spoofing detection problem can be formulated as a threshold-based hypothesis testing given by where T denotes a threshold of the hypothesis testing.The null hypothesis H 0 represents that the GPS position is spoofed.H 0 is accepted if ∆L iu is above the threshold T .On the other hand, a true alternative hypothesis H 1 means that there is no GPS spoofing.Although the distance between UAV and BS is the main factor affecting the path loss in the threshold based hypothesis testing, other environmental factors (e.g., cloud, temperate and vapor) can also impact the path loss, which may lead to a wrong decision on the GPS spoofing.Accordingly, the threshold based hypothesis testing for GPS spoofing detection faces the following significant challenges.Firstly, the path loss of a data transmission is more likely to be affected by the environment, which may result in increased spoofing detection errors.Secondly, the threshold value has a noticeable impact on the accuracy of the hypothesis testing at different time slots.Indeed, a bigger threshold value could lead to a higher probability of miss detection, while a smaller threshold value could result in a higher probability of false alarms.Thus, determining the appropriate threshold value is crucial, yet a difficult task.Thirdly, the hypothesis testing results reported by different BSs should be given different weights to determine the final decision.The rationale behind assigning different weights is that a larger distance between a BS and a UAV could lead to a higher error of hypothesis testing result.
To address the aforementioned challenges, we leverage the potential of both statistical methods and ML to devise an effective GPS spoofing detection approach.Three statistical methods are used to extract the statistical properties of path losses of multiple data transmissions, which allows to remove the effects caused by the changing environmental conditions.To deal with the threshold and weight setting issues, we formulate them as a constrained optimization function and propose a MLP-based ensemble approach to solve it.

IV. PROBLEM FORMULATION
In this section, we first introduce three typical statistical methods to analyze the PL data and eliminate the negative effect of environment on the spoofing detection.Based on these PL data, we further formulate the spoofing detection as a constrained optimization problem.

A. Statistical Methods
The following three statistical methods are provided to calculate the statistical metrics of path losses of multiple data transmissions.
1) Mean-Variance-Skewness-Kurtosis (MVSK): MVSK involves important numerical characteristics into the analysis of a path loss, including the Mean, the second central moment Variance, the third standardized moment Skewness, and the fourth standardized moment Kurtosis [47].To obtain the MVSK metrics of path losses, we assume that the number of data transmissions that occur in a time interval t is E. The corresponding actual path loss set L iu (t) is expressed as: where L e iu (t) denotes the e th path loss in the time interval t.Similarly, the theoretical path loss set L iu (t) is defined as: where L e iu (t) denotes the e th theoretical path loss in the time interval t.According to the MVSK method, we have where L M iu (t), L V iu (t), L S iu (t) and L K iu (t) represent, respectively, the mean, variance, skewness, and kurtosis values of actual path losses in time interval t.Similarly, we can obtain the theoretical ones The difference ∆L x iu (t) between the actual and theoretical MVSK metrics is expressed as where x ∈ {M, V, S, K}.For each difference ∆L x iu (t), the threshold introduced in the hypothesis testing in ( 6) is denoted as T x iu (t).2) BOX: BOX is a method of descriptive statistics including the minimum (Q0), the first quartiles (Q1), the sample median (Q2), the third quartiles (Q3) and the maximum (Q4) of the path loss set L iu (t) [48] (10), we can obtain ∆L x iu (t); the x quartiles difference between the actual and theoretical values, where x ∈ {Q0, Q1, Q2, Q3, Q4}.The corresponding threshold is denoted as T x iu (t) in the hypothesis testing for GPS spoofing detection.
3) Wasserstein Distance (WD): WD is a metric to estimate the distance between two probability distributions [49].We use ∆L W iu (t) to denote the WD distance between the actual path loss L iu (t) and the theoretical one L iu (t).Then, we have where C −1 Liu (t) and C −1 Liu (t) are the corresponding inverse cumulative distribution functions (CDFs) of L iu (t) and L iu (t), respectively.The WD distance ∆L W iu (t) can also be used to detect the GPS spoofing according to the hypothesis testing with the threshold T W iu (t).

B. Optimization Problem
We use F iu (t) to denote the weighted statistical feature that represents the authenticity of the PL data provided by the i th BS at time t.Since the PL differences under different statistical methods have different influences on the spoofing hypothesis testing results, we can define F iu (t) as the weighted sum of PL differences ∆L x iu (t), which is expressed as where ω x iu is the weight for the statistical feature x and F = {M, V, S, K, Q0, Q1, Q2, Q3, Q4, W } is the set of statistical features.
At time t, u broadcasts signals to its nearby n BSs located at different positions.We know that a higher distance between u and a BS can lead to a bigger error of PL value [34], and thus it further incurs a bigger error of the ensemble statistical features.We use F u (t) to denote the ensemble feature of the n BSs at time t.To reduce the ensemble error, we define F u (t) as the weighted sum of F iu (t); i.e., where Ω iu represents the weight of the i th BS ensemble feature.
The threshold-based hypothesis testing given in ( 6) can be rewritten as where T u (t) is the threshold of hypothesis testing.
To ensure the spoofing detection performance at UTM, our objective is to minimize the total errors of the hypothesis testing.This can be formulated as the following optimization problem.
Subject to: where ϕ(•) is an indicator function of ω, Ω and T to measure the error of the hypothesis testing.
n]}, and T = T u (t).ϕ (•) = α is used to measure the missed detection, i.e., if H 0 is true but UTM approves H 1 .ϕ (•) = β is used to measure the false alarm, i.e., if H 1 is true but UTM approves H 0 .Otherwise, ϕ (•) = 0. Here, α > β, which indicates that the missed detection is much more harmful than the false alarm because the former can cause violation of no-fly zone regulation and collision risks.Meanwhile, α + β = 1, which is to let the objective function of Eq. ( 15) falls within a limited range.Otherwise, ϕ(•) = 0. Constraints (15a) and (15b) denote the range of the weights ω x iu and Ω iu , respectively.Constraint (15c) denotes the range of the number of BSs, where N = 9 according to the 3GPP trials results in [22].Constraint (15d) denotes that ϕ(•) is equal to 0, α or β.
Note that this is a nonconvex and nonlinear optimization problem, which is generally challenging to be solved.ML, particularly deep learning, is a promising direction to deal with this challenging problem, owing to its ability of solving complex problems, while achieving the desired performance/complexity balance.Thus, we propose a MLP-based ensemble approach to solve the optimization problem in the following section.

A. MLP Model
As shown in Fig. 3, the MLP is a deep learning neural network consisting of a set of interconnected nodes arranged into multiple parallel layers; i.e., an input layer, an arbitrary number of hidden layers and an output layer.Except for the input layer nodes, each node is called a neuron that propagates an output to the next layer through a nonlinear activation function applied over the weighted sum of the received inputs plus a bias factor.Mathematically, this can be formulated as where the input element ∆L x iu (t) is the x path losses statistical metric reported by the i th BS, and y(t) denotes the output prediction result at time t.Here, y(t) ∈ [0, 1] denotes the probability whether the position is spoofed or not, where a value close to 1 means that the GPS position is likely to be spoofed.f (•) is the nonlinear activation function, where the Rectified Linear Unit (ReLU) is set as a default activation function.ω x iu stands for the neuron that stores the weight relationship among different statistical metrics and T x is the threshold T u (t) of the hypothesis testing.The predication result of each BS will be the input of the ML-based ensemble method described in the next section.

B. Structure of Multi-MLP Ensemble Approach
We propose a MEC architecture that uses deep ensemble learning to aggregate the individual prediction results of the different BSs for GPS spoofing detection.As shown in Fig. 4, this approach consists of four main processes, namely data sampling, statistical analysis, individual prediction, final prediction, in addition two extra processes for deploying and consuming the spoofing detection service.
In the sampling phase, the actual path losses data and the corresponding theoretical path losses are grouped into different groups of E samples.Note that the theoretical path losses are computed based on the reported UAV's position (Lat u , Lon u , Alt u ) and the BS's position (Lat i , Lon i , Alt i ).A statistical analysis is then performed using MVSK, BOX and WD methods to extract the statistical characteristics of actual and theoretical path losses data.Using the differences between the actual and theoretical statistical metrics as inputs, the trained MLP models will separately predict whether the GPS position is spoofed or not.Finally, the MLPs' prediction results are integrated to provide the final decision on the presence or absence of a spoofing attack based on a preset threshold (See Section V-D1) or ML methods (See Section V-D2).Specifically, the MEC servers are in charge of path loss data collection, statistical analysis, and MLP predictions, while the ensemble learning is conducted on the edge cloud server.
In practice, the spoofing detection service deployment and consumption work in the background to support high prediction performance for the spoofing detection.In fact, the UTM service (UTMS) can deploy spoofing detection services to an edge cloud server through the core and transport networks, and also inform remote operators when the GPS is spoofed through the Operator Command and Control Service (OCCS).The Supplementary Data Provider Service (SDPS) provides meteorological data and other information to the remote operator and UTMS for UAV flight planning and management [4].
In addition to the MLP-based ensemble approach, the Long Short-Term Memory (LSTM) neural network is alternatively used in ensemble architecture.The LSTM-based ensemble architecture allows to extract of the temporal features from path losses and can further improve the detection performance.Compared with MLP-based ensemble architecture, the LSTMbased ensemble approach needs more historical path losses data to train a new model with a spoofing pattern.Moreover, the inherent mobility of UAVs implants high dynamics in the path losses data, which causes too much noise in the long-time path losses spoofing feature and could decrease the detection performance.Although the use of statistical methods can remove the environmental impacts, the LSTM-based ensemble approach needs to add an independent LSTM model after each statistical feature, which increases the size of the LSTM-based ensemble architecture and the storage usage in the MEC server.Regarding the time-consuming and storage usage, the MLPbased ensemble approach is more efficient than the LSTMbased approach because MLP uses a small size of neural network to learn the recurrent spoofing behaviors from the path losses.

C. Statistical-MLP Approach
To eliminate PL data fluctuations caused by environmental changes, we propose a statistical-MLP approach for GPS spoofing detection at BS level.In this approach, a MLP model is built at each BS by training the MLP neural network on dataset comprising the differences of statistical metrics collected locally.The training is carried out using the stochastic gradient descent method for optimizing the MLP neural network parameters (i.e., weights and bias) to minimize the loss function.In this study, we adopt the Adaptive Moment Estimation (Adam) optimizer [50].Moreover, the binary crossentropy loss function is used to assess the performance of the MLP model, which is expressed as where B is the binary cross-entropy loss, ŷi is the i th predicted observation value, and y i is the i th real observation value.Note that a smaller B corresponds to a better MLP model.
Algorithm 1 summarizes the full process of the proposed statistical-MLP approach, including data sampling, data processing, data labeling, and MLP training and selection.We use the symbol ∥•∥ to denote the length of a set, the symbol |•| to represent the absolute value of a number, and the symbol ⌈.⌉ to denote the rounding up of a number to its nearest integer.The data sampling procedure produces the grouped data.Initially, the waypoints reported by an UAV and the position of the BS (it connects to) are used to calculate the theoretical PLs (line 2).Thereafter, the PL differences ∆L between the actual PLs reported by the BS and the theoretical PLs are computed (line 3).We assume that the planned waypoints P, the reported waypoints R and the reported PLs are synchronous and have the same length.The PL differences are evenly divided into equal-sized groups, each of which consists of E values of PL differences (line 4).To ensure that the size of the last group is E, the indices of its elements are from (∥∆L∥ − E) to ∥∆L∥.Besides, the distance between planned waypoints and reported waypoints are calculated and grouped in order to further decide whether the reported GPS is spoofed or not (lines 5-6).The data processing procedure applies the statistical methods defined in sub-section IV-A on each PL differences group to alleviate the environmental impacts and produce the data points for training and testing the MLP models (lines 10-18).The data labeling procedure aims to assign a label y (1/0) to each data point f according to the relationship between the tolerated GPS error dE and the distance between the reported and planned positions (line [21][22][23][24][25][26][27][28][29][30][31].In fact, a data point is labeled as GPS spoofing (i.e., y = 1), if the average distance of the corresponding group is greater than the tolerated GPS error.Otherwise, it is labeled as legitimate GPS error (i.e., y = 0).It is well known that the performance of a deep neural network model influenced by the tuning of its hyperparameters (e.g., number of layers, the number of neurons in each layer, the batch size, and the learning rate) [51]).To find the most accurate MLP model, the training and selection procedure (lines 34-46) applies a grid search method [52] to determine the optimal hyperparameters' values that lead to the smallest loss error.The search space for finding the optimal hyperparameters for the MLP model is defined in IV.For each combination of hyperparameters, the corresponding MLP model is built on the training data set and its performance is evaluated on the unseen testing dataset.The MLP model with the smallest loss error B is used by the BS for detecting GPS spoofing.
An important consideration for assessing the practicality of Algorithm 1 Statistical-MLP Algorithm.

Input:
P: Planned way points, return c ← C k .47: end procedure the proposed approach is its time complexity.To this end, we theoretically analyze the time complexity for building the statistical-MLP model.Theorem 1 provides the upper bound of the statistical-MLP algorithm's asymptotic time complexity.
Theorem 1.The time complexity of building the statistical-MLP algorithm is O(5 × ∥P∥ + ∥L∥ × (∥F∥ + 1) + (γ × (▽ MLP − 1) + 1) × ∥L∥ × ∥C∥ + 1), where ∥P∥ is the number of way points in P, ∥L∥ is the number of PL data groups, ∥F∥ is the number of statistical methods, γ is the fraction to create bootstrapped training data, ▽ MLP denotes the time complexity of training each MLP and ∥C∥ is the number of the MLPs.
Proof.The time complexity of the statistical-MLP algorithm is the sum of the computational complexities for performing path loss data sampling, statistical analysis and labeling, as well as MLP model training and selection.Thus, we need to calculate the time complexity of each phase.
The time complexity of path loss data sampling is O(5 × ∥P∥), which includes the time needed for computing the theoretical PLs, calculating and grouping the differences between the theoretical and actual PLs, and finally calculating and grouping the distances between the planned and reported way points.The time complexity of statistical analysis and labeling is O(∥L∥ × (∥F∥ + 1)), which equals to the sum of the time complexity of computing the statistical features of each group and that of labeling each group.Let I, q, and h denote, respectively, the number of training iterations, the number of hidden layers, and the number of neurons in each hidden layer.The time complexity of MLP model training and selection is O((γ × (▽ MLP − 1) + 1) × ∥L∥ × ∥C∥ + 1), where ▽ MLP ≈ O(I × ∥F∥ × q h ) according to [53].Here, the time complexity of the training process is O(∥L∥ × γ × ▽ MLP × ∥C∥), and that of the selection process is O(∥L∥ × (1 − γ) × ∥C∥ + 1).

D. Multi-MLP Ensemble Approach
To provide the final decision on the presence or absence of GPS spoofing with high accuracy, we propose a multi-MLP ensemble approach for integrating the individual prediction results issued by the MLP models deployed at the BSs.To this end, we consider two strategies for aggregating the individual predictions, namely: (i) A threshold based weighted aggregation, where the final prediction is produced by comparing the weighted combination of individual predictions to a preset threshold; and (ii) a ML-based aggregation, where the final decision is taken by combining the individual predictions through a meta-learner built using different ML techniques.
1) Threshold-based Multi-MLP Ensemble Approach: This approach aims at finding the best threshold for solving the optimization problem defined in Eq. ( 15) based on a weighted aggregation of individual predictions.As described in Algorithm 2, the threshold-based multi-MLP ensemble approach Algorithm 2 Threshold-based multi-MLP ensemble algorithm.

Input:
D: Data set.Each data point in D is a vector with size (n + 1); that includes n BSs' MLP predictions and one label which is set to 1 if spoofing and 0 if normal.Θ: The thresholds set, Θ = {0.00,0.01, 0.02, ..., 0.99}; α: The importance coefficient for missed detection.

Output:
T : The final ensemble threshold for the given date set D. for each X i ∈ X do 6: W: The weights for X i , W ← {}; 7: for each x j ∈ X i do 8: for each (ŷ i in Y) and (y i in Y) do 31: if (ŷ i ̸ = y i ) and (y i = 1) then 32: F N ← F N + 1; 33: else if (ŷ i ̸ = y i ) and (y i = 0) then 34: F P ← F P + 1;

35:
end if 36: end for 37: end for 40: return R 41: end procedure 42: procedure THRESHOLDSELECTION(R, Θ) 43: Get the min value index k, k = index(min(R)); 44: return T ← θ k .45: end procedure comprises three sequential procedures; i.e., the data weighting procedure, the threshold evaluation procedure, and the threshold selection procedure.The data weighting procedure (lines 4-14) aims to assign a weight to the prediction of each ensemble individual MLP model.The weight is proportional to the prediction confidence score of GPS spoofing; that is, the highest weight is assigned to the individual prediction with the highest confidence score.The weights are normalized so that their sum is equal to 1.The ensemble weighted prediction D i is defined as the sum of the weighted predictions of the individual MLP models (line 10).The threshold evaluation procedure (lines 17-40) uses a set of thresholds and the performance metrics (i.e., missed detection and false alarms) defined in (15) to evaluate the ensemble weighted predictions.For each threshold value θ, the procedure first tests whether or not a GPS position is spoofed through the hypothesis testing in Eq. ( 14), by comparing the ensemble weighted predictions to the preset threshold.If the ensemble weighted prediction is above the threshold, the spoofing hypothesis (i.e., null hypothesis H 0 ) is accepted and the label 1 is assigned to the test data.Otherwise, the spoofing hypothesis is rejected and the label 0 is associated to the test data.The testing labels Y are then compared to the actual data labels Y to assess the detection performance of the multi-MLP ensemble model under the preset threshold.The detection performance is evaluated with respect to the number of false negatives (FNs) and false positives (FPs).FNs refer to missed detections (i.e., ŷi = 0) when the actual GPS positions are spoofed (i.e., y i = 1), while FPs represent the false alarms (i.e., ŷi = 1) when the actual GPS positions are legitimate (i.e., y i = 0).The threshold selection procedure (lines [43][44] uses the detection performance metrics to determine the best threshold.According to Eq. ( 15), the best threshold is the one minimizing both FNs and FPs.In the following Theorem 2, we give the time complexity of the threshold-based multi-MLP ensemble algorithm.
Theorem 2. The time complexity of building the Thresholdbased multi-MLP ensemble algorithm is O((n + ∥Θ∥ × 2) × ∥D∥ + 1).Here, n denotes the number of BSs, ∥D∥ is the number of prediction results in D obtained by Algorithm 1, and ∥Θ∥ represents the length of the preset thresholds set.
Proof.The time complexity of the threshold-based multi-MLP ensemble algorithm corresponds to the sum of the time complexity of computing the ensemble weighted predictions and that of thresholds evaluation and selection.The time complexity of calculating the ensemble weighted predictions can be determined as O(n×∥X∥).The time complexity of the thresholds evaluation includes the time required for hypothesis testing ∥Θ∥×∥D∥ and the time needed for assessing the detection performance ∥Θ∥ × ∥Y∥.The threshold selection incurs O(1) time complexity.As ∥X∥ = ∥Y∥ = ∥D∥, we obtain that the time complexity of building the threshold-based multi-MLP ensemble algorithm is O((n + ∥Θ∥ × 2) × ∥D∥ + 1).
2) ML-based Multi-MLP Ensemble Approach: Finding the best decision threshold to achieve the desired GPS spoofing detection performance is a challenging task, which depends on various factors, including the number of BSs, the training data set and the FNs-FPs trade-off.Thus, an intelligent solution to automatically learn the decision threshold is vital.To this end, we propose a ML-based multi-MLP ensemble approach for integrating the individual prediction results of the MLP models using six different ML methods, including MLP, Decision Trees (DT), Logistic Regression (LR), Support Vector Machine Kernel (SVMK), Support Vector Machine Gamma (SVMG), and Gaussian Naive Bayes (GNB) [53]- [57].Specifically, the individual prediction results from the outputs of multi-MLPs are the inputs of these ML methods and the trained model with the smallest binary cross entropy loss B is adopted as the predicting model to make a decision on whether the GPS provided by the UAV is spoofed or not.
The proposed ML-based multi-MLP ensemble approach is summarized in Algorithm 3. In the following Theorem 3, we give the time complexity of the ensemble algorithm.M: The set of ML meta-models, M ← {} 6: for each M i ∈ M do 7: Utilize the ( X, Y) to train M i ; 8: end for 10: return M 11: end procedure 12: procedure MODELEVALUATION(( X, Y), M) 13: B: The set of the Loss values, B ← {} 14: for each M i in M do 15: Use ( X, Y) to evaluate M i and obtain the average loss B i ; 16:

VI. PERFORMANCE EVALUATION
In this section, we evaluate the performances of the statistical-MLP, threshold-based multi-MLP and ML-based multi-MLP ensemble algorithms, respectively.We also conduct a performance comparison between the threshold and ML-based multi-MLP ensemble algorithms.

A. Simulation Settings
We develop a simulator using Python 3.6 and Tensorflow 2.1 to evaluate the performances of the proposed algorithms.In this simulator, Python is used to set up the simulation platform and Tensorflow is applied to build the ML models.We consider that the nine BSs are evenly distributed in an area of 300 × 300 m 2 , where the distance between any two adjacent BSs is 150 m.The height of the BSs is set to 35 m.The flight height of both the target UAV u and the aerial GPS spoofer is set to 150 m.Initially, u is at the location above the center of the area, and there are a total of 12 potential trajectories (i.e., flight paths) toward different directions.A polling strategy is adopted to choose two alternative paths from the preset twelve paths, and one of them is used as the planned trajectory while the other one is assumed as real time trajectory that can be spoofed or not.Meanwhile, we apply the path loss models defined in [22] to generate the training and evaluating data from the BSs to u.The channel frequency is set to 2.0 GHz.
In Algorithm 1, the number of data points E is drawn from {100, 150, 200, 250, 300, 350, 400, 450, 500}.dE is preset as {10, 15, 20, 25 , 30, 35, 40, 45, 50} to simulate the GPS error tolerance of different scenarios.γ is 0.7, which means that 70% of the data in the data set is used for training the model, and the other 30% is used for its evaluation.It is well known that the performance of MLP is sensitive to hyperparameter settings [58].Thus, to find the best configuration of the different MLP models in our study, we carry out hyperparameter tuning by varying the learning rate, the number of hidden layers, the number of neurons per hidden layer and the training batch size.Note that the batch size is the number of data points used for each training iteration.The number of MLP layers is varied from 1 to 5 layers, and the number of neurons of the hidden layers is taken from {10, 20}.Specifically, all MLP models have one input layer with 10 neurons and one output layer with one neuron.The learning rate is drawn from {0.001, 0.005, 0.0001, 0.0005}.In our simulation, the MLP model with three hidden layers (10,20,10) using the learning rate 0.005 achieved the best accuracy performance on both training and validation datasets.All MLP layers use Rectified Linear Unit (ReLU) as their activation function and use Adam as the solver, where the ReLU is used in the hidden layer to prevent the vanishing gradient problem and accelerate the training processes while the Adam solver can work with sparse gradients and use the average of recent magnitudes of the gradients to mitigate noise impacts.The MLP models are built using back propagation method on a data set containing approximately 1 million samples, and each element of the data set consists of a planned way point, a reported way point and a BS position.We set PL observations as ∥L∥ ≈ 1 × 10 6 .∥P∥ = ∥R∥ = ∥L∥ ≈ 1 × 10 6 .The training is performed at most 200 epochs and an early stopping patience of 10 on loss B is applied to prevent overfitting.The model's loss B defined in (17) is calculated for every epoch on a hold out validation set.
In Algorithm 2, D is the data set of different BSs' prediction results, the size of D is equal to (∥P∥ × n/E), and the importance coefficient for missed detection α is set to 0.9.In Algorithm 3, the fraction γ is 0.7 for training data.For the MLP ensemble algorithm, n neurons with one hidden layer is used to integrate each BS decision.DT's depth ρ has the same size as the number of BSs n.According to the logistic regression, the inverse regularization strength of LR is set as 0.01 to reduce overfitting.SVMK model uses the linear kernel with penalty coefficient σ = 0.025 to increase the fault tolerance of the classifier and avoid overfitting, while SVMG uses radial basis function kernel with parameters gamma = 2 and σ = 1, where gamma decides how much curvature in a decision boundary.Here, a high gamma leads to higher curvature.

B. Performance Metrics
We evaluate the effectiveness of the proposed algorithms according to the commonly used performance metrics, including the total error (T E) of the hypothesis testing (see Eq. ( 15)), Accuracy, Recall, and F1.The performance metrics are expressed as: where T P (True Positive) denotes the number of the spoofed positions that are correctly detected, F N (False Negative) represents the number of the spoofed positions that are wrongly detected as normal positions, F P (Flase Positive) is the number of the normal positions that are wrongly detected as spoofed ones, and T N (True Negative) denotes the number of  the normal positions that are correctly detected.The F1 metric is used to characterize the trade-off relationship between the precision rate and the recall rate.

C. Performance Evaluation of Statistical-MLP Approach
We first investigate the impact of the number of data points E on MLP performance under the proposed statistical-MLP approach.The results are summarized in Fig. 5(a) with a setting of GPS error dE = 15m.It can be observed from Fig. 5(a) that the MLP accuracy increases as E increases.This is due to the fact that the MLP output performance mainly depends on the input corresponding to these statistical results, where a larger data set results in higher accuracy and precision.Fig. 5(b) further illustrates how the GPS error dE affects the accuracy under the setting of E = 150.We can see from Fig. 5(b) that as GPS error increases, the accuracy decreases.This is because in the environment with a high GPS error, it is more easier for an attacker to counterfeit the GPS position.
A careful observation from Fig. 5(a) and Fig. 5(b) indicates that the training and testing results have the same increasing/decreasing trend, and the difference between these two results is small.It is notable that a small difference between training and testing results means that the model is well trained without overfitting on testing.We can also observe from Fig. 5 that the accuracy under the proposed statistical-MLP approach is high up to more than 0.83 for all trials.In the next subsection, we will show that this accuracy can be further improved under the ML-based multi-MLP ensemble approach.1) Threshold-based Multi-MLP Ensemble Approach: According to the optimization problem in Eq. ( 15), the performance of the threshold-based multi-MLP ensemble approach depends on the number of BSs, the threshold value, and the value of α.Fig. 6 illustrates the performance of the thresholdbased multi-MLP ensemble approach in terms of FP, FN and total error.Fig. 6 shows also the influence of the number of BSs and the value of the importance coefficient α assigned to missed detections on the optimal threshold value that yields the smallest total error.The results in Fig. 6(a) and Fig. 6(b) show that an increase in the threshold value leads to a decrease in FPs and an increase in FNs, respectively.The opposite trends exhibited by FP and FN with respect to the threshold value indicates the interrelation between these two measures.In fact, with lower threshold values, the detection algorithm is likely to detect all spoofed GPS positions, but at the expense of erroneously classifying normal GPS positions as spoofed.On the other hand, higher threshold values make the detection algorithm more permissive, which reduces the number of false alarms but at the price of rising the risk of missing the detection of spoofed GPS positions.Another key observation from Fig. 6(a) and Fig. 6(b) is that the increase in the number of BSs can lower the FP and FN.This can be explained by the fact that involving more BSs in the detection process will provide more data points which naturally improves the credibility of the ensemble predictions.As introduced in Eq. ( 15), obtaining the optimal threshold that gives the best detection performance (i.e., the smallest T E) depends on the relative importance of detecting all spoofed positions versus  keeping the number of false alarms low.From results depicted in Fig. 6(c) and Fig. 6(d), it is clear that increasing the importance coefficient of missed detections α leads to lower threshold values to achieve the smallest T Es.Indeed, lower threshold values are required to reduce the FN which allows to minimize their influence on the T E. The results show also the positive impact that the number of BSs have on lowering the T E. As demonstrated in Fig. 6(a) and Fig. 6(b), this positive impact is due to the decrease of FN and FP with the increase in the number of BSs.

D. Performance Evaluation of Multi-MLP Ensemble Approach
2) ML-based Multi-MLP Ensemble Approach: To evaluate the effectiveness of the ML-based multi-MLP ensemble algorithm in detecting the GPS spoofing attack, we measure the fundamental performances in terms of FP, FN, TE, as well as Accuracy, Recall, and F1 using the test data set under the setting of dE = 15m and E = 150.
We summarize the performance results in Fig. 7.As with the threshold-based multi-MLP ensemble approach, the results depicted in Fig. 7(a) and Fig. 7(b) show that the same observation about the impact of the number of BSs on FP and FN applies; that is, increasing the number of BSs yields to reduced FP and FN.In fact, a higher number of BSs allows to provide more data to the ML meta-model aggregating the individual predictions, which improves the ensemble decision about the spoofing or not of a GPS position.It is worth noting that an exception is noticed for the GNB meta-model, where the increase in the number of BSs slightly increases the generated FP.The reason behind this trend is that GNB is achieving the highest performance in terms of FP, with a value less than 0.01, which makes it sensitive to potential errors in predictions provided by some far BSs.Indeed, more the UAV is far away from the BSs, higher the error in the path losses values will be [32].
It can be observed from Fig. 7(d) that the Accuracy performance can be significantly improved.For instance, the Accuracy is nearly 0.97 using the ensemble model with two BSs, while it is below 0.90 when only using the statistical-MLP algorithm.The reason behind the phenomenon is that the ensemble algorithm takes more evidences from different BSs, which enriched the knowledge for the ensemble algorithm making the final decision.We can also obtain the same conclusion in terms of Recall in Fig. 7(e) and F1 in Fig. 7(f).Furthermore, we observe that the GNB meta-model exhibits the worst performance among the six meta-models.This can be explained by the fact that GNB makes a very strong assumption on the normal distribution of data, which may not necessarily hold for the individual prediction results fed into the GNB meta-model, leading to incorrect detection decisions.For the dataset under study, GNB generates the highest number of missed detections.
We further compare the ML-based algorithms to the threshold-based algorithm as illustrated in Fig. 7.It is worth noting that for the threshold-based algorithm, we represent the performance results of the optimal threshold value selected for a given number of BSs when the importance coefficient α is set to 0.9.For instance, the performance results when 3 BSs are considered are the ones obtained with θ = 0.73, while those for 4 BSs are the ones achieved with θ = 0.70.
The results in Fig. 7 indicate that the threshold-based algorithm achieves its optimization goal defined in Eq. ( 15) of minimizing the F N , which also influences positively the Recall metric.Although the threshold-based algorithm can reduce the missed detections, it fails in achieving a trade-off among F N , F P , Accuracy and Recall.Such a balance can be measured using the F 1 score, which represents the harmonic mean of precision and recall.The F1-score values in Fig. 7(f) show that ML-based algorithms outperform the thresholdbased algorithm in achieving a better precision-recall balance when the number of BSs is more than 5. From Fig. 7(b), it can be observed that the FN rate for most ML algorithms is less than 2%, which demonstrates the effectiveness of ML-based multi-MLP ensemble approach in protecting the system from GPS spoofing attack while achieving a good balance between precision and recall.

E. Time Complexity Comparison
The time for building the detection model (i.e., training time) and the time needed by a trained model to take decision (i.e, testing time) are critical indicators of performance evaluation [51].Specifically, a longer training time will bring more latency for updating the threshold value or the ML model in a dynamic environment in order to adapt to changes in the path loss model, while a longer testing time introduces more latency for the spoofing detection.In what follows, we discuss and compare the training and testing time complexity of the proposed multi-MLP ensemble approaches.For the thresholdbased approach, the training time represents the time required to find the optimal threshold.
From the theoretical analysis conducted in Section V, the time complexity of the threshold-based algorithm is O((n + ∥Θ∥×2)×∥D∥), and the complexity of training and testing an ML-based algorithm is of O(γ ×∥D∥×▽ Mi +(1−γ)×∥D∥), where ▽ Mi is of O(n) for SVMG and SVMK, O(2 × n) for LR and GNB, O(n × ρ) for DT, and O(I × n × q h ) for MLP.It can be seen that the time complexity of the threshold-based algorithm is much higher than that of the MLbased algorithms excluding the MLP-based algorithm.This is because the number of thresholds ∥Θ∥, equal to 100 in this study, is much bigger than 2 × n and still higher than n 2 for n ∈ [1,9].The higher complexity of MLP-based algorithm is due to the fact its training time depends not only on n, but also on the number of hidden layers q, the number of neurons per layer h, and most importantly the number of training epochs I (set to 200 in our simulation).
We also measured the training time and testing time via simulations using a physical machine with 4-cores Intel's Core 1.6Ghz CPU and 16GB RAM.Fig. 8 illustrates the obtained results.One interesting finding observed from Fig. 8(a) indicates that the training time of the MLP model is significantly higher than that of the other ML-based algorithms.An intuitive explanation is that the MLP model may need more time to update the neurons parameters to provide a better performance.Another important finding from Fig. 8(b) is that the testing time of the SVMK and SVMG is higher than that of the other ML-based algorithms.This is because these SVMK and SVMG models are not incremental; i.e., all the test data are analyzed at once.Regarding the threshold-based algorithm, it spends 1400ms each time on finding the best threshold, which is also much higher than the training time consumed by most of the ML-based algorithms, except the MLP.For the testing time, the threshold-based algorithm only needs to compare the test values with the thresholds, which is of lower complexity compared with all ML-based algorithms.Based on the above performance evaluations, we find that the ML-based multi-MLP ensemble approach could achieve a significant improvement on the GPS spoofing detection performance compared with the statistical-MLP algorithm and the threshold-based multi-MLP ensemble approach.Meanwhile, a flexible combination of the statistical-MLP algorithm and other ML methods could reduce the spoofing detection latency, and achieve a real time GPS spoofing detection.

VII. CONCLUSION AND FUTURE WORK
This paper explored the spoofing detection performance for cellular-connected UAV based on the ML methods.To this end, a constrained optimization problem was proposed to formulate it.To solve the nonlinear and nonconvex optimization problem, we first proposed a statistical-MLP algorithm to independently predict the spoofing probability of GPS position at each BS.For an improvement of detection performance, we further proposed a multi-MLP ensemble algorithm to integrate these independent prediction results using six different ML models, respectively.Remarkably, our proposed algorithm can achieve above 97% accuracy rate under two BSs, while it can

dFig. 2 .
Fig. 2. The illustration of the parameters d iu , d ′ iu and hu .

Fig. 4 .
Fig. 4. The MEC architecture for deep ensemble learning-based GPS spoofing detection.

Algorithm 3
ML-based multi-MLP ensemble algorithm.Input: D: Data set.Each data point in D is a vector with size (n + 1); that includes n BSs' MLP predictions and one label which is set to 1 if spoofing and 0 if normal.M: Ensemble ML methods, M = {MLP, DT,LR, SVMK, SVMG, GNB}.γ: Fraction to create bootstrapped training data Output: C: The final ensemble ML model.1: X: Data points, X = D[1 : n] 2: Y: Data labels, Y = D[n : n + 1] 3: Divide (X, Y) into training data set ( X, Y) and testing data set ( X, Y) according to γ 4: procedure MODELTRAINING(( X, Y), M) 5: end procedure 20: procedure MODELSELECTION(B, M) 21: Get the min loss index k, k = index(min(B)); 22: return C ← M k 23: end procedure Theorem 3. The time complexity of building the ML-based multi-MLP ensemble algorithm is O(γ ×∥D∥× ∥M∥ i=1 ▽ Mi + (1−γ)×∥D∥×∥M∥+1).Here, γ is the fraction to create bootstrapped training data from the MLPs' prediction results D obtained by Algorithm 1, ∥D∥ is the number of prediction results in D, and ▽ Mi denotes the time complexity of i th ML method M i where M i ∈ {MLP, DT, LR, SVMK, SVMG, GNB}.Proof.The time complexity of the ML-based multi-MLP ensemble algorithm corresponds to the sum of the time complexity of ML model training and that of ML model evaluation.The time complexity of ML model training can be determined as O(γ ×∥D∥× ∥M∥ i=1 ▽ Mi ), where γ ×∥D∥ is the number of the elements of the training data set,▽ MLP ≈ O(I × n × q h ), n is the number of features, ▽ DT ≈ O(n × ρ), ρ is the depth of the tree, ▽ LR ≈ O(2 × n) for the classification of two labels, ▽ SVMK ≈ O(n), ▽ SVMG ≈ O(n), and ▽ GNB ≈ O(2 × n) [53]-[57].The time complexity of ML model evaluation is O(∥D∥ × ∥M∥).Besides, the time complexity of preparing data and selecting ML model in the training and evaluation processes is O(1).Thus, we obtain that the time

( a )
The effect of E on accuracy under dE = 15m.(b) The effect of dE on accuracy under E = 150.

Fig. 6 .
Fig. 6.The performances of the threshold-based multi-MLP ensemble algorithm under dE = 15m and E = 150.

Fig. 8 .
Fig. 8. Training and testing time consumption of MLP ensemble algorithms.

TABLE II SUMMARY
OF DETECTION METHODS FOR GPS SPOOFING ATTACK.

TABLE III NOTATIONS
USED IN THE PAPER.P = (Lat u , Lon u , Alt u ) P . L Q0 iu (t) and L Q4 iu (t) denote, respectively, the lowest and the largest path loss excluding any outliers.L Q2 iu (t) and L Q3 iu (t) represent the median and the median of the upper half of the L iu (t), respectively.Here, outliers represent the observations that fall below L Q1 iu (t) − 1.5 * IQR or above L Q3 iu (t) + 1.5 * IQR, where IQR is the Inter-Quartile Range and is equal to L Q3 iu (t) − L Q1 iu (t).

TABLE IV HYPERPARAMETERS
SETTINGS OF MLP MODELS.