Improving GNSS Spoofing Awareness in Smartphones via Statistical Processing of Raw Measurements

Due to the low received power of Global Navigation Satellite Signals (GNSS), the performance of GNSS receivers can be disrupted by anthropogenic radio frequency interferences, with intentional jamming and spoofing activities being among the most critical threats. It is reported in the literature that modern, GNSS-equipped Android smartphones are generally resistant to simplistic spoofing, and many recent contributions support such a biased belief. In this paper, we present the results of a test campaign designed to further stress the resilience of such devices to simplistic spoofing attacks and highlight their actual vulnerability. We then propose an effective spoofing detection technique, that exploits the spatial and temporal correlation of the counterfeit signals by leveraging the statistical analysis of raw GNSS measurements. By not requiring access to the low signal processing level of the GNSS receiver, the proposed solution applies to any device embedding a GNSS receiver that provides raw GNSS measurements, such as current Android smartphones. Vulnerability analysis and validation of the proposed technique were conducted in a controlled environment by transmitting realistic, counterfeit Global Positioning System L1/CA navigation signals to a variety of Android smartphones embedding also different GNSS chipsets. We show that, under proper conditions, the devices were vulnerable to the attacks and that the effects were visible through their raw measurements, i.e., Carrier-to-noise ratio $(C/N_{0})$ , pseudo-range measurements, and position estimates. In particular, the study demonstrates that cross-correlation between the $C/N_{0}$ time series provided by each device for different GNSS satellites increases under spoofing conditions, thus constituting an effective metric to detect the attack within a few seconds.

mass-market devices. To date, raw GNSS measurements support is mandatory for devices running Android TM 10 (API level 29) or higher. Around 82 % of available Android TM phones currently support raw measurements data [3]. This data include internal clock measurements like the time of signal reception, clock drift, clock discontinuities, etc., and the GNSS receiver measurements such as the received GNSS satellite time, i.e., Time of Week (TOW), Doppler frequency, carrier phase measurements as well as constellation status, and further navigation data. The full description of the available raw measurements can be found in [3]. More recently, the Google Service Framework TM has provided Automatic Gain Control (AGC) measurements through updated Android TM classes, with the release of Android TM API 9.0. However, not all the GNSS chipsets are fully compliant with all those measurements and the quality of such data may vary from device to device [4]. Their use can lead to improved GNSS performance by opening the door to more advanced processing techniques previously reserved only to high-end GNSS receivers. These benefits have been demonstrated for code-based, aided, differential and precise point positioning [4], [5], [6]. Furthermore, collaborative navigation approaches leveraged raw measurements to offer a naive collaborative distancing technique [7] and a GNSS-only enhancement of Position, Velocity, Timing (PVT) estimation accuracy [8]. Although the standalone position computed from raw GNSS data may not be as accurate as the ones obtained through an onboard sensor and network integration (i.e., Google Fused Location Provider), the use of raw GNSS data and ad-hoc implemented algorithms can improve the solution with respect to unaided, standalone GNSS solutions. However, positioning improvement is not the only possible exploitation of raw measurements. Since they provide, to a certain extent, an insight of the processing taking place inside the chipset, they can be used to analyse the effects of radio-frequency (RF) impairments affecting the GNSS received signals, as well as to compute new metrics for the quality assessment of the associated output solution [9]. As an example, it is known that due to the weakness of GNSS signals, GNSS receiver performance can be easily disrupted by anthropogenic interferences, with jamming and spoofing activities being critical threats in this context. Swept-frequency and frequency-modulated jamming are typical intentional Radio Frequency Interference (RFI) that can be emitted by personal privacy devices with a carrier frequency that varies across GNSS bands. Spoofing, on the other hand, refers to the transmission of counterfeit, yet plausible GNSS signals with the intent of inducing false PVT estimates at a victim's receiver. Countermeasures for jamming and spoofing threats have been extensively discussed and proposed in the literature [10], and, as far as spoofing is concerned and targeted by our test campaign, a brief review will be covered in this article. As recalled later on, many of the proposed spoofing detection techniques require the implementation of sophisticated algorithms that need to have access to the low-level signal processing stages of the GNSS receiver in order to be effective against simplistic to advanced spoofing attacks [11], [12], [13], [14], [15], [16]. A classical technique for spoofing detection, based on the carrier-to-noise ratio (C/N 0 ), is proposed in [13], where the measured C/N 0 of received GNSS signals are compared to a known or expected value. The C/N 0 measures the strength of the carrier signal compared to the background noise. If C/N 0 is significantly lower than the expected value, it could indicate that an unspecified RFI is ongoing, and the signal should be discarded. However, if a spoofing signal is unspread, it can show a C/N 0 value within a nominal range, even though jamming or spoofing is performed. Detection of such malicious actions can be difficult in this case, as the C/N 0 measurement may not show any abnormalities. In such cases, other techniques operating over frequency or time domains, or the use of integrity messages from augmentation systems, can be used in conjunction with C/N 0 monitoring to provide a more robust spoofing detection. Some of these techniques can detect spoofing even when the spoofer uses the same codes, frequencies, and power levels as the legitimate signal [17].
However, such techniques might not be usable in systems and platforms embedding a GNSS chipset for which only a limited number of outputs is available to the user or to the higher application layers, such as in smartphones and several other mass-market devices. For this reason, it is of interest to devise interference detection and classification techniques, based only on the observation and the statistical processing of common outputs of GNSS receivers, and in particular of the raw measurements provided by Android TM smartphones as well as by a growing number of low-cost receivers, as reported in [11]. In this paper, taking the smartphone platform as a reference case study, we propose and assess the performance of a technique for the detection of single-antenna spoofing attacks. The proposed solution exploits the spatial and temporal correlation of the spoofing signals, and it is validated through an experimental campaign based on the analysis of the correlation of the raw output data provided by various Android TM smartphones. Therefore, the present article aims at • analyzing the effect of single-antenna, simplistic spoofing attacks on the raw GNSS measurements provided by different smartphones embedding various GNSS chipsets • identifying the most suitable time series of raw GNSS measurements and a suitable figure of merit allowing for a prompt and robust detection of the attacks • defining a methodology for the analysis of the raw data of interest and of the associated figure of merit towards an effective detection of the attack • experimentally assess the proposed technique through data collections retrieved from an on-field test campaign including 18 devices The rest of the paper is organised as follows: Section II recalls reference studies and fundamental aspects of spoofing attacks against GNSS receivers. Section III describes the setup for the vulnerability analysis and the effects on raw GNSS measurements and position estimation. Section IV introduces the methodology for a spoofing detection strategy exclusively based on raw GNSS data. A performance assessment is then presented in Section V and, eventually, conclusions are drawn in Section VI.

II. BACKGROUND AND RELATED WORKS
Recent studies have proposed various techniques for investigating the impact of spoofing attacks on GNSS receivers. Such an effort has been fundamental to enhance security and reliability in GNSS-based applications.
A GNSS satellite simulator was utilized for the first time in a GNSS spoofing experiment in [16]. In [18] researchers investigated the practical aspects of a satellite lock takeover, where a victim receives spoofed signals after first being locked on to legitimate Global Positioning System (GPS) signals. In [19], researchers at the University of Texas (Austin) developed a portable low-cost GPS intermediate spoofer. A successful spoofing attack was carried out on a commercial super yacht using intermediate spoofing techniques, thus highlighting the potential threat against civil PNT systems. Various techniques have since been developed to evaluate the active resilience of GNSS devices and to mitigate the risk of spoofing attacks for improved GNSS security in [20].
Early countermeasures were proposed in [21] to counteract simplistic spoofing attacks. Later advances in the field fostered the development of advanced spoofing detection and mitigation techniques at various stages of signal processing in GNSS receivers [22]. While high-end GNSS receivers now implement spoofing alert systems at their application layers, Android TM smartphones do not provide proper warnings to the user yet, and effective spoofing attacks may stealthily hinder their PNT capabilities. Indeed, upcoming GPS Chips-Message Robust Authentication (Chimera) [23] and Galileo Open Service Navigation Message Authentication [24] services allow receivers to be resilient against counterfeit signals at the cost of implementing the respective authentication algorithms. At the present, it is worth examining the potential effects of intentional interference on mass-market GNSS-equipped devices, as well as assessing the resilience of their embedded receivers and their capability to live detect ongoing attacks with reasonable latency. Some demonstrations of spoofing against Google [28], inertial navigation sensors such as magnetometer, accelerometer, and barometer were used for triggering possible spoofing event detection in smartphones. By exploiting the availability of GNSS raw measurements, in [29], [30], the impact of spoofing attacks against mobile phones were analysed and specific techniques were suggested to enhance security such as the use of cheap accelerometers together with the monitoring of raw GNSS measurements. The possibility to compare or combine metrics to better identifies spoofing and meaconing attacks was also investigated in [31]. In the study, GNSS anti-spoofing defense were proposed based upon a cooperative positioning approach leveraging the exchange of raw GNSS measurements. The results allowed the identification of possible metrics to be monitored to identify malicious attacks against the positioning and navigation systems in mass-market connected devices. In [32] researchers provided a mobile application for detecting GNSS jamming and spoofing. The application used four different methods to detect attacks: comparing the GNSS and network locations, checking the Android TM mock location flag, comparing the GNSS and system times, and observing the AGC and carrier to noise density ratio (C/N 0 ) signal metrics.
In [33] the authors looked at AGC measurements from multiple smartphone models which have different GNSS chipsets, assesse their behavior under RFI, and point out the current limitations, and improvements that would assist in its usage as a GNSS RFI indicator. The Signal Quality Monitoring Technique, a spoofing-detection methodology, was presented in [34] for realistic spoofing scenarios. This solution is based on the quality of the correlation of the incoming signal and the receiver's local replica and on the cooperative use of a pair of extra correlators to find vestiges of the signal. In [35], the authors suggested how the National Marine Electronics Association messages provided by the GNSS receivers can be utilized to detect instances of spoofing and identify suspicious, potentially spoofed satellite signals. Authors in [36] [44]. The main limitation of the study was its narrow focus on only a few GNSS chipsets integrated into consumer devices, which may not be representative of the broader range of devices available in the market.
Moreover, it was observed that the Android TM smartphones under the test were generally resilient. A recent work proposes a combined jamming and spoofing detection technique based on AGC and C/N 0 observations [45]. The proposed strategy leverages two theoretical assumptions from [46], [47] 1) if AGC value decreases and C/N 0 decreases, jamming is likely. 2) if AGC value decreases and C/N 0 is relatively constant, spoofing is more likely than jamming. Despite offering a relatively simple detection algorithm, it has to be remarked that, for many devices, AGC measurements may be an unreliable metric due to the rough resolution of the values and the fact that it is not even provided as an output by some GNSS chipsets. Depending on the different Android TM versions, AGC values are not granted for old devices and, generally speaking, they are not as reliable as other types of raw GNSS measurements, as highlighted in [45]. Furthermore, standalone C/N 0 time series have to be compared with a pre-defined threshold that may be not straightforward to be determined. Therefore, in order to provide a more robust detection, the algorithm is hybridized in an Android TM application named GNSS Alarm in charge to concurrently • compare GNSS estimated location and estimates from other location providers (e.g., network) • check for the Android TM mock location flag • compare the GNSS and Android TM system times.
In this work, starting from the results presented in the existing literature, an extended investigation is performed, testing a wider variety of Android TM smartphones and spoofing attacks in different scenarios via an extensive test campaign, that unveils the actual, unsolved vulnerability of the smartphones against these threats. Differently from the approaches recalled in this literature review, our technique aims at • leveraging a single data source, by looking for a unique figure of merit that only relies on GNSS data with no need for external location providers or access to O.S. flags • detecting spoofing on a signal-basis and not only as an aggregated flag (constellation/band), thus providing a more detailed view of the attack (if required). It is worth remarking that spoofing attacks may affect only a subset of available satellites and AGC data cannot provide such a detailed view • relying on a threshold that is independent from the magnitude of the data under analysis and does not require further normalization or runtime updates.

A. SPOOFING MODELING AND CLASSIFICATIONS
Unstructured RFI such as jamming disturbance can significantly impair the receiver by disrupting its operational capabilities at the early signal processing stages. On the other hand, spoofing disturbances act stealthily, as the receiver operation is typically not interrupted from a user standpoint. Spoofing methodologies are mostly classified on the basis of the time-coherence of the spoofing signals and their legitimate counterparts. The difficulty in performing coherent attacks also determines the practical feasibility and associated risk of such threats [37]. Furthermore, the possibility of detection from a receiver's standpoint may rise a further level of classification [10]. In [38], researchers classified spoofing attacks using a multilayered model, distinguishing between development architectures, acquisition strategy, control strategy, and application. This allowed them to assess the risks and strategies of operational spoofers with prevention.
Depending on the features of the spoofing and the complexity of the attack, it is possible to classify these disturbances into three categories: simplistic, intermediate and sophisticated spoofing attacks. They are recalled hereafter for the sake of completeness [10], [16], [37].

1) SIMPLISTIC OR ASYNCHRONOUS SPOOFING
These attacks are characterized as the incoherent transmission of counterfeit GNSS signals over a pre-determined bandwidth aiming at forcing victim receivers to estimate a fake PVT solution. The lack of synchronisation between spoofers and GNSS timescale can be often used to detect ongoing attacks [45]. This class of spoofer can be also built by using a signal simulator that re-transmits counterfeit signals by means of mass-market SDR components [39].

2) INTERMEDIATE OR SYNCHRONOUS SPOOFING
This attack foresees a spoofer architecture embedding a GNSS, built-in receiver that acquire and tracks legitimate GNSS signals in order to coherently generate their counterfeit counterparts. By receiving real-time GNSS signals and estimating the main parameters of interest (i.e., code phase offset and Doppler shift) the spoofer can perform real-time signal transmission of the counterfeit signal by modifying these parameters on its need. A downside of an intermediate spoofing attack is that, in order to be effective, some apriori information about the victim receiver must be known.
To successfully mislead the target PVT estimate, different factors must be known, except in the case of a self-spoofing scenario in which the spoofer and the victim receiver may be co-located. Some implementations of intermediate spoofing scenarios of GPS signals to exploit a modified softwaredefined receiver integrated with the front-end, are presented in [10].

3) SOPHISTICATED OR MULTI-ANTENNA SYNCHRONOUS SPOOFING
This attack is also referred to as nulling attack and it aims at transmitting a disruptive interference signal along with counterfeit, spoofing signals. The use of multiple transmitters increases the effectiveness of the attack against physical detection methods based, for instance, on the angle of arrival. Sophisticated spoofing is the most insidious technique as it takes control of the target receiver without being typically detected. As described in [40], the malicious action leveraged a soft-take-over through a time-synchronised transmission. It starts with a low level of power which is increased slowly till the receiver has acquired and started to track the spoofed signals. In [13], research conducted sophisticated spoofing scenarios in a multi-layered processing architecture. However, this type of spoofing uses multiple antennas to broadcast GNSS signals, thus overcoming state-of-theart anti-spoofing countermeasures. Practically, this threat is rarely deployed due to its high cost and complexity and is typically not affordable without advanced expertise. Due to its affordable cost and practicability, simplistic spoofing is the target threat we experimentally addressed in the current study. However, in the following, single-antenna attacks are modelled that cover, in principle, both simplistic and intermediate spoofing scenarios.

B. LEGITIMATE AND COUNTERFEIT SIGNALS MODELING
In absence of interferences, the GNSS signal received at the antenna can be modeled as the sum of N s independent satellites' signals where P R,i is the received signal power, D i (t) is the navigation data stream, C i (t) is the pseudo-random code sequence, f c is the carrier frequency shifted by the observed Doppler shift f d,i , τ i is the propagation delay and θ i is the phase offset. Eventually, n(t) is the thermal noise contribution. The received power of each signal, P R i , reflects the unique properties of the propagation path it covered between transmitting and receiving antennas.

1) MODELLING OF SPOOFING GNSS SIGNALS
In order to fake a GNSS receiver, a spoofer must replicate all the components of the navigation signals defined in (1), such as its spreading code, Radio Frequency (RF) carrier, and the navigation data symbols of the selected constellation. A simplistic GNSS spoofer generates and transmits GNSS-like signals. However, it cannot keep phase and time coherence w.r.t. to the legitimate signals without an external time and frequency sources. The generated counterfeit signals have a similar structure to the legitimate signals, however, they may differ in terms of Doppler and phase shifts of both code and carrier. Furthermore, different power levels are usually observed at the receiver location for spoofing and legitimate signals, respectively. Advanced attacks may calibrate the signal power to be similar enough to the received power of each legitimate GNSS signal. However, such a calibration would require accurate knowledge of the attacker-to-victim range, thus of the victim's location, as typically addressed by sophisticated spoofing actions. In the following, spoofing GNSS signals will be identified through the apex (·) (S) . A simplistic spoofer will generate N sp counterfeit signals characterized by code delay τ d (t) may be introduced by the relative kinematics of the transmitting and receiving antennas and is assumed equal to zero when both are static or carried on the same moving rigid body. The expression for the sum of N sp single frequency, single constellation spoofed signals is where the received power at the antenna, P (S) R,i , reflects the different amplitude attributed to each signal to simulate different path losses, and theD i highlights possible differences w.r.t. the legitimate navigation message data stream foreseen in (1). The value of P (S) R,i , as for the real signals, may actually change over time, but differently from the case of (1), such variations depend on • the misalignment of transmitting and receiving antennas as well as any changes in their relative heading during the spoofing attack • the fading effects introduced by the terrestrial channel and mostly due to multipath which is very relevant when the attacker is at the same altitude of the victim receiver. When a GNSS receiver is under a spoofing attack, it receives both authentic and spoofed signals, and additive thermal noise affects their sum. Therefore, the total signal at the victim receiver's front-end is modeled as where the notationx indicates the noiseless, legitimate and spoofing signals derived from (1) and (2) by neglecting the respective noise terms. Without any lack of generality, f c will be referred hereafter as to GPS L1/CA center frequency, i.e., 1575.42 MHz.

2) IDENTIFICATION OF SPOOFING GNSS SIGNALS
From a geometrical and physical standpoint, legitimate GNSS signals propagate through different channels. The free space path loss mainly contributes to the differences in the received power observed for each satellite. Multipath-related constructive and destructive interferences may be responsible for fluctuations in the received power, being conditioned by the elevation at which each satellite is observed. Differently from the authentic GNSS signals, all the spoofing signals travel through the same propagation path from the spoofer's to the receiver's antenna, thus experiencing a common, yet unique physical channel. The aforementioned power variations will then reflect in a similar way on each generated satellite's signal by introducing strong spatial and temporal correlation. In the following preliminary analysis we look for such correlations through the time series of the GNSS raw measurements to identify the spoofing attacks.

III. PRELIMINARY ANALYSIS
Through the assessment of the smartphone vulnerability to simplistic spoofing attacks, the following analysis aims at identifying the set of observables, among the available GNSS raw measurements, being suitable to the design of the proposed spoofing detection method.

A. EXPERIMENTAL SETUP AND TEST PROCEDURE
To perform the vulnerability analysis, we developed a lowcost portable spoofer based on a Great Scott Gadgets TM HackRF One TM platform [39] and a Raspberry TM PI 4B.
A high-level diagram of the system is shown in Figure 1. 3) Digital to analogue conversion. The transmitting module of the front-end (HackRF One TM ) is in charge to perform the digital-to-analog conversion by mixing the baseband signal provided at step 2 to the carrier frequency generated through the VCO (i.e., GPS L1 C/A), thus, transmitting I/Q modulated GNSS signals in L1/CA band. A block diagram is provided on the right side of Figure 1   application, which provides the GNSS standalone position of the smartphone in standard NMEA format [43].

2) TEST METHODOLOGY
Experiments on smartphones were carried out in a dedicated test campaign. Each test foresaw 600 s data collections for two complementary scenarios, in controlled environmental conditions. The range of the spoofer was kept at about 3 m, and, in order to prevent any RFI disturbances beyond the range of the experimental setup, a 30 dB attenuator was applied at the coaxial cable to reduce transmitting signal power levels and limit the spoofer coverage.   Table 2 for Scenarios 1 and 2. During the spoofing attacks, the set of counterfeit signals were generated to force the GNSS receiver to estimate a faked PVT solution. Counterfeit and real satellite signals are distinguished in Table 2. As it can be seen, such a set includes i) satellite signals that would be broadcasted by satellites that are not actually visible to the receivers at the time and location of the tests, and ii) satellite signals that are already being tracked by the receivers, for which the spoofing signal has to replace the real signals under tracking. The overall satellite skyplot of the visible satellites and of the counterfeit constellation generated during the tests are shown in Figure 2a and Figure 2b, respectively. The skyplots depict azimuth and elevation angles of the satellites w.r.t. the user location. For static users, they highlight the difference in the observed scenarios in terms of relative geometry of the satellites w.r.t. the receiver location.

3) RAW GNSS MEASUREMENTS OF INTEREST
Among all the available raw GNSS measurements, the data fields of interest for the investigations pursued within this study are:  the associated measurements. In fact, strong received signals return high C/N 0 values, typically leading to better signal tracking and PVT determination. Abrupt variations to it can indicate the presence of interference while an unnaturally high value could also indicate the presence of a counterfeit satellite signal. C/N 0 is estimated by each tracking channel independently, therefore data is available for each received signal. • Pseudorange and Pseudorange rate (PrM): The pseudorange is a measurement of the distance between the user and satellite, affected by the clock offset of the receiver clock w.r.t. the satellite timescale. The observation of the behavior of the pseudorange and of its variation over time (i.e., pseudorange rate), allows to inspect the effect of the spoofing signals (when they are tracked by the receiver) and to motivate any impact of the interference on the subsequent PVT solution. For the sake of completeness, the output position estimates from the computed PVT solutions have been also investigated to provide evidence of the vulnerability of the devices under test to the simplistic spoofing attack. It is worth recalling that misleading PVT solutions are indeed the usual objective of such deliberate malicious actions.

B. SPOOFING EFFECTS ON SMARTPHONES RAW MEASUREMENTS
In this section, we analyze the effects of the designed spoofing test on the raw GNSS measurements of interest as well as on the position estimates provided by the devices under the test scenarios 1 and 2 described in Section IV. The effects of the spoofing signals are hereafter reported by means of measurements time series. The discussion follows the signal processing flow of a conventional GNSS receiver architecture, i.e., from the AGC to the PVT computation. Figure 3 plots the AGC value (in dB) observed under the test Scenarios 1 and 2 for the devices under test. From test Scenario 1 (Figure 3a), it can be seen that the effect of turning on the spoofer is similar to what in-band jamming or interference would cause. Due to the presence of an in-band, powerful signals, the receiver reduces the amplification of the incoming signals. By collaterally attenuating the legitimate GNSS signals, it creates the conditions for the acquisition of counterfeit signals. In the same figure, the AGC amplification dramatically drops from 48 to 45 dB for Redmi 8 Pro of the smartphone, and down to about 40 dB for the Redmi 8 once the spoofing is turned on at time t = 150 s. In the Scenario 2, being the spoofing signals broadcast with the same power level as in Scenario 1, the initial values of the AGC amplification are similar as during the spoofing period in Scenario 1. When spoofing is ended at time t = 350 s during the test Scenario 2, the AGC increases back to its initial level, as it can be seen in Figure 3b.

1) AUTOMATIC GAIN CONTROL (AGC)
The jump in the AGC value for the Redmi8 device could indeed be due to a loss of lock on authentic signals and subsequent reacquisition and relock on spoofing signals. The strength and persistence of the spoofing signal could also be a factor in determining whether there is a gap in the measurements output or not. If the spoofing signal is strong and persistent enough, it could cause the GNSS receiver to lose lock for a longer period, resulting in a gap in GNSS output. On the other hand, if the spoofing signal is weaker or less persistent, the GNSS receiver may be able to maintain a lock on authentic signals and produce continuous output, even in the presence of the spoofing signal. Hardware and software differences in the GNSS receiver could also play a role in determining the response to spoofing signals. Different receivers may have different sensitivities, filtering capabilities, or other features that affect their ability to be resilient to spoofing attacks.
This variation brings evidence of the presence of the spoofing signal, and detection techniques based on the observation of the AGC level has been indeed proposed [33], [45], [48].
However, by itself, the AGC variation cannot be sufficient to declare the presence of a spoofing signal, but it only allows for raising an alert. Unconventional AGC behaviors may indeed subtend jamming attacks; therefore, the AGC variation has to be cross-checked together with the spectral distortion of the input signal or the C/N 0 value of each channel [49], or further independent metrics. Furthermore, received signals from low-elevation satellites may be degraded by multipath due to the presence of buildings and other obstacles. When the spoofer is turned on at time t = 150 s, it can be seen in Figure 4a that it acts as generic RFI over the L1 frequency band by disturbing the reception of the legitimate signals up to their loss-oflock. In parallel, the attack forces the GNSS receiver to acquire and lock on the spoofing signals. However, concerning the counterfeit signals, their received power is higher, thus leading to a raise of the C/N 0 value. The most relevant observation is the similarity of the behavior over time of the C/N 0 , where despite variability in the range 35-45 dB-Hz, there is a remarkable common trend in time. This can be explained considering that the spoofer is emulating a satellite scenario generating signals with different power levels in order to mimic the different distances of the satellites. However, at the same time, all the signals are generated through the same transmitting hardware and are subject to the very same propagation conditions, as discussed through the signal modeling in Section II. Similar remarks can be done by observing Figure 4b for the Scenario 2. It also shows similar trends during the spoofing period and a larger variability and diversity of such trends after the spoofer is turned off. These observations suggest that the correlation between the raw measurements, could be evidence of the spoofing presence and it is the basis of the spoofing detection technique presented and discussed in Section IV. Figure 5 compares the pseudoranges value between all PRNs during the entire test period for the smartphone RedMi 8. When the simplistic spoofer is turned on at time t = 150 s the spoofed signals are tracked and the pseudorange value is altered accordingly. Some weaker real signal suffers from the in-band jamming effect and is not tracked anymore, such that their pseudorange is not provided during the spoofing period, as it can be seen in Figure 6. As for the common VOLUME 4, 2023 881 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. PRNs, the higher power of the spoofed signals forces the receiver to lose the lock and relock on the new signal, as shown by the jump in the pseudorange value. A dual effect can be noticed in Figure 7 when the spoofer is turned off. An interesting finding is that when the signal is switched from real to spoofing, there is a high jump in the pseudoranges that cannot be attributed to the spoofed location. Indeed, the counterfeit location set in these tests would not justify such a large variation. The reason for this is the different user clock biases estimated in spoofing presence. The observed anomalous magnitude of the pseudorange measurements is attributed to an altered estimate of the signal's time-of-flight due to the outdated TOW carried by spoofing signals. In the experiments, the estimation of the pseudorange measurements is based on such a TOW and on the local time at the receiver. When spoofing occurs, local time is not shifted accordingly to the TOW of the spoofed signals and the resulting time of flight becomes higher than expected in nominal conditions.

4) POSITION ESTIMATION
As a cross-check of the spoofing vulnerability, according to the NMEA stream it was observed that both the time and locations of all the smartphones under investigation were successfully spoofed. Figure 8a and Figure 8b represent the shifts in latitude, longitude and altitude reported in the NMEA log files in both the test Scenarios 1 and 2 for the Xiaomi RedMI8 device. The vertical dotted lines in Figure 8  delimit the timespan corresponding to the spoofing period. It can be seen in Figure 9 that two different positions were estimated during the test. The solution is shifted from the real  to the fake location a few seconds after the spoofing starts (Scenario 1), and vice versa when after its end (Scenario 2). It can be noted that in test Scenario 2, Figure 8b shows a discontinuity of operation between t = 350 s and t = 450 s on the estimated latitude, longitude and altitude coordinates. This is due to the misalignment between real and simulated timescales and also depends on the outdated GPS ephemeris employed for the spoofing signal generation.

IV. METHODOLOGY FOR SPOOFING DETECTION
One of the main observations of the previously described test campaign is that spoofing induces significant variations in the estimated C/N 0 and PrM data, also showing a similar trend over time of these time series for different satellite signals [50]. Based on the measurement model described in Section II-B along with these observations, we introduce a methodology for the analysis of possible correlations between the raw data time series obtained for different tracked signals and observed within a common time window, T n . In the following analysis, we will focus on the C/N 0 time series as the target data. In fact, the proposed methodology forgets about the physical meaning of the measurements, treating the observed data as time series of noisy raw GNSS measurements that have to be considered as realizations of non-stationary, stochastic processes.

A. SPOOFING DETECTION STRATEGY THROUGH PEARSON CORRELATION COEFFICIENTS
In order to provide a quantitive analysis of the similarity between data series along the experiment time, the pairwise cross-correlation function between two-time series X A and X B can be computed where E(·) represents the mean operator, t 1 and t 2 identifies two generic time instants, and the (·) * indicates the complex conjugate of the argument. By subtracting the respective mean to each series in (4), we obtain the cross-covariance function In the proposed applications short observation windows are expected to be monitored, thus long-term trends that typically characterize the investigated quantities can be neglected with no lack of validity. In light of this, wide-sense stationarity of the processes can be assumed and (5) is modified as where τ = t 1 − t 2 , and denotes the independence on the choice of t 1 and t 2 of the cross-covariance, i.e., the crosscovariance of a Wide-Sense Stationary (WSS) process. In order to obtain a scale-free metric of the correlation between the time series, a normalization is introduced in (6), by defining that is known as Pearson correlation function. The maximum value assumed by (7) corresponds to the well-known Pearson correlation coefficient [51], [52], and is computed as In case X A and X B are identical time series (i.e., they are of the same length and assume the same values), the maximum of (7) is located at τ = 0. Furthermore, the Pearson correlation coefficient assumes values in the range (−1, 1). ρ X A ,X B = 1 highlights a perfect positive relationship, while ρ X A ,X B = −1 denotes a perfect negative relationship, and ρ X A ,X B = 0 indicates the absence of a linear relationship between the random variables. Pearson correlation coefficient also relates to the slope of the linear regression between the time series. Therefore, it is exploited in the proposed solution to track temporal and spatial correlation that characterizes the spoofing signals.

B. IMPLEMENTATION OF PEARSON COEFFICIENTS ESTIMATION
In absence of repeatability conditions, multiple realizations of the random processes of interest are not available in the target application. Therefore, sample means and standard deviations in (6) must be estimated. By further assuming ergodicity of the observed data, the time average can be considered in place of the sample means for the estimation of μ X A , and μ X B , thus C XY (τ ) can be estimated through (6). Similarly, standard deviations computed on time can be used in place of their statistical counterpart. The proposed implementation acts on pairs of input time series by a) computing their time average, b) independently subtracting them to each, c) performing the discrete cross-correlation, d) normalizing the value by the product of their standard deviation, and eventually extracting its maximum, i.e., the Pearson Correlation Coefficient. An estimate of (8) is hence provided through the approximated Pearson correlation coefficient where T n is the window size, a t , b t are the time series samples observed at the t-th instant, and a, b are the sample means. The proposed approximated correlation coefficient, ρ ab , can be computed for all the available pairs to verify their pairwise correlation. It is worth remarking that the estimation accuracy ofρ ab can depend on the size of the observation window, T n . Short windows may lead to misleading correlation information but long windows introduce a considerable latency for the collection of the data samples. A K × K symmetric, correlation matrix is eventually populated with the estimated correlation coefficients for each pair of tracked GNSS signals (9) and aggregated metrics can be used to build a decision logic over the whole set of tracked signals, such as the average cross-correlation coefficient computed on the lower triangular matrix where a and b defines rows and columns indices, respectively.

C. DECISION LOGIC
The proposed decision logic for the detection of simplistic spoofing is hence based on a binary decision rule. Two hypotheses are tested in the context of a classical Neyman-Pearson decision problem, as depicted in Figure 10 • H 0 : only legitimate GNSS signals are received and tracked. Under such a hypothesis, μ ρ is typically low due to a poor cross-correlation among the C/N 0 time series, and its distribution follows a given PDF, f 0 (r). • H 1 : legitimate and spoofing GNSS signals are concurrently received and spoofing signals are tracked in place of the legitimate ones. Under this hypothesis, μ ρ is expected to be as closer to 1 as many spoofed satellites are tracked by the receiver, with a PDF distributed according to f 1 (r). Analytic expressions for f 0 (r) and f 1 (r) can be approximated through the series where r is a variable defined in the range (−1, 1), (·) is the Gamma function, n is the number of Pearson's correlation samples in a given experiment, and ρ is the known level of correlation [53]. The PDF described in (12) is a skew distribution with a skewness factor that increases with ρ. However, according to the Central Limit Theorem, (12) can be transformed through the Fisher transformation such that (12) approaches a normal distribution as n increase, with standard deviation This step is depicted by the plots of Figure 10. Single correlation coefficient will define the distribution of Figure 10a while averaging multiple coefficients will shift the decision problem to the Gaussian-like distributions defined through (13) and centered at the average correlation coefficient, as in Figure 10b. In order to establish a decision threshold, γ , we express the probability of a false alarm as where f 0 (z) is the transformed PDF according to (13), and α is the design parameter for the decision logic. The threshold γ is computed by fixing the probability of a false alarm, α, as where Q(·) is the Marcum Q-function and γ must be reverted to γ by inverting (13). As an example, by fixing a probability of false alarm of 1.5 % through α = 0.015, we obtain γ 2.17. This value corresponds to a threshold γ 0.5 for the original PDF (12) of Figure 10b. γ 0.5 is hence the value that will be utilized in the experimental validation of our technique to establish the detection. Such a threshold will be also cross-validated by means of experimental datasets in Section V.

1) IMPLEMENTATION OF THE DECISION LOGIC
A block diagram for the implementation of the proposed decision logic is provided in Figure 11. The algorithm aims at 1) determining the correlation threshold, γ , under H 0 and H 1 hypothesis. γ can be estimated by fixing the false alarm probability. 2) comparing the current mean correlation coefficient μ (K) ρ estimated through real-time data over a window of T W s, with the threshold γ . 3) deciding for spoofing or non-spoofing conditions within the observed time window by accepting or rejecting H 1 according to the Neyman-Pearson criterion. For a large set of training datasets, the decision threshold, γ can be heuristically selected by identifying the average correlation coefficients observed under the aforementioned hypothesis. Figure 11 highlights both theoretical and empirical threshold estimation on the right side of the diagram.

V. RESULTS
The performance of the proposed spoofing detection algorithm was assessed through the analysis of the pairwise cross-correlation between C/N 0 time series for each datasets. For the sake of clarity, in the following, GPS L1/CA signals are referred through the corresponding PRN code.

A. LINEAR CORRELATION OF C/N 0 TIME SERIES
As an example, the cross-correlation results of the different PRN's C/N 0 values are shown in Figure 12, by considering an observation window of 50 s. In detail, Figure 12a shows the C/N 0 values of GPS PRN 1 and PRN 3 for Xiaomi Redmi 8 being tested between the spoofing (top) and non spoofing (bottom) periods. As one can see, the C/N 0 of PRN 1 and PRN 3 are limited in the range 35-55 dB-Hz within the spoofing period. A remarkable difference is instead visible in non-spoofing conditions in which C/N 0 assumes values in the range 20-40 dB-Hz with a slow decreasing trend and sporadic discontinuities at low C/N 0 values. Their correlation in both spoofed and non-spoofed cases was verified by plotting the linear regression among the time series and evaluating their approximated Pearson correlation coefficients (9).
A higher linear correlation is evident in the upper plot, for which spoofing action induced a correlation value of ρ 1,3 = 0.99. Data discontinuities and dissimilar trends of the non-spoofed C/N 0 time series provide instead a poor correlation ofρ 1,3 = −0.76. Where zero values indicate undefined numeric results. The comparison of the two observation windows returns unambiguous classifications for spoofing and non-spoofing period. Similar outcomes can be observed in Figure 12b, and Figure 12c, in which 50 s observation windows return strong linear correlation of spoofed time series with correlation coefficientsρ 1,8 = 0.98 andρ 8,21 = 0.99, respectively. Non-spoofing observations return a poor correlation coefficient in 12b while a larger value is visible in Figure 12c. In the latter, linear correlation appears stronger for higher values of C/N 0 but the overall correlation coefficient is still remarkably lower than the spoofing period. It is worth remarking that single pairwise observation cannot be considered as reliable inputs for the decision logic described in Section IV, and aggregated metrics, i.e., the average cross correlation coefficient defined in (11), must be used instead.

B. CORRELATION MATRIX AND AVERAGE PEARSON COEFFICIENTS
The patterns of the correlation matrices of 18 datasets have been evaluated to better understand the behaviour of the proposed indices. Figure 13, illustrates the results of Pearson correlation coefficients for the sample datasets D1, D2 and D8, showing each element of the matrix equation (10) obtained considering the C/N 0 measurements associated to each PRN. The numerical value indicating the rows and the columns of the matrix corresponding to a given PRN. Results are presented using a heatmap based on a color code. To highlight the overall correlation increment due to spoofing attacks, we further defined an average Pearson correlation increment as A summary of the results of all the experiments is given in Table 3. It can be seen that the Pearson correlation increment varies depending on the dataset but is always experimentally verified during the spoofing period for all the devices under test. As an example, we discuss hereafter the results of three representative datasets. As it can be observed, in the D1 dataset Figure 13b, one of the highest correlations corresponding to the pair PRN 1 and PRN 3 (ρ 1,3 = 0.99). Similarly, PRN 1 and PRN 3 were highly correlated in the D2 and D8 dataset, as shown in Figure 13d (ρ 1,3 = 0.98) and Figure 13f Table 3, which summarizes the remarkable difference between correlation coefficients under spoofed and non-spoofed periods. The complete analysis of all the datasets confirms how a significant increment on the correlation between the time series can be observed for all the pairs of the PRNs. In light of this the mean of the Pearson coefficient defined in equation (11) is a suitable metric for the detection of a single-antenna spoofing attack based exclusively on the observation of raw GNSS measurements.

1) OBSERVATION WINDOW LENGTH AND DETECTION LATENCY
The results presented in Section V-A have been obtained for a pre-defined length of the observation window, i.e., T n = 50 s. In order to identify a minimum latency for the detection of a possible spoofing attack through the proposed technique, the average Pearson correlation coefficients, μ (k) ρ , have been evaluated for different window lengths in the range of 5 s to 400 s. Figure 14 shows the behaviour of μ (k) ρ and of the estimated threshold, γ , by varying the length of the observation windows for all the devices under test. In the interval between 5 s and 50 s we reported the coefficients with a step of 5 s. As it can be observed, the observation windows can be both shortened and extended without any remarkable impact on the performance of the proposed method. Shorter windows reduce the latency of the decision logic as well as the data buffering requirement but the output coefficients may turn unreliable as in the case of Samsung A32 in Scenario 2 (see, Figure 14b). Increasing T n leads of course to an increased latency being also prone to the variability of the environment within the timespan. These remarks justify the observation window of T n = 50 s that have been hence considered in this work. According to this analysis, it is a valuable and safe trade-off between latency and reliable correlation coefficients.

VI. CONCLUSION
In this paper, starting from the analysis of the effects that single-antenna, simplistic spoofing has on the GNSS receivers embedded in a variety of Android TM smartphones, a spoofer detection technique based on the processing of raw measurement was proposed. The most relevant observation is that raw GNSS measurements, i.e., Carrier-to-noise ratio C/N 0 and pseudorange measurements, show a considerable correlation of the output time series. The estimation of the C/N 0 for spoofed signals is indeed sensitive to the spatial and temporal correlation introduced by the spoofer transmission of multiple signals over a single propagation channel. Such a peculiar feature of single-antenna spoofing attacks constitutes a considerable difference w.r.t. the received legitimate GNSS signals. It has been shown how estimating an average Pearson correlation coefficient considering all the PRNs pairs provides a suitable metric for detecting the attack. Furthermore, the analysis of such coefficients for the different devices under test showed that the performance does not depend on the target device and the observation window of the C/N 0 time series. Since the input needed for the proposed method are time-series of typical raw GNSS measurements, i.e., as C/N 0 values, future works will investigate the applicability of the proposed technique to other classes of GNSS devices, exploring different conditions of the attacks. He has a relevant experience in European projects in satellite navigation as well as cooperation with industries and research institutions. His research interests cover the design of GPS and Galileo receivers and advanced signal processing for interference and multipath detection and mitigation, as well as ionospheric monitoring. He serves as a member of the IEEE Aerospace and Electronics Systems Society Navigation Systems Panel.