Jamming Detection and Classification in OFDM-Based UAVs via Feature- and Spectrogram-Tailored Machine Learning

In this paper, a machine learning (ML) approach is proposed to detect and classify jamming attacks against orthogonal frequency division multiplexing (OFDM) receivers with applications to unmanned aerial vehicles (UAVs). Using software-defined radio (SDR), four types of jamming attacks; namely, barrage, protocol-aware, single-tone, and successive-pulse are launched and investigated. Each type is qualitatively evaluated considering jamming range, launch complexity, and attack severity. Then, a systematic testing procedure is established by placing an SDR in the vicinity of a UAV (i.e., drone) to extract radiometric features before and after a jamming attack is launched. Numeric features that include signal-to-noise ratio (SNR), energy threshold, and key OFDM parameters are used to develop a feature-based classification model via conventional ML algorithms. Furthermore, spectrogram images collected following the same testing procedure are exploited to build a spectrogram-based classification model via state-of-the-art deep learning algorithms (i.e., convolutional neural networks). The performance of both types of algorithms is analyzed quantitatively with metrics including detection and false alarm rates. Results show that the spectrogram-based model classifies jamming with an accuracy of 99.79% and a false-alarm of 0.03%, in comparison to 92.20% and 1.35%, respectively, with the feature-based counterpart.

particular interest and is tackled with two approaches that enable attack detection and classification. Jamming mitigation, on the other hand, is outside the scope of this effort. Nonetheless, several methods were reported in literature, where the use of artificial intelligence (i.e., enforced learning) and path planning were proposed [14]- [18].

II. RELATED WORK
Cyberattacks on UAVs include data interception, data manipulation, and denial-of-service (i.e., jamming). Data interception/manipulation attacks are often mitigated with broadcast authentication [19]- [23] and secure location verification [24], [25]. The former applies cryptographic and non-cryptographic schemes; whereas the latter verifies the locations of UAVs with distance bounding, group verification, Kalman filtering, multilateration, and traffic modeling. Although these methods have shown promise in improving UAVs security, the added hardware and/or software to the existing protocols as well as time-stamping adjustments were major constraints that setback their ready acceptance in foreseeable future. Also, these methods are inefficient for detecting jamming, where the UAV-controller communication is interrupted with interference to impose security threats and cease information exchange [26]- [28]. With the readily available software-defined radio (SDR), attackers can easily launch this interference to disturb a UAV trajectory, potentially leading to collisions. Hence, developing affordable jamming detection techniques that also comply with the existing standards are of utmost importance. These techniques must facilitate high detection rate and low false-alarm rate. Furthermore, they should enable jamming classification to allow for selecting the optimum countermeasure routine that ensures operational security through informed decisions.
In our previous work, the impacts of four jamming types on UAV security were analyzed qualitatively (i.e., range, complexity, severity) and quantitatively with conventional machine learning (ML) algorithms [29]. These algorithms were exploited for jamming detection/classification based on extracted signal features. In this work, deep learning models, i.e., four configurations of convolutional neural networks (CNNs), are adopted based on spectrogram images. The spectrogram-based approach improved the classification accuracy from 92.2% (i.e., feature-based approach) to 99.79% and reduced the false-alarm rate from 1.35% to 0.03% as will be presented in greater detail in Sections IV and V. Finally, this work contributes an additional dataset (i.e., spectrogram images) for training and testing ML classifiers. This dataset and the spectrogram-based approach proposed herein, were not provided nor explored in [29]. Also, this work differs from other existing techniques in the following aspects: 1) In contrast to imposing modifications to the existing protocols [19]- [25], [30], readily available radiometric features and spectrogram images are used to develop ML models for detecting and classifying jamming.
2) In comparison to the simulation-based attack scenarios [31]- [39], this work utilizes SDR for launching jamming attacks that facilitate detection and classification with realistic environments and training datasets. 3) Here, jamming detection/classification via deep learning models is introduced. These models are trained and tested with spectrograms that characterize the jamming spectrum. This approach outperforms its feature-based counterpart in classification accuracy. 4) The datasets that are collected and used to develop the feature-and spectrogram-based classification models (i.e., features, images) are made publicly available. It is worth mentioning that ML was proposed for satellite communications, vehicle Ad Hoc networks (VANETs), 5G networks, Internet of Things (IoT), and UAVs with applications including jamming detection [40]- [44], object detection, trajectory optimization, swarm communication, situational awareness, and malicious attack mitigation [45]- [47].
The remaining of this is paper is summarized as follows: Section III describes the jamming types entailed in this work, the experimental setup, and attack scenarios. Section IV presents the feature-based conventional ML models for detecting/classifying jamming. Section V elaborates on the spectrogram-based deep learning models via CNNs. Finally, conclusions and future work are given in Section VI.

III. JAMMING ATTACKS AND EXPERIMENTAL SETUP
The attack scenario and experimental setup for four jamming types are presented herein. Holy Stone HS720E is used for testing. This drone has a communication range and transmission power of 1000 meters and 16 dBm, respectively. It also uses IEEE 802.11 orthogonal frequency division multiplexing (OFDM) at 2.4 GHz [48]. B210 SDR from National Instruments and GNURadio are exploited to launch attacks within 40 MHz bandwidth to accommodate all subcarriers.
A. TYPES OF JAMMING ATTACKS 1) Barrage: In this type, noise from normal distribution is launched at the communication band to increase interference level at the receiver (i.e., UAV). Therefore, barrage is often used when the transmission frequency is unknown to the jammer. Barrage jamming is simple to launch; however, its efficiency reduces as the transmission bandwidth increases.
2) Single-Tone: Here, a high-power interference is launched to interfere with the center frequency that the target uses for data exchange. This interference signal is generally denoted as J (t) = A j cos(2πf 0 t + θ j ), where A j is the jamming amplitude, f 0 is the center frequency, and θ j is a phase shift.
3) Successive-Pulse: In this type, pulse-sequence is launched to interfere with the target's operation band, and is given as:  where N j is the jamming tones. The period T is set such that 312.5 KHz frequency spacing is realized between generated pulses (i.e., subcarrier spacing in IEEE 802.11 OFDM). 4) Protocol-Aware: This type transmits low interference via shot-noise pulses to corrupt the ongoing transmissions while minimizing detection probability. In other words, the jammer simulates the transmitter of the targeted protocol without affecting other standards occupying the same bandwidth [49].

B. EXPERIMENTAL SETUP
Two experimental environments are established to evaluate the qualitative and quantitative impacts of the jamming types. The qualitative evaluation analyzes severity, launch complexity, and effective jamming range. The quantitative evaluation entails radiometric extractions (i.e., signal features, spectrogram images) through data collection under different jamming scenarios. Data is used for training and validating ML algorithms for jamming detection and classification.
1) Qualitative Evaluation: The separation between the jammer (i.e., B210 SDR) and drone is fixed to 0.5 meter. To measure the effective jamming range, the separation between the jammer-drone pair and the transmitter is increased gradually for each jamming type in an unobstructed outdoor setup, as shown in Figure 1. Here, effective jamming is defined as a complete loss of signal and is reported in Table 1 for each type. Results indicate that barrage has the most jamming range among all types due to spreading interference over all OFDM subcarriers in comparison to interfering with the center (or selected) frequencies as in single-tone and successive-pulse jamming or transmitting shot-noise as in protocol-aware jamming. Table 2 depicts the qualitative find- ings for launch complexity and severity in a scale of 1 to 4, where 4 is the highest score. Barrage has the least launch complexity as it does not require extensive knowledge about the communication bandwidth. Nonetheless, it has the highest severity. Single-tone jamming is relatively simple to launch. Nevertheless, this type is inefficient in scenarios where multiple frequencies or subcarriers are used. Successive-pulse jamming with N j = 64 pulses has a moderate launch complexity as interference pulses need careful positioning with respect to the center and subcarrier frequencies. The output power, P j , of the jammer is distributed on pulses in a way that the interference pulse power is P j /N j . Therefore, it has  the least severity. Protocol-aware jamming has the highest launch complexity as it requires a thorough knowledge of the communication protocol. It also has a moderate severity since limited-power interference is launched at the transmission bandwidth to maintain low detection probability.
2) Quantitative Evaluation: Radiometric data (i.e., signal features, spectrogram images) are collected for ML training/classification. The goal here is to develop models that not only detect jamming, but also identify its type. To collect such data, the transmitter-drone separation is set to 350 meters, which is the minimum separation where all jamming types are effective. Then, without jamming presence, features and images are obtained at the drone with B210 SDR and GNU-Radio modules. The same procedure is repeated in the presence of each of the jamming types, where a second SDR is utilized as jammer at eight locations J i , i = 1, 2, . . . 8, around the drone, one at a time. This procedure is performed for radii r = 0.5, 1, and 1.5 meters as shown in Figure 2.

IV. FEATURE-BASED CLASSIFICATION
As discussed in section III, B210 SDRs and GNURadio are used to launch different jamming attacks and extract radiometric data. Figures 3(a) and 3(b) show simplified GNU-Radio flow graphs for launching the attacks and extracting features, respectively. Nine features are extracted to train ML algorithms for detecting and classifying jamming attacks. Of these features, four are specific to OFDM (i.e., subcarrier length, cyclic prefix (CP) length, subcarrier spacing, and symbol time). The subcarrier length represents the number of subcarriers being used. The CP length is utilized to control symbol overlapping, and the subcarrier spacing is the frequency separation between subcarriers, which is the reciprocal of symbol time [50]. The 1 OFDM Estimator block shown in Figure 3(b) is used to extract these features [51]. The 2 Energy Detector block is used to extract the average received power and threshold [51]. The threshold is a binary indicator that returns 1 once the average received power exceeds a certain level and returns 0 otherwise. Finally, three more features; namely, signal-to-noise ratio (SNR), average signal power, and average noise power are extracted from the 3 SNR Estimator Probe block. It is paramount to point out that the average received power in 2 conveys noise energy; whereas the average signal power in 3 presents the estimated signal power excluding noise power. At the end of the experiment featured in Figure 2, a total of 23,565 signal samples are collected. Of these samples, 10,071 are obtained under no jamming; whereas 3,392, 3,367, 3,378, and 3,357 are obtained in the presence of barrage, single-tone, successive-pulse, and protocol-aware jamming, respectively. The complete dataset with all the 23,565 samples is provided in [52]. To develop the ML classifiers, this dataset is divided into training and testing sets as detailed in Table 3, which suggest a balanced distribution among the jamming types, leading to high detection and classification accuracy. During the processing of features, it is found that the (symbol time, subcarrier length) and (threshold, average noise power) pairs are highly correlated. Thus, different ML models are explored by reducing the dimension of the features dataset. In Case 2, symbol time is eliminated; whereas symbol time and average noise power are eliminated in Case 3. The list of features in each case is given in Table 4 Table 5. The two-class models predict whether a jamming attack is launched or not; whereas the five-class models detect the jamming attack and identify its type (i.e., barrage, single-tone, successive-pulse, and P-aware). During model development, 10-fold cross-validation is used in the training/validation stages. Once a model is trained, evaluation is performed on the test set; and the DR, F-score, and FAR are computed. Grid search is used to find the optimal hyper-parameters for each algorithm. The performance of the developed classifiers for the two-and five-class models are given in  Figures 4(a)-(c) show the confusion matrices of the five-class RF model for each case. None of the clean records are misclassified as jamming records. Rather, misclassification occurs only among the jamming types; particularly, barrage and protocol-aware, which is attributed to the similarity in their spectral properties (i.e., interference in these types targets the entire transmission bandwidth, but at different intensity levels). The weighed FAR values are obtained from Figure 4 to be 1.35% for Case 1, 1.33% for Case 2, and 2.38% for Case 3. Finally, there is no false-alarm in the two-class models regardless of the number of features used in training/validation. The validity of the feature-based models for detecting and classifying jamming attacks is further analyzed considering samples with different SNR levels. To this end, the extracted SNRs for all scenarios are plotted in Figure 5. Five subdatasets, summarized in Table 6, are created to represent all jamming types. Sub-datasets 1, 2, 3, 4, and 5 have samples with SNR intervals of {0-1}, {1-2}, {2-3}, {3-4}, and {4-5} dB, respectively, and are established from the testing set. The samples with SNR values when there is no jamming are excluded to emphasize classification accuracy only among the four jamming types. The six classifiers developed earlier are tested with these five sub-datasets, and testing entailed the three cases with nine, eight, and seven features. The resulting DRs are illustrated in Figure 6 with the following observations in mind: 1) The overall accuracy dropped due to removing the clean samples from the sub-datasets, i.e., ''Clean'' samples are not within any of the SNR intervals. 2) No misclassification as ''no jamming'' occurred in almost all classifiers. 3) The least accuracy is obtained with VOLUME 10, 2022    protocol-aware jamming has the highest misclassification as depicted in the confusion matrices presented in Figure 4. As a result, this SNR-based investigation shows that imbalances in the dataset (e.g., imbalance in the number of samples for each jamming type) significantly affect the classification quality and accuracy. Therefore, the datasets utilized for training and testing the feature-based ML classifiers in this work (i.e., Table 3) are balanced and have adequate number of jamming and clean samples to facilitate high detection and classification accuracy.

V. SPECTROGRAM-BASED CLASSIFICATION
To improve the five-class classification accuracy, deep learning models trained with spectrogram images obtained from no-jamming/jamming scenarios are developed. These models have multiple processing layers that use backpropagation to model the parameters of complex datasets (e.g., image, speech), thereby facilitating precise classification [53]. Here, CNNs are used for their leading advantage in processing images by not only efficiently extracting image properties (e.g., size, color, pattern), but also pooling a large number  of pixels to reduce calculations. The configuration of CNNs consists of input layer, convolution layer, pooling layer, fullyconnected layer, and output layer. The input layer feeds images to the hidden layers. The convolution layer contains convolution kernels for extracting features, and their size gradually decreases, or remains constant, as more convolution layers are added. The pooling layer retains the highest-scoring features and discards others with low scores. It also reduces model parameters; thus, reduces computations at later layers. The fully-connected layer is similar to a regular neural network (i.e.,neurons in one layer are connected to those in the next layer). The output layer returns the probability of each class. Weights are adjusted in the network via backpropagation.
Spectrogram images are collected with SDR and QT GUI Waterfall Sink block. Python scripts are developed to capture screenshots during testing. Here, 762 images are collected under no jamming and 204 images are collected for each of the jamming types. The standard image size is scaled down  from 1688 × 990 × 3 to 422 × 248 × 3 to reduce training time. These images are separated into 70% training and 30% testing. Figure 7 shows sample images in different scenarios. The complete image dataset is made available on [52].
Spectrogram-based classification is realized with four CNN configurations: AlexNet, VGG-16, ResNet-50, and EfficientNet-B0. Figure 8 shows their structures and Table 7 details their parameters. AlexNet uses ReLu activation function and dropout method [54]. ReLu increases training speed and the dropout is added in the first two fully-connected layers to minimize overfitting. It starts with a convolution layer of 11 × 11 kernel size and 96 filters, which reduces to 5 × 5 and 256 filters. It also consists of three convolution layers with 3 × 3 kernel size and three pooling layers. These layers are followed by three fully-connected layers and an output layer. The VGG configuration adds more convolution layers to facilitate accuracy via deep neural networks [55]. However, an excessive addition of such layers potentially  leads to gradient dispersion that results in training divergence. Here, VGG-16 is used for image training with five groups of two or three convolution layers of 3 × 3 kernel size together with five pooling layers, three fully-connected layers, and an output layer. The ResNet configuration addresses the vanishing gradient problem by exploiting batch normalization and by skipping connections among convolution layers [56]. It also comes in different structures including ResNet-18/34/50/101/152. Here, ResNet-50 is adopted, which consists of a 7 × 7 convolution layer and groups of 1 × 1, 3 × 3, and 1 × 1 convolution layers. It also has two pooling, one fully-connected, and output layers. Lastly, EfficientNet improves accuracy through model scaling and branches into B0-7 [57]. In this work, EfficientNet-B0 is used for its compact architecture, which is characterized by a 3 × 3 convolution layer followed by moving reverse bottleneck convolution (MBConv) layers with either 3 × 3 or 5 × 5 kernels. It also conveys 1 × 1 convolution, pooling, fullyconnected, and output layers.
The training/testing of the four CNN models is performed in two systems. The first uses a 64-bit Windows 8, Intel Corei7-6900K CPU @ 3.20 GHz proces- sor and 128 GB RAM. The second uses Google Colab with 16 GB RAM and Tesla P100 GPU. All Python codes use Tensorflow with Keras interface. Table 8 shows the DR, VA, F-score, and the training/testing times for the CNN classifiers. EfficientNet-B0 has the highest DR of 100% and 99.79% for the two-and five-class models, respectively; VOLUME 10, 2022 whereas, AlexNet results in the lowest training/testing times, highest VA, and fastest convergence rate as shown in Figure 9(a)-(d). It is also found that the training and testing times for the CNN models are significantly higher than those obtained by the conventional ML algorithms, which is attributed to the CNNs deep and complex architectures. However, since detection times (i.e., GTE, CTE) result from classifying 472 images, the average processing times of the five-class EfficientNet-B0 model to classify an image are 0.005s with GPU and 0.066s with CPU, enabling real-time jamming detection and classification. Figure 10 shows the receiver operating characteristic (ROC) of the two-class models and indicates that EfficientNet-B0 outperforms other classifiers in jamming detection. Lastly, the weighted FARs are computed from the confusion matrices, shown in Figure 11, to be 0.6% for AlexNet, 1.55% for VGG-16, 1.86% for ResNet-50, and 0.03% for EfficientNet-B0. It is noteworthy to mention that complexity and severity of a given jamming type have no contribution to its classification accuracy. For example, barrage jamming is the simplest to launch, whereas protocol-aware has the most launch complexity. Yet, their feature-and spectrogram-based misclassifications are nearly 2.5% and 0%, respectively. Similarly, barrage has the highest severity among the four jamming types, whereas successive-pulse has the lowest severity. Nonetheless, their feature-and spectrogram-based misclassifications are < 1% and 0%, respectively, as demonstrated in the confusion matrices in Figures 4 and 11. Table 9 shows a comparison between the proposed method and those reported in literature in detecting and/or classifying jamming attacks with applications to satellite communications, OFDM, VANETs, and 5G/IoT networks. This work entailed four jamming attacks with the highest detection/classification accuracy. Moreover, six conventional and four deep learning models are trained and tested with realistic datasets of extracted signal features and images obtained after rigorous measurement routines.

VI. CONCLUSION
An ML method is proposed to detect/classify four types of jamming attacks on OFDM receivers with application to UAVs. Each attack is built with B210 SDR and launched against a drone to qualitatively analyze its impacts considering severity, complexity, and jamming range. Then, an SDR is used in proximity to the drone to record key OFDM parameters, threshold, signal power, noise power, and SNR for the feature-based approach as well as spectrogram images for the spectrogram-based approach. The former approach is explored with six algorithms and the latter is realized with four CNN algorithms to achieve higher jamming detection/classification accuracy. All models are validated with metrics including detection and false alarm rates, and showed that jamming is detected with 92.2% and 99.79% confidence following the feature-and spectrogram-based classifiers, respectively. This method requires the integration of a data extraction module with the UAV receiver to obtain real-time signal features and/or images to facilitate the detection and classification routines. This integration potentially imposes the need for interface circuits adjoined with a careful analysis of power aspects and hardware imperfection. Future work will entail exploring more jamming types (e.g., deceptive, reactive), incorporating maximum-likelihood classification with advanced SNR probing, and investigating UAV-specific anti-jamming solutions (e.g., trajectory optimization). He secured external funding close to $5M in his research areas (sponsoring agencies include AFOSR, AFRL, CFI, NASA, NIST, NSERC, NSF, ONR, and industry partners). He has authored 250 peerreviewed papers. His research interests include applied electromagnetics, biomedical applications of wireless sensor networks, computer-aided design, device modeling, image processing, infrastructure monitoring, neural networks, RF/microwave design, unmanned aerial vehicles, and virtual reality. In Canada and USA, he graduated 75 theses students at the M.S. and Ph.D. levels and won student nominated teaching excellence awards. He served as an Associate Editor for the International Journal of RF and Microwave Computer-Aided Engineering under the Editor-in-Chief Dr. I. Bahl. He is also a Professional Engineer of the Association of Professional Engineers and Geoscientists of Alberta.