Signal Preprocessing Technique With Noise-Tolerant for RF-Based UAV Signal Classification

Since the beginning of the COVID-19 pandemic, the demand for unmanned aerial vehicles (UAVs) has surged owing to an increasing requirement of remote, noncontact, and technologically advanced interactions. However, with the increased demand for drones across a wide range of fields, their malicious use has also increased. Therefore, an anti-UAV system is required to detect unauthorized drone use. In this study, we propose a radio frequency (RF) based solution that uses 15 drone controller signals. The proposed method can solve the problems associated with the RF based detection method, which has poor classification accuracy when the distance between the controller and antenna increases or the signal-to-noise ratio (SNR) decreases owing to the presence of a large amount of noise. For the experiment, we changed the SNR of the controller signal by adding white Gaussian noise to SNRs of −15 to 15 dB at 5 dB intervals. A power-based spectrogram image with an applied threshold value was used for convolution neural network training. The proposed model achieved 98% accuracy at an SNR of −15 dB and 99.17% accuracy in the classification of 105 classes with 15 drone controllers within 7 SNR regions. From these results, it was confirmed that the proposed method is both noise-tolerant and scalable.


I. INTRODUCTION
Unmanned aerial vehicles (UAVs), including drones, are used for various purposes, such as delivery, agriculture, The associate editor coordinating the review of this manuscript and approving it for publication was Guillermo Valencia-Palomo .
transportation, and communication [1]. The potential uses of such vehicles are continually increasing [2]. Additionally, social and commercial demands for remote technology have increased owing to the recent COVID-19 outbreak, and drones are being proposed as a noncontact solution for numerous applications. The approach in [3] shows how backup VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ transportation systems based on existing drone infrastructure can play an important role during the COVID-19 pandemic and similar situations. Additionally, the authors in [4] proposed the application of drones for spraying disinfectants to combat the COVID-19 pandemic. However, behind such positive applications, the illegal uses of drones [5], such as for spying [6], drug trafficking [7], and terrorism [8], are increasing. Although restricted areas and laws for drone operation have been formed, the barriers for drone purchases have decreased, which can result in numerous risks [9]. Therefore, an anti-drone system is required to block malicious use. It is designed to protect private property and personal privacy from the use of unauthorized drones. It comprises three stages: detection, identification, and decision making [9], [10]. Among the three stages, detection and classification must precede the decision step of the antidrone system such that it can operate normally and defend successfully.
Methods for detecting and classifying drones using various data sources have been presented. Radar and audio signals, vision data, and radio frequency (RF) signals are currently used for drone detection and classification [11]. But even well-known radar-based detection approaches struggle to detect small size drones and their low-altitude flights. In audio-based detection, although the sound generated when the brushless DC motor rotates at high speed is analyzed and detected, it has an extremely short detection distance and is sensitive to noise [12]. Additionally, vision-based detection has the disadvantage of being limited by fog, weather, and various obstacles. RF-based detection offers the advantages of higher reliability and superior performance [13]. In the next section, the related studies are discussed in detail.
For RF-based drone classification, various methods using feature extraction, machine learning (ML), and deep learning (DL) have been proposed, For example, the drone controller signal classification, which uses the frequency domain feature (e.g., kurtosis, entropy, and variance) and various ML classifiers have been studied [14]. The approaches in [15] showed the drone signal classification using the frequency spectrum and a deep neural network (DNN). The authors in [16] proposed a method for channelizing the frequency spectrum and classifying the drone signals using a onedimensional convolution neural network (1DCNN). Recently, CNNs with high utility as image classifiers and spectrograms have been used to obtain high accuracy in RF-based drone classification tasks. However, these studies neither considered the cases wherein the signal-to-noise ratio (SNR) is lowered nor discussed the poor classification accuracy problem in the low SNR regions.
Hence, we propose a noise-tolerant classification method that obtains high classification accuracy even at an extremely low SNR. The proposed method performs data preprocessing based on a threshold after the spectrogram expressed by the power spectral density (PSD) is converted into a power-based spectrogram 1 . This data preprocessing method can achieve high classification accuracy even in the low SNR region with added noise. The dataset used was the drone remote controller RF signals [17] published on IEEE Dataport. The superiority of the proposed method was verified by classifying 15 different drone controllers and comparing the results with other studies that used the same dataset. Focusing on the goals and preprocessing techniques mentioned above, the main contributions of this study can be summarized as follows: • Different methods of drone detection and classification were investigated. Particularly, the preprocessing method and its related classifier were reviewed in terms of RF-based drone classification. Moreover, to stimulate future research activities, we comprehensively reviewed the open-source drone RF data that have been released thus far.
• Related studies have pointed out the problem of poor classification accuracy at low SNR regions in the RF-based drone classification with noise. To create low-SNR environments, the correlation between drone classification and various methods (e.g., Gaussian noise, Rician fading, and Rayleigh fading) was investigated.
Based on this, we proposed a power-based spectrogram and preprocessing technique to achieve noise-tolerance while maintaining high classification accuracy. The proposed method achieved 98% classification accuracy even at an SNR of −15 dB.

II. RELATED WORK
In recent years, the illegal and malicious use of UAVs has raised various security, privacy, and safety concerns. Therefore, numerous methods have been studied to detect and identify drones [18]. Based on the type of data source used, there are four major methods: radar, audio, vision, and RF-based. In this section, a brief description of these methods and related studies are presented.

A. RADAR-BASED METHOD
A radar emits a strong electromagnetic wave and receives an echo wave reflected from the target object to determine its position and speed. It has certain advantages compared with vision-based approaches such as being unaffected by various weather conditions including fine dust, fog, clouds, and rain, However, because the radar cross section (RCS) is optimized for aircraft operating at high altitudes between 1 and 100 m 2 , drones with low RCSs are difficult to detect using radar [19].
Owing to these difficulties, research on the RCS of drones is also ongoing. For example, [20] presented an RCS analysis of DJI Phantom 2, which is a type of quadcopter, at 10 GHz and [21] compared and analyzed the RCS and micro-Doppler signatures of drones based on the number of propellers.
In [22], a passive bistatic radar (PBR) system was developed for drone detection. It uses a digital television signal with bandwidths of 685 and 738 MHz for the transmitter and receiver, respectively, at a distance of 7.5 km. Filtering and correction are applied to the signal after it is received. For the former, an extended Kalman filter was used to effectively track the trajectory of a DJI Phantom 4. However, birds similar in size to drones were also detected, and distinguishing between them in both radar and PBR systems still remains a challenge [23].
In [24], the radar signals of planes, helicopters, quadcopters, birds, and stationary rotors corresponding to 11 classes were collected using an actual 9.5 GHz radar system. Based on these data, a micro-Doppler signature was introduced and then classified by applying ML techniques. The resulting accuracies obtained by the linear support vector machine (SVM), nonlinear SVM, and naive Bayes were 94.91%, 95.39%, and 93.6%, respectively.
Despite these studies, the radar-based method has several limitations such as RF regulation and implementation cost of radar systems.

B. AUDIO-BASED METHOD
The audio-based method collects acoustic signals generated by the rotor rotating at a high speed when the drone that is flying contains acoustic sensors (e.g., a microphone). This signal-based method directly utilizes the unique characteristics of drones [25]. In [26], the authors assumed that a quadrotor vehicle has a characteristic audio power spectrum fingerprint. The plot image learning method was applied with a fast Fourier transform (FFT) of the recorded audio signal to detect the part with the highest frequency amplitude at a fixed size. This method demonstrated a detection accuracy of 83%, and to apply a different algorithm, five consecutive chunks were extracted as a wave file from the same location and converted into a csv file through the FFT. Drones can be detected with an accuracy of 61% when the k-nearest neighbor (k-NN) algorithm is applied to the processed data.
In [27], a CNN was applied to the matrix generated through a short-time Fourier transform (STFT) of the recorded data. The final accuracy of the proposed model was 98.97%, thereby indicating a high classification accuracy. However, when the SNR was reduced to 5 dB by adding white Gaussian noise to the signal under the same conditions as in the model, the accuracy decreased to 75.87%.
Furthermore, the audio-based method requires an appropriate microphone array, it has an extremely short detection range compared to other methods, and is very vulnerable to ambient noise.

C. VISION-BASED METHOD
Owing to the rapid development of computer hardware, numerous computations can be performed at high speeds; thus, it is possible to use a CNN when applying a vision method, which outperforms the traditional method.
In [40], images and various types of drone videos were extracted and collected from the internet, and the drones were distinguished using YOLOv3 [41]. The drones were classified based on the number of rotors, and the accuracy was calculated as the mean average precision (mAP). A final mAP of 0.74 was obtained.
In [42], a hemispherical camera array structure was constructed by using 30 cameras. The authors collected data from this structure and from the detected drones, helicopters, and airplanes using YOLOv2 [43], wherein the detection accuracies were 52.13%, 90.47%, and 96.03%, respectively, which are extremely low for drone classification.
In [44], data points were obtained by applying Harris corner detection to an 1080p image, and ConvNet was applied to remove the background from these points. Subsequently, the initialized points were traced by using the Lucas-Kanade optical flow algorithm [45] to form a trajectory. Using the trajectory generated by the tracking module as an input, the results were derived for trajectory lengths of 30 and 60 points, wherein the classification accuracies of the aircraft and drones were 90.68%, and 92.93%, respectively. Using these results, the authors proposed not only a real-time tracking and classification model but also a precision model. Efficient and easy-to-use vision-based methods are widely used. However, their limitations are evident in that they require a high-resolution camera and line-of-sight (LOS), and they are severely affected by weather or obstacles. Additionally, partial occlusion and illumination are also problems that must be overcome [46].

D. RADIO FREQUENCY-BASED METHOD
The RF-based detection method intercepts and utilizes the RF signal between the drone and the remote controller. Compared with other methods, it is free from constraints such as weather, obstacles, and LOS. The related studies are summarized in Table 1.
The authors of [14], who constructed the dataset [17] used in this study, found energy transient from spectrograms and extracted RF fingerprints. After applying a neighbor component analysis (NCA), classification was conducted through various ML algorithms, such as the k-NN, discriminant analysis (DA), SVM, and neural network (NN). The k-NN classification accuracy for the 14 classes was 96.3%, but all four algorithms demonstrated accuracies of less than 50% at an SNR of 0 dB.
In [39], the authors added Wi-Fi and Bluetooth signals, which are within the ISM radio band used by the drone's remote control signals, to the dataset used in [14]. Interfering signals, Wi-Fi and Bluetooth, can be separated based on the bandwidth and modulation function. RF fingerprints were extracted from the drone controller RF signals and the controllers were classified using ML algorithms. The proposed method obtained a classification accuracy of 98.13% by using the k-NN at an SNR of 25 dB. This model also showed a limitation in that the accuracy was less than 60% at an SNR of 0 dB. In [15], the authors who constructed the drone dataset [32] classified the data into four classes (3 drones and 1 background) with an accuracy of 85.4% by using the frequency spectrum and DNN. Moreover, various studies [33], [34], [35] have used 1DCNN and fully connected neural networks to frequency data, and obtained significantly good accuracy results such as 85.8% [33], 92.02% [34], and 92.5% [35], respectively. Reference [16] achieved 95.6% accuracy by using multichannel CNN as a classifier. Reference [36] leveraged grouped convolution instead of standard convolution and obtained 98.5% accuracy. However, these studies have a limitation in that they do not consider a real noise environment because they used a dataset that was collected in a laboratory.
To obtain higher classification accuracy, studies using the spectrogram have been recently proposed [28], [29], [30], [37]. The authors of [37] shared the dataset with [38], and obtained 100% accuracy by using the RGB values of the spectrogram and a CNN classifier. In [28], 99.9 % accuracy was obtained by extracting features from the PSD, which is the spectrogram value, through CNN transfer learning and classifying them using logistic regression. In [29], 98.3% accuracy was obtained using ML algorithms in the spectrogram RGB matrix to which principal component analysis (PCA) was applied. These studies have shown that the signal classification performed using spectrograms demonstrate a very high accuracy compared to other studies. However, these studies still did not consider noise. To contribute to the broader research community, [30] released the dataset [14] and obtained almost 100% accuracy using a spectrogram and deep residual neural network (DRNN). However, they obtained an accuracy of 86.7% at −10 dB SNR owing to the effects of the added noise.
As evident from related studies, RF-based drone classification can achieve higher classification accuracy compared to other studies by combining spectrograms and artificial intelligence (AI). However, RF-based classification is limited in that the accuracy is reduced if the SNR value is low. To this end, an advanced preprocessing technique is proposed in this study for denoising the RF-based signals even if the SNR decreases by adding noise.

III. SYSTEM MODEL AND DATASET A. SYSTEM MODEL
The system setup of the RF-based UAV classification model is shown in Fig. 1. It consists of a drone, drone controller, antenna, high-resolution oscilloscope, and computer. Because a drone uses an RF signal to communicate with the controller, the proposed method intercepts this signal to identify the drone controller. The received signal is divided into noise and signal components. To increase the classification difficulty, white Gaussian noise is added to the signal by calculating the power of the signal component. A powerbased spectrogram image is generated, and the drone controller signal is classified using a CNN model.

B. DATASET DESCRIPTION
In this study, the drone controller signal [17] in the IEEE Dataport was used. The collected dataset comprised 15 drone controllers made by eight manufacturers, as listed in Table 2, and the time-voltage graph of each controller is shown in Fig. 2. The RF surveillance system continuously receives RF signals and records when the drone remote control (RC) RF signals are detected. For data collection, an oscilloscope (KEYSIGHT Technologies, Infiniium S-Series) with a maximum sampling frequency of 20 GHz, 2.4 GHz 24 dBi grid parabolic antenna (Tp-link, TL-ANT2424B), and low-noise amplifier (Fairview Microwave, 0.85 dB NF Input Protected Low Noise Amplifier) operating at 2.0-2.6 GHz were used. The metadata for the collected RF signal are listed in Table 3.

C. NOISING PROCEDURE
The signal was captured in an indoor laboratory environment wherein the distance between the drone RC and antenna varied from 1-5 m. As shown in Fig. 2, because the size of   the noise section is constant, it is assumed that the environmental noise is fixed. As the distance between the drone controller and antenna increases, SNR decreases. For successful RF-based UAV classification, it is necessary to classify the controller signal accurately even at an extremely low SNR. The authors of [30] evaluated the classification performance in Rician and Rayleigh fading environments. No significant deviation in the classification performance owing to the channel variation was observed. Therefore, in this study, the UAV controller was classified using the dataset wherein the SNR was artificially decreased by adding white Gaussian noise to the collected signal. To change the SNR, it is necessary to generate and add noise that matches the SNR to the signal of the drone controller. Noise is generated through the following process.
To add noise with respect to the SNR to the signal, the power ratio must be calculated. Hence, the noise and signal components within the signal must be separated. For this, the starting point of the transient state is provided in [47]. Several methods can be used to determine the starting point of the transient state, wherein the mean change point detection method [48] is applied. This method, expressed as J (k)   in (1), has an advantage in that there is no need to define a threshold or perform nonparametric estimation to test the hypothesis [49].
When the collected signal is where N is the length of the collected signal x, and k is divided as a reference point into x 1 , x 2 , . . . x k−1 and x k , x k+1 , . . . , x N . At this time, k is the starting point of the transient state that minimizes the following J (k). (1) In this case, the optimal solution of arg min J (k), k * , is the minimizer. x t1 is the interval before k, x t2 is the interval after k, x t1 is the average of x t1 , and x t2 is the average of x t2 . As shown in Fig. 3, based on the point k * at which J (k) is the minimum, the separated x t1 is the noise component, and x t2 is the signal component. Using the following process, it is possible to separate the signal of x t2 from the signal and calculate the noise based on the SNR.
where n is the length of x t2 . If the desired SNR is γ req [dB] and the current power is not on the dB scale, a conversion is required. The inverse transformation from the dB scale is as follows: Because the SNR can be calculated the power level, the noise power can be obtained as a ratio using (2) and (3) as follows: Finally, the magnitude corresponds to the square root of the power, and n[i], which is white Gaussian noise with a Gaussian normal distribution, is generated.
where i = 1, 2, . . . , n, and n is the total number of samples of Additionally, N is the Gaussian normal distribution, 0 is the average, and 1 is the variance. A signal with the desired SNR can be generated by adding this to the original signal The waveform of the original signal with added noise is shown in Fig. 4.

IV. IMAGE GENERATION
Because the threshold corresponding to denoising is applied to the spectrogram transformation, the spectrogram is described first. The image generation used for CNN training from the signal with a decreased SNR comprises two steps. The first is finding a threshold value and filtering the signal based on this value, and the second is creating an image through spectrogram transformation.

A. SPECTROGRAM TRANSFORMATION
Using Fourier transform (FT), time-series data can be analyzed within the frequency domain. However, this means that the information regarding time is lost. It is often necessary to analyze information in terms of both time and frequency, such as human voices and music. The STFT was devised for this purpose [50].
STFT divides the time-series signal into multiple signals by applying a window function and then applying the FT to the signal. In the FT, it is assumed that the period is infinite when calculating an aperiodic signal as a periodic signal. To derive this assumption, a window function i.e., a function whose ends converge at zero, is used. Representative examples include Hanning and Kaiser windows [51]. The discretetime STFT to be applied to the signal captured from this equation is as follows: where x[n] are the data obtained by preprocessing the drone controller signal collected from the antenna, w[n] is the window function, m is the discrete time, and R is the hop size of the window. The result of this STFT is represented by the following matrix: where S(m, w) denotes the spectrogram. The spectrogram is expressed in terms of time, frequency, and PSD by mapping the absolute square of the STFT to a color bar. The transformed spectrogram is shown in Fig. 4. In the spectrogram transformation used in this study, to reduce the number of computations, the hop size is set to the length of the window function; thus, there is no overlapping part.

B. FINDING THRESHOLD AND POWER-BASED SPECTROGRAM
In this study, the threshold value was calculated from the power spectrum and filtered through the power-based and not the PSD-based spectrogram. The results of FFT is only for the frequency domain; therefore, the proposed method can not be applied to FFT. However in case of wavelet transform (WT), it is expected that the proposed method can be applied if processing for variable-sized windows in WT is added. The CNN is a type of DL architecture and is a powerful tool used for image classification. The data used for CNN training can be interpreted as an image, and the image can be interpreted as a three-dimensional matrix with rows, columns, and RGB channels. In an actual CNN, the RGB color value, position, and other image factors affect the results. However, a spectrogram matches the corresponding values obtained from the matrix using the (8) for the color map, as shown in Fig. 4, wherein it can be observed that the same color is biased as the noise increases. These results cause a decrease VOLUME 10, 2022 in the classification accuracy of a CNN. Hence, applying a threshold to a signal to which noise has been added has the same effect as that of filtering. In other words, the application of a threshold corresponds to denoising. The average power of a signal can be obtained using the following equation when signal x(t) is given: where T is the period wherein t = t 0 at an arbitrary time. From this, a window function is applied to manage the time constraint of the signal instead of the time constraint of the integral boundary of the frequency.
This is a window function such that x T (t) = x(t)w(t), where w(t) is 1 at any point and 0 for the remaining interval. From this, the energy of the signal is obtained through Parseval's theorem as follows: where X T (f ) denotes the FT of x T (t). Thus, the PSD can be obtained as follows: However, because S xx (f ) is a density function, to obtain the power spectrum, the power of the corresponding frequency can be calculated by applying an integral with respect to the frequency within a short section.
where P limited,i is the power corresponding to two frequency bandwidths, i.e., the i-th segments f i and f i+1 , which satisfies the condition 0 < f i < f i+1 . The same amount of power is calculated from the positive and negative frequencies, which can explain the 2 in front of the equation. Fig. 5(a) shows the power spectrum of the DJI Inspire 1 Pro, and Fig. 5(b) shows the power spectrum of a signal with noise added to an SNR of −10 dB. These two graphs show that the drone controller signal uses a bandwidth of 2.4 GHz, and the power of the drone controller signal is the largest. From (8), it is confirmed that the PSD-based spectrogram is a graph wherein the value of the absolute square of the STFT, i.e., the PSD, is mapped to the color bar. The (13) is applied to the (8) to generate a power-based spectrogram by integrating it over a limited frequency in the PSD-based spectrogram.
where S p (m, w) is a power-based spectrogram obtained by converting the of mapping the time-frequency-power to the color bar from the PSD-based spectrogram, which was originally expressed as a time-frequency PSD. The frequencies f i and f i+1 satisfy the condition f i < f i+1 . As shown in Fig. 6(b), a power-based spectrogram can be obtained from the PSD-based spectrogram. However, because the image in Fig. 6(a) is color-biased, as shown in Fig. 4(c), the threshold value should be applied as follows: where γ t denotes the threshold value. 2 This threshold is applied to the power-based spectrogram as follows: If the threshold value used at this time is such that the drone controller signal power is larger than that of the other signal, the average signal power can be set as the threshold value for an expedited calculation. The power-based spectrogram, to which a threshold is applied, is shown in Fig. 6(b). By applying a threshold to the original and difficultto-classify signals, the signal and noise can be separated more clearly. Finally, the complexity of the proposed method can be obtained as follows. The complexity of FFT is O(N log 2 N ) [54]. In case of STFT, FFT is required for each of the samples in the window function, the complexity is O (N w N log 2 N ) where N w is number of sampling points in the window function [55]. In addition, power-based spectrogram integrates over the frequency segments from the spectrogram using STFT. Therefore, the complexity of proposed method is O(N seg |N w N log 2 N | 2 ) where N seg is the number of points in the frequency segments.

V. CNN-BASED DRONE CONTROLLER RF CLASSIFICATION
In this section, an efficient low-cost and highly-accurate CNN model is presented to classify the RF signals of 15 different drone controllers. The proposed CNN model consists of three two-dimensional convolution (conv2D) layers and three max pooling (maxpool) layers. Our goal is to propose a drone classification method with the possibility to be implemented in real-time. Therefore, we consider the basic CNN architectures because of the time complexity. This optimized model has been tested through various approaches, such as depthwise convolution and dilated convolution. The overall architecture of the CNN model is illustrated in Fig. 7, and the details of the model configuration and the number of parameters are listed in Table 4. The 356 × 452 × 3 sized power-based spectrogram image that was preprocessed in the previous section is used as the input layer of the CNN model. The first conv2D layer is a 64-channel filter with a stride 2 The key function of a real-time spectrum analyzer is parallel sampling and FFT calculation [52]. The data sampling continues while the calculations are performed. Furthermore, the real-time operation of STFT was studied in [53]. Using these methods, threshold value can be calculated in a semiadaptive manner.   of (2, 2) and size of 2 × 2. Subsequently, the information passes through a batch normalization layer and rectified linear unit (ReLU), which is an activation layer. From the CNN architecture, the number of learnable parameters is 0.485 M and the number of floating-point operations (FLOPs) is 507.87 M.
ReLU outputs a zero if the input value is less than zero, and outputs the input value otherwise. The data then pass through the maxpool layer before being connected to a fully connected layer through 128 and 256 conv2D and maxpool layers, respectively. The last layer of the CNN model is the classification layer, for which the model uses the softmax function.  where a i (x) is the i-th value of the fully connected layer and n is the total number of classes. As shown in the above equation, the softmax function is a method of expressing a probability between zero and one from the values of the fully connected layer. From this, the resulting value is predicted using a onehot vector based on one-hot encoding.
In (19), v is a vector obtained through one-hot encoding of the output result of the softmax function, and i denotes the i-th element of vector v. The length of v can reach up to the total number of classes, which means that an i-th of 1 is the predicted class.
To allow for learning in the correct direction, it is necessary to ensure that the predicted values match the actual values of the cost function. Therefore, training should be directed toward minimizing the cost function. In this study, the following cross-entropy loss function is used: where W is the weight vector of the model; y i is the true label, i.e., the previously obtained one-hot vector; y i is the predicted label, i.e., the result of the softmax function; and N is the total number of classes. If the predicted result and actual value are the same, it converges to zero, and if they are completely opposite, it diverges toward infinity by − log. Learning proceeds such that the loss function is minimized.

VI. SIMULATION RESULT
In this section, the performance of the proposed model is evaluated. The simulation settings for the CNN model are listed in Table 5. Particularly, the optimizer was a stochastic gradient descent with momentum, the momentum factor was 0.9, L2 regularization factor was 0.0001, maximum number of epochs for training was 100, initial learning rate was 0.01 (which decreased to 0.001 after 60 epochs for better training convergence), and the mini-batch size was 16.
For the simulations, noise was added to the drone controller signals based on the SNR. A power-based spectrogram image was generated from the signal with the added noise, with a threshold applied based on the value of the power spectrum. The signal was changed by adding noise in 5 dB steps, from an SNR of −15 to 15 dB. Two simulations were performed to confirm whether the proposed model can accurately classify the physical signals, even at a low SNR.
First, to verify the effectiveness of the proposed method in a real noise environment, white Gaussian noise was added and the accuracy of the PSD-based spectrogram was compared with that of the power-based spectrogram based on the threshold value. Additionally, the classification accuracy was compared with the results of other studies that used the same dataset.
Thereafter, the last experiment confirmed whether the proposed method is scalable as an RF-based approach for drone detection. In the first setup, data from the same drone controller in different SNR regions were added as a new class. In the second setup, 105 classes were generated by combining the 15 drone controller signals with seven different SNR regions. The result of the classification experiment indicates that the proposed method can classify the same drone controller even if it has a different SNR value.
The power-based spectrogram used in the simulation contained 300 images per class. All simulation settings were the same as that for the model and options described above. 80% of the dataset was used for training, and the remaining 20% was used as the test set for verification. Accuracy was used as an indicator to evaluate the performance.

A. PERFORMANCE EVALUATION
In this experiment, the performance of the proposed method was evaluated. The PSD-based spectrogram was used to compare and evaluate the performance of the power-based spectrogram. The quality and number of datasets used are critical to obtain accurate outcomes. Accordingly, 300 spectrogram images were selected as the training set and testing set. The results of this experiment are shown in Fig. 8. The PSD-based spectrogram images obtained a high classification accuracy of 94.92% at an SNR of 15 dB; however, when the SNR was reduced, the classification accuracy decreased significantly. Contrastingly, in the power-based spectrogram to which the threshold value was applied, the classification accuracy did not fall below 98%, even at an SNR of −15 dB.  Because the drone controller and noise homogenize to their signal levels, running CNN based algorithms using PSD-based spectrogram for classification exhibits poor accuracy. Hence these results validate that the proposed method is noise-tolerant, robust, and scalable when a power-based spectrogram and threshold value are used.
To confirm whether the proposed method achieves high accuracy, the classification accuracies obtained in [14] and [39] were compared using the same drone controller signals. Both studies applied various ML techniques by extracting a feature called an RF fingerprint from the energy transient. In [14], k-NN, DA, SVM, and NN techniques were used, and in [39], k-NN, DA, and random forest (RandF) were applied. A comparison of these two studies is shown in Fig. 9. Although the classification accuracy is high when the SNR is high, the classification accuracy evidently decreases significantly as the SNR decreases. This indicates that there are serious difficulties in extracting the valid features owing to the effects of additive noise.
Through these experiments, it can be confirmed that even at a low SNR, the classification accuracy of the proposed method is sufficiently reliable.

B. ANALYSIS OF THE RF-BASED DRONE CLASSIFIER AFTER ADDING A NEW CLASS
Through this experiment, the scalability potential of the proposed method as the RF-based drone classifier was confirmed in a real application. The classifier should be capable of maintaining the same performance even if a new model is added and classifying drone controllers in different SNR environments. Therefore, the signals of the 15 controllers were divided into 105 classes across 7 SNR regions, and the classification results of the proposed method were confirmed through preprocessing. For the training, 300 images from each class were used; therefore, 31,500 images were applied. Among the classification results, the incorrectly predicted classes are listed in Table 6, and the controller corresponding to the class is arranged in Table 2.
Out of 6,300 test set images, 52 incorrect predictions were made and a classification accuracy of 99.17% was obtained. Particularly, the classifier often incorrectly predicted DJI Phantom 4 Pro, which corresponds to #5 in Table 2, as DJI Matrice 600, which corresponds to #3, for each class. It was analyzed that this is because the two controllers are made by the same brand, and both their waveforms and spectrograms have similar values and patterns, respectively. However, the classification accuracy of 99.17% for 105 classes demonstrates that the proposed preprocessing technique that uses a power-based spectrogram is robust to denoising methods.
Thus, from the two experiments, it was demonstrated that the proposed method is noise-tolerant and scalable.

VII. CONCLUSION
In this study, a data preprocessing technique for classifying the RF signal of a drone controller for UAV identification was proposed. The SNR of the signal was changed by adding white Gaussian noise. The proposed method used a powerbased spectrogram instead of the existing PSD-based spectrogram for learning. Additionally, to reduce the effect of noise, the threshold value was calculated from the power spectrum and then applied to the power-based spectrogram to increase the classification accuracy. The proposed method can classify the drone controller signal with an accuracy exceeding 98%, even at an extremely low SNR of −15 dB.
Through the results of our proposed method, it is expected that classification performed using the spectrogram can identify drones with higher accuracy. However, classification of multiple drone controller signals remains challenging.
In future research, such classification at the physical signal level will be investigated for achieving real-time detection similar to radar.  Previously, she was with Korea Communication Agency and LG Electronics Inc., South Korea. She is currently an Assistant Professor with the School of Biomedical Convergence Engineering, Pusan National University. Her research interests include AI, bioinformatics, drug discovery, graph neural networks, machine learning, and big-data analytics.