Microwave Radiometer RFI Detection Using Deep Learning

—Radio frequency interference (RFI) is a risk for microwave radiometers due to their required very high sensitivity. The Soil Moisture Active Passive Mission (SMAP) has an aggressive approach to RFI detection and filtering using dedicated spaceflight hardware and ground processing software. As more sensors push to observe at larger bandwidths in unprotected or shared spectrum, RFI detection continues to be essential. This paper presents a deep learning approach to RFI detection using SMAP spectrogram data as input images. The study utilizes the benefits of transfer learning to evaluate the viability of this method for RFI detection in microwave radiometers. The well-known pre-trained convolutional neural networks (CNN), AlexNet, GoogleNet and ResNet-101 were investigated. ResNet-101 provided the highest accuracy with respect to validation data (99%), while AlexNet exhibited the highest agreement with SMAP detection (92%).


I. INTRODUCTION
ADIO frequency interference (RFI) detection in microwave brightness temperature data continues to be a problem of interest and several techniques have been developed to detect the presense of RFI in radiometer measurements, e.g.[1][2][3][4][5][6][7][8][9][10].Soil Moisture Active Passive (SMAP) is the first spaceflight mission to use onboard digital signal processing dedicated to generating information to enable RFI detection and filtering [11].The RFI detection algorithms in the SMAP ground processing are drawn from this previous work and include energy detectors in both time and frequency referred to as the pulse and cross frequency detectors, the kurtosis method, which is a test for normality and polarimetric approaches, which search for anomalies in the 3 rd and 4 th Stokes parameters [4,12,13].These methods work best to filter RFI that is sparse in time and/or frequency.They improve the quality of measurements but at a cost of increased uncertainty or Noise Equivalent Differential Temperature (NEDT) due to removal of contaminated pixels, which reduces the time-bandwidth product.
Broadband radiometry enhances sensitivity by reducing instrument noise, thus as upcoming sensors desire observations P.N.Mohammed is with NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA and also with Morgan State University, Baltimore, MD 21251 USA.(Priscilla.N.Mohammed@nasa.gov)J. R. Piepmeier is with NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.(jeffrey.r.piepmeier@nasa.gov)at much larger bandwidths than that of the SMAP radiometer in unprotected or shared spectrum, RFI detection and filtering remains a necessity, e.g.[14,15].Indeed on-board RFI detection is being considered for future missions, for example the second generation of polar orbiting meteorological satellites (MetOp-SG) as well as the Copernicus Imaging Microwave Radiometer (CIMR) mission [16,17].
Increasing bandwidths lead to an increase in the number of spectral channels.
Future instrument concepts include hyperspectral imagers and sounders that offer benefits such as increased precision and accuracy of products as well as extended spectral coverage [18,19].Thus with the potential deluge of data from new sensors, conventional RFI detection and filtering may not be the only option.A new class of RFI detection algorithms include intelligent approaches based on machine learning techniques.For example, the authors of [20] have used deep learning techniques to detect RFI in 2-D time ordered radio astronomy data and the authors of [21] have used deep learning to detect RFI in Global Navigation Satellite System (GNSS) signals.Here deep learning is used to demonstrate RFI detection on spectrograms produced by the earth orbiting SMAP microwave radiometer.The work in this paper was done as a proof of concept to determine the feasibility of deep learning for RFI detection using real world data.With the advent of graphics processing unit (GPU) technology for space use and the growing number of channels in digital receivers, it may be advantageous to run a deep learning algorithm on a GPU rather than conventional algorithms on a microprocessor.

II. METHODOLOGY
Deep learning [22] is a type of machine learning where the model learns to perform classification tasks directly from images, text or sound [23,24].In this work, RFI detection is investigated using supervised learning based on SMAP spectrograms.The model is trained using labeled data sets as inputs and the training process is done iteratively until the expected output is generated.Convolutional neural networks (CNNs) have been successfully exploited in image classification and most commonly used in this type of application [25,26].Transfer learning [27] is the process where a pre-trained network is fine-tuned; the learned features are transferred to a new task using a smaller number of training images and is usually much faster.The CNN could be trained using simulated RFI, however, SMAP measurements were chosen for this proof of concept experiment.Although millions of spectrogram images can be produced from SMAP data, transfer learning is utilized for this application to produce results in a timely manner for study.
To use deep learning and transfer learning for RFI detection, the problem is defined as an image classification one where spectrograms are the input images to be classified as having RFI or not having RFI.In this paper, the benefits of transfer learning are leveraged and the pre-trained CNNs AlexNet [25], GoogleNet [28] and ResNet-101 [29] were used for feature learning and classification.These networks were previously trained on more than a million images from the ImageNet [30] database to classify images into 1000 categories including nature, animals and everyday objects.
The transfer learning approach applies knowledge of one type of problem to a different but related problem.In order to use an existing CNN such as AlexNet to detect objects not trained in the original network, it can be retrained through through transfer learning.The last few layers of the network were replaced and then retrained with images of spectrograms derived from SMAP data.In the proposed method here (see Figure 1), the pre-trained network was loaded using Matlab, labeled input images were loaded to a datastore that stores the filepaths to read the image data into memory as needed and the image size converted to that required by the existing network.The last layers of the CNN that learn features specific to the input data set were replaced and the network was then trained using 80% of the input spectrogram images.The network was validated using 20% of the input data and the results were deployed to classify new images.

A. Data Acquisition for Input Training Images
Each SMAP footprint in the Level 1A data contains an 8 (1.2ms samples) x 16 (1.5-MHzchannels) spectrogram of antenna counts or average power [31].The passband shape is equalized by applying independent gain and offset calibration coefficients to each of the 16 frequency channels [32].The spectrograms were converted to images, with normalized color scales, and used as inputs to the deep learning algorithm implemented in Matlab.
Supervised training requires a-priori labeling of the images, in this case "no RFI" and "RFI."RFI cases were taken from all parts of the globe including low level (5 to 10 Kelvin), moderate level (10 to 100 Kelvin) and high level (> 100 Kelvin) RFI as well as different types of RFI such as pulsed, CW and wideband.Ground truth is difficult to establish when using real world data, thus conditions were set where it is reasonably certain that RFI is present in the footprint.The SMAP ground processing algorithm, which detects and filters RFI, attempts to estimate the RFI strength.RFI level is determined from the difference of non-filtered and filtered data.However, when RFI is very strong (typically > 400 K), 100% of the spectrogram is blanked resulting in a reported null value for the filtered footprint.In these cases, RFI level cannot be directly determined.Also, if more than 50% of the spectrogram is blanked by the algorithms, it is reasonably certain that RFI exists in that footprint no matter the difference of the nonfiltered and filtered data.Therefore, spectrograms were labeled as "RFI" on the condition that either the RFI level for the footprint was over 5 K (-20 dB INR) as estimated by the SMAP ground processing algorithms or that the SMAP ground processing algorithms blanked more than 50% of the spectrogram.
To be confident that the RFI-free cases were indeed not contaminated, RFI quiet parts of the globe were used for these samples.Footprints over Australia, Antarctica, the Arctic and the Southern part of the Indian Ocean were used with the condition that the RFI level was less than 2 K and the number of pixels flagged was less than 50% of the spectrogram.The training input data consisted of 2507 images labeled "RFI" and 2507 images labeled "no RFI" with 80% used for training and 20% for validation.Orbits different to those used for obtaining training samples were used to test the trained networks.

B. Training the Network
CNNs consist of convolution and pooling layers that extract features from the image inputs followed by a classification layer that uses the features to classify the input image.In transfer learning, most of the convolutional part is unchanged and the new classifier replaces the classification layer.The last learnable layer was replaced with a fully connected layer with the number of outputs equal to the number of classes in the input data set.In this case, the number of classes is two since the objective is to classify images as "RFI" or "no RFI".The first layer of the network is the image input layer that specifies the image size.The training images were resized and additional augmentation prevented the network from overfitting.Random translation and reflections of the spectrograms were performed which represent shifting RFI in time and frequency.Once training was completed, accuracy was determined by classifying the validation images with the trained network.
In this study three well-known CNN architectures, AlexNet, GoogleNet and ResNet-101, have been analyzed for classification precision and training time.These pre-trained networks, that have already learned to extract features from images of nature, were used as a starting point to learn the new task of RFI detection using SMAP data as inputs.Transfer learning was first done using AlexNet since this network is one of the faster networks.Once the settings for training were established, the other networks, which are known to be more accurate but with longer prediction times, were used to check for better accuracy and to see if RFI detection results improved.

III. EXPERIMENTS
The general hypothesis is the object-recognition capabilities of the CNN will detect RFI features in the spectrogram amidst the random variations of natural thermal emission, much like detecting a specific object in a busy image.The RFI-free images over uniform scenes are essentially random fields with normally distributed values.Spectrograms with RFI tend to have features concentrated in time or frequency and in some cases have broad time-frequency features that can mimic geophysical features such as coastlines.Non-uniform scenes, for example coastlines, impose a systematic variable background on the spectrogram over time, which as it turns out can be mistaken for RFI.Therefore, two groups of experiments with different training data sets were run.The first group included RFI-free coastlines in the training set.The second group excluded RFI-free coastlines in the training set and concentrated on using RFI-free data from uniform scenes.Figure 2 shows some examples of RFI-free (a-c) and RFIcontaminated (d-f) footprints.Figure 2 Including coastlines in the RFI-free training set (e.g., Figures 2(b) and (c)) can result in missed detections of RFI such as Figure 2(f) because it is similar to (c).On the other hand, training the CNN using data with RFI such as that shown in Figure 2(f) can produce false alarms along coastlines.To complete the experiment, the two trained networks were then tested with data taken from orbits over Europe and the Middle East to test RFI detection performance.

A. Accuracy of the trained CNNs
In this section, the performance of the trained CNNs to identify RFI are presented.Table 1 shows the results of a

B. Test Orbits
The orbits over Europe and the Middle East were chosen as test cases for the trained networks, based on the types of RFI seen in these locations of the world.The pass over Europe consisted mostly of pulsed RFI, which is sparse in time and frequency, while the RFI over the Middle East was mostly wideband and more persistent in time and frequency.Figure 3 shows the spectrograms of sample footprints from these orbits.The example spectrograms over Europe show RFI that is narrowband and pulsed, wideband pulsed, and narrowband continuous.The spectrograms over the Middle East contain wideband RFI concentrated over a large portion of the footprint with one RFI example (Figure 3(g)) that looks very similar to a coastline feature.The feature in 3(g) was identified as RFIcontaminated since multiple detectors in the SMAP algorithms flagged several pixels and the footprint was anomalously high in brightness temperature.This highlights the problem the deep learning detector faces when identifying RFI-contaminated images versus RFI-free geophysical features.
Parts of both orbits with RFI concentration were tested using the trained networks and the RFI detection results were compared to SMAP RFI detection.Since the SMAP ground processing contains numerous detectors with a combined false alarm rate of ~6% [13], the same conditions for creating the RFI contaminated footprints for training data were used as the conditions for positive SMAP RFI detection.Thus, if a footprint had an RFI level greater than 5 Kelvin, or the SMAP algorithms blanked more than 50% of the spectrogram, then that footprint was positively identified as RFI contaminated and compared to the binary results from the trained networks.

C. RFI Detection Results
For the Europe orbit, 19405 footprints were tested and 4088 footprints fit the SMAP detection criteria.For the orbit over the Middle East, 98467 footprints were tested and 3113 footprints were detected by the SMAP detection.that the deep learning algorithm has high performance for detecting pulsed and narrowband RFI.The networks showed lower performance for detecting RFI over the Middle East where wideband signatures are more prevalent.However, the detection agreement increased when coastlines were excluded from the RFI free training data set.This increase was more prominent for the test orbit over the Middle East where AlexNet showed the highest detection agreement with SMAP detection.

D. Europe Orbit
Figure 4(a) shows the horizontal polarization brightness temperatures for part of the orbit over Europe that was tested.RFI shows up as hot spots throughout the image.Any value greater than 330 K can automatically be considered as RFI since this is the geophysical limit for brightness temperature measurements.The results of the deep learning algorithm (with coastlines in the RFI free training data set) are shown in Figure 4(b).The RFI detected footprints were omitted and appear grey in the image.The results compared to SMAP detection are shown in Figure 4(c).The red pixels indicate deep learning and SMAP detection agreement of RFI contaminated footprints with grey showing agreement of RFI free footprints.Blue shows deep learning detection but no SMAP detection and yellow shows SMAP detected footprints not detected by deep learning.Of the 19405 footprints tested in this orbit, 4088 fit the SMAP detection criteria stated in Section II A. The deep learning algorithm has a 97.58 % (3989 footprints) agreement with these SMAP detected footprints, depicted as red in Figure 4(c).The deep learning algorithm detected an additional 3418 footprints as RFI contaminated which are blue in Figure 4(c).
Not all the additional footprints were false detections.Manual inspection revealed low level pulsed sources less than 5 K.Some were false alarms.Since SMAP has a FAR of ~6%, this implies that about eight spectrogram pixels are flagged on average in the absence of RFI by the SMAP detection algorithms.Deep learning detected 256 footprints (1.3% of total tested footprints) that had 10 or fewer pixels detected by the SMAP algorithms.These footprints can be considered false alarms.When AlexNet (no coastlines) was used for testing, agreement with SMAP detection increased to 98.85%, but the number of footprints detected as possible false alarms also increased to 1053 or 5.4% of the tested footprints.Figure 5 shows the results of a similar analysis over the Middle East.Using the training network AlexNet, the deep learning algorithm has 88.31% (2749 footprints) agreement with SMAP detection (3113 footprints).The agreement increased to 92.23% (2871 footprints) when AlexNet (no coast) was used.The number of false alarm detections also increased from 551 (0.6%) to 4470 (4.54%) footprints.See Table 4.When Figure 5(c) is examined, there appears to be footprints detected by the deep learning algorithm concentrated along the coasts that do not match SMAP detection.Random mismatches occur throughout the orbit as well.There are 8119 additional footprints (shown in blue) detected by deep learning that do not match the SMAP detection criteria imposed.Footprints that appear as yellow in Figure 5(c), are deep learning missed detections.These test cases indicate that the deep learning algorithm is directly affected by the input training data.The deep learning algorithm provided very good detection for RFI that is localized in time or frequency since RFI data has very distinct features when compared to RFI free data.
This study produced a lower detection rate for broadband RFI as indicated by the results of the Middle East test orbit.This orbit contained RFI, which was mostly broadband in nature and even contained RFI with characteristics similar to geophysical features.This allowed for higher missed detections or higher false alarms depending on whether coastlines were included in the input training data.

V. APPLICATION TO A NEW INSTRUMENT
In this section the SMAP-trained networks were used to detect RFI in data produced by a second, unrelated, instrument.Over the period of August 2-6, 2017, a reflectometry experiment took place at Platform Harvest located about 10 km off the coast of central California [33].Direct and reflected signals from DirecTV, a U.S.-based direct broadcast satellite (DBS) service provider, were measured with a K-band system [34].The K-band system included an RF front end with with two commercial reflector antennas and low-noise block downconverters, and a digital back end designed to collect 200 MHz of bandwidth from 18.6 to 18.8 GHz.Of particular interest are ocean reflections that can affect sensors in this frequency range.The Global Precipitation Measurement (GPM) Microwave Imager (GMI) has seen RFI caused by ocean reflections from DBS signals at 18.7 GHz which is a shared allocated band [35].
Figure 6 shows the normalized power spectral density of the direct and reflected DBS signals measured by the K-band system with interference to noise ratios (INRs) estimated at 10 dB and -8 dB respectively [34].The INR is relatively high thus the deep learning detector is expected to work well.Figure 7 shows a time slice of the normalized variance of the direct and reflected signals after spectral subbanding.The reflected signal retains a similar shape to the direct signal but the power level is much lower.The time frequency power for the reflected signal was converted to spectrogram images as was done with SMAP data in this study.
The K-band digital backend provided data products including power for 16 spectral subbands for both direct and  reflected signals with a time resolution of approximated 328 μs.Spectrograms were created using 11 (328-us samples) x 16 (12.5-MHzchannels) for an elapsed time of 3.6 ms and total bandwidth of 200 MHz, which approximately matches the total integration time and bandwidth of the GMI 18.7 GHz channel [36].
Approximately 70 seconds of data were used to create 19194 images for test.Data without RFI was obtained by placing absorber in front of the antenna of the K-band system.Spectrograms for the no RFI case were created similarly with 10 seconds of absorber data providing 2636 images.
A spectrogram with the reflected DBS signal is shown in Figure 8(a), which shows distinct features of the RFI occurring continuously and concentrated in frequency.Figure 8(b) is an example of spectrogram data without RFI taken by the K-band system.The SMAP-trained networks in Table 3 were used to test these RFI contaminated and RFI free images.All the networks classified 100% of the images with reflected DBS signals as RFI contaminated.This is not surprising since the DBS signal is easily identifiable in the images.Half of the networks classified 100% of the non contaminated images as RFI free.The other networks falsely classified 7 (0.27%) or less of the RFI free images as RFI contaminated.See Table 5 for the results of this classification.This classification experiment demonstrates that networks can be trained with data from one sensor or several data sources and used to detect RFI in other data sources.The deep learning algorithm learned features of different types of RFI seen in SMAP data and was able to correctly detect these features from the K-band system dataset.

VI. CONCLUSION
In this paper, a transfer learning approach for RFI detection using SMAP detection was presented.In this approach the last few layers of pre-trained networks were replaced and retrained with a smaller dataset.These results were then used to classify images from two test orbits of SMAP data.Three well-known pre-trained CNNs, AlexNet, GoogleNet and ResNet-101 were investigated for accuracy and RFI detection.All networks provided very similar accuracy results for the validation data sets; however, AlexNet provided the best results when used for RFI detection on other test orbits.
The algorithm had high performance for detecting RFI localized in time or frequency and lower performance for broadband RFI.RFI free data is usually broadband and vary smoothly over long time-scales, while RFI can appear as high intensity pixels localized in time/frequency data.RFI can also have broadband features, and vary over time-scales longer than a footprint.The results show that classification is highly dependent on the type of input data.Training data that used geophysical features such as coastlines produced more missed detections of broadband RFI while training data without sometimes mistook coastlines for RFI.
The trained networks were also used to detect DBS signals measured by a K-band system demonstrating that the trained networks can identify RFI in data from other systems.The work in this paper presents the initial step to applying deep learning for detection of RFI in radiometer data.To provide better performance for detection for all types of RFI, the training data set requires more input training data, representative of all types of RFI and geophysical features.To provide better detection especially for RFI with broadband features, the pre-trained network can be retrained with a data subset with improved ground truth such as simulated data.Future work includes the development of a comprehensive set of simulated and real world data of RFI and RFI-free spectrograms with varying RFI resolution and INR to train a network from scratch.Performance factors to be evaluated include detection capability and energy efficiency.The use of CNNs to detect RFI is an attractive alternative to conventional techniques given the large amount of radiometer data that will exist in the coming decades.
AlexNet contains 25 distinct layers that includes 5 convolutional layers and 3 fully connected (FC) layers in which a rectified linear unit (ReLu) activation function is applied after every convolutional and FC layer.Layer 1 is the input layer to which the images are fed.Layers 2-22 are the convolution, ReLu and Max Pooling layers where feature extraction occurs.The last 3 layers include the FC layer that maps the extracted features to each of the 2 output classes in this experiment (RFI, no RFI), followed by the softmax layer where the probability is assigned to the input image for each output class and lastly the classification layer returns the output class of the input image.The first two FC layers has 4096 neurons and the third has 1000.GoogleNet and ResNet-101 are 22 layers and 101 layers deep respectively.Both contain an FC layer of 1000 neurons.The input image size for AlexNet is 227x227x3 and 224x224x3 for the other two CNNs used.During the transfer learning process, only the last 3 layers were modified to suit the RFI classification problem.
(a) shows a natural, uniform scene and (b) and (c) are examples of non-uniform coastal crossings.Figure 2(d) shows RFI continuous in time and limited in frequency; Figure 2(e) shows an RFI signal chirped in frequency and (f) shows RFI with less distinct timefrequency definition.

Figure 2 :
Figure 2: (a), (b) and (c) are examples of RFI-free images over Australia, where (a) is over land and (b) and (c) are coastline footprints.The bottom images, (d), (e) and (f), are examples of RFI contaminated footprints over Japan, the United Kingdom and Spain respectively.

Figure 3 :
Figure 3: Spectrograms from the orbit over Europe showing typical RFI in this pass (a), (b) and (c).The RFI is either sparse in time or frequency.Spectrograms from the orbit over the Middle East, (e), (f) and (g) indicate wideband RFI.

Figure 4 :Figure 5 :
Figure 4: Horizontal polarization brightness temperature for the test footprints over Europe (a).The color scale was limited to values from 180 K to 300 K. Footprints with brightness temperature equal to or lower than 180 K appear dark blue and those that are 300 K and above are dark red.Footprints detected by the deep learning algorithm using AlexNet (b) were removed from the images.A comparison of SMAP detection and deep learning are shown in (c).

Figure 6 :
Figure 6: Power spectral density of the normalized baseband signals measured with the K-band recording system deployed at Platform Harvest.

Figure 7 :Figure 8 :
Figure 7: Normalized variance of the baseband signals after spectral subbanding measured with the K-band recording system deployed at Platform Harvest.

Table 1 :
Results showing the accuracy for each network as well as the training times.Each network was trained twice: Dataset 1 contained coastlines in the RFI free images while Dataset 2 did not.

Table 2 :
Results showing the confusion matrix for each network for the validation data set.The true class indicates the true classification and the predicted class is that of the trained network.Dataset 1 contained coastlines in the RFI free images while Dataset 2 did not.comparativeanalysiswith respect to training time and the accuracy using validation data.Each network was trained with 80% of the labeled training dataset and the remaining 20% was used to test for accuracy.The results show that AlexNet had the least training time of the three and ResNet-101 took the longest.Each network was trained twice, using datasets with and without coastlines in the RFI free images.The training times were slightly less using the second dataset and accuracies were slightly better for GoogleNet and ResNet-101.All networks in both training cases achieved over 96% accuracy.Table2shows the confusion matrix for each trained network.
The lower left and upper right of the off diagonal represent the probability of false alarm and probability of missed detection respectively.In order to test these networks for RFI detection, an orbit over Europe and another over the Middle East were used.These orbits were different to the orbits used for the training data extraction.

Table 3 :
Table 3 shows how well the trained networks detected RFI over the two test orbits compared to SMAP detection.All the networks had similar detection agreement rates for the Europe orbit.This indicates Results showing the detection agreement with SMAP detection for each network.

Table 4 :
Results showing the footprints considered false alarms for retrained AlexNet with different input training sets.

Table 5 :
Results showing the classification of RFI free images from the K-band system.