Flash Flood Detection From CYGNSS Data Using the RUSBoost Algorithm

Flash floods can cause massive damages because of their rapid evolution. To reduce or prevent harm caused by a flash flood, it is vital to have information about its formation and spread. Hence, providing real-time surveillance flood is essential. Considering Hurricane Harvey and Hurricane Irma as two case studies, six different data preparation approaches (DPAs) for flood detection based on the Cyclone Global Navigation Satellite System (CYGNSS) data and the Random Under-Sampling Boosted (RUSBoost) classification algorithm are investigated in this article. Taking flood and land as two classes, flash flood detection is tackled as a binary classification problem. Eleven observables are extracted from the delay-Doppler maps (DDMs) for feature selection. These observables, alongside two features from an ancillary data, are considered in feature selection. All the combinations of these observables with and without ancillary data are fed into the classifier with 5-fold cross-validation one by one. Based on the test results, five observables with the ancillary data are selected as a suitable feature vector for flood detection here. Using the selected feature vector, six different DPAs are investigated and compared to find the best one for flash flood detection. Then, the performance of the proposed method is compared with that of a support vector machine (SVM) based classifier. For Hurricane Harvey and Hurricane Irma, the selected method is able to detect 89.00% and 85.00% of flooded points, respectively, with a resolution of $500 \, \mathrm {m} \times 500 \, \mathrm {m}$ , and the detection accuracy for non-flooded land points is 97.20% and 71.00%, respectively.


I. INTRODUCTION
A flash flood is a surge of water that starts and develops in a short period. The primary cause of flash flood is heavy rain. Additionally, dam breakage, ice and snow meltdown, and events in which a large amount of water is released to dry areas can also cause flash flood. Even though a flash flood dissipates quickly after the occurrence, it has consequential damages such as death and severe injuries, water contamination, financial harms, infrastructure damages, and agricultural losses [1], [2]. Hurricanes, which are a significant cause The associate editor coordinating the review of this manuscript and approving it for publication was Jon Atli Benediktsson . of flash floods, are tropical cyclones with high wind speed (higher than 33 ms −1 [3]) and capable of pouring massive rain over coastal regions during landfall [4]. Considering the population growth in coastal areas that are exposed to hurricanes, flood detection and monitoring its extent are important to reduce these damages and increase the speed of post-disaster response [5].
Being able to monitor the surface of Earth continuously, remote sensing technologies have been used as reliable resources for flood detection. In order to observe the extent of floods, different methods based on active and passive remote sensing satellite systems operating at various frequencies have been applied [6]- [9]. Different electromagnetic waves within the visible spectrum interact differently with water bodies, depending on their wavelengths. For instance, blue bands penetrate the water, while red bands are partially absorbed and near-infrared bands are fully absorbed. Therefore, by defining certain thresholds, the water bodies, including flash floods, can be detected using optical sensors [10]. The detected water extent are then compared with water reference data sets to estimate floods extent maps [8]. The water bodies detection algorithms are able to detect the surface water including floods with an accuracy higher than 97% [11], [12]. However, in an optical image, the cloud shadows are classified as flood. Therefore, the cloud shadows are the main challenges for flood extent estimation using optical sensors [12]. Active microwave satellites such as synthetic aperture radar (SAR) systems work in day or night providing high spatial resolution data. They are able to see through obstacles such as clouds and certain biomass. Similar to any other classification problem, creating floods extent maps from SAR data can be solved by using supervised and unsupervised methods [13]. In a supervised method, since, the classifier is trained with labeled pixels from a region, the algorithm has local dependence. Segmentation [14], threshold determination [15], and change detection [16] are three main methods for unsupervised classification. Even though these methods are able to detect floods effectively, they have drawbacks. The segmentation method requires heavy computations compared with the other two methods. Moreover, since it ignores small flooded clusters surrounded by large non flooded ones and vice versa, it is less precise [13]. In the threshold determination method, instead of a single threshold value, multiple threshold values are considered for detecting floods on a large scale [15]. Therefore, its accuracy is highly dependant on how accurate the preset threshold values are. In the change detection method, prior-and post-flood SAR images are required, which is a big challenge due to the revisit time of SAR systems [16]. Therefore, in recent works, combinations of these techniques are used [17], [18]. Depending on the region, the SAR based flood detection algorithms are able to detect floods with an accuracy ranging between 80 % to 95 % [19]- [21]. The SAR data requires geometric correction and speckle reduction. Hence, compared to passive microwave and optical sensors, the retrieval algorithms based on SAR data are more complicated [13]. Moreover, since in active SAR systems such as RADARSAT-1/2, TerraSAR-X, and Sentinel-1 the transmitter and receiver are placed on the same platform, obtaining a large constellation is costly and their constellations are usually small [22]. Thus, due to the low temporal resolution (several days), satellite might not even be able to collect data over a flash flooded area in time. Some of these remote sensing data and algorithms have been used by observatories to create near real-time (NRT) flash floods information. For example, the Dartmouth Flood Observatory (DFO) and the NASA Goddard's Hydrology Laboratory employ the data collected by two MODIS sensors (aboard the satellites Terra and Aqua) for flood monitoring [8]. By computing the MODIS reflectance ratio of Band 1 (red) and Band 2 (near-infrared) as well as a threshold on Band 7 (shortwave infrared) to estimate water extent and comparing with reference data, they determine the flash flooded areas [23]. Also, they employ microwave sensors data to mitigate the cloud effect to increase flood detection accuracy [8]. The Global Flood Detection System (GFDS) uses AMSR-E passive microwave remote sensing data to detect riverine flooding globally. In this system, the value of calibrated surface brightness is compared with a threshold to detected riverine inundations [9].
The Global Navigation Satellite System Reflectometry (GNSS-R) is a well-established technique for remote sensing [24]. The GNSS-R receivers collect the Global Navigation Satellite System (GNSS) signals reflected from the surface of the Earth in a bistatic radar configuration. Since it takes advantage of existing signals of opportunity, a GNSS-R system does not require any onboard transmitter. Hence, they are cost-efficient, which makes it possible to achieve a larger constellation and, consequently, high temporal resolution (hours) [25]. Since the GNSS-R receivers operate at L-Band, they can see through clouds what exist when most flash floods happen. Therefore, with a large operational constellation, the GNSS-R data is ideal for flash flood remote sensing. The GNSS-R has shown a great capacity for various applications such as altimetry [26], sea surface wind [27]- [31], soil moisture (SM) [32]- [34], target detection [35], tsunami [36], [37], sea ice [38]- [43], inland water detection [44], and seasonal flood classification [45]. However, its application for flash floods detection is yet to be investigated.
Among GNSS-R data sets collected by different GNSS-R receivers, only the Cyclone Global Navigation Satellite System (CYGNSS) GNSS-R data, which is collected by the constellation of eight satellites [46], was shown to have the potential for capturing the variations of surface reflectivity caused by flash floods over affected regions [47], which are consistent with the changes in precipitation data and brightness temperature data [48]. Unlike seasonal floods, which are developed in specific periods, and they last over more extended time [49], the flash floods are difficult to monitor. Therefore, it is vital to develop a method that can detect and monitor flash floods. To the best of the authors' knowledge, the capacity of the GNSS-R technique for flash flood detection has not been quantitatively analyzed and this article is the first one that investigates such a topic by using the CYGNSS GNSS-R data. Therefore, the objective of this work is to quantitatively investigate the ability of the GNSS-R technique to detect flash floods. The flash flood detection problem is a binary classification problem with two classes (flood and land). Various ML algorithms can be implemented for solving a binary supervised classification problem, such as the Neural Networks (NN), Super Vector Machines (SVMs), and Decision Trees, which are among the most commonly used classifiers in remote sensing [50]. By combining decision trees as basic classifiers, a classifier that outperforms the constituent classifiers is created, which is called an ensemble classifier. Stacking, blending, bagging, and boosting are four main approaches for creating an ensemble classifier [51].
Since floods are usually localized, when a large area is considered, the number of points collected over flooded regions is smaller than those obtained from the non-flooded one, which creates an imbalanced data set. For instance, the flood labeled data points are only 4.38 % and 8.37 % of the Harvey and Irma data sets, respectively. These two imbalanced factors are calculated as the ratios of the number of flood labeled data points (m f ) to the total number of data points (m) in each data set. In an imbalanced data set, information provided by the minor class is considered less important due to the unequal ratio between major and minor classes. However, the minor class results could be more vital at higher costs, despite its smaller size. Various strategies for tackling imbalanced data sets have been developed [52]. At the data level, the leading solutions for handling imbalanced data include cost-sensitive learning and data sampling. In cost-sensitive learning, each class is assigned with a misclassification cost and the goal is to minimize the overall misclassification cost instead of maximizing the accuracy of the model [52]. In data sampling, by creating new instances in the minor class (oversampling methods) or eliminating instances from the major class (under-sampling methods), the imbalanced data becomes balanced [52]. The Synthetic Minority Oversampling Technique (SMOTE) and the Adaptive Synthetic Sampling Method (ADASYN) are two renowned oversampling methods, in which synthetic instances are generated from existing instances in the minor class [53], [54]. As a powerful tool, the Generative Adversarial Network (GAN) is another method for creating artificial instances in the minor class [55]. In such a method, two neural networks compete to optimize their objective functions that are contradictory to each other [56]. The Random Under-Sampling (RUS) is an under-sampling method that balances the data via random elimination of instances from the major class [52]. The balancing techniques are applied to different classifiers, such as ensembles methods, leading to various developed algorithms for classifying imbalanced data [52]. Among different methods developed for classifying imbalanced data, in this study, the Random Under-Sampling Boosted (RUSBoost) algorithm is selected for classification due to its efficient computational time, accuracy, and widely available resources [57]- [59]. Moreover, the support vector machines (SVM) algorithm, which is a representative ML method, is implemented for comparison purpose.
In this article, six different data preparation approaches (DPAs) for flood detection with high-resolution (500 m × 500 m) are investigated using the RUSBoost based algorithm. After comparison, the best technique is selected for flash flood detection using CYGNSS data. The performance of the proposed method is compared with that of a SVM based classifier. The contributions of this study are as follows 1) The first method for detecting flash floods using the GNSS-R technique is proposed. 2) Based on the eleven different CYGNSS observables and an ancillary data set, a suitable feature vector for flash flood detection is determined. 3) Six different DPAs for detecting flash floods using the CYGNSS data are investigated, showing Approach 3 as the best one.
This study is a well-detailed follow-up of what has been done in [60]. This work is outlined as follows: Section II introduces employed data sets. In Section III, eleven different CYGNSS observables, and the RUSBoost-based and SVM-based algorithms are described. Section IV provides the results of feature selection, flood detection, and comparison with the SVM classifier. Conclusions are provided in Section V.

II. DATA SETS A. CYGNSS
In this work, we employed level 1 V2.1 of the CYGNSS data [61] that are available for the public through [62].
In the CYGNSS constellation, each satellite is an along-track scanner which collects the GNSS reflected signal in the direction of the satellite passing over a region with an onboard GNSS-R payload. Hence, when a disaster occurs in a few days (5 to 10 days), the CYGNSS receivers are only able to cover a portion of the flooded area and for some areas, there is no data. Considering this limitation, among all the floods that have happened since 2016 (the year CYGNSS was launched) to 2019, we considered two significant events, Hurricane Harvey and Hurricane Irma. These two hurricane events are among the harshest and costliest ones that have affected the United States significantly [63].
Hurricane Harvey reached the coast of the USA on Aug. 25th, 2017, and according to media, the inundation lasted till Sep. 8th, 2017. Hurricane Irma landed the coast of the USA on Sep. 10th, 2017 and caused a 6-day flood. The affected areas of Hurricane Harvey and Hurricane Irma are located in geographic coordinates of [26.7  W], respectively. Since Hurricane Harvey affected a larger area compared to Hurricane Irma, it has more data points. In other words, the data of Irma might not be enough to fully train a classifier, which may lead to an underfitted model. Therefore, in Section IV-A, for feature selection, both data sets are combined and used for classification by a 5-fold cross-validation evaluation. Furthermore, in Section IV-B, we decided to use 50% of the Harvey data for training and then validate the trained classifier with the remaining 50%. This trained classifier is then tested with the data of Irma, which is unknown to the machine.
The CYGNSS constellation collects the reflected GPS L1band signals and converts them into DDMs. A DDM is a projection of the scattered signal from the surface around a specular point (SP). In the CYGNSS data set, each SP has a 17 delay bins in 11 Doppler bins DDM with a delay bin equals 249.4 ns and a Doppler bin equals to 500 Hz.

B. ANCILLARY DATA
In current work, the flood maps created by the DFO are used as reference data for training and validation. The DFO is a remote sensing research lab of Institute of Arctic and Alpine Research (INSTAAR), at the University of Colorado Boulder. As a part of the Global Disaster Alert and Coordination System (GDACS) project, they create and provide flood maps using data from multiple sources, including NASA MODIS, ESA Sentinel 1, ASI Cosmo SkyMed, Copernicus Sentinel 1, and Radarsat 2 [8]. In this work, the regions impacted by Hurricane Harvey and Hurricane Irma are considered as two case studies, one flood map for each event is obtained from the DFO geographic information system (GIS) data. The GIS data of Hurricane Harvey and Hurricane Irma and more details on them are available through [68], and [64], respectively.
Since the water tends to move to places at low altitudes, the elevation data can impact the accuracy of classification. Therefore, altitude data of the Shuttle Radar Topography Mission Digital Elevation Model (SRTM90m DEM) is employed as an ancillary data [65]. This data set alongside the extracted GNSS-R observables are used as the input for training and testing of the classifier.
The flood reference map is created based on the changes of the surface during the flood. However, areas with water bodies such as permanent waters and some regions of wetlands might have similar characteristics to flood, but SPs over such regions could be detected as either flood and land. One solution is to exclude the points that are located over such areas. Therefore, for excluding such data points, the Global Wetland V3 data provided by the Center for International Forestry Research (CIFOR) [66] and Global Surface Water (GSW) Occurrence data [12], [67], are used. The CIFOR Global Wetland data set indicates the distribution of wetland, peatland and peat depth that covers the tropics and subtropics. This data set is created using products from the MODIS sensors, the phased array type L-band synthetic aperture radar (PAL-SAR) data, and other ancillary data sets [69]. Even though this data set is not validated due to the unavailability of ground truth, it agrees well with other commonly used data sets [69]. The GSW data set is generated based on optical images collected by Landsat [12]. The GSW Occurrence data shows the extent of permanent water from 1984 to 2019. Hereafter, we refer to the Global Wetland CIFOR and GSW Occurrence data sets as CIFOR and GSW, respectively. The key parameters of the employed datasets are listed in Table 1. Although their accuracies were not available, the CIFOR and CYGNSS data sets are benchmark data sets that have been widely adopted for analysis in literature.
In this work, various georeferenced data sets with different spatial resolutions are considered. Therefore, a comprehensive approach for matching the flood reference and ancillary data to each GNSS-R data point must be taken. Similar to other GNSS-R systems, the coherent footprint of CYGNSS is dynamic. In [45], it has been shown that DDMs can be gridded into cells of size 500 m × 500 m. Following the literature, in this study, we assume that the DDM of each SP represents a 500 m × 500 m region around it. Therefore, for assigning a flood/land label to an SP, the number of flood pixels of the reference flood map within an area of 500 m × 500 m around the SP is counted. When the percentage of flood pixels around the SP is higher than 75%, it is labeled as flood; otherwise, it is labeled as land. We investigated all the possible values for flood threshold and 75% is the optimum value. For SRTM90m DEM, assigned value to each SP is the average of the reference data within the area of 500 m×500 m around each SP. Moreover, whether an SP is located within wetlands or permanent water bodies is determined by the value of a grid cell in CIFOR or GSW data sets that is closed to the SP.

III. METHODOLOGY
In this section, first, eleven different observables for CYGNSS data are discussed. Then, the RUSBoost algorithm and a brief description of the SVM algorithm are presented. The RUSBoost classification in this study follows the flow chart depicted in Figure 1. Lastly, the six DPAs that are investigated for flash flood detection are described.

A. CYGNSS OBSERVABLES
In GNSS-R technique, the received power is a combination of coherent and incoherent scattered components. When the surface is smooth, the reflection is mostly coherent. As the surface roughness increases, the reflected signal becomes more incoherent. Under stable and calm weather conditions, inland surface water bodies are smooth. Thus, the reflected signal from them is predominantly coherent. On the other hand, during a severe condition such as a hurricane, the high speed winds can increase the roughness of inland water bodies. However, the presence of high speed winds of a hurricane over land is shorter than the flash flood caused by its landfall. In addition, as a hurricane reaches the land its wind speed decreases gradually due to the higher roughness of land [70], [71]. Moreover, as investigated in [48], during severe typhoons (tropical storms in the Northwest Pacific Ocean) the coherent components of reflected signal are still consistent with flash floods. Therefore, following similar assumption in the literature [34], [45], [48], in this study we regard the reflected signals as coherent. The surface reflectivity (SR) DDM can be calculated by [45] = where R tx and R rx are the distances between SP and transmitter and receiver, and σ is the calibrated incoherent bistatic radar cross section (BRCS) that is reported as DDM [72]. It is worth mentioning that another approximation for coherent reflected power from heterogeneous smooth surfaces is suggested by [73], in which the reflected signal is described by VOLUME 8, 2020  surface diffraction integral. The surface diffraction integral is calculated over an area larger than the first Fresnel zone (FFZ) whose radius varies from 300 m to 800 m depending on the incidence angle [45]. However, our case studies consist of both rough and smooth surfaces. In addition, in this study, the coherent reflection comes from an area of 500 m × 500 m around each SP, which is within the range of FFZ. Therefore, here, (1) is considered. In this study, instead of working with the whole DDM, eleven different observables including corrected signal to noise ratio (SNR C ), trailing edge slope (TES), leading-edge slope (LES), delay-Doppler map average (DDMA), the width of the waveform (Wave-width), the first generalized linear observable (GLO 1 ), kurtosis, maximum, mean, skewness, and variance are extracted for each SP. All the observables except SNR C are computed using the SR DDM, which is calculated as described in [45]. The first seven observables (SNR C , LES, TES, DDMA, Wave-width, GLO 1 , and maximum) are obtained as follows [45], [74] • Provided in the CYGNSS data set, SNR Peack is the ratio between the maximum value in a DDM to its average noise per bin (10log(S max /N avg )). This value is then corrected to SNR C [45]: where λ is the GPS wavelength that is 19.05 cm and σ m is the maximum value of BRCS DDM and P rxm is the maximum value in power DDM [45]. The maximum of BRCS and maximum of DDM are computed using the BRCS DDM and power DDM of each SP and vary with different DDMs.
• LES and TES are computed as the slopes between the maximum point and the points at two delay bins before and after the maximum point in the SR delay waveform (SR DDM integrated over Doppler axis) [45].
• The DDMA is the arithmetic mean of SR DDM within a window around the maximum value. In this study the size of window is chosen as 3 delay bins×5 Doppler bins [75].
• The width of the waveform is the number of Doppler bins whose intensity is higher than 1/e of the maximum of the SR Doppler waveform (SR DDM integrated over the delay axis).
• The N th generalized linear observable (GLO) is defined as [76]: where del is SR delay waveform, a N (i) is the N th weight of SR in the delay bin i and it is computed by applying principal component analysis (PCA) to the SR delay waveform. The summation is calculated considering ±3 delay bins around the delay bin of the maximum of SR delay waveform (i max ). We only consider the first GLO (GLO 1 ), since it has been proven that it is more correspondent to the inundation over land [45].
• The maximum (Maximum) that is the maximum value of the SR delay waveform is also considered as another observable.
A DDM represents the pattern of the scattered power from a surface. When the surface roughness changes, the scattered power and its pattern changes as well. These variations impact the statistical characteristics and histogram of the DDM. Moreover, a histogram can be described by statistical moments such as mean, variance, kurtosis, and skewness. Therefore, by considering the SR delay waveform as a random variables (RV) and analyzing its statistical moments, the impact of flood on a DDM can be studied [77], [78]. The statistical moments considered are described as • Mean shows the position of the central mass of an RV; • Variance is the squared differences of an RV from its mean. In other words, it measures how far values of an RV are located from the mean in a histogram; • Skewness is an indicator of the asymmetry of the probability distribution of an RV. When the distribution is symmetrical, skewness equals to zero. However, when the distribution is skewed to the values higher or lower than mean the skewness is a negative or positive value, respectively; • Kurtosis is a value that estimates the tailedness of the shape of a histogram by taking into account the outliers values.
More explanation about statistical moments and their mathematical formulas are given in [79]. It is worth mentioning that the number of observables is not confined. Other observables can be defined and computed based on different aspects of the GNSS-R data.
Since the ranges of observables are different, as a part of the data cleaning step, they are normalized based on the normalization ranges mentioned in Table 2. The value of each parameter is projected to the interval of [0, 1] using its Min to Max. These values are obtained based on self-observations.
Depending on the labels of the SPs, their observables show different characteristics as depicted by the box plots in Figure 2, for which the SPs located over permanent water bodies and wetlands are excluded using the GSW and CIFOR data sets. Comparing Figure 2 The DDMs whose maximum is not between delay bins 4 and 14 are discarded as noise. The discarded DDMs include high altitude measurement and noisy DDMs. This range is determined by observing DDMs and comparing the delay bins of their maximum values. Moreover, when the incidence angle is between 15 • and 60 • , the reflected signal is more correlated with the water extent around an SP, as shown in [80]. Since we intend to detect flash flood that is a type of surface water body, the SPs with incidence angles out of this range are removed. In addition to these conditions, quality flags, mentioned in [81], are also considered in the preprocessing step. It is worth mentioning that the speckle noise impact is negligible since each DDM is obtained from 1 s incoherent integration of 1000 DDMs [82].

B. CLASSIFICATION ALGORITHM 1) RUSBoost
Flood detection is a classification problem with two classes (land/flood). Since the DFO reference maps include flooded areas and only some/incomplete permanent water bodies but no wetland, both permanent water and wetland are not included in the training and testing. The class of each SP is determined by using the trained RUSBoost based classifier and its GNSS-R extracted observables and ancillary features. After selecting the features, all the observations in the Harvey data set, which is allocated for training and testing the classifier, are shuffled together. Then by a random selection, two separate equal sets for training and testing are generated. This unit that contains random shuffling and random selection is added to the RUSBoost classifier. For better perception, the pseudo-code of the RUSBoost classifier recreated from [83] is depicted in Figure 3. The training data set, is the imbalanced set is a vector in the J dimensional feature space and y i ∈ {0, 1} is its respective class label. In our case, x i is a vector containing selected observables and y i can be either land (0) or flood (1). At the first step, each point in S is assigned with an initial weight of D 1 (i) = 1/m prior to the first iteration (step 1).
Using the RUSBoost method, at iteration t, balanced temporary subset S t = {(x p , y p )| p = 1, . . . , 2n} ⊂ S is created containing all the n points of minor class and n randomly selected points from major class. Knowing the indices of the selected data points from S that are members of S t , another temporary subset containing their corresponding weights D t ⊂ D t is obtained. These two temporary sets are then employed for training weak learners based on the idea of reducing the classification error iteratively (step 2a) [83].
When the data points in S t are passed to the tth decision stump, it divides them into two splits which in this work are referred to as right and left. Having a decision threshold for the feature j, (j = 1, . . . , J ), the observation p in S t is positioned into the right or left split based on whether x p,j is higher or lower than the value of decision threshold, respectively. Hence, the performance of decision stump depends on the feature and its decision threshold. Since all the employed features except Wave-width have continuous values, the decision threshold can take an infinite number of values. However, these thresholds do not necessarily result in different results. When 2n points of S are sorted regarding their values of the same feature, between every two adjacent points, infinite thresholds can be considered. However, since they all have a similar result, only one of them should be considered. Therefore, instead of trying infinite numbers of thresholds, 2n − 1 values between sorted points plus 0 and 1 are enough to be considered as the values of the decision threshold. Hence, (2n + 1)J combinations of thresholds and features can be used for examining all the possible outputs. Considering the combination of jth feature and its qth decision threshold c j t (q), (q = 1, . . . , 2n + 1), the weighted Gini impurity factor (GI t ) of the decision stump t is obtained as: where r,l t (q) is the probability of right or left split and r,l t (q) is the Gini impurity factor of right or left split. For the right split r t (q) and r t (q) are defined as:  (6) and (5) [84]. Moreover, in boosting methods, the performance of a weak learner needs to be slightly better than the random guess (Gini impurity factor of 0.5) [85]. Hence, by randomly selecting a limited number of pairs of thresholds and features, the one that minimizes the Gini impurity factor is selected for creating the weak hypothesis. It should be mentioned that since S t is balanced, minimizing the Gini impurity factor translates to maximizing the Gini gain. We assume that among all features, x i,k ∈ x i , with decision threshold c k t (q k ), meets the requirements (step 2bi). From the feature, its decision threshold, and the number of points in each split regarding S (step 2bii), the weak hypothesis is constructed as (step 2 d) where π r,l = N r,l (y)/N r,l is the label proportion, which is the ratio between number of y ∈ {0, 1} labeled points within a split N r,l (y), and its total number of points N r,l (step 2c). The pseudo loss of the weak hypothesis for all the points in S is calculated as (step 2e) where 1 − y i is the incorrect label of observation i. A weights updating factor, α t , is calculated as (step 2f) Then, a new set of weights are computed and normalized as (step 2g-2h) When the hypothesis of the weak learner is correct for the training data set, which means that the weak learner was able to classify all of the training data points correctly, t will be equal to zero, and the new weights will be equal to the previous ones. Otherwise, the weights of the misclassified points will be higher than the ones of correctly classified points. Therefore, in the next iteration, the weak learner will be biased to classify the misclassifications of the previous decision tree, which translates to increasing the variance step by step. The procedure of random undersampling, creating a weak hypothesis, and updating observations weights is repeated for T iterations. At the last iteration, when the training of all of T weak learners is finished, the output hypothesis is created as a weighted vote of weak hypotheses (step 3): where x is a feature vector of the test data. The criteria for the trained RUSBoost classifier is to find the label that maximizes the summation of the hypothesis of weak learners with respect to α t [83], [86]. Since the number of weak learners affects the structure of the trained classifier and its performance, the number of weak learners (T ) is the hyperparameter of our model. Also, the learning rate, which determines the step size at each iteration, is another important parameter of our model. In this article, the RUSBoost-based classification is implemented in MATLAB R2018 using the Statistics and Machine Learning Toolbox. A total number of 150 of weak learners are trained with a learning rate of 0.1. We investigated different combinations of the number of weak learners and learning rate values in terms of classification error and the selected combination gives the minimum error. Each weak learner is a decision stump. At each iteration, among 150 random combinations of different features and decision thresholds, one of them is chosen. The trained classifier is used for testing and evaluation.

2) SVM
In this article, the SVM classifier is considered for comparison with the proposed method. As a well-known supervised ML algorithm, SVM has been used in various remote sensing applications [38], [87]- [89]. SVMs classify data by determining the optimal hyperplane for maximizing the margin between classes [90], [91]. For nonlinear data, computing the hyperplane is achieved by using the kernel trick, which maps the data in a higher dimensional space. More details on the SVM ML algorithm can be found in [90], [91].
In this article, an SVM based classifier is implemented using the Statistics and Machine Learning Toolbox of MAT-LAB R2018. As in Section III-B1, selecting training data points from Harvey consists of random shuffling and random selection. For balancing the imbalanced data sets, RUS is applied to the training data set since it requires a much lower computational load compared to oversampling methods (e.g., SMOTE) [57], [58]. The radial basis function (RBF) kernel is selected as the kernel function. The values of hyperparameters are optimized using the sequential minimal optimization (SMO) algorithm proposed in [92].
Since the selected training data from Harvey is random, for having a better perception of the performance, the classification was repeated 20 times for both SVM and RUSBoost classifiers as shown in the block diagram of the RUSBoost classifier depicted in Figure 1.

C. DATA PREPARING APPROACHES
In this section, six different DPAs for flash flood detection are described. As mentioned in Section II-B, water bodies that are not caused by flash floods, e.g., permanent water bodies and some regions of wetlands, can be mislabeled as flood. Two main DPAs could be taken for solving this issue. One solution is to use reference data sets and exclude SPs that are located over water bodies. Another one is to use the variation between the CYGNSS data during flood and the CYGNSS data collected during a period that flood did not happen. Therefore, six different DPAs are investigated in this study that are described as • In Approach 1, all inland SPs collected during floods are used. Even though some SPs are located over wetlands or permanent waters, in order to investigate the errors caused by water bodies other than flood, the non-flooded SPs are labeled as land.
• In Approach 2, based on the GSW and CIFOR data sets, SPs located over wetlands and water bodies are excluded. This method was previously used in Section IV-A for feature selection.
• In Approach 3 GSW data set is used for excluding the SPs located over permanent waters.
• Approach 4 consists of three steps: detecting water bodies, excluding the SPs associated with water detection results, and flood detection. Using the 2018 CYGNSS data and inland water detection method described in [78], water bodies over Harvey and Irma are detected. The detected water extent is then used as a reference for corresponding excluding SPs.
• In Approach 5, the impact of flood is investigated by considering the changes caused by flood with respect to the CYGNSS data collected one month prior to flood.
• Similar to Approach 5, in Approach 6, the variations caused by flood are considered. In this DPA, the CYGNSS data of three months dry season of the year 2018 are considered as background data.
In Approach 5 and Approach 6, for calculating the changes of selected observables, each SP in the CYGNSS flood data set is matched with the closest data point from the background data set. The distance between SP in the flood data set and its match from the background data set is to be less than 1.5km. When the distance between two points is higher than 1.5km, the SP is excluded. We investigated different values for determining this distance and 1.5km was the optimum value with respect to data exclusion amount and classification error. Since in Approach 1 all data points collected during floods are used for classification, its result includes possible misclassifications. Comparing the classification results of other DPAs with Approach 1 can indicate the advantages and disadvantages of them. The coverage of the CYGNSS is low and in some DPAs, a portion of data is not even considered due to the data exclusion. Hence, the percentage of excluded data points alongside the accuracy of the classifier are two factors that are used in Section IV-B to evaluate the overall performance of each DPA. Since the amount of excluded data for each DPA is different, it is not possible to evaluate them using exactly same validation data.

IV. RESULTS AND DISCUSSION
In Section IV-A, all the combinations of eleven observables and two features from SRTM90m DEM are separately used as inputs of a RUSBoost classifier with 5-fold cross-validation to select a suitable combination of features. With the selected features, the detecting flood performances associated with the six DPAs are evaluated in Section IV-B. By comparing their results, the best method for flood detection is selected. The performance of the recommended RUSBoost classifier is then compared with that of an SVM based classifier.

A. FEATURES SELECTION
In this section, we want to select the features that are proper for flood detection. Therefore, by using GSW and CIFOR data sets, SPs located over wetlands and permanent water are excluded. This ensures that the remaining SPs used in feature selection are either flood or bare land. The eleven observables described in Section III-A and the surface elevation and terrain from the SRTM90 DEM data set are considered as thirteen features. In this section, both Harvey and Irma data sets are combined and the idea of recursive feature elimination is implemented to determine a suitable feature combination out of all the 8191 different combinations. For each combination, the accuracy of the classifier with 150 weak learners is evaluated by 5-fold cross-validation in which same subsets of data are used for all combinations. The accuracies of the best five combinations with and without the two features from SRTM90 DEM are shown in Table 3. From Table 3 it is clear that using the elevation and terrain from SRTM90 DEM (combinations 1 to 5 in Table 3) has improved the accuracy (around 10% for flood and 2.5% for land). Due to the gravity, water accumulates in lower altitudes. Hence, it is unlikely that flood would occur in an area located on the slope. Overall, knowing the elevation and terrain of an area gives a better insight into the regions with a higher possibility of flooding.
In terms of land detection, features combination 4 in Table 3 has the best accuracy. However, it has a lower flood detection accuracy compared to features combination 5, which has the best performance for flood detection. Therefore, features combination 5 in Table 3 has a better performance overall.
Here, the hyperparameter (i.e., the number of weak learners (T )) of the RUSBoost classifier is set to 150 after evaluating the classification error for various values of T by considering all the features as the input feature vector. Next, the top five combinations in terms of classification error are found based on the selected hyperparameter. Since the value of T can impact the classification results, we further investigated the variation of the overall classification error with T for the other four feature combinations listed in Table 3. As shown in Figure 4, as the number of weak learners increases, the classification errors decrease. After a certain value of T (140 here), the accuracy will not change significantly. The optimal values of T are 150 for Combinations 1, 3, and 5 and 149 for Combinations 2 and 4. Although there is a small difference between the optimal and selected values for Combinations 2 and 4, the results obtained from T = 150 are still appropriate since the difference between the accuracies with the selected and optimal hyperparameters is less than 0.1%.

B. FLOOD DETECTION
By knowing the best feature vector from Section IV-A, in this section we intend to find the best DPA for flood detection using the CYGNSS data through evaluating the six DPAs mentioned in Section III-C.
Unlike Section IV-A where a classifier with 5-fold cross-validation was used, here, the RUSBoost based classifier depicted in Figure 1 is trained and tested by using the features combination 5 in Table 3 for each DPA.
The results shown in Table 4 are the accuracies and the percentage of excluded data for both Harvey and Irma. For choosing the best method, first, the percentage of the VOLUME 8, 2020 discarded data of each DPA is compared with other DPAs. Then, suitable method for flood detection is selected based on the accuracy. Among all the DPAs, Approach 1 and Approach 6 have no data exclusion. The excluded data points in Approach 3 are located over permanent water. This is reasonable since permanent water area does not need to be determined to be flooded or not, i.e., there is no overlap between permanent water and the reference flooding regions. Due to the overestimation of detected water extent, Approach 4 has the highest data exclusion. Even though Approach 2, Approach 4, and Approach 5 have an acceptable accuracy in some occasions, they do not seem to be proper options for flash flood detection due to their high percentage of data exclusion. It should be pointed out that unlike Section IV-A, where we intended to find a suitable feature vector for detecting flood, here, we are investigating different DPAs for flash flood detection. Moreover, as shown in Table 4, the accuracies of Approach 2 and Approach 3 are comparable, but Approach 2 is not suggested since more data points are excluded. Approach 5 has the lowest flood detection accuracy for Irma. Among Approach 1, Approach 3, and Approach 6, Approach 3 has the highest land detection accuracy and Approach 1 has the highest flood detection accuracy for both Harvey and Irma. The flood and land detection accuracies of Approach 6 are less than those of Approach 1 and Approach 3. Therefore, Approach 1 and Approach 3 are the final candidates. In terms of flood detection, Approach 1 outperforms Approach 3 by 1.9% in Harvey and 2.3% in Irma. However, Approach 3 is able to detect land with a higher accuracy (3.2% in Harvey and 8% in Irma). The intention in this study is to detect flash flood. However, since the land points outnumber the flood points, the overestimation of flood is also crucial. Hence, Approach 3 seems like the proper method for flash flood detection.
The maps of employed reference data sets, classification result and error maps of Harvey and Irma are depicted in Figures 5 and 6, respectively. The GSW and CIFOR references shown in Figures 5(a) and 6(a) are used for data exclusion in Approach 2. For Approach 3, discarded SPs are selected by using the GSW reference data that is depicted in Figures 5(a) and 6(a) in blue colour. Based on the high-resolution DFO flood reference maps depicted in Figures 5(b) and 6(b), in each DPA, considered SPs are labeled as flood/land. Due to data exclusion, the flood reference map of each DPA could be different from others. As shown in Figures 5(b) and 6(b) the area flooded by Hurricane Harvey is concentrated over the coastline, and most of the inland areas were not impacted. While in Hurricane Irma, affected areas are scattered over the land. Furthermore, in order to make the flash flood detection method independent of other data sets such as the GSW and CIFOR, in Approach 4, we attempt to detect the water extent over Harvey and Irma and use the results as a water extent reference for excluding data. Considering the CYGNSS data of the year 2018, three observables, including Kurtosis, Maximum, and Variance, are extracted. Using these observables, a RUSBoost classifier trained with data from the Congo basin is used for detecting water bodies of Harvey and Irma [78]. The water detection method overestimates the presence of water bodies, which leads to excluding a large portion of data. The two investigated regions consist of various dynamic water bodies, to which the CYGNSS is sensitive, including wetlands, permanent waters, and farmlands. Therefore, water overestimation is inevitable. Comparing the classification results of Approach 3 and flood reference maps depicted in Figures 5(c) and 6(c) and Figures 5(b) and 6(b), respectively, shows that even with small coverage, Approach 3 is capable of identifying the flash flood extent.

C. COMPARISON TO SVM CLASSIFIER
For comparison, an SVM-based classifier is trained using the selected features in Section IV-A and same data that is produced by Approach 3 and employed for building the RUS-Boost classifier. As mentioned in Section III-B2, the parameters of the SVM-based classifier are optimized using the SMO algorithm.
As shown in Table 5, compared to the RUSBoost classifier with Approach 3, the SVM classifier can detect flash floods with an accuracy of 6.1% and 11.3% higher for Harvey and Irma, respectively. However, in terms of land detection, the RUSBoost-based classifier is 8.98% and 32.2% more accurate for Harvey and Irma, respectively. The SVM classifier overestimates flash floods as depicted in Figures 5(d), 5(f), 6(d) and 6(f). For the SVM based classifier, due to the disproportion of imbalanced data sets, the number of misclassified land points is much higher than correctly detected flood points. Therefore, the RUSBoost classifier with Approach 3 is better than the SVM classifier.
It is worth mentioning that the overall run-time of the RUSBoost and the SVM classifiers are 12.10 s and 0.48 s, respectively.

V. CONCLUSION
In this article, a flood detection method based on CYGNSS data has been conducted using the RUSBoost based classification. Eleven different features have been extracted from the CYGNSS data, and each point is labeled as land/flood, as discussed in Section II-B. For feature selection, by excluding wetland and permanent water, the CYGNSS flood data set only includes SPs that are either flood or land. Using this data, after investigating the accuracies of all the combinations of thirteen features via a classifier with 5-fold crossvalidation, the most effective features were selected. Even though the accuracy of various combinations is roughly in the same range, the combination of Kurtosis, DDMA, Maximum, Variance, Wave-width, and SRTM90m DEM has the best performance overall. The selected feature combination might not be the best one, but the classification results indicate that it is a suitable option for flash flood detection.
By using the selected features combination, six different DPAs for detecting flash flood were investigated, among which Approach 3 gives the best performance in terms of accuracy and data exclusion. Moreover, the comparison between the RUSBosst-based and SVM-based classifiers, which were trained using data exclusion in Approach 3, indicates that the RUSBosst-based classifier has a better overall performance. Therefore, it is recommended as the method for flood detection using CYGNSS data.
The GSW and SRTM90m DEM data sets are the only ancillary data sets employed in the recommended DPA, i.e., Approach 3. Both of them are available for the public with global coverage. In addition, unlike GLO 1 , the selected observables do not require any heavy preprocessing, and for each SP, they are computed based on provided parameters in the CYGNSS data for that SP without any dependency on a region or time period. As mentioned in Section II-A, all classifiers in Section IV-B were trained with half of the Harvey data, which was randomly selected. Then the trained classifiers were tested with the remaining half of the Harvey data and all of the data of Irma.
The CYGNSS data set involves non-geophysical uncertainties. The effective isotropic radiated power (EIRP) of GPS transmitters that is used in the CYGNSS data process is not subtle [93]. Due to the different designs of space vehicles and the transmitting antenna panel, the EIRP of GPS transmitters fluctuates that leads to inaccuracy of CYGNSS measurements and impacts the results of this study. Since August 2018, by monitoring the transmitted power of GPS satellites, the fluctuations are compensated [94], [95]. However, due to the limitation of available CYGNSS data that is associated with significant flash flood, we selected the two representative events that happened in 2017.
The main drawback of the proposed method is flood overestimation with respect to the DFO reference data. This problem was also reported in [96], where data from Soil Moisture Active Passive (SMAP) was employed for flood detection. This may be because both CYGNSS and SMAP use L-band signals which are sensitive to SM. Flash flood is a complicated matter, and it depends on various conditions. In addition to the massive surge of water, various factors such as soil moisture, soil type, vegetation, subsurface flows, elevation, etc. can impact the development of flood [49], [97], [98]. Moreover, the scattering from the surface at L-band primarily depends on two factors: roughness and soil moisture, as investigated in [99]. Due to heavy precipitation during a flash flood, the SM increases till the soil becomes saturated. This increase in SM can be an explanation for this problem since the reflected signal from an SP with high SM can be as coherent as the reflection from a flooded SP, which causes flood overestimation. The low accuracy of flood over Irma in Approach 5 shows the impact of SM on flood detection. Both Hurricane Harvey and Hurricane Irma occurred during the high season (July to November) [100], [101] and during the month prior to flood, several precipitations happened in those areas especially for Irma [102]. Since having high SM does not necessarily indicate that an SP is flooded [103], SM is not the only source of error. Another parameter that also has a major role in the reflected signal is the roughness [104]. The coherent reflection from a smooth surface can lead to flood overestimation. Therefore, the regions that are relatively flat with high SM, such as Irma, are overestimated by the proposed method. Moreover, as mentioned before, the high-speed wind of a hurricane would increase the surface roughness of water in flooded regions. As the surface becomes rougher, its root-mean-square-height increases. Consequently, surface reflectivity decreases exponentially [105]. In other words, the incoherent components become more dominant. Therefore, in the early stages of our case studies where a high-speed wind is present, there are flood points whose scattered power is predominantly incoherent, which leads to flood underestimations. Furthermore, some flooded areas are heterogeneous, meaning that there is a diversity of land and flood in them. The heterogeneity can impact the scattering pattern. which results in overestimation or underestimation of flood. Despite the overestimation and underestimation, based on the obtained results, the proposed method is able to detect a flash flood with high accuracy. It is worth mentioning that similar to other microwave systems [106], [107], the turbidity of the water does not significantly impact the scattering of the GNSS signals from the water bodies since the signals cannot penetrate into the water too much. Thus, the turbidity of water may not be a major source of error in our work.
Compared to optical satellites, similar to other spaceborne microwave systems, CYGNSS is not affected by clouds, which makes it a reliable source for monitoring flash floods. Compared to other remote sensing satellites such as SAR and optical, the GNSS-R technique has a lower quality in terms of spatial resolution and accuracy [108]. On the other hand, the revisit time of the CYGNSS satellites is shorter than SAR systems, hence it is able to detect flash floods. Moreover, due to the less expensive receiver of GNSS-R, larger constellations can be obtained that leads to better coverage.
The proposed method has two main limitations. Firstly, it cannot detect urban flash floods since the impacted regions include various human-made obstacles causing incoherent reflection. Moreover, due to the gap between CYGNSS constellation tracks, it is unlikely to have enough collected data for flash flood monitoring when a small flash flood happens. Hence, the proposed method is more suitable for observing extensive flash floods. However, this limitation can be solved by having more GNSS-R receivers.
The parameters, such as high-speed winds during hurricanes and soil type, were not included in this study. These parameters should be considered in future studies. Moreover, in this study, the proposed method was only evaluated over two case studies and it was compared with an SVM-based classifier. In addition, feature selection was based on the optimal hyperparameter found for one combination rather than the corresponding optimal value of each combination. In the future, more advanced feature selection methods could be investigated. Also, other oversampling (e.g. GAN, Variational Autoencoder (VAE), and SMOTE) and undersampling methods developed for tackling imbalanced data along with other ML algorithms such as the random forest, Extreme Gradient Boosting (XGBoost) may be investigated for flash flood detection from the CYGNSS GNSS-R data. He is currently with the School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology. His research interests include tsunami, sea ice, and land remote sensing using global navigation satellite system-reflectometry. He was a recipient of the 2019 IEEE GRSS Letters Prize Paper Award from the IEEE Geoscience and Remote Sensing Society.
DESMOND T. POWER (Member, IEEE) was born in North River, NL, Canada, in 1967. He received the Bachelor and Master of Engineering degrees from the Memorial University of Newfoundland.
He is currently the Vice President of Remote Sensing at C-CORE and manages a group of 40 individuals with expertise on geomatics, radar systems, earth observation, and geomatics. He has 29 years of professional experience. In the first seven years of his career, he primarily worked on high frequency over the horizon radar. In 1998, he started work on microwave radar systems and satellite synthetic aperture radar. He has since branched out into work on earth observation (electro-optical and SAR) and geomatics, with a strong focus on research and applications development. The research performed by Desmond's group include iceberg and vessel detection, SAR-based winds, EO water quality and river ice monitoring, ice charting and sea ice thickness measurement, satellite and ground-based interferometric SAR (for ground deformation monitoring), vehicle detection and monitoring, wetlands monitoring, and oil slick detection.
Mr. Power is a member at large of the executive of Canadian Remote Sensing Society and a P.Eng registered with Professional Engineers and Geoscientists of Newfoundland and Labrador. VOLUME 8, 2020