Assessment of Interpolation Errors of CYGNSS Soil Moisture Estimations

High spatio-temporal soil moisture (SM) is essential for many meteorological, hydrological, and agricultural applications and studies. Spaceborne Global Navigation Satellite System Reflectometry (GNSS-R) provides a promising opportunity for high-resolution SM retrievals. NASA’s Cyclone Global Navigation Satellite System (CYGNSS) is a recent GNSS-R application that offers relatively high spatial and temporal resolution observations from Earth’s surface. However, the quasi-random sampling of land surface by the CYGNSS constellation circumvents obtaining fully observed daily SM predictions at high spatial resolutions. Spatial interpolation techniques may fill this gap and provide a fully covered high-resolution daily SM estimation. However, the spatial interpolation errors need to be assessed when applied to the quasi-random 9-km CYGNSS based SM estimations. In this paper, we conduct interpolation error analysis using the SMAP Enhanced L3 Radiometer Global Daily 9-km product, sampled at the CYGNSS observation locations. The results indicate that the overall interpolation error (RMSE) was 0.013 m3m−3 over SMAP’s recommended grids. In addition, sparse CYGNSS SM observations are directly interpolated. The achieved results show that interpolated and observed CYGNSS SM values have similar performance metrics when validated with the SMAP 9-km gridded SM product as well as sparse soil moisture networks.


I. INTRODUCTION
S OIL moisture (SM) is one of the critical components of agriculture and terrestrial water management, and global climate studies [1]. Accurate, global, and high-resolution SM estimation is required for many studies, including water resource management, irrigation scheduling, prediction of agricultural yields, and flood forecasting [2], [3]. Microwave remote sensing technologies have been widely used for SM estimation on a global scale [4]. Currently, there are several ongoing satellite missions that provide SM estimation with different spatial and temporal resolutions. The European Space Agency's (ESA) Soil Moisture and Ocean Salinity (SMOS) [5] and the National Aeronautics and Space Administration's (NASA) Soil Moisture Active Passive (SMAP) [6] are both dedicated SM missions that are instrumented with L-band passive radiometers and provide 36-km spatial and 1-3 days temporal resolution. Sentinel-1 is another ESA mission equipped Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object Identifier XX/JSTARS.2020.XX with C-band synthetic aperture radar, and it can facilitate SM mapping with 1-km spatial and 6-12 days temporal resolution [7]. Global Navigation Satellite System-Reflectometry (GNSS-R) is a promising, alternative microwave technique that has been recently gaining high interest within the scientific community as it has the potential to provide higher spatiotemporal resolution observations than conventional, passive microwave remote sensing methods. This bistatic forward scattering approach collects reflected signals from the Earth's surface, covering the space-time gap of the current traditional active/passive monostatic satellite missions [8]. NASA's Cyclone Global Navigation Satellite System (CYGNSS, launched in December 2016) is primarily an ocean surface wind mission to improve extreme weather prediction, but it operates continuously over both land and ocean from 38 o North to 38 o South latitudes, providing many land observations with clear sensitivity to changing land surface conditions. Many studies have recently taken advantage of the large amount of CYGNSS land observations to develop models and algorithms to retrieve SM from CYGNSS observations at various spatial and temporal scales [9]- [14].
CYGNSS acquires GNSS-R measurements from 32 channels with 8 small satellites and 4 channels per satellite. Its mean revisit time can be as short as seven hours with 25-km spatial resolution over the ocean under a dominantly diffuse scattering regime. The use of a bistatic receiver constellation for GNSS Signals-of-Opportunity (SoOp) measurements introduces many factors which should be considered when handling this dataset. Because of the ever-changing geometries of the GNSS transmitters and CYGNSS satellites, CYGNSS observations have a quasi-random sampling characteristic which increases the measurement uncertainty when compared to the repeatable, swath-like sampling of traditional remote sensing satellites. Additionally, the reflections of GNSS signals from land are much more diverse and are a composite of coherent and/or incoherent scattering depending on surface conditions. This makes the spatial resolution of the CYGNSS observations over land variable. There are several ongoing research activities that attempt to identify relative contributions of coherent and incoherent reflections [15]- [17], yet the actual spatial resolution of GNSS-R measurements is still a subject of debate within the research community. Following previous studies that produce SM products which assume the presence of coherent signals [9], [11], [13], we will also assume coherent reflections over land. Under such assumption, the CYGNSS observations have a relatively high spatial resolution (on the order of a few kilometers), but less frequent observations To illustrate the spatial sampling capabilities, observations of SMAP, CYGNSS, and interpolated CYGNSS SM on 06 November 2018 are given in Fig. 1. Although the composite SMAP SM product provides 100% coverage on its track after excluding large, static water bodies, the daily SMAP SM product contains many unmeasured regions due to spatiotemporal resolution provided by the SMAP orbit and ground track. On the other hand, it can be seen that CYGNSS provides high-resolution observations over a larger area in a single day, but with sparse sampling. To fill the unsampled grids, spatial or temporal averaging of CYGNSS observations can be performed. However, this degrades the high-resolution potential of CYGNSS-based SM estimation both in space and time. Fig. 2 provides the spatial coverage performance of CYGNSS for different spatial gridding and temporal averaging scenarios. For instance, to achieve a daily product with full spatial coverage (at least one CYGNSS sample within each grid) between ±38°latitudes, a spatial gridding that is greater than a 36-km aggregation is necessary. Also, more than ten days of temporal averaging is required to provide full coverage SM estimations at 9-km spatial resolution. Hence, without further processing, CYGNSS does not provide both a high spatial and temporal resolution product simultaneously.
Furthermore, CYGNSS's revisit or temporal sampling rate varies spatially within its coverage (between ±38°latitudes) as illustrated in Fig. 3. For 9-km × 9-km aggregated grids, Fig. 2. Coverage performance of CYGNSS in different spatial and temporal resolutions. Each CYGNSS measurement is gridded at 3-km spatial resolution around the specular point assuming a coherent reflection over land. If a pixel contains at least one CYGNSS specular reflection point, it is assumed to be representative of the whole pixel. Bars show one standard deviation around the mean. the temporal sample rate of CYGNSS can vary between 0.5 and 0.1 visits/day. While it will take 2 days for CYGNSS to provide an observation for locations above ±30°on average, it takes 3-10 days for locations between +30°and -30°latitudes for 9-km × 9-km grids. Due to the spatially changing observation rate of CYGNSS, the averaging requirements differ for different regions of the world.
As an alternative to spatial and/or temporal averaging, which are often employed in retrieval algorithms, we investigate the use of interpolation methods to provide CYGNSS-based SM estimates in unmeasured regions of CYGNSS's ground track. This approach has the potential to generate a high resolution SM product for both the spatial and temporal domains as illustrated in Fig. 1(c). Spatial and temporal interpolation of SM data is not new, and there have been some studies focusing on spatial interpolation/extrapolation of SM measurements for continuous data [19]- [21]. However, these studies were performed using in-situ measurements for limited areas.
The idea of spatial interpolation of CYGNSS reflectivity observations was also presented using a previously-observed behavior interpolation (POBI) by Chew [22]. Our previous study performed spatial and spatiotemporal interpolation of the CYGNSS SM estimates using linear, nearest, and natural neighbor interpolation techniques [23]. The present study extends our previous work by incorporating ordinary Kriging [24], inverse distance weighting (IDW) [25], best linear unbiased estimation (BLUE) [26] and POBI [22] techniques for spatial interpolation and analyzes the interpolation performance for different land cover and geographic factors.
The purpose of this study is two-fold. First, we assess error sources in the spatial interpolation of CYGNSS -based SM data to obtain a daily SM product within the full CYGNSS coverage (between ±38°latitudes). To accomplish this, we utilize the SMAP Enhanced L3 Radiometer Global Daily 9-km dataset as a testbed since its 1000-km swath width provides fully covered regions on its track. It is imperative to note that this SMAP product is a 9-km gridded version of the original SMAP measurements, and the native resolution is still 36km [27]. In this scheme, CYGNSS spatial locations are used to sample the SMAP data and the interpolated results are compared with the actual SMAP observations to assess the errors due to the utilized interpolation approaches. Second, we interpolate sparse CYGNSS SM estimates directly and compare the resultant estimates against SMAP and sparse insitu SM networks.
The rest of the paper is organized as follows: Section II summarizes datasets used. Details on our assessment approach and the utilized interpolation algorithms are provided in Section III. Results and discussions are provided in Section IV. Finally, conclusions are drawn in Section V.

A. MSU-GRI CYGNSS-derived surface SM product
In this study, we use our CYGNSS-derived 9-km gridded SM product, MSU-GRI surface SM v1.0a, which is publicly available in [18]. This product was generated by a machine learning model that uses CYGNSS land observations (reflection power, incident angle, and trailing edge slope) and ancillary inputs (elevation, soil texture, difference vegetation index, and vegetation water content). We used the 1-km resolution Digital Elevation Model GTOPO30 product from the United States Geological Survey Earth Resources Observation and Science archive for surface elevation information (https://lpdaac.usgs.gov/products/myd13a1v006/). The presence of surface inland water body is identified by utilizing a 30-m Global Surface Water Dataset from the Joint Research Centre (GSW-JRC) [28]. Global soil texture data from the Global Gridded Soil Information [29] were used to characterize the spatial variations of soil type. Several Moderate Resolution Imaging Spectrometer (MODIS) land surface products were used to characterize the underlying vegetation conditions [30]. The model was trained using only in-situ SM data from 170 International Soil Moisture Network (ISMN) sites and is independent of SMAP observations in its training. The produced SM product was then successfully validated using both SMAP enhanced L3 SM product and ISMN's SM measurements. The details of the ancillary data and algorithm behind the MSU-GRI product can be found in [14].

B. SMAP SM
The SMAP Enhanced L3 Radiometer Global Daily 9-km EASE-Grid SM product is used here for error analysis of the interpolation approaches. This product is an enhanced version of the 36-km radiometer SM product using Backus-Gilbert optimal interpolation [31]. SMAP datasets contain SM estimation and the associated coordinates for both the descending (A.M.) and ascending (P.M.) overpasses. Both passes are combined to obtain daily results. With the help of a 1000-km swath width, a daily SMAP product can cover about 70% of all land areas within the CYGNSS coverage (±38 0 latitudes). The SMAP product also contains SM retrieval quality flags that indicate whether the SM retrieval is recommended or not. SMAP SM estimations can have an uncertain quality for several reasons such as water body fraction, coastal proximity, urban area, precipitation, slope, vegetation water content. The data is freely available through the National Snow and Ice Data Center (NSIDC) at https://nsidc.org/data/SPL3SMP E/versions/3.

III. METHODOLOGY
Interpolation can be applied to either CYGNSS reflectivities before SM predictions as in [22] or directly on the CYGNSSderived SM products as done here. In this study, we first use the locations of the CYGNSS SM product for sampling the SMAP L3 9-km gridded data set in order to assess the interpolation performance of well-known interpolation algorithms (listed in subsection III-A) under the CYGNSS quasirandom observation scheme. We then interpolate the MSU-GRI CYGNSS SM product directly to produce a full-coverage SM product at 9 km, which is independently evaluated against both SMAP's SM product and in-situ observation at various ISMN sites. Due to the SMAP's native 36-km resolution [27], it is expected that interpolation error for CYGNSS SM estimations against 9-km gridded version of the original SMAP measurements could vary depending on the heterogeneity of the surface being interpolated.
Both SMAP SM retrievals and CYGNSS SM retrievals within the time period of 18 March 2017 to 31 December 2019 are arranged as 1019×920×3200 (Days × Latitude × Longitude) matrices. Then, we followed the same quality controls described in [14]. For instance, the samples that correspond to water bodies and high elevation (> 2000 m) were removed from both CYGNSS and SMAP datasets. In order to analyze the interpolation error, daily SMAP SM estimates whose EASE grid center are closest to CYGNSS specular point locations were sampled. Then the resulting sparse SMAP daily SM data was interpolated independently using the different algorithms discussed in subsection III-A. A total of 1019 days of spatial interpolation was performed, and an interpolated SMAP SM matrix was generated. The interpolated and original SMAP SM estimations were compared to calculate the interpolation error over the EASE grids only if SMAP has measurement at that grid location. Root mean square error (RMSE) metric is used to quantify the interpolation error. The interpolation and error calculation process is illustrated in Fig. 4.
After the error analysis, we study direct interpolation of the CYGNSS SM product. This includes evaluations of interpolated CYGNSS SM values independently against overlapping SMAP SM data with grid-based performance metrics for both (1) all grids and (2) recommended SMAP grids. In addition to error analysis comparing SMAP and CYGNSS, an error analysis is performed using individual sparse insitu SM networks as an independent metric for the CYGNSS SM interpolation methods. Fig. 5 illustrates the comparison approach.

A. Interpolation algorithms
In the study, we used well-known spatial interpolation methods such as linear [32], natural neighbor [33], IDW [24], Kriging [34], and best linear unbiased estimation (BLUE) [26] and POBI [22] for daily SM estimations. These interpolation approaches use a weighted combination of the sparse obser-vations asû whereû(x 0 ) is the predicted/interpolated value for the unsampled point x 0 , u(x i ) is the SM observations at locations x i , and λ i are the weights for each u(x i ) value where N is the number of observations within the window of interpolation. 1) Linear: Classical linear interpolation fits a plane between measured samples to estimate the unknown values. For spatial interpolation, we performed the Delaunay triangulationbased [35] linear interpolation. In this approach the grid is subdivided into a series of triangles such that the measured points are connected with lines in such a way that no triangle edges are intersected by other triangles. Once the triangulation is done, a bivariate linear interpolation is applied within each triangle to find the unknown pixel value using the ones measured at the three edges of the triangle. This approach is an exact interpolator and implemented with MATLAB's griddata function.
2) Inverse Distance Weighting (IDW): The inverse distance weighting method estimates the unsampled point value using a linear combination of sampled points values and their corresponding weights. The weights are calculated based on the exponentiated Euclidean distance between sampled and unsampled points. IDW weights with low Euclidean distance values will have more influence than weights which feature high Euclidean distance. The IDW estimation of an unknown point is given byû whereû(x 0 ) is the interpolated value at any location x 0 , u(x i ) are the sampled values (i = 1, 2, ..., N ), d i is the distance between x 0 and x i , p is the a power parameter and N is the number of sample points within the interpolation window used for the estimation. IDW is implemented with window sizes of 5, 7 or 9 and p = 3 is used as the power parameter.
3) Natural Neighbor: Like IDW, the natural neighbor interpolation method [33] is a weighted-average interpolation technique. The method is based on Voronoi tessellation of a discrete set of spatial points and provides smoother approximation. The method is based on Voronoi tessellation of a discrete set of spatial observation points, where basically a partition of a plane into multiple regions that are close to each observation point is done. The partitioning of the plane is updated after inserting the interpolation point to the list of all spatial points. The weights for each measured sample are calculated by using the intersection area of the new partitioning cell with the existing areas of the cells of the closest samples. For the presented results in this paper, natural interpolation is implemented utilizing MATLAB's griddata function using the natural neighbor option. 4) Kriging: Kriging [24] or Gaussian process regression is another weighted spatial interpolation method. In Kriging, the weights are based on the overall spatial arrangement among the measured points which is quantified through the spatial correlation between the sample points. Thus, in Kriging, the weight depends on a fitted model of the field semi-variogram derived from the distance to the unsampled point, and the spatial relationships among the measured points around the unsampled point. Assuming an average mean adds the constraint of N i=1 λ i = 1 to the system. The Kriging equation is expressed as: where, γ(x i , x j ) is the semivariogram between observation point x i and x j , γ(x i , x 0 ) is the semivariogram between sampling point x i and interpolation point x 0 , and µ is the Lagrange multiplier parameter. Kriging equations can be written in the form of a matrix: In the study, a Gaussian model fitted semivariogram function was used for each segmented windows (for detail see section III-B). After solving Eq. (4) the weights λ will be obtained, and they can be used in Eq. (1) to estimate the value at the unsampled point.

5) Best Linear Unbiased Estimation (BLUE) : BLUE [26]
is an unbiased linear estimator and it has minimum variance among all other linear estimators. Since it is a linear estimator, the unknown point can be represented as a linear combination of measured points as in Eq. (1). The unbiased constraint states that E(û(x 0 )) = N i=1 λ i E(u(x i )) = u(x 0 ). The variance on the estimateû(x 0 ) is given by var(û(x 0 )) = λ T Cλ where C is the covariance of the measured points. The BLUE algorithm minimizes the estimate variance constraint to the unbiased property. In order to satisfy the unbiased constraint, E(u(x i )) should be linear inû(x 0 ) as E(u(x i )) = s iû (x 0 ) where s is the scaled mean. The solution to the variance minimization will result in the optimal weights as BLUE requires the covariance and the scaled mean to estimate the SM at the unknown point. In our implementation both C and s are estimated from all historical data over the study period of 2017-2019.
6) Previously-observed behavior interpolation (POBI): This interpolation technique interpolates an unsampled point by quantifying how past observations of the quantity at the unsampled location varies with the past observations with the nearby points [22]. This relationship is defined by a linear regression model between the unsampled location and each nearby point separately using the past observations of the unsampled point and the nearby points. The estimated SM value at the unsampled point can be expressed aŝ where N is the number of neighbor pixels, a i and b i are the slope and intercept parameters of the regression line applied on previous measurements at locations x i and x 0 . λ i are the weights for each u(x i ). Where λ i is the square of the correlation coefficient between u(x 0 ) and u(x i ) computed over the past observations. In [22], the POBI is applied on the CYGNSS reflectivity values. Instead, in this paper, interpolation over the SM observations is implemented using the same approach via (6).

B. Implementation of the algorithms
Each interpolation method can work on a different size region or apply varying weighting mechanisms for an optimal performance. Next we provide the implementation details of the presented interpolation approaches: • For IDW, BLUE, and POBI, a window size of n × n (e.g., 5 × 5, 7 × 7, 9 × 9) pixel is used to interpolate an unknown point. Here, the unknown point is in the center of the window. The sampled measurements within the n × n window are used to calculate unknown points. This procedure is repeated for each unknown point in the dataset. • For the Kriging interpolation algorithm, a moving window size of 24×24 is utilized. The window is moved horizontally and vertically with 4 pixels overlapping. First, a Gaussian fitted semivariogram model was estimated for each 24 × 24 window using sampled SMAP SM data. These semivariograms were calculated for each day independently using the sparse observations within the specified window size. The number of data for semivariogram calculation can vary between 170 -310 depending on the number of CYGNSS observation in the region. In this study, a constant window size of 24 × 24 is utilized since it provides a good tradeoff between the number of available measurements and the local information. It This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. should be noted that using a small size window makes it difficult to obtain a proper fit for the semivariogram model because of decreased number of observations. However, increasing the size of the window will degrade the semivariogram model because of spatial heterogeneity of SM. All unsampled points within the 24 × 24 window are interpolated using the detailed Kriging approach.

IV. RESULTS AND DISCUSSION
In this section, the performance of the aforementioned interpolation approaches have been analyzed. First, spatial heterogeneity within the SMAP 9-km gridded data, the MSU-GRI CYGNSS SM data, and the CYGNSS reflectivity is examined. The interpolated images from a selected region is then illustrated. Then, the overall interpolation error of the tested algorithms over the whole CYGNSS observation region and time-span is presented. We then provide more detailed analysis of the observed interpolation error with respect to CYGNSS revisit rate, geographical distribution, land cover, SMAP retrieval quality flags, and number of annual CYGNSS observations. Finally, the overall performance of interpolated CYGNSS SM against SMAP and in-situ SM products were reported.

A. Heterogeneity Check
With the error analysis using SMAP data, we implicitly assume that the interpolation error metrics calculated using SMAP as a proxy for CYGNSS will be similar to what the true interpolation errors would be when interpolating CYGNSS SM. To assess the validity of this assumption, spatial heterogeneity maps for SMAP 9-km gridded data, the MSU-GRI CYGNSS SM data and CYGNSS reflectivity for the year of 2019 were generated for each EASE grid. This was achieved by calculating the standard deviation of the SM estimates and reflectivity for various window sizes (i.e., 45 × 45 km, 63 × 63 km, 81 × 81 km, and 99 × 99 km) and averaging throughout 2019. The comparisons are provided in Table I that provides the overall SM standard deviation for different calculation sizes and different land cover types. In addition, an example map is provided in Fig. 6. The results indicate that SMAP L3 9-km SM is more heterogeneous than CYGNSS SM except for shrublands and barren areas. This results, at first, might sound counter intuitive as one expects more variations in CYGNSS than SMAP due to CYGNSS's sensitivity to higher spatial scales and SMAP's 36 km native resolution. It is clear that the CYGNSS reflectivity has very large variations (up to 8 dB standard deviation) as seen in Fig. 6 (c). However, the sensitivity of CYGNSS reflectivty stems mostly from the presence of inland water bodies. On the other hand, the MSU-GRI CYGNSS SM algorithms explicitly removes transient open water fractions larger than 2% within the first Fresnel zone while the baseline SMAP retrieval algorithm corrects for large static water bodies only. This usually makes the dynamic range of SMAP larger than that of CYGNSS SM estimates (e.g., see Figs. 10-13 of [14]). To summarize, while there are some inherent differences in the heterogeneity of both SM products, both data sets follow similar spatial patterns and SMAP is found to be a good candidate as a proxy for CYGNSS interpolation testing.  Fig. 7, RMSE varies from 0.015 m 3 m −3 to 0.020 m 3 m −3 and the performance of each technique is specific to this particular region. The MAE is found to be between 0.097 -0.18 m 3 m −3 , while AAE changes between 0.008 -0.012 m 3 m −3 depending on interpolation algorithms. In order to provide the overall performance of the tested approaches, we now provide average results for the entire CYGNSS coverage.

C. Overall Interpolation Performance
This section presents the overall interpolation error (RMSE) by comparing original and interpolated SMAP data observations using different algorithms over the whole CYGNSS coverage area. The comparison was performed for all SMAP observation grids and SMAP recommended grids separately.  For IDW, BLUE, and POBI algorithms, the different sizes of interpolation windows were tested, while linear and natural interpolations are applied over the whole region at once. Table II shows the spatial interpolation errors of different methods over the SMAP SM estimations. The results indicate the RMSE of SM samples may vary between 0.025 m 3 m −3 and 0.032 m 3 m −3 when computed over all grids. The BLUE and POBI interpolation algorithms reached a score of 0.025 m 3 m −3 and 0.026 m 3 m −3 , respectively. However, these two algorithms are the only algorithms that uses prior statistics of SM values. For the other methods, the average interpolation error is around 0.030 m 3 m −3 . If we only consider SMAP's recommended grids the interpolation errors are much lower and vary between 0.013 and 0.016 m 3 m −3 , with smaller performance difference between interpolation algorithms. While using smaller window sizes for IDW, BLUE and POBI, a decrease in the achieved RMSE values is observed, and the coverage rate (ratio between the number of sampled or interpolated grids and the total number of grids on land within ±38 0 latitudes) for smaller windows is also less. For example, while using a window size of 5×5 provides an average of 87% spatial coverage, a 9 × 9 window provides near full coverage of 97% as shown in Table II. It should be emphasized that the RMSE values presented in Table II are only errors due to the interpolation, they are not absolute SM errors. If the SMAP SM values were correct, the interpolation approaches would contribute to the presented error levels in Table II.

D. Analysis of Interpolation Errors
In this section, the results obtained only using the BLUE algorithm (without loss of generality) are considered and analyzed. The spatial distribution of the interpolation error over the entire CYGNSS coverage is provided in Fig. 8. It indicates that interpolation error is very low (less than 0.01 m 3 m −3 ) in dry regions such as North Africa, Australia, and the Western U.S. However, the error dramatically increases for regions with substantial seasonal variations, such as south India, the north part of South America, and the equatorial band. These regions are frequently flagged by SMAP due to uncertainties in SM retrievals.
1) The Effect of SMAP Retrieval Quality Flag (RQF): While we use the SMAP L3 SM estimation as a reference SM value, its retrieval performance varies with respect to several factors such as water body fraction, coastal proximity, urban area, precipitation, terrain slope, and high vegetation water content. Uncertain RQFs generally belong to Amazon, central Africa, and seasonal flooding regions (e.g., Southeast Asia). Fig. 8 shows that the interpolation error is high for the region where SMAP retrieval is not recommended. We further divided the overall interpolation errors in Table II for all SMAP grids as well as only SMAP recommended grids. For all compared techniques, the interpolation errors for SMAP recommend grids are consistently much lower. The possible reasons are: (1) The regions where the SMAP RQFs are "recommended" generally have dry and homogeneous SM characteristics. (2) Because of low VWC, SMAP SM measurements are more consistent in time and spatial domain. However, SM variation in the regions where the SMAP RQFs are "not recommended" is high, and the retrieval system has its own error due to high VWC.
2) The Effect of Land Cover: In Table III, we show the interpolation errors with respect to different land cover types. Open regions such as open/closed shrublands and barren land cover are generally associated with relatively limited SM variation. Thus, the interpolation is more successful, and errors are relatively small (< 0.008 m 3 m −3 ) than other land cover regions. For the relatively high vegetation land cover areas (woody, savanna, grass, and cropland), the interpolation error is calculated as 0.016 m 3 m −3 . Spatial and temporal SM variation is relatively high for these areas, and this increases interpolation error.
3) Effect of CYGNSS Revisit Distribution: As mentioned in the introduction section (see Fig. 3), CYGNSS has unequal sampling characteristics. The sampling rate is high near 37 o latitudes while it decreases near the equator regions. Figure  9 shows the interpolation error changes by revisit rate of CYGNSS for different land cover types. The general trend for all land cover types is that interpolation error decreases with the increase of the CYGNSS revisit rate. In Fig. 9 barren and shrublands' relatively flat characteristics indicate that SM's spatial variation is low for these land covers. However, other land cover types' revisit rates for CYGNSS become more effective for interpolation performance due to the relatively high spatial SM variation since the performance gets better as revisit rate increases for these land types.  Fig. 10, we observe that interpolation error is decreasing with the increasing daily coverage rate, which is defined as the ratio of the number of 9-km EASE grids that have at least one CYGNSS observation during the considered time period and the total number of 9-km EASE grids on land within ±38 0 latitudes. This is well aligned with the fact that as more samples are available for each day, lower interpolation errors can be achieved.

E. The Performance of Interpolated CYGNSS SM against SMAP and ISMN sites
In this section, we interpolate the CYGNSS SM product [18] directly to produce a fully-covered SM product at 9km over the entire CYGNSS land coverage. We then evaluate independently against overlapping SMAP SM data for the "all grids" and "recommended SMAP grids" evaluation datasets by calculating the grid-based performance metrics. The RMSE and ubRMSE are computed for each EASE grid using all available samples during the study period and the mean of all grids are presented in Table IV. This includes the overall performance scores for sparse grids, interpolated grids only, and all grids. The all grids case includes both the original sparse CYGNSS observation and the interpolated SM observations to create a spatially complete, CYGNSSbased SM product. All cases are compared against SMAP observations that are available at the same grids on the same days. The obtained scores for all cases are similar. This indicates that the interpolated CYGNSS SM data has similar errors compared to SMAP as the CYGNSS SM observations are computed from CYGNSS measurements. Additionally, there is a relatively lower ubRMSE for interpolated CYGNSS estimations. This is because the average number of samples per grid of the interpolated CYGNSS product is higher than the un-interpolated product as given in Table IV. We also compare interpolated CYGNSS SM estimations against the in-situ SM measurements at 170 ISMN stations.   V. SUMMARY AND CONCLUSION This paper assessed interpolated CYGNSS-based SM estimations to generate a fully covered, daily high spatial resolution SM product. The overall results demonstrated that the gaps of quasi-random sampling of the CYGNSS could be filled successfully with the tested interpolation algorithms. The overall interpolation RMSE error computed over the SMAP SM data vary between 0.013 m 3 m −3 and 0.016 m 3 m −3 for SMAP recommended grids and 0.025-0.032 m 3 m −3 for all grids depending on the selected interpolation algorithm. In addition, when sparse CYGNSS data is interpolated with the BLUE approach and compared with SMAP or ISMN SM measurements, similar SM error levels are obtained for both the original sparse CYGNSS data and the interpolated data which indicates the benefit of the interpolation.
When interpolation errors are analyzed, substantially higher interpolation errors for high vegetation content regions (e.g., > 5kg . m −2 ) are observed since the SMAP RQF is uncertain for such regions including the Amazon, central Africa, and seasonal flooded areas. Interpolation error also depends on the revisit rate of CYGNSS. There are slight increments in interpolation error due to the decrement of revisit rate for the same land cover areas. Near the equator (e.g., between ±10 o latitudes), the sample rate of CYGNSS is lower, and the interpolation error of these regions remain relatively high. We observe a minor relationship between land cover type and interpolation error. Due to the high spatio-temporal SM variation, croplands, woody, forest, and savanna areas have a relatively higher interpolation errors.
Although this study did not aim to find the best interpolation technique, algorithms that use prior statistics (e.g., covariance and mean) of unsampled points and their neighbors such as BLUE or POBI demonstrated higher performance overall. Simpler techniques such as Linear and IDW showed slightly worse performance than other algorithms with 0.003 -0.007 higher RMSE values. Natural neighbor and Kriging techniques performed similarly. However, different types of variogram models and area size for the variogram calculations may increase the performance of the Kriging method. As a future study, semivariograms of SMAP and CYGNSS sparse retrievals or Fourier-transform-based approaches could be compared to investigate the different spatial scales of the two data sets. Regional or seasonal comparisons of interpolation algorithms may also be the subject of future studies.
This study used the SMAP L3 9-km gridded SM product to calculate interpolation error. However, this product is already an enhanced version of the 36-km radiometer SM product using Backus-Gilbert optimal interpolation techniques and may not represent real SM values. Because of this, the expected interpolation error could be slightly higher than the results presented in the study. In the study, we used a 9-km gridded CYGNSS product. Readers should consider that interpolation error will vary for the CYGNSS products with different aggregated resolutions. This study used popular spatial interpolation algorithms. The effect of more sophisticated techniques on both temporal and spatial domain still needs to be evaluated and compared. However, this study shows that CYGNSS SM products and interpolation have high potentials to provide daily high-resolution SM products over its quasi-global coverage.