A New Multiresolution CYGNSS Data Product for Fully and Partially Coherent Scattering

A new Cyclone Global Navigation Satellite System (CYGNSS) data product is described, which is generated from the raw intermediate frequency (IF) data. The product includes several established signal coherence detectors, including the power ratio <inline-formula> <tex-math notation="LaTeX">${P}_{\mathrm {ratio}}$ </tex-math></inline-formula>, complex zero-Doppler delay waveform, full entropy <inline-formula> <tex-math notation="LaTeX">${E}_{\mathrm {full}}$ </tex-math></inline-formula>, and a novel fast entropy detector <inline-formula> <tex-math notation="LaTeX">${E}_{\mathrm {fast}}$ </tex-math></inline-formula>. Both entropy detectors are provided with two temporal resolutions: 2 and 50 ms. Coherence performance is characterized using the phase derivative of the reflected signal at the peak of the delay waveform <inline-formula> <tex-math notation="LaTeX">${\varphi }_{\mathrm {peak}}$ </tex-math></inline-formula>. Threshold values of the full entropy detector are determined, which classify scattering into three regimes: incoherent, partially coherent, and coherent. Several scattered signal strength products are included: signal-to-noise ratio (SNR), reflected power <inline-formula> <tex-math notation="LaTeX">${P}_{g}$ </tex-math></inline-formula>, reflectivity <inline-formula> <tex-math notation="LaTeX">${\Gamma }$ </tex-math></inline-formula>, and normalized bistatic radar cross section (NBRCS). Each of these products is derived using a coherent integration time of <inline-formula> <tex-math notation="LaTeX">${T}_{c}$ </tex-math></inline-formula> = 1 ms and incoherent integration times of <inline-formula> <tex-math notation="LaTeX">${N}_{\mathrm {inc}}$ </tex-math></inline-formula> = 1000, 500, 250, 100, 50, and 2 ms. Signal strength time series at the shorter (2 and 50 ms) times provides excellent detection of land–water transitions in heterogeneous scenes. Delay Doppler maps (DDMs) are also generated with high delay (<inline-formula> <tex-math notation="LaTeX">${\Delta \tau }$ </tex-math></inline-formula> = 1/16 chip) and Doppler (<inline-formula> <tex-math notation="LaTeX">${\Delta f}$ </tex-math></inline-formula> = 50 Hz) resolution. The behavior of each signal strength product as a coherence detector is examined using the full entropy method as a reference. Performance is characterized using receiver operating characteristic (ROC) curves. The fast entropy method, which has a much lower computational cost, is similarly characterized. This suite of coherence detection methods can be used to detect the presence of small inland water bodies.

Digital Object Identifier 10.1109/TGRS.2023.3318639Fig. 1.Geodistribution of the 2022 CYGNSS Earth's land surface raw IF tracks.Target areas cover mainly tropical rainforest regions, as well as Southern USA, Sudan, India, and Vietnam.Specific sites were selected by the CYGNSS Science Team based on scientific requirements for ongoing and future investigations.
In addition, CYGNSS captures raw intermediate frequency (IF) signal tracks, e.g., [17], [18], and [19], over specific Earth's surface areas (see Fig. 1).The main foundation of this CYGNSS raw IF calibrated product is on the generation of improved quality high-resolution delay Doppler maps (DDMs) with shorter integration times, using all the available raw IF tracks collected over land surfaces.This multiresolution product is currently available for the CYGNSS inundation working group.It offers a unique opportunity for studies dedicated to, e.g., river width [20], river slope [21], wetlands [22], floods [23], single pass SMC retrievals [24], and development and validation of new GNSS-R scattering models, e.g., [25] and [26].This scenario also offers the possibility to test and develop new GNSS-R opportunities in preparation for the future European Space Agency (ESA) HydroGNSS microsatellite constellation mission [27], which is focused on land surface studies and plans to include an onboard coherent channel to continuously collect peak-DDM complex data and raw IF tracks over areas of interest.
Earth's surface water monitoring is probably one of the most promising applications of GNSS-R missions because of their high spatiotemporal sampling capabilities and the high spatial resolution under the coherent scattering regime.This is an important research question, and it deserves further investigation.Theoretically, the resolution is limited by the size of the first Fresnel zone, with a nonnegligible influence of higher order zones, e.g., [28], [29], [30], [31], and [32].It was found that short N inc provides a reasonable tradeoff between an acceptable along-track spatial resolution and a reduced signal noise level, which enables to detect land-water transitions accurately [33].We encourage the users to investigate this, and additional research questions, to further advance our understanding of the ultimate capabilities of GNSS-R.To do so, the product is delivered with a wide variety of observables [34], such as power DDMs (P g , , and BRCS), including the corresponding information at the peak p, as well as the normalized bistatic radar cross section (NBRCS), and a wide variety of coherence detectors, i.e., complex delay waveforms, ϕ peak , P ratio , E full ,and E fast .In addition, quality flags (see the Appendix) and a comprehensive set of metadata, e.g., Moderate Resolution Imaging Spectroradiometer (MODIS) land cover type, multiresolution Pekel water masks, and topographic roughness, are included.
This article is organized as follows.First, we introduce the CYGNSS raw IF data in Section II.Then, the raw IF signal processing is presented in Section III, describing the details of the enhanced DDMs, the signal calibration approach, and the derived land surface observables.Then, the coherence detectors are presented and described in Section IV.Finally, Section V concludes this article.

II. CYGNSS RAW IF DATA A. Introduction
The raw IF samples collected by the delay Doppler mapping instrument (DDMI) onboard CYGNSS contain the highest possible resolution over delay and Doppler space.The access to this raw IF data enables high-resolution processing onground, in many ways not possible in real time by the instrument.For example, the raw IF samples can be processed to extract in-phase I and quadrature Q information of the complex DDMs for high-quality evaluation of coherent Earth's surface scattering.In addition, comprehensive studies of the received reflected signal power can be performed using variable delay and Doppler spacing and integration times, thus permitting the generation of extremely high-resolution DDMs for advanced science applications, including ocean and land.

B. Baseline L1 Mission Product
After the antenna and the low-noise amplifier (LNA), the GPS signal enters the onboard receiver, where it is downconverted and digitized.These raw IF samples are then processed using a fast Fourier transform (FFT) technique, implemented in the receiver.The real-time output is a DDM with 128 delay bins at 1/4 chip delay steps and 20 Doppler bins at 500-Hz steps.The scattered signal power is processed using a coherent integration time T c = 1 ms and an incoherent averaging time N inc = 1000 ms (December 2016-August 2019) or 500 ms (August 2019-present).This onboard full DDM is compressed to meet the satellite downlink requirements.The compressed DDM includes 17 delay bins and 11 Doppler bins, which are centered at the onboard estimated nominal specular point, and the bit depth of each bin is truncated from 32 to 8 bits.A more precise geolocation of the nominal specular point is performed on-ground.These DDMs are generated in raw uncalibrated units, and calibration is performed at the University of Michigan (UMich) [35].

C. Raw IF Data Collection
Starting from 2017, some few raw IF datasets are being collected by the DDMI and downloaded to ground for improved studies over specific target areas, with specific interest for the CYGNSS science team.More recently, a new raw IF data collection strategy was approved in winter 2022 so that ∼20 raw IF tracks are downloaded continuously every week.Two files for each raw IF data collection are generated: one raw IF metadata file and one raw IF data file.The raw IF metadata file contains the ID of the spacecraft, a single Data Recorder Track 0 (DRT0) packet, and one or more pulse per second (PPS) packets.The raw IF data file includes three sets of raw signal sample streams (∼60 s) received by the zenith antenna and the nadir antennas at port and starboard.All raw IF tracks collected by the CYGNSS mission to date include byte interleaved data from all three antenna channels of the DDMI, numbered as follows: zenith navigation antenna, nadir starboard side science antenna, and nadir port side science antenna.
All samples for all channels are saved as 2-bit values and are interleaved.The source of the binary raw IF data is byte 9-byte N of the File Transfer Packet Data (FD00) packets emitted by the DDMI.The raw IF file contains bytes 9-N of multiple contiguous FD00 packets.The FD00 packets are expected to contain consecutive sequence byte numbers.If a missing FD00 packet is detected, 2048 zero bytes are inserted in the raw IF file in place of the missing data.The first FD00 packet in the stream carries a DRT0 header block at the beginning of the data bytes.The rest of this first packet and all subsequent packets contain the binary raw IF sample data.

A. Introduction
The bandwidth of the raw signal is ∼2.5 MHz, centered at the IF of ∼3.8 MHz.In the nominal mode, the IF signals are sampled with a sampling rate of ∼16.0 MHz and a resolution of 2 bits per sample for both direct and reflected channels.The ground-based raw IF processing uses a delay-domain FFT technique to perform the correlations at all the delay samples using a frequency-domain multiplication.The delay sampling is configured based on the decimation of the sampling rate and the Doppler processing range.N inc and the delay Doppler sampling properties are also configured depending on the Earth's surface type, i.e., ocean versus land.The raw IF processing is designed and built on a Linux Ubuntu laptop using the GNU Compiler Collection (GCC) compiler and the FFTW library.The code is written in C with the FFT processing based on the open source "fastgps" signal acquisition processor [36], upgraded to accept the CYGNSS raw IF data format and perform GNSS-R specific processing tasks, including variable noncoherent integration.

B. Multiresolution Enhanced DDMs
Different types of enhanced DDMs can be generated to improve and enable new scientific applications over land and ocean by CYGNSS (see Table I).The main three DDM types are illustrated in Fig. 2. In so doing, the key parameters are the following: coherent integration time T c , incoherent integration time N inc , delay bin resolution τ , Doppler bin resolution f , delay window d w , and Doppler window D w .This article is focused on Earth's land surfaces [see Fig. 2(a)].
All enhanced DDMs are generated with f = 50-Hz Doppler bin resolution and τ = 1/16 chip delay bin resolution.The DDMI estimation of the Doppler at the nominal specular point is introduced in the raw IF processing to reproduce as much as possible the onboard scenario for calibration  The output power after raw IF processing is not exactly the same at different delay sampling rates because the processing uses 1-ms FFTs in each case on different length vectors to perform the correlations.The shorter the vector, the faster the FFT, but delay samples are thrown out.The highest achievable resolution, which has the slowest FFTs, is at the full sampling rate or 16 samples per chip (i.e., 1/16 chip delay bin resolution).As such, the use of 1/16 chip delay bin resolution without sample decimation provides the highest fidelity results.The required longer run time is not a limiting consideration because the processing is performed on-ground.The raw IF processing output DDMs are generated in raw uncalibrated units, known as counts (see Fig. 2).There is a scale difference between these DDMs and those generated by the DDMI probably due to the onboard compression algorithm Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
and the fact that the DDMI uses the "Zoom" Doppler domain correlator (and possibly some internal, unknown, bit overflow shifting/scaling) [37].This difference must be compensated.In addition, both processing chains must be referenced to the GPS absolute time.Thus, before radiometric calibration, time synchronization and scaling strategies must be defined.

C. Time Synchronization and Geolocation
Accurate GPS timing information derived from the direct signal received by the zenith navigation antenna is used to improve the geolocation of the nominal specular point over the Earth's surface.The nominal specular point position is computed over the world geodetic system (WGS) 84 ellipsoid of reference, and then, the solution is projected over the NASA Shuttle Radar Topography Mission (SRTM) global digital elevation model.This improvement compared to the nominal mission data enables more accurate land-surface studies.The original data strategy was designed for ocean wind speed retrieval for long N inc , and the onboard timing precision is adequate to that goal.
In addition, a lag-correlation technique is used for each pseudorandom noise (PRN) code number, to align the onboard generated time series and the raw IF-based time series, before calibration.Lag correlation is the correlation between two series where one of the series has a time lag with respect to the other.This is required to compensate for a variable temporal offset that is present between the two time series.The compensation of this offset is used to align the raw IF time series with the onboard operations in real-time processing.Time series at different N inc is aligned at the beginning of each sampling rate's integration period.

D. Calibration Strategy
1) Scaling: Uncalibrated DDMs are measured in counts.These counts are linearly related to the total signal power generated by the raw IF processing (C if ) or the DDMI (C ddmi ).
Raw IF-based DDMs (C if ) in counts can be related to the arriving signal power as follows: where τ is the delay, f is the Doppler, G if is the end-toend gain of the raw IF processor, P a is the thermal noise power generated by the antenna (in Watts), P r is the thermal noise power generated by the instrument (in Watts), P g is the scattered GNSS power (in Watts), and P e (in Watts) represents certain parameters of the real-time signal processing performed by the DDMI on orbit, which are unknown, and so cannot be exactly duplicated by the ground processing of raw IF data.DDMI-based DDMs (C ddmi ) in counts can be related to the arriving signal power as follows: where G ddmi is the end-to-end gain in real-time DDMI processing.The DDMI L1 (red) trace is so low that it seems to be a flat zero as its 10 5 versus 10 14 for the raw IF (see Fig. 5 for trend after scaling).
Combining ( 1) and ( 2), both types of uncalibrated DDMs can be linearly related as follows: The first step in the calibration is to estimate the coefficients to transform and After lag correlation, a linear regression between the times series (see Fig. 3) at the maximum peak (subscript p) of the raw IF-based (C if, p ) and the DDMI-based DDMs (C ddmi, p ) at N inc = 500 ms is used (see Fig. 4) to empirically determine the coefficients for each PRN code number per channel.This is necessary because certain parameters of the real-time signal processing performed by the DDMI on orbit are unknown and so cannot be exactly duplicated by the ground processing of raw IF data.These parameters include the start and stop times of the coherent correlation relative to the time-varying phase of the GPS L1 carrier signal and the phase tracking algorithm used to synchronize the timing of the correlator.Both parameters can affect the scaling of the DDM data products.
The scale factor is derived with the corresponding high delay and Doppler bin resolution raw IF time series [ f = 50 Hz and τ = 1/16 chip] because the output power of the raw IF processor is different at different delay sampling rates.In addition, the impact of different N inc 's is considered.The output reflected power changes with different N inc 's since the software processing computes an accumulation over the incoherent integration time.It does not compute an average over the incoherent integration time.Thus, there is a need to scale linearly with the length of the integration interval as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.follows (see Fig. 5): where N inc is any incoherent averaging and N inc,0 is the incoherent averaging time used to derive the scale factor ( p1, p2), i.e., 500 ms after August 2019.It is worth highlighting that the scale factor ( p1, p2) is derived for each raw IF track.Fig. 5 shows the after-scaling time series.The output reflected power of the different time series is unbiased.In addition, the raw IF time series shows a higher dynamic range and an improved resolution compared to the DDMI because of the shorter integration times down to N inc = 2 ms, and the higher delay and Doppler bin resolution [ f = 50 Hz and τ = 1/16 chip].
After scaling, the signal-to-noise ratio (SNR) is computed as follows:  where C N ,if,scaled,N inc is the noise power.Noise information is calculated separately for each DDM, using pixels in which no signal power is present, i.e., the 180 top delay rows of the original uncompressed 512 × 200 DDMs [ f = 50 Hz and τ = 1/16 chip] in counts.
2) Radiometric Calibration: This section describes the radiometric calibration [38], [39], [40] approach used to convert each bin of the after-scaling DDMs into counts C if,scaled,N inc (τ, f ) to units of Watts P g (τ, f ).After radiometric calibration, the reflected power DDMs (in Watts) are derived as follows (see Fig. 6): Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.where C=(C if,scaled,N inc (τ, f )−C N ,if,scaled,N inc ), P B is the blackbody thermal noise power, and C B is the DDM in the blackbody state.
The DDM noise power is expressed in counts as where C N ,if,N inc is the noise of the before-scaling DDMs.P B and C B are derived onboard during the blackbody calibration when a calibration switch selects between the nadir antenna and a blackbody target as the source of the input signal to the DDMI.The blackbody calibration is performed every 60 s for each nadir antenna.The DDM in the blackbody state is given as follows: P B can be expressed as follows: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.significantly [41].In addition, the blackbody measurements are on-ground resampled and interpolated to the measurement time.
The thermal noise power generated by the DDMI can be expressed as follows: where NF is the DDMI noise figure and T r is the DDMI temperature.The NF versus temperature profile was characterized prelaunch for all the instrument LNAs.NF is continuously updated in-orbit.
The reflectivity DDM is estimated as follows [see Fig. 7(c)]: where R t and R r are the ranges from the transmitter and the receiver to the Earth's surface scattering area, G r is the receiver antenna gain, λ is the signal electromagnetic wavelength, and EIRP is the transmitter equivalent isotropically radiated power of the right-hand circular polarization (RHCP) direct GPS signal.
The BRCS DDM is estimated as follows [see Fig. 7(d)]: The surface NBRCS is estimated as follows: where A eff is the effective scattering area corresponding to four delay bins × ten Doppler bins cell around the peak over land surfaces because the scattered signal is concentrated in a small area near the specular point.For the inversion of the land observables, the calibration variables are approximated with the values corresponding to the peak.This is used for the calibration of the ocean DDMs [see Fig. 2(b)] but approximating at the nominal specular point [38].Thus, it can be applied over land surfaces because the scattering is more specular (scattered signal is concentrated in a small area near the specular point) than over the ocean.However, when the scattering is very diffuse, i.e., off-specular [see Fig. 2(c)], this approximation is no longer valid, and per-bin calibration variables must be used along the complete d w and D w windows (see Table I).
In addition, the delay Doppler ambiguity introduced by the Woodward ambiguity function (WAF) must be considered.

E. Time Series Analysis: Reflectivity Case Study
The time series of peak reflectivity p = (τ p , f p ) is selected to highlight some product capabilities, but more observables are available for the final users.Two IF tracks are considered over a Surface Water and Ocean Topography (SWOT) mission cal/val site (see Figs. 8 and 9) and the Pacaya-Samiria region in the Amazon basin (see Figs. 10  and 11).The corresponding geolocation of p using Landsat imagery is depicted in Figs. 9 and 11.N inc = 2-and 50-ms p time series shows a higher signal dynamic range and a higher spatial resolution than those corresponding to N inc = 500 ms.There is clear evidence of the sensitivity of p to the presence of surface water, showing the capability to detect small inland water bodies (see Fig. 12) and land-water transitions over complex heterogeneous scenes.
In the SWOT site, low p ∼ <0.1 levels [see Fig. 8(a)] appear over vegetated areas because of the larger attenuation of the reflected signal by the vegetation cover [42], [43], [44] [see Fig. 9(a)].On the other hand, in the inundated terrain located in one lateral of the Amazon River, p increases up to ∼0.2.The Amazon River is then divided into several arms, surrounded by vegetation.p can capture sharp transitions land-water-land.Within the river, there is a combination of sharp peaks p ∼ 0.5 with lower values ∼0.2 probably because of the different surface roughness conditions across the wide Amazon River.It is worth pointing out that the spatial resolution gradually decreases with increasing N inc .The 500ms p time series is not able to resolve this complex scene [see Fig. 9(d)].On the other hand, increasing N inc helps to reduce signal noise, e.g., N inc = 2 ms versus N inc = 50 ms.
The Pacaya-Samiria scene (see Fig. 11) is characterized by a complex-form river within the Amazon basin, surrounded by dense vegetation and several other small-size water bodies (see Fig. 12).The orientation of the selected raw IF track is interesting because the nominal specular point crosses some of these water bodies and, in addition, is laterally surrounded by a few of them.The N inc = 2-ms p time series [see Fig. 11(a)] resolves this complex scene better than longer times, showing that the spatial resolution under the coherent scattering regime is fine enough to detect small inland water bodies, while the impact of higher order Fresnel zones appears negligible in this scenario.
However, there is a nonnegligible impact on target areas characterized by a large and smooth surface water extension surrounded by heavy vegetation [31], [32] (see Fig. 12).

A. Introduction: Coherent and Incoherent Scattering Modeling
There are several correlation techniques to extract the geophysical information from the scattered GNSS signals.CYGNSS uses the classical GNSS-R or cGNSS-R.The scattered electric field is cross-correlated with a replica of the known GNSS codes, so as to generate the so-called complex DDMs (see Fig. 13) where a represents the modulating PRN code, u is the received signal, f c is the GNSS carrier frequency, and t is the time.In a spaceborne scenario, increasing T c is debatable as the motion of the receiver (∼7 km/s) will limit the signal phase coherence.Some results estimated that close to T c = 1 ms is near optimal and T c ∼ 0.8 ms for a faster moving Shuttle instrument [11], while T c = 1, 2 ms for a classical GNSS-R configuration [45].However, since the scattered signal is of even weaker amplitude than the direct one, and in addition, it suffers from speckle noise, a large number of incoherent averages have to be done to improve the SNR of the so-called power DDMs Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Power DDMs can be modeled as the sum of two terms [13], [28], [32], [43] where ⟩ can be obtained as follows [13]: where χ is the WAF, σ 0 is the bistatic scattering coefficient, γ is the transmissivity of the vegetation [44], and A is the scattering area, which is limited to the so-called glistening zone.
Under the Huygens-Kirchhoff principle, ⟨|Y coh (τ, f )| 2 ⟩ can be expressed as in (20), shown at the bottom of the page, [29], [30], [31], [32], where υ is is the characteristic free space wave impedance ∼120π, k is the angular wavenumber, θ i is the incidence angle, R is the complex Fresnel reflection coefficient, and σ is the surface height standard deviation (related to small-scale surface roughness).
Results have demonstrated the impact of higher order Fresnel zones on the total coherent reflected power as collected by a GNSS-R sensor, showing ringing fluctuations in the reflected power near high-contrast boundaries.This theoretically determines the spatial resolution of the coherent scattering over heterogenous areas.

B. Coherence Detectors: Definitions
Several coherence indices are provided to help the final users in selecting the most appropriate observable for each targe area, depending on the dominant scattering mechanism, i.e., coherent or incoherent.
The full entropy detector E full is obtained by applying the von Neumann entropy definition to the corresponding GNSS-R density matrix D full as follows [46]: where the symbol Tr is the trace of the matrix.
The full GNSS-R density matrix D Full is calculated after normalization of the eigenvalues β using its trace where β is the diagonal eigenvalues matrix generated with the generalized eigen decomposition (GED) of the correlation matrix Q where φ is the eigenvectors matrix and the superscript tp denotes the matrix transpose.Q is generated from the N sequential snapshots of the zero-Doppler delay waveforms Z [T c = 1 ms and N inc = 1 ms] (see Figs. 14 and 15) as , where H denotes the Hermitian transposition.In this study, 48 bins centered at the maximum peak of the waveforms are used for entropy calculations.Tr(β) is equal to the sum of its eigenvalues.E full output values are finally generated with two different temporal resolutions of 2 and 50 ms.This metric uses the idea of entropy as an estimation of the information available in the eigenvalues of the correlation matrix.The eigenvalues are a reliable estimator of the energy distributed along the dimensions of the signal subspace.E full ∼ 1 represents a uniform eigenvalues' distribution, which is an indication of totally incoherent scattering.On the other hand, E full ∼ 0 indicates the presence of a dominant eigenvalue, which is an indication of totally coherent scattering.The full entropy E full detector is based on the variations in time of the scattered signal, as well as in the shape of the waveforms.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The fast entropy detector E fast is a fast and approximated calculation of E full .It is computed using the von Neumann entropy definition as follows:  where D fast is the fast GNSS-R density matrix, which is defined as follows: where η is the diagonal eigenvalues matrix.η 1, is the largest eigenvalue of the correlation matrix after whitening correlated additive noise Q w , which is computed using the so-called power method [47].The second eigenvalue η 2 is  defined as the mean value of the remaining eigenvalues.
The power method is used to iteratively find the eigenvalue of Q w that is largest in absolute value, i.e., the dominant eigenvalue of Q w .Although this restriction may seem severe, dominant eigenvalues are of primary interest in many physical applications, which is the case.The power ratio P ratio is the ratio of the raw counts of the 13 × 51 delay Doppler grid surrounding the peak value of C if [ f = 50 Hz and τ = 1/16 chip] over the sum of the rest of the values as follows [48]: where and

C. Time Series Analysis: Detectors' Intercomparison
One single raw IF track over a complex heterogeneous scene in the Pacaya-Samiria region is selected to show an intercomparison of the time series of the three coherence detectors previously described (see Figs. [16][17][18]. The full entropy E full and fast entropy E fast suddenly decrease when a water body approaches the nominal specular point because the coherent scattering becomes dominant, and thus, there is a dominant eigenvalue (see Figs. 16 and 18).Consequently, the reflectivity p increases, which is an additional indication that the surface scattering is mainly coherent ⟩ over surface water bodies [28], [46], [48].On the other hand, in the presence of vegetated areas, both entropy indices (E full and E fast ) increase because the scattered signal becomes incoherent ⟨|Y inc (τ, f )| 2 ⟩.It is worth pointing out that E full shows more clear transitions from coherent to incoherent regimes and a higher signal dynamic range than E fast [see Fig. 16(a) and (b)].In addition, the power ratio P ratio time series is depicted [see Fig. 18(c)].P ratio increases in the presence of water.P ratio was designed primarily for use when signal phase information is not available (i.e., the standard CYGNSS L1 products).It has been adapted for its application with short integration times and high delay and Doppler resolution DDMs by tuning the noise exclusion threshold process.conditions, while the Pekel mask is based on averaged optical Landsat imagery from the global surface water explorer [49].This is the reason why the Pekel mask is not able to show the presence of water over areas covered by vegetation (see    ⟩ scatterings, depending mainly on surface roughness [50], [51], [52], [53], [54].The full entropy E full is the most powerful coherence detector presented in this study [see Fig. 16(a)].Here, it is selected as the main truth reference for further evaluation.The phase derivative at the peak of the reflected complex zero-Doppler delay waveforms ϕ peak is used to derive the full entropy levels that can be used to classify the different scattering regimes.ϕ peak is computed as the arctangent of the product of the complex peak Y n (τ p , f p ) and the conjugate of the previous complex peak Y n−1 (τ p , f p ) with T c = 1 ms.
Two raw IF tracks have been selected over target areas with differentiated geophysical properties (see Fig. 19): the first one over the SWOT cal/val site (see Figs. 19(a) and 20) and the second one over the Pantanal [55], [56], which is the world's largest tropical wetland area (see Figs.The classification is based on E full time series with 50 ms of resolution because of the noisier performance with 2 ms.The variance decreases by increasing the number of waveforms (see Fig. 15) for the estimation of each entropy value.The larger the number of waveforms under consideration, the larger the heterogeneity of the equivalent footprint, but the overall entropy is lower.This is relatively similar to speckle noise mitigation by coherent averaging, which increases the coherence of the signal compared to lower T c .In addition, it is worth to comment that the minimum entropy levels are of the same order, independently of the number of waveforms considered for calculations.The incoherent scattering regime is characterized by random phase fluctuations and full entropy levels higher than E full ∼ 0.7.This state appears in densely  vegetated areas without surface water (see Fig. 19).The partially coherent regime is characterized by a noisy linear trend of the phase derivative ϕ peak and full entropy levels between E full ∼ 0.3 and E Full ∼ 0.7.This state is found in regions such as the Pantanal wetlands (see Fig. 19).Finally, the coherent scattering regime is found for full entropy levels below E full ∼ 0.3.ϕ peak is clearer in this state, which is associated with the presence of inland water bodies without upwelling vegetation cover, i.e., low γ levels.In this scenario, when the surface is smooth, the scattering is highly coherent, and the reflected signal phase can be tracked.
This characterization has been found to provide an accurate scattering classification over the complete CYGNSS raw IF dataset.In this section, two tracks have been selected to illustrate their temporal series, and in Section IV-E, the study is performed using the full dataset.

E. Full Dataset Assessment: Fast Versus Full Entropy
Finally, all the available raw IF tracks collected over land surfaces in 2022 are selected for a global scale assessment.The performance of E fast versus E full is evaluated as a function of SNR [see Fig. 22 22(d)].E fast has a lower computational requirement [see (24)] than E full [see (21)], but there is a clear functional relationship between both entropy metrics.For the coherent scattering regime, i.e., E full ∼ <0.3, the relationship appears linear, which means that E fast is an Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.excellent coherence detector, despite the use of the power method for the estimation of the dominant eigenvalue.On the other hand, it seems that E fast saturates for E full ∼ >0.7.This is probably because of the better capability of E full to capture the incoherent scattering regime and its higher signal dynamic range (see Fig. 16).This improved performance is possible because of the higher accuracy of the GED in determining the eigenvalues than numerical methods.E fast uses the mean value of the remaining eigenvalues, as an estimation of the second eigenvalue η 2 , and thus, there is a limitation in the detection of the incoherent scattering, which is associated with a uniform eigenvalues' distribution.
Higher values of SNR [see Fig.   [8], using E full as proxy data (see Fig. 23).The scattering is incoherent if the full entropy is E full ∼ >0.7, and it is coherent when full entropy is E full ∼ <0.3 (see Figs. 20  and 21).
The optimum operating point of an ROC curve is defined at the point of inflection when the slope of each curve crosses from above to below unity (see Fig. 23).This point marks the transition from more true than false detections to more false than true.In addition, the ratio of the probability of detection (PD) and the false alarm rate (FAR) is related to the area between each curve and the diagonal line (PD = 0 & FAR = 0 to PD = 1 & FAR = 1) in Fig. 23: E fast_area (∼0.49),SNR area (∼0.47), and P ratio_area (∼0.47),NBRCS area (∼0.46), and p_area (∼0.46).Based on both the optimum points of the curves and the area under the curves, E fast shows a behavior almost similar to E full , which confirms the functional relationship found in Fig. 22.The threshold levels (one for each observable) at the optimum operating points of the ROC curves can be used for coherence detection and, thus, for inland water bodies tracking.Future versions of this product could include more coherence detectors [57].

V. CONCLUSION
The new CYGNSS multiresolution land data product based on enhanced quality DDMs is generated after processing and calibrating raw IF collections.p time series with low integration times, e.g., N inc = 2 and 50 ms, captures land-water transitions with high accuracy because of the higher signal dynamic range and the higher spatial resolution compared to the standard mission L1 product.In addition to power observables, the use of complex DDMs allows us to characterize the performance of more advanced algorithms for coherence detection.Bistatic scattering regimes are classified as incoherent (E full ∼ >0.7), partially coherent (0.3 ∼ <E full ∼ <0.7), and coherent (E full ∼ <0.3) by analyzing time series of ϕ peak .Based on this truth reference state generated with E full , ROC curves are used to characterize the performance of E fast , SNR, P ratio , NBRCS, and p .A PD higher than ∼95% for an FAR lower than ∼5% is found at the optimum points of the ROC curves using all the 2022 raw IF tracks over land surfaces.In particular, E fast appears to be an excellent water detector, showing a clear linear functional relationship with E full under the coherent scattering regime.The lower computational requirement of this detector makes it suitable for in-orbit water monitoring.
The ultimate goal of this article is to show the capabilities of our product to the users for further advancement of the CYGNSS science team investigations over land surfaces, with special interest in inland water bodies monitoring.This is currently a hot research topic with the SWOT mission aiming to resolve ∼100-m-wide rivers and ∼250 × 250 m lakes and reservoirs.

APPENDIX QUALITY FLAGS
See Tables II and III.
purposes.The DDMI Doppler information is provided every N inc = 1000 ms (December 2016-August 2019) or every N inc = 500 ms (August 2019-present).A spline method is applied to the DDMI-based Doppler vector to generate more precise Doppler inputs for lower integration times N inc = 250 ms, N inc = 100 ms, N inc = 50 ms, and N inc = 2 ms.The delay bin resolution is determined by the effective sampling rate.In the nominal mode, raw IF data are sampled at a rate of ∼16 MHz, which is approximately 16 000 samples per 1 ms.Approximately, 16 samples per GPS coarse/acquisition (C/A) code chip is a delay chip resolution of 1/16 chip in the DDMs.The raw IF processing allows the selection of different delay bin resolutions so that the ∼16-MHz samples are downsampled.1) 1/8 Chip Delay Bin Resolution: Downsampling by a factor of 2 to a ∼8-MHz sampling rate or eight samples per chip.2) 1/4 Chip Delay Bin Resolution: Downsampling by a factor of 4 to a ∼4-MHz sampling rate or four samples per chip.3) 1/2 Chip Delay Bin Resolution: Downsampling by a factor of 8 to a ∼2-MHz sampling rate or two samples per chip, which is the Nyquist limit for the GPS C/A code.

Fig. 3 .
Fig. 3. Before-scaling comparison of DDMI [ f = 500 Hz, τ = 1/4 chip, and N inc = 500 ms] and raw IF [ f = 50 Hz and τ = 1/16 chip] peak reflected-power time series [counts].These time series correspond to a single PRN number within one raw IF file.Track acquired on April 22, 2022.The DDMI L1 (red) trace is so low that it seems to be a flat zero as its 10 5 versus 10 14 for the raw IF (see Fig.5for trend after scaling).

Fig. 4 .
Fig. 4. Sample scale factor between DDMI [ f = 500 Hz, τ = 1/4 chip, and N inc = 500 ms] and the N inc = 500-ms raw IF [ f = 50 Hz and τ =1/16 chip] time series of reflected peak power in raw uncalibrated units [counts] after lag correlation.This sample scale factor corresponds to a single PRN number within one raw IF file.Track acquired on April 22, 2022.

Fig. 12 .
Fig. 12. Zooming GNSS-R raw IF reflectivity p over high-interest target areas: (a) track acquired on March 27, 2022 [see Fig 9(a)] and (b) track acquired on February 16, 2022 [see Fig. 11(a)].Land-water transitions are accurately captured by CYGNSS.Over water, the reflectivity increases because of the higher dielectric constant, smoother surface, and the absence of vegetation cover.It is also worth highlighting that the signal fluctuations over water are probably due to variable surface roughness.

Fig. 14 .
Fig. 14.Sample temporal series of delay vector over the White River Basin, AR, USA (track acquired on January 15, 2022).

Fig. 20 .
Fig. 20.Time series in the SWOT cal/val site (track acquired on March 27, 2022): (a) ϕ peak and (b) E full .Full entropy levels are identified for the classification of several scattering regimes: incoherent, partially coherent, and coherent.

Fig. 21 .
Fig. 21.Time series in the Pantanal (track acquired on July 7, 2022): (a) ϕ peak and (b) E full .Full entropy levels are identified for the classification of several scattering regimes: incoherent, partially coherent, and coherent.

Fig. 17
shows a comparison of the full entropy coherence detector E full and the 2-km resolution Pekel water mask.E full provides real-time information regardless of weather Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 22 .
Fig. 22. Scatter plots of E Fast versus E Full (resolution 50 ms) using all the available raw IF tracks collected over land surfaces in 2022, as a function of the following parameters with N inc = 50 ms: (a) SNR, (b) P ratio , (c) NBRCS, and (d) p .

Fig. 23 .
Fig. 23.ROC curves of the peak-observables (SNR, p , and NBRCS) and the coherence detectors (E fast and P ratio ) using E full as the truth reference.Entropy detectors with 50 ms of resolution and the rest of the parameters with N inc = 50 ms.

Fig. 17 )
Fig.17), while the GED of Q enables the capability to detect small coherence changes (see Fig.17).
Fig.17), while the GED of Q enables the capability to detect small coherence changes (see Fig.17).
Fig.17), while the GED of Q enables the capability to detect small coherence changes (see Fig.17).
19(b) and 21).Both regions are characterized by the presence of surface water covered by heavy upwelling vegetation.This scene remains unresolved for optical sensors.Three different scattering regimes are shown in Figs.20 and 21, including incoherent, partially coherent, and coherent.

TABLE I SUMMARY
OF THE PROPERTIES OF EACH RAW IF DATA PRODUCT TYPE.THIS ARTICLE IS FOCUSED ON LAND.COHERENT INTEGRATION TIME T c , INCOHERENT INTEGRATION TIME N inc , DELAY BIN RESOLU-TION τ , DOPPLER BIN RESOLUTION f , DELAY WINDOW d w , AND DOPPLER WINDOW D w

TABLE II PER
-DDM QUALITY FLAGS 1

TABLE III PER
-DDM FLAGS 2