Optimizing Satellite Mission Requirements to Measure Total Suspended Solids in Rivers

Human modification of the landscape affects total suspended solid (TSS) concentrations in water. The quantitative extent of these changes remains poorly understood, partly because of the challenges associated with observing TSS dynamics in inland waters over large scales. While many current missions and sensors provide usable data to estimate inland water quality (e.g. Landsat series, VIIRS, and Sentinel-2), future missions present the opportunity to increase transferability and accuracy of TSS estimation. Here, we degrade assumed ideal spectral data to evaluate the optimal data quality for TSS retrieval using an optical sensor configuration. We also perform wavelet analysis and a river size distribution analysis to study temporal and spatial data quantity requirements, respectively. We find that while the highest resolution data always gives the best retrieval accuracy, some factors are more essential in TSS estimation than others and can simplify mission design. Specifically, fine hyperspectral resolution is key in improving retrieval accuracy and a finer spatial resolution allows exponentially more river surface area to be observed. A revisit period of approximately five days or less best captures TSS pulse events, such as floods. Understanding the optimal mission specifications for observing inland water quality, especially TSS, will assist in developing and proposing future optical satellite missions.


Optimizing Satellite Mission Requirements to
Measure Total Suspended Solids in Rivers Molly K. Stroud , George H. Allen , Marc Simard , Daniel Jensen, Ben Gorr , Student Member, IEEE, and Daniel Selva, Senior Member, IEEE Abstract-Human modification of the landscape affects total suspended solid (TSS) concentrations in water.The quantitative extent of these changes remains poorly understood, partly because of the challenges associated with observing TSS dynamics in inland waters over large scales.While many current missions and sensors provide usable data to estimate inland water quality (e.g.Landsat series, VIIRS, and Sentinel-2), future missions present the opportunity to increase transferability and accuracy of TSS estimation.Here, we degrade assumed ideal spectral data to evaluate the optimal data quality for TSS retrieval using an optical sensor configuration.We also perform wavelet analysis and a river size distribution analysis to study temporal and spatial data quantity requirements, respectively.We find that while the highest resolution data always gives the best retrieval accuracy, some factors are more essential in TSS estimation than others and can simplify mission design.Specifically, fine hyperspectral resolution is key in improving retrieval accuracy and a finer spatial resolution allows exponentially more river surface area to be observed.A revisit period of approximately five days or less best captures TSS pulse events, such as floods.Understanding the optimal mission specifications for observing inland water quality, especially TSS, will assist in developing and proposing future optical satellite missions.

I. INTRODUCTION
K NOWLEDGE of total suspended solids (TSS) within surface water is key to understanding geophysical processes within inland water bodies such as rivers, lakes, and coastal wetland systems.The movement of sediment through inland waters maintains ecosystems [1], structures landscapes [2], and transports nutrients [3].TSS has also been shown to correlate with other water quality parameters, such as heavy metals [4], total phosphorus [5], and turbidity [6].Human modification of the landscape has led to changes in TSS transport patterns, both on local and global scales [7].Anthropogenic activities such as agriculture [8] and deforestation [9] increase sediment delivery to inland waters, impacting water quality, channel stability, and local ecosystems.Simultaneously, structures such as dams impound massive amounts of sediment in reservoirs and have reduced the flux of sediments that reach the world's coastal regions by over 1 billion metric tons per year [7].Changes in sediment regimes can harm aquatic species that are adapted to sediment-rich water or seasonal sediment variations [1], [10].As rivers fail to deposit the necessary quantity of sediment to maintain coastal elevations, coasts may experience major erosion [11].Coastal wetlands are especially at risk as relative sea level rise caused by global sea level rise and land subsidence begins to exceed accretion rates [12].
Improving our capability to observe global sediment flux in rivers can further our understanding of how sediment movement is changing due to anthropogenic activities.Traditionally, in situ methods have been used to measure TSS concentrations.However, these methods are often time-consuming and costly [13].Furthermore, they are impractical for large-scale analysis, and long-term TSS measurements do not exist at the global scale.Satellite remote sensing offers an efficient alternative for estimating TSS over large scales [14], [15].Sensors and satellites such as MODIS, Sentinel-2, and the Landsat series have all been used effectively to gather wholeriver-network TSS measurements [16], [17], [18].However, an inland water-focused sensor-for example, the sensor proposed by the CEOS Feasibility Study-may provide data more suited to estimating inland water quality [19].
A variety of algorithms have been implemented in an attempt to best observe TSS using current remote sensing technology [20], [21], [22].Many TSS algorithms apply simple empirical approaches which can be highly accurate over small geographic regions but are limited in locationto-location transferability due to inherent optical properties such as the mineralogy, size, and color of the sediment [23].Although much work has been dedicated to creating a spatially transferable and robust algorithm, no true "global TSS algorithm" has been derived [24].In addition to more accurate TSS estimates, a more optimized sensor may allow for the creation of algorithms that are transferable from one location to another.
Imaging spectroscopy may provide the opportunity for innovative techniques in water quality retrieval.While frequently used sensors such as Landsat-based sensors and VIIRS provide a few multispectral bands in the VNIR, hyperspectral data contain hundreds of adjacent spectral bands.This allows for more precise identification of materials with different characteristics [25], as well as in the development of a more generally applicable model that takes variations in water and sediment properties into account.For example, Jensen et al. [26] effectively use hyperspectral data to create a more spatially transferable TSS model by calculating derivative spectra to identify absorption features and emphasize important wavelength regions.Many studies have argued that a hyperspectral satellite mission is needed to effectively study inland water ecosystems and water quality [27], [28], [29].However, relatively little work has been published using hyperspectral data to analyze TSS, and further work is required to understand the benefits of using hyperspectral data for these applications.
Spectral resolution, spatial resolution, signal-to-noise ratio (SNR), and temporal resolution (revisit period) are four key observational performance metrics of an optical remote sensing satellite.While recent literature points toward the importance of spectral resolution in water quality estimation [27], [30], [31], the observational requirements needed to study inland water quality remain unquantified.Observing inland water quality may impose unique resolution requirements, such as the combination of high spectral resolution with high spatial resolution, as many inland water bodies have small surface areas.One may hypothesize that an ideal mission would have high spatial, spectral, and temporal resolution, as well as SNR, but a mission of this type may not be feasible due to sensor cost and design limitations.Thus, this study will consider the tradeoffs between varying observational configurations (i.e., different spatial and spectral resolutions and SNRs) and quantify timing of hydrodynamic processes to assist scientists and engineers in space mission design.
Many sensors currently exist that provide usable data for water quality estimation (Fig. 1).However, a water qualityoptimized sensor could greatly improve the science being conducted in the field of inland water quality [19], [27], [32].Furthermore, some resolution gaps exist in current and planned missions (Fig. 1) that may be filled by an inland water qualityfocused mission.While sensors such as VIIRS may provide fast revisit and good spectral resolution, they lack high spatial resolution.The Landsat series, on the other hand, has good spatial resolution but a long revisit.Some planned missions, such as Surface Biology and Geology (SBG), Landsat 10, and Sentinel-2 NG [33], which are expected to have high spectral resolutions, may begin to fill this observation gap, and this study may assist in defining the necessary requirements for these missions.Furthermore, this study may be used to understand the expected performance from these planned missions.
Understanding the optimal mission specifications for observing inland water quality, especially TSS, will assist in developing and proposing future optical satellite missions.While the notion that higher quality, more frequent data may always be the most desirable, we aim to understand the relative importance of varying observation configurations and whether some configuration requirements are more imperative for TSS retrieval accuracy than others.

II. METHODS
To understand the optimal sensor characteristics needed to observe TSS dynamics, this study combines multiple sources of data including airborne hyperspectral imagery, a global dataset of river size, and in situ gauge data.To quantify the retrieval performance of the various configurations, we degrade assumed ideal airborne data to understand the ideal instrument requirements, which we refer to as "data quality" [34], necessary for estimating TSS retrieval performance using an optical sensor.We also study the mission requirements, or "data quantity" [34], necessary in terms of temporal and spatial observational requirements.Finally, we create a science return function, which helps users understand the tradeoffs between different observational requirements and their corresponding scientific value.All code for the methods can be found in the associated GitHub repository.

A. Data Quality Analysis
Many TSS retrieval algorithms use simple, single-band models applied to multispectral data, which are easy to implement and can be highly effective in location-specific studies [16], [35].However, these simple models are not able to leverage the hundreds of hyperspectral bands.Partial least-squares regression (PLSR) models have been shown to be an effective alternative when handling hyperspectral data [36] by reducing the number of predictor variables used to perform the regression.Numerous hyperspectral studies have demonstrated the reliability and accuracy of PLSR models, while still being relatively simple models to implement [26], [37], [38].PLSR models avoid issues encountered by linear regression, such as collinearity and small sample size [39].While PLSR has been applied to hyperspectral data for over a decade now, it has only recently begun to be utilized in water quality studies and shows promising transferability and retrieval accuracy [26].
We use high-resolution hyperspectral AVIRIS-NG airborne sensor (∼5 m, ∼5 nm) surface reflectance data from the ongoing Jet Propulsion Laboratory Delta-X mission [40], which aims to understand changes occurring in the Mississippi River Delta through a combination of in situ and remotely sensed measurements.In addition to the AVIRIS-NG imagery (Fig. 2), the mission collected paired in situ TSS measurements [41] over the Atchafalaya Basin and the Terrebonne Basin, which allow for TSS algorithm calibration.To the best of our knowledge, the Delta-X mission is the highest quality publicly available paired optical remote sensing and in situ samples dataset.
We develop a PLSR model using these paired in situ TSS measurements (n = 36, concentrations range from 21.75 to 154.5 mg/L) and the AVIRIS-NG hyperspectral data.We limit our in situ measurements to those that were collected within 24 h of the AVIRIS-NG data at that location and use the VNIR portion of the spectrum (450-1000 nm), which has been shown to be the most effective region for estimating TSS [15].To determine the optimal number of PLSR model components, which summarize the original predictor variables, we use the lowest value of the predicted residual sum of squares (PRESS) statistic, a form of cross-validation that assesses a model's predictive capability [42].This also assists in avoiding overfitting when applying the PLSR model.
To analyze the observational requirements for a TSSfocused sensor mission, we first analyze data quality to understand the observation configuration requirements that return the highest TSS retrieval accuracy.We conduct a data degradation experiment, in which we resample the AVIRIS-NG data to lower spatial and spectral resolutions and SNRs, and observe how the TSS retrieval accuracy changes.For the spatial analysis, we simulate increased ground sampling distance by applying a low pass Gaussian spatial filter to the original AVIRIS-NG data while masking out the surrounding land cover using the normalized difference water index (NDWI) [43].We then spatially resample the data to larger pixel sizes.We also resample the spectral bands to lower resolutions using a Gaussian kernel regression smoother technique, the Nadaraya-Watson kernel regression estimate, whereby we reduce the number of bands and widen the full width at half maximum bandwidths in the VNIR [44].To simulate varying SNRs, we add Gaussian random noise at varying intervals from 50 to 1000 SNR [45] to the AVIRIS-NG data.We perform a Monte Carlo simulation with 1000 runs of each simulated SNR level.In order to understand the interaction effects between the spatial resolution, spectral resolution, and SNR, we apply the PLSR model to each observation configuration combination of these three parameters.We use the PRESS statistic to determine the correct number of components in each model, as the addition of noise from the simulated SNR may lead to overfitting.

B. Data Quantity Analysis
Another key component of understanding the observational tradespace required for TSS measurement is quantifying the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
amount of data necessary to best capture TSS, in terms of observable spatial extent (how many rivers can we observe?) and temporal resolution (how frequent does our revisit period need to be to capture unique TSS events?).To evaluate the geographical coverage requirements, we use the Global River Widths from Landsat (GRWL) database [46].We calculate river surface areas, for rivers of width 30-1000 m to quantify how the observable river surface area changes with varying spatial resolutions.The GRWL database contains river geometry information for rivers of 30 m width and larger along over 3.3 million km of rivers worldwide.We fit a function to the statistical distribution of river area using ordinary least-squares regression and extrapolate the abundance of narrower rivers.
To further characterize observational requirements in terms of data quantity, we evaluate the necessary revisit period.To carry out this temporal analysis, we use USGS gauge data with continuous 15-min turbidity measurements.We use turbidity as a proxy for TSS because TSS and turbidity significantly covary [6], [47], and there are very few gauges with TSS measurements available.We use data from 674 gauges that are distributed across the U.S. and cover an assortment of river sizes, with order of magnitudes ranging from 1 to 1000 cm.We use a smoothing cubic spline function to interpolate any gaps in gauge records that are less than one day in length.For gaps longer than one day, we split the gauge data into two separate records.After these steps, we analyzed 720 records that we use for our wavelet analysis.
We quantify an appropriate revisit period using wavelet analysis to identify the timescale across which TSS events most commonly occur.Wavelet analysis is a form of signal processing similar to Fourier transforms but instead of only accounting for global signals across an entire time series, wavelet analysis considers local signals [48].Wavelets are well-suited for analyzing hydrologic time series because they contain local events such as floods that do not have a consistent repeat period [49].After performing the wavelet analysis, we fit an exceedance probability function over the wavelet power spectrum in order to quantify the science value of different revisit periods.

C. Science Return Function
To understand the tradeoffs between different observational requirements and their corresponding scientific value, we create a science return function.This is an emerging method in science mission formulation and architecture studies [50], [51], where trade studies are often guided by simple figures of merit (e.g., coverage metrics).To create the science return function, we consider the interactions between our data quality parameters as well as employ an additive weighted-sum approach to incorporate data quantity analysis into a singular, master equation.We create a three-part function that relates the data quality (spatial, spectral, and SNR) and data quantity (revisit time and geographical coverage) to the scientific value of the user's desired input observation configurations, that is represented by a scalar [0, 1], with 1 representing the highest scientific value and 0 representing no scientific value.We establish these mathematical functions through fitting equations to our raw data and selecting equations with the lowest error metrics.For data quality, we use a least-squares multiple regression fit.For data quantity, we normalize the fitted curve for the spatial resolution and create an exceedance probability function for the temporal resolution.

A. Data Quality Analysis
The data quality degradation study consisted of degrading the spatial resolution, spectral resolution, and SNR of the highresolution AVIRIS-NG data.We degrade the spatial resolution in steps from the original 5-m resolution to a coarse 100m resolution.For the spectral resolution, the values range from the original 5-100 nm in the VNIR.For the SNR, we simulate values ranging from 50 to 1000.On the raw AVIRIS-NG data, which represents the highest quality data possible, the PLSR model obtains an accuracy of R 2 = 0.94, RMSE = 9.89 mg/L, relative error = 0.17, and nRMSE = 0.36 (Fig. 3).As expected with reduced data quality, the accuracy of the TSS retrieval with the PLSR model decreases with lower resolution data.However, the spatial resolution, spectral resolution, and SNR do not all decrease equally with reduced resolutions and retrieval accuracy is more sensitive to some parameters than others.(These results may be seen in Fig. 3.) Decreasing spatial resolution only very slightly worsens TSS retrieval accuracy, while the spectral degradation analysis Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.reveals a much stronger correlation between increasing bandwidth and worsening retrieval accuracy.Narrower bandwidths show consistently high retrieval accuracy regardless of the spatial resolution and SNR, indicating that spectral resolution is a key data quality component in TSS estimation.Decreasing data SNR reduces retrieval accuracy, but the difference in performance between the high and low SNR values that we test is relatively small.We also find that above an SNR of about 500, there are diminishing returns (Fig. 4).

B. Data Quantity Analysis
Our data quantity analysis examines how much data we need to most accurately capture TSS temporal dynamics and spatial requirements in inland waters.We calculate river surface area using the GRWL database and create observational thresholds for the surface area from 5 to 1000 m.We then analyze the relationship between the observational thresholds, which represent the pixel resolution, and the amount of river area that is able to be observed.We find that spatial resolution is exponentially related with observable river area (Fig. 5).In other words, finer spatial resolution imagery is able to resolve an exponentially increasing quantity of river surface area.For example, a 5-m resolution captures over 30 times as much river surface area as a 500-m resolution, indicating the importance of fine spatial resolutions for observing the greatest quantity of river area.
We also conduct a temporal analysis to quantify the revisit period tradespace for a TSS-focused satellite mission using wavelets.The wavelet analysis shows a peak in wavelet power, which indicates the strongest signal from the data, at a period of approximately five days [Fig.6(a)].This peak indicates that for a mission that aims to examine rivers of all sizes, a revisit period of approximately five days or less best captures  unique events and overall TSS variability.The corresponding exceedance probability shows that the science value return for time periods greater than five days begins to significantly lessen, while time periods shorter than five days are even more useful [Fig.6(b)].

C. Science Return Function
The science return function incorporates the observational requirements into a simple set of equations that return a scalar [0, 1], demonstrating the scientific value of the user's desired instrument and mission observational capabilities.A value of 1 represents the highest scientific value and 0 represents no scientific value.The main function (1) combines the data quality H (spatial resolution, spectral resolution, and SNR), with observable river extent I , and temporal sampling J with β, γ , and δ being the user's desired weightings (the sum of which must equal a value of 1) based on their mission's focus or a given application Data quality performance is represented by with the constants being a I = 1.97 × 10 −6 , b I = −7 × 10 −3 , and c I = 13.14.Data quantity also depends on temporal sampling and is represented by where w is the revisit period (days) with the constants a J = −1.42× 10 −6 , b J = 3.08 × 10 −4 , c J = −2.81× 10 −2 , and d J = 1.03.The results of ( 2)-( 4) are put into (1) to obtain the science value.In this study, we developed this equation within the range of specified values shown in Figs. 3 and 6.The temporal values in Fig. 6 range from 15 min to 83 days.We do not recommend using this equation for values outside of these specified ranges.We provide a use-case example of our science return function by examining how Landsat-8 OLI and the proposed SBG mission perform.The results may be seen in Table I, and further relevant missions may be seen in Supplementary Table I.

IV. DISCUSSION
We find that, as hypothesized, decreasing the quality of remote sensing data leads to decreased TSS retrieval accuracy.However, the relationship between the examined instrument characteristics and the retrieval accuracy is not uniform.Most notably, we find that the spectral resolution is the most important data quality component, and provides the largest difference in TSS retrieval accuracy.Finer resolution spectral data (e.g. 1 nm) may provide even greater TSS estimation accuracy but further work is required to test this hyperresolution range.This finding is consistent with the CEOS Aquatic Ecosystem Feasibility Study [49] as well as the IOCCG Earth Observations in Support of Global Water Quality Monitoring report [29], which both find that high spectral resolution is a key variable for studying inland water.Hyperspectral missions focused on inland water quality have been continuously proposed over the past decade, and the findings of this article further support a mission of this type [27], [28], [29].The SNR analysis shows an increase in retrieval accuracy as SNR is improved but this increase is relatively negligible.One notable trend we find is that above SNR ∼ = 500, the rate of increase in retrieval accuracy begins to diminish.Schott et al. [52] demonstrate similar results when analyzing varying SNRs in Landsat imagery and suggest that at a certain point, there are diminishing returns despite increasing data quality when observing SNR [52].
While we find that improved spatial resolution leads to improved accuracy, it does not play as significant of a role as spectral resolution.We note that our study masks out land pixels prior to data degradation, which avoids mixed pixel effects that would degrade the results and would further constrain the amount of observable rivers.We do find that spatial resolution is very important in terms of data quantity, as can be seen by the exponential loss of river area seen in (2) (Fig. 5).The importance of this finding is application dependent, and may be weighted appropriately to the user's preferences in the science return function.While some studies may have significant interest in studying small rivers and the maximum river surface area, others may be exclusively interested in large rivers.We also find that to best observe individual TSS events such as floods and high flow events, a revisit period of approximately five days or less is key.With a five-day revisit period, most events will be captured at least once throughout the lifetime of the event.If a user needs to observe five-day events at multiple stages throughout their progression, a 2.5-day revisit or faster is advised.It is important to note that this analysis does not take into account cloud cover, which is often correlated with flood events and must be considered when selecting a revisit period.
While this study focuses on TSS, the methods used here may be employed using other water quality parameters that Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
are able to be estimated using VNIR remote sensing imagery, such as chlorophyll-a and colored dissolved organic matter (CDOM).Our findings here are relevant to observing TSS requirements in rivers, but different water quality parameters may have different observation configuration requirements.To design a general inland water quality-focused optical mission, other relevant water quality parameters must be similarly analyzed in order to understand the optimal configurations.Regardless, the findings of this study may be used in future mission designs and satellite missions.
There are a few potential sources of error and assumptions made in this study.To the best of our knowledge, the AVIRIS-NG data is the highest quality paired optical remote sensing and in situ samples that are publicly available.However, the sample size of the collected data is relatively small, especially as we limit the data to within 24-h optical/in situ pairings.For the data quality analysis, a larger sample size may have provided a more representative sample between the resolution and retrieval accuracy.Also, the range of TSS values found in the Mississippi River Delta is not representative of the range of global TSS concentrations, nor is it representative of the global range of sediment color and mineralogy variations.Repeating this study in a variety of locations and over a wider range of conditions would improve accuracy and help resolve uncertainty; however, TSS measurements paired with the same day, high-resolution hyperspectral datasets are exceedingly rare.Thus, one of the major sources of uncertainty in this study is the relatively small, site-specific sample size.Still, the dataset is the best publicly available paired TSS data of its type.Furthermore, the gauge records were taken only from gauges in the United States and are clustered around certain areas, which may have biased the results of the temporal analysis.Assumptions made in the GRWL dataset apply to our spatial observational analysis, specifically uncertainty regarding seasonal variations in river surface area and the extrapolation of surface area to small streams [46].The science return also uses a simple weighted sum approach, which does not capture the interaction effects between the data quality and quantity parameters.

V. CONCLUSION
As humans continue to contribute to climate change and alter the sediment cycle and overall water quality in our inland water bodies, understanding the extent of these changes is crucial to beginning to mitigate them.This study characterized the observational tradespace necessary to best observe TSS in rivers.We analyzed both data quality (spatial, spectral, and SNR) and data quantity (observable spatial area and temporal).Spectral resolution plays an important role in water quality estimation in terms of data quality.Spatial resolution is important for determining data quantity, as the amount of observable river area scales exponentially with spatial resolution.For temporal sampling, a revisit period of five days or less best captures individual TSS events in rivers.These observational characteristics are combined into a science return function, to evaluate a mission's scientific value in terms of TSS observation.While having the highest quality resolution of all components is desirable for TSS estimation, a mission of this type is not necessarily possible because of sensor cost and design limitations.Future work will apply the methods used here to other water quality parameters (e.g., chlorophyll-a and CDOM) and in diverse environments (e.g.lakes, wetlands) to better understand the requirements for a general inland waterquality-focused mission.

Fig. 1 .
Fig.1.Spatial-spectral and temporal-SNR configurations of existing and planned satellite sensors identified by the authors for possible measurement of TSS (data retrieved from https://eoportal.org and https://space.oscar.wmo.int).Spectral resolution is the average VNIR bandwidth, and SNR is also averaged over the VNIR.

Fig. 2 .
Fig. 2. Locations of cotemporal in situ TSS and AVIRIS-NG composite mosaics of the Mississippi River Delta from the ongoing Delta-X mission in (a) Fall 2016 and (b) Spring 2021.Yellow dots indicate cotemporal in situ TSS and reflectance measurements.

Fig. 3 .
Fig. 3. Data quality analysis results.(Top left) PLSR model accuracy on the original AVIRIS-NG data.Dashed blue line is the 1:1 line.Remaining panels demonstrate the PLSR retrieval accuracy of the interactions between spatial resolution, spectral resolution, and SNR, with the third dimension at the highest resolution.

Fig. 5 .
Fig. 5. Finer spatial resolution allows for an exponential increase in the quantity of observable global river area.

TABLE I INPUTS
AND SCIENCE VALUES OF LANDSAT-8 OLI AND SBG.FURTHER SENSORS MAY BE FOUND IN THE SUPPLEMENTARY MATERIALS