Learning-Based Tracking of Crop Biophysical Variables and Key Dates Estimation From Fusion of SAR and Optical Data

Monitoring crop development is of crucial importance to ensure sustainable management practices while promoting efficient land use. The ability of satellite remote sensing data to cover large areas offers a robust tool to aid this task. In this article, we propose a filtering framework, which uses Gaussian-process-based dynamic and observation models, an unscented Kalman filter, and the fusion of multitemporal SENTINEL-1 and SENTINEL-2 data to monitor crop biophysical variables. This method complements state-of-the-art filtering frameworks given its ability to learn models and uncertainties from data and to exploit the imagery temporal dimension. This enables the method to be transferable to other crop types, biophysical variables, and locations. We test the methodology to track asparagus below-ground carbohydrates and the season crop age and to forecast crop key dates. The amount of carbohydrates stored below ground in the plant's root system is highly associated with the yield of asparagus and the ability to establish a healthy canopy. Validation with ground truth showed that the use of more than one SENTINEL-1 orbit and SENTINEL-2 data provided the best tracking performances and a reliable way for handling missing data from a sensor. Under this configuration, the method achieves a mean absolute error (MAE) of 1.802 Brix degrees (surrogate for carbohydrates). Similarly, it can retrieve crop age and forecast harvest date, with the MAE of six days. Remotely tracking below-ground carbohydrates may contribute toward reducing the destructive sampling required for its measurement in the field.


I. INTRODUCTION
I N LINE with population growth, demand for agricultural commodities is also increasing. Technological tools to ensure sustainable management practices are key to reduce the Manuscript  negative impact on people and the environment [1]. Thanks to its large area coverage and the growing availability of free data, satellite remote sensing has been increasingly used in recent decades for agricultural information extraction. These insights are then used to inform decision-making by different stakeholders in the agricultural supply chain [2], [3], [4]. Agricultural fields monitoring with satellite-based synthetic aperture radar (SAR) has gained importance in recent years (see [4]) due to the ability of these sensors to acquire images under moderately adverse weather conditions, including the presence of clouds and the independence of sunlight. Consequently, it is possible to achieve a systematic acquisition system able to avoid data gaps and create consistent time series.
Regarding monitoring crop development with SAR, several methodologies have been reported in the literature. These include methods in which the SAR response to a given crop stage is manually analyzed and insights are extracted based on expert knowledge [5], [6]. More recent methods use machine learning algorithms trained on ground truth and SAR data to create a mapping function between the SAR images and the crop stages [7], [8]. Wishart-distance-based crop stage classifiers [9] have also been proposed with the aim of exploiting the full polarimetric information for phenology classification. However, these methods are not designed to include data from other sensors (i.e., multispectral satellites) and could improve the exploitation of the data's temporal dimension. Similarly, these methods lack the ability to model the crop dynamics in order to hind-cast and forecast crop development.
To overcome these limitations, other studies proposed to retrieve crop biophysical variables by tracking the state of a dynamic system (i.e., the crop) and used well-established tracking algorithms such as Kalman filter and particle filter (PF). An example of this methodology is presented in [10], where Vicente-Guijalba et al. report to successfully perform the near-real-time tracking of cereal crops phenology using an extended Kalman filter (EKF). An improvement of this method is proposed in [11], where a comparison between an EKF and a PF is performed. The authors conclude that the PF provides more flexibility to model the crop dynamics and allows the method to achieve better accuracy. In this study, the methodology is also used to forecast crop key dates, specifically, the date when a crop will be ready for harvest. In [12], sensor fusion is performed by using TerraSAR-X backscatter intensity in combination with a This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Landsat-derived vegetation index [normalized difference vegetation index (NDVI)] to estimate rice phenology. The authors report that using backscatter intensity to complement NDVI in periods when no optical images are available increases the accuracy achieved by the algorithm. More recently, a PF has been utilized to track canola phenology [13]. The algorithm fuses multifrequency SAR data from the RADARSAT-2 and TerraSAR-X satellites and temperature. In this case, the attention is focused on the potential of the method for predicting crop flowering, since based on the humidity at this stage, farmers can make informed decisions regarding their management practices.
These applications demonstrate the potential for crop monitoring with dynamic filtering. In addition to achieving high accuracy and enabling sensor fusion, the methodologies are also suitable for near-real-time monitoring purposes given the ability of dynamic filtering for real-time applications [14], [15]. However, the dynamic models (i.e., the function that models the crop dynamics over time) and the observation models (i.e., the function that relates crop biophysical variables to satellite observations) used within the filtering strategy so far can still benefit from improvements.
Indeed, in [12], the dynamic and observation models are parametric models based on double logistic functions, which despite offering adequate results for rice fields in a specific location are not likely to be transferable to other biophysical variables, crop types, or locations. While, in [13], the accuracy for estimating flowering is high, the approach for modeling crop dynamics is specific for their local conditions preventing the method to be transferable. In this case, to model the crop dynamics (i.e., dynamic model), it is assumed that phenology can be estimated from the accumulated temperature in the year or in the agricultural season, measured in growing degree days (GDDs) [16]. Note that, under this assumption, the transferability of the method to other crop biophysical variables is limited. This is because the function that relates temperature and a given crop biophysical variable is not always known. A different approach, which proposes to use historical data to model dynamic and observation models, is used in [10] and [11]. The models were fitted using splines [17]. While the authors report achieving adequate fitting, the uncertainties associated with model predictions are learned from a totally different modeling step. Considering the key role of these uncertainties in a dynamic filtering, an approach to model them accurately is crucial. Recently, Mascolo et al. [18] proposed a grid-based filter (GBF) to monitor rice phenology. As mentioned by the authors, this approach represents the optimal and closed-form solution to the Bayesian estimation problem when the state variables are discrete and finite, such as the case of phenological scales. However, for the purpose of the present article, since the approximate carbohydrate content (i.e., Brix) is a continuous, and not a discrete, variable, the GBF solution is not a feasible solution. In [19], a spatiotemporal phenology estimation of paddy rice is introduced using a dynamic filtering strategy. The authors propose a thresholding algorithm for transplanting date retrieval and a hierarchical tree for a first approximation of a parcel phenology. This is then used as a prior in a PF setting for the estimation of the phenology's posterior distributions. Interestingly, the authors highlight the importance of knowing when the season started (transplanting date) for the successful deployment of the dynamic filtering approach. Note that, in this case, the transplanting date retrieval requires a separate algorithm to the Bayesian filtering. Consequently, obtaining this information while avoiding this extra step also represents an opportunity for improvement.
In this article, we propose a Gaussian-process-based unscented Kalman filtering (GP-UKF) framework that uses freely available multitemporal SENTINEL-1 data and learns nonlinear and not parametric dynamic and observations models of the filter using Gaussian processes (GPs) [20]. Learning these models and their predictive uncertainties simultaneously allows the GP-UKF to be transferable to other crop variables, locations, and crop types. The framework also performs active-passive satellite sensor fusion, incorporating multitemporal SENTINEL-2 data when available. The GP-based dynamic model is utilized to fill gaps due to missing remote sensing observations and to forecast the crop development. We present an application for tracking below-ground asparagus plant carbohydrates (highly related to the crop yield) and the seasonal crop age (also known as the days after season start). In addition, the forecasting capabilities of the GP-UKF framework are tested to determine when a field will be ready for harvest.
The rest of this article is organized as follows. Section II introduces the test site and ground and remote sensing datasets used in this study. Section III presents a brief introduction to the theoretical background required by the GP-UKF. Section IV introduces the setup used for the near-real-time implementation of the algorithms. Section V presents the results obtained tracking below-ground asparagus plants carbohydrate and seasonal crop age and the results of forecasting when a field will be ready for harvest. The strengths and limitations of the current study are discussed in Section VI. Finally, Section VII concludes this article. Fig. 1 shows a SENTINEL-1 and SENTINEL-2 image of the test site acquired on January 4, 2019. The asparagus fields located in the Peruvian north coast considered a total of 432 parcels for the analysis. The plants (Asparagus officinalis L.) used in the crop were grown in a nursery and later transplanted to the fields, which consist of dry sandy soil with a subtropical climate. The temperature peaks in February, reaching averages of up to 26 • C and falls in August with average temperatures of 15 • . This synchronizes with the highest and lowest solar radiation levels [21].

A. Test Site
Given the mild winters, the crop does not naturally reach senescence and is instead able to grow all year round. Given this, farmers can grow two agricultural seasons per year. The growth stage of individual parcels is not necessarily synchronized, and as a consequence, different parcels could be in different growth stages.

1) Ground Truth:
Given the key role that below-ground carbohydrates (CHO) play in the crop yield, periodic samples are taken in the field to monitor a parcel's CHO evolution over time. However, the samples collected do not correspond directly to carbohydrate levels. Although there exists an analytical method known as the anthrone method for measuring soluble carbohydrate in milligrams per gram of dry weight [22], it is too demanding and not commercially viable [23]. This led to the development of an approximation of root carbohydrate content derived from the refractive index (Brix%) of asparagus root sap measured with a refractometer [23], [24]. According to this method, the carbohydrate content can be estimated by a linear regression model that uses Brix% as input [23], [24], [25]. This process was carried out as a tool to monitor the crop development in the 432 fields analyzed in a period covering from June 2018 to December 2020.
2) SENTINEL-1 Images: The level-1 GRD SENTINEL-1 data utilized were downloaded from the Google Earth Engine platform [26]. Interferometric wide swath data are used, from which the backscattering coefficients at the VH and the VV channels are obtained. The preprocessing steps applied by GEE include orbit file update, GRD border noise removal, thermal noise removal, radiometric calibration, and terrain correction [26]. Three acquisition geometries are available in the study site, as shown in Table I, covering the period from June 2018 to December 2020.
3) SENTINEL-2 Images: Orthorectified and radio-corrected SENTINEL-2 level 1C imagery available in the GEE platform were used. A cloud masking preprocessing step was applied exploiting the SENTINEL-2 QA60 band, masking out opaque clouds and cirrus clouds. A period covering from June 2018 to December 2020 is selected to generate vegetation indices time series at 10-m resolution. This results in a total of 71 images used for the analysis. However, these images are concentrated in consecutive images for short periods of time (cloud-free months) and prolonged gaps in cloudy months, as will be mentioned in Section V-A2. Fig. 2 shows how the above-ground processes involving spear growth and fern establishment are connected with the below-ground carbohydrates stored in the plant root system. It shows the mean values of VH backscatter intensity, green normalized difference vegetation index (GNDVI) [27], [28], and Brix degrees (ground truth) time series. Six agricultural seasons for five typical parcels, which have the same management practices, are shown. The green and red vertical dashed lines mark the season beginning and end of the fern establishment, respectively. The time between a red and green vertical line corresponds to harvest. In asparagus, below-ground carbohydrates are produced through the canopy via photosynthesis and consumed when new stems grow from the root system. This cycle of production and consumption of carbohydrates is shown in the bottom plot of Fig. 2.

III. METHODOLOGY
Crop development monitoring can be seen as a particular case of monitoring a dynamical system, in which the crop development is considered a time-varying process observed through noisy and/or missing remote sensing observations [10], [29]. A set of D biophysical variables, such as phenology, leaf area index, and biomass (among others), is assumed to describe the crop condition at a given point in time. Specifically, the crop state can  be described by knowing the state of these variables, which are in turn known as state variables. Typically, and throughout this article, the state variables are represented by the D-dimensional random vector x k , where k refers to the time step at which the system is.
This approach enables the use of widely known tools for dynamical systems state estimation by means of the well-known Bayesian filtering algorithms [30]. The Kalman filter [31] is the most used in practice, for example, for systems with linear evolution in time and process states that are Gaussian distributed. For the concrete purpose of this article, a filtering algorithm aims at tracking the crop state variables by fusing observations from freely available active-passive multitemporal satellite data sources and data-driven models of the crop state variable dynamics. As shown in Fig. 3, the filter does this by updating or modifying the predicted distribution from the dynamic models P (x k |x k−1 ), by a factor derived from the deviations between the vector of remote sensing observations received at time step k (y k ) and these same observations predicted given the observation model predictions P (ŷ k |x k ).
The filter output P (x k |y 1:k ) represents the posterior probability density function of the states at time k, given all the observations until time k. This is achieved by assuming that the Markovian property for state estimation applies [30]. Note, however, that this is different from exploiting multitemporal data, as will be shown in Section III-C. The difference betweenx k and x k is thatx k represents prior predictions from the dynamic model, and x k is the filter output (posterior), this is, the predictions after being updated by the filter. The same rationale applies for the vector of remote sensing observations y k and the predictions of remote sensing observationsŷ k from the observation model. After a filter output, the previous state is updated to the current state (x k−1 = x k ), providing the filter the ability to perform recursive state estimation. In Fig. 3, the dashed lines indicate that these paths are optional and are executed only when the vector of remote sensing observations y k is available, this is, when a new image arrives. In this sense, if no remote sensing observations are available, the filter output corresponds directly to the dynamic model predictions. This enables the estimation of the crop state variables between remote sensing images (nowcasting and/or gap filling) and, through recursive estimation, forecasting of the crop state variables.
Apart from the filter itself, two key components of the filtering algorithms are the dynamic and the observation models [30]. The state variable evolution over time is modeled with a probabilistic dynamic model, represented by In (1), the system state at a given date k can be estimated as a function of the previous state x k−1 using the nonlinear function f (.) and considering the additive GP noise q k ∼ N (0, Q k ) accounting for the uncertainty when modeling the system dynamics with the process noise covariance matrix Q k . The nonlinear function f (.) describes the laws that govern the dynamic system evolution, which is a function deriving future states following from the current states. Also note that the dynamic model of (1) results in the conditional probability distribution P (x k |x k−1 ). The observation model is in charge of modeling a sensor response to a given crop state and is represented by the Edimensional vector of sensor observations y k . The E dimensions in this case correspond to the number of features used to observe the crop state with sensors, which corresponds in this article to backscatter intensities derived from SENTINEL-1 data and/or the vegetation indices from SENTINEL-2 data. The sensor response y k can be predicted as a function of the system state x k by means of the probabilistic observation model where g ith (.) is a nonlinear function mapping from system state to remote sensing observations measured by the sensor ith, and r k ∼ N (0, R k ) is the additive Gaussian measurement noise accounting for the uncertainty around the predicted observations. Note that the probabilistic observation model of (2) results in the conditional probability distribution P (y k |x k ) describing the likelihood of getting the observations y k given the state x k . The subscript ith stresses the fact that there should be as many nonlinear functions g ith (.) as sensors observing the crop development. This is to ensure the correct fusion of sensors.

A. Learning Dynamic and Observations Models
Typically, the dynamic and observations models are represented by parametric models (i.e., equations) derived from having an in-depth knowledge of the forces driving the evolution of system states and the response of each of the satellite sensor to the system states [30]. For crop monitoring, this is associated with knowledge regarding the dynamics of each state variable, for each individual crop type and under the specific conditions of each location. It requires identifying the adequate number of factors impacting the evolution of state variables over time and the complex relationship between them. Unfortunately, even if this process can be done for a test site, it is difficult and/or expensive to scale everywhere this may be needed. Given the large variety of variables (e.g., crop type, location, and wavelength in the case of SAR measurements), models are very specific and not transferable to other crop types or locations.
Instead of using parametric models that may experience the aforementioned issues, a Gaussian process regression (GPR) has been proposed in engineering applications with the socalled Gaussian process state space (GPSS) models to learn the transition and observation models from data [32], [33]. A key advantage of using GPR to learn these models, besides its nonparametric form and flexibility, is that they provide uncertainty with each predicted estimation. This is a vital requirement of the Bayesian filtering algorithms. In fact, it provides a statedependent uncertainty, for instance with increased uncertainty in those regions where the variability increases or in those regions where insufficient data are used to train the models. Refer to Appendix A for an introduction to GPRs.

B. Training of Dynamic Models
The dynamic model must predict the conditional distribution P (x k |x k−1 ) corresponding to the prediction of the current state given the previous state. In this article, we use GPs to obtain the prediction of this probability distribution. The output of the GPR is a Gaussian distribution of the form, The training of the dynamic model is achieved in two steps: First, we obtain the typical state variables behavior over time, and second, we learn to predict a state distribution at time step k given the state distribution at step k − 1.
In order to estimate a typical state variable behavior over time, we collect a training set of ground truth samples from several agricultural seasons, either several parcels or more than one season of the same parcel (i.e., for perennial crops). Then, we fit a nonlinear nonparametric GPR using this set as output and using the crop age or days after the season started (DaS) as input. Once the typical evolution is obtained, the key role of the dynamic model is to predict a state distribution at step k given the state distribution at time step k − 1. A second GPR model learns to predict the expected change to the current state in order to advance to a subsequent state following the typical dynamics previously obtained. To train this model, a dataset of the form corresponds to the expected value of the crop state variables at time step k − 1, and the output expression (x k − x k−1 ) is the expected change to advance to a subsequent state. The conditional probability distribution P (x k |x k−1 ) can be obtained as where GP µ corresponds to the mean and GP var corresponds to the variance obtained from a GP model using the dataset D dyn and (7) and (8) (see Appendix A). Note that using this procedure results in a "one-step ahead" prediction. However, by recursively using (3), multiple-step-ahead predictions can be obtained. This is a key property since it allows us to predict values of the state variables when no remote sensing observations are available (i.e., in a daily basis) and allows us to fill gaps, for instance, for SENTINEL-2 and/or to forecast the expected evolution of the state variables given the current states.

C. Training of Observation Models
As mentioned previously in this section, the observation model oversees modeling a sensor response to a given crop state (represented by the state variables). A GPR is trained to predict the conditional probability P (y k |x k ) ∼ N (y k /μ k , Σ k ) of the remote sensing observations given the crop state at time step k. The remote sensing observations predicted by this model are the same remote sensing observations used to monitor the crop. In previous works, the authors have reported the use of several features derived from backscatter and PolSAR decompositions of a single image (16 features in [11] and 12 features in [13] or, more generally, p PolSAR features). In this article, we propose to use the temporal dimension of the observations to compensate for the lack of Quad-PolSAR data. This way, a vector of remote sensing observations contains a sequence of the last N past available images instead of the p PolSAR features of a single image. Considering a sequence of observations is in fact desired since it provides the information of not only the crop state in a single date but also in the period covered. In this sense, the observation model predicts the sequence of the past N observations available in a predefined period (e.g., the previous N -VH backscatter or N -GNDVI measurements available in the past 100 days) given the current crop state. It is important for the observation models to be able to predict the observations and their uncertainty for any arbitrary day in the past (within the predefined period). This way, we do not constrain the model to predict exclusively every 6 or 12 days for the case of SENTINEL-1 or five days for SENTINEL-2. This is beneficial for optical SENTINEL-2 data, where there may be gaps in the observations acquired in cloudy days resulting in an irregular number of days between observations. To train this model, a dataset of the form k corresponds to the expected value of the crop state variables at time step k, DaS k−j is the number of days in the past when we wish to predict an observation, and the output y k−j are the predicted remote sensing observations acquired k − j days ago. Note that the model is used N times to form the sequence of the past N observations, for j that goes from zero to the N past observations. In this case, zero corresponds to the observation at time k and N the oldest of the N observations in the sequence. The output of an observation model can be obtained as for j = 0, . . ., N : Sensor fusion is achieved by training separate models for each sensor used (e.g., one for SENTINEL-1 and another for SENTINEL-2) and selecting the model to use for predicting the sequence of past N observations to be the same as the sensor providing the new observation that arrives at time k.

D. Unscented Kalman Filter (UKF) With GP Dynamic and Observation Models
The UKF is an extension to the Kalman filter to applications where the dynamic and observation models are nonlinear. This is the case in this article since these models are learned with the nonlinear and nonparametric GPRs.
The UKF first finds a Gaussian approximation to the non-Gaussian distributions resulting from propagating the states x k−1 and x k through the nonlinear models of (1) and (2), respectively. Then, the analytical solution of the Bayesian filtering equations [30] for the case of linear and Gaussian systems, i.e., the Kalman filter equations, can be applied to compute the approximated system state. In the UKF, the approximation to Gaussian distributions is made with the aid of the unscented transform (UT). The UT selects deterministically a set of points (known as sigma points) in the input distribution [x k−1 of (1) or x k in (2)], in order to characterize it [30]. These points are then propagated through the nonlinear models, i.e., a prediction is made for each sigma point using a GP-based model, and a weighted mean and covariance of the output or target distribution is computed from them.
Algorithm 1 in Appendix B presents a detailed description of the combination between the original UKF [34] and the GPbased dynamic and observation models.

A. State Variables
Two variables are utilized to characterize the crop over time, below-ground plant carbohydrates, and the season crop age in days or also known as the number of days after the season started. This results in the state vector x k = [Brix k , DaS k ] T 1) Below-Ground Plant Carbohydrates: Asparagus crop yield is directly linked with the amount of plant carbohydrates stored in a plant root system [35]. The carbohydrates are, in turn, linked to the amount of canopy available to intercept sunlight, the solar radiation in the site, and the efficiency of the plant transforming the latter into carbohydrates [35]. In this article, we consider tracking the Brix% quantity since this is the variable taken in the field and available as ground truth. Since this quantity is constantly monitored, there are sufficient samples available for training and testing, providing spatial and temporal diversity.

2) Crop Age or Days After the Season Started:
Several stakeholders of the agricultural supply chain are interested in knowing the current season starting date. These can include insurance companies and governments as a requirement prior to paying indemnities to farmers. Knowing DaS is also required to compute the GDDs for cases when phenology is estimated with a thermal calendar [16]. Similarly, it provides rough estimations of future key dates, such as flowering or harvest. Previous studies have considered the DaS as a key feature and design an independent algorithm for its estimation [11], [19], [36]. Here, we show that this variable can be included as a variable to be tracked with a filtering framework as new remote sensing observations are available. This also provides useful information for dynamic and observation models to be able to disentangle similar remote sensing observations at different times of the agricultural season.

C. Observation Model
An observation model predicts the conditional probability distribution P (y k |x k ) corresponding to a sensor's response given the crop state (represented by the state variables). As introduced in Section III-C, in this article, we propose to use a vector of observations conformed by a sequence of N observations available in the past 100 days. The period of 100 days is chosen since it covers more than half of an agricultural season. Then, the sequence of N observations is predicted using (4), with In order to achieve the fusion of active and passive satellite remote sensing data, separate models must be trained for SENTINEL-1 and for SENTINEL-2. Then, during near-realtime operation, a model is chosen to be compatible with the observations received, i.e., using the model of SENTINEL-1 when SENTINEL-1 observations arrive or the model of SENTINEL-2 when SENTINEL-2 observations arrive.
If a single feature for each sensor is used, for instance, a sequence of N -VH backscatter or N -GNDVI measurements, a single output GPR can be trained. If multiple features are considered, such as the VH, VH/VV, dual-Pol RVI, or others for the case of SENTINEL-1, a multiple output GPR [37], [38] can be trained instead. Note that as in the case of the dynamic model, using independent models for each feature results in uncorrelated noise covariances, or in other words, the matrix Σ k−i is a diagonal matrix. In this article, both the VH backscatter and the ratio VH/VV are considered, and individual models are trained to predict each feature. For the case of SENTINEL-2, the GNDVI and the Modified Chlorophyll Absorption in Reflective Index (MCARI) [39], [40], as well as training individual models to predict each feature, are used. Other vegetation indices were not included since linear correlation between them and the GNDVI and MCARI were identified. In the case that an SAR and optical images are acquired on the same date, the system will process them sequentially, following their acquisition time. Consequently, it will handle this case without any extra considerations.

D. Filter Initialization
A prior belief of the crop state variables is required to initialize the filter. An assumption based on the SENTINEL-1 VH backscatter is made to provide an initial estimate of the DaS and the Brix degrees. The DaS zero is determined as the nearest date in the previous 100 days of the monitoring starting date, in which the VH backscatter was lower than −23 dB. If all the values are higher than −23 dB in the previous 100 days, the crop is assumed to be in maturation and a DaS of 120 is adopted. The Brix degrees are then initialized with the typical starting value for DaS zero (median of Brix ground truth samples when DaS is zero). If a DaS of 120 is adopted, the typical Brix value for DaS 120 is used. Since the filter is initialized with a Gaussian probability distribution, a standard deviation of 20 days for the DaS and a standard deviation of 2.5 Brix degrees for the below-ground carbohydrates are used. Note that the described assumption is not necessarily required if large variances to the prior belief or initial probability distribution are adopted. In this case, after seeing new observations, the filter updates the state variables so that they are in agreement with the observations, thus converging to the actual states. However, the described assumption allows the filter to converge faster.

A. Unscented Kalman Filtering With GP-Based Dynamic and Observation Models
Several combinations between SENTINEL-1 acquisition geometries and SENTINEL-2 data were evaluated, as shown in Table II. The GP-UKF algorithm described in the Appendices is then used for filtering. Fig. 4 shows the result of running the GP-UKF filter without any remote sensing observations but only using the dynamic model to predict recursively and track the below-ground plant carbohydrates and the crop age.

1) Tracking Crop State Variables Only With the Dynamic Model:
As can be seen in the red dots, the filter is able to provide daily predictions that are similar to the ground truth samples (black crosses). These predictions follow the typical trends expected for the evolution of the state variables confirming that the dynamic model effectively captured their behavior. Note that when the season end is detected, the DaS is reset to zero, as well as the state initial uncertainty. Note, however, that the predictions do not adapt to the particular conditions of every season as no remote sensing observations are being considered. These conditions could include, among others, variations in management practices as well as variations in crop development due to environmental conditions. As an example, note that after September 2019, the dynamic model predictions are not synchronized with the  ground truth. This is due to an atypical season start and end being changed according to management practices. This, in turn, reduces the overall GP-UKF performance if observations are not available.

2) Tracking Crop State Variables With SENTINEL-2:
If SENTINEL-2 observations are considered, Fig. 5 shows that the GP-UKF adapts better to the particularities of each agricultural season. In this case, the filter uses the SENTINEL-2 observations when available and, based on the deviations between them and the predictions from the GP-based observation model for the SENTINEL-2 features, adjusts the predictions of the dynamic model. If observations are not available, for instance due to cloudy days, the filter is still able to provide a prediction filling these gaps. The uncertainty in the predictions decreases significantly when several observations are consecutively received, such as between February and May of 2020. On the contrary, when a gap in the SENTINEL-2 observations occurs, the filter fills the gaps and the uncertainty associated with these predictions increases as is the case between July and September of 2020. In fact, notice that a seasonal pattern can be observed in the uncertainty, where in the coldest months of the year when is also cloudier (second semester of each year), the uncertainty is larger than in the hotter months where less clouds are normally present. It is expected then that during cloudy months, the GP-UKF relies mostly on the dynamic model, whereas in the cloud-free months, it relies on the observations. Note also that the uncertainty also increases when recursively using the dynamic model to forecast, as shown after the last observation available in February 2021.

3) Tracking Crop State Variables With One Acquisition Geometry of SENTINEL-1:
Given the SAR capabilities of acquisition at day and night and almost all-weather conditions, long time series can be obtained. Fig. 6 shows the result of the GP-UKF using one of the SENTINEL-1 acquisition geometries available for the test site (orbit 91 in ascending pass). Visually inspecting Fig. 6, it can be seen that the predictions are close to the ground truth even though the observations can be noisy. If we focus on predictive accuracy, we can notice that SENTINEL-1 seems to be less accurate than SENTINEL-2 in cloud-free months. However, in the cloudy months, the more consistent time series of observations of SENTINEL-1 provides better predictions than SENTINEL-2. This suggests that the vegetation indices from a multispectral satellite may be more easily associated with the canopy development than the backscatter from SENTINEL-1, but this benefit is lost with the data gaps. Note that  the predictive uncertainty using a single orbit of SENTINEL-1 is relatively constant during the whole monitoring period.

4) Tracking Crop State Variables With All Acquisition Geometries of SENTINEL-1 and SENTINEL-2:
When all the acquisition geometries of SENTINEL-1 and the SENTINEL-2 data are used, as shown in Fig. 7, the results obtained are more precise, and the uncertainty when tracking the crop state variables is low. The accuracy in the season from January to June of 2020, for example, improves compared to the previously shown cases, confirming the added value of combining SENTINEL-2 and the three SENTINEL-1 orbits. On the other hand, the uncertainty that was high when filling the gaps if using SENTINEL-2 data only is now reduced, thanks to the SENTINEL-1 data being present. Another synergy achieved using active-passive sensor fusion can be seen at the end of October 2020. The estimation of the accumulated Brix degrees at the end of the season is better predicted than in any other case. In this case, as in the other seasons, the Brix degree level at the end of the season prior harvest is a key informative factor about the potential asparagus yield in that season.

B. Performance Evaluation
The ground dataset is split in two parts from October 2019 to December 2020 covering more than two agricultural seasons for each of the 116 analyzed parcels. Three metrics are used as indicators of performance: the root-mean-square error (RMSE), the Pearson correlation coefficient (r2), and the mean absolute error (MAE). The results of these metrics validating the performance of the GP-UKF applied to each of the analyzed cases of Table II are shown in Table III.  TABLE III  GP-UKF PERFORMANCE EVALUATION Using all the acquisition geometries of SENTINEL-1 together with the SENTINEL-2 data when available provides the best results. In this case, the GP-UKF is able to track the asparagus crop state variables with an MAE of 1.802 Brix degrees and six days. Note that while r2 for DaS is high achieving 0.97, it is not as high for Brix degrees achieving only 0.58. This difference may be related to the fact that the Brix samples are significantly more prone to error when measuring them, resulting in noisy training and testing data. In fact, in order to reduce this error in the ground truth, more samples are required at parcel level for the sampled population to correctly represent the Brix level. However, this process is expensive and resources consuming. The RMSE achieved in this case is 2.287 and eight days, similar to the error obtained with the MAE.
In general, based on the results of Table III, adding more satellite sensors improves the performance partly due to the improvement of the temporal resolution observing the crop and partly due to the complementary information of the activepassive data fusion. The errors retrieving Brix degrees obtained with the other cases are not substantially different, as shown in Table III. The last row (line) represents the performance evaluation if all the predictions corresponded to a line equal to the median of the training ground truth samples. This is only for testing that the filtering is indeed better than predicting always the median Brix and DaS values. The row Dyn_model represents the case when no remote sensing observations are considered but only the dynamic model, as presented in Section V-A1.
For the case of DaS as well as for Brix degree, the poorest results are obtained when using orbits 18 and 142 of SENTINEL-1, both individually and in combination with other orbits or with SENTINEL-2. This can be associated with the number of images available being considerably lower than for orbit 91, as can be seen in Table I. In fact, using SENTINEL-2 data only (S2) even with the associated data gaps has better performance than orbits 18 (S1_18) and 142 (S1_142), with an MAE of 2.083 compared to 2.508 and 2.531 for these two SENTINEL-1 orbits. However, orbit 91 (S1_91) individually presents better performance than SENTINEL-2 data with an MAE of 1.931 and six days. Note that this is practically the same performance achieved using all the acquisition geometries available (S1_all), which achieves an MAE of 1.905 and six days, possibly meaning that the majority of the performance achieved corresponds to information provided by orbit 91 of SENTINEL-1. The value of the active-passive sensor fusion can be seen when considering that the best four performances are achieved when using both SENTINEL-1 and SENTINEL-2. In these four cases, orbit 91 is always present confirming the value of a consistent flow of SENTINEL-1 data (not fully achieved with orbits 18 and 142). An interesting case to highlight here is that using orbit 91 and SENTINEL-2 (S1_91+S2) performs better than using all the SENTINEL-1 acquisition geometries (S1_all), confirming the value of SENTINEL-2 even with its data gaps in cloudy months. Note that this case (S1_91+S2) practically achieves the same performance as the best case. However, using all the orbits of SENTINEL-1 (or more than one) and SENTINEL-2 provides a more robust system in the case of intermittent operation of any acquisition orbit, as was the case in this test site with orbits 18 and 142. Fig. 8 shows the results of retrieving the crop state variables with the GP-UKF when using the active-passive sensor fusion for the whole farm. These four dates show the estimation over time of Brix degrees. As can be seen, not all the parcels are in the same development stage since farmers can manage this, taking advantage of the all year-round food production potential offered by the local climatological conditions.

D. Key Date Estimation
The same procedure to fill gaps or make one-step-ahead predictions can be recursively performed for multiple-step-ahead prediction or forecast. This gives the GP-UKF the possibility to estimate the occurrence of future crop key dates. Accurately predicting the date when a parcel will be ready for harvest is essential to plan human and material resources required for harvest and to plan the capacity in the plant to process the harvested asparagus. This is particularly important considering the all-year round production of asparagus. The crop is assumed to be ready for harvest when the accumulation of below-ground plant carbohydrates in the root system is maximum. Therefore, the procedure to test the GP-UKF capabilities for harvest date prediction is as follows: Using the case that provides the best results for tracking, i.e., using the three acquisition geometries of SENTINEL-1 and SENTINEL-2, the GP-UKF tracks the crop state variables until the last season in 2020. Then, four cases are considered: the filter stops the tracking when the season crop age is 0, 30, 60, and 90 days. The GP-based dynamic model is utilized to recursively forecast from these points when the maximum Brix degree in the next 200 days will occur. This date is then extracted as the expected harvest date for the parcels of asparagus included in the analysis. A key point to highlight is that this date, in practice, can be affected by management decisions since it could be accelerated or delayed depending on contractual obligations or market intelligence. Therefore, since the date provided here is an estimation based only on the crop condition, it may in some cases disagree with the real date for days or in few cases even weeks. Table IV shows the results from forecasting the harvest date for the three cases considered: As can be seen, the lower the error predicting harvest date, the closer the parcel is to the real harvest (DaS = 90). This is because the filter has seen more observations to correctly track the crop state variables up to that point. Conversely, forecasting the harvest date from DaS = 0 corresponds to predicting without the aid of remote sensing observations, but only using the dynamic model. Note than even when predicting harvest date 30 days after the season started, an MAE of six days is achieved. This is already an acceptable error for the practical purpose of this task.
Despite not accurately knowing when the season started but assuming a window of one month where it could have started, and after receiving observations for two months (60 days), the GP-UKF is able to forecast the date when the crop will be ready to harvest with an error (MAE) of four days.

VI. DISCUSSION
This article introduced a filtering framework designed to track below-ground asparagus plant carbohydrates and the season crop age with fusion of freely available and multitemporal SENTINEL-1 and SENTINEL-2 data. The method proposed that complements what other studies have suggested are the advantages of a filtering framework, particularly in relation to accuracy, sensor fusion, and key date estimation [10], [12], [13]. It was shown that retrieving Brix degrees and DaS with SENTINEL-2 data could be more accurate than SENTINEL-1, if no data gaps existed. However, if a SENTINEL-1 acquisition geometry provides constant data (S_91), it is overall more accurate than SENTINEL-2. Note, however, that, even when SENTINEL-2 data gaps exist, the GP-based dynamic model is able to fill these gaps and provide an estimation of the tracked crop state variables and to forecast crop key dates. The combination of active-passive data provided the best performances and was found to be valuable to better predict the Brix degree of a parcel at the end of the season. This is one of the most important moments in the season for this prediction, since it is highly correlated with the crop yield [23], [41]. As can be seen in Table III, using one orbit of SENTINEL-1 that provides systematic acquisitions and in fusion with SENTINEL-2 (S1_91+S2) provides better performance than using all three available acquisition geometries of SENTINEL-1. This helps corroborate the value of the active-passive sensor fusion. In addition, including more than one SENTINEL-1 orbit with SENTINEL-2, besides of providing the best performance, contributes to having a more reliable operational system.
In all the cases, nonlinear and nonparametric dynamic and observation models are learned, allowing the GP-UKF to, if desired, be transferred to other crop state variables, crop types, locations, and wavelengths if training data are available. This is a clear improvement with respect to previous approaches that use filtering frameworks with parametric models [12], [13]. Note also that in this article, we proposed to estimate the DaS directly as part of the filtering instead of designing a separate algorithm as has been done in previous works (see [6] and [36]).
Compared to the traditional Kalman filter, the GP-UKF is more accurate as it is able to learn nonlinear dynamic and observation models. Compared to the EKF as in [10], the UKF has been shown to be at least as accurate as the EKF [34]. Note that propagating the states through the GP-based models results in a non-Gaussian distribution [33]. However, the GP-UKF approximates them to Gaussian in order to use the original Kalman filter equations. In this regard, while the GP-UKF is less computationally demanding, it may be at a slight disadvantage compared to the PFs of [11] and [13]. Nevertheless, a pragmatic approach can be taken accepting this potential loss of accuracy, as it allows for a significant improvement in reducing the destructive sampling damage of this high value crop. Future work will address this potential disadvantage. Further research will also focus on extending the present method as a crop development anomaly detector. This would be by detecting when the crop behavior is deviating from the expected crop growth learned by the dynamic models.

VII. CONCLUSION
In order to provide an alternative to current destructive and expensive field sampling of below-ground asparagus plant carbohydrate, this article presents a novel, data-driven, unscented Kalman filtering framework. It fuses multitemporal and freely available SENTINEL-1 and SENTINEL-2 data to track Brix degrees as a surrogate of carbohydrates and the season crop age and to forecast crop key dates. GPRs are trained with multiyear ground truth to learn the dynamic and observation models and their corresponding uncertainties. After testing the performance of the GP-UKF with unseen field samples, an MAE of 1.802 Brix degrees is obtained fusing the three SENTINEL-1 acquisition geometries available in the test site and the SENTINEL-2 data when available. The GP-UKF achieved an MAE of six days for crop age retrieval and an MAE of six days for forecasting the date for a parcel being fit for harvest.
The GP-UKF proposed here is also able to perform daily predictions, fill data gaps, and forecast key crop dates while remaining robust to individual sensor failures. As a data-driven approach, the method can be applied to other crop biophysical variables and crop types.

A. Gaussian Process Regression
Given the importance of the GPR to learn dynamic and observations models, this appendix presents a basic introduction to its foundations. For an in-depth study of GPRs, the reader is referred to [20]. Note that in the literature, the X and y variables are normally used to denote, respectively, the inputs and outputs of a GPR. In this article, however, since these variables are used for the system states and vector of observations, the training inputs for the GPR are represented by the matrix χ χ χ, while the outputs are represented by the vector ψ. In this context, given a set of inputs χ χ χ = [d 1 , d 2 , . . . , d n ] T , where d i is a D-dimensional input vector example and a set of outputs ψ = [ψ 1 , ψ 2 , . . . , ψ n ] T , with ψ i being the corresponding scalar output, a GPR considers the following nonparametric model for regression: where h represents the unknown mapping function between d i and ψ i and ε ∼ N (0, σ 2 f ) corresponds to the additive noise model with variance σ 2 n assumed to be corrupting the observed values. Rather than using a parametric model and estimating its parameters from the data in order to obtain a regression function, h(.) is assumed to be a GP (i.e., a Gaussian distribution over functions) [20]. A GP is fully specified by a mean function m(.) and a positive-semidefinite covariance function K(.) (also known as the kernel). The covariance function determines the key characteristics that the mean function takes, such as smoothness, periodicity, etc., and is chosen depending on the modeling problem at hand [20]. In this article, we consider the exponentiated quadratic kernel k(d i , d j ) where Λ is a diagonal matrix of the length-scale hyperparameters and σ 2 f is the variance of the function h defined in (5). At a test point(s) d * evaluated with the model h(.), the GP estimates an output ψ * with mean and variance using the predictive equations (7) and (8), respectively. These equations are derived from the Algorithm 1: GP-UKF Algorithm.
In (7) and (8), the term k * = k(χ χ χ, d * ) is a vector defined by kernel values between the inputs χ and test point(s) d * , K = k(d i ; d j ) is the n × n kernel matrix of the training input values, and k * * := k(d * ; d * ) is the kernel function evaluated at the test points d * . In (7), β β β is often used in the literature to shorten the equation, with β β β := (K + σ 2 e I) −1 ψ. The length-scale hyperparameters Λ and variance σ 2 f in the kernel function are learned by maximizing the log marginal likelihood of the training outputs given the inputs [20].

B. Unscented Kalman Filtering Combination With GP-Based Dynamic and Observation Models
Dynamic filtering algorithms approach the filtering problem in two steps: prediction and update [30]. Algorithm 1 begins the prediction in step 2 with the UT, generating a set of (2D + 1) so-called sigma points to characterize the distribution of the previous state x k−1 ∼ N (μ k−1 , Σ k−1 ). In this step, the parameter γ is associated with the spread of the sigma points around the distribution mean [30] and D is the number of state dimensions.
In Step 3, each of the sigma points is propagated through the GP-based dynamic model [see (3)], while in Step 4, the additive process noise is estimated from the predicted uncertainty of the GP-based dynamic model. In Steps 5 and 6, the predicted mean μ k and the predicted covarianceΣ k , as originally proposed in the UKF [34], are estimated. In these Steps, W i(m) and W i(c) are constant weights required to perform a weighted combination of the propagated or predicted sigma points [30]. The update stage of the GP-UKF algorithm begins in Step 7 by generating again a set of sigma points, now characterizing the predicted state distribution of Steps 5 and 6. Then, this set is propagated through the GP-based observation model in Step 8 to obtain the expected observations given the state (see Section III-C), while the observation noise is also obtained from the GP-based observation model in Step 9. Subsequently, in Steps 10 and 11, again, a weighted combination of the propagated sigma points is made to obtain the mean and covariance of the predicted observations. Step 12 computes the cross covariance of the predicted state and predicted observations, as proposed in the seminal UKF algorithm.
Step 13 computes the so-called Kalman gain as in the original Kalman filter algorithm, and Steps 14 and 15 provide the state mean μ k and the covariance Σ k , conditioned on the sensed observations y k