On the potential of Sentinel-2 for estimating Gross Primary Production

Estimating Gross Primary Production (GPP), the gross uptake of CO2 by vegetation, is a fundamental prerequisite for understanding and quantifying the terrestrial carbon cycle. Over the last decade, multiple approaches have been developed to derive spatio-temporal dynamics of GPP combining in-situ observations and remote sensing data using machine learning techniques or semi-empirical models. However, no high spatial resolution GPP product exists so far that is derived entirely from satellite-based remote sensing data. Sentinel-2 satellites are expected to open new opportunities to analyze ecosystem processes with spectral bands chosen to study vegetation between 10m to 20m spatial resolution with 5-days revisit frequency. Of particular relevance is the availability of red-edge bands that are suitable for deriving estimates of canopy chlorophyll content that are expected to be much better than any previous global mission. Here we analyzed whether red-edge-based and near-infrared-based vegetation indices (VIs) or machine learning techniques that consider VIs, all spectral bands and their non-linear interactions could predict daily GPP derived from 58 eddy covariance sites. Using linear regressions based on classic VIs, including Near Infrared Reflectance of Vegetation (NIRv), we achieved prediction powers of R210-fold = 0.51 and an RMSE10-fold = 2.95 [μmol CO2 m-2 s-1] in a 10-fold cross-validation. Chlorophyll Index Red (CIR), and the novel kernel NDVI (kNVDI) achieved significantly higher prediction powers of around R210-fold ≈ 0.61, and a RMSE10-fold ≈ 2.57 [μmol CO2 m-2 s-1]. Using all spectral bands and VIs jointly in a machine learning prediction framework allowed us to predict GPP with R210-fold = 0.71, and RMSE10-fold = 2.68 [μmol CO2 m-2 s-1]. Despite the high-power prediction when machine learning techniques are used, under water-stress scenarios or heat-waves, optical information alone is not enough to predict GPP properly. In general, our analyses show the potential of non-linear combinations of spectral bands and VIs for monitoring GPP across ecosystems at a level of accuracy comparable to previous works, which, however, required additional meteorological drivers.

G ROSS Primary Production (GPP), the amount of carbon absorbed by the ecosystem via plant photosynthesis, is the largest single flux in the global carbon cycle [1]. GPP varies in response to several abiotic [e.g., radiation, temperature, and precipitation; 2, 3], and biotic factors [e.g., metabolic pathway, vegetation type, leaf chemical traits, and species composition; 4]. However, GPP cannot be directly observed and needs to be derived from in-situ measurements of net CO 2 exchanges using the eddy covariance (EC) technique over canopies [5,6]. Using different flux partitioning methods, it is possible to estimate the amount of carbon that is taken up by the ecosystem (GPP) or released through ecosystem respiration (RECO) [7,8,9,10,11]. Nevertheless, EC can only measure the exchange of energy and matter between the ecosystem and the atmosphere at the scale of the climatology footprint, which can vary between a few hundred meters to a few kilometers [e.g . 12]. Today, EC data are available globally in multiple regional networks (Integrated Carbon Observation System: ICOS, The National Ecological Observatory Network: NEON, AmeriFlux, AsiaFlux), and the meta-network Fluxnet [13,14]. The flux database networks enable studies into local processes understanding [15,6,16,17], evaluating biotic and abiotic relationships on multiple time scales [e.g. 18,19], and evaluating terrestrial biosphere models [20,21,22,23].
In the last decades, many process-based, semi-empirical, and data-driven models have been developed to upscale GPP using remote sensing data, and climate information, in order to understand the spatio-temporal dynamics of the global carbon cycle [24,25,3,26]. For instance, the MODIS MOD17 product is based on a semi-empirical model that estimates GPP as the product between the light-use efficiency and the absorbed photosynthetically active radiation (APAR) [27]. The maximum light-use efficiency is a plant functional type dependent parameter, and it is down-regulated by stress factors that depend on temperature and vapor pressure deficit that need to be parameterized. The Breathing Earth System Simulator (BESS) [28] is a process-based approach, which relies on a radiative-transfer model coupled with several remote sensing products to predict GPP and evapotranspiration (ET) at global scale with a temporal resolution of 8 days. Jung et al. [29] showed that machine learning methods can likewise efficiently upscale fluxes from in-situ data to the globe. Building on this work, Tramontana et al. [30] used the FLUXNET dataset and MODIS remote sensing information to train multiple machine learning techniques to predict monthly GPP at global scale. Later, Bodesheim et al. [31] produced GPP global products at half-hour temporal resolution using different settings, but of low spatial resolution (0.5 • ). The state-of-the-art machine-learning based upscaling of GPP is described in Jung et al. [26].
A more direct approach to predicting GPP is to identify vegetation indices that are highly correlated with GPP dynamics. Badgley et al. [32], for instance, found that the Near-Infrared Reflectance of vegetation (NIRv) index strongly correlates with monthly estimates of Sun-Induced chlorophyll Fluorescence (SIF), rendering it a potential predictor for GPP at the global scale. Later on, Badgley et al. [33] showed that NIRv can explain 68% of the monthly GPP variability at the FLUXNET sites. Recently, Camps-Valls et al. [34] presented a non-linear version of the Normalized Difference Vegetation Index (NDVI) based on kernel methods (kNDVI) that correlates better with GPP and SIF products than NIRv and NDVI. The advantage of such approaches is that they rely purely on remote sensing data and circumvent the parameterization of light use efficiency models. However, relying on reflectance values alone means that the detection of physiological regulation of photosynthesis via meteorological conditions are not detectable, unless they last long enough to affect vegetation pigments and structure.
Today, new satellite missions have increased the information available to characterize vegetation properties and ecosystem processes [35,36]. Specifically, the satellite missions from the Copernicus program have opened new ways to monitor ecosystem processes with unprecedented spatial, temporal, and spectral resolution [37,38]. For instance, it has been shown that Copernicus missions allow deriving plant traits such as chlorophyll and nitrogen content along with other biophysical parameters [39,40,41,42]. To the best of our knowledge, only three studies have evaluated the prediction capacities of GPP using Sentinel-2: Wolanin et al. [43] used the SCOPE model and machine learning techniques to predict GPP of C3 crops. Lin et al. [44] evaluated the potential prediction of GPP as a function of the vegetation index multiplied by the incident photosynthetic active radiation (PARin). They analyzed the performance of five red-edge vegetation indices and three non-red-edge vegetation indices. They found that Chlorophyll Index Red (CIR) showed the highest correlation with GPP from the EC tower for two grassland sites. And finally, Cai et al. [45] compared GPP predictions using Sentinel-2 and MODIS for several EC-sites in Northern Europe. The authors did not find any improvement for the prediction of GPP when using Sentinel-2 compared to MODIS using the enhanced vegetation index (EVI2). Despite these advances, there is a lack of systematic comparison between novel red edge vegetation indices and vegetation indices based on the classic red and NIR bands (i.e., NDVI, kNDVI, and NIRv) in terms of their predictive power regarding GPP. Likewise, the question of whether a machine-learning approach considering all Sentinel-2 bands could improve the satellite-based predictions of GPP remains unresolved.
In this study, we aim at understanding the potential of Sentinel-2 mission for monitoring GPP across European and North American major biomes at high spatial resolution. Firstly, we want to understand, which vegetation indices or spectral bands available from Sentinel-2 are the most relevant for predicting GPP. Secondly, we investigate what is the difference in prediction-performance between different approaches based on state-of-the-art vegetation indices (e.g. NIRv, kNDVI, red-edge based, and non red-edge indices) and machine learning using all spectral bands.

A. Eddy Covariance sites
We used 58 EC sites compiled by the ICOS Drought 2018 Team (49 sites) and the Ameriflux/ONEFLUX (9 sites) initiatives from 2015 to 2018 (Table A2). We used halfhourly GPP data (GPP NT VUT USTAR50) estimated using the FLUXNET2015 workflow [14]. GPP is calculated in FLUXNET with the night-time partitioning method [8] using a variable u* threshold for each year. The annual u* threshold is derived from the 50 th percentile of u* threshold distribution obtained by bootstrapping the original night-time net ecosystem exchange data [14]. Daily GPP values are estimated as the mean of the half-hourly values where net ecosystem exchange is observed or gap-filled with good quality (e.g., NEE VUT USTAR50 QC = 0 and 1). In our analysis, days with less than 70% of good quality half-hourly data were set to "NA". Finally, we smoothed the time-series using a moving window mean with a window size of 7 days.

B. Sentinel-2 Imagery
We downloaded Sentinel-2 L1C products for the EC sites from 2015-2018 using the Scihub Copernicus portal (https: //scihub.copernicus.eu/, last accessed October 2020). We performed atmospheric corrections for all products using Sen2Cor 2.5.5 [46]. All bands were resampled to 20 meters resolution using the nearest-neighbor approach for up-sampling, and median for down-sampling. Finally, we computed several vegetation indices (See Table C4) such as NDVI, kNDVI, NIRv, and multiple red-edge vegetation indices as the Inverted Red Edge Chlorophyll Index (IRECI), and CIR. Among these indices, kNDVI requires a specific parameterization of the kernel width σ, which was here set to the median distance between the near-infrared band (NIR) and the red band per spatial pixel; for Sentinel-2 σ = median(0.5×(B8+B4)). Post-processing of the images was performed using SNAP v7.0 [47], and automatized using the Graph Processing Framework and the Graph Processing Tool. The scripts for the post-processing of the products are available at a GitHub repository (See code availability).
We defined a buffer area of 100 meters radius around the EC towers to ensure that the flux footprint climatology lies within this area (Supplement Material 1). We used the scene classification generated by Sen2Cor to filter out images with: "No data", "Saturated or defective pixels", "dark areas", "cloud shadows", "water", "cloud", "thin cirrus", and "snow". To reduce the effect of shadows or saturated pixels that are not correctly classified by Sen2Cor, we implemented an outlier detection approach that consists of three steps: Firstly, we computed z-scores (data centered and scaled to unit variance) per image and removed pixels of the buffer area with an absolute residual value higher than quantile 0.99 [48]. Secondly, to detect potential images with clouds, we used the time series of the spectral bands per site. We then estimated the average of the buffer area for each image/band and decomposed the time series of each band into a seasonal and a trend component using locally estimated scatterplot smoothing [LOESS 49]. Next, we applied an inner-quantile range technique over the residual of the time series decomposition [50]. Residuals with values higher or lower value than 3 times the quantiles 0.25 and 0.75, respectively, were also classified as outliers. This analysis was performed using the "anomalize" R package [50] Thirdly, we defined a bigger buffer area of 900 meters, where we estimated the percentage of clouds. We removed observations where the percentage of clouds was above 70%. We also identified 16 additional images with clouds by visual inspection (Supplement Material 3). We present the complete description of the time series decomposition and the outlier detection in the Supplementary Material 2 (the R scripts are available in the GitHub repository, see code availability). The minimum number of images per site detected as an outlier is 1, the maximum is 20 and the mean across sites is 6 images. Finally, we selected the daily GPP values for the days when we also have valid images from Sentinel-2.

C. Dataset balancing
The imbalanced representation of different categories in a dataset can influence the weighting of the observations during the training process and consequently in the quality of the prediction [51]. In the last decades, several methods have been developed to solve this issue, mainly for classifications problems, but recently also for regression analysis [52,53]. To address this problem for the prediction of GPP through different vegetation types that are not all equally represented ( Figure 1). We implemented three methods to balance the dataset given the differences in the number of observations per vegetation type. 1. Under-sampling balancing: all observations are grouped by vegetation type, and are resampled without replacement, to the number of observations of the vegetation type with the least observations. 2. Oversampling balancing: all observations are again grouped by vegetation type. Each category is completed until reaching the number of observations of the maximum category (sampling with replacement). The replacement technique is only applied when the total observations of the category are less than the difference between the number of observations of the category with the maximum number of observations and the total number of observations of the current vegetation type. 3. Synthetic Minority Oversampling TEchnique for Regression (SMOTER) balancing: it is a balancing technique proposed by Torgo et al. [52], where the idea behind the method is to under-sampling observations with high frequency. In contrast, values with a low frequency (rare observations) are over-sampled. In this form, rare observations will have a higher weight during the training. All following analyzes were applied considering all three balancing techniques as well as to the imbalanced case.

D. Linear regression based GPP prediction
We evaluated the performance of red-edge vegetation indices to predict GPP using linear regression, using the balanced and imbalanced datasets. We compared the performance of NDVI, NIRv and kNDVI [34], as well as red-edge vegetation indices such as IRECI and CIR (for an overview, see table C4). All evaluations were based on Leave-Location-and-Time-Out 10-fold cross-validation as proposed by Meyer et al. [54], and implemented in the "CAST" R package [55]. To increase the robustness of the analysis, the generation of 10 folds was repeated 50 times. In this approach, the partitions for the crossvalidation are semi-randomly generated to minimize spatial and temporal auto-correlation. We evaluated the performances of the different models using coefficient of determination (R 2 ) of the linear regression between observed and predicted GPP, the root-mean-square error (RMSE), and the mean absolute error (MAE). Finally, we compared the distributions of the model evaluation metrics between the vegetation indices using the Wilcoxon test [56].

E. Machine-learning based GPP prediction
We used random forests [57] as prediction approach for GPP for each balanced and imbalanced dataset. A detailed description of how to use RF for upscaling land surface fluxes can be found in Bodesheim et al. [31]. We explored what variables are the most relevant for predicting GPP. For this, we evaluated the radiometric indices presented in the Table  C4, additionally to the spectral bands B1, B2, B3, B4, B5,  B6, B7, B8, B8A, B9, B11, and B12 (Table B3) resulting in a total of 35 predictor variables. kNDVI was not included here since it is a non-linear transformation of the NDVI using kernel methods, and its inclusion would have added no information when applying machine learning techniques. We performed a forward feature selection as suggested by Meyer et al. [58], where the models are generated based on the pairs' combination of predictors, allowing us here to compare non-linear combinations of spectral bands and vegetation indices, as we may expect that they could reduce model complexity. The power prediction of each model was estimated using a 10-fold Leave-Location-and-time-out crossvalidation Meyer et al. [54], where the 10-folds were generated 50 times to increase the robustness of the analysis. The idea is that the model with highest R 2 is selected first, then new variables are iteratively added to this initial model. The process finishes when none of the remaining variables increases model performance.

III. RESULTS AND DISCUSSION
In the following, we firstly report the results of the GPP prediction using different vegetation indices in linear regressions, where we specifically discuss the performance of GPP estimates based on red-edge vegetation indices compared with the ones based on NIRv, NDVI and kNDVI. We also discuss the effect of the balancing techniques on the performance of the prediction. Secondly, we present the results of the GPP prediction using Sentinel-2 spectral bands and vegetation indices using random forests, where we present examples of the prediction for different EC sites, and an entire Sentinel-2 tile. Finally, we discuss the possibilities and limitations of predicting GPP using remote sensing information only and how such prediction can be improved in the future and provide globally continuous flux estimates.

A. GPP prediction using linear regressions
In figure 2 we compare the performance of linear GPP predictions using red-edge based vegetation indices (CIR, IRECI, Table C4), NIRv, NDVI and kNDVI. Red-edge vegetation indices perform better than NDVI, and NIRv in all considered metrics (Figure 2), while kNDVI perform as well as IRECI. According to the Wilcoxon test, the differences in the performance distribution of each index are statistically significant. In general, CIR explains on average 3% more of the GPP variance than kNDVI, 4% more than IRECI, 10% more than NIRv, and 11% more variance than NDVI. kNDVI explains in average 1% more than IRECI, 7% more than NIRv, and 8% more than NDVI. The prediction of GPP using IRECI shows that 6% more variance in GPP is explained compared to NIRv, and 7% more than NDVI. NIRv only explains 1% more of the GPP variance than NDVI. The RMSE shows smaller errors in GPP estimated with CIR, kNDVI and IRECI compared to the estimates based on NIRv and NDVI ( Figure 2). As expected, when balanced datasets are used, the explained variance increases 2% for CIR, from 2% to 4% for IRECI, from 2% to 5% for NIRv, from 1% to 3% for kNDVI, and from 2% to 3% for NDVI (Table I,  Badgley et al. [32] introduced the NIRv as an alternative to SIF for the estimation of monthly GPP. Compared to machine learning products or radiative transfer models, the advantage of this approach is that it could be used to estimate global GPP easily using global and long-term time series products like MODIS. However, our results suggest that the red-edge vegetation index CIR yield significantly higher prediction powers of GPP compared to NIRv. This finding could be interpreted as an important argument for relying on the novel Sentinel-2 data for GPP prediction. Red-edge is the region around 710 nm which marks the sharp transition between the red region (700 nm), where the absorption of chlorophyll occurs, and the near-infrared region (730 nm), where the reflectance is produced by the internal structures of the leaf [59, p. 180]. This region is highly sensitive to the leaf chlorophyll content [60,61]. At the same time, chlorophyll content is a controlling factor of the fraction of photosynthetically active radiation absorbed by plants (APAR). This is one possible explanation why CIR is strongly correlated with GPP [62], even if it cannot reflect the fast variations of the photosynthesis itself. For these reasons, VIs based on red-edge bands might generally have advantages for estimating GPP over VIs that do not rely on the red edge. Lin et al. [44] found that CIR multiplied by PAR can explain slightly more variability of GPP than NIRv multiplied by PAR for 2 grasslands sites. However, we would argue that the PAR effect could be dominant in their study, while our aim here was to focus on the spectral information only.
We also tested the predictive performance of kNDVI [34] which was reported to predict monthly GPP better than NIRv. The idea behind kNDVI is to solve the saturation problem of NDVI at high values by exploring the non-linear relations of the two bands of the NDVI. Even tough no red-edge information is used, we found that kNDVI performed at the level of IRECI in our study. One interpretation of this finding is that most of the information contained in the red-edge bands can be captured by an appropriate transformation of the distance between near-infrared and red bands. However, there is no direct mechanistic argument, and it is unclear to what extent this observation is general and further research will be necessary. However, our results may imply that kernel versions of classical vegetation indices could derive relevant information from satellite missions that do not have red-edge indices.

B. GPP prediction using Random Forest
Another question of this study was whether machine learning could outperform even the new generation of vegetation indices. In Table D5 we present the results of the variable selection analysis where a different number of predictors are selected depending on the balancing technique. From 35 predictors that included Sentinel-2 spectral bands (Table B3) and derived vegetation indices (Table C4) CIR, S2REP, and B1 are selected for all datasets, while GNDVI, PSSRA B3, and B4 are selected in at least three cases. ARVI, MTCI, MCARI, B2, and B5 are selected at least in two datasets. IRECI, NDI45, RVI, TNDVI, TSAVI, CIG, and B12 are selected at least once (Table D5). The variable selection analysis shows that even when non-linear combinations of spectral bands are possible, vegetation indices are still selected as they probably would simplify the machine learning model. Yet, not all information required for predicting GPP seems to be encoded in vegetation indices alone. Bands B1, B2, B3, B4, B5, and B12 also appear to provide information that is useful for the predictions. A surprising result is the selection of band B1. This band is typically used for aerosol detection and correction purposes. We speculate that B1 is a proxy for radiation dynamics (e.g. direct and diffuse radiation) that are important for GPP. However, we note that Penuelas et al. [63] had considered this spectral region earlier in their Structure Insensitive Pigment Index (SIPI) which has, however, not been developed further for vegetation monitoring. The additional selection of bands B2 (blue), B3 (green), B4 (red), and B5 (vegetation rededge) suggest that there is space for the development of new vegetation indices that can capture the GPP variability beyond the existing indices.
In figure 3 we present the prediction of GPP using Random Forest regression. Where GPP can be predicted with a R  (Table D5). The comparison between the distribution of the metrics shows that there are significant differences between the imbalanced and balanced datasets (Figure E.1). The results of the cross validation for each fold and balancing technique are presented in the supplement material 7. Tramontana et al. [30] reported that spectral information with machine learning techniques can explain around 78% of the GPP variability across sites. One of the advantages of our approach is that it does not require a previous vegetation type classification [64]. In comparison with the estimation of GPP using biophysical parameters as e.g. in Lin et al. [44], we show that it GPP can be estimation more directly with high accuracy.
In figure 4 we present examples of predicted and observed GPP representing different vegetation types. The prediction for each site is presented in supplementary material 5. Despite the overall high variances explained by random forests, there are indeed cases when GPP cannot be predicted correctly. For instance, the maximum GPP is underestimated in savannas and evergreen needleleaf forest ecosystems. Our study period covers the 2018 heat-wave, an extreme event where northwestern Europe vegetation was highly affected [65,66,67]. We find, however, that the reduction in CO 2 uptake during this event was not well captured for mixed forest and deciduous broadleaf forest ( Figure 4). This can be seen also when comparing the time series of 2016 and 2017 to 2018 (see Figure 4). This means that the prediction of ecosystem fluxes during extreme-events remains an open issue that needs to be addressed with high priority as discussed in Sippel et al. [68]. However, our finding that GPP dynamics during drought events cannot be well represented is in-line with earlier findings. For instance, Bodesheim et al. [31] showed that GPP was not properly predicted during dry summers for several EC sites, and attributed this to the poor representation of water availability in their dataset. Different from our study, their study also used climate information which, in theory, increase the model performance for water stress scenarios. One general problem could be the time lag between change of photosynthesis rates and the decline in the concentration of the pigments, including chlorophyll content, in the leaves. However, given that the data generated here are based on vegetation reflectance properties only, it is expected that they can only pick up changes in GPP that are primarily driven by changes in APAR and pigment concentrations, but are not apt to capture the fast response of photosynthesis mediated e.g. by stomatal closure. This limitation is inherent to all reflectancebased methods and the reason why in some sites we are not able to reproduce GPP dynamics under stress.
Nevertheless, the overall seasonal dynamics are represented very well in our GPP estimates across sites and vegetation types. Future studies should investigate whether the inclusion of thermal information from Sentinel-3 or radar information from Sentinel-1 can help to indirectly address the water deficit in the ecosystems during drought periods [69] and lead to the next generation of operational GPP products based on remote sensing data only. In addition, the unique combination of red-edge vegetation indices in Sentinel-2, radar information from Sentinel-1, or multispectral and thermal information from the bands available in Sentinel-3 may open unprecedented possibilities for vegetation monitoring in the near future [35].
Previous studies used plant functional classes as a spatial feature to upscale GPP [70,30]. To use vegetation types as predictor of GPP, a necessary step will be to improve the land cover maps to match the resolution of Sentinel-2. The ESA WorldCover consortium gave the first steps, producing the first global land cover map at 10 meters resolution for 2020 using radar information from Sentinel-1 and optical information from Sentinel-2 [71]. Future research will have to test the added value of these upcoming products for predicting carbon fluxes at high spatial resolution.
To give a taste of what the mapping of carbon fluxes might look like in the future, in figure 5 we present an example of the upscaling of GPP for a Sentinel-2 tile over the Ballons des Vosges Regional Nature Park (France, 2020-06-23; Supplement Material 6). The area contains different types of deciduous broadleaf forest, weatlands, grasslands, and croplands. Even though our model does not use vegetation type as a predictor, it does clearly differentiate GPP dynamics of crops, weatlands and forests. The high spatial resolution of Sentinel-2 could be a nice avenue to monitor forests with a high degree of fragmentation [72], or even green areas in cities [73]. A tutorial of how to use the final models produced in our study to upscale GPP using any Sentinel-2 L2A product provided by Copernicus-ESA is presented in the code repository.

IV. CONCLUSION
In this study, we explore how remote sensing information provided by Sentinel-2 can be used to predict GPP across a variety of vegetation types. We find that the Chlorophyll Index Red (CIR) explains in average 10% more of the variability of GPP at daily scale than NIRv and 11% more than NDVI using linear regressions. The high correspondence between kNDVI and IRECI is unanticipated and requires further physical exploration. The prediction power of vegetation indices can be slightly outperformed using machine learning: using random forests, the spectral information provided by Sentinel-2 alone can predict in average 68% of GPP variability (crossvalidated). However, under extreme climate conditions such as the 2018 drought/heatwave, meteorological data or thermal information might be necessary to improve the prediction of short-term reduction of GPP that are not associated to changes in APAR or the decline of chlorophyll content. From a methodological point of view, we also explored whether balancing techniques can help to represent vegetation types and rare observations. Furthermore, we found that improvements in the prediction accuracy of GPP is associated to the use of balanced datasets for the training. Overall, our study presents a first attempt to assess the capability of Sentinel-2 data alone to predict GPP. Despite the discussed limitations, Sentinel-2 generally offers a highly relevant perspective to map fluxes at high spatial resolution, opening new ways to understand ecosystem processes and responses from local to global scale.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
Difference Vegetation Index (DVI) Global Environmental Monitoring Index (GEMI) B8A+B4+0.5 [110] Green Normalized Difference Vegetation Index (GNDVI)  ICOS data is available on the web-site: https://www.icos-cp.eu/data-products/YVR0-4898. Ameriflux data is available on the web-site: https://ameriflux.lbl.gov/data/download-data-oneflux-beta/ ACKNOWLEDGEMENT This project has received funding from the European Union's Horizon 2020 research and innovation program via the TRuStEE project under the Marie Skłodowska-Curie grant agreement no. 721995. M.D.M. and M.R. thank the European Space Agency for funding the project DeepExtremes -AI for Science, Multi-Hazards, Compounds and Cascade events. This work used eddy covariance data acquired and shared by Ameriflux, and the Integrated Carbon Observation System (ICOS). In addition, funding for AmeriFlux data resources was provided by the U.S. Department of Energy's Office of Science. We thank Prof. Dr. Gustau Camps-Valls for his feedback on kNDVI.