Introduction
Normalized difference vegetation index (NDVI) time series (TS) derived from remote sensing images has been widely used in vegetation phenology detection [1], land-cover change monitoring [2], [3], [4]; environmental dynamic simulation [5], [6]; vegetation classification [7], etc. However, satellite remote sensing TS images are frequently interrupted due to low time resolution or pollution by bad atmosphere (such as aerosols and dust), clouds, and snow [8], [9]. Consequently, it becomes imperative to employ effective methods for reconstructing NDVI TS to meet the subsequent extended application. According to the principle of missing information restoration, three categories of NDVI TS reconstruction methods have been widely used in the past few decades, namely, temporal-based method, frequency-based method, and hybrid method.
Temporal-based methods: This category of methods can be further classified into three types. Temporal interpolation replacement methods, like iterative interpolation for data reconstruction (IDR) [10], the best index slope extraction (BISE) [11], and modified BISE (M-BISE) [12]. Temporal filter, including the Savitzky–Golay (SG) filter [13], Whittaker filtering method [14], and changing-weight (CW) filter [15]. Temporal function fitting methods, such as double logistic (DL) function fitting [16], and Fourier [17].
The mentioned methods heavily rely on local temporal neighborhood information and have limited reconstruction capabilities. Consequently, researchers have extended temporal-based reconstruction methods. For instance, Liang et al. [18] utilized MODIS, Landsat8, and Sentinel-2 imagery at different spatio-temporal resolutions, employing gap-filling and Whittaker smooth filtering to recover NDVI TS. Yang et al. [19] enhanced the DCT-PLS method for reconstructing unevenly spaced data, generating high-quality and cloud-free Sentinel-2 NDVI TS.
Frequency-based methods: Frequency-based methods recover TS by transforming contaminated data from the time domain to the frequency domain. Notable methods in this category include the harmonic analysis of TS (HANTS) method [20], and the wavelet transform (WT) method [21]. However, these methods may unintentionally diminish reasonable high values and struggle to effectively preserve vegetation phenology. To address these limitations, improved frequency-based methods, such as the spatio-temporal prefill method with harmonic analysis of TS (ST-HANTS) [22], have been proposed.
Hybrid methods: The two aforementioned categories of methods have demonstrated success in specific scenarios [23]. However, their extended application is limited due to the absence of consideration for spatial dimensions. In recent years, the hybrid methods that integrate both time and space information have garnered attention and research interest from scholars. Examples include the search and fill algorithm with moving offset method (SFA-MOM) [24], spatio-temporal Savitzky–Golay (STSG) method [25], and spatio-temporal tensor completion (ST-Tensor) method [26].
The current NDVI TS reconstruction methods encounter three primary challenges. First, existing methods are typically designed for MODIS NDVI products, characterized by coarse spatial resolution and high temporal resolution [27]. This raises concerns about the applicability of the algorithms to middle- and low-resolution images, posing challenges for TS reconstruction. Second, most methods are nonideal in scenarios involving long-term continuous data gaps. Third, a significant drawback is the heavy reliance on the pixel reliability index (RI) dataset in many existing methods, leading to substantial noise in reconstruction results when RI contains errors [26], [28].
For the first issue, Landsat and Sentinel NDVI TS data are suitable for more detailed applications. However, due to their higher spatial resolution and infrequent revisit frequency, reconstructing Landsat and Sentinel NDVI TS data is more challenging compared to coarser resolution data. Several studies have proposed novel reconstruction algorithms tailored to the intricate nature of these data. For instance, Yu et al. [29] proposed a climate incorporated gap-filling (CGF) method to generate Landsat NDVI TS at 8-day intervals. Chen et al. [30] obtained NDVI TS data by integrating MODIS NDVI TS data with cloud-free Landsat observations. Additionally, Yang et al. [31] proposed a method to synthetically generate gap-free NDVI TS from raw contamination observations for reconstructing Sentinel-2 NDVI TS. Landsat and Sentinel TS data have also been used for national- or local-scale mangrove species mapping [32], [33], mapping mangrove functional traits [34], sustainable mangrove management [35], coastal salt marsh mapping [36], which greatly benefits blue carbon research and precise management.
For the second issue, there are mainly two approaches to resolve it.
The first common approach involves using spatial neighboring pixels to reconstruct the NDVI value of the target pixel. Methods like STSG [25], ST-Tensor [26], wWHd [37], etc., have gained prominence in this context. Typically, these methods generate a 1-year reference NDVI TS to capture the seasonal growth pattern. This is accomplished by computing the average of all uncontaminated NDVI values for the corresponding day of the year (DOY) across all years in the TS. Subsequently, the correlation between the reference TS of neighboring pixels and the target pixel is calculated to identify similar pixels. Finally, the NDVI value from the generated spatial reference TS is directly utilized to replace the NDVI value labeled as a pollution point.
Another commonly employed approach is MODIS-Landsat spatio-temporal fusion for reconstructing Landsat NDVI TS data. Examples include the highly scalable temporal adaptive reflectance fusion model (HIST-ARFM) algorithm [38], GF-SG [30], enhanced gap-filling and Whittaker smoothing (EGF-WS) [18], and CGF [29]. Such methodologies typically require obtaining cloud-free Landsat and MODIS images simultaneously on the base date, as well as MODIS images for the prediction date, thus presenting certain limitations in their applicability.
For the third issue, a limited number of scholars have introduced new methods to mitigate the dependence on quality assessment (QA). For instance, Zhu et al. [39] proposed a reconstruction method based on self-weighting function fitting from curve features (SWCF) that does not require ancillary data about quality. Additionally, Yang et al. [40] proposed an enhanced STSG method (cuSTSG) that alleviates the impact of inaccurate quality marks on the final results.
While the aforementioned improved methods have addressed most of the problems to some extent, three shortcomings persist.
Most existing methods rely on other data sources as supplementary data of the same type (with strict fusion requirements) or on prior knowledge data, resulting in the introduction of additional errors and a substantial increase in both data volume and workload.
Few methods adequately utilize the NDVI points marked as contaminated points in QA, which may include both valid and invalid points. This inadequacy results in further scarcity of usable data and exacerbates the challenges associated with reconstruction.
In terms of using spatial information, the current methods mainly generate a 1-year reference TS using the same DOY, which results in the inability to represent land-use changes and may also cause the correlation coefficient between the neighborhood pixel and the target pixel to be falsely high, affecting the judgment of similar pixels.
To address the aforementioned issues, this study proposes a local peak Savitzky–Golay (LPSG) method for spatio-temporal reconstruction of Landsat NDVI TS based on two characteristics: TS variation of NDVI should be smooth and continuous, and the NDVI values are always subject to negative bias [10]. First, we construct a local peak neighborhood weighted interpolation (LPNWI) method to fill gaps, eliminating the need for auxiliary data and maximizing the utilization of all original values. Second, we design a slope change decision tree (SC-DT) method to detect residual noise and mitigate it using LPNWI, thereby minimizing the error impact of QA. Third, a multidimensional calibration with weighted spatial reference (MDC-WSR) method is proposed. To be specific, we design a new method to compute the weighted correlation coefficient between the target pixel and its neighborhood pixels, generating a 10-year weighted spatial reference (WSR), which effectively represents the land-use changes. Subsequently, positive and negative bias anomalies are detected and calibrated. Finally, the SG filter is applied to obtain a smooth and high-quality TS. The main contributions of this article are as follows.
We construct a new LPNWI method for gap filling that fully utilizes the dynamic change law of gradual local NDVI values and the characteristic that contaminated NDVI values tend to be negatively biased noise [10].
The SC-DT method designed to remove the noise present in TS after gap filling is well resistant to the uncertainty of QA quality.
The traditional method of calculating correlation coefficients between a target pixel and its neighborhood pixels is improved for generating WSR over multiple years, which is more robust in dealing with land-use changes and large long time gaps.
Study Area and Data
The Qinghai–Tibet Plateau (QTP) is situated in the southwest region of China, spanning from 25
Study area of Qinghai–Tibet Plateau with GLC_FCS30-2020 [42] as the base map covered by eight typical land-use types.
Meanwhile, this study generates a fused quality assessment (FQA) data based on QA of the original Landsat series data to accurately characterize pixel quality. Pixels affected by clouds, shadows, and snow/ice in the original Landsat series data are excluded. If the count of effective NDVI points is greater than or equal to 1, the NDVI value at that position is classified as reliable (FQA = 1); if there are no effective points, indicating noise in the NDVI value at that position (FQA = 0).
Finally, we utilize MOD09A1, MYD09A1, and MOD13Q1 products (retrieved from GEE) and Landsat series data to create a reference dataset (details are in Section IV-C). This reference dataset serves as the ground truth, enabling a quantitative evaluation of the reconstruction performance. Due to the lower spatial resolution of various MODIS products compared to Landsat, we upscale them on the GEE platform to standardize the spatial resolution of all data to 30 m. Specifics regarding the data used are presented in Table I.
Proposed Methodology
The flowchart of LPSG is shown in Fig. 2. First, gap-filling is performed on the original NDVI TS
Flowchart of the proposed LPSG method for spatio-temporal reconstruction of Landsat NDVI TS.
A. LPNWI for Filling Gaps
The commonly used linear interpolation with filtering combination method [30] utilizes only the information of good NDVI values in adjacent time. However, when there is a long continuous gap, this method tends to obscure the original details of the TS. In this study, we propose LPNWI to fully consider all raw NDVI values as shown in Fig. 3.
In order to account for the scenario where the first or last points of TS are contaminated, we mirror the head and tail of the TS with a line parallel to the y-axis which represents the NDVI values. Specifically, the head represents all the points within the range from the first point of the TS to the first good point, and the tail represents all the points within the range from the last point to the last good point.
Above all, we define a local peak (LP) point as the point that is larger than the points on its left and right sides, and perform a weighted linear interpolation operation on every two adjacent LP points to fill the target value. In this regard, the original weight of each NDVI value
For each NDVI value to be filled
\begin{align*}
\mathrm{T}_{e}^{m}=&\sum _{i=1}^{\ell }\sum _{j=1}^{\gamma } w_{\Delta }^{(i,j)} \Delta ^{(i,j)} \\
w_{\Delta }^{(i,j)}=&\frac{w_{o}^{m - i} w_{o}^{m+j}}{\sum _{i=1}^{\ell }\sum _{j=1}^{\gamma }w_{o}^{m - i}\mathrm{T}_{o}^{m+j}} \tag{1}
\end{align*}
B. SC-DT for Residual Noise Removing
There could be errors in Landsat QA, potentially resulting in some contaminated NDVI points in FQA not being flagged, and there might be residual noise in
Flowchart of the designed SC-DT method for identifying residual noise. “1th–
We first compute the set of all slopes
\begin{equation*}
{\kern0.0pt} \Psi _{k}=\text{sort}(|k|)\lbrace 2\times \upsilon \rbrace \tag{2}
\end{equation*}
Subsequently, we continue to use LPNWI (as described in Section III-A) to fill the detected residual noise. The only difference is that we adjust the weight of each point, as expressed in the following:
\begin{equation*}
w_{o^{\prime }}^{m} = {\begin{cases}
1,& \text{FQA}=1\ \text{and} \ m \notin \mathrm{A}\\
0.5,& \text{FQA}=1\ \text{and} \ m \in \mathrm{A}\\
0.2,& \text{FQA}=0 \end{cases}} \tag{3}
\end{equation*}
Fig. 5 visualizes five scenarios in the noise detection process using SC-DT. It is evident that identifying an instance as noise requires stringent criteria. Consequently, the slope threshold
C. MDC-WSR for TS Calibrating
The previous steps involve gap-filling and denoising. However, relying solely on time dimension information may result in outliers, particularly when there is a continuous gap, leading to insufficient available time dimension information and an inability to restore the details of the vegetation growth curve. Adding spatial information can further correct the TS and ensure spatial continuity.
In this step, we construct MDC-WSR, which maximizes the utilization of NDVI values to calibrate the filled and denoised
\begin{equation*}
\begin{aligned} R_{w} &=\frac{\sum \nolimits _{i=1}^{n} D_{n}^{i}D_{d}^{i}w_{n}^{i} w_{d}^{i}}{\sqrt{\sum \nolimits _{i=1}^{n} (D_{n}^{i} w_{n}^{i} w_{d}^{i})^{2}\sum \nolimits _{i=1}^{n} (D_{d}^{i} w_{n}^{i} w_{d}^{i})^{2}}}\\
D_{n}^{i} &=\mathrm{T}_{n}^{i}- M_{n}\\
D_{d}^{i} &=\mathrm{T}_{d}^{i}- M_{d} \end{aligned} \tag{4}
\end{equation*}
On each date, the contaminated points in the TS of all similar pixels are removed. Based on the
\begin{align*}
\mathrm{T}_{s^{\prime }} = &(\mathrm{T}_{s+}-b)\times a+b\\
\mathrm{T}_{s+} = &\mathrm{T}_{s} + \text{median}_{i\in [1,\upsilon ]} \lbrace \lambda _{d}^{i}-\lambda _{s}^{i}\rbrace \\
b = &\text{median}_{i\in [1,\upsilon ]} \lbrace \lambda _{s+}^{i}\rbrace \\
a = &\text{median}_{i\in [1,\upsilon ]}\lbrace (\varepsilon _{d}^{i}- b)/(\varepsilon _{s+}^{i}-b)\rbrace \tag{5}
\end{align*}
Employing a certain percentage of data is effective in resisting the influence of outliers and enhancing data robustness [29], [43]. It is noteworthy that the 20th percentile rather than the minimum is utilized to mitigate the potential negative bias noise in the TS.
The preceding gap-filling and denoising operations have eliminated a significant portion of the noise in the TS. However, the NDVI values calculated based on LPNWI may not be entirely accurate. It may be due to the NDVI values between the adjacent LP points being too high, resulting in a high calculated fill value and a positive bias noise. In addition, SC-DT might overlook some subtle negative bias noise. Hence, we introduce spatial information to correct two categories of noise in
Single point abnormal negative bias noise: The difference of NDVI value between
and\mathrm{T}_{s^{\prime }} at each point\mathrm{T}_{d} is calculated. WhenD^{m} ,D^{m}>\Psi _{n} is considered as noise, and\mathrm{T}_{d}^{m} is used for calibration. The calculation of\mathrm{T}_{s^{\prime }}^{m} is provided in the following:\Psi _{n} where\begin{equation*} \Psi _{n}=M_{D}+R_{n}\times \text{std}_{D} \tag{6} \end{equation*} View Source\begin{equation*} \Psi _{n}=M_{D}+R_{n}\times \text{std}_{D} \tag{6} \end{equation*}
andM_{D} refer to the mean and variance of all\text{std}_{D} after taking the absolute value, respectively. The determination process of parameterD is shown in Section IV-A.R_{n} Continuous positive bias noise: At the end of each year, a symmetrical time window with a HW length of 6 is created to check the low state of vegetation NDVI at the year transition. A total of
time windows are obtained, and the set of all window maxima is represented by\upsilon -1 . Here,\mathrm{B} \in \mathbb {R}^{1 \times (\upsilon -1)} denotes the\mathrm{B}^{f} th maximum value. Iff , then all NDVI values that have changed within this window are substituted with the corresponding values from\mathrm{B}^{f}>\Psi _{p} . Refer to the following for the calculation of\mathrm{T}_{s^{\prime }} :\Psi _{p} where\begin{equation*} \Psi _{p}=\text{median}_{f \in [1,\upsilon -1]}\lbrace \mathrm{B}^{f}\rbrace +R_{p}\times \text{std}_{B} \tag{7} \end{equation*} View Source\begin{equation*} \Psi _{p}=\text{median}_{f \in [1,\upsilon -1]}\lbrace \mathrm{B}^{f}\rbrace +R_{p}\times \text{std}_{B} \tag{7} \end{equation*}
is the variance of\text{std}_{B} . For details on determining the parameter\mathrm{B} , please refer to Section IV-A. The TS after correction for single-point negative bias noise and continuous positive bias noise is denoted asR_{p} .\mathrm{T}_{c}
D. SG Filter for TS Smoothing
In the final step, we apply the SG filter, as defined in the following, to smooth the
\begin{equation*}
\mathrm{T}_{f}^{m}=\left(\sum \limits _{i=-s}^{s} C_{i}\mathrm{T}_{c}^{m+i}\right)/(2s+1) \tag{8}
\end{equation*}
According to (8), the SG filter requires the manual selection of two parameters. First, the radius of the sliding window, where a larger radius results in a smoother reconstructed NDVI TS. Second, the order of the polynomial (typically 2–4), with lower orders leading to a smoother reconstructed NDVI TS. In this study, the SG filter parameters are configured with a window size of 5 and a polynomial order of 2.
The specific implementation process of LPSG is shown in Algorithm 1.
E. Quantitative Evaluation Indices
The root-mean-square error (RMSE) is used to evaluate the performance of the different methods. It is defined as
\begin{equation*}
\text{RMSE} = \frac{1}{n} \sqrt{\sum \nolimits _{i=1}^{n} \left(\mathrm{T}_{\text{predict}}^{i} - \mathrm{T}_{\text{true}}^{i}\right)^{2}} \tag{9}
\end{equation*}
Experimental Results
In the experiment, we select two regions (500 × 500 pixels) and eight points of different land-use types within the QTP in China (see Fig. 1) to make the experiment more convincing. Four classical time filtering methods are selected for comparison with LPSG, i.e., the SG filter, the HANTS method, the Whittaker filter, and the Fourier algorithm. It is worth noting that before the time filtering, the time domain linear interpolation operation is first performed on the contaminated points (FQA = 0). The main difference between LPSG and the four comparison methods is that LPSG uses the newly proposed LPNWI to fill gaps and does not rely entirely on FQA to determine contaminated points. In contrast, the comparison methods use a linear interpolation method to fill in contaminated points that are entirely determined by FQA. Additionally, LPSG considers spatial information and inter-annual variations, which allows it to preserve the spatial correlation and temporal periodicity of TS data.
A. Parameter Sensitive Analysis
The LPSG method integrates the TS information of similar pixels in the neighborhood of the target pixel, which involves two key steps. The initial step of MDC-WSR involves retrieving similar pixels to generate WSR, which is crucial to determining the parameters of the neighborhood size and the correlation coefficient threshold. To establish the optimal values for these parameters, we conduct experiments using 400 random points in Region A (15 km × 15 km) and change the HW and correlation threshold under different proportions of random gaps. Analyzing Fig. 8, we observe that when the proportion of random gaps ranges from 10% to 60%, the optimal HW is 5; when the rate increases to 70%–80%, the optimal HW becomes 10. Additionally, when the correlation coefficient threshold is greater than or equal to 0.76, RMSE gradually increases. In summary, we determine that when the random gap rate of the target pixel is less than or equal to 65%, HW is set to 5, and when the rate exceeds 65%, HW is set to 10, with a correlation coefficient threshold of 0.76.
Average RMSE of LPSG is calculated by changing only the HW size and the correlation coefficient threshold in the scenarios where the random gap changes from 10% to 80% with an interval of 10%. The simulation experiment is carried out on 400 random pixels of Region A.
The second step of MDC-WSR, involving determining the position to be calibrated from multiple dimensions and replacing it with the corresponding reference value, requires defining two parameters: the positive bias correction threshold
\begin{align*}
R_{p} &= {\begin{cases}-0.8, & R_{\text{gap}}< 35\%\text{or}R_{\text{gap}}>80\% \\
0, & 35\% \leqslant R_{\text{gap}} \leqslant 45\% \\
0.4, & 45\% \leqslant R_{\text{gap}} \leqslant 80\% \end{cases}} \tag{10}\\
R_{n} &= {\begin{cases}2\times (3-10\times R_{\text{gap}}), & R_{\text{gap}}< 35\% \\
-1, & R_{\text{gap}} \geqslant 35\%. \end{cases}} \tag{11}
\end{align*}
Average RMSE of LPSG is calculated by only changing the positive bias correction threshold
B. Visual Evaluation of Reconstruction Results
1) Temporal Analysis
According to the GLC_FCS30 dataset [42], representative points of 8 different vegetation cover types are selected from southwest to northeast of the QTP, including eastern wetland (point A), herbaceous cover (point B), deciduous broadleaved forest (point C), grassland (point D), central wetland (point E), sparse vegetation (point F), western wetland (point G), and shrubland (point H). The results of different filtering methods are processed by 3-D curve expansion to visually compare the reconstruction effect. The general characteristics of NDVI TS for eight points with different vegetation cover types are shown in Table II.
Figs. 10 and 11 show the overall and local curves of the NDVI TS from 2013 to 2022 reconstructed by LPSG and the other four methods. Generally, LPSG retains the high value of NDVI and restores enough details in dealing with continuous noise and FQA errors. Specifically, the advantages of LPSG over other methods mainly reflect in four aspects.
LPSG better handles the false noise at the head and tail part of the TS, which may be significantly lower than the valley values of the same period in other years [e.g., the head and tail part of Figs. 10(a), (d), and 11(b)]. HANTS and Fourier methods underestimate the true NDVI value, while Whittaker and SG methods retain this false valley value. In contrast, LPSG efficiently identifies false valleys in the head and tail parts, and utilizes interannual information to accurately restore the values.
LPSG effectively avoids the influence of FQA errors. For some narrow and deep pseudovalleys in wetland and deciduous broadleaved forest [e.g., the years 2014, 2015, and 2020 in Fig. 11(a), the year 2015 in Fig. 10(b), the year 2014 in Fig. 11(c)], although the other four methods can filter these valleys to a certain extent, there are still obvious local troughs. LPSG successfully identifies abnormal abrupt NDVI values by SC-DT, ensuring they return to the correct values.
Compared to other methods, LPSG can better fill the continuous gaps. For example, in the latter half of 2016, there are continuous gaps at point D, and only LPSG restores TS to a shape similar to that of other years [see Fig. 11(b)]. Similar situations also appear in the transition between 2014 and 2015 in the central wetland point E [see Fig. 11(c)], 2019 in the sparse vegetation point F [see Fig. 11(d)], the connection between 2019 and 2020, and the connection between 2021 and 2022 in the western wetland point G [see Fig. 10(c)].
LPSG ensures the NDVI peaks of the original TS as much as possible. For example, in point A for the year 2018 [see Fig. 11(a)], in point E for the year 2016 [see Fig. 11(c)], and in point F for the year 2016 [see Fig. 11(d)]. One possible reason for the other four methods to underestimate the NDVI peaks is that the weights of each NDVI point are equal, and the fitting aims to place as many NDVI points as possible on the curve, resulting in the phenomenon of underestimating the peaks. Another possible reason is that other methods treat the possible false valley value near the peaks as the correct NDVI value, resulting in a sudden rise in the TS in a short time, and the peaks are determined as noise.
Temporal reconstruction performance of different methods for some typical vegetation pixels. (a) Herbaceous cover (point B). (b) Deciduous broadleaved forest (point C). (c) Western wetland (point G). (d) Shrubland (point H).
Temporal reconstruction performance of different methods for some typical vegetation pixels (local zoom in). (a) Eastern wetland (point A). (b) Grassland (point D). (c) Central wetland (point E). (d) Sparse vegetation (point F).
2) Spatial Analysis
Fig. 12 illustrates the comparison of Landsat NDVI TS data reconstructed by different methods across the entire QTP on the 8th day of 2020. In early January, the QTP is in winter, and the NDVI values of most vegetation are at a low level. However, due to factors such as cloud occlusion and satellite orbit, the original NDVI of vegetation in some areas is too low or even negative [especially in the striped dark area in Fig. 12(a)]. From the results, although the Whittaker, Fourier, HANTS, and SG methods improve the low value of the original data to a certain extent, the low value area of the strip is still obvious and the space is discontinuous. A large area of evergreen broadleaved forest is distributed in the southern part of the QTP, and this type of vegetation maintains active growth activities throughout the year, whose NDVI value in winter can generally be maintained at about 0.8. However, in the other four methods, Fourier increases the original NDVI value by the most, only increasing the value to about 0.4, while LPSG well restores the true high value of evergreen broadleaved forest and shows continuity and integrity in space. Fig. 13 illustrates the reconstruction results of different methods across the entire QTP on the 357th day of 2022. Evidently, as shown in Fig. 13(b) and 13(e), the reconstruction results of the HANTS and Fourier methods exhibit noticeable striping noise, failing to preserve the spatial continuity of the image. Fig. 13(c) and 13(d) demonstrates that the Whittaker and SG methods maintain spatial continuity relatively well, they fail to accurately recover the NDVI values for lake and river areas. The QTP is characterized by numerous lakes and extensive snow cover in its northern regions. Since water bodies and snow have lower reflectance in the near-infrared spectrum compared to the red spectrum, their NDVI values should be negative. The reconstruction results of the other four methods consistently overestimate the NDVI values in these areas, introducing high-value noise in regions where low NDVI values are expected. In contrast, the LPSG method effectively restores the NDVI values of water and snow-covered areas, presenting very clear low-value contours.
Spatial reconstruction performance of different methods (taking the 8th day of 2020 as an example). (a) Raw NDVI. (b) HANTS. (c) Whittaker. (d) SG. (e) Fourier. (f) LPSG (Ours).
Spatial reconstruction performance of different methods (taking the 357th day of 2022 as an example). (a) Raw NDVI. (b) HANTS. (c) Whittaker. (d) SG. (e) Fourier. (f) LPSG (Ours).
C. Index Evaluation of Reconstruction Results
To quantitatively evaluate the performance of LPSG, two regions located in the southwest (Region A) and middle (Region B) of the QTP are selected. Simulated data is reconstructed, and the average RMSE is calculated for each region. The simulated data are generated by adding noise to the true NDVI data based on actual pixel pollution. Since the NDVI true value of each pixel is unavailable, the reference data serves as the ground truth.
As depicted in Fig. 14, the NDVI data from MODIS and Landsat are amalgamated to synthesize NDVI values every 16 days. The final value of the reference data is determined by averaging when there are at least five uncontaminated good points. Conversely, linear interpolation is employed to calculate the corresponding reference NDVI value. The reference NDVI TS data generated through this method are independent at each time point, facilitating a more accurate restoration of the actual NDVI curve changes.
Reference TS generated based on Landsat and MODIS data for evaluating reconstruction performance.
FQA is utilized in conjunction with the reference data to create simulation data for the experiment. Specifically, in locations where the NDVI point is contaminated (FQA = 0), a random negative noise of less than 40% is added to the reference NDVI value. If the NDVI value is of good quality (FQA = 1), it remains equal to the reference data. In practical scenarios, QA often encounters two situations: 1) marking good points as contaminated points and 2) marking contaminated points as good points. The former situation underestimates the number of available NDVI values with minimal impact, while the latter may mislead the TS with low details. Consequently, we evaluate the performance of LPSG in both cases of correct and incorrect FQA to demonstrate its robustness.
Reconstruction of raw data with correct FQA: Assuming the FQA is completely correct. Simulation data is generated based on the reference data for Region A and Region B, considering pollution in the original Landsat series data.
Reconstruction of raw data with incorrect FQA: The previous experiment simulated an ideal scenario, but reality may differ. In this experiment, we assume that 1% of the contaminated NDVI values in each pixel are labeled as good points (FQA = 1). Experiments are conducted separately in Regions A and B.
The results in Table III and Fig. 15 indicate the superior performance of LPSG, featuring the lowest RMSE and the highest proportion of green regions in the images. Conversely, the Fourier and SG methods demonstrate comparatively inferior performance, while the Whittaker and HANTS methods exhibit relatively higher levels of efficacy. In Region A, when FQA is entirely accurate, the RMSE of LPSG is 0.00018 lower than the second smallest (Fourier) and 0.00167 lower than the largest (HANTS). In the presence of FQA errors, the RMSE of LPSG is 0.00565 lower than the second smallest (Whittaker) and 0.0075 lower than the largest (SG). Shifting to Region B, under perfect FQA accuracy, the RMSE of LPSG is 0.00294–0.00487 lower than RMSE of other methods. In the presence of FQA errors, the RMSE of LPSG is 0.00445–0.00536 lower than the RMSE of other methods.
RMSE visualization maps under conditions of correct FQA and incorrect FQA (
The spatial visualization images of different methods in Regions A and B are depicted in Fig. 16. Fig. 16(a) and 16(b) presents the reconstruction results of Region A on DOY 24 in 2013 and DOY 136 in 2016, mainly characterized by grassland, cropland, and deciduous broadleaved forest, heavily affected by clouds and snow contamination. While the other four methods are capable of rectifying the low values, notable negative bias noise is present in the results. Additionally, Fourier, SG, and HANTS methods introduce banded noise in their results. On the other hand, LPSG outperforms others in restoring high NDVI values and maintaining spatial continuity. Fig. 16(c) and 16(d) illustrates the reconstruction results of Region B, dominated by grassland and bare land, on DOY 73 in 2018 and DOY 40 in 2017. Although each method generally reconstructs NDVI values effectively, Whittaker, Fourier, HANTS, and SG methods produce significant salt and pepper noise with poor spatial continuity, while LPSG better ensures spatial correlation.
Spatial reconstruction performance of different methods in Region A and Region B. (a) Region A with correct FQA on DOY 24 in 2013. (b) Region A with incorrect FQA on DOY 136 in 2016. (c) Region B with correct FQA on DOY 73 in 2018. (d) Region B with incorrect FQA on DOY 40 in 2017.
Discussions
A. Robust Analysis
Recent studies have highlighted the significance of filling missing values before fitting Landsat TS data [44]. The ability to fill missing values is a crucial criterion for assessing the performance of a reconstruction method. Therefore, it is essential to examine the stability of LPSG under different degrees of gaps. Given that this study retains contaminated NDVI value during the processing of the original data, and when the value does not exist, it is set to 0, noise and gaps can be considered equivalent. We randomly select 400 pixels in Region A and simulate two types of noise based on the reference data.
Case 1: Random noise: For each pixel in the reference data, random noise ranging from 10% to 80%, with a 10% interval, is sequentially added in the time domain. The intensity of the noise is less than or equal to 40% of the reference value.
Case 2: Spatio-temporal continuous noise: A fixed 50 × 50-pixel rectangular patch, with a length ranging from 4 to 24 and a spacing of 2, is placed randomly within Region A, and noise is added to the NDVI values covered by the patch.
Fig. 17(a) illustrates the average RMSE of various methods under different proportions of random gaps. By comparing the changes in RMSE slopes, it is evident that LPSG exhibits superior reconstruction capabilities. Specifically, when the gap rate is below 20%, SG and LPSG demonstrate better performance. With the increases of the gap rate, the RMSE values of all methods also increase, but the distinction lies in the fact that the other four methods experience exponential growth, while LPSG exhibits linear growth with a small slope, especially when the gap rate is below 60%. Fig. 17(b) depicts the RMSE variation curve in the case of continuous spatio-temporal gaps. The RMSE curves of all five methods show a pattern of rising initially, followed by a decline, then another rise before stabilizing. The RMSE values of Fourier, Whittaker, and HANTS are higher and ultimately tend to converge. Additionally, although the SG method performs well when the gap length is less than or equal to 6, it sharply increases as the gap length grows, surpassing an RMSE value of 0.15. LPSG consistently maintains the lowest RMSE, and its change is the most gradual, with the RMSE remaining below 0.011.
Evolution of average RMSE obtained by different methods as a function of (a) gap rate and (b) gap length.
B. Adaptability With Hybrid Land-Use
The LPSG is designed for reconstructing multiyear long TS. Therefore, a crucial consideration is whether the performance of the reconstruction method remains robust in the presence of land-use changes. In this experiment, we simulate two types of land-use changes based on the original data from Region A and Region B (see Fig. 18).
Scenario 1: Time dimension splicing. The first 5-year data (2013–2017) of Region A and the last 5-year data (2018–2022) of Region B are spliced to simulate land-use changes across all pixels.
Scenario 2: Spatial dimension splicing. The odd-columns data of Region A for the last five years are replaced with data from Region B at the corresponding positions, simulating land-use changes.
Synthesis scenarios for 10-year NDVI TS data. In scenario 1, the first 5 years of Region A and the last 5 years of Region B are synthesized. In scenario 2, the odd-columns data represent land-use changes, while even-columns retain the original land-use of Region A.
We use LPSG to compare the reconstructed TS with the spliced reference TS on the synthetic data. Fig. 19 primarily illustrates the TS of land-use change from grassland to herbaceous vegetation. It is observed that the reference aligns well with the TS reconstructed by LPSG, demonstrating that LPSG is not sensitive to land-use changes, even if using multiyear periodic data. The explanation for this lies in the fact that, during the retrieval of information in the neighborhood of the target pixel, only pixels similar to the target pixel contribute to the calculation of WSR. In an extreme scenario, there might be no pixels similar to the target pixel in the surrounding area, and in such cases, LPSG does not utilize spatial information. Previous studies have indicated that situations like these can be addressed by identifying land-use changes through continuous change detection and classification methods [45].
Temporal reconstruction performance of LPSG under hybrid land-use. “Reference (A+B)” represents the concatenated reference data from Regions A and B. (a) Scenario 1. (b) Scenario 2 odd-columns. (c) Scenario 2 even-columns.
C. Validity Analysis of Improved Correlation Coefficient and Spatial Reference
To demonstrate the effectiveness of the improved correlation coefficient calculation method and spatial reference generation method in MDC-WSR (Section III-C), a random pixel is selected to visualize the correlation coefficient and spatial reference calculation processes using both methods. Fig. 20(a) illustrates the relationship between a TS within the neighborhood of the target pixel and the target TS. In traditional method, a spatial reference for each TS is generated prior to calculating the correlation coefficient by averaging the NDVI values for the same day across different years. As a result, the traditional reference spans only one year. Visually, the two traditional references appear uncorrelated. However, due to their short length, the computed correlation coefficient is an unreasonably high 0.9983. In contrast, the improved method yields a correlation coefficient of 0.3808, more accurately reflecting the relationship between TS. Fig. 20(b) compares the spatial reference results from both methods. While the traditional spatial reference is smoother and exhibits clear periodic characteristics, its shape significantly diverges from the target TS and fails to capture interannual variations. Conversely, the improved spatial reference closely aligns with the target TS and effectively represents interannual changes. In conclusion, the improved correlation coefficient calculation method and spatial reference generation method in MDC-WSR are effective.
Comparison of traditional and improved correlation coefficient and spatial reference. (a) Visualization of A TS in the target pixel neighborhood and the target TS. (b) Spatial reference results from both methods.
Conclusion
Reconstructing Landsat NDVI TS data is crucial for meeting the demands of ecologically sensitive applications with medium to high spatial resolution. In this study, we develop LPSG, which does not rely on additional data and maximizes the utilization of all original data, including contaminated NDVI values. Additionally, we enhance the way in which spatial information is used in existing methods by generating a WSR of equal length to the original TS, considering both periodic and interannual variations.
The experimental results within the QTP region of China highlight the advantages of LPSG over four established methods (Whittaker, Fourier, HANTS, SG).
In both temporal and spatial dimensions, LPSG demonstrates the good ability to mitigate the impact of FQA errors, preserve peaks and local details of the TS, and maintain spatial continuity.
In terms of index evaluation, LPSG achieves a notable reduction in average RMSE compared to other methods, with decreases ranging from 0.00018 to 0.00750 in Region A and 0.00294 to 0.00536 in Region B under correct and incorrect FQA.
LPSG exhibits good robustness under different gap conditions and effectively restores the TS when the land-use changes in the study area.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers for their valuable feedback, which has contributed to improving the quality of the paper. Additionally, we extend our appreciation to the outstanding work and services provided by the Google Earth Engine team.