Improving the Accuracy of Fractional Evergreen Forest Cover Estimation at Subpixel Scale in Cloudy and Rainy Areas by Harmonizing Landsat-8 and Sentinel-2 Time-Series Data

Evergreen forest provides essential ecosystem services such as maintaining the balance in carbon and oxygen cycles and air purification. However, under cloudy and rainy weather conditions, it is difficult to obtain optical remote sensing images with high spatial resolution and complete time series. In addition, surfaces underlying the forest canopy can be complex and fragmented. To solve this problem, we developed a new approach (NDVICV-LS) for mapping urban evergreen forest at the subpixel scale. In order to capture more accurate growth characteristics of evergreen forest, we harmonized Landsat-8 and Sentinel-2 images with cloud cover less than 10% acquired within 1 year to denser the time-series dataset. In view of the time series fluctuation stability of evergreen forest, the coefficient of variation (CV) of the normalized difference vegetation index (NDVI) was used to distinguish evergreen forest from other vegetation. Meanwhile, the annual minimum NDVI (NDVIann-min) was used as the parameter in a dimidiate pixel model for estimating fractional evergreen forest cover (FVCever). Hefei, a cloudy and rainy subtropical city in China, was selected as a case study to evaluate the validity of the model. The verification results revealed that harmonizing Landsat-8 and Sentinel-2 time-series images to extract evergreen forest improved the overall accuracy by 8% compared with using Landsat-8 images alone, indicating that the NDVICV-LS model can improve the accuracy of FCVever estimation, especially for areas with complex underlying surfaces under cloudy and rainy conditions.


I. INTRODUCTION
V ARIATION in forest types shape the structure of the forest landscape [1] which largely determines forest ecosystem Manuscript  services and functions. A good understanding of the spatial distribution of different forest types contributes to better forest ecosystem and land space management [2], which then facilitates the protection of wildlife habitats and biodiversity [3], [4]. Evergreen forest is an important part of the subtropical urban ecosystem. Efforts to protect the regional environment and promote sustainable development [5] have to take these forests into account. Evergreen trees contribute to air purification, cooling, humidification, water and soil conservation, and other ecological services [6]- [8]. They play an irreplaceable role in maintaining the carbon balance of urban ecosystems and protecting the urban ecological environment [9]. Compared with deciduous trees, evergreen trees have a larger leaf mass per area and longer leaf life; thus, they have a higher capacity for carbon sequestration and dust retention [10], [11]. Moreover, in evergreen forest, the relative humidity is approximately 7% higher than that on nonforested land in winter [12]. Evergreen forest also effectively mitigates urban heat-island effects [13]. Evergreen trees have high landscape aesthetic value because they can compensate for landscape gaps in areas where deciduous plants grow in winter. Therefore, it is vital to have a comprehensive understanding of evergreen forest distribution. Remote sensing can acquire real-time, large-scale data; thus, it is commonly employed to obtain basic maps such as maps of land-cover types at regional and global scales [14]. Time-series remote sensing images from satellites are important data sources for monitoring vegetation characteristics. Based on phenological analyses, vegetation characteristics are extracted [15]- [18]. For example, the growth characteristics of evergreen forest remain almost unchanged throughout the year; therefore, they have been extracted over large areas using time-series indices such as the enhanced vegetation index, land surface water index, and normalized difference vegetation index (NDVI) [19]- [21]. Previous studies mostly used time-series data from the moderate resolution imaging spectroradiometer and advanced very high resolution radiometer sensors [22]. These sensors provide the needed temporal resolution for observing growth dynamics. However, the low spatial resolution will cause a lot of errors by mixing multiple objects with different spectral curves and This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ phenological characteristics in one pixel [23], especially in spatially heterogeneous areas which have fragmented landscapes [24], [25]. In recent years, the satellites with moderate to high spatial resolution (e.g., Landsat and Sentinel-2) have developed rapidly and can well express the detailed features of the surface [26]. Medium-resolution satellites can reflect the growth status of vegetation throughout the growing period, if the time series is complete enough. However, evergreen species are mostly distributed in tropical and subtropical regions where cloudy and rainy weather conditions occur frequently. It is hard to acquire images with high spatial resolution and complete time series under cloudy and rainy weather conditions, making it difficult to conduct broad-scale, high-precision dynamic monitoring. Most satellite sensors cannot take high temporal resolution and high spatial resolution images at the same time [27]. Vegetation extraction from urban areas with complex underlying surface requires high spatial resolution images to reduce the influence of mixed pixels. However, the temporal resolution of mediumto high-resolution images cannot meet the requirement of constructing continuous time series under cloudy and rainy weather conditions.
Multisource images fusion is one of the most effective solutions to this problem [28]- [30]. The method not only helps to resolve the contradictions between high temporal resolution and high spatial resolution imagery datasets [31], but also promotes the utilization of current earth observation data [32]. Fusion of datasets with similar characteristics, such as Landsat-8 operational land imager (OLI) and Sentinel-2 multispectral instrument (MSI) images, provides the potential to observe landscape dynamics at a spatial resolution of 30 m. As an example, the harmonized Landsat-8 OLI and Sentinel-2 MSI datasets were used to assess winter wheat yields at regional scale [33], estimate the number and timing of mowing events in Central Europe grasslands [34], and detect irrigated areas in Southern Italy [35]. In the very past few years, despite the growing attention in harmonizing Landsat-8 and Sentinel-2 images, there is a lack of papers on their use for evergreen forest mapping of subtropical urban areas.
The underlying surface in subtropical urban areas is complex and fragmented. Mixed pixels constitute a serious problem. To improve the accuracy of evergreen forest extraction, it is necessary to decompose the mixed pixels. So far, a variety of pixel unmixing models have been developed, including linear, probabilistic, geometric-optical, stochastic, and fuzzy models [36]. The dimidiate pixel model, a linear mixed model, is widely used to estimate fractional vegetation cover (FVC) at the subpixel scale [37]- [39].
In this article, we developed a novel approach (NDVI CV -LS) for improving the accuracy of evergreen forest extraction in cloudy and rainy areas at subpixel scale. Landsat-8 and Sentinel-2 data were harmonized to construct a time-series dataset to capture more accurate growth characteristics of evergreen forest. In particular, the time series fluctuation of evergreen forest is relatively stable compared with other vegetation types (i.e., deciduous forests and farmland). We used the coefficient of variation (CV) of NDVI to separate evergreen forest form continuous crop and deciduous forest, which were extremely admixed with low coverage evergreen forest. Then, the annual minimum NDVI (NDVI ann-min ) was used as a parameter in the dimidiate pixel model to estimate fractional evergreen forest cover (FVC ever ). Given the spatial heterogeneity, the study area was divided into areas with and without an impervious surface by the normalized difference built-up index (NDBI). These two regions were assigned different parameters to extract evergreen forest at subpixel scale. A typical subtropical city, Hefei, China, was selected as the study area to evaluate the validity of the model. The results indicated that our model enabled higher accuracy of FVC ever estimation, making it possible to conduct evergreen forest monitoring in subtropical urban areas with heavy cloud cover and rainfall.

A. Study Area
The study area is located in Hefei City, Anhui Province, China, which has a central latitude and longitude of 31.82°N and 117.23°E (Fig. 1). Hefei covers an area of 11 445 km 2 . The terrain in this area is dominated by hills with an altitude of approximately 15-80 m, and most of the slopes in the study area are less than 2°. This area has a typical subtropical humid monsoon climate with four distinct seasons, long frost-free periods and abundant rain. The average annual temperature is approximately 15.7°C, the average annual relative humidity is about 77% and the annual precipitation is approximately 1000 mm. Climatic conditions are suitable for growing evergreen trees. Common native evergreen broad-leaved trees include magnolia, camphor, and holly, whereas evergreen coniferous trees include mason pine and Cryptomeria fortunei, and the main evergreen shrub is osmanthus. However, frequent occurrence of clouds and rain makes it difficult to acquire remote-sensing images with high spatial resolution and complete time series. Taking into account the high degree of urbanization in Hefei, the underlying surface is complex and fragmented. Therefore, the study area is highly representative and applicable to most similar situations.

B. Data Sources and Preprocessing
Images used in this article were obtained from Landsat-8 OLI and Sentinel-2 MSI sensors on days with cloud cover less than 10%. The spatial resolution for Landsat-8 images is 30 m, whereas that for Sentinel-2 is 10 and 20 m (for bands 2, 3, 4, and 8A only; Table I). Sentinel-2 consists of two satellites-A and B. Each satellite has a revisiting period of 10 days, and as a combination, the two satellites have a revisiting period of 5 days, which is shorter than that of Landsat-8 (16 days). All data were downloaded from the United States Geological Survey Global Visualization Viewer website (http://glovis.usgs.gov/). All images were acquired in 2019, with no valid data obtained in February due to cloud cover (Fig. 2).
Given the biases between the two sensors in terms of different spectral band ranges and view geometries, a series of preprocessing steps were applied to obtain a consistent surface reflectance record for the reconstruction of a 30-m time-series dataset [40].  The following is the preprocessing process of reconstructing the time-series dataset.
2) The cloud and cloud-shadow masking for Landsat-8 was performed by LaSRC internally, and for Sentinel-2 by an adapted version of Fmask [42]. 3) To improve the spatial coregistration between the two sensors, geometric resampling and geographic registration were performed by automated registration and orthorectification package [43].

4)
In view of the differing solar and view angles between the two sensors, the bidirectional reflectance distribution function effects were corrected [44]. 5) A bandpass adjustment (based on a linear fit between equivalent spectral bands) is applied to Sentinel-2 images to match the Landsat-8 spectral responses [35].

III. METHODOLOGY
The workflow for our method is schematized in Fig. 3. First, we reconstructed a time-series dataset through harmonized Landsat-8 and Sentinel-2 images. Next, we constructed an evergreen forest extraction model (NDVI CV -LS). The CV of  time-series NDVI was used to separate evergreen forest from other vegetation. NDVI ann-min was used as a parameter in the dimidiate pixel model to accurately extract evergreen forest at subpixel scale. We ran a model simulation with high-resolution images and field samples to determine a reasonable threshold value. Finally, root mean square error (RMSE) and mean absolute error (MAE) were computed to explain errors associated with the results. Confusion matrices of different products were also established to compare the accuracy of the NDVI CV -LS model with that of other methods.

A. Time-Series NDVI Dataset
Although cloud masking had been applied to the harmonized Landsat-8 and Sentinel-2 dataset during preprocessing, it performed poorly under surface and atmospheric states such as bright targets, thin clouds, cloud edges, and hazy conditions [45]. Additional operations were required to reduce cloud and shadow effects.
We applied an interpolation algorithm to the images of adjacent months, that is, we computed their average value, to supplement the missing images caused by cloud cover (e.g., images for February). For images with little cloud cover, interpolation results from adjacent months were used to replace the cloudy parts [25], [46]. During the noncritical growth period (i.e., from July to September), the vegetation characteristics did not change much. The maximum value composite method was used to synthesize the data from these months [47] in order to remove the influence of cloud cover, which could ensure that the CV values were not overly affected. Then, the NDVI was calculated as follows [48]: where NIR is the near-infrared band reflectance record, and R is the red band reflectance record. Through the abovementioned steps, we obtained a reconstructed NDVI time-series dataset. Fig. 4 shows that compared with Landsat-8 time-series images, the reconstructed NDVI time series is more complete and can reflect more growth status of vegetation throughout the growing period.

B. Coefficient of Variation
The CV is expressed as the ratio of the standard deviation to the mean. It reflects the differences among individual datapoints in a dataset. The CV of time-series NDVI is sensitive to fluctuations in the phenological cycle of vegetation. For example, deciduous trees sprout in spring and wither in autumn, and crops are sown and harvested at specific dates. CV has the ability to capture these growth fluctuations in the phenological cycle; thus, it can be used to characterize phenological attributes of evergreen forest. In this article, we took the ratio of the standard deviation to the mean of NDVI time series as the CV for each pixel [49]. The standard deviation σ and mean μ are calculated as follows: where NDVI m is the NDVI associated with image m and n is the number of images in the time series. The CV of the pixel in row i and column j of the NDVI time series is then calculated as follows: Evergreen forest remains relatively stable throughout their phenological cycle. Compared with farmland and deciduous vegetation, the CV value of a pure evergreen forest is small [ Fig. 5(a)]. However, when the coverage of evergreen forest in a certain pixel is low, its NDVI ann-min [around 0.3, Fig. 7(b) and (c)] is almost close to continuous crop and deciduous forest [ Fig. 5(b)]. Compared with NDVI ann-min , CV is sensitive to small changes [even 10%, Fig. 7(a)] in the proportion of evergreen forest; therefore, it can separate the low-coverage evergreen forest from other vegetation.

C. Dimidiate Pixel Model
To accurately extract evergreen forest at subpixel scale, we used the dimidiate pixel model to estimate FVC ever . The dimidiate pixel model is a linear mixing model [36]. The model assumes that two components, background and vegetation, contribute to a pixel signal in terms of FVC estimation. The resulting signal, S, as observed by a remote sensor can be expressed as [50] where S veg is the signal contribution from the vegetation component and S back is the contribution from the background component. Then, FVC is derived as follows: Because less information is reflected in a single band, NDVI is usually taken as the "S" to estimate FVC. NDVI is linearly related to vegetation density distribution, which can well reflect the actual situation of vegetation growth status [51]. As shown in Fig. 5(b), the NDVI ann-min of evergreen forest was significantly higher than those of the other three types of ground objects. On this basis, we adopted NDVI ann-min as the signal "S" in the model instead of NDVI, to extract evergreen forest at subpixel scale. Then, FVC ever can be expressed as FVC ever = NDV I annmin − NDV I annmin_back NDV I annmin_veg − NDV I annmin_back where NDVI annmin˙back is the NDVI ann-min when the pixel does not contain evergreen forest, and NDVI annmin˙veg is the NDVI ann-min when the pixel contains only evergreen forest. With this equation, evergreen forest was extracted at subpixel scale.

D. Thresholds Analysis
The key to model performance lies in the reasonable setting of parameters. Compared with deciduous vegetation and farmland, the annual NDVI fluctuation curve of evergreen forest is most stable [52], which means that evergreen forest has the smallest CV (Fig. 5). By setting an appropriate CV threshold, all pixels containing evergreen forest can be filtered out. The evergreen forest classification results obtained by CV contained a large number of mixed pixels, which needed to be accurately extracted at the subpixel level. The CV associated with built-up areas does not differ much from that associated with evergreen forest (Fig. 4). Consequently, classification results may be inaccurate when evergreen forest is extracted using only CV. The NDVI ann-min associated with evergreen forest is significantly larger than that associated with built-up areas. We set NDVI ann-min as the driving variable of the dimidiate pixel model to extract FVC ever from mixed pixels.
To obtain more accurate FVC ever values, the NDBI was used to divide the study area into areas with impervious and nonimpervious surfaces (Fig. 6), to which different parameters of the dimidiate pixel model were assigned. NDBI was calculated as follows [53]: where NIR represents the reflectance value of the NIR band and MIR represents the reflectance value of the mid-infrared band. The Gaofen-1 (GF-1) images used in the study were acquired on November 20, 2019 and December 27, 2019, with a spatial resolution of 8 m in the multispectral bands and 2 m in the panchromatic bands. From the GF-1 images, 200 samples were randomly and uniformly selected. Each sample covered 8 × 8 pixels with a resolution of 30 m to reduce the registration deviation between GF-1, Landsat-8, and Sentinel-2 images.
The proportion of evergreen forest in each sample was estimated, as well as the average CV and NDVI ann-min . Simulation analysis was carried out in combination with 29 field samples and visual interpretation via Google Earth. Thresholds of three model parameters-CV, NDVI annmin˙back , and NDVI annmin˙veg were determined by mixed pixels with different proportions of evergreen forest.

E. Accuracy Assessment and Model Comparison
To verify the effectiveness of the NDVI CV -LS model, we compared its performance with that of the following three evergreen forest extraction methods: 1) NDVI min , evergreen forest is extracted using only the NDVI ann-min without considering the constraints from the CV, with harmonized Landsat-8 and Sentinel-2 images; 2) NDVI win , evergreen forest is extracted from single-scene NDVI images in winter, using Landsat-8 images acquired on January 23, 2019; and 3) NDVI CV -L, evergreen forest is extracted using only Landsat-8 time-series data. The preprocessing for harmonizing Landsat-8 and Sentinel-2 images differs from standard Landsat-8 and Sentinel-2 products; thus, standard Landsat-8, Sentinel-2, and the harmonized Landsat-8 and Sentinel-2 dataset are not directly comparable. From here on, we will use the terms "Landsat" and "Sentinel-2" to refer to their respective subsets of the harmonized Landsat-8 and Sentinel-2 dataset. The thresholds of the parameters used in the above three models were consistent with those used in the NDVI CV -LS model. To assess the accuracy of the experimental results, RMSE and MAE were used as verification standards [54] and calculated as follows: where n is the number of samples, X mod,i is the FVC ever estimated using the NDVI CV -LS model, and X obs,i is the FVC ever estimated from visual interpretation of high-resolution images.

A. Analysis Results of Model Parameters
The relationships between CV, NDVI ann-min , and the proportion of evergreen forest in mixed pixels are shown in Fig. 7. From the results of simulation analysis, it can be checked that the CV decreases with increasing evergreen forest proportion, whereas the NDVI ann-min increases with increasing evergreen forest proportion. When the evergreen forest proportion was greater than 0, the CV was less than 0.22. When the underlying surface types in the mixed pixels differed, the NDVI ann-min values corresponding to the same evergreen forest proportion were different. With an evergreen forest proportion of 1, the NDVI ann-min associated with impervious areas was 0.51, whereas that associated with nonimpervious areas was 0.54. With an evergreen forest proportion of 0, the NDVI ann-min associated with impervious and nonimpervious areas was 0.2 and 0.34, respectively. After considering the different underlying surface types contained in the mixed pixel, the following final threshold values for the NDVI CV -LS model were determined: CV throughout the study area, 0.22; NDVI annmin˙back and NDVI annmin˙veg in areas with impervious surfaces, 0.20 and 0.51, respectively; NDVI annmin˙back and NDVI annmin˙veg in other areas, 0.34 and 0.54, respectively. Fig. 8 shows the spatial distribution of evergreen forest in the study area in 2019, estimated based on CV and NDVI ann-min thresholds obtained through prior analyses. Hefei contains 430.86 km 2 of evergreen forest, which are mainly distributed near mountainous areas. Sporadic evergreen tree stands are also scattered throughout the urban area and were likely grown for the purposes of afforestation and aesthetics. In the flat main crop-growing areas, there are few evergreen trees. The estimated FVC ever results almost match the actual ground situation. In particular, our method has certain advantages in extracting evergreen forest with low coverage in urban built-up areas [ Fig. 8(a)].

C. Model Validation and Comparison
The extraction accuracy of the model was verified using field samples, Google Earth and GF-1 high-resolution images. The overall classification accuracy of the NDVI CV -LS model was higher than that of other models (Table II). The user accuracy of our model was 91% and producer accuracy was 86% in terms of evergreen forest extraction. For the study area, harmonizing Landsat-8 and Sentinel-2 time-series images to extract evergreen forest improved the overall accuracy by 8% compared with using Landsat-8 images alone. Meanwhile, the RMSE (0.11) and MAE (0.09) of NDVI CV -LS model were both lower than that of NDVI CV -L model (0.24 and 0.17, respectively). The overall classification accuracy of the NDVI min model was approximately 15% lower than that of the NDVI CV -LS model. The overall accuracy, user accuracy, and producer accuracy of the single-phase NDVI win model were all low at approximately 60%.
Although the NDVI CV -L model was based on time-series images and took CV and NDVI ann-min constraints into account, the latter two parameters could not reflect real phenological characteristics of evergreen forest. Because some images in phenology periods may not be available due to the influence of clouds and rain, which led to a lower classification accuracy. The NDVI min model extracted evergreen forest based on the minimum NDVI value in the time series, which was derived from a single image. The NDVI win model was also based on single-phase images. Thus, it is expected that the classification accuracy of these two models will not be high. In conclusion, the NDVI CV -LS model is the most effective and stable model.
We also compared the results of the NDVI CV -LS model with that of supervised classification methods to validate our model. We chose three classifiers-maximum likelihood classification, minimum distance classification, and support vector machine  classification [55]- [57]. Ten classification operations were repeated for each classifier, and the training samples used for each classification routine were randomly generated. The same training and verification samples were used for all three classifiers to ensure that the outputs were comparable. The commission and omission errors of the different classifiers in distinguishing evergreen forest at the pixel scale were analyzed (Fig. 9). The commission and omission errors of the NDVI CV -LS model were 0.09 and 0.14, significantly smaller than those of the three supervised classification methods, indicating that our model can effectively extract evergreen forest.

V. DISCUSSION
Mapping evergreen forest using time-series images faces some difficulties, for instance, incomplete time series and low   spatial resolution. There could be a great improvement, as shown in our study, by means of the NDVI CV -LS model. Based on harmonizing Landsat-8 and Sentinel-2 images, we improved the accuracy of FVC ever estimation in subtropical urban areas that frequently experience cloudy and rainy weather conditions.

A. Necessity and Feasibility of Data Harmonization
Satellites that capture medium-to-high spatial resolution images tend to have a long revisiting period, such as 16 days for Landsat. For example, only 23 Landsat images of Hefei were taken in 2019, of which only nine had little or no cloud cover. Combined with the growth characteristics of evergreen forest, a continuous time series of images is needed as a data source for mapping. Given the weather conditions in Subtropics of the Northern Hemisphere, not many usable images can be captured during the plum rain season [52]. If the time series is not complete enough, only the NDVI when the background vegetation has the best growth is available, the FVC ever estimation results will be unsatisfactory. Fig. 10 shows the key phenology periods in which evergreen forest is characterized as distinct from other vegetation. If only Landsat-8 is used as the data source, some information about the key phenology period (such as June) will be lost, leading to inaccurate NDVI ann-min . Introducing multisource data can increase the possibility of obtaining the optimal NDVI ann-min , which directly impacts the accuracy of the FVC ever estimation in pixels. The comparison of the FVC ever estimation results between NDVI CV -LS and NDVI CV -L model also proves this point (Section C, Chapter Ⅳ). To ensure the time series is complete enough, the dataset must be supplemented with images from multiple sources.
Images acquired from different satellites vary in spatial resolution and spectral band range due to differences in solar altitude, atmospheric conditions, and sensors. In the process of harmonizing multisource images, in addition to correction through preprocessing, selecting similar data is also critical to minimize the biases. Therefore, Landsat-8 and Sentinel-2, whose spatial resolution and spectral information were similar, were selected as the data sources of harmonizing in the study. To assess the feasibility of harmonizing multisource data undertaken here, we selected three types of ground objects-forest, farmland, and urban area-to verify if the Landsat-8 and Sentinel-2 data were consistent. Sample points representing each type of ground object were selected from Landsat-8 images acquired on January 23, 2019 and Sentinel-2 images acquired on January 22, 2019. Then, regression analysis was performed using the reflectance of the red and NIR bands and the NDVI values of corresponding pixels. The results showed that Landsat-8 and Sentinel-2 images exhibited high consistency and can be used to construct a time-series dataset (Fig. 11). Nevertheless, some inconsistencies were observed for ground objects depending on the spectral band. Within the Red band, forest was the most consistent between the two sensors, followed by farmland and urban area. Within the NIR band, forest and farmland were both highly consistent between the two sensors, whereas urban area is somewhat inconsistent. Because compared with forest and farmland, urban area has lower near-infrared reflectance values and is more susceptible to changes in solar altitude. The trend in consistency in the NDVI regarding the three types of ground objects was similar to that observed in the NIR band. There are many types of impervious surfaces with different characteristics, including roads and buildings; thus, this gives rise to a range of NDVI values. Additionally, NDVI values are calculated using NIR and Red band reflectance values. The spectral values associated with forest and farmland increase significantly going from Red to NIR wavelengths, whereas the spectral value associated with impervious surfaces do not differ much between the two bands. As a result, spectral differences between the two sensors were amplified by the normalization process.

B. Applicability of the NDVI CV -LS Model
The accuracy of this model is greatly affected by spatial heterogeneity, especially for areas with complex and fragmented underlying surfaces. The NDVI ann-min values associated with pure background pixels containing different underlying surfaces varied; e.g., the NDVI ann-min was 0.2 for pixels containing buildings and 0.3 for pixels containing farmland and deciduous vegetation. In urban areas, building shadows can lead to low NDVI ann-min values, whereas in nonurban areas, the mixing of withered crops, grasses, and leaves into the soil can lead to higher NDVI ann-min values. If the same threshold value is used for extracting evergreen forest in mixed pixels with different underlying surfaces, the results will not be satisfactory. In the GF-1 images [ Fig. 12(a)], residential areas alternate with evergreen forest, resulting in a large number of mixed pixels. However, if evergreen forest is extracted without considering thresholds for different land-cover types, only pixels in which evergreen forest is mixed with other types of vegetation (deciduous vegetation, farmland, grassland, etc.) would be extracted. In this case, pixels in which evergreen forest co-occurs with impervious surfaces (residential land, roads, etc.) would not be considered, and any urban areas with a low coverage of evergreen forest would be ignored [ Fig. 12(b)]. By using NDBI to divide the study area and setting parameters by area, good extraction results were obtained [ Fig. 12(c)]. With this modification, the NDVI CV -LS model can be used for the fine extraction of evergreen forest in urban areas.

C. Uncertainties Associated With the Model
Because images were obtained from two different remote sensors, cumulative errors due to differences in spatial resolution and spectral band range will increase the uncertainties associated with model outcomes. Although harmonization processing was performed, some errors will still inevitably affect the classification results [45]. In addition, the shorter the time-series interval, the better the CV of NDVI reflects the actual situation [58]. Therefore, it suggests that a greater frequency of observations, such as from harmonizing data across all comparable sensors, is still needed.

VI. CONCLUSION
In this article, we developed a new approach, NDVI CV -LS, for extracting evergreen forest from harmonized multisource remote-sensing images acquired in cloudy and rainy areas. We used Landsat-8 and Sentinel-2 images with cloud cover less than 10% acquired within a year to reconstruct a continuous time-series dataset. Then, the CV of time-series NDVI was used to separate evergreen forest from other vegetation, and the NDVI ann-min was used as a parameter in the dimidiate pixel model to estimate FVC ever . Given the spatial heterogeneity and severe mixed-pixel problem, the study area was divided into areas with impervious and nonimpervious surfaces. The dimidiate pixel model parameters were simulated and analyzed for each region. The overall classification accuracy of the NDVI CV -LS model was 93% with the RMSE and MAE were 0.11 and 0.09, respectively. The results showed that the addition of Sentinel-2 images provided the potential to build full time-series models. In particular, the harmonized Landsat-8 and Sentinel-2 data improved the accuracy of FVC ever estimation in areas with complex underlying surfaces under cloudy and rainy conditions. Compared with other evergreen forest extraction models, the NDVI CV -LS model is more stable and accurate. He is currently a Professor with the School of Earth Sciences and Engineering, Hohai University, Nanjing, China. His research interests include hyperspectral remote sensing and water remote sensing applications.
Yuting Zhao is currently working toward the M.S. degree in photogrammetry and remote sensing at the School of Earth Sciences and Engineering, Hohai University, Nanjing, China.
Her research interests include forest resources monitoring and extracting using multiple remote sensing datasets. He is currently an Associate Research Fellow with the Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing. His research interest includes optical remote sensing and its application in ecological and environment field.