Short Time Cloud-Free Image Reconstruction Based on Time Series Images

A cost-effective method for cloud removal and reconstruction of remote sensing images has been a long-standing research challenge in remote sensing data processing. In this article, we address this challenge by presenting a fast and simple method for cloud detection and cloud-free image reconstruction using Landsat-8 OLI and Sentinel-2 MSI series data. The proposed method utilizes the spectral difference between cloud pixels and transparent pixels to develop a fast and simple cloud detection algorithm for multispectral remote sensing sensors. Subsequently, the cloud-free data of the complementary image and the target image undergo histogram matching, and the cloud-free pixels of the complementary image are seamlessly integrated into the original image, leading to the generation of a cloud-free image reconstruction algorithm. Comparative analysis between the results obtained from our proposed method and the corresponding artificial results reveals an accuracy rate exceeding 90% and a high consistency in the reconstructed spatial spectrum. By addressing the need for cost-effective cloud removal and image reconstruction, our method contributes to the advancement of remote sensing data processing and applications.

The presence of clouds prevents optical satellites from obtaining useful information about the Earth's surface and affects image availability to varying degrees. Furthermore, shadows cast by clouds on the ground can contaminate the image. Lack of image information due to clouds and their shadows results in space-time gaps in satellite Earth observation data [5], and may result in deviations in subsequent image processing and application [6]. Some classical aerosol retrieval algorithms, such as MODIS dark target, deep blue, and multiangle imaging spectrometer aerosol retrieval algorithms, need to remove the influence of clouds before retrieval [5], [7], [8], [9], [10], [11]. Although cloud removal is a long-standing research topic in the community of remote sensing, it is still a matter worthy of continuous improvement [12], [13]. However, cloud removal techniques still face challenging problems such as complexity and universality that demand further research, despite the ongoing efforts to improve them. This article aims to address these problems by proposing a method of cloud detection and cloudfree image reconstruction using Landsat-8 OLI and Sentinel-2 MSI series data, which realizes high-precision and fast cloud removal.
Cloud detection is a crucial step in the process of cloud removal. The percentage of clouds, which can be determined through cloud detection, is often used as an indicator of image quality and data availability [14]. It helps to extract useful data, improve storage and transmission efficiency of image data, and provide an important product in the preprocessing phase, allowing to maximize the use of remaining cloud-free areas and improve image suitability, particularly in cloudy and rainy areas [5]. However, only a few satellite products have associated cloud mask products, such as Landsat, MODIS, and Sentinel-2 [15]. Therefore, fast and simple cloud detection algorithms are essential.
Over the past decades, a variety of cloud detection methods have been developed [2], [16], [17], [18]. These methods generally rely on the comparison between cloud and background surface within a given target area. This comparison can be based on the difference of a single spectrum, spectral combinations, temporal and spatial characteristics of clouds, or a combination of spectral, spatial, and texture features of the cloud [1], [19], [20], [21], [22].
One of the effective cloud detection methods is the threshold method. This algorithm is determined by the spectral diversity and the variation of the underlying surface [6]. Most This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ existing algorithms are rule-based, meaning some spectra, like near-infrared intensity thresholds, play a crucial role in cloud detection. Such an approach is susceptible to variations in meteorological conditions and absolute surface reflectance. The threshold-based detection method is widely used for its simplicity and efficiency [23]. The ISCCP [24] proposes that visible and infrared radiances are affected by two kinds of conditions, cloudy and clear, and the ranges of radiances associated with these two conditions do not overlap. Cloud pixels can be identified by comparing bands of 0.6 and 1.1 µm with a clear-cloud threshold value. This conservative algorithm minimizes false cloud detection while omitting some thin clouds. Unlike methods of detecting clouds by a single pixel, the NOAA CLAVR algorithm [22] takes a pixel matrix of 2 × 2 as the identification unit. Only if all four pixels in the pixel matrix pass the cloud detection test can they be regarded as the cloud. Otherwise, the pixel matrix can be identified as a clear sky if it meets different underlying surface criteria. The APOLLO algorithm [25] uses data from five AVHRR channels to define a threshold in each channel accordingly to detect cloud pixels. When the reflectance threshold is too high, or the point designated by temperature is too low, the pixel is identified as the cloud. And the pixel is identified as the clear sky if the reflectance ratio of channels 2 and 1 falls within the range of 0.7 to 1.1. The pixel is clear if the temperature difference between channels 4 and 5 exceeds a certain threshold; otherwise, the pixel would be a mixture of clear sky and cloud.
In addition, other commonly used cloud detection methods include those used by the Landsat satellite, such as the Landsat 7 automated cloud cover assessment [26], [27], the landscape fire and resource management planning tools CCA, the Landsat ecosystem disturbance adaptive processing system cloud algorithm [28], and the function of mask algorithm [6], all of which utilize multiple spectral threshold-based tests. The Sentinel-2 satellite, known for its fast return cycle and high resolution, also has cloud detection methods such as Sen2Cor developed by the European Space Agency (ESA), which performs atmospheric correction and cloud detection, however, it still encounters problems of over/under-estimation of clouds above water. The MAJA (Maccs-Actor Joint Algorithm) designed for Sentinel-2 is based on time series analysis [29], but it requires a significant amount of imagery and computational resources.
However, threshold-based methods have limitations, such as being susceptible to variations in meteorological conditions and absolute surface reflectance, and their performance can vary depending on the season or sun elevation [30]. Additionally, the transfer of threshold-based algorithms from one sensor to another can be difficult, as distinct criteria are often involved [31].
In contrast, the proposed method in this article addresses these limitations by using a combination of Landsat-8 OLI and Sentinel-2 MSI series data, which allows for high-precision and fast cloud removal. Furthermore, it utilizes a cloud detection algorithm that is based on the spectral difference between cloud pixels and transparent pixels, which improves its accuracy and efficiency compared to traditional threshold-based methods. This algorithm is specifically designed to be used with multispectral remote sensing sensors, such as Landsat-8 OLI and Sentinel-2 MSI, which makes it more versatile and applicable to a wider range of data. The results of this method show that it not only achieves similar or even better accuracy than existing threshold-based methods, but also has the advantage of being faster and more versatile.
Cloud detection is an essential step in obtaining accurate and complete ground imagery by removing cloud cover. However, it is not the sole factor in achieving high-quality, cloud-free imagery. Data fusion techniques play a crucial role in effectively deducing the ground image under cloudy conditions. These techniques enable the integration of multiple remote sensing images to compensate for the information gaps caused by cloud pollution, thereby enhancing the overall quality and accuracy of the reconstructed image [32]. Despite recent advancements in satellite imaging technology, obtaining high-quality, cloud-free imagery for a specific time and area remains a challenge. To address this challenge, this article proposes a method that combines cloud detection and data fusion techniques to effectively deduce the ground image under cloudy conditions, ultimately resulting in improved quality and accuracy of the reconstructed imagery.
Numerous efforts have been made on cloud removal to reduce the impact of clouds. Reconstructing cloudless imagery is actually a process of information reconstruction, which can be classified into noncomplementary, multispectral complementary, and time-based complementary [33], [34].
In the noncomplementary method, the information of cloudcontaminated areas in remote sensing images is reconstructed from the remaining portion of the image without the aid of other complementary data. This approach relies on the assumption that regions without clouds have similar contextual information to their adjacent cloudy areas [35]. Honold et al. [20] suggested that the spatial relationships between local and nonlocal regions can be incorporated into cloud removal. Practicable spatial-based cloud detection algorithms include interpolation-based methods, total variational methods [36], spreading or diffusion-based methods, and exemplar-based methods [16]. However, when the absence of information is substantial, the results of this method may be inadequate for data analysis or further application [16], [37], [38].
The multispectral data is used to reconstruct the missing information in the multispectral complementarity method, by modeling the relationship between the contaminated band and the auxiliary band with complete and clear data. The correlation between various spectral bands makes it possible to recover the lost information, by utilizing the auxiliary bands as a reference [39]. This approach is particularly useful in cases where the quantitative products of remote sensing images are affected by cloud contamination, as the multispectral complementarity method allows for the reconstruction of such data.
Compared with noncomplementary and multispectral complementarity methods, multitemporal complementarity methods are more effective in coping with thick clouds. Satellite remote sensing systems with fixed recurrence can acquire images of the same area on a regular basis [23], enabling the collection of multitemporal data by revisiting the same location at different times. The cloud coverage in these images typically does not overlap completely, providing a data source for reconstructing information via multitemporal image processing [40]. This method assumes that ground type and geometric position remain relatively unchanged in the short term, allowing for the reconstruction of missing data through a neighborhood similar pixel interpolator [23]. Image mosaics are also utilized to obtain cloud-free images by splicing cloud-free regions from multitemporal images [32], [40], [41], [42]. Chen et al. [43] developed a neighborhood similar pixel interpolation (NSPI) method to address the gap caused by the closure problem of Landsat ETM+ scan line corrector and further improved it (MNSPI) to remove thick clouds [44]. Other multitemporal complementary methods designed to recover missing pixels caused by sensor failures, such as the local linear histogram matching (LLHM) method [45] and weighted linear regression (WLR) method [46], can also be used for cloud removal. However, despite their effectiveness in thick cloud removal, temporal-based methods are also susceptible to spectral reflectance change [6]. Additionally, these methods may not maintain the spatial continuity of objects in the restored image and may be limited in their ability to reconstruct large-scale thick clouds or address issues arising from significant spectral differences among multitemporal images, such as those caused by longer time intervals or differing atmospheric conditions.
In addition, these methods may not maintain the spatial continuity of objects in the restored image. Despite these limitations, multitemporal complementarity-based methods have proven to be an effective approach for thick cloud removal and have been widely used in various applications. However, further research is needed to improve the robustness and applicability of these methods in the presence of significant spectral differences and to ensure the preservation of spatial continuity in the restored image.
Recently, with the powerful feature extraction and expression capabilities of deep learning [47], it has been applied in various fields. In the field of remote sensing data quality improvement, several solutions have been proposed through a data-driven learning framework [48], such as SAR image despeckling [49], hyperspectral image denoising [50], and pan sharpening [51], [52], [53]. These methods have shown to achieve state-of-the-art reconstruction results.
Despite the success of these feature learning methods in various remote sensing image processing tasks, they still have limitations when it comes to thick cloud removal. Specifically, these methods often fail to take into account the local characteristics of missing regions and their adjacent areas, and are typically limited to using a single time image. Additionally, they often rely on the availability of cloud-free supplementary data, which is often not the case in real-world scenarios [49]. These limitations have led to a need for more effective cloud removal methods, which is where the proposed method in this article comes in [6].
In this study, we propose a novel cloud removal method that utilizes a multitemporal complementary approach, which emphasizes on achieving high-quality cloud-free images containing 90% clear pixels while also considering the trade-off between accuracy and speed. By incorporating a data-driven learning framework, our method effectively addresses the shortcomings of previous feature learning methods, such as the lack of consideration for local particularity and the reliance on cloudless images as supplementary data. Our approach allows for the efficient reconstruction of large-scale cloud-covered areas while minimizing the risk of misdetection as clouds. As a result, our method not only provides high-quality cloud-free images but also maintains a balance between accuracy and computational efficiency, making it a practical solution for remote sensing applications.

II. MATERIALS AND METHODS
The proposed method is illustrated in Fig. 1 which presents the overall flowchart of the process. The first step involves executing a cloud detection algorithm using multispectral data to generate a cloud mask. Next, a spatial and spectral matching algorithm is implemented to sort complementary data based on their proximity to the target date. Subsequently, a spatio-temporal recovery method is employed to reconstruct the information of cloud-covered areas. Finally, a spectral histogram matching algorithm is applied to all reconstructed pixels. With a sufficient amount of supplementary data, it is possible to fully recover all cloud-covered areas. The specific details of each step are further discussed in this section.

A. Study Area and Data
The proposed cloud-free reconstruction algorithm was evaluated on five typical landforms in China, as illustrated in Fig. 2. These landforms, which include coasts, plains, basins, plateaus, and mountains, are located in Shanghai, Hainan, Sichuan, Qinghai, and Tibet, respectively. These locations cover various cloud cover conditions, such as the southern foothills of the Himalayas, where the warm and humid airflow from the Indian Ocean is blocked, resulting in frequent cloud cover, and the mountains, where there is a large amount of ice and snow interference. In order to evaluate the proposed algorithm, two of the most widely used multispectral sensors, Sentinel-2 and Landsat 8, were selected for this study.
The proposed cloud-free reconstruction algorithm was evaluated using data from two widely-used multispectral sensors, Sentinel-2 and Landsat 8, in five study areas representing different landforms in China. These areas include coasts, plains, basins, plateaus, and mountains, located in Shanghai, Hainan, Sichuan, Qinghai, and Tibet, respectively. The Sentinel-2 satellite, part of the European Earth Observation Program, carries the MSI sensor, which acquires over 500 images per day with high spatial resolution and short revisit cycles. The NASA Landsat 8 satellite carries the OLI sensor, which provides data with a 16-day revisit interval and bands covering wavelengths from 0.435 to 2.294 µm. A total of 83 images, covering an area of over 2900 square kilometers, were used in this study, collected from 2016 to 2020 [12], [54].
To establish a comprehensive dataset for cloud removal research, it is important to consider a wide range of common cloud and surface types. The impact of the underlying surface on the cloud removal process increases with the thickness of the cloud layer.
To account for this, the selected study areas must include a variety of artificial and natural textures such as towns, forests, grasslands, paddy fields, bare soil, and snow-covered mountains. The spectral characteristics of these surfaces must be taken into consideration to ensure that the images are consistent across time and space. For example, the reflectance of water is typically lower than that of other surfaces such as snow or artificial surfaces, which are more prone to being misidentified as clouds. Despite this, the extensive data collected in this research will enable us to distinguish between clouds and ground surfaces. As shown in Table I, there are significant spectral differences between ground objects in different regions. These representative regions were chosen to ensure that the research methods developed are universal.

B. Cloud Detection Algorithm
The cloud detection algorithm is based on the difference in reflectance spectra between clouds and underlying surfaces in the visible to near-infrared wavelengths [4]. However, relying on specific bands alone may not reliably distinguish clouds from other features. Hence, the algorithm combines distinct band differences and spectral correlation to accurately differentiate between cloudy and clear skies [6]. According to the conclusion  of CDAG, different cloud detection algorithms must be evaluated for different sensor data, which is time-consuming and not conducive to future work expansion [16]. Therefore, if the most common satellite band is used for detection, it can be applied to as many sensors as possible and improve the level of automation. In this article, due to the existence of complementary information, the accuracy requirement can also be appropriately reduced to detect the vast majority of clouds [55].
The single-band cloud detection algorithm is based on the fact that the reflectance of clouds at the 1.6 µm band is typically higher than that of the underlying surface. This method is commonly used in remote sensing to detect clouds by setting a threshold value for the reflectance of the 1.6 µm band. In this study, we use a threshold range of 0.3 to 0.5 (1), which has been shown to effectively detect thick clouds in previous research [56]. This single-band method is combined with a correlation test to further improve the accuracy of the cloud detection algorithm and reduce the number of false positives (Fig. 3). (1) The proposed cloud detection algorithm is based on the difference in reflectance spectra between clouds and underlying surfaces in the visible to near-infrared wavelengths. A combination of single-band and correlation tests is used to realize a more conservative cloud detection algorithm with fewer false detections. The 1.6 µm band in the single-band test is used to set a threshold sensitive to clouds or atmospheric moisture, as both Sentinel-2 MSI and Landsat 8 OLI have such bands. The threshold is set at 0.3 and 0.5, which can detect most thick clouds. After this process, the majority of thick clouds are extracted, however, transparent sky pixels may not be detected as cloud pixels during detection. To reduce false detection of clear-sky pixels, a corrosion treatment of the detection results is performed. The correlation coefficient (CC) (2) of typical clouds is calculated to determine whether the remaining pixels should be identified as clouds or clear sky. A CC better than 0.9 is used as the criterion for determining whether a pixel is a cloud. To address the problem of detecting clouds with less obvious characteristics, a spectral sample is selected from more than 100 ROI regions from images within a year before and after the original date of the reconstructed basic image, and their spectral information is used to form a typical cloud spectral library. This cloud spectral database provides enough information for comparing clouds with less obvious characteristics [12], [57].
(2) where DN i represents the DN value of different bands, DN represents the average DN value of different types of cloud data, EN i represents the cloud reflectance of reference data, and EN represents the average reflectance; the n represents number of bands of pixels being compared.
To verify the effectiveness of this method, we selected the images with the area mentioned above as the test area. The test terrain covers snow, mountains, forests, grasslands, lakes, clouds, and bare soil. The diversity of ground surface types facilitates the universality of the proposed method.
Cloud pixels correct rate = T P Cloud pixels in reference data Clear sky pixels correct rate = T N Clear sky pixels in reference data (4) Error rate = FP Clear sky pixels in reference data (5) Missing rate = FN Cloud pixels in reference data .
To quantitatively evaluate this method, we compared the cloud detection results with the manual quantization results used as the reference and calculated the clear sky pixel accuracy, misjudgment rate, and error rate (3)-(6). The following four indicators can be used as the basis for judgment.
Where TP (true positive) represents the pixel determined as a cloud layer in both the reference data and detection results; TN (true negative) represents the pixel that is determined to be clear sky in both the reference data and the detection result; FP (false positive) indicates the pixels that are clear in the reference data and classified as clouds in the detection result; and FN (false negative) indicates the pixels classified as the cloud in the reference data and clear sky in the detection result.

C. Cloud-Free Image Reconstruction
To ensure continuity in the reconstructed image, spectral correction is performed to account for variations in imaging conditions across different images. This step ensures that the spectral characteristics of the reconstructed image are consistent and comparable across different regions and time periods. Using the precise coordinate transformation of remote sensing images, we can simply complete image registration [58]. The subsequent multispectral and multitemporal problems can be solved through spectral matching on the one hand, and complemented by weighting time distance on the other hand.
As depicted in Fig. 4, the proposed method utilizes the original image as the reconstructed target image and the complementary image as the image providing supplementary data. However, due to varying imaging conditions, these images may exhibit significant spectral differences, which can have a significant impact on subsequent quantitative analysis [59]. To mitigate this issue, the proposed method employs a histogram matching technique, which utilizes the statistical features of pixels in cloud-free areas to correct the spectra of the complementary images [60]. By doing so, the proposed method is able to maintain high fusion accuracy even in the presence of spatiotemporal changes and produce relatively stable results, even when the input image is affected by thin clouds [31]. Spectral matching reveals that the cloud-free region of the complementary image has at least more than 30% higher spectral similarity with the original image, making the complementary image a more reliable source of supplementary data.
After histogram matching, the pixels are mosaicked step-bystep according to the imaging date for reconstruction [61]. Due to the particular offset of each imaging position of the satellite, direct reconstruction is difficult [41]. However, the overall distribution of the ground, particularly the distribution of vegetation and crops, does not change significantly. Thus, it is feasible to mosaic pixels according to coordinates. To minimize the difference between mosaic pixels and original pixels, cloudless pixels are filled with images of similar dates according to the time difference between complementary images and the original image [62]. If it is perceived as a cloud, it would be supplemented by the image of the latest date.
In this section, we present the results and discussion of our proposed method. We conducted two experiments to evaluate the effectiveness of our method. The first experiment involved cloud detection, where we targeted clouds in different regions and underlying surfaces, and compared the results of our T-C (threshold-correlation) method with other methods. In the second experiment, we compared the reconstructed image with the target date image, analyzing the overall quality and data correlation between the two images. This comprehensive evaluation confirms the usefulness and general applicability of our method. The proposed algorithm was validated using a diverse set of Sentinel-2 MSI and Landsat 8 OLI images from various dates between 2016 and 2020, including images of thin, thick, broken clouds, as well as various underlying surfaces such as vegetation, sand, snow, etc. Additionally, the validation was performed on various scales to ensure its broad applicability. The selection of images acquired at different times and underlying surfaces makes the verification more thorough, accurate, and reliable.

A. Cloud Detection Results
In this section, we conducted an experiment to evaluate the effectiveness of the proposed cloud detection method on various regions and underlying surfaces. A total of 23 Landsat-8 OLI images and 32 Sentinel-2 MSI images were used for validation. Fig. 5 illustrates the cloud detection results of Sentinel-2, with the left image in each group showing a false-color composite MSI image and the right image showing the corresponding cloud detection result, where the red parts represent cloudy pixels and the rest represents clear-sky pixels. The results of the first three groups show detection results for vegetation and urban surfaces, which demonstrate the satisfactory performance of the method. The detection results for offshore areas, forests, cities, bare soil, and oceans in the fifth to seventh groups also indicate the accurate detection of clouds without confusion with land and ocean. In the eighth to tenth groups, the detection results for snowy mountains, which are prone to confusion between the spectra of snow and clouds, are conservative, but retain complementary information. These results demonstrate that the  Table I presents the statistics for cloud detection results obtained from two satellite images. The accuracy evaluation matrix indicates that the proposed method has a high level of accuracy for both satellites, with cloud detection accuracy greater than 0.9, and low error and missed detection rates. This is sufficient for the reconstruction of cloud-free images. It should be noted that the detection accuracy may be affected by the thickness of the clouds, with thicker clouds having a higher accuracy rate. Additionally, Table I shows that the T-C method has advantages over other simple cloud detection algorithms, such as dynamic thresholding and the use of multiple features, while being lightweight and simple yet effective. The cloud detection rate of the T-C algorithm is also higher than official results, as it is more sensitive to thin clouds, resulting in a larger coverage rate. Furthermore, the detected clouds generally cover the official detection results, and do not produce significant differences in undetected cloud layers. Overall, these results are reliable and can be used for practical applications.
The results in Table II and III indicate that the threshold correlation algorithm (T-C) has certain advantages over the other two algorithms, such as dynamic threshold and multifeature SVM. The overall cloud coverage of the image using T-C is higher than the others, and its detection accuracy relative to the official algorithm is also higher. This suggests that T-C is more sensitive to thin clouds and has a larger coverage rate than the official results. Additionally, the detected clouds using T-C are found to be in alignment with the official detection results, indicating that the T-C algorithm produces plausible and usable results.

B. Cloud-Free Image Reconstruction
In this section, we conducted an experiment to evaluate the effectiveness of the proposed cloud-free image reconstruction method using over 80 images on Sentinel-2 and Landsat-8. Specifically, we compared the proposed method with other existing methods, such as MNSPI [44], LLHM [45], and WLR [46] on the same surface. The results of the experiments, as shown in Fig. 6, demonstrate the practicability of the proposed framework under various conditions.
In most cases, the proposed method is effective in reconstructing the cloud-free areas of the original image, as long as the local weather is not excessively cloudy. However, in instances where most areas are covered by clouds for an extended period of time, the number of cloud-free pixel references is limited and the credibility of the reconstruction is not high. It is important to note that this problem cannot be solved using traditional optical remote sensing methods, and alternative methods such as using microwave or other bands that are not affected by clouds must be considered [63], [64].
Furthermore, as shown in Fig. 6, most of the original image data used in the experiments were redundant, wasting a significant amount of data. A more efficient solution is to obtain cloud masks of the target area from multiple images, and then select the images with the most significant differences in cloud cover positions. By utilizing this method, it is possible to reconstruct cloud-free images using only a few images. The method involves obtaining the cloud mask of the original image area, then downloading the cloud masks of other images (such data is generally provided and the data amount is minimal). Then, by reversing the mask of these complementary data, the clear image element mask can be obtained. The original data's cloud mask and the complementary data's clear mask are then intersected and sorted according to the intersection size. The complementary data with the largest intersection is selected. This approach allows for the screening of hundreds of complementary data without downloading the complete image. An example of this is shown in the following cloud-free image of southern Tibet, which was synthesized using only four images. Fig. 7 illustrates the reconstructed cloud-free image in the southern region of the Himalayas. Despite the significant local variations in spectra and radiation, the complexity of the area, and the significant altitude changes, the proposed reconstruction method demonstrates its effectiveness. The surface features, such as rivers and mountains, are clearly visible and consistent, and a majority of the information obstructed by clouds has been effectively recovered. These results demonstrate the efficacy of our method in reconstructing cloud-free images in large-scale and multitemporal scenes.
The results presented in Fig. 8 demonstrate that the proposed method Fig. 8(b) reconstructs the image closer to the original clear image Fig. 8(a) compared to the other methods. The LLHM method Fig. 8(e) produces a recovery result in the cloud-contaminated region with a serious spectral distortion, particularly in the river and mountain region. This outcome indicates that a simple histogram matching method is inadequate when the two images have a complex terrain and large spectral differences. The results of the MNSPI and WLR methods Fig. 8(c) and (d) are similar, and most of the ground features are well recovered. However, in the mountain region, particularly with snow, the spectral characteristics differ from those in the remaining area, which may also be due to the large spectral differences between the target and reference images. The proposed method produces the most plausible visual result, and  some of the detailed information is well recovered while also effectively suppressing unnecessary noise. Notably, the original and reference images were acquired in different seasons, and the spectral characteristics of the ground features have changed significantly. The LLHM, MNSPI, and WLR methods cannot handle this issue very well, leading to more errors in their results. In contrast, the proposed method utilizes similar pixel offsets and the radiometric information of the image itself to fill the missing region, enabling it to better address this issue. The quantitative assessment in Table IV also confirms the clear superiority of the proposed method.
In Table IV, we have calculated the average efficiency of these methods for cloud removal. It can be seen that the method proposed in this article is a very simple and effective cloud-free image reconstruction technology in comprehensive consideration of complex landform, climate, and other environmental conditions.
As can be seen from the comparison, our proposed method has distinct advantages over alternative approaches such as the harmonic analysis of the time series (HATS) algorithm and the multichannel singular spectrum analysis (m-SSA) method. The HATS algorithm is primarily used for reconstructing single-band  data types and is not suitable for data whose imaging conditions change slightly, such as spectra, which can result in significant differences [18]. Similarly, the m-SSA method may not be able to reconstruct data in cases of missing values and complex correlations. Additionally, deep learning methods, such as cloud-free sea surface temperature image reconstruction based on abnormal network repair, are limited to single bands and can be costly to implement [65]. Furthermore, sparse reconstruction based on random samples, such as spark unmixing-based denoising for single scene cloud removal, is challenging in the presence of polluted pixel information (e.g., haze) and does not use the time sequence information of multiple scenes [66]. Our proposed method, on the other hand, is effective in reconstructing cloudless images in large-scale and multitemporal scenes while being computationally efficient.

C. Usability of the Reconstructed Data
The effectiveness of the proposed method is further illustrated by quantitative assessments using the root mean square error (RMSE) and CC indices (7)- (8). The RMSE index measures the average difference between the values of the recovered pixels and the corresponding true values in the original image, while the CC index measures the linear relationship between the recovered and original pixel values. The definitions of these evaluation indices are as follows: To evaluate the usability of the reconstructed data for further applications, the Normalized Difference Vegetation Indices (NDVI) of the recovered images are compared in the second experiment. Based on the scatterplots presented in Fig. 9, it is suggested that the LLHM [45] method estimates the values in cloud-contaminated regions with larger errors. On the other hand, the MNSPI [44] and WLR [46] methods have a better agreement with the original image. However, the proposed STMRF method still outperforms the other methods. As illustrated in Fig. 9, the R 2 value of the LLHM result is unsatisfactory at −0.375. In contrast, the MNSPI and WLR results show a significant increase to 0.021 and 0.386, respectively. The proposed method achieves the highest value of 0.843. These results demonstrate that the cloud removal results obtained from the proposed method can significantly enhance the support for further applications. This method has been demonstrated to be effective in dealing with the significant changes of multitemporal images. However, it is important to note that it is most suitable for areas with dense, random cloud cover as opposed to sites where clouds remain in a fixed range for an extended period. Additionally, due to the nature of time-series imaging, this method may not be appropriate for areas with significant differences in ground spectra. It is also important to note that the return cycle of the satellite used for reconstruction should not be too long in order to ensure the continuity of the reconstructed data. Furthermore, the proposed detection method relies on the sensor band covering short-wave infrared in order to realize coarse cloud screening. Additionally, in the event of frequent and extensive cloud cover at the imaging site, and unfavorable atmospheric conditions, the complementary images obtained may provide limited effective information. In such cases, long-term multisource images from multiple satellite platforms may be required to collect sufficient cloud-free pixels to effectively restore the surface image, or resorting to microwave remote sensing may be necessary.

D. Efficiency Evaluation of Proposed Methods
To evaluate the efficiency of the proposed cloud removal method, we compared its processing time with three comparison methods. We selected three different satellites' remote sensing  Table V. We ran all the methods on a PC configured with Intel Core i7-9700K CPU and 16 GB RAM, and recorded the time required for each method to process these images. Table VI shows the results of the processing time in seconds.
From Table VI, it can be seen that the proposed cloud removal method has the fastest processing speed on all images, far faster than the other three methods. This indicates that the proposed method can not only effectively restore information in cloud-covered areas, but also has high computational efficiency, making it suitable for large-scale and real-time remote sensing image cloud removal applications. Compared to the other three methods, although they can also achieve certain cloud removal

E. Data Demand for Reconstruction
To further evaluate the performance of the proposed cloudfree image reconstruction method, we discuss some issues related to the data requirements of this method in the following sections.
First, we assess the number of images needed for the reconstruction. Since the proposed method is based on spatiotemporal recovery, sufficient supplementary data are needed to fill in the information in the cloud-covered areas. Specifically, the number of images depends on the cloud cover rate, the cloud cover location, and the target date, which are not easy to quantify. Generally speaking, the higher the cloud cover rate, the more dispersed the cloud cover location, and the closer the target date to the cloudy season, the more images are needed. To quantify this issue, we designed an experiment that selected a cloud-covered image from the Landsat 8 OLI dataset as the target image, and randomly selected different numbers of supplementary data from images of the same area on different dates. PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity) are two indicators that measure the similarity of images, we calculated the PSNR and SSIM metrics between the reconstructed image and the real cloud-free image under different numbers of supplementary data, and plotted the curves shown in Fig. 10. Fig. 10 shows that as the number of supplementary data increases, the quality of the reconstructed image also improves. However, when the number of supplementary data reaches a certain level, the reconstruction quality tends to saturate. This indicates that the proposed method can achieve good reconstruction results with a limited number of supplementary data, without requiring a large amount of data. Specifically, in this experiment, when the supplementary data covers 90% of the cloud-covered area in the original image, it generally requires eight images, and the reconstructed image can achieve high PSNR and SSIM values. Therefore, we recommend using around eight supplementary data to implement the proposed method.
Second, we discuss the applicability of the proposed method in rapidly changing regions. Since the proposed method is based on spatio-temporal recovery, it assumes that the target region has similar or stable spectral features at different time points. However, in some rapidly changing regions, such as crop fields and water bodies, this assumption may not hold true. To verify this issue, we selected two remote sensing images containing crop fields and water bodies as experimental data, and applied the proposed method and three comparative methods for cloudfree image reconstruction. We calculated the PSNR and SSIM metrics between the reconstructed image and the real cloud-free image in the crop and water regions, and presented the results in Table VI. Table VII shows that in the crop region, the proposed method can still obtain the highest PSNR and SSIM values, indicating that the method can adapt to changes in the crop region and maintain good reconstruction results. However, in the water region, the PSNR and SSIM values of the proposed method are significantly lower than those of the other three methods, indicating that the method performs poorly in the water region. This may be because there are many factors such as ripples, reflections, and turbidity in the water region, resulting in large spectral differences at different time points in the water region. Therefore, for cloud-free image reconstruction in water regions, further improvements are needed for the proposed method or other methods that are more suitable for water features. In summary, the proposed method can effectively recover the information of cloud-covered areas with a limited number of (around eight) and relatively stable supplementary data, and is applicable for cloud-free image reconstruction in most regions. However, for rapidly changing regions such as water bodies, the reconstruction performance of the proposed method is poor, and further research and improvement are required.

IV. CONCLUSION
When optical remotely sensed images are affected by cloud cover, much ground information cannot be acquired, which significantly limits their application. Despite the abundance of remotely sensed images available today, high-quality images that provide complete, clear ground information remain a sought-after goal. Therefore, this article proposes a new and effective method to remove clouds and accurately reconstruct ground information. Missing pixels are replaced with similar pixels from the remaining regions of the cloud-contaminated image, with another complementary temporal image serving as a reference to locate the similar pixels. To select the most appropriate similar pixels to replace missing pixels, we build a time-histogram select and match model to obtain its optimal solution. This optimal solution represents the optimal combination of similar pixels to replace all missing pixels.
The experimental results demonstrate that incorporating spatio-temporal information in image reconstruction significantly enhances the accuracy compared to conventional methods. Furthermore, the proposed method effectively utilizes the radiometric information within the image itself to reconstruct missing data, thereby demonstrating its robustness to various atmospheric conditions and seasonal changes in multitemporal images. Overall, the proposed method is capable of achieving temporally, spatially, and spectrally coherent reconstruction.
However, the proposed method has certain limitations. It performs well in handling significant changes in multitemporal images. However, when the time interval is very short and the atmospheric conditions are similar, the acquired multitemporal images may be very similar or their changes may be simple and linear. In such cases, conventional methods such as fitting, matching, and regression can obtain excellent reconstruction results and their computations are fast. Therefore, compared to these conventional methods, the proposed method may not have a distinct advantage and may be time-consuming. To address this issue, we plan to combine the proposed method with traditional methods to handle different levels of changes in multitemporal images in our future work.