Mapping Dongting Lake Wetland Utilizing Time Series Similarity, Statistical Texture, and Superpixels With Sentinel-1 SAR Data

Wetland is of significant ecological value, which is very important and challenging for large-scale mapping. Sentinel-1 can continuously record wetland changes with its all-day, all-weather working capability. How to fully utilize multidimensional information, such as time and statistical texture of synthetic aperture radar (SAR) image, to classify wetlands more accurately has become a research focus. Thus, this article constructed time series similarity parameters to describe the temporal change information of targets in wetland and introduced G0 statistical texture parameters to describe texture characteristics of SAR images. Combining multifeatures and superpixels, we proposed a classification method based on random forest (RF) to map Dongting Lake wetland. The overall classification accuracy was 97.57%, and the Kappa coefficient was 0.97. The classification accuracy of reed beach and grass marsh was above 95%. The results showed the proposed SAR time series similarity features could effectively utilize dynamical information among classes and was helpful to identify mudflat, grass marsh, and reed beach with high dynamics in wetland. The introduced statistical texture features expressed the heterogeneity of targets, and enhanced the recognition and extraction ability of forest beach, mudflat, and reed beach. Compared with support vector machine, decision tree, and RF classification methods, RF with superpixels optimized not only got high precision but also could effectively reduce the pepper-salt error in classification, because of the consideration of superpixels context information.


I. INTRODUCTION
W ETLAND is an important component of the earth's ecology [1]. In recent years, climate changes and human activities have destroyed the balanced development of wetland. It is urgent to have dynamic information on wetland timely and accurately in ecological environment protection and management Manuscript [2]. The accurate change of wetland characteristics is conducive to the sustainable development of wetland protection [3]. Remote sensing monitoring of wetland can solve the problems of low efficiency, high cost caused by traditional artificial field survey, and work more efficiently [4]. The land cover of wetland is complex and diverse, and easily changes with time, which requires high temporal resolution images of monitoring [5], [6]. Optical remote sensing is easily disturbed by weather, and is difficult to obtain effective observation data stably, which is inconvenient to continuous monitoring and analysis [7]. With its all-day and all-weather working ability, synthetic aperture radar (SAR) can continuously and stably obtain earth observation data and information of vegetation structure and downward canopy, also be highly sensitive to water and soil moisture [8]. The backscatter coefficients of wetland woodlands and grasslands are significantly related with the timing changes of water level [9]. Consequently, time series SAR data have great potential for wetland classification and monitoring [10]. Sentinel-1 SAR is covered global with free charge, 12/6 days revisit period. Therefore, it is beneficial to the wetland classification and monitoring with intensive time [11].
Fully utilizing characteristics of wetlands in SAR images is the key to accurately classify and effectively monitor wetlands. First, the backscattering intensity can describe the response difference between targets on radar pulse [12], [13], [14], [15], [16], [17]. Second, textures can supplement the SAR information dimension, but those ones used in SAR wetland classification were calculated by gray-level co-occurrence matrix (GLCM) commonly [18], [19], which seldom considers the statistical distribution characteristics of SAR data. Existing studies [20], [21] have shown that SAR statistical texture features can reflect the uniformity of targets and perform better than GLCM in classification applications. Finally, the spatial context features were applied to wetlands classification by object-oriented analysis, whose results were closer to the real landcover [22]. The superpixels analysis facilitates the utilization of spatial and contextual features, and has great potential in SAR image classification [23], [24], [25], [26].
Temporal information is valuable to wetland classification because of the dynamic variability of wetlands. SAR time series data have proved to be an ideal data set for capturing wetland dynamics [15], [27], [28]. Calculating the maximum, minimum, or standard deviation as temporal characteristics is a common strategy for multitemporal SAR data processing [29], [30], [31]. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ However, that might not fully utilize or miss the time series change information. Using the long-term observation data directly can retain complete and rich wetland time information, and the results are more accurate [11], [32].
Each feature has its own advantages in wetland classification, and combining multifeature together can fully exploit their advantages to improve wetland classification accuracy. Mohammadimanesh et al. [33] showed that the classification accuracy of random forest (RF) could be improved by combining SAR intensity, interferometry, and polarimetry data. Wetland classification with multiple features is an important trend.
Therefore, the purpose of this article was to map wetland combining timing features, statistic texture, and spatial context with annual sentinel-1 SAR time series images. The specific objectives were to as follows. 1) Propose time series similarity features and explore their effectiveness in wetland classification. 2) Introduce G 0 statistical texture features in wetland classification and analyze effectiveness. 3) Combining multifeatures, propose a wetland classification method with RF utilizing superpixels spatial context information.

A. Study Area
Dongting Lake is located in the middle-lower Yangtze River and the south of Jingjiang River, the second largest freshwater lake in China. And its terrain is high in the northwest and low in the southeast. This area experiences a subtropical monsoon-type climate, with four distinct seasons and abundant rainfall. Summer is the peak rainfall period, which is most prone to floods. The beach of Dongting Lake is mostly fertility mashing soil, which is conducive to the growth of surface coverings such as sedges, reeds, and trees. Except for few natural shorelines, Dongting Lake is bounded by permanent levees. In this article, the study area is delineated with the levee as the boundary, and is divided into three parts: east, south and west (see Fig. 1).

B. Experimental Data
The experimental data is "COPERNICUS/S1_GRD" in GEE platform, which comes from sentinel-1 A launched by ESA, with C-band and a resolution of 10 m. In the study area, the satellite obtains VV and VH dual-polarization data, and has a 12-day revisit frequency. To keep the consistency of sensor parameters, the same orbit data covering the study area were selected, including 29 scenes in 2018, 30 scenes in 2019, 25 scenes in 2020, and 26 scenes in 2021. The SAR time series data were composed by year, respectively.

C. Reference Data
Referring to classification of land use status of the China (GB/T 21010-2007), wetland classification of the China (GB/T 24708-2009), through field investigation, the wetland in the study area was divided into five classes: forest beach, reed beach, grass marsh, water body, and mudflat. According to the site location in field investigation, high resolution images in Google  Earth and sentinel-1/2 images, reference sample data were made for training and verification, and marked on GEE. The specific distribution is shown in Fig. 2. The number of sample data is shown in Table I. Field photos of different classes are shown in Fig. 3.

III. MATH
The process of the method is shown in Fig. 4. First, extract the Sentinel-1 features, including time series similarity (TSS), G 0 statistical texture (G 0 _ST), and backscatter intensity (BI) features from sentinel-1 SAR time series images, and combine them for training and classification. Then, carry out the RF classification, and next perform the simple linear noniterative clustering (SNIC) superpixels segmentation optimization to obtain the final wetland classification result.

A. Time Series Similarity Feature
Yang et al. [34] defined the similarity between the target and typical scatterer, which was applied to the classification and targets detection. In this article, we transferred the calculation object from correlation matrix to time series SAR intensity data, and proposed time series similarity parameters to describe the similarity between long-time backscattering intensity sequences. The formula is as follows: where t is the time series backscattering intensity vector of target, t i represents the value at time i, t c is the time series backscattering intensity vector of a sample class, and t ci represents the value at time i. N means the number of images in vector. TSS value is equal to or close to 1, indicating that they are highly to be the same object. If they are not the same class, the value will deviate from 1, and the further away from 1, the higher the probability of a different object.
Taking the mean values of training samples of different classes as the standard ones, the TSS features of different wetland classes under VV and VH polarization were calculated according to (1).

B. Statistical Texture Feature
Wetlands are complex and diverse with high heterogeneity. Considering G 0 distribution model can meet the modeling requirements of complex scenes with high heterogeneity [16], [20], G 0 distribution was selected for statistical modeling and analysis of SAR data covering wetlands. The probability density function of G 0 distribution is as follows: where C represents SAR data of covariance matrix, L represents the number of multilook processing, λ represents texture parameters, Γ(·) is gamma distribution, Σ is the mean value, and tr(·) is the trace of the matrix. And d means the polarization dimension of SAR data, d equals 1 in single-polarization, and d is 2 in dualpolarization, 3 for full polarization. In this article, Sentinel-1 SAR data was modeled and analyzed on single-polarization VV and VH.
When d = 1, (2) can be simplified to the G 0 distribution of the multilook and single-polarization SAR intensity image I. Its probability density function is Texture parameters can describe the heterogeneity. Based on the second-moment estimation method, Chen et al. [21] constructed G 0 distribution texture parameter calculation applicable to single/double/full polarization SAR data, its expression is as follows: When the SAR data is dual/full polarization, x = tr(Σ −1 C).When the SAR data is single polarization, x = I. V ar and E represent variance and mean processing.
According to (4), the statistical texture of all SAR images series was calculated, and the VV and VH statistical texture features were obtained.

C. RF Classification and SNIC Superpixels Optimization
Based on the decision tree, RF combines bagging and random choices, coordinates the classification attributes of different decision trees through voting constraints, and weakens the random error. RF can process high-dimensional data, is difficult to over-fit, and has strong generalization ability. It is the most widely used classifier in wetland classification application. In this article, backscattering intensity, statistical texture, and TSS were combined as the input-feature set of RF. The number of RF trees was set to 50, also the contribution of features in classification was output.
After RF classification, SNIC in GEE was used to obtain superpixels, and then voted within superpixel to optimize the classification. The class with the most quantity in each superpixel was the final classification result. The SNIC algorithm controls its fineness by the initial number of seed points, compact coefficient, and connected number. The segmentation parameters of the experiment were as follows: the seed number was 30 in the west of the wetland, 15 in the south, 20 in the east, and value of compactness was 50, and the connectivity was 8.

IV. EXPERIMENTAL RESULTS AND ANALYSIS
Generated the confusion matrix utilizing the verifying data set, and calculated overall accuracy (OA), producer accuracy (Pro), and user accuracy (User) to quantitatively evaluate the effectiveness of the method. In addition, taking the TSS features, texture feature, and RF classifier as variables, respectively, comparative experiments (see Table II

A. Classification Results and Accuracy Evaluation
The sentinel-1 time series images of Dongting Lake wetland in 2020 were used to verify the proposed SAR wetland classification method, and the results were quantitatively evaluated by the confusion matrix. Fig. 5 shows the classification results of Dongting Lake wetland. Fig. 5(a) is the classification results of M-7, which can be seen that reed was widely distributed in the east, south and west of Dongting Lake wetland and was one of the dominant land types. The grass marsh was mostly distributed near the water area, mainly in the middle of east and south Dongting Lake. Forest beaches were mainly distributed in the western and southern regions, and mudflats were mainly distributed near the waters of east Dongting. The water body flowed through the Dongting Lake wetland, with open water and large runoff in the east and south, and many tributaries in the west. Table III is the confusion matrix obtained based on the verification set. The OA of classification results was 97.57%, Kappa = 0.9693, and the accuracy of each class was above 95%, indicating that the proposed method can effectively identify wetland land cover classes.

B. Comparative Experiments and Characteristic Analysis
The classification results of the comparison methods were shown in Fig. 5(b)-(g). The distribution results obtained by different methods were very similar. The lowest OA in the classification accuracy of Table IV also reached 87.37%, showing high reliability. However, it was found that the combination of different features and classification methods had differences in the wetland classification.
First, after introducing TSS features (M-3 versus M-1), the OA was improved by more than 2%, reaching 94.34%. The producer accuracy of mudflat was improved by 16.4%, and the user accuracy of grass marsh was improved by about 2%. Since mudflat and grass marsh are almost flooded in the wet season, the backscatter is similar. However, in the dry season, the lake water retreats, the coverage of mudflat and grass marsh is different, which lead to the dissimilarity of backscatter. The TSS feature made full use of change information to improve the classification accuracy effectively. Compared with the forest beach, the rhizome height, leaf density, and color of reed in the whole growth process vary greatly. The temporal information collected by TSS features can effectively reflect the growth state of vegetation. The producer accuracy of reed beach was increased by more than 1%, reaching to 97.71%.
Second, after introducing G 0 statistical texture feature (M-2 versus M-1), the OA was improved by 3.65% to 95.81%. The producer accuracy of mudflat increased 9.6%, that of forest beach increased by 6.2%, and the user accuracy of grass marsh increased by 5.1%. During the wet season, the mudflat is flooded by water. When the water retreated, the surface of mudflat is smooth, but rough later. The change of surface roughness is characterized by multi temporal G 0 statistical texture features. The surface roughness of reed beach is high during the budding  and growth period of reed, but low in the mature period. Compared with reed, the image homogeneity of poplar or willow planted in forest beach changes slightly.
Third, after combining all features (M-6), compared with the previous method utilizing intensity (M-1), the OA was improved by 4.95%, compared with only adding TSS features (M-3), the OA was improved by 2.77%, and compared with only adding statistical texture features (M-2), the OA was improved by 1.3%. The experimental results showed that multifeature combination was effective to improve the accuracy of wetland classification. And then, the RF classifier (M-6), compared with the SVM classifier (M-4), the OA improved by 9%, with the DT classifier (M-5), the OA improved by 9.74%. RF combined the results of multiple features and decision trees during classification, which reduced the random error and the classification noise in the results.
Finally, compared with M-6, M-7 added superpixels and utilized spatial context information further improved the OA to 97.57%. The integrity of targets' distribution was enhanced, and pepper-salt noise was optimized, such as reed in forest beach and water body in mudflat. The superpixels internal voting constraint considered the correlation of targets in geographical distribution, enhanced the spatial connection of pixels. Therefore, RF with superpixels could get high precision and reduce the pepper-salt error effectively.
The proposed method fusing TSS features, statistical texture feature, and backscattering identify together, made full use of time and texture information, which improved forest beach, reed beach, and mudflat recognition effectively. Through superpixels optimization, the pepper-salt phenomenon was well reduced.
But this method also had some errors, and there was nearly 5% misclassification of water body. For example, in the western and southern areas of Dongting Lake, as is shown in A and B of Fig. 5(a), local areas of the river were truncated and identified as reed beach, glass marsh, or mudflat. The mistaken river is generally narrow, which may be related to the resolution of Sentinel-1 and the speckle of SAR. In addition, plants such as reeds on the river may be also interfering with the accurate extraction of narrow rivers.

C. Multiyear Classification Results and Analysis
By utilizing the proposed method, collecting the sentinel-1 image series in GEE platform from 2018 to 2021, we obtained the wetland classification results year by year and counted the area, to grasp the distribution characteristics and changes of wetland coverings.

V. DISCUSSION
To analyze the distinguishing ability of TSS and statistical texture features on different SAR polarization, the J-M distance between each two class was calculated by utilizing the training sample set. For the statistical texture, the evaluation also considered the time difference. The J-M distance is a common index to evaluate the ability of distinguishing targets, whose range is [0, 2]. The larger the value, the stronger the distinguishing ability.

A. Effect Analysis of Time Series Similarity Features
The J-M results of various TSS features of VV and VH polarization are shown in Fig. 8(a) and (b). TSS features based on VV polarization were better than VH in discriminating between targets. In VV polarization, water body VS mudflat, forest beach VS mudflat, forest beach VS grass marsh, and grass marsh VS water body were well distinguished in TSS, and the maximum J-M value exceeded 1.8. Reed beach VS mudflat had a good discrimination under the TSS of reed and grass marsh, but it was general in the TSS of other classes. The distinguishing of reed beach VS grass marsh was general, whose J-M value was about 1. The differences between forest beach VS water body and grass marsh VS mudflat were poor. The changes of forest beach and permanent water body are small, and the change trend of grass marsh and mudflat with flooding is similar. In VH polarization, forest beach VS grass marsh, grass marsh VS water body could be distinguished well. The effects of forest beach VS reed beach, forest beach VS mudflat, and reed beach VS water body were different in TSS features, In general, they were good. However, the distinguishment between forest beach VS water body was not ideal, because the change of forest beach and permanent water body is small. In short, the proposed TSS features were helpful to identify and extract mudflat, grass marsh, and reed beach with high dynamics, which was consistent with the results of M-3 versus M-1 in Table IV.

B. Effect Analysis of Statistical Texture Feature
The J-M results of statistical texture features of VH and VV in different time are shown in Fig. 9(a) and (b). The closer the J-M distance curve is to a circle of radius 2, the better the discrimination between targets. In general, statistical texture features in VH polarization were better than in VV. For example, in forest beach VS mudflat, all J-M values were high in VH polarization with good discrimination, but there were fluctuations in VV polarization and the low value was small. The discrimination of reed beach VS water body and forest beach VS water body was very good in VV or VH. The differentiation effect of reed beach VS mudflat, forest beach VS grass marsh, grass marsh VS water body was not bad, and their J-M value of some periods was above 1.5. It was difficult to distinguish water VS mudflat, reed beach VS grass marsh, forest beach VS reed beach, and grass marsh VS mudflat. The heterogeneity of water body, mudflat, grass marsh, reed beach, and forest beach increases gradually, and the statistical texture features have a better ability to distinguish between classes with large heterogeneity differences. The abovementioned analysis showed that the introduced statistical texture could enhance the recognition and extraction ability of forest beach, mudflat, and reed beach, which was consistent with the results of M-2 VS M-1 in Table IV. The discrimination was different in different time periods. For example, in the VV polarization of forest beach VS mudflat, its J-M value was close to 2 from July to October, and fluctuated between 1.5-1.8 from February to April, while the value was lower in May, November, and December.

C. Importance Evaluation of RF
The larger the RF contribution value is, the more important the feature is. There are 110 features in 2020 sentinel-1 dataset, including 50 statistical textures, 50 backscatter intensity, and 10 TSS. The top 30 features with feature contribution value are shown as Fig. 10, among which there were 22 statistical texture features with a dominant ratio of 22/50, 4 TSS features with a dominant ratio of 4/10, and 4 backscattering intensity features with a dominant ratio of 4/50. And 1010-VVG0, 0916-VVG0, and T-TSSVV were in the top three. The contribution of statistical texture features and TSS features was prominent in classification. In addition, from the top 30, the contribution value of statistical texture on October 10 was the largest, while the number in February was the most. The number of features calculated based on VV was 18, which on VH was 12, also the contribution value of features based on VV was larger.

VI. CONCLUSION
In this article, a wetland classification method utilizing RF combining multifeatures of time series similarity, G 0 statistical texture, backscatter intensity, and spatial context information was proposed, which could make full use of various features of time series sentinel-1 SAR data to accurately classify Dongting Lake wetland. The proposed SAR TSS features could effectively describe the difference of SAR time series change information among classes, and was helpful to identify and extract mudflat, grass marsh and reed beach with high dynamics in wetland. The introduced statistical texture feature took the statistical distribution characteristics of SAR data into account, could express the heterogeneity of targets, and enhance the recognition and extraction ability of forest beach, mudflat, and reed beach. The superpixels was used to reduce the influence of speckle through spatial context information, and optimized the pepper-salt phenomenon in pixel level classification.
The proposed method with sentinel-1 SAR of Dongting Lake wetland in 2020 achieved good results and its OA was 97.57%, the Kappa coefficient was 0.97, and the classification accuracy of all classes was more than 95%. By adding TSS features, the OA was improved by more than 2%, and the producer accuracy of mudflat improved by more than 16%. After introducing statistical texture features, there was an improvement by more than 3.6% in OA. Compared with SVM and DT classifier, RF was better than 9%. Also, the OA was optimized by 0.83% with superpixels voting. Through J-M distance analysis and RF importance evaluation, the proposed SAR TSS features and the introduced statistical texture feature had achieved good results in wetland classification.
In short, the proposed wetland classification method utilizing time series similarity, statistical texture, and superpixels with Sentinel-1 SAR Data can provide high-precision wetland products and meet the needs of wetland continuous monitoring applications. However, limited by the resolution and wavelength of sentinel-1 SAR data in GEE, the extraction of narrow rivers covered by high-density vegetation is unsatisfactory. Based on time series Sentinel-1 data, we will further supplement with more penetrating L-band, full polarization or high-resolution SAR data to explore wetland applications. Combining the proposed features with deep learning network for wetland classification is also the focus of follow-up research.