Mapping Invasive Aquatic Plants in Sentinel-2 Images Using Convolutional Neural Networks Trained With Spectral Indices

Multispectral images collected by the European Space Agency's Sentinel-2 satellite offer a powerful resource for accurately and efficiently mapping areas affected by the distribution of invasive aquatic plants. In this work, we use different spectral indices to detect invasive aquatic plants in the Guadiana river, Spain. Our methodology uses a convolutional neural network (CNN) as the baseline classifier and trains it using spectral indices calculated using different Sentinel-2 band combinations. Specifically, we consider the following spectral indices: With two bands, we calculate the normalized difference vegetation index, normalized difference water index, and normalized difference infrared index. With three bands, we calculate the red–green–blue composite and the floating algae index. Finally, we also use four bands to calculate the bare soil index. In our results, we observed that CNNs can better map invasive aquatic plants in the considered case study when trained intelligently (using spectral indices) as compared to using all spectral bands provided by the Sentinel-2 instrument.


I. INTRODUCTION
R EMOTE sensing has been widely used for water support management in recent years. For instance, the work developed in [1] supported water quality management, and monitored the spatial-temporal distribution of water turbidity. In [2], coastal ecosystem health status was evaluated. Other studies, such as [3] and [4], mapped flooded areas. In [5], March 2023. This work was supported in part by the European Social Fund (Resolución de 10 de mayo de 2017, de la Secretaría General de Ciencia, Tecnología e Innovación, por la que se resuelve la convocatoria de ayudas para la financiación de contratos predoctorales para formación de Doctores en los centros públicos de I+D pertenecientes al Sistema Extremeño de Ciencia, Tecnología e Innovación en el ejercicio 2017; expediente PD16001); in part by the Consejería de Economía, Ciencia y Agenda Digital of the Junta de Extremadura and the European Regional Development Fund (ERDF) of the European Union under Grant GR21040; in part by ayuda PID2019-110315RB-I00 del proyecto "Desarrollo de técnicas de aprendizaje profundo para la optimización de la infraestructura de supercomputación y de aplicaciones de imagen hiperespectral" water were mapped. Water body extraction was carried out in [6], and permafrost areas were mapped in [7]. Many studies have been focused on mapping vegetation, i.e., estimating vegetation regions [8], monitoring vegetation growth [9] and vegetation changes [10], predicting the location of algal blooms [11], and discriminating between different macrophyte species [12], among other applications.

A. Using Spectral Indices to Detect Invasive Aquatic Plants
Spectral indices exhibit a great potential for effectively mapping ecosystems affected by invasive aquatic plants [13]. To detect Spartina alterniflora in Sentinel-2 images [14], several indices have been used, including the normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), difference vegetation index (DVI), green difference vegetation index (gDVI), green NDVI (gNDVI), soil-adjusted vegetation index (SAVI), and a phenological vegetation index (PVI). The work in [15] combined vegetation indices (e.g., NDVI) and water indices such as the normalized difference water index (NDWI) to detect Eichhornia crassipes (water hyacinth) in Sentinel-2 images. The work in [16] also focused on the detection of water hyacinth by calculating fractional vegetation cover (FVC) in Sentinel-2 images using SAVI.
Multispectral images acquired by unmanned aerial vehicles were used in [17] to determine NDVI, enhanced normalized difference vegetation index (ENDVI), normalized difference red edge index (NDREI), normalized green-red difference index (NGRDI), and green normalized difference vegetation index (GNDVI). Landsat images were used in [18] to calculate NDVI for water hyacinth detection. Other works used spectral indices and spatial autocorrelation analysis to detect algal blooms in Sentinel-2 and Landsat images [19], and also used MODIS images to calculate the floating algae index (FAI).
In order to exploit the information provided by spectral indices and take advantage of state-of-the-art machine learning classifiers, several authors have combined both techniques in their studies. For instance, the authors in [20] used Landsat-8 and Sentinel-2 data to develop a hierarchical classifier based on three steps for water hyacinth detection. The steps can be summarized as follows: 1) water detection with a modified normalized difference water index (MNDWI); 2) vegetation detection with NDVI; and 3) detection of water hyacinth with a This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ random forest (RF) classifier. The authors in [21] discriminated between water hyacinth and water primrose in Sentinel-2 images by using RF and nine spectral vegetation and water indices (NDVI, NDAVI, WAVI, SAVI, NDVIRe2, NDVIRe3, NDWI, NDII, and MNDWI). The authors in [22] mapped Spartina alterniflora using Sentinel-2 and Sentinel-1 data, SAR vegetation indices, and seven spectral indices: NDVI, EVI, NDWI, land surface water index, and automated water extraction index. To the best of our knowledge, no studies have combined deep learning classifiers and spectral indices for detection and mapping of aquatic invasive plants.

B. Mapping Invasive Aquatic Plants Using CNNs
One of the species that is considered to be most invasive is the water hyacinth. This plant tends to cover the river surface due to its tapestry-like distribution. Great efforts have been made to control the spread of this plant, with negative effects on biodiversity, the environment, and the economy [23]. In order to successfully manage these species, current strategies aim at their removal when present on the water surface to prevent their dispersal [24].
Our previous studies began a new line of research were invasive aquatic plants in the Guadiana river, the second longest river in Spain, were mapped by using remote sensing and deep learning techniques. In order to facilitate mapping, monitoring, and control of the invasive aquatic plant distribution in the Guadiana river, our previous work first focused on automatic detection of the plant using all the spectral bands provided by the Sentinel-2 satellite. In [23], a quantitative and qualitative comparison of different machine/deep learning algorithms was carried out, determining that convolutional neural networks (CNNs) were effective for mapping purposes. This conclusion is in agreement with other works, which concluded that CNNs are successful for mapping vegetation species in remotely sensed data [25], [26], [27], [28], [29], [30], [31], [32].
In a subsequent work [33], we used CNNs as a baseline to monitor the spatio-temporal distribution of water hyacinth using sparse training samples collected from only four images (out of a total of 62 images available in the analyzed two-year time series) independently of the phenological stage. To study the dynamics of the spread of invasive plants over a two-year period, a methodology for mapping the most frequent areas of water hyacinth accumulation was developed.

C. Motivation and Innovative Contributions of This Work
The goal of this article is to provide a better training mechanism for CNNs to identify invasive aquatic plants. In case of methods based on spectral indices to detect invasive aquatic plants, they use combinations of spectral bands to highlight pixels of the multispectral (MS) images with specific land-covers. However, when vegetation indices (such as NDVI) are used, invasive plants and other types of vegetation may coincide in the same thresholds (as it actually happens in the two test areas considered in our work). This requires manual segmentation to differentiate between invasive and noninvasive plants. Obviously, this entails a more costly and less automatic process. Therefore, in this study, the invasive aquatic plant detection task consists of an automatic process by using deep learning techniques. Moreover, this work takes advantage of band combinations that are conducive to highlighting vegetation in the water. This is accomplished by extracting suitable training samples for the model. In summary, the main contributions of this work can be summarized as follows.
1) We introduce a new methodology for the automatic detection of invasive aquatic plants that simplifies the spectral complexity of the images acquired by the Sentinel-2 satellite. Specifically, we use spectral indices to improve the training process of CNNs and determine if the obtained results (calculated using a reduced set of carefully selected Sentinel-2 bands) can improve the results obtained using all available bands. Our approach uses remote sensing, geographical information systems (GIS) techniques, and a deep learning model. An additional advantage of using fewer spectral bands is to reduce the amount of data required for image downloading, preprocessing, and processing. 2) We test our new automatic detection strategy in two areas of the Guadiana river, Spain, which are heavily affected by the presence of invasive aquatic plants. Our methodology is tested in a quantitative and qualitative way. The first area is affected by Mexican water lily, while the second one is affected by water hyacinth. 3) We conduct a comprehensive comparison of the performance of the CNN model with other classical machine learning classifier, the RF, that was used as a standard in other studies.
II. METHODOLOGY Fig. 1 graphically illustrates the workflow adopted in this work. First, multispectral images over the Guadiana river are collected by the Sentinel-2 satellite. Next, a preprocessing of these images is carried out where spectral indices are calculated. Region of interest (ROI) extraction and management of nodata values are also carried out as described in [23]. Then, automatic detection of invasive plants is carried out using CNNs trained with different spectral indices. Finally, the outputs are evaluated by comparing the results with a ground truth image, generated according to the procedure in [23], calculating different accuracy metrics.

A. Study Area
Two sections of the Guadiana river affected by invasive aquatic plants have been selected for experiments (see Fig. 2). The first ROI corresponds to an area affected by Eichhornia crassipes, also known as water hyacinth, close to the city of Mérida (ROI_ME). There, an invasive plant control barrier for mechanical removal of these plants has been installed. The second ROI (ROI_BA) is a section of the river affected by yellow water lily (also known as Nymphaea mexicana). This area crosses the city of Badajoz, in the SW of Extremadura region, Spain. The main characteristics of these ROIs are given in Table I.

B. Remotely Sensed Imagery
Sentinel-2 images are open-access multispectral datasets provided by ESA's satellite, as part of the Copernicus programme. Level-1 C (S2L1C) and Level-2 A (S2L2A) products are offered. The data are collected in a discrete number of bands (13 or 12 spectral bands, respectively) with different levels of spatial resolution (from 10 to 60 m per pixel). Table II shows the details of the 13 spectral bands available and their main applications. Sentinel-2 has successfully contributed to many studies for monitoring aquatic invasive plants, as indicated in Section I-A. In this work, S2L2A products with atmospheric correction are used. Moreover, the image datasets are acquired from SentinelHub [34]. Here, six different band combinations have been used as CNN inputs (details are given in Section II-C). In addition, to compare results, we also consider all Sentinel-2 bands as input to evaluate if the CNNs trained using only spectral indices (resulting from specific band combinations) can outperform the results obtained using all available bands. Table I describes the main characteristics of the considered multispectral datasets.

C. Band Compositions and Spectral Indices
In the following, we describe the spectral band combinations and the spectral indices used in this work. The Sentinel-2 band names (according to Table II) are also specified.
1) RGB: Composite of three bands: Red (B04), green (B03), and blue (B02), also called natural color band combination. It allows to display land covers in true color. The values range from 0 to 255 and they are normalized in this study to a range from 0 to 1.
2) NDVI: A well-known index [35] commonly used for green vegetation quantification. Equation (1) defines this index which is calculated by considering two bands: Red (B04) and near-infrared, hereinafter NIR (B08). The values are normalized 3) FAI: Introduced in [36] to detect vegetation on the surface of oceans. It is less sensitive to atmospheric effects than NDVI and EVI. Equation (2) shows the definition of the index for Sentinel-2 images, where three bands are considered: Red (B04), NIR (B08), and SWIR 1 (B11). Its values are normalized to a range from 0 to 1.

5) BSI:
Introduced in [38] to discriminate bare soil and fallow land from vegetation and other land cover classes by combining four bands: SWIR 1 (B11) and red (B04) bands (which determine the soil mineral composition), while NIR (B08) and blue (B02) determine the presence of vegetation. Their values are normalized to range from -1 to 1. This index is defined as follows: 6) NDII: Developed by [39] as infrared index and later used at [40] as NDII. It uses two spectral bands: NIR (B08) and SWIR (B11). This index can also be seen as a normalized version of NDMI when using Sentinel-2 B08 (or B8A) and B11 bands [41]. NDII (or NDMI) gives information on the changes in vegetation water content. Leaf internal structure and leaf dry matter content affect NIR reflectance, while the SWIR band gives information on changes in the water content of the vegetation, as well as the structure of the spongy mesophyll in vegetation canopies. Its values are scaled to a range from -1 to 1. The index is defined as follows: Table III summarizes the main characteristics of the set of images considered in this study, where spectral indices are used to train the CNN adopted in our work for mapping purposes. It should be noted that the resulting raster images after applying spectral indices have 10 m of spatial resolution. To deal with the different spatial resolutions between the spectral bands involved in the calculation of the indices, we sampled every band to a 10 m resolution. In this way, 20-m resolution bands (60 m would be the same) are obtained by repeating their pixel values resampling them (without modifying their values) to 10 m resolution. Since there is at least one 10-m band in every index, all the pixels offer

D. Preprocessing
After downloading the Sentinel-2 images, spectral indices are calculated for the pixels used for training as described in Section II-C, and the NoData values are managed as explained in [23], i.e., normalizing the pixel values between 0 and 1 and changing the format from 8 bits to 32 bits. The CNN architecture and the training process are explained in the following section.

E. Detection
As indicated above, the main novelty of our work is that the detection of aquatic plants is performed using a CNN trained using spectral indices. In the following, we describe the CNN architecture and the training process.
1) CNN Architecture: For aquatic weeds detection on preprocessed Sentinel-2 images, a CNN model was developed and trained with all Sentinel-2 spectral bands in [23], outperforming other traditional machine learning methods. The same CNN model was also used in our previous work [33], in which the training set was composed by samples collected from different Sentinel-2 images, acquired on different dates. In those works, the CNN model allowed us to detect water hyacinth at different phenological stages and also to analyze the spatio-temporal dynamics in a time series, determining the areas where the invasive plants were most frequently accumulated in the period analyzed.
In this work, we adopt the same CNN architecture, but the CNN is trained using spectral indices instead of all the spectral bands from the Sentinel-2 satellite. A detailed scheme of the CNN architecture is shown in Fig. 1. Specifically, the CNN architecture has one convolutional 1-D layer with rectified linear unit (ReLU) activation function, 20 filters, and a kernel size of 12, together with a reshaping (flatten) layer), a fully connected (dense) layer (including 128 neurons), a batch normalization layer with ReLU activation, and a fully connected (dense) layer with 4 neurons and Softmax activation. The model is retrained in this work as indicated in the following section.
2) Training Process: As illustrated in Fig. 1, the CNN model has been trained with different sets. Specifically, six training sets (based on different spectral indices) have been used: Set RGB , Set NDVI , Set FAI , Set FAI , Set NDWI , Set BSI , and Set NDII . These sets have been generated as follows. First, a ROI is defined to label the pixels containing aquatic invasive plants in the image (here, two case studies are considered: Badajoz and Merida). As explained in [23], the ROI contours were carefully selected using high resolution imagery and ground knowledge, so that the ROI contours encompass only pixels that contain invasive aquatic plants in the image. Then, a percentage of the pixels in the ROI are selected for training the CNN architecture (while the remaining pixels in the ROI are used for testing). In the selected pixels, the per-pixel values of different indices are calculated, resulting in six different training sets. These training sets are used to train the CNN architecture. Table IV shows the total number of pixels used for training in each considered case study (Badajoz and Merida), resulting in two ROIs that are called ROI_BA and ROI_ME.

F. Hardware/software Environment
The hardware environment considered in our experiments comprises an Intel(R) Core(TM) i9-10900 k processor with 64-GB RAM memory and 2-TB SSD. Regarding the considered  IV  NUMBER OF TRAINING SAMPLES (PIXELS) USED FOR TRAINING THE CNN  ARCHITECTURE IN THE BADAJOZ CASE AND THE MÉRIDA CASE software environment, the implementation was developed in Python 3.10 by using TensorFlow and Keras framework. GIS techniques have been considered for preprocessing operations (raster clipping, image analysis, sample selection, etc.) by using Python scripts and QGIS software tools. These tools have also been used for the visualization of images and for the design of map layouts.

A. Metrics for Accuracy Assessment
In order to evaluate the CNN models, the prediction errors are calculated. For that purpose, ground truth data generated in our previous work [23] have been used in this study. A confusion matrix has been calculated for each CNN model. These are binary matrices that indicate whether a pixel of the image contains invasive plants or not (zero value if invasive plants are detected and one value it they are not). In the first column true negatives (TNs), false positives (FNs) are represented. In the second one, false positives (FPs) and true positives (TPs) are shown.
Moreover, different metrics have been implemented for evaluating the performance of the considered classification approaches. The different relationships between the model predictions and the real values (ground truth values) are defined in the following equations: Overall accuracy [(6)], user's accuracy [(7)], producer's accuracy [(8)]-also known as recall, sensitivity, or TP rate-and F 1 score [(9)].

B. Assessment of Results
This section describes the results of invasive aquatic plants detection after applying the CNN model, using the different training sets considered in this study. Fig. 3 shows all the CNN detection maps. In addition, the ground truth (GT) images and the Sentinel-2 (S2) images in true color are also shown, for better visual interpretation of the results. In the Mérida case [see Fig. 3(a)], by comparing the S2 and GT images with the CNN algorithm outputs, we can see that there is water hyacinth distributed in small masses on the river banks and in a large mass upstream of the containment barrier. In general terms, it can be seen from Table VI that better accuracies are obtained with spectral indices than with the total number of bands offered by the S2 satellite. In all cases, training with spectral indices results in high accuracies (from 0.856 to 0.933 in terms of overall accuracy, from 0.843 to 0.924 in terms of user's accuracy, from 0.754 to 0.935 in terms of producer's accuracy, and from 0.798 to 0.913 in terms of F 1 score). The index with the worst results is the NDWI.
In contrast, in the Badajoz case, the invasive plant (Mexican water lily) is distributed in several irregular masses over the entire surface and along the banks. As it can be seen in Fig. 3(b), very good accuracies (from 0.643 to 0.779 in terms of overall accuracy, from 0.641 to 0.813 in terms of user's accuracy, from 0.669 to 1.00 in terms of producer's accuracy, and from 0.734 to 0.839 in terms of F 1 score) are obtained with spectral indices. In two of the four accuracy metrics (user's accuracy and producer's accuracy), better results are obtained with spectral indices than with all the bands offered by the S2 satellite. In the cases in which the best scores are obtained using all bands (overall accuracy and F 1 score), the difference with the accuracy of the index offering the second best accuracy differs in less than 2% (in the case of overall accuracy) and less than 1% (in the case of F 1 score).

IV. DISCUSSION
The results in Table V reveal that the CNN model provides the best results when trained with spectral indices (RGB, FAI, BSI, NDVI, and NDII) instead of all spectral bands offered by S2L2A in the case of Mérida. We also generated the same results for the RF classifier and included them in Table VI to compare our CNN with a classical classifier, and we can see all results are improved by our method for both areas. However, as illustrated in Fig. 3(a), the results obtained with NDVI show that some of the accumulated water hyacinth is not detected upstream of the large mass in the containment barrier. Moreover, as can be seen in the confusion matrix in Fig. 5(c), the prediction success (i.e., the TPs and TNs) are higher and the prediction errors (i.e., the FPs and FNs) are lower with indices than with the all bands used as input. Therefore, the use of a large number of bands as input (instead of a specific index) introduces some confusion in the learning of the CNN. As it can be seen in Fig. 3(a), if we consider the NDWI as input, there is water hyacinth accumulated behind the barrier and on the banks that has not been correctly detected. This fact can also be appreciated in Fig. 5(e). In the same way, the results in Table VI (in terms of user's accuracy) show that, when the CNN is trained with spectral indices, better results are obtained than using all the bands as input. Regarding the producer's accuracy, as shown in Table VI, the spectral indices that do not exceed the accuracy provided by using all bands are NDVI, FAI and NDWI, because the percentage of TPs is lower and the percentage of FNs is higher (as is shown in Fig. 5). Finally, considering the F 1 score metric, the worst accuracy is obtained with NDWI index.
In the Badajoz case study, it can be observed in Fig. 3(b) that the worst results are obtained when training the CNN with RGB, NDWI, and FAI, due to overdetection in RGB and NDWI and under-detection in FAI. On the other hand, as shown in Table VI, training with BSI, NDII, and NDVI provides similar results to using all bands in terms of overall accuracy. It can also be seen in Fig. 6 that there are higher FP values in RGB and NDWI and lower TP values in the case of FAI. However, when analyzing the user's accuracy metric, it can be appreciated that the worst results are obtained with RGB and NDWI and the best ones with BSI and NDII. In the case of producer accuracy, RGB and NDWI obtain very high results because they result in a high percentage of TPs and a very low percentage of FNs and, on the contrary, lower results are obtained in cases where the percentage of FNs is higher (as with the BSI and NDII inputs). Finally, analyzing F 1 score, the highest values in terms of FPs do have a negative effect, penalizing the RGB, NDWI, and FAI indices, while favoring other indices such as NDVI, BSI, and NDII. To visually compare the results, Fig. 4(a) and (b) were generated with the same conditions but using the RF algorithm.
Our results also reveal that the accuracies vary if we consider the case of Mérida or Badajoz. In the case of Mérida, higher accuracies are obtained than in Badajoz. As it can be seen in Fig. 3(b), the distribution of invasive plants on the surface of the river in the area of Badajoz is presented in masses with more irregular shapes than the masses of invasive plants present in the area of Mérida. These masses are also responsible for the higher percentage of FP detections.
Another significant benefit of using spectral indices (instead of all bands) for training the CNN is computational efficiency. Not only because fewer bands need to be downloaded from the S2 satellite, but also because of the higher throughput resulting from the fact that less information has to be processed. Moreover, since the results provided by all indices resulted in better accuracies, a more efficient training of the CNN architecture can be performed. If we compare the training times needed for the RF and CNN algorithms, the mean time (after 10 runs) for training the RF algorithm is about 0.3 s using indices or the whole set of bands. Regarding the CNN training time, it is higher (about 11 s).

V. CONCLUSION
In this work, we have developed a new method for invasive aquatic plant detection in Sentinel-2 images that relies on a CNN architecture trained with different spectral indices. This represents a significant advantage over previous works, in which the CNN was trained with all the spectral bands available (which generally resulted in lower quality results despite using a higher number of spectral bands). The index with better detection accuracy depends on the evaluation metric that is considered, but in general it can be concluded than the information provided by spectral indices is more useful for training than the raw multispectral data provided by the Sentinel-2 satellite. This opens new perspectives in terms of more efficient and effective data processing. In future studies, we will perform a more detailed assessment of computational performance and consider other deep learning models.