A New Technique for Segmentation of the Oil Spills From Synthetic-Aperture Radar Images Using Convolutional Neural Network

Oil spills have proven to have detrimental effects on the marine-based environment and economy. Thus, it is necessary to identify oil spills and classify them in the sea to reduce oil-induced pollution in seas and oceans. Synthetic-aperture radar (SAR) imaging is a good option for rapid oil detection, as it covers a wide area, collects data at short intervals, and allows taking images in all weather conditions throughout the day. The reason for using deep neural networks is that training several images enhances segmentation accuracy significantly. This article intended to separate the oil spills of SAR images using U-NET and DeeplabV3 neural networks, separately with the lowest number of images and the highest accuracy possible. Each of these neural networks carries out image segmentation with different architectures independently and thus we could not combine these two networks for oil spill segmentation. We managed to find two accurate convolutional neural networks (CNNs) for oil spill segmentation because we did not have access to sufficient hardware facilities, such as GPU, to train dozens of neural networks. The two networks we used in the article are among the most well-known and widely used networks. Our purpose was to figure out which network was the best in SAR oil spill detection. Given the limited number of SAR oil spill images and as the input of CNNs needs many images for training, we increased the number of input images to 9801 using the augmentation technique. Then, we carefully identified oil spills with 300 epoch and a batch size of 5 using the Python programming language on the GoogleColab server. The oil spill detection accuracy was 78.8% in the U-NET network and 54% in the DeepLabV3 network. Accordingly, we conclude that the most accurate identification of SAR oil spills in images belong to the U-NET network.

deliberate discharge of tank-cleaning wastewater from ships [1], [2], [3], [4], [5]. This is caused by the frequent tanker accidents and oil spills in waters as the main cause of oil leaks in oceans and seas. By the leakage of oil into water bodies, a thin layer is rapidly formed by spreading over the water surface, known as an oil spill. Due to several environmental factors, marine oil spills are hazardous and can rapidly spread over an extensive area.
One of the most significant origins of marine contamination is oil spills with severe economic and environmental effects on the coastal zone and ocean [1]. Oil spills caused by unintentional or intentional releases into coastal or oceanic waters present a primary threat to marine ecosystems. Hence, the adverse effects of oil spills on these ecosystems are the subject of significant environmental, political, and scientific concerns [6]. The NEREIDs program made the first serious attempt with the support of the European Commission to utilize metocean, shipping, and geological data to characterize oil spills in one of the key oil exploration areas in the world to hinder any major oil spill accidents. These data revealed that oil spill models were generated to simulate trajectories, develop oil spills, assess the susceptibility of the coastal zone, and find suitable measures to alleviate its environmental effects [7]. Hence, identifying and classifying oil spills are essential in preventing water contamination [8]. Nonetheless, some natural phenomena (e.g., waves, ocean currents, and human factors) can alter light intensity over the sea's surface, which leads to non-uniform intensity or high noise from oil spills or lookalikes, sometimes making it very difficult to segment oil spills automatically. Hence, the accurate segmentation technique crucially contributes to oil spill control [9]. Synthetic-aperture radar (SAR) is a cohesive imaging technology capable of producing high-resolution, large-scale images of the earth and targets. SAR can function at both night and day and in adverse conditions to overcome the limitations of optical and infrared systems [10]. Thus, oil spills are detected as "dark" areas in SAR images [11]. Traditionally, SAR imaging is conducted by a moving aircraft or spacecraft [12]. Oil spills emerge in SAR images in form of dark patches, because they produce low-backscatter responses in comparison to nearby clean sea regions [13]. An image of oil spills appears as dark gray pixels in SAR images. It is not simple to transfer image segmentation approaches to SAR images. These images comprise speckle arising with scatterers' complex summation within a resolution cell from coherent signals. Through such This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ noise, together with SAR sensor representation of geometry, it becomes challenging to consider edge information in the segmentation of the SAR image [14]. Nevertheless, SAR images are normally polluted by multiplication noise or speckle caused by the destructive and constructive interference scattered from coherent returns scattered by small reflectors within each resolution cell. Computer vision systems and human explainers encounter difficulties in interpretation and processing due to the presence of speckle noise in SAR images. Therefore, removing speckles from SAR images is essential to enhance the performance of various computer vision algorithms, such as recognition, segmentation, and detection [2]. It is still difficult to extract information from SAR images, even if manual labels are based on accurate auxiliary information. Data augmentation is a technique used to enhance the generalizability of the network by increasing the number of samples in the training set. However, it has many ambiguities in separating dark spots. Indeed, there is a hybrid area in the feature space for oil spills and lookalikes. For instance, SAR image properties of biological oil layers are very similar to those of oil spills. Without other auxiliary information like environmental information and remote sensing images, even experts cannot make a clear judgment about samples in a hybrid area of a feature space [15]. Over the recent years, convolutional neural networks (CNNs) have been extensively developed in radar imaging and semantic segmentation [16]. A CNN is an in-depth learning method specifically designed for image recognition and classification. This network is designed to be similar to multi-layered neural networks. CNNs are the same as biological neural networks employed for speech recognition and visual and image processing. CNNs can extract spatial information effectively and share weights among nodes for reducing the number of parameters [17]. A CNN works well in recognizing two-dimensional shapes and learning structural features and presents a way to distinguish features from points automatically. Short training time makes it easier to use artificial neural networks and improves diagnostic accuracy [18]. A CNN also works well to automatically learn features from raw data, particularly for structural features [19]. CNN was successfully used in image classification and recognition, indicating the satisfactory results of CNN in hand-written recognition [20]. CNN is also used in image segmentation. A study in 2005 reviewed oil spill classification and segmentation methods [21]. According to Solberg et al., there are three feature extraction steps for oil spill detection in SAR images, including oil pollution detection, dark spot detection, and look-alikes. The oil extraction location is affected by the accuracy of dark spot detection. In this article, different methods were reviewed for satellite-based oil spill detection in the marine environment. Manual and automatic methods were considered based on various satellite sensors and oil spills under different circumstances to differentiate between oil spills and look in terms of pattern recognition [21]. In another study, two neural networks were implemented by Duan et al. [19] for segmenting dark objects and separating oil spills from similar ones. The presented technique was very auspicious in detecting dark formations and oil spills from similar cases because dark formations are recognized with an overall accuracy of 94%. Moreover, 89% of the examined items were correctly recognized. The framework was used in different other assessments [6], [22]. According to Krestenitis et al., semantic segmentation with deep CNN (DCNNS) deployment can be used successfully in oil spill detection. The established DCNN segmentation model was trained and evaluated using a common database on a case-by-case basis. Generally, the best performance was reported by the DeepLabv3 model with the highest accuracy based on greater inference time. The mentioned study is advantageous since the complexity of the problem was extensively discussed, along with relative figures. The superiority of the DeepLabv3 model was confirmed by further comparison of various DeepLab architectures with previous methods. It is not possible to use this model for oil spills and similar classes since some oil spill pixels are similarly classified. Though, it is expected that this deficiency was caused by the small number of specimens for the small object size and the training method [23]. A superpixel segmentation method was introduced by Zhang et al., to hypersegmente SAR images. A statistical dissimilarity measurement technique was proposed in the superpixel integration section to transform soft superpixels into a self-connected weight graph. Moreover, phase integration and superpixel generation were run under a unified deep network. This method is advantageous since its shape is alternately adjusted based on the segmentation results and boundaries during training to achieve the desired segmentation results. The segmentation method is efficient computationally due to the simple network structure and includes advanced performance and good generalization [24]. A boundary clustering method was designed by Ma et al. to assess the specific tasks of superpixels. A soft graph convolutional network was proposed in the segmentation section taking the connectivity map as input and subdividing superpixels intelligently. The superpixel and graph complexity parts can be trained under a unified framework until obtaining optimal parameters of the two parts, which is the advantage of this technique. Furthermore, the superpixel shape can be adjusted by the network gradually in terms of segmentation. The results revealed that the boundary information could be preserved by the suggested superpixel generation model, which has good resistance to stains. This technique was efficient computationally and performed consistently with generalizability [25]. A benchmark solution was presented by Hang et al., by establishing a general framework for deep learning of multiple MDL components. Generally, the MDL framework is used to classify pixel tasks and model spatial information with artificial neural networks. A general framework was presented comprising two subnetworks, i.e., ex-framework and FU-net, to present a basic solution for classification tasks of pixel-level images using MULTIMODAL data. Various hybrid strategies have been presented in this regard. In networks, the three "what," "how," and "how" were focused along with two feature extraction approaches associated with FC-NET and CNN to classify pixels and spatio-spectral classes, respectively [26]. The CNN network can record spatial-spectral properties to classify hyperspectral images. Recently, GCNNs have been used in data analysis and visualization despite the sampling restrictions. Hong et al., assessed the complete HS image arrangement with GCNN and CNN. MINIGCNN can infer the output data without altering the network and enhancing the classification performance. It also facilitates the training of large-scale graph networks through the MINIBATCH technique. MINIBATCH allows the joint use of GCNN and CNN to extract more distinct and diverse properties for the classification of HS images [27]. Usually, HS images are combined into a data cube using spatio-spectral data. Generally, it can be considered a sequence of data along with the spectral dimension. Inverters can describe available sequential attributes as a powerful architecture. An inverter-based backbone network was proposed by Hong et al., by further focusing on the extraction of spectral information from classifying these images without using it. This method can be improved by investigating self-organizing learning and creating a weighted network in terms of transformers to reduce the network complexity while maintaining more performance [28].
In this article, we needed many images to train convolutional networks. For this case, we increased 99 images of oil spills received from the Sentinel-1 and EnviSat satellites from the desired sites to 9801 images using the augmentation technique. Thus, we removed the oil spill 2 from the image and placed it on the oil spill 1. Several images are required by the CNNs for training. Only 99 images were used in the article as the database. The method as the main creativity and innovation of the article was used to increase the number of images of each oil spill on the remaining 98 oil spills. Thus, the images with various oil spills were incremented to 9801. Furthermore, the method was used to train CNNs well and perform image segmentation with higher accuracy. The main background of the new image was the oil spill 1. By this innovation, CNNs are trained without overfitting and perform segmentation with considerable accuracy. The remaining sections of the article were arranged as follows: The collection procedure for images required for training neural networks was introduced in Section II. Section III described the proposed method. Section IV explained the analysis results of segmented images of the oil spill output from CNNs in detail with their accuracy tables. Section V presented conclusions and implications.

II. DATASETS
This section deals with the way SAR radar oil spill images were collected for the database to train CNNs. SARs send information to Earth in Earth's orbit in any weather conditions and at any time of day or night. Images sent by the Sentinel-1 and EnviSat radars at alternating time periods per month can be downloaded from the relevant websites. This dataset shelters major types of oil spill candidates detected under various sea conditions [29]. In oil spill images, it is necessary to consider satellite type, frequency band, resolution, polarization, and SAR or POLSAR type. For instance, on the ESA SciHUB site, images sent from the Sentinel-1 radar can be received and converted to desired formats, like JPG, using SNAP software. A high resolution (up to MB) SAR radar is needed to train the neural network of convulsive images of oil spills. In this article, about 99 original images of oil spills were collected from different SARs. SAR images are needed in addition to annotated images of oil spills to train CNNs for oil spill segmentation. An annotated image of an oil spill made using Supervisely software is presented in Fig. 1. The mask image shows all the borders of oil spills accurately.

A. Creating a Database of SAR Oil Spills Images in Google Drive
We put each of the oil spills images on the other oil spills to increase the images for training CNNs. An example of an oil spills shown on an oil spills is shown in Fig. 2.
For instance, we removed the oil spill in the first image. To do so, we multiplied the mask in the same image, and the resulting matrix returned the image values in the spots and was zero in spots without oil spills. Using matrix techniques, we placed the spot image in Fig. 2. The oil spill of the first image may overlap with the second oil spill, which does not affect the output of our work in this case. Using this method, we can produce many new images. The block diagram of the image data augmentation technique is shown in Fig. 3. We put all the 9801 images of SAR oil spills along with annotated images in a folder in the Google drive to provide a better access and established the link for addressing the folder in the program code. Each folder in the Google drive had two original images and a mask. The original image was the image received from SARs, and the annotated image was the image obtained from the software for training CNNs. The database of the images is available here and online. The original and annotated images were preserved in two separate folders.

III. PROPOSED ALGORITHM
In this article, we first masked all the oil spill images with supervisely software. Deep learning models become hungry of data, particularly with large architectures comprising numerous trainable parameters. For learning general classification rules and features and not overfitting the training data, the model must Fig. 3. Image data augmentation technique. be exposed to numerous input-output pairs that are segmentation masks and SAR images for oil spill detection. Although unlabeled data are inexpensive and accessible in large degree, there are normally scarce labels expensive for achievement [30]. By augmenting the dataset during training, overfitting the training data was avoided in the model, and the generalization ability of the model was enhanced over invisible instances. The generalization performance can be improved by randomized data augmentation in various computer vision tasks, such as applications on remote sensing [29]. Then, using the image data augmentation technique, we increased the SAR oil spill images to 9801 images. We needed many images to train convolutional networks to avoid overfitting. Next, we trained the images to U-NET [31] and DEEPLABV3 networks separately, and at the end, we gave some images to the network as a test sample. The oil spill segmentation images were also displayed. The block diagram of the proposed flowchart is shown in Fig. 4.

IV. EXPERIMENTAL RESULTS
Although unlabeled data are inexpensive and accessible in large degree, there are normally scarce labels expensive for achievement [30]. By augmenting the dataset during training, overfitting the training data was avoided in the model, and the generalization ability of the model was enhanced over invisible instances. The generalization performance can be improved by randomized data augmentation in various computer vision tasks, such as applications on remote sensing [29].
Then, using the image data augmentation technique, we increased the SAR oil spill images to 9801 images. We needed many images to train convolutional networks to avoid overfitting. Next, we trained the images to U-NET [31] and DEEPLABV3 networks separately, and at the end, we gave some images to the network as a test sample. The oil spill segmentation images were also displayed. The block diagram of the proposed flowchart is shown in Fig. 4.

A. SAR Oil Spill Detection Results Using U-NET Network
We considered the size of all the oil spill images at 1024 × 1024 pixels. We programed the database address as/ content/drive/My Drive/DataSets/OilSpillGen.
Multiplying the original images by the annotated images gave us original fragmented images. As seen, an oil spill was augmented to the image, and in the annotated images, both parts of the mask oil spill were removed. This algorithm was used in the U-NET network application code. As observed in Fig. 5, the final oil spills had a black background and the spots were quite clear. The network was trained with 90% of the input images. Here, we considered 90% of the original images for training and 10% for testing. Then, after the image augmentation technique, we used 10% of this educational data for validation. Identifying oil spills means that areas containing oil spills surrounding the SAR image are entirely isolated. The U-NET network training results in Table I are given here. Then, images with 300 epoch and a batch size of 5 were considered for network training. As observed, in the epoch i = 22, the values of the parameters were mean_iou = 0.9058 and val_mean_iou = 0.7888, which had the closest and highest values. The proximity of these two parameters' values showed the good training of the network, and the value of val_mean_iou showed the oil spill detection accuracy by the network, which was among the highest values (0.788 or 78.8%). Furthermore, we saved weights obtained from the network training in the program algorithm so that no time was spent on re-testing the network training, and we could use the weights. In Table I, the loss showed the network error value, and val_loss showed the validation error value. Additionally, the mean_iou parameter indicated the oil spill detection accuracy, and the val_mean_iou grid showed the network accuracy for validation data. Diagrams of network training parameters are shown below.
We observed the loss diagram based on the number of epochs in the U-NET network. It calculated the output or prediction with     the actual value according to the formula and obtained the error, and the network reduced this value during training by backpropagation and weight adjustment. In Fig. 6, we observed that as the number of epoch iterations increased, the error decreased and reached a constant value.
We observed the error value in the validation data. As the epochs increased in Fig. 7, the error reached a fixed state of 0.05.
The mean_iou parameter is used in detection and segmentation studies. Iou means that we had the main ground prediction range. The prediction detection range was also introduced. We intended to measure the extent to which these two areas were fit. The parameter was more fit if it was closer to one. The mean_iou parameter means that the entire test data will be trained. The mean of the data is the mean_iou value. This diagram showed that the mean_iou value in the U-NET network increased with the increase in epoch and reached a constant value of about 0.9, as shown in Fig. 8.   The val_mean_iou parameter calculated the men_iou value in the validation data, reaching around 0.8 with the increase of epoch, as shown in Fig. 9. Network testing was performed on three images for which the network was not yet trained. Parts of the oil spills were highlighted in red.
The target image was developed at 723 × 543 pixels on November 20 at ESRIN, Italy. As shown in Fig. 10, the scattering oil spill was not collected in one part, and the U-NET network managed to identify areas containing oil and even narrow oil borders.
In Fig. 11, the oil spill accumulated in one part and was not dispersed. The U-NET network identified oil spills well in the tested image and was marked in red with full details. Fig. 12 shows the detection on November 20 at ESRIN, Italy. As seen, the oil spill had a closed-loop shape, and there were clear areas inside the loop with no oil spill. The U-NET network recognized well around the thick borders of the spot, and the borders were well displayed.
Deeplabv3+, an encoding-decoding DCNN, extended Deeplabv3 by adding a simple yet effective decoder module to   refine the segmentation results, especially along object boundaries, which vastly improved semantic image segmentation. Due to the unique radar imaging mechanism of SAR, the SAR image structure is complex, and the content is extremely rich, making the semantic segmentation of SAR imagery more difficult than that of optical imagery [44]. We considered the size of all the oil spill images at 512 × 512 pixels. We provided the address of the database in the program.

B. SAR Oil Spill Detection Results Using DeepLabV3 Network
The DeeplabV3 network tutorial results are given in Table II. Here, images with 300 epochs and a batch size of 4 were Multiplying the original images by the annotated images gave us original fragmented images. As shown, an oil spill was augmented to the image, and in the annotated images, both parts of the mask oil spill were removed. The same algorithm was used in the DeepLabV3 network application code. As shown in Fig. 13, the final oil spills had a black background, and the oil spills were quite clear. The network was trained with 90% of the overview images. Here, we considered 90% of the original images for training and 10% for testing. Then, after the augmentation technique, we used 10% of this educational data for validation. Identifying oil spills means that areas containing oil spills surrounding the SAR image are entirely isolated.
As shown in Fig. 14, the Deeplabv3 network error value decreased with an increase in the number of epochs and reached a constant value.   The Deeplabv3 network validation data error value reached a value between 0.5 and 1 after a series of fluctuations around the value of 1. There were more errors in this network compared to U-NET. The loss val diagram in the DeepLabV3 network is shown in Fig. 15.
The mean_iou diagram in the Deeplabv3 network reached a fixed value of 0.5 after increasing epochs compared to that in the U-NET network. Accordingly, the network was weaker in the detecting oil spill segmentation. The mean_iou diagram in the DeepLabV3 network is shown in Fig. 16.
As the diagram in Fig. 17 shows, with the increase of epochs, the mean_iou value in the validation data increased to 0.55, which was lower than the U-NET value, and segmentation in the validation data in Deeplabv3 had a more unsatisfactory performance. Network testing was performed on three images for which the network was not yet trained. Parts of the oil spills were highlighted in red.
The target image was produced at 723 × 543 pixels on November 20 at ESRIN, Italy. As the image shows, the oil spill    was diffused and did not shrink in one part, and the DeepLab3 network had moderately identified areas with oil spills and narrow oil borders. As shown in Fig. 18, the left part of the image had marked parts with a pale red in addition to the oil spill, which was considered a problem.
In Fig. 19, the oil spill accumulated in one part and was not dispersed.  In this article, the following hyperparameters were used in U-NET and DeepLabv3 networks: optimizer; activation function; learning rate; epochs; and batch size. In DeepLabv3 and U-NET networks, the activation function parameter is equal to ReLU. In U-NET, the optimizer parameter is equal to RMSprop, and DeepLabv3 networks are equal to Adam. The learning rate parameter in DeepLabv3 and U-NET is equal to 0.001. In U-NET, the batch size parameter is 5, while in DeepLabv3, it is 4. The epoch parameter in U-NET and DeepLabv3 equals 300. In other words, during the test, the hyperparameters did not change.
Considering a prominent feature of oil spill segmentation with this technique and networks, it can be concluded that the generality of this method was examined. Hence, noisy images, such as speckle noise, were tested, yielding accurate segmentation results, which are considerably displayed here.
The types of the oil spill segmentation results of SAR radar are presented in Fig. 21, along with the speckle noise of the unit network. Remarkable accuracy was obtained for the segmentation results. It should be noted that for both networks, we reviewed all the assessed output data in 300 epochs. Moreover, after 20 epochs, the recorded epoch had the highest accuracy due to not enough space in the text of the article to include 300 epochs.
Considering the future challenges and motivations, it is worth noting that oil spill segmentation was investigated in this article by using CNN networks. For the next ideas, the number of input images of the networks should increase; for instance, data transmission satellites should be checked, which comprise more images containing oil spills to increment the accuracy of the segmentation. The neural network architecture can be deepened at the same time to obtain better accuracy. Also, the neural network with optimal settings should be updated; for instance, the optimizer parameter or learning rate should be changed.
There is a problem with spectral variability for hyperspectral images gathered from airborne or satellite sources, inevitably making it difficult to accurately estimate the spectral mixing. An advanced ALMM linear mixing model was introduced by Heng et al., to cope with spectral variability by using a data-based learning strategy for hyperspectral mixing inverse problems. Then, other spectral variables were modeled, such as temperature, local humidity, and atmospheric influence, as well as instrument settings such as noise and nonlinear effects. The ALMM model considered both the main scale factor and other spectral variables by introducing the spectral diversity dictionary to increment the end member dictionary scalability. More importantly, the presented technique can achieve more accurate frequency estimation than other advanced algorithms because it models spectral variables separately as scale and other spectral variables based on distinctive features [45]. Radar images are formed by the coherent interaction of transmitted microwaves with targets, unlike optical images. Therefore, the speckle noise effect is caused by the coherent summation of randomly scattered signals in each pixel. There is more noise in radar images than in optical images. SAR images are degraded inherently due to the coherent nature of the scattering phenomena known as speckle. The utility of the SAR images is reduced by the presence of speckles by reducing the capability to discover ground objects, which has adverse effects on the image quality and hampers the observation of crucial information in the image. In this article, noise-free images of SAR radar oil spills were considered for the training and input of the network. The inclusion of noise in SAR radar oil spill images should be assessed as a new issue. Here, the test data for the U-NET and DeepLabv3 networks were the oil spill images of noisy radar. As seen, good results are obtained for oil spill segmentation with noise by using the U-NET network.

V. CONCLUSION
Deep network training needs a significant sum of data and usually faces the overfitting issue with a small volume of data. In this article on SAR images, our data was limited, and it was shown that with this number of deep segmentation networks, the networks faced the overfitting issue and were not efficient. In overcoming this problem, one approach is to use augmentation techniques, although the desired number of several thousand images cannot be reached with these techniques. The innovative task carried out in this article was to combine the existing images to produce a large number of images. For instance, here, the number of original images was 99, where 98 × 98 images were produced using this approach, as each oil spill of the image can be placed on 98 other images. With this number of new images, we managed to overcome the overfitting issue. As stated in the previous sections about the training and testing results of SAR oil spill images using the U-NET and DeepLab3 networks, U-NET and DeepLabV3 could detect oil spills with 78.8% and 54% accuracy, respectively, and the U-NET network had the highest detection accuracy. Given the architecture and structural features of the U-NET network, with 300 epochs and testing of images, the network lasted about 8 h on the Google Colab server and performed oil spill separation and detection acceptably. However, in addition to detecting oil spills, the DeepLabV3 network indicated additional areas as oil spills, which were network errors. The oil spill detection accuracy reached about 78.8% using the U-NET network. Regarding the remaining motivations and challenges, it should be said that this article investigated oil spill segmentation by using CNN networks. For the next ideas, it is better to increase the number of input images of the networks, for instance, to check data transmission satellites containing more images of oil spills and to increase the segmentation accuracy. At the same time, the neural network architecture can be deepened or updated by optimal settings (e.g., changing the optimizer parameter or learning rate) to achieve better accuracy. The following are some approaches to improve the accuracy of oil spill segmentation: In other words, increasing the input data will lead to more training of neural networks, and as a result, the accuracy of oil spot spill segmentation will be improved. Another idea can be mentioned to increase the number of epochs and further training. Again, more CNN training will lead to better accuracy. Here, it can be said that using a deeper and different neural network will definitely increase the accuracy of oil spill segmentation. Finally, updating and optimizing hyperparameters (such as the optimizer and learning rate) will contribute significantly to improving the accuracy of oil spill segmentation.