Classifying Ships in SAR Images by Using Contour Bias Features and Transfer Learning

With the development of synthetic-aperture radar (SAR) image interpretation technology in recent years, classifying detected ships based on SAR images has become an important trend in ocean monitoring. However, owing to the underlying imaging mechanism, the texture of SAR images typically contains noise that cannot be easily eliminated. Therefore, in this study, we consider more stable target contour features to classify SAR images of marine vessels. Based on a deep learning algorithm, we propose a method to obtain contour bias features by employing style transfer learning to classify detected ships from SAR data. We also adopt transfer learning to improve the uneven distribution of SAR datasets of representative ships. The results of experiments conducted to evaluate the proposed approach show that contour bias features improve the generalization performance and classification accuracy of the model. They also show that transfer learning effectively avoids the problem of data imbalance and, thus, improves classification accuracy on the OpenSARShip 2.0 database.


I. INTRODUCTION
With the increasing density of marine shipping and the development of ship management automation technology, the demand for fine management of marine ships is increasing. Ship monitoring technology has been developed and widely adopted in navigational systems for narrow waters, inland rivers, and ports. However, existing monitoring networks still involve some blind spots for remote waters or non-cooperative targets. Ship classification based on synthetic-aperture radar (SAR) imaging have been proposed and applied in this context. SAR is an active microwave imaging sensing modality. Based on synthetic-aperture technology, SAR simultaneously produces two-dimensional high-resolution images by transmitting broadband signals in both the range and azimuth directions. Compared with traditional optical imaging. SAR serves as an important technique to obtain information to classify ground objects and monitor natural disaster areas owing to its capacity to penetrate The associate editor coordinating the review of this manuscript and approving it for publication was Jeon Gwanggil . physical objects and obtain images reflecting the microwave scattering features of targets. At present, SAR has been widely adopted in military and civic fields to realize target detection and classify ground objects; therefore, methods to identify, classify, and monitor ships based on SAR have attracted considerable attention as a topic of active research.
The imaging principle of SAR is complex because the results are not only related to the wavelength, angle of incidence, and polarization mode of the radar, but also to the geometric features and material of the target. Owing to the difficulty in understanding SAR images, research on SAR image target interpretation technology has largely been based on existing optical image target interpretation algorithms combined with some particularities of SAR images. These methods process SAR images as optical images. Currently, ship classification based on SAR images is mainly divided into traditional feature-based and machine-learningbased classification methods. In the former case, features are extracted manually and input to a classifier. Commonly used features include (1) geometric features such as ship length, aspect ratio, covariance coefficient, contour features, and ship VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ scale, and (2) scattering features such as two-dimensional comb features, local radar cross-section density, permanent symmetric scatterer, and polarization features. These methods are based on advanced and interpretable mathematical theories as an essential prerequisite, and are thus highly targeted and difficult to generalize. In contrast, classification methods based on machine learning train a classification model to recognize and classify testing samples on the basis of sufficient training samples. As an important branch of machine learning with accelerated data processing, good fault tolerance, and self-training capabilities, deep learning can process signal information produced in complex environments, with unknown background knowledge, or by unclear reasoning rules. It also allows samples to have defects or distortions, and therefore provides adaptive performance and provides high classification accuracy. However, the deep learning process requires a large number of labeled training samples to train a classification model. In addition, the training and testing data should exhibit the same data distribution. At present, ship classification from SAR image data still involves many challenges. First, relatively few high-resolution data are available to describe ship SAR images, and samples in the existing datasets are unevenly distributed. In addition, given the particularities of the SAR imaging modality, expert interpretation is usually required to fully utilize SAR images to obtain improved classification accuracy. Considering these two factors, many methods have been proposed to apply transfer learning to SAR image classification. The amount of time and training data required to train a model can be reduced by reusing the knowledge of another model trained in another related task or domain. At present, there is a SAR database with good image quality, and considering that there will be the same imaging features among the same imaging mechanism, in this paper, we choose to learn and extract the imaging features of the source domain and apply them to the target domain.
Transfer learning can also enable deep learning models to automatically obtain edge and texture features from labeled images. Geirhos et al. [1] pointed out in 2018 that convolutional neural network (CNN) classification methods generally rely too heavily on object textures rather than global object shapes; however, SAR image textures tend to exhibit considerable noise, which cannot be perfectly removed without losing image quality. In addition, Lang et al. [2] pointed out in 2017 that the resolution of scattering features would decrease and vary with the angle of incidence in these mediumresolution images for ship SAR detection and classification, SAR sensor features, and the imaging environment due to the features of typical ships.
The contributions of this study are summarized as follows. 1) Our experimental results reveal that the obtained contour bias feature is more stable than the combination of the texture feature and contour feature, which can significantly improve accuracy and precision in SAR classification tasks.
2) Moreover, the results show that contour bias features can be used effectively in transfer learning, because the model can be trained to classify samples with improved accuracy by adjusting the feature learning preferences and influencing the feature weights.
This remainder of this paper is organized as follows. In Section II, we first summarize existing machine learning methods that exhibit good performance in SAR image classification and then focus on deep learning methods. In Section III, we detail the proposed transfer learning classification method for SAR images based on contour bias features. In Section IV, we present the results of experiments conducted to verify the effectiveness of the proposed approach by analyzing the features obtained from deep learning and comparing different models. Finally, in Section V, we provide conclusions and suggest some possible avenues for future research.

II. RELATED CONTENTS
At present, research on SAR image classification is primarily based on machine learning, which can be divided into traditional machine learning and deep learning methods. Classification based on traditional machine learning models usually requires researchers to be familiar with the imaging principle of SAR images and design features manually for input into predictive classifiers. In contrast, classification methods based on deep learning establish a relatively more complex model designed to learn the relationship between input data and their associated label and thus automatically extract features for classification. Compared with traditional machine learning methods, this approach reduces labor and time costs and improves generalization and classification accuracy.
In the application of traditional machine learning methods to recognize and classify ship SAR images, some methods have achieved better classification accuracy by integrating feature models in line with SAR image targets into classifiers. Lang et al. [2] proposed simple and efficient naive geometric features method for ship classification, which was designed to ascertain the essential differences between different types of ships. Multiple kernel learning was developed to learn combination weights. Lin et al. [3] proposed an improved SAR-HOG model in 2018 and applied it to ship classification using manifold learning to achieve dimensionality reduction. They also integrated a dictionary into their proposed classifier in the task-driven dictionary learning (TDDL) framework and imposed structured incoherent constraints on TDDL to achieve better classification results. In 2011, Margarit et al. developed a method to classify ship according to a fuzzy logic decision rule by combining scattering and geometric features of targets. [4]. In 2013, Xing et al. [5] combined different structural and scattering features of targets to construct an internal representation, and then used the sparse representation method to classify them. Wu et al. [6] proposed a ship classification method in 2015 which was designed to estimate feature vectors by calculating the average value of kernel density estimation, three structural features, and an average backscattering coefficient, and then to classify ships using a support-vector machine model. In 2017, Gorovyi et al. [7] proposed the use of Haralick features for texture recognition and local binary patterns as a comparison standard and classified images by fusing azimuth and range target profiles.
Some methods have achieved improved classification performance by implementing more effective classifiers. Along these lines, Wang et al. [8] studied a hierarchical classifier based on geometric and scattering features to classify ships. In 2019, Gishkori et al. [9] proposed a method that used the invariance of pseudo-Zernike distance and introduced auxiliary atom to construct a sparse dictionary and increase the redundancy of dictionaries to perform image classification. Clustering algorithms have also been considered for this purpose. For example, in 2020, Xu et al. [10] proposed a new distance metric learning method that improved the original Laplacian regularized metric learning by adding inter-class distribution shift regularization. This method can effectively improve the separability between classes and the compression within classes, as well as the performance of fine-grained ship classification from SAR images. The author introduced a new transfer metric learning method [11], which integrated pairwise constraint, joint distribution adaptation, and geometric structure preservation to realize discriminant information preservation, geometric structure preservation, and domain displacement, and transformed manifold regularization into a unified optimization function to fully utilize their complementarity to improve the performance of SAR ship classification systems.
Most existing deep learning methods for SAR ship classification are based on CNN models. In 2017, Lin et al. [12] introduced a CNN-based architecture designed to be trained with smaller datasets, and the network was flexibly constructed by stacking a unit architecture to extract deep feature representations for classification. In 2016, Chen et al. developed a highly targeted A_ConvNets network for the MSTAR database, which achieved experimental accuracy that could not be replicated on other databases. [13]. In 2019, De Laurentiis et al. [14] proposed a CapsNets model for ship classification, which was designed to improve the use of spatial information. Dong et al. [15] developed a deep residual network for fine-grained ship classification to achieve high accuracy; however, its data are too simple to be widely used in other SAR images. He et al. [16] proposed extending a dense convolutional network to medium-resolution SAR ship classification and used a multi-task learning framework to minimize softmax log loss and triple loss to better extract deep features and improve classification accuracy. Some researchers have also considered the features of SAR images. For example, Huang et al. [17] proposed a novel deep learning framework that considers complex SAR images, spatial texture information, and backscattering patterns of ground objects. Zhang et al. [18] proposed a method that fused directional gradient histogram features with neural networks.
Most of these deep learning approaches are supervised machine learning methods that require a large, balanced dataset with high-quality annotations. Some researchers have applied transfer learning to the SAR field to better overcome the limited amount of data available and the imbalance of samples in typical datasets. For example, Wang et al. [19] proposed in 2018 to migrate a deep convolutional network pre-trained on the ImageNet dataset to small-sample SAR ship classification to extract abstract features of ships. Huang et al. [20] comprehensively studied where and how to transmit signal in SAR automatic target recognition and improved the accuracy with which a system could classify ships by fine-tuning a deep network trained on other databases. In 2020 [21], a deep migration learning method based on similarly annotated optical land-cover datasets with similar labels was proposed, and a top-2 loss smoothing loss function with cost-sensitive parameters was introduced to solve the problem of label noise and imbalance.

III. METHOD
Inspired by transfer learning, in this study, we propose a transfer learning method that adopts a model trained on a balanced SAR image set to an unbalanced SAR image set. We consider two SAR image datasets, including MSTAR [22] and OpenSARShip 2.0 [23]. MSTAR is a SAR image dataset comprising military vehicles, which is used as the source domain for transfer learning in our proposed approach. It contains ten kinds of target images acquired of military vehicles at various azimuth angles. The data distribution is balanced, the recognized targets are located in the center of the image. The OpenSARShip 2.0 dataset comprises SAR images of ships acquired by the Sentinel-1 satellite, including 16 types of ship images acquired from different backgrounds. We adopted this dataset as the target domain for the transfer learning process in the proposed method. However, the data distribution of OpenSARShip 2.0 is very uneven, the images contain a considerable variety of content other than the target, and the ship targets account for a relatively small proportion of the image, which is not conducive to training deep learning models. Therefore, the proposed approach transfers the parameters trained on the MSTAR database to the OpenSARShip 2.0 database.
Multiplicative noise is difficult to remove in SAR imaging owing to the special imaging mechanism. Therefore, the proposed method extracts contour bias features by using the style transfer method on SAR images. One SAR image is used as the content image and ten different style images are used for style transfer. The SAR image after style transfer retains the original contour but exhibits different textures. When the deep model learns these images with different textures but is marked with the same label, the contour bias features are obtained, and SAR images of different styles are generated. Our experimental results show that this can improve classification accuracy. In addition, we also propose the implementation of transfer learning from tank SAR images to ship SAR images, and our experimental results demonstrate prove that this approach is an effective method to improve the accuracy of ship classification methods based on SAR images. The key contributions of the present work can be divided into two parts. 1) Because image noise in the MSTAR dataset cannot be removed without degrading the quality of the image, stylized images are used for training, and the network is forced to learn more stable contour bias features for classification. 2) Owing to the poor image quality, relatively few samples, and unbalanced data in the OpenSARShip 2.0 dataset, deep learning models typically overfit the data. To avoid this phenomenon, our proposed approach implements transfer learning from the tank dataset to the ship dataset. The model is trained with the MSTAR dataset, transfer learning is performed to the OpenSARShip 2.0 dataset, and then the last several layers of the network are fine-tuned. The process flow of the proposed approach is shown in Fig. 1.

A. CLASSIFICATION MODEL
In the classification model, this study chose the CNN, which is widely used and has many derivative models. A CNN is composed of an activation function, convolutional layers, pooling layers, and fully connected layers. For any given input image, the neural network cannot know exactly which parts of comprise the salient features of the image are, so every pixel X in the original image is considered in the convolutional process. A convolution operation is performed with a square matrix of a given N * N size, called a convolution kernel. The convolution kernel K is traversed from beginning to end through a picture of size W * W * D, as shown in Equation 3.1, where D is the depth of the picture consistent with the depth of the convolution kernel.
In addition, there are some parameters to determine how to convolve, for example, padding, which determines whether to fill zeros around the input picture, and strides, which determine the distance that the convolution kernel moves. Finally, the size of the output matrix is M * M (see Equation 3.2).
Convolution is essentially an input-output mapping that can learn a large number of mapping relationships between the input and output without any accurate mathematical expressions between the input and output. A convolutional neural network trained using known patterns learns a mapping between the input and output. The activation function is a nonlinear function that maps the output results of the convolutional layer; commonly used activation functions are Sigmoid, Tanh, ReLu.
The pooling layer reduces the computational complexity from the upper layer, and simultaneously exhibits translation invariance. The extracted features remain unchanged even with a small displacement, which can eliminate the influence of image distortion. According to specific needs, we can choose max pooling, that is, keeping the maximum value in the receptive field, or mean pooling, which retains the average value in the receptive field.
The fully connected layer plays a classification role in the entire neural network; after deep layers such as convolutional, activation, or pooling, features are identified and classified through the fully connected layer. Finally, the backpropagation algorithm is used to optimize the network structure, and the weights are adjusted by calculating the difference between the actual and ideal outputs.

B. IMAGE STYLE TRANSFER
SAR images are processed by using image-style transfer. First, the SAR imaging system is based on the coherence principle; that is, in the radar echo signal, the gray value of adjacent pixels exhibits some random changes owing to coherence, and this random change is varies around a certain mean value. Thus, speckle noise is produced in the image, and at present no method exists to remove such noise without losing the image information. Image style transfer is adopted to weaken the influence of image texture on image classification, preserve image contour information, and to guide the network learn more stable contour bias information for classification. Secondly, compared with the general datasets used in optical image classification, the MSTAR database has less data and is prone to overfitting. Therefore, we adopt image style transfer processing, and one SAR image is processed with ten different style images to generate ten stylized images with different textures and consistent contours (Fig. 4) to augment the data. Moreover, we verified that image-style transfer processing can improve the accuracy of image classification.
In the experimental evaluation, images from the MSTAR dataset were used as content images, images from Kaggle's Painter by Numbers dataset were used as style images, ten different style images and content images were randomly selected, and the AdaIN style transfer algorithm [24] was used for style transfer.
First, the content image c and style image s are input into the VGG network, and the feature images obtained from them are saved as c and s , respectively. According to Equation 3.3, we first standardize the content image, then multiply it by the standard deviation of the style image and add the mean value of the style image to keep the mean value and standard deviation of the content image consistent with the mean value and standard deviation of the style image as much as possible, where σ and µ are the mean value and standard deviation, respectively. The results are then input to a VGG decoder model to obtain the initial stylized image, as shown To make the stylized image closer to the style image, the initial stylized image is input into the VGG network to obtain its feature map t , and then compared with the original stylized and content images, and the weighted loss of the initial stylized image and the sum of the two are optimized. The total loss is shown in Equation 3.5, which consists of content loss L c and style loss L s , where λ is the weight hyperparameter. The content loss, L c , is the two-norm of the contraposition difference obtained by t and t feature graphs, as shown in Equation 3.6. The style loss, L s , is the two-norm of the mean and standard deviation of each channel of each layer feature map between t and s, where l is the subscript of the convolutional layer in VGG (see Equation 3.7).
Two experiments were conducted to compare and analyze the effects of extracting more stable contour bias features from the stylized training images, as described below. Experiment 1 (Expt-1): In the first experiment, we used the original MSTAR images as the training and verification datasets.
Experiment 2 (Expt-2): The stylized MSTAR image was used as the training set, and the original MSTAR image was used as the verification set.
The models for the two experiments were trained separately using the same deep learning network, and the extracted feature maps were retained for visual analysis ( Fig. 2). The yellow part of the heat map shows the applied feature extracted by the network, and the blue part of the fusion image in the heat map marks the feature part learned by the network. From the point of view of feature extraction, Expt-1 extracted a large amount of background noise, which interferes with the classification task, and the extracted target features were mostly located in the shadow of the target. In contrast, the features extracted in Expt-2 were very clean in the non-background noise area, with noise mainly concentrated on the target and its contour. This contour bias feature can effectively improve the classification accuracy of the network.

C. TRANSFER LEARNING
Effective methods to train a deep neural network training with small datasets include transfer learning and fine-tuning. First, pre-training must be completed on large datasets such as ImageNet, and then the top-level part of the network must be modified to adapt to specific goals. In this study, we considered transfer learning and fine-tuning from the SAR field to the SAR field were adopted. SAR images have the same low-level features. When transferring from SAR to SAR, the underlying features can be shared, that is, the parameters of the underlying convolutional kernel are held constant, and only the top layer is fine-tuned, as shown in Fig. 3. Consider the CNN VGG19 model as an example. In the figure, green represents the convolutional layer, blue represents the pooling layer, and yellow represents the fully connected layer. In the transfer learning process with the VGG19 in the proposed approach, the gray part is fixed by the parameters of these layers and not updated; only the last convolutional layer, pooling layer, and two-layer full connection are updated.

IV. EXPERIMENT AND ANALYSIS
The hardware environment included an Intel i9 CPU, an NVIDIA GeForce RTX 3090 GPU, and 32G RAM. The software was compiled in the Python language environment VOLUME 10, 2022 using the TensorFlow open-source machine learning library. CUDA 11.4 was used to call the GPU for accelerated training.

A. SAR IMAGE STYLE TRANSFER
In this study, the Painter by Numbers dataset from Kaggle was used as the style source, and the texture of images was transferred to the MSTAR database to construct a new dataset, referred to as Stylized-MSTAR Data. The results of image style transfer are shown in Fig. 4. It may be observed from the image that after the style transfer, the SAR image in MSTAR changes the original image texture by retaining the original target contour and adding texture features and color information consistent with the style image.

B. APPLICATION ANALYSIS OF STYLE TRANSFER
To compare the MSTAR database with the Stylized-MSTAR database, we evaluated them with the same general classification network. Under the same network structure and using the same batch size, loss function, and optimization function, the learning rate and its decay speed were adjusted for different databases for the best comparison. See Table 1 for the database category distributions for Expt-1 and Expt-2.
To compare Expt-1 with Expt-2, we used two kinds of datasets with the AlexNet network, set the same parameters, such as batch size, epochs, and loss function and optimization functions, and set learning rate.
Accuracy of the verification set converged at approximately 15 epochs without obvious fluctuation. In terms of classification accuracy, in Table 2, Expt-2 exhibited better performance than Expt-1, which shows that after image style transfer, neural networks show an obvious bias to contour feature response. By analyzing the results of the comparison between Expt-1 and Expt-2, we found that when using SAR images after style transfer as contour features extracted from training sets for classification, the accuracy of classification fluctuated less and steadily improved. According to Expt-3, we found that style transfer is an effective data enhancement method that can improve the performance of learning models without any architectural changes. Compared with Expt-4, the results of Expt-2 show that the network only exhibits a strong generalization ability for stylized images. Although such images can be used to transfer the features learned from stylized MSTAR images to the MSTAR database, transferring the features learned from the MSTAR database to stylized MSTAR images was difficult.
For a better comparison between Expt-1 and Expt-2, the accuracy, recall, and other evaluation indices were recorded and compared to obtain the highest accuracy, as shown in Table 3. The precision rate of Expt-2 shown in the table is much higher than that of Expt-1, indicating that the contour bias features extracted by the model are more stable, significantly reducing the overfitting phenomenon, and the experimental method was more effective.
In the comparison between Expt-1 and Expt-2, by analyzing the training chart ( Figure 5), we used the same network, batch size, loss function, and optimization function for both experiments (VGG11) and observed the trend of the resultant curve when adjusting parameters to achieve their respective optimal results. Under the same epochs, the network convergence in Expt-2 was obviously faster than that in Expt-1. the fluctuation of the verification set in Expt-2 was smaller, and the network was more stable.

C. APPLICATION ANALYSIS OF PARAMETER TRANSFER
After obtaining the weight of MSTAR training, transfer learning was adopted for training, and the parameters trained on the MSTAR database were transferred to the OpenSARShip 2.0 database considering the small number of samples in OpenSARShip 2.0 databases and uneven distribution among classes. During transfer learning, the top-level parameters can be fine-tuned, or all transfer weights can be taken as initialization parameters before training the entire network. In this study, we only performed fine-tuning for the parameters of the top layer, because the MSTAR and OpenSARShip 2.0 databases comprise SAR images with similar bottom-level features, and the abstract features of the top layer are different. Fine-tuning of the convolutional kernel of the top layer can be performed. If only weights are used as initialization parameters, the efficiency of the retraining parameters is relatively low.   The OpenSARShip 2.0 source database include many images with very small sizes. We input these samples into the network and expanded them to an input image size of 224. The interpolation pixel content affects the extraction of image features. Therefore, before the experiment, we performed data cleaning and eliminated the images with length or height less than 50 pixels. The final data distribution is shown in Table 4.
The previously trained parameters were transferred to OpenSARShip 2.0, and the results were fine-tuned, as shown in Table 5. The fine-tuning areas selected for different networks differed. For example, the LeNet network includes relatively few network layers, and only the parameters of the fully connected layer are fine-tuned, whereas Inception_v4 network does not include a fully connected layer, and the parameters of the last inception module are adjusted in the  process of fine-tuning. As shown in Table 5, parameter transfer learning was carried out on the OpenSARShip2.0 dataset, and the parameters learned in the stylized-MSTAR dataset (Expt-2) based on style transfer generally improved compared with those obtained from parameter transfer in Expt-1, which also demonstrates that training on the stylized SAR images of tanks improved the generalization of the network As shown in Table 5, we introduced the public dataset FGSCR [33] for optical remote sensing image ship classification for comparison and found that the effect of transfer to OpenSARShip 2.0 after classification on the FGSCR database was not as good as that of classifying and then transferring the data to the MSTAR database. The quality of the data in the FGSCR dataset is quality is not as good that of the MSTAR dataset, because more background interference is included, and the data are unevenly distributed. In terms of data types, the FGSCR database is consistent with the objects described in the OpenSARShip 2.0 database, but the imaging mechanism is different. Hence, there was no close correlation of features; in contrast because the MSTAR and OpenSARShip2.0 database were collected with the imaging mechanism, the same material typically exhibited the same reflection coefficient, with similar high-level semantic characteristics.

V. CONCLUSION
In this study, we have proposed a deep learning method to improve the accuracy of SAR image classification. Considering that noise in SAR images is difficult to divide without loss of image quality, we have first proposed a method to obtain stylized SAR images by stylize transfer to construct a SAR dataset to be used as a training set as input to a deep learning network to learn the contour bias feature of SAR images. The results of an experiment using the original SAR image as the verification set showed that the contour bias feature can significantly improve the accuracy with which SAR images can be classified and improve the generalization performance and stability of the model. To address the problem of limited amounts of images and uneven data distribution in available SAR image dataset, we used the proposed approach to transfers the parameters obtained from learning stylized MSTAR SAR image models to the task of classifying the OpenSARShip 2.0 database and fine-tune the top-level parameters of the network for different tasks to improve the classification performance of the model. The experimental results show that the accuracy with which SAR images can be classified can be improved by using image style transfer as a pretreatment of the training set and extracting contour bias features from SAR datasets. The training parameters on SAR datasets with larger data volumes and more uniform distributions are transferred to learn to classify SAR data from small and uneven datasets. In contrast, the contour bias features obtained after style transfer and parameter transfer are more generalized and can improve classification accuracy.