Ship Velocity Estimation From Ship Wakes Detected Using Convolutional Neural Networks

Accurately tracking marine traffic considering security and commercial activities is still challenging despite its increasing global importance. Recently, space-borne synthetic aperture radar (SAR) is being considered to accurately monitor maritime traffic, and techniques to detect the position of ships and estimate their velocity have become essential. Here, we investigated the potential for automatic estimation of ship velocity using the azimuth offset between ships and wakes detected using convolutional neural network (CNN) coupled with SAR imagery. We found that azimuth offset is proportional to the Doppler shift effect of the back-scattered signal in SAR, thus, it relates to the radial velocity of a moving target. Consequently, we propose a method whereby a CNN is applied to automatically detect ship wakes from TanDEM-X data. In this method, ship velocity is calculated using the azimuthal distance (i.e., azimuth offset) between the stern of the detected ship and the vertex of the detected V-shape wake—determined as the intersection of two lines obtained through edge filtering and Radon transforms. The location and number of detected ships are then compared with an automatic identification system (AIS), and the calculated velocity of the ship is compared with the velocity obtained via along-track interferometry and AIS. Results show that our method automatically detects ships and wakes with accuracies of 91.0% and 93.2%, respectively, and estimates the velocity of ships with an accuracy of 0.13 m/s. This method is effective when wind velocities are not substantially higher than 5.5 m/s and ship velocities are not extremely low.


I. INTRODUCTION
T HE surveillance of maritime traffic is becoming an increasingly important security topic in certain parts of the world. In recent years, an automatic identification system (AIS) has been installed on every active ship, enabling maritime traffic information to be monitored in real time. However, the AIS occasionally loses functionality following maritime accidents, making it impossible to track the location of drifting ships. Illegal pair trawling by ships without an AIS is another significant issue. Therefore, research on additional methods of spatially monitoring ocean-going ships is of vital importance. Moreover, even though ships have AIS installed, the number of maritime accidents due to collision between ships is steadily increasing. Maritime traffic monitoring data, such as ship location and velocity, can help prevent maritime accidents. The space-borne synthetic aperture radar (SAR) system is the most effective tool for ocean applications and ship monitoring due to the all-weather, day/night applicability of its sensor [1], [2]. Various algorithms based on SAR imagery are currently under the development for ship monitoring, such as ship detection and velocity estimation. Such algorithms include moving target indicator (MTI) algorithms, which detect a moving target, and wake detection algorithms, which detect the wake created by a moving ship.
Previous studies focused exclusively on ship detection (i.e., rather than wake detection) have used RADARSAT-1 images with a statistical approach considering wind speed, incidence angle, and resolution [3]. Moreover, ships have been detected using the thresholds of different intensity values from SeaSat and ERS-1 images [4]. Wavelet transform has also been applied to detect ships in SAR images [5]. However, most ship detection studies employ the adaptive threshold of SAR backscattering coefficient.
Frequency-based ship detection results have been compared for C-band Sentinel-1 data and X-band TerraSAR-X data [6]. Additionally, various studies on wake detection based on frequency have shown that ships and wakes are more easily imaged in X-band SAR data than C-band data [7], [8]. One study that utilized polarization showed that among HH, VV, and HV, HH polarization provides the best ship-sea contrast [9]. The performance of ship detection using polarimetry SAR data has also been compared with that obtained from single-channel SAR data [10].
Typically, ship velocity estimation methods in SAR single look complex (SLC) imagery rely almost completely on ship wake components [11]. Ship wakes appear as dark or bright straight lines in SAR images, and existing algorithms for wake detection exploit linear component detection algorithms such as Radon transforms [12]. Several researchers have developed wake detection techniques for SAR images based on Radon transforms [11]- [19]. In addition, localized Radon transforms have been applied for wake detection [13]- [15] and extraction of ship velocity information from analyzed wake components [11], [16], [17]. Courmontagne [19] developed an improved method This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ for wake detection based on Radon transforms. Kuo and Chen [20] applied the wavelet correlator to detect wakes, and Jiaqiu et al. [20] proposed a wake constant false alarm rate detection algorithm based on the Hough transform. However, all of these studies only focused on a few cases using a manually selected subset or localized data from SAR imagery, and not the entire SAR imagery.
Several studies for wake detection using Radon transforms have measured ship velocity by utilizing the azimuth offset between the ships and wakes observed in SAR images [11], [16], [17], [22], [23]. However, wake-detection algorithms are relatively computation-intensive and time-consuming. Another disadvantage of this approach is that wakes are not clearly visible in images acquired at high incidence angles. Thus, it is difficult to estimate ship velocity with high accuracy based on a limited range of information such as wake patterns, directions, and sizes.
Recent studies have detected ships using deep learning techniques such as convolutional neural networks (CNNs) for imagebased feature extraction [24]- [27]. Deep learning techniques calculate weighting factors by utilizing training samples in various environments and maritime weather conditions, thereby increasing the probability of ship detection. However, there have been no reported cases of wake detection employing deep learning. Most studies using deep learning techniques focus on methods for the detection of artificial targets such as ships, instead of wake components. This is because a moving ship creates strong backscattering that is independent from sea clutter and motionless or slow-moving ships do not create wakes. Therefore, MTI algorithms that employ deep learning to detect ships and wakes are scarce. Under various maritime weather conditions, it is possible to detect wakes using deep learning based on CNNs with manually trained datasets. By utilizing deep learning for wake detection, ship velocity can be estimated from a single SAR image based on the azimuth offset between ships and wakes. Two bright or dark linear components are the signatures of a turbulent wake that is aligned with a ship's longitudinal axis.
Furthermore, one method of determining the radial velocity of a moving target employs the phases of a multichannel along-track interferometry (ATI) SAR system [28], [29]. Recent dual SAR systems, such as TerraSAR-X and TanDEM-X, are suitable for estimating relatively slow-moving targets, such as sea-surface currents or ships, owing to long along-track baselines. However, the ATI phase only yields that of the phase wrapped in 2pi. The relative velocity obtained using the wrapped phase is required to resolve ambiguity and determine absolute velocity. The most effective method for calculating absolute velocity from a single SAR image involves using the azimuth offset between a ship and its wake. When an SAR SLC image is processed from raw data, azimuth compression interprets the phase history based on the assumption of stationary targets. The relationship between Doppler frequency and the azimuth offset is linear, and thus, the phase record is identical to that of a similar (albeit stationary) target located at an azimuthal distance. Thus, this method can resolve the ambiguous velocity of a ship using the accurate azimuthal distance between the ship and its wake.
In this article, we propose an algorithm for automatic ship velocity estimation based on a combination of CNN for wake detection as well as ship detection, Radon transforms, and the azimuth offset. In addition, we compare the proposed method with ATI and AIS data.

II. STUDY AREA AND DATA ACQUISITION
We selected the Korea Strait as the study area for estimating ship velocity using 3-multilook TanDEM-X SLC images in descending orbit (see Fig. 1). The Korea Strait is bounded by the southern coast of the Korean Peninsula and the southwestern coast of Japan. A branch of the Kuroshio Current also passes through the strait. The surveillance of maritime traffic is an extremely important security issue in this region.
Twelve scenes from bistatic TanDEM-X data with dates ranging from 2012/02/02 to 2013/03/15 were used for estimation (see Table I). All data were recorded in descending mode, VV polarization, and at an incidence angle of ∼21°. The average coherence in most images was below 0.4. We analyzed along-track and across-track baselines for all pairs. We also noted the wind directions and speeds at the time of TanDEM-X acquisition. The latter were relatively low and steady, ranging from 1.4-5.5 m/s, except on December 17, 2012 (7.9 m/s).
The baselines of TanDEM-X vary depending on latitude owing to helix orbit formation. Generally, a decrease in the along-track baseline improves the conditions necessary to obtain the ATI phase required to determine moving target velocity. To evaluate the accuracy of ship velocity determination, the data were converted to the LOS direction using ship direction and velocity from the AIS data (in situ data) acquired by TanDEM-X over the same period.

III. METHODS
A CNN algorithm that learned from manually selected ship and wake training samples was applied to the TanDEM-X SLC  TABLE I  TanDEM-X DATA ACQUISITION images. We automatically detected ships as well as wakes using the learned CNN algorithm. Using detected wake subset data, the precise direction and location of the intersection of linear wake components were calculated from the Radon transforms and edge filtering to calculate the azimuth offset at a high resolution within a subpixel. Then, the subpixel distance of the azimuth offset was converted to radial velocity. Furthermore, the ship velocity extracted from the ATI phase was generated using the TerraSAR-X and TanDEM-X pairs. Land masking was conducted by utilizing a Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) with a 30 m spatial resolution. The generated ATI phase was expressed as the wrapped velocity within the velocity range response to the along-track baseline. Then, ship velocity was compared to in situ data, such as projected radial velocity, using the AIS (see Fig. 2). The following are described in the ensuing sections: 1) the CNN algorithm; 2) azimuth offset from Radon transforms; 3) ATI algorithm.

A. Detection of Ships and Wakes Using Convolutional Neural Network
A CNN is a deep learning technique that is frequently used for image-based feature extraction or classification [30]. A CNN consists of a convolution layer, a pooling layer, and a fully connected layer (FCL). The kernel learned from the input data from the previous layer is moved to the next layer, and the convolution layer conducts convolution operations by considering image features and extracted features to produce a feature map. In the process, multiple convolution layers are produced to construct a deep network, and the dataset undergoes a pooling process to downsize image dimensions. Pooling is a subsampling process of pixel values that uses various dimension reduction methods because the values of adjacent pixels can be extremely similar. At this stage, max-pooling or average-pooling is applied With respect to scaled-up architectures based on CNNs, Inception v3 by Google is a representative model of image classification and object detection [31]. This model can detect, classify, and learn targets from images and is currently available to users as an open source platform. Even though its function is difficult to perform owing to its highly complicated architecture, it performs tasks in TensorFlow via the Python programming language, and the results can be modified based on various objectives. This architecture emphasizes the importance of memory management and restrictions on computational capacity. This implies that Inception v3 could be more suitable for object detection (such as ships and wakes) in SAR imagery than other more computationally expensive architectures [32].
In this article, radiometric and geometric calibrations for SAR preprocessing were performed to assume all SAR images for normalization. Preprocessed TanDEM-X amplitude images were used as input data, and trained samples of both ship and wake data were generated for learning purposes. Although the number of training samples was insufficient, approximately 200 ships and wake samples were used, respectively. Because the total number of ships in the acquired 12 TanDEM-X SLC images was limited, augmentation (i.e., resizing, shifting, flipping, and rotating) was effectively employed to compensate for insufficient training samples. Since the size and shape of ships and wakes vary, the training samples were accordingly selected in various sizes and shapes. A data subset in GeoTIFF file format contained location and size data, and a text file containing x and y coordinates (i.e., pixel location), w (width), and h (height) was used as sample training data. This data were used as input to the convolutional layer, resized to 299 × 299.
The input data were applied to the Google inception v3 model (see Table II) based on the inception module for calculating output data. The general Inception module consists of four operations (1 × 1 convolution, 3 × 3 convolution after 1 × 1 convolution, 5 × 5 convolution after 1 × 1 convolution, and 1 × 1 convolution after 3 × 3 max-pooling) to decrease computational complexity and improve calculation precision. The convolutional layers were used immediately before a larger kernel convolutional filter to decrease the number of parameters to be determined during the pooling feature process. Fully connected layers utilized the nodes in the previous average pooling layer [31], [33]. In the Google Inception v3 model, the inception module is changed as inception-A where each 5 × 5 convolution is replaced by two 3 × 3 convolutions, Inception-B after the factorization of n × n convolutions, and Inception-C with expanded high dimensional representations [31]. The object of factorizing convolutions is to reduce the number of connections or parameters without decreasing the network efficiency. And auxiliary classifier and grid size reduction process are employed in Inception v3 architecture (see Fig. 3).  Table III). We attempt to directly adjust the Google Inception v3 model for object detection of SAR SLC image without any modification of the architecture. Finally, ships and wakes were detected from each of the 12 SAR scenes based on the data learned from the training samples. The various sizes of detected ships and wakes were obtained as output data. The output data is set to be stored as a text file containing the position and size of the ship or wake, respectively.

B. Estimation of Ship Velocity Using Azimuth Offset From Radon Transforms
Using the subset data of wakes from the CNN, edge filtering and Radon transforms were applied to determine the accurate wake location. This technique conducts line integrals on twodimensional (2-D) images and accumulates points integrated on the Radon transform space. Linear components were detected using the parameters (θ, x') derived from the peaks in the Radon domain [4], [5], [34]. This is an effective method for highlighting and detecting linear features such as ship wakes. It is also extremely effective for feature extraction from images in which relevant features cannot be easily distinguished from clutter owing to the unique integral function. The Radon transform conversion equation for f(x, y) of a 2-D image is given as follows: Linear components were defined by locating two peaks in the Radon transform domain. It was possible to define the linear component generated by the wake in the SAR amplitude image using the minimum distance and rotation angle. The intersection point of the V-wake pattern was estimated using the azimuthal distance of the subpixel from the starting point of image x' and direction θ. Then, the minimum distance was defined by the azimuth offset. The azimuth offset was utilized for the intersection of the two linear components generated by the wake and the location of the subpixel corresponding to the latitude and longitude of the ship. Then, it was possible to estimate the subpixel distance along the azimuth direction between the starting point of the intersection from wake and the end point of the ship. Fig. 4 shows the processing workflow of edge filtering and radon transform.
According to the equations given in [35] and [36], ship velocity can be calculated using the azimuth offset as follows: where V sat denotes the satellite velocity, A offset denotes the azimuth offset, θ in denotes the local incidence angle, and R slant denotes the slant range distance. The velocity of a ship can be estimated by calculating the distance of the azimuth offset.

C. Estimation of Ship Velocity Using Along-Track Interferometry (ATI) SAR
ATI techniques are based on the simultaneous acquisition of two SAR images from separate antennas. The ATI phase is proportional to the Doppler shift effect of a backscattered signal, and thus, it is related to the radial velocity of a moving target. Following the initial publication of this technique, ATI-SAR was demonstrated with SRTM [37]. The TanDEM-X mission was launched in 2010, after which several studies applied interferometry to the data obtained from the mission. The primary goal of the mission was acquisition of a high-precision DEM from across-track interferometry (XTI). The two satellites of the formation, i.e., TerraSAR-X and TanDEM-X, were flown in a helix orbit, which allowed for flexible XTI/ATI baselines. Thus, the sensitivities to elevation changes and moving targets were selected over a wide range [38], [39]. The TanDEM-X mission provided extremely sensitive measurements of moving targets, such as ships and tidal currents, owing to the relatively longer along-track baseline between TerraSAR-X and TanDEM-X. This is advantageous over other methods owing to its high sensitivity to moving targets and flexible along-track baseline.
In this article, the ship velocity on the sea surface was estimated using this technique by converting the interferometry phase of ATI-SAR to velocity after coregistration, resampling, removing the flat earth phase, phase unwrapping, and absolute phase calibration. However, a baseline of TanDEM-X data consists of along-track and across-track directions, and the information in the across-track direction must be eliminated for obtaining appropriate interpretation. This should be considered for the acquisition of moving target velocity from TanDEM-X data [40]. The contributions of the along-track and across-track components to the interferometry phase are as follows: where V ATI and H XTI denote the moving target velocity and elevation change, respectively, V sat denotes the satellite velocity, φ ATI denotes the phase of ATI-SAR, φ XTI denotes the phase of XTI-SAR, B ATI denotes the along-track baseline, B XTI denotes the across-track baseline, and θ I denotes the angle of incidence. The extracted velocity of ATI-SAR should be projected in the LOS direction. The wrapped velocity of ships was acquired using ATI, as follows: where V ATI represents the moving target velocity obtained using ATI, φ ATI denotes the ship velocity acquired from the ATI phase, N amb denotes the ambiguity velocity, V ran denotes the velocity range based on the ATI baseline of TanDEM-X, and error denotes the error according to the XTI phase. Additionally, the fixed land phase was eliminated from the image to obtain the relative velocity, as compensation for the topography.

IV. RESULTS
In this article, we automatically detected ships and wakes from SAR images using deep learning based on the CNN technique. By automatically selecting a wake's subset data, the linear component of the wake was used to identify an accurate azimuth offset between ships and wakes using Radon transforms and edge filtering. Finally, we compared ship and wake detection rates with AIS data and validated the estimated ship velocity using ATI and AIS data.
The CNN-processed TanDEM-X scenes of the Korea Strait acquired on March 28, 2012 and their respective ship and wake detection results are shown in Fig. 5. The detection performance was sufficient to clearly distinguish between ships and wakes. In a quantitative accuracy test, the results of the 12 TanDEM-X scenes were compared with the AIS data acquired in the same period as the SAR images (see Table IV). Compared to the AIS data, ship and wake detection accuracies were 91.0% and 93.2%, respectively.
Even though the automatic detection rate was high in general, the data collected on December 17, 2012 produced considerably lower ship and wake detection rates of 68.4% and 60.0%, respectively. This was because of weather conditions. Wind speed was higher than usual (specifically, 7.9 m/s), and thus, the backscattering coefficient increased owing to the rough ocean surface.
The result of calculating the azimuth offset between a detected ship and its wake in an SAR image is shown in Fig. 6. The velocity of ship #3 obtained from the azimuth offset was -779 m/s. Additionally, Fig. 7 shows the processing result of the amplitude images of TerraSAR-X, the coherence of TanDEM-X pairs, the wrapped phase and ship velocity obtained using ATI phase from TanDEM-X pairs, respectively. The velocity of ship #3 acquired from the ATI phase was -8.69 m/s. Table V compares the ship velocity obtained from the azimuth offset, ATI, and AIS for five ships in a SAR SLC image (see Fig.5). The mean difference between the azimuth offset and AIS was less than 0.13 m/s. Even though the number of comparison points was limited, an R 2 of 0.99 and a root mean square error (RMSE) of 0.16 m/s demonstrate that the two were strongly correlated. The comparison of the ship velocity calculated from the ATI phase and AIS data produced an R 2 of 0.98 and an RMSE of 0.55 m/s.

V. DISCUSSION
Deep learning has produced state-of-the-art results in numerous computer vision and speech recognition tasks. It is a useful tool for obtaining results with high accuracy when a large number of training samples is available. Unfortunately, there is only a limited number of SAR images, which limits the available training samples for ship and wake from SAR images. In fact, there are only 189 ships in 12 SAR images in the AIS. Thus, it is impossible to select massive training samples, such as possible in the field of computer vision. This is considered to be because a limited number of training samples is not sufficient to accurately detect a vague target. However, if a feature is distinct owing to strong backscatter, such as a ship, it can be accurately detected with only a limited number of training samples. In fact, in a recent study, Wang et al. applied a small number of training samples using the COSMO-SkyMed data from the Google inception v3 model [41].
We evaluated the accuracy obtained with a CNN algorithm by comparing the results with those obtained from a conventional method of ship detection using a K-distribution for the image pdf [1]. Both methods were applied to the same SAR data. Preprocessing of the SAR data was carried out, and ships were detected using the K-distribution technique and target grouping. The ship detection rate using the CNN was found to be 89%, which is similar to the ship detection rate of 91% obtained using the CNN algorithm in this article (see Table VI). However, a few cases of false alarm and misdetection occurred and wake detection was almost impossible using only the conventional method. Thus, deep learning techniques such as CNN are essential to detect wake based on wake features.
Even though the ship and wake detection performance using the CNN was excellent, the number of detected wakes was relatively small (see Table IV) because slow-moving ships do not always appear clearly on SAR images. Furthermore, very small ships (above 15 m in this article) or weak SAR backscattering were difficult to detect with the CNN. Even though it was not a ship, there was a false alarm on the ship by strong SAR backscattering (see Fig. 8).
The performance of wake detection mainly depends on ship velocity. For example, the bottommost ship in Fig. 8 was detected via AIS and CNN, but no wake occurred because the velocity of the ship was too low. A ship with a velocity of less than approximately 1 m/s is not detectable because it does not generate a wake. In other words, the proposed method cannot be applied if the ship velocity is low and a wake does not occur, the wake is insufficiently visible (or there is no wake), or there is interaction among wakes. However, most ships in offshore areas are less prone to wake interaction as they are operated at a sufficient distance to avoid collision. Therefore, this approach can be applied to offshore areas because it can detect ships and wakes that are moving at relatively high velocity.
Wake visibility in SAR images depends on several parameters related to radar backscatter characteristics, ship motion, and weather conditions [17]. The strong backscatter of ships can cause ghosts in SAR images and Doppler ambiguity problems beyond pulse repetition frequency. This may include errors in the estimation of ship velocity because strong backscatter may be included between ships and wakes. A turbulent wake is observed as a dark line surrounded by two bright lines, which represent the V-shape wakes due to radar backscatter [17]. Marine environment conditions may prevent turbulent wakes from appearing clearly, even for ships moving with high velocity. In addition, as a V-shape wake appears as a spiral rather than a line when on the move to redirect the ship, it results in an error when automatically estimating ship velocity.
We applied the proposed method to SAR imagery of an offshore area without complex imagery such as crowded harbors, rainy conditions, or oil spills. However, the 12 SAR images contained variable combinations of ship and wake interactions. The proposed method cannot be applied in cases with interference caused by linear components of different wakes. Hence, it is limited to detecting a single ship's wake without interaction with another wake. However, as most ships have AIS and/or radar systems, their movements are planned to prevent close interaction with other ships. Therefore, we expect the possibility of wake interaction to be quite low in most offshore settings (unlike crowded ports, where there are numerous ships). Given this focus on offshore areas, the proposed automatic velocity extraction method has the requirement that ship velocities should be high in such areas (and thus wakes should be visible and detectable). In addition, we found that ships and wakes could be efficiently detected in areas where an image appears dark. Thus, the method may be applied to areas with look-alike oil slicks. This could be tested by applying further training samples in future research.
In this article, an error also occurred in the velocity estimation from the ATI phase because of the use of TerraSAR-X and TanDEM-X, which included along-track and across-track baselines. While calculating sea surface current, across-track effects are negligible because the ocean surface is flat and wave height-induced phase variations are low [40]. However, tall ships may induce errors in the phase because remaining XTI components exist, and thus, the XTI baseline is still present. If the XTI phase of a ship can be estimated by a model that calculates ship heights, then it is possible to estimate ship velocity with increased accuracy by utilizing TanDEM-X ATI data.

VI. CONCLUSION
To date, no research has specifically focused on the development of wake detection technologies via deep learning. In this article, we automatically detected wakes as well as ships from SAR images using deep learning based on CNN. By automatically detecting a wake's subset data, the linear component of the wake was used to identify an accurate reference point between the ship and its wake using Radon transforms and edge filtering. Despite the limited sample set, the ship velocity determined using the azimuth offset of the subpixel was strongly correlated with the ship velocity calculated from AIS data (R 2 of 0.99 and RMSE of 0.16 m/s). Furthermore, the velocity calculated from the ATI phase using TanDEM-X was compared to the velocity obtained from AIS (R 2 0.98 and RMSE of 0.55 m/s). Despite the small sample set, the velocity estimated from the azimuth offset was slightly more accurate than that obtained from the ATI phase. Finally, the wakes detection using deep learning and the ship velocity calculation using azimuth offsets between ships and wakes can lead to effective estimation of ship velocity. The proposed method is effective in low wind conditions in the open sea (i.e., without ports).

ACKNOWLEDGMENT
The TanDEM-X data used in this article were kindly provided by DLR as a part of ATI_OCEA0391.