Ship Navigation Behavior Prediction based on AIS Data

Real-time and accurate ship navigation dynamic prediction can effectively improve maritime supervision’s intelligence and precision level and ensure ship navigation safety. To further enhance the accuracy of ship navigation dynamic prediction, this paper uses ship AIS as the data source and proposes an improved LSTM navigation dynamic prediction model based on the attention mechanism. Firstly, a set of pre-processing means, including navigation data extraction, abnormal data processing, and missing data interpolation, is proposed to solve the problems of information loss and inaccuracy in AIS data and incomplete retention of dynamic navigation features; secondly, combining the dynamic characteristics of ship navigation in AIS data with time series, using longitude, latitude, heading, speed, a ship heading and time increment as input to establish a dynamic forecasting model for navigation based on LSTM; The existing navigation sequence coding distortion and space-time data incoherence problem, an optimized Attention-LSTM neural network navigation dynamic prediction method is proposed, and the accuracy and robustness of the model are verified by simulation analysis. The results show that this method can achieve high-precision prediction of the ship’s longitude, latitude, heading, and speed.


I. INTRODUCTION
W ITH the development of marine traffic monitoring technology and equipment, most ships are equipped with an Automatic Identification System (AIS) to record their static, dynamic, and voyage information. This information can be applied to traffic flow analysis, detection of abnormal ship behaviour, and recognition of ship motion patterns. When a ship is sailing at sea, especially in a sea area with high traffic density, dense obstacles, and a complex and changeable navigation environment, the risk of marine traffic accidents increases. Therefore, based on the ship's AIS data, the prediction and analysis of the ship's navigation behaviour can not only provide necessary technical support for the early warning of maritime traffic accidents but also is of great significance to improving the efficiency of ship monitoring and prevent the occurrence of navigation accidents.
Due to the limitations of marine equipment and the operating environment, ship navigation and dynamic prediction research are in the exploratory stage. AIS data are subject to data discontinuity and inaccuracy after multiple processes. To solve this problem, Experts [1]- [6] designed a method to recover vessel trajectories in inland waterways based on AIS data and established three rules to identify inaccurate data and clean AIS data. Using three rules, incorrect data could be effectively identified and removed. With the rapid developments of data mining and AI technologies, deep learningbased prediction methods are increasing attention. The long short-term memory (LSTM) and the BP prediction model are two typical prediction methods that characterize the navigational behavior of a ship [7]- [13]. In addition, by addressing the noncooperation of multidimensional spatiotemporal data, Nguyen et al. [14] proposed an adaptive density-based AIS trajectory clustering method and achieved the classification prediction of ship trajectories by using an improved multiclassification logistic regression algorithm. Considering the time-series nature of the navigation prediction problem, Yang Bochen et al. [15][16][17][18] established a ship track prediction model based on an LSTM neural network combined with a discrete wavelet transform, which was optimized in parallel using a Spark distributed architecture.
To improve the quality of vessel trajectory records from AIS networks, Ryan Wen Liu [19], [20] proposes an AIS VOLUME 4, 2016 data-driven trajectory prediction framework. The vessel traffic conflict situation modeling, generated using the dynamic AIS data and social force concept, is embedded into the LSTM network to guarantee high-accuracy vessel trajectory prediction; both quantitative and qualitative experiments on realistic vessel trajectories have demonstrated that this method could achieve satisfactory prediction performance in terms of accuracy and robustness. Notably, the existing research related to navigation dynamics prediction is mainly based on ship kinematic equations and traditional artificial neural network models [21], which are often ineffective due to the interference of ship navigation uncertainty, operational parameter variation, and sea environment complexity [22], [23].
To better improve the accuracy of ship navigation prediction, considering the spatiotemporal characteristics of the navigation prediction problem, this paper combines the dynamic ship characteristics in the AIS data with the time series and proposes a network based on LSTM. The navigation dynamic prediction method. At the same time, to further solve the problem of model sequence coding distortion and spatiotemporal data incoordination, this paper proposes a solution based on the attention Mechanism based on LSTM, using the attention function to highlight the critical influence on the navigation prediction results in the AIS sequence The field features of LSTM are established. A navigation dynamic prediction model based on the Attention-LSTM neural network is established, further improving the prediction accuracy. In this paper, the prediction of the ship's longitude, latitude, heading, and speed is realized, and the accuracy and robustness of the model are verified by simulation analysis.

II. AIS DATA ACQUISITION AND PREPROCESSING
With the rapid development of maritime communication technology, ship AIS data have undergone explosive growth [24], which provides a valuable data basis for ship navigation dynamics prediction research. However, due to factors such as maritime equipment and ship operating environment, AIS data often involve many errors and missing internal information after being subjected to acquisition, encapsulation, transmission, reception and decoding processes. Thus, these data are difficult to directly analyze and utilize directly [25]. In this context, AIS data must be detected and repaired, and the problematic data must be cleaned while retaining the original navigation characteristic attributes to satisfy the data mining needs in subsequent studies.

A. DATA ACQUISITION
The data in this paper are obtained from the NOAA public dataset, which pertains to information onshore-based AIS messages in a sea area in 2017 and contains 122,801,856 AIS message reports collected from 31,164 vessels with a collection frequency of approximately once per minute [26]. To ensure quality assurance, AIS messages from commercial vessels with nonconfrontational behavior in the range of 03v-06v of the General Mercator projection from January  Item  Category  Remarks  MMSI  static  string  Base-Date-Time  dynamic  long  LAT  dynamic  double  LON  dynamic  double  SOG  dynamic  float  COG  dynamic  float  Heading  dynamic  float  Vessel-Name  static  string  IMO  static  string  Call-Sign  static  string  Vessel-Type  static  string  Status  dynamic  integer  Length  static  string  Width  static  string  Draft  dynamic  double  Cargo  dynamic  double to March are selected as experimental data, and the specific field information is summarized in Table 1.

B. DATA PREPROCESSING
The AIS information fields involved in ship navigation and dynamic prediction problems mainly include the marine mobile communication service identification code, Coordinated Universal Time, longitude, latitude, heading to ground, speed to the ground and bow direction. Factors such as voyage spacing, operation time and system error may lead to insufficient recording of vessel AIS voyage information points, and in this case, the data cannot provide substantial reference for vessel voyage prediction. To solve this problem, the dataset is traversed using the MMSI number to identify ships with excessively few voyage points. Additionally, to avoid the problem of long time differences in the before and after data due to AIS start/stop, a time threshold of 24 h is set. Due to the limitations of guipment, technology and other factors, a certain amount of voyage data may be missing. As an example, the missing data situation for the sailing date of one ship is shown in Table 2   To address the problem of missing data in the sequence, this paper selects the improved Lagrange interpolation to fill the navigational sequence gap in the AIS data.
The essence of Lagrange interpolation is the use of polynomials to fit the relevant point positions. For a certain polynomial function, (n + 1) points are given: (x 0 , y 0 ), (x 1 , y 1 ), . . . , (x n , y n ). Here, x i corresponds to the base-date-time in the ship navigation data, and Y i represents the corresponding values of LAT, LON, SOC, COG, and HED. The Lagrange interpolation polynomial is where ℓ i (x) is the Lagrange fundamental polynomial defined as In this case, polynomial and Eq. (2) can be rewritten as Defining ϕ i as the center of gravity weight, the Lagrange interpolation polynomial can be modified as After traversing the single ship voyage data sequence, the corresponding missing values can be found by introducing the corresponding base-date-time for any vacancy in the AIS.

C. VALIDATION EXPERIMENTS
After the abovementioned process, 12,454 valid ship voyage subsequence segments in the AIS dataset are output, with each subsequence segment containing 500 voyage information reports. Specifically, a total of 20,000 ship voyage sub-sequences were obtained in the process of voyage data extraction; in the process of abnormal data, 58.16% of erroneous values were removed, 2.10% of duplicate values were removed, and 6.43% of outliers were corrected; In the missing data interpolation, 6.15% of the missing values were filled. To more accurately analyze the preprocessing effect of AIS data, this paper visualizes and compares two aspects of ship voyage track information and ship voyage behavior information.

1) Vessel trajectory information
Figs. 1-4 show the visualization and comparison of the preprocessing effect of the ship navigation trajectory in certain sea areas in the AIS dataset. The quality of the processed AIS data is enhanced. As shown in Fig. 1, the density of ship trajectory data within the sea area is considerably decreased, the abnormal aggregated trajectories and repeated trajectories are effectively cleaned, and the orderliness of the trajectories is considerably enhanced. As shown in Fig. 2, a large number of intermittent and fragmented trajectories are corrected and eliminated. As shown in Fig. 3, unconventional and erroneous trajectories are cleaned, and instances in which ship trajectories cross the land or are beyond the study area are eliminated. As shown in Fig. 4, the empty sections in the trajectory are filled, and the continuity of the trajectory is enhanced.

2) Vessel trajectory information
For a more intuitive demonstration, the sailing data of a ship (MMSI=367135370) at 0:00-8:00 on January 1, 2017, are considered an example, and the preprocessing effects for the sailing behavior information are illustrated in Figs. 5-7. The accuracy and continuity of the three kinds of sailing behavior information after cleaning are enhanced and are similar to the real ship sailing dynamics.
After treatment, A total of 12,454 valid ship voyage subsequence segments were exported from the AIS dataset, with each subsequence segment containing 500 voyage information reports. Specifically, 20,000 ship voyage subsequence segments were obtained in the voyage data extraction session; in the abnormal data processing session, 58.16% of incorrect values were removed, 2.10% of duplicate values were deleted, and 6.43% of outliers were corrected; in the missing data interpolation session, 6.15% of missing values were filled. To better analyze the preprocessing effect of AIS data, this study visualizes and compares two aspects of ship navigation track information and ship navigation behavior information to show.
The visualization and comparison results show that the AIS data preprocessing method effectively solves problems such as vacant values within the information while retaining the original ship navigation dynamic characteristics.

III. LSTM-BASED SHIP NAVIGATION DYNAMICS PREDICTION A. LSTM NEURAL NETWORK
Combining the dynamic features of ship navigation in AIS data with time series, Schmidhuber et al. [25] proposed an LSTM network based on RNN for optimization. The problem of gradient explosion or disappearance due to structural constraints was effectively solved [27]. In this study, the memory unit and gate mechanism are added to the original RNN, and the original hidden nodes are designed in the form of a self-loop to ensure that the memory unit maintains an error stream. In this manner, the effective propagation of the gradient for a long time can be ensured, and the iterative update of the self-loop weights can be realized. The LSTM neural network is arranged in a typical chain pattern, and the unit structure is shown in Fig. 8.
The solid and dashed line connections represent the current and previous moments of information transfer, respectively. The input gate is used to control the amount of information updated by the memory cell; the forgetting gate is used to control the amount of information retained by the memory  (a) Before pretreatment (b) After pretreatment (a) Before pretreatment (b) After pretreatment   (a) Before pretreatment (b) After pretreatment (a) Before pretreatment (b) After pretreatment  cell in the previous moment; the output gate is used to control the amount of information output to the next hidden state; and Cell implements the storage and deletion of information by controlling the different gates [28]. The ship navigation dynamic characterization data are derived from AIS information, including the longitude, latitude, heading to ground, speed to the ground, bow direction and time increment. Therefore, for a single ship, the navigational dynamic characteristics Y (t) at the tooth can be expressed as The ship navigation dynamics information Y (t−n+1) , · · · , Y (t−2) , Y (t−1) , Y (t) of n consecutive moments are input to the LSTM model, and the ship navigation dynamics characterization data Y ′ (t+1) of future instances are used as the model output. In this case, the ship navigation dynamics prediction model expression can be defined as The LSTM prediction model involves many critical parameters that need to be tuned to better fit the specific problem. In this paper, the numbers of neuron nodes in the input and output layers are directly related to the navigation dynamics prediction problem and are determined by the sequence of navigation dynamics representations, Y (t).

B. EXPERIMENTAL SIMULATION ANALYSIS 1) Experimental environment and data
The experiment is conducted using a Windows 10 (64-bit) system with an Intel (R) Core(TM) i7-8700 CPU at 3.20 GHz and 32.0 GB of RAM. The experimental programming language is Python 3.0, and the IDE is Anaconda-Jupiter Notebook 5.6.0, using Keras 2.3.1, a superstructure based on TensorFlow version 1.13.1, as the deep learning framework. Data are transformed using the most valued normalization.
where x i is the original AIS data and x ′ is the corresponding normalized data.
The normalized AIS data samples are shown in Table  3, where the multidimensional data in the navigation and dynamic sequences are jointly adjusted to the scalar [0,1]. In the subsequent experiments, 1,000 sets of these AIS subseries segments are randomly selected as the experimental data, with the first 80% of each subseries considered the training set and the last 20% considered as the test set with timestamps.

2) Experimental setup and evaluation index
To comprehensively evaluate each algorithm, 2 sets of 28 simulation experiments are conducted. The test sequences from the AIS dataset are used as the input of the model to be evaluated, and the ship navigational characteristic data for the past m timestamps are used to obtain the predictions for n timestamps in the future. The values of and are set as The standard value of RMSE is [0, +∞). A superior model prediction corresponds to an RMSE value approaching zero. A larger RMSE value corresponds to an inferior prediction. The RMSE value is zero when the predicted value matches the expected value, corresponding to the best prediction effect.

3) Experimental results and analysis
The BP prediction model is used as a comparative model, and the performance of the LSTM prediction model is evaluated. The number of hidden layers of the BP and LSTM sailing dynamic prediction models is uniformly set as 2; the numbers of neuron nodes in the input, hidden, and output layers are 5(6), 16, and 1, respectively; the activation functions are tanh and sigmoid; the input timing is 50; the initial learning rate is 0.001; and the number of iterations is 300.
After the BP and LSTM models have been trained, the data in the format shown in Eq. (5) are used as the input to predict the dynamic data for the future and time-stamped voyage. The input data time series is set as 50, and the prediction timestamp length is set as {1, 5, 10, 20, 30, 40, 50} to test the predictive effect of the model in long and short periods. Fig. 9 shows the root mean square error of the prediction of the BP and LSTM models for four types of features in the navigation dynamics: longitude, latitude, speed, and heading. The prediction root mean square error of the LSTM model is smaller than that of the BP model in all four categories, and the root mean square error of the BP model is larger than that of the latter in the first regression of the test set. As the prediction timestamp progresses, the prediction accuracy of both models decreases, but the root mean square error rate curve of the BP model increases steeply, and although the root mean square error curve of the LSTM model increases as well, the rate of increase is considerably smaller than that of the former, leading to the increasing accuracy gap between the two models.

IV. ATTENTION-LSTM-BASED PREDICTION OF SHIP NAVIGATION DYNAMICS A. ATTENTION-LSTM NEURAL NETWORK
The attention mechanism is widely used in regression problems [29]. An improved LSTM neural network is implemented based on the optimal encoding-decoding framework with reference to the multilayer adaptive modular neural network structure [30]. The attention layer is used as the interface between the LSTM layer and fully connected layer, and the unfolding structure is shown in Fig. 10.
In the LSTM network with the attention mechanism, the hidden layer state information is no longer directly decoded and integrated after the output, but the similarity comparison of the state vectors is performed by the attention mechanism, and several features of the process vectors are learned based on the attention function [31]. In the subsequent output, the attention model combines the bias mechanism to assign different weights to different hidden layer state features and assign a higher attention to the key sequences among these features to obtain detailed information. The minor parts are appropriately alleviated or even ignored, allowing the LSTM network model to obtain more accurate judgments. The detailed calculation unit is shown in Fig. 11.
As shown in Fig. 11 the new process vector in the Attention-LSTM network model depends on not only the state input at the current moment but also the state sequences at multiple preceding and following instances. The processed vector is set as B (t) , defined as The input sequence vector to the hidden layer state vector is transformed using Eq. (10). H(t) is the hidden layer state corresponding to the current input sequence x(t), and λ T (t)  is the attentional bias weight of the hidden layer state to h(t) in the previous stage. λ T (t) is calculated as where η T (t) is a similar scalar of the hidden layer state value h(t) and the target state, calculated as where R(t) and E(t) are the weight coefficient matrices for the ith time period, respectively, and φ η is the corre-sponding bias value. The corrected hidden layer state value The essence of the Attention-LSTM neural network is to create a one-to-one mapping relationship between the state sequence of the encoded input and feature vector of the decoded output, thereby decreasing the loss due to data compression and allowing the model to learn a reasonable vector representation of the input sequence. Moreover, the addition of the attention mechanism does not considerably influence the computational and storage overhead of the LSTM network model.  The Attention-LSTM-based navigation and dynamic prediction model includes the input sequence, LSTM hidden layer, attention layer, fully connected layer, and output sequence. The preprocessed ship AIS navigational dynamic characterization data are extracted by the LSTM layer for dynamic features after entering the model. Then, the attention layer performs the difference assignment to strengthen the key features in the sequence, and the final prediction is realized in the fully connected layer. The ship navigation dynamic prediction model is shown in Fig. 12.

1) Experimental environment and data
The experimental programming language is Python 3.0, the integrated development environment is Anaconda-Jupiter Notebook 5.6.0, and the superstructure Keras 2.3.1 based on TensorFlow version 1.13.1 is used as the deep learning framework. Additionally, in the subsequent experiments, 1,000 sets of AIS subsequence segments are randomly selected as the experimental data, with the first 80% of each subsequence C (2) . . .

. . .
h' (1) h' (2) h' (n) Dense Layer2 segment considered as the training set and the last 20% considered as the test set with timestamps.

2) Experimental setup and evaluation index
To comprehensively evaluate each algorithm, three sets of 26 simulation experiments are conducted. The test sequences from the AIS dataset are used as the input of the model to be evaluated, and the ship navigational characteristic data for the past m timestamps are used to obtain predictions for the future and timestamps. The values of n are set as {1, 5, 10, 20, 30, 40, 50} to test the predictive effect of the model in long and short periods, and the model dropout is uniformly set as 0.5 to deactivate the neural unit probability and prevent the model from overfitting.

3) Experimental results and analysis
The AIS data are normalized. Specifically, 50 consecutive sequences of dynamic representations of ship navigation, such as t−50, t−49, . . . , t−1, and t, based on the set hidden layer neuron nodes are used as model inputs. Moreover, the representations at moment t + 1 are used as network outputs to train the internal parameters of the Attention-LSTM neural network. The model performance is evaluated by setting different hidden layer neuron nodes {4, 8, 16, 32, 64, 128, 256}.
The variation in the root mean square error of the model under different hidden layer neuron nodes in the experiments is shown in Fig. 13. As shown in Fig. 13, the Attention-LSTM prediction model incurs the minimum error when the number of hidden layer neuron nodes is 18. In the experiment, the overall error of the model in the interval (4,18) tends to decrease with the increase in the number of units, which indicates that the model's ability to fit the ship navigation sequence is enhanced. However, the model error curve increases in the VOLUME 4, 2016  interval (18,26), which indicates that the model is overfitted in the interval range.
The input timing is an important factor affecting the prediction accuracy of the Attention-LSTM model. In the experiments, the transformations are performed based on the set input timings {10, 12,14,16,18,20,22,24,26,28, 30} to achieve the dynamic prediction of the voyage for the next timestamp, and the effects of different input timings of the model performance are evaluated. The variation in the root mean square error of the model under different input timings during the experiment is shown in Fig. 14.
As shown in Fig. 14, the Attention-LSTM prediction model achieves the minimum root mean square error when the input timing is 24. The root mean square error of the prediction model decreases continuously in the step interval (6,24), which indicates that the increase in the input timing can help enhance the fitting ability of the model at this time. Notably, the model error starts to increase in the step interval (24,30), which indicates that the learning ability of the model is insufficient, and an input sequence that is excessively long interferes with the mapping relationship for the instances before and after the ship navigation dynamic data.
The number of iterations affects the overall prediction error of the model. To investigate the influence of the number of iterations on the prediction accuracy of the model, the model is trained according to the set iteration range [1,200] to predict the navigation dynamics for one timestamp in the future. The variation in the root mean square error of the model under different iterations is shown in Fig. 15. In the training process of the Attention-LSTM model, the root mean square error of the model decreases continuously as the number of iterations increases, and the model converges when the epochs reach the interval (100,110), corresponding to the minimum root mean square error. Specifically, the model reaches the optimal state under the current test set and topology. In the subsequent interval (110,200), although the epochs increase and the model training time is sequentially extended, the overall error curve does not exhibit a significant change.
To intuitively express the actual prediction effect of the model, here according to the determined Attention-LSTM neural network topology, the sliding window mode is used to recursively predict the longitude, latitude, speed, and heading of one of the ships (MMSI=303031000). Specifically, the length of the prediction timestamp was set to 1, the length of the sliding window was 24, the number of hidden layers of the model is 2, the number of neurons in the input layer, hidden layer, and output layer is 6, 18, and 1, respectively, the activation functions were tanh and, sigmoid with a learning rate of 0.001, and the maximum number of iterations was 110.   16 shows the prediction results of the LSTM and Attention-LSTM models for the longitude and latitude in the navigation dynamics. It can be observed that the predicted track of the Attention-LSTM model is closer to the real track. In the straight sailing part, both models can achieve a better fit for the ship track, but with the increase in the prediction sequence, the results also gradually deviate; in the curved navigation section, the prediction results of the LSTM model significantly deviate from the real track, while the Attention-LSTM model can still achieve high-precision fitting. Specifically, the minimum error, maximum error, and average error of the Attention-LSTM model in longitude prediction are 0.0001°, 0.0246°, and 0.0029°, respectively, while the corresponding errors of the LSTM model are 0.0004°, 0.0201°and 0.006°; the corresponding errors of the Attention-LSTM model in latitude prediction are 0°, 0.0043°, and 0.0013°, while the LSTM model corresponds to 0°, 0.0149°, and 0.0058°.    It can be seen that in the recursive sequence, the regression value of the Attention-LSTM model is closer to the true heading. In the stable stage of the ship, both models show a good fitting effect; in the later stage of ship steering, the LSTM model The heading prediction value shows irregular oscillation, and the Attention-LSTM model can still achieve a good fit at this time. Specifically, in the heading prediction, the minimum, maximum, and average of the Attention-LSTM model are 0.0073°, 6.59°, and 1.2629°, respectively, whereas the LSTM model corresponds to 0.0306°, 17.4683°, and 2.8079°.
In summary, in the example application, the navigation dynamic prediction effect of the Attention-LSTM model is better than that of the LSTM model, indicating that the addition of the attention mechanism effectively improves the fitting ability of the LSTM model to the navigation dynamic representation sequence, which is consistent with the previous reasoning. After the investigation, it was found that the prediction error of the Attention-LSTM model proposed in this study is within an acceptable range, which can better meet the accuracy requirements of VTS in practical operations.

V. CONCLUSION
Based on ship AIS data, this paper explores two technical aspects, namely, dynamic feature data extraction and ship navigation prediction modeling, and proposes a set of VOLUME 4, 2016 ship navigation and dynamic prediction methods considering the spatial and temporal characteristics of navigation and engineering application requirements. The proposed strategies can effectively enhance the intelligence and accuracy of maritime supervision. Results of simulation experiments demonstrate that the prediction effect of the proposed model exhibits a high robustness. Therefore, the proposed method can support decision making for maritime supervision and ensure the safety of ship navigation.
Although the proposed prediction model exhibits a high performance in the simulation experiments, certain limitations remain. Future research can focus on the following aspects.
(1) During the actual navigation of the ship, the ship navigation dynamics are affected by wind, waves, currents, and other external disturbances in the sea environment. Therefore, in future work, the sea environment information can be obtained and combined with the ship AIS data to extend the dimensionality of the input sequence.
(2) In this paper, the AIS data are preprocessed. However, we do not attempt to enhance the quality of AIS data. Future research can focus on extensively cleaning this part of the data.
(3) In terms of applied research, although the proposed ship navigation dynamics prediction method achieves the ideal prediction effect in simulations, its validity has not been verified in real navigation practice. Future research can be aimed at verifying the prediction effect based on actual marine traffic control platforms.
TIAN LIU was born in Weinan, Shanxi ,China in 1995. Now she is studying in Shandong Jiaotong University, master candidate in transportation from Shangdong Jiaotong University, Shangdong, China in 2021.
From 2021 to now, She has been studying in Shandong Traffic College, mainly engaged in research in the fields of application of big data of governance and navigation safety of commercial and fishing vessels. From 2012 to now, he has been working in Shandong Traffic College, and is now an associate professor in the College of Navigation of Shandong Traffic College, mainly engaged in research in the fields of application of big data of governance and navigation safety of commercial and fishing vessels. VOLUME 4, 2016