A Review of Artificial Neural Networks Applications in Maritime Industry

Artificial neural networks (ANN) are a data driven tool that has been used for modeling, prediction, optimization, classification, diagnostics, decision-making, etc., in various systems where measurements are available to produce significant amount of data. Ship processes are constantly monitored in order to control the operation of the ship and to ensure efficient and safe environment, generating large amount of data. Those data are increasingly being exploited by ANNs and the number of applications is growing. The aim of this paper is to analyze the applications of ANNs in maritime industry, and especially on ships. Based on the review analysis of the sixty-nine papers found published on this topic over the last 10 years in relevant databases, applications have been classified into eight categories in this paper. ANN types, training algorithms, activation functions, as well as measures used to evaluate the performance of the ANN models, have been analyzed for each application category. ANNs rely on data, therefore data acquisition, data processing, organization of the data for training ANN models, their validation and testing have also been addressed in this paper. The conclusions from the review analysis presented should be useful for future work in the area of ANN applications on ships and in maritime industry.


I. INTRODUCTION
Large ships represent very complex systems, which are constantly being upgraded to a higher level of automation, streaming towards fully automated and autonomous concepts, remotely controlled in order to reduce costs and increase efficiency.The measurements are essential to achieve those goals, while the number of variables being monitored are constantly increasing, making large amounts of data available for analysis, modeling, prediction, control, decision-making, etc.
Data-driven tools, such as artificial neural networks (ANNs), have therefore easily found their way to a number of different applications in the maritime industry and on ships.ANNs have been also widely used in underwater environment, e.g., localization and control of the underwater vehicles, as in [1] and [2], and underwater communications, The associate editor coordinating the review of this manuscript and approving it for publication was Tao Wang .
as in [3] and [4], however, such applications are not within the scope of this paper.
ANNs themselves grew as a tool over the years, becoming more advanced and more specialized in solving various problems, resulting in a large number of ANN types and training algorithms developed for different purposes.In addition, advances in communication systems in terms of availability, capacity, and speed of data transfer opened up new possibilities for real-time exploitation of these data, making data-driven tools such as ANNs even more attractive.
Sixty-nine papers have been found in relevant databases, including Google Scholar, Web of Science and Scopus, reporting on applications of ANNs on ships and in maritime industry over the last decade.In order to get a clear perspective about current standards in the field, the papers were analyzed in terms of applications, ANN types, activation functions, evaluation measures, methods for data collection, data processing and the organization of the data.
The results are summarized in a table, and some graphical representations of the results are provided.Finally, FIGURE 1. Biological neuron [9].
a conclusion was made about current trends in the field, the most common applications, ANN types, activation functions, training algorithms and the evaluation measures used for each application, and in general, as well as the conclusion about data collection and organization methods.
The continuous efforts of maritime stakeholders to utilize the available data onboard a ship highlights the importance of data processing using technologies such as ANNs, as analyzed in [5].
The paper tends to reveal some uncovered areas in the field, as well as potential limitations in developing an ANN model for particular application as discussed in the conclusion, thus suggesting the further work.This paper does not intend to provide a deeper insight into ANNs, however, a brief description of ANNs, types, algorithms and activation functions has been provided within the paper for better understanding.
The rest of the paper is organized as follows: Section II introduces the artificial neural network types and algorithms found in reviewed literature.Section III presents the applications of ANNs to ships and ship's systems.Section IV presents the analysis of the applications introduced in previous section.Finally, Section V concludes the paper.

II. ARTIFICIAL NEURAL NETWORKS
Foundations of artificial neural networks (ANN) were laid in 1943 by McCulloch and Pitts [6], who introduced neuron model based on a real, biological neuron.Data used for the inputs of the neuron was Boolean as well as the output data which is the result of a processing input data by the neuron.In 1958, Rosenblatt introduced a slightly different model of a neuron, called Perceptron [7] introducing weights of the input connections and possibility to work with non-Boolean data, thus processing any real input.To understand the functioning of ANN, it is necessary to descend to the level of the individual neuron.The biological neuron, as shown in Fig. 1, consists of the cell body containing nucleus, dendrite, axon, and synapses.
Artificial neurons are modelled after biological neurons where functions performed by nucleus, dendrite, synapses, and axon are described mathematically.What makes an ANN is a set of artificial neurons arranged in layers, comprising the network capable of processing the data.Based on the type of neurons and the way they are connected, different ANNs have been developed over time and applied to different fields of science [8].Furthermore, different neurons use different activation functions, and to make ANN useful it must be trained where options are also diverse.All these aspects of ANNs are continuously evolving over time, making them even more flexible tool than they were originally based solely on their nature.

A. ARTIFICIAL NEURAL NETWORK TYPES
Although there are many neural network types, only some of them are briefly described in this chapter since they have been used in reviewed literature.Network types used include Feed-forward ANN (FFANN), Recurrent neural network (RNN), Convolutional neural network (CNN), Radial Basis Function network (RBF-ANN), Self-organizing map (SOM), Generalized regression neural network (GRNN), Support Vector Machine (SVM), Long short-term memory network (LSTM) and Bayesian-Transformer neural network (BTNN).

1) FEEDFORWARD NEURAL NETWORK (FFANN)
The most common type of ANN is the Feedforward Artificial Neural Network (FFANN) used for prediction and regression problems.Sometimes, an FFANN is referred to as multi-layer perceptron (MLP) and its architecture is presented in Fig. 2. It consists of input layer of neurons accepting input data x i , one or multiple layers of hidden neurons and output layer of neurons giving the output data y j .Each neuron in previous layer is connected to the neurons in the following layer.

2) RECURRENT NEURAL NETWORK (RNN)
A Recurrent Neural Network differs from a FFANN by having at least one feedback connection, thus having at least one feedback loop.It is used for prediction problems and pattern recognition.The neuron, as shown in Fig 3, provides feedback from its output to other neurons inputs.RNN can have hidden layer of neurons, or the feedback can be fed back to the same neuron, therefore creating self-feedback loop.As stated in [10], a feedback loop has a great impact on learning capabilities and performance of the network.Such network uses unit-time delay operators, denoted as z −1 , to achieve nonlinear dynamic behavior.

3) CONVOLUTIONAL NEURAL NETWORK (CNN)
A Convolutional Neural Network is a class of ANN used for image analysis such as pattern recognition and object identification using convolution operation between matrices.Such network has four types of layers starting from the first: convolutional layer, optional non-linearity layer, pooling layer, and fully connected layer, as shown in Fig. 4. Number of each layers defines the deepness of network i.e., greater number of each of these types of layers makes network deeper.CNN purpose and architecture as well as the functions of each layer and activation function selection was thoroughly described in [11] and [12].

4) RADIAL BASIS FUNCTION NETWORK (RBF-ANN)
Radial Basis Function networks are universal approximators and, due to simpler structure and faster training process, an alternative to the MLP networks as noted in [13].Its architecture is shown in Fig. 5 where i represents number of neurons in input layer.Number of neurons in hidden layer is represented with h while j represents number of neurons in output layer.1h represent a nonlinear activation function such as Gaussian, thin-plate spline or logistic function for each neuron in hidden layer.The network sums the results from the previous layer in output layer and provides the output denoted as y j .

5) SELF-ORGANIZING MAP (SOM)
Self-organizing map is an m by n grid of neurons, as shown in Fig. 6, which are competing with each other following rule of competitiveness by which the neuron that outputs the most similar pattern to the input pattern wins.SOM is trained using competitive learning algorithms such as Kohonen as described in [14] rather than using gradient descent or backpropagation algorithm.Weight coefficients w are stored in weights matrix.SOM maps the similar patterns close together and therefore can be found in broad type of applications as described in [15].One of the main contributions of the SOM is the dimensionality reduction when representing data since humans cannot visualize high-dimensional data as stated in [16].

6) GENERALIZED REGRESSION NEURAL NETWORK (GRNN)
GRNN networks are variation of RBF networks used for regression, classification or prediction firstly proposed in [17], thus sharing the same architecture as presented in Fig. 5.It is characterized by parallel structure which can make the network computationally expensive.Since it is based on nonparametric regression i.e., the network does not predetermine solution but rather constructs the solution based on input data.It uses Gaussian functions as a basis of operation thus achieving high estimation accuracy and it is applicable to noisy data.

7) SUPPORT VECTOR MACHINE (SVM)
SVM, shown in Fig. 7, is a supervised learning network that utilizes algorithms for data classification and regression analysis.According to [18] SVM plays a significant role in pattern recognition when the datasets are not extremely large.Feature vectors of the input data x i are calculated using kernel functions K which are basically a similarity function that is provided to the network to calculate the similarity of the two inputs.This makes calculations computationally inexpensive for the reasonable-sized data sets measured in thousands of data samples as stated in [19].

8) LONG SHORT-TERM MEMORY (LSTM) NETWORK
A LSTM network is a type of RNN with a cell which holds an input gate, an output gate and forget gate.Therefore, architecture of the network is as shown in Fig. 8.By replacing the neurons in hidden layer with LSTM memory cells, an LSTM network is created.Such network has the ability to memorize the long-term dependencies of the variables, therefore the term memory in the name.Due to such ability, network can process the time series data in sequences, rather than processing a single data point at a time.Furthermore, it solves the vanishing gradient problem as stated in [20], a very common problem in multi-layer ANNs that use partial derivatives of error function to update weights.When a problem occurs, partial derivative values tend to zero, thus preventing the further training process.LSTM are thoroughly described in [21].

9) BAYESIAN-TRANSFORMER NEURAL NETWORK (BTNN)
BTNN network, shown in Fig. 9, is a network with encoder decoder architecture as described in [22] and augmented with    Bayesian principle by means of Bayesian transformer units (BTU) to provide more reliable probability calculation and capture the relationship between the input data.It is based on

B. ACTIVATION FUNCTIONS
Most commonly used activation functions found in reviewed literature are presented in Table 1 along with mathematical expressions and the applications in neural networks.
Where: K -number of classes in multi-class classifier, x -data point, x 1 and x 2 -two data points in space, α -coefficient determining slope of the function.It is worth mentioning that some artificial neural networks such as self-organizing maps (SOM) do not use activation functions and only transfer the data between neurons.
Furthermore, some layers of convolutional neural networks (CNN) do not use activation functions and only process the data using pooling functions such as maxpool, avgpool or minpool.

C. TRAINING ALGORITHMS
ANNs resemble biological neural networks not only by their structure but by their way of operation as well.One of the fundamental properties of ANN is an ability to learn from the example.These examples are presented to the network in terms of data, making an ANN a data driven tool.Training algorithm is then used to train the network to perform a certain task.Therefore, to make an ANN useful it must be trained.If adequate data are presented to the network, an error it makes in the desired calculations can be determined.This error is usually presented in the form of some function, called the loss function.Therefore, the process of training usually implies loss function calculations while updating and refining weight coefficients, i.e., adjustable network parameters, with the goal of minimizing the values of the loss function.Therefore, the knowledge of the network is contained in the weight coefficients.
Training process can be supervised, unsupervised or reinforced.Supervised learning implies presenting data samples to the network as input-output data samples, therefore giving the ANN information about what the resulting output should be.However, in case of unsupervised learning, output is a priori unknown.There are no input-output examples, and the rule of competitiveness is used to train the network.Thus, there is a competitive layer of neurons, competing to activate as a reaction to input data.
Furthermore, reinforced learning is performed with continuous interaction with the environment while iteratively determining actions which will lead to the minimization of the cost function.Since the cost function is not known, iterative procedure is based on trial-and-error method where ''teacher'' or ''criticist'' evaluates each step of the procedure and, in case of good decision, awards the network.This method of learning is based on Thorndike's law of effect as explained in [23] according to which the probability that the system would retake the same actions increases if it led to a positive effect.
During the iterative process of training, the network is processing the input data and adjusting the weight coefficients using one or multiple algorithms.In the following chapters, the most commonly used algorithms will be briefly described.

1) BACKPROPAGATION (BP)
Backpropagation algorithm is the two-stage algorithm.In the first, feed-forward stage the network using such algorithm is fed with the input data based on which the output is calculated.Since this is the supervised learning algorithm, output value for given input is known and used in the second, back-propagation stage to calculate the error.BP algorithm propagates the error backwards to adjust the weights of the ANN, thus the name Backpropagation algorithm.Weights adjustment is performed by minimizing the squared error sum using the gradient descent method based on partial derivatives of the activation function to converge to minimum value of error.

2) LEVENBERG-MARQUARDT (LM)
Levenberg-Marquardt or sometimes referred to as damped least-squares (DLS) method, proposed by [24] and optimized by [25] is used for curve fitting problems.This algorithm converges to minimum error combining Gauss-Newton and gradient descent methods.With a small learning rate, the Levenberg Marquardt algorithm becomes Gauss-Newton method while with a large learning rate, it becomes the gradient descent method.In general, on function approximation problems, LM was proven to be the fastest and most appropriate algorithm for training FFANN networks containing up to few hundreds of adjustable parameters [26].

3) DENSITY BASED SCAN
DBScan algorithm -a density-based clustering algorithm is an unsupervised learning method for spatial clustering based on number of neighbors within the defined radius.If some point of data achieves defined minimum number of neighbors within defined radius it is considered to be the core point of one cluster.Data points within the defined radius around core point are considered to belong to that cluster.Unlike the K-means algorithm, a DBScan algorithm as described in [27] does not require the number of clusters to be a priori defined.

4) ALMEIDA-PINEDA
Almeida-Pineda algorithm proposed in [28] is a supervised learning backpropagation algorithm used in RNN, thus it is usually referred to as a recurrent backpropagation (RBP) algorithm based on error gradient calculation and its backpropagation in recurrent neural networks.

5) EUCLIDEAN DISTANCE
Euclidean algorithm which is used in self-organizing maps (SOM) is an algorithm used to calculate the distance between the two points.It is useful in SOM to determine similar clusters i.e., Euclidean distance is used to cluster the data around specific cluster center.Clusters with small Euclidean distance can possibly contain similar data as explained in [10].

6) STOCHASTIC GRADIENT DESCENT (SGD)
Stochastic gradient descent as described in [29] is an iterative procedure of minimizing the error function just like the regular gradient descent (GD).What makes the difference between the regular and stochastic gradient descent lies in the way of selecting the data for each step of an iterative process.The data used for calculating the gradient is selected randomly from a whole data set, thus the name stochastic.The advantages of the SGD algorithm and its properties in process of training has been analyzed in [30].The cross-over between the SGD and GD is a mini batch gradient descent where the data set used for training is divided into small batches (groups) and the algorithm calculates the gradient of each batch.The performance of the mini batch gradient descent has been evaluated in [31].

7) PARTICLE SWARM OPTIMIZATION (PSO)
PSO as proposed in [32] is an iterative procedure of finding the optimal solution i.e., minimum of the error function.Several random data points called particles are selected on the error function plane.Role of each particle is to search the minimum value in random directions.In every iteration, a particle is searching around previously found minimum value which after certain number of iterations results in finding the overall minimum value by swarm of the particles.

8) GENETIC ALGORITHM
A genetic algorithm (GA) is an algorithm inspired by natural evolution theory by Charles Darwin.GA starts with a set of data called population where each data point is the solution to the problem.Data points are defined by parameters which represent genes.When deploying GA, the genes are represented in a string called chromosome.Each data sample is evaluated to determine the ability to compete with other data samples.Best performing data samples are allowed to pass the genes (parameters) to next generation by crossover.Algorithm comes to an end when population converges to the best solution i.e., when future generations does not provide significantly better results.Past, future, and recent advances of GA are examined in [33].

9) A * ALGORITHM
A * or commonly called A-star algorithm, as explained in [34] and used in [35] is a graph or path searching algorithm.Its goal is to find the shortest path from initial to final point by using grid comprised of cells and combining heuristic searching and searching based on the shortest path.Advantage of such algorithm lies in simplicity and fast calculation while the inability of searching in every angle is a disadvantage.

10) K-MEANS ALGORITHM
K-means algorithm, described in [36], is a clustering algorithm.It is an unsupervised learning technique which requires a priori a number of clusters K.Each cluster contains a randomly selected centroid.Sum of squared distance is calculated between each centroid and data point based on which each data point gets clustered in one cluster -the one represented with closest centroid.The algorithm iteratively repeats the process until the assignment of data points to clusters does not change.Lastly, the centroid is calculated by taking the average value of all data points assigned to each cluster.

11) K-NEAREST NEIGHBOUR ALGORITHM
K-nearest neighbor or simply k-NN is an algorithm used for classification and regression as described in [36].It is a supervised learning technique based on proximity evaluation such as Euclidean distance, Minkowski distance, Hamming distance etc.When applied for classification purposes, the data example being classified is clustered into cluster based on predefined number of k nearest neighbors.The affiliation of k nearest neighbors to cluster is analyzed and data example being clustered is assigned to the cluster that prevails among the k nearest neighbors.This algorithm is sometimes called lazy learning since the network utilizing it does not learn during training but rather responds when a query is performed which makes it suitable for data mining.

12) RADIAL BASIS FUNCTION (RBF)
RBF algorithm is a real function of a distance between fixed point and input.Fixed point can be origin or center as explained in [37] and [38].Euclidean distance is usually used to calculate the distance between the points but other methods of calculating distance can also be used.Usual application of RBF function is in RBF neural networks for data classification in a non-linear way or to approximate multivariate functions.Furthermore, RBF is used as a support kernel in SVM networks as stated in [10].

13) ADAM ALGORITHM
Adam algorithm or Adam optimizer algorithm is an extension of an SGD algorithm as described in [39].It is used 139828 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
for iterative weights update of a neural network.Its main contribution is to compute the learning rate based on the estimation of first and second moments of gradient thus iteratively adjusting the learning rate opposed to the SGD which uses the same learning rate throughout the learning process.Since the learning rate is being adjusted iteratively, convergence into the minimum of the error function is more efficient, thus decreasing the number of iterations in training process as explained in [40].

14) EXTREME LEARNING MACHINE (ELM)
ELM algorithm as proposed and explained in [41] is an algorithm used with ELM -a type of feedforward neural network with single hidden layer which, opposed to usual FFANN does not use gradient descent method to update the weights but rather uses Moore-Penrose generalized inverse.It is used for classification, clustering, regression, and feature learning purposes giving the extremely fast learning rate, thus the name extreme learning machine.

15) DYNAMIC TIME WARPING (DTW)
DTW is an algorithm used in similarity analysis of a time series.It is used to compare two differently sized arrays of time-dependent data.An optimal alignment between two time series is found based on the distance or cost measure as explained in [42].Since the two arrays of time-series are not the same length, a DTW algorithm ''warps'' the time series to match up.Such algorithm is used in sound or speech recognition, signature recognition or any sample matching applications.

16) RANDOM FOREST ALGORITHM
RF algorithm as explained in [43] is a supervised learning technique that uses multiple training algorithms, i.e., ensemble learning for classification and regression purposes.Ensemble learning is based on bagging.Bagging or bootstrap aggregation is a technique that chooses random samples from data set and generate the tree -bootstrapping.Final result depends on majority voting after results of all trees have been combined -aggregation.

17) QUICKBUNDLE CLUSTERING ALGORITHM
QuickBundle Clustering algorithm is an algorithm used for tractography clustering in neuroscience as proposed in [44].The algorithm is applicable to trajectory clustering as stated in [45].Algorithm is based on symmetric distance function as proposed in [44] which calculates the distance between the two trajectories based on mean Euclidean distance.Based on the results of the distance evaluation, similar trajectories are clustered into same cluster.

18) VARIATIONAL INFERENCE ALGORITHM (VI)
Variational inference is one of the methods of machine learning which is used to estimate complex probability densities.It is based on Bayesian statistics and used to estimate posterior probability of a model when it cannot be calculated as an integral as explained in [46].Other methods such as Monte Carlo or Markov Chain can also be used to estimate the posterior probability of a model.However, in case of many features in data set, a VI method is being used to determine the set of possible posterior probabilities among which the best one is selected using Kullback-Leibler (KL) divergence.

19) SALIENCY DETECTION ALGORITHM
A saliency detection algorithm is an algorithm used to pre-process the images in computer vision applications with the purpose of finding salient (noticeable) features or objects.It is based on gradients of the output over input thus highlighting the area on the input image that is noticeably standing out and therefore pointing out the object of interest on an image.The state-of-the-art saliency algorithms have been reviewed in [47].

20) BAYESIAN REGULARIZATION ALGORITHM
Bayesian regularization (BR) can be interpreted as an expanded version of Levenberg-Marquardt (LM) algorithm.It is based on the modification of the objective function used in an LM algorithm in which the aim is to reduce the sum of squared errors (SSE).BR adds an additional term SSW, i.e., the sum of square adjustable network parameters (w).In this way, the algorithm minimizes both the error and the number of adjustable network parameters needed to achieve a minimum error, where parameters α and β are used to define the priority of the learning process.If α ≪ β, the training algorithm will drive the errors smaller, and if α ≫ β, training will emphasize weight size reduction.
Sometimes, ANN can perform well during the training phase, but fails to perform when introduced to test set.This phenomenon is called overfitting and usually is solved using regularization techniques such as Bayesian regularization.

21) DECISION TREE ALGORITHM
Decision tree algorithm is a supervised learning method used for classification and regression tasks as explained in [48].It is based on decision making for given input data which ultimately leads to a decision or prediction.Since it is based on Boolean logic it can be visually represented, and therefore easy to understand and interpret.Usually, networks that utilize decision tree algorithm are referred to as decision trees or decision tree networks.According to [49], any neural network using any training algorithm can be categorized as a decision tree.
Similarly, a network can be used as a pre-trained network which is known as transfer learning.In some papers, algorithms were not named since authors did not specify the algorithm used.

III. ANN APPLIED TO SHIP AND SHIP'S SYSTEMS
The applications of ANN on ships and ship systems are diverse.This paper has classified the applications into eight groups, which are described in the following chapters.Review strategy included searching the relevant literature using all the available databases, including Google Scholar, Web of Science and Scopus.Search was conducted using keywords and search strings with exclusion criteria for papers older than 10 years.This resulted in 69 papers out of which 28 are a Q1 journal article, 18 are a Q2 journal article, 8 are a Q3 journal article, and 3 are a Q4 journal article according to the Clarivate Journal Citation Reports as shown in Fig. 10.The remaining articles include 9 conference papers, 1 technical report, and 2 book chapters.
Number of papers published per year is presented in Fig. 11, showing a positive trend of research in this field.

A. SHIP IMAGE CLASSIFICATION AND TYPE IDENTIFICATIONS
AlexNet Convolutional neural network (CNN) has been used in [50] to classify 35 types of ships from images.As an evaluation measure, precision has been used.Similarly, [51] used YOLOv4 CNN and Kalman filter to detect ships in real time surveillance.
Furthermore, a Grid CNN (GCNN) was proposed and used in [52].Images were batch normalized and divided into three data sets: training data set containing 70% of all data, validation data set, containing 20% of the data and test data set, containing 10% of the data.As an evaluation measure precision, accuracy and mean average precision (mAP) was used.The network was developed using Keras and Python and trained using Adam optimizer algorithm.
Similarly, a SCLANet based on Faster R-CNN was proposed in [53] for the purpose of ship detection on SAR images.The training data set includes 70% of all images, while the remaining 30% comprises test set.The network was developed using Python 3.6 and trained using stochastic gradient descent algorithm.The performance of the network was evaluated using average precision (AP) and compared to other similar solutions such as Faster R-CNN, YOLOv4 (You Only Look Once, Version 4), STAC (Self-Training and the Augmentation driven Consistency regularization), CSD-SSD (Consistency-based Semi-supervised learning method for object Detection) and UFO2 (Unified Object detection Framework) networks.
Similarly, Depth wise separable CNN (DS-CNN) was used in [54] to detect ships on SAR images.Images were batch normalized (BN) and divided into three data sets: training data set, consisting of 70% of all images, validation data set, consisting of 20% of all images and test set, consisting of the remaining 10% of all images.Network was developed using PyCharm software platform using Python programming language, Keras and TensorFlow.As a training algorithm Adam optimizer was used.In this paper, the proposed solution was compared to similar solutions including YOLOv2, YOLOv3, RetinaNet, Faster R-CNN, YOLOv2Tiny and YOLOv3Tiny in terms of precision, recall, mAP and frames per second (FPS).Furthermore, the influence of pre-training on large datasets was investigated, as well as the anchor box mechanism, concatenation mechanism and multiscale detection mechanism.
In [55], a CNN based on YOLOv2 was used to detect and classify ships in six classes from real time video surveillance.As an evaluation measure, AP, mAP, recall, precision and FPS were used.The proposed solution was compared to other solutions for the same purpose, such as YOLOv2, SSD, Fast VGG, Fast ZF and Faster VGG.
Coarse-to-Fine CNN was used in [56] to detect and classify ships in one of seven classes.Images have been divided into training data set and test data set in 80% to 20% ratio.As an evaluation measure accuracy has been used.Proposed solution, created using MATLAB, has been compared to CNN, RF, K-NN.It should be noted that all the solutions had problems classifying LNG ships.
In [57], a Bayesian transformer neural network (BTNN) was used to identify ship type based on ship motion information.Six months of AIS data was normalized in 0-1 interval where 80% of data forms training data set while the remaining 20% forms test data set.PyTorch software was used for training the network using Variational inference (VI) and Kullback-Leibler (KL) methods.Performance of the network was evaluated using accuracy, weighted precision, weighted recall and weighted F1 score measures.Furthermore, the proposed solution was compared with SVM, FFANN, LSTM and RNN.
In [58], a YOLOv4LITE CNN was used to detect ships on SAR images.Images have been clustered using K-means algorithm and divided into training, validation, and test data set in 70 %, 20% and 10% ratios, respectively.PyTorch, Python and CUDA10 have been used to train the network, which was evaluated using mAP, AP, precision, FOM (Figure of merit), recall and FPS evaluation measures.
Furthermore, a VGG16 CNN has been used in [59] for ship classification from images.Gradient descent has been used for error minimization while the data was divided into training and test data sets in 80% to 20% ratio, respectively.As an evaluation measure, accuracy has been used.
Similarly, [60] used four-layer CNN to classify ships in six classes using infrared images.Extreme Learning Machine (ELM) algorithm has been used along with sigmoid activation function.Network was compared with same network trained using BP algorithm using accuracy as an evaluation measure.
In [61], a CNN is used to identify ship type in video surveillance.Wavelet transform was used to filter out noise in grayscale images, resulting in 3 high frequency and one low frequency component.Features were extracted using Hu invariant moments method and furthermore fused with CNN features to acquire better classification performance.Network was developed in PyTorch environment, MATLAB, and Python.Besides using wavelet transform, data have been normalized using z-score method and horizontal flip has been performed to reduce overfitting.AlexNET CNN was modified by removing one convolutional layer and adding one fully connected layer while reducing the number of neurons in first and second convolutional layers therefore reducing the complexity of the network.As an evaluation measure, classification accuracy, F1 score and average time consumption for feature extraction per image have been used.
In [62], a various CNN networks have been compared for the purpose of ship classification from images.The networks classified ships into seven classes, while using the average intersection over union (AIOU), mean average precision (mAP) and recall as performance evaluation measure.As an activation function, exponential linear unit (ELU) and Leaky rectified linear unit (ReLU) was used along with the iterative convergence training algorithm.YOLOv2, YOLOv3 and proposed regressive deep CNN (RDCNN) were mutually compared.
In [63], a cascade coupled CNN (3C2N) has been proposed for the purpose of ship detection on satellite images utilizing NLM (non-local-mean) algorithm to reduce speckle noise in SAR (Synthetic aperture radar) images.As an evaluation measure, precision, recall and F1 score was used, a commonly used quality criteria in object detection applications.
Similarly, [64] uses CNN to detect ships on synthetic aperture radar (SAR) images.Attention model with sigmoid activation function was proposed to improve results in comparison to regular attention models.Attention models are techniques used to increase focus to a specific object, component, or input data features.As an evaluation measures, GIoU (Generalized intersection over union), recall, precision, average precision and F1 score were used.Furthermore, the proposed network was compared with similar solutions including YOLOv3.
A CNN was used in [65] to detect ships on SAR images.Network consists of three convolutional layers with ReLU activation function, three pooling layers after each convolutional layer utilizing maxpool function and one convolutional and one softmax layer.As an evaluation measure, precision, recall and intersection over union (IoU) was used.
Similarly, a MSARN (Multiscale Adaptive Recalibration Network) -a type of CNN was used in [66] to detect ships on SAR images using TensorFlow library.As a training algorithm, SGD was used.As an evaluation measure average precision (AP) and precision-recall curve (PRC) were used.Furthermore, the proposed network was compared to other solutions such as: RBF Net, YOLOv3, Attention-ResNet and Faster R-CNN.
Most commonly used neural network types in ship image classification and type identification applications are CNN, found in 88.24% of reviewed literature, followed by FFANN and BTNN, each found in 5.88% of the reviewed literature.As for the training algorithms, SGD and transfer learning stand out among the rest, each found in 25% of the reviewed literature.Recall, Precision and Accuracy are the most used evaluation measures as found in 20.83%, 18.75% and 14.58% of the literature, respectively.

B. SHIP ROUTE APPLICATIONS
In [67], a three-layer FFANN (7-9-2) has been used to predict course change locations during voyage.Sigmoid has been used as an activation function in hidden layer and MSE (Mean squared error) was used to evaluate performance of the network.
An FFANN (5-x-x-1) has been used in [68] to predict wave height for the purpose of optimal route selection.Network is realized using Keras tool in Python using A * algorithm and the performance of the network has been evaluated using MSE.
In [69], an RBF-ANN has been used for predicting the ship position in order to improve path following.Mathematical model has been established to perform simulations and evaluate the network using MAE and Mean absolute control effort (MAC).
In [70] a short-term prediction of ship trajectory was performed using RNN developed using PyTorch and Python.Adam algorithm was used to update the network parameters while using the normalized AIS parameters as input data.As an evaluation measure, RMSE was used, and the performance of the proposed network was compared to the FFANN with one hidden layer, trained with BP algorithm.
A CNN was used in [71] to classify ship trajectory based on AIS data.Network was built in TensorFlow and Keras where the trajectories based on AIS data were developed.All trajectories were presented to the network as images and flipping and rotation has been made to create more examples to solve the problem of local minima convergence.Adam optimizer has been used for training the network which was later evaluated using accuracy, precision, recall, F1 score and Area Under Curve (AUC).Besides the aforementioned algorithm, a few other algorithms such as K-NN, DT and SVM were used and their impact on network performance has been analyzed.
Similarly, a CNN was used to classify ship trajectory based on AIS data in [72].Network was used to classify ship trajectories in three classes: voyage, maneuvering and static.Classes were color coded just as in [71].For network development purposes Keras, TensorFlow and Python were used.Network was evaluated using accuracy, precision and F1 score.
Another application of CNN for the purpose of ship trajectory classification is presented in [73] where Deep CNN was used.AIS data was, as done in other papers that used CNN, transformed into trajectory image data and clustered.An improved QuickBundle clustering algorithm was used in this paper and compared to the more conventional clustering algorithm, the DBSCAN algorithm.Adam optimizer algorithm was used, and network was developed using Ten-sorFlow.As an evaluation measure, accuracy was used.
In [74], a RNN was used to predict ship trajectory based on AIS data.Data have been clustered using DBSCAN algorithm.As an activation function, bipolar sigmoid and sigmoid were used, which are commonly used activation functions in RNN.Networks performance has been compared to LSTM solution and evaluated using MSE.
In [75], a LSTM-CNN was used to predict time series of the electric propulsion plant energy consumption while underway to optimize voyage route to reduce impact on environment.As a training algorithm Minibatch gradient descent was used and network performance has been evaluated using RMSE.Bipolar sigmoid limited to [−2 2] range was used as an activation function in hidden layer.
Similarly, [76] used two LSTM RNN for marine transport surveillance and anomaly detection in ship routes.Evaluation was not feasible by quantitative methods as there were no available quantitative reference data sets but was rather graphically compared to other methods.
Most commonly used neural network type in ship route applications is CNN, found in 40% of reviewed literature, followed by RNN, FFANN and RBF-ANN, found in 30%, 20% and 10% of the reviewed literature, respectively.As for the training algorithms, ADAM optimizer and DBScan stand out among the rest, found in 33.33% and 22.22% of the reviewed literature, respectively.Most common activation functions in hidden layer are sigmoid and ReLU, standing out among others, found in 40% and 30% of the reviewed literature, respectively.Accuracy and MSE are the most commonly used evaluation measures, both found in 21.43% of the reviewed literature, followed by precision, F1 score and RMSE each found in 14.29% of the reviewed literature.

C. CONDITION MONITORING & CONDITION BASED MAINTENANACE
In [77], an RNN has been used to monitor the condition of steam plant regeneration exchanger in various working conditions to detect malfunction.The network's three-layer structure has been determined by trial-and-error method.First two layers used sigmoid activation function while the output layer used linear activation function.Network was trained using Pineda and Almeida algorithm and evaluated using MSE.
In [78], self-organizing maps (SOM) are used to monitor piston cooling oil pressure, exhaust gas temperature and piston cooling oil temperature of panamax container ship's main engine.Data acquired with sensors has been clustered and used to train network.Network consists of 16 neurons, forming 4×4 map.The network was validated using classification accuracy criteria.
A three-layer FFANN with wavelet activation function was used in [79] for real-time monitoring and fault analysis of a marine engine.Morley function has been used as a basis of wavelet transform.As an activation function in output layer, linear function has been used, while training the network with genetic algorithm.As an evaluation measures, fitness and mean squared error has been used.
In [80], a three-layer FFANN combined with fuzzy theory has been used for the purpose of condition-based maintenance (CBM) of an electric propulsion system on-board a ship.The proposed solution has been evaluated using fuzzy comprehensive evaluation.
Furthermore, a CNN and RNN has been used in [81] for corrosion detection on ship's hull using autonomous robot.GoogleNet Inception v3 CNN was used while activation functions included ReLU and softmax.Accuracy was used for evaluation purposes.
An FFANN (100-50-10-3) has been used in [82] to classify Acoustic emission (AE) signal.Acoustic emission is a method based on acoustic signal wave propagation in solids.When material suffers irreversible damage, an acoustic signal wave propagates differently along the material therefore it can be used to detect malfunctions in material structure.Training was performed using unsupervised methods and BP algorithm has been used to fine tune the trained network.As an evaluation measure, accuracy has been used.Three-layer FFANN has been used in [83] for condition monitoring and predictive maintenance.In this paper, the exhaust gas temperature of a two-stroke marine diesel engine was monitored on one of the cylinders.Resulting network predicted exhaust gas temperatures for the following 5-hour period.As an activation function in hidden layer bipolar sigmoid was used, while output layer used linear activation function.BP algorithm with Bayesian regularization was used to train the network.As an evaluation measure, MSE and correlation coefficient R was used as well as the autocorrelation of error.
In [84], a Random Forest algorithm-based decision tree model was proposed, built in Simulink, and compared to FFANN model for the purpose of ship power station fault diagnosis.Random forest algorithm was used for the decision tree model while the FFANN used the BP algorithm.Evaluation of the performances was conducted using R 2 measure and speed of prediction.
A fault diagnosis model of a ship electrical power plant was developed in [85] using CNN.The power plant including diesel generator and turbo generator was built using MAT-LAB/Simulink.For the network development TensorFlow library and Python were used.The network was trained using BP algorithm and it was evaluated using accuracy measure.
An FFANN has been used in [86] to predict exhaust gas temperature of the panamax container ship main engine.The network has been trained using Levenberg-Marquardt algorithm and validated using mean square error (MSE) and correlation coefficient (R).Bipolar sigmoid has been used as an activation function in hidden layers, while the output layer used linear activation function.
A three-layer FFANN was used to predict fuel consumption and voyage speed in [87] for the purpose of hull condition monitoring.Levenberg-Marquardt algorithm was used to train the network and BP algorithm for minimizing the error function.As an activation function in hidden layer, sigmoid was used.Network was validated using Pearson's coefficient of correlation.
A CNN built in TensorFlow was used in [88] to predict sea state and its effect on container ship hull for the purpose of ship hull condition monitoring.Data was acquired on board a ship by measuring motions and structural response of the hull.Using the data acquired, a CNN was used to estimate the directional wave spectra for encountered sea state.The estimated sea state was compared to the wave models by evaluating the solution using MSE, 95 th percentile error and Fischer's correlation factor.
Most commonly used neural network types in condition monitoring and condition-based maintenance applications are FFANN and CNN, found in 46.15% and 23.08% of reviewed literature, respectively, followed by RNN, found in 15.38% of reviewed literature.SOM and Decision tree were each found in 7.69% of the reviewed literature.Most common training algorithms are backpropagation, found in 35.71% of the reviewed literature, followed by Levenberg-Marquardt algorithm, found in 14.29% of the literature.The rest of the algorithms were each found in 7.14% of the reviewed literature.Most common activation functions are sigmoid and ReLU, standing out among others, found in 42.86% and 28.57%, respectively.Furthermore, softmax and bipolar sigmoid are both found in 14.29% of the reviewed literature.Evaluation measures used in reviewed literature are accuracy and MSE, both found in 33.33% of the reviewed literature, followed by R score, found in 16.67% of the literature and error autocorrelation and R 2 , both found in 8.33% of the literature.

D. SHIP'S AUTOPILOT APPLICATIONS
In [89], a FFANN is used to establish neural controller for ship track-keeping.SIMO (single input multi output) approach has been used with on-line training.BP algorithm has been used with the network architecture 10-10-1.Performance evaluation was performed by measuring error between the reference value and regulated value while presenting the three different route examples to the network.
In [90], two FFANN (5-10-2) and (6-15-1) have been used for autopilot application.This paper used an inverse modelling approach, thus determining behavioristic characteristics of a system based on acquired data.Levenberg-Marquardt algorithm was used to train the network.The proposed approach has been tested on three sea conditions and calm sea showing benefits in terms of lower amplitude and rate of change of output signal for rudder, thus giving less stress on rudder.As an activation function, hyperbolic tangent has been used in hidden layer, while linear function has been used in output layer.
In [91], an LSTM RNN has been used to predict wave height for autopilot purposes.Two networks have been established for two sea areas.The aim was to ensure ship route through sea areas with lowest wave height.Sigmoid and bipolar sigmoid activation function were used as activation functions in hidden layer.Furthermore, LSTM RNN has been compared with FFANN, RF and SVM using MAE, RMSE and correlation coefficient R.
In [92], a six-layer FFANN (3-12-12-12-12-2) was used for autonomous ship docking under various wind conditions.It is based on mathematical model of a ship and ship's behavior which was used to train the network using genetic algorithm.As a disturbance variable, a wind force model was used.As an activation function in hidden layer, bipolar sigmoid was used while the output layer used linear activation function.As an evaluation measure, MSE was used.
In [93], an RBF-ANN was used to optimize PID controller as a function of ship autopilot.A sliding mode PID controller was proposed and tested with a ship model built in MATLAB.As a training algorithm gradient descent was used.Performance has been evaluated by graphically comparing the response.
In [94], a SSENET (Sea state estimation network) was proposed, based on CNN for estimation of the sea wave height and direction for ship autopilot application.Python 3.6 along with TensorFlow, Keras and Anaconda was used to develop the proposed network.Data have been acquired through two zig-zag maneuvers of a ship, thus acquiring data of ship motion in 9 DOF.The network was trained using Adam optimizer algorithm and its performance has been evaluated and compared with other state-of-the-art solutions such as FFANN, LSTM, ResNet and SeaStateNet using accuracy as an evaluation measure.
Most commonly used neural network types in ship autopilot applications are FFANN, found in 50% of reviewed literature, followed by RBF-ANN, RNN and CNN each found in 16.67% of the reviewed literature.As for the training algorithms, backpropagation stands out among the rest, found in 37.50% of the reviewed literature, followed by Levenberg-Marquardt, found in 25% of the literature and genetic algorithm, gradient descent, and Adam optimizer, each found in 12.50% of the literature.Most common activation functions are bipolar sigmoid and sigmoid, found in 37.50% and 25% of the reviewed literature, respectively.Furthermore, ReLU, RBF Kernel and softmax are each found in 12.50% of the reviewed literature.Relative error is the most commonly used evaluation metric, found in 28.57% of the reviewed literature, followed by MAE, RMSE, MSE, R score and accuracy, each found in 14.29% of the literature.

E. CONTROLLER TUNING APPLICATIONS
Two FFANN have been used in [95] for self-tuning of PID controller for stabilization fins of a ship.One (6-7-1) was used to identify and predict behavior of a non-linear inputoutput relationship between the six degrees of freedom (DOF) motions at input and roll angle at output using bipolar sigmoid as an activation function in hidden layer.Second network (6-7-3) used the roll angle given by the first network and roll angular velocities in the past five time steps as an input to calculate the PID controller parameters given at the three output neurons while using sigmoid as activation function in hidden layer.As an evaluation measure, a roll reduction rate (RRR) has been proposed to evaluate roll reduction percentage compared to the mean value of the roll without proposed self-tuning approach.
In [96], an FFANN (1-5-1) has been used as excitation voltage controller of a synchronous generator to protect the generator from excitation overcurrent in case of generator's engine speed drop and to regulate generated output voltage.A threelayer network has been developed in MATLAB/Simulink using Bipolar sigmoid as an activation function in hidden layer and linear in output layer.The network was evaluated by analyzing graphs of output voltage as a function of generator speed.
In [97], an RBF-ANN has been used to predict ship's response to wind, wave, and sea current in order to develop an adaptive controller for ship's crane with the aim of minimizing cargo swing and achieving efficient disturbance rejection.Results are confirmed experimentally on built test bed comparing swing response graphs and calculating error when positioning the cargo using proposed method in comparison to the advanced nonlinear tracking controller without predictive terms and LQR (Linear quadratic regulator) method.MATLAB has been used for numerical simulations before test bed experiments.
A four-layer FFANN has been used in [98] as neural network controller of a ship crane to compensate load swing caused by waves and ship movement while ignoring wind.As a performance evaluation measure, relative error of load position has been used.
Most commonly used neural network types in controller tuning applications are FFANN, found in 75% of the cases reviewed, followed by RBF-ANN, found in 25% of the reviewed literature.Most common training algorithms are backpropagation, which stands out among the rest, found in 50% of the reviewed literature, followed by Levenberg-Marquardt and RBF algorithm, both found in 25% of the literature.Most common activation functions are sigmoid, and bipolar sigmoid, found in 66.67% and 33.33% of the reviewed literature, respectively.

F. GAS EMISSION APPLICATIONS (ECOLOGY)
In [99], VGG16 CNN has been used to detect fuel sulfur content by imaging exhaust gases using UV camera.Same network was used in [59] as well as the transfer learning method of training.Network has been used to solve classification problem.As an activation function ReLU and Sigmoid has been used in hidden layer.As a loss function, cross entropy has been used and network was evaluated using accuracy with weighted average method.
Similarly, [100] proposes Le-Net5, based on CNN to detect sulfur content in exhaust gases.Initial CNN was modified by adding dropout layer which randomly shuts-off 50% of the neurons and use ReLU activation function.In the output layer, sigmoid activation function was used.For the network development TensorFlow library and Python were used and it was trained using Adam optimizer algorithm.Network performance was evaluated using accuracy measure.
In [101], a three-layer FFANN (7-12-1) has been used to predict fuel consumption of diesel-mechanic propulsion tanker in various exploitation conditions and to develop decision support system for fuel consumption reduction with the aim of reducing greenhouse gas emissions.The Levenberg-Marquardt algorithm was used for network training.As an evaluation measure, MSE, RMSE and R 2 have been used and the model is furthermore validated using multivariate regression (MR) in IBM SPSS software.
In [102], an FFANN, developed using TensorFlow, was used to predict fuel consumption to reduce greenhouse gas emissions of a cruiser using AIS data by embedding the network into four particle swarm optimization (PSO) algorithms.As a training algorithm, stochastic gradient descent (SGD) and batch gradient descent have been used while the evaluation measures included accuracy and variance.Activation functions in hidden layers included: bipolar sigmoid, softmax, sigmoid and ReLU.
FFANN and CNN networks are each used in 50% of the reviewed literature for ship gas emission applications.Most common training algorithms are backpropagation, Levenberg-Marquardt, transfer learning, Adam optimizer, SGD and PSO, each found in 16.67% of the reviewed literature.Most common activation functions are sigmoid and ReLU, both found in 30% of the reviewed literature, followed by softmax and bipolar sigmoid, both found in 20% of the reviewed literature.Accuracy was used in 33.33% of the reviewed literature, followed by MSE, RMSE, variance and R 2 , each found used in 16.67% of the reviewed literature.

G. SAFETY OF NAVIGATION (ENCOUNTER CLASSIFICATION)
An RNN has been used in [103] to detect marine traffic anomalies in real time for the purpose of increasing safety of navigation and situational awareness augmentation.AIS data 139834 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.used for network inputs has been clustered using DBSCAN algorithm.Network has been validated experimentally, showing ability to detect abnormalities in ship trajectory, speed and course.
A CNN was used to classify ship collision risk in [104] based on AIS data.MATLAB was used to render images illustrating the ship encounter situations which served as one of the input information for CNN.MAE and MAPE (Mean absolute percentage error) were used to evaluate the performance of the network.
Similarly, a LSTM RNN network was used in [105] to develop a ship collision probability model using Monte Carlo simulation based on AIS data.PyCharm software platform using Python programming language was used as a tool.
In [106], an RNN was used to model the ship maneuverability to make predictions ensuring safe avoidance of collision.Network architecture was 21-21-21-3.Data used to train the network have been acquired using mathematical model and free run tests of a scaled ship in pool.The network has been trained as FFANN using BP algorithm and sigmoid activation function in hidden layer.After the training process, feedback connections have been added, thus making the network an RNN.As an evaluation measure, MSE and absolute error has been used.
Most commonly used neural network type in safety of navigation applications is RNN, found used in 60% of the reviewed literature, followed by CNN and LSTM, both found used in 20% of the literature.As a training algorithm, DBScan algorithm and backpropagation have been used in 50% of the reviewed literature, each.Most common activation functions used are sigmoid and ReLU, used in 66.67% and 33.33% of the reviewed literature, respectively.As an evaluation measure, deviation, MAE and MAPE were used, each in 33.33% of the reviewed literature.

H. OTHER APPLICATIONS
In [107], a generalized regression neural network (GRNN), FFANN, Fuzzy neural network (FNN), and Elman Neural network (ENN) have been used and mutually compared for the purpose of port traffic predictions.All used data have been pre-processed using Hilbert-Huang transformation to decompose data into high and low frequency components and DTW (Dynamic time warping) method was used to cluster the data.Performance evaluation measures included relative error (RE), standard deviation and residual analysis.
In [108], a three-layer FFANN (6-6-1) was used to estimate the added wave resistance coefficient.It has been compared to GRNN, RBF-ANN and Linear networks.Furthermore, various training algorithms were compared to determine the best performing solution.Lazarus IDE and Free pascal compiler has been used to develop the final solution of the added wave resistance coefficient software.As an activation functions in hidden layer, sigmoid and bipolar sigmoid have been used, while training algorithms included backpropagation, gradient descent, and Levenberg-Marquardt.To evaluate the performance of the networks trained using aforementioned algorithms and using aforementioned activation functions STATISTICA software was used, while evaluation measure included Pearson's correlation coefficient and MSE.
In [109], an FFANN with BP algorithm has been used to reduce sea clutter of a medium range radar to improve object detection on sea.Networks were evaluated using MSE and SCR (Signal-to-Clutter Ratio).As an activation function in hidden layer, bipolar sigmoid was used while the output layer used sigmoid.
In [110], a three-layer FFANN (6-25-1) has been used to determine main engine optimal rated output when designing main propulsion system based on draught, length overall, width, tonnage, speed and main engine power.BP algorithm has been used to minimize the error function and unipolar sigmoid was used as an activation function in hidden layer.Network performance was evaluated using RMSE.
In [111], FFANN, RBF-ANN and GRNN with various activation functions have been used and mutually compared for the purpose of spatial modelling based on sea depth measurement using MAE and maximum error evaluation measures.
An FFANN was used in [112] to establish the regression model for the purpose of ship fuel consumption prediction using MATLAB.Levenberg-Marquardt algorithm has been used for network training using sigmoid, ReLU and bipolar sigmoid as an activation function in hidden layer.As an evaluation measure, MSE and correlation coefficient R have been used.
An LSTM RNN has been used in [113] to identify MIMO model of a ship for the purpose of ship maneuvering motion prediction.For the input layer sigmoid and bipolar sigmoid activation function has been used and leaky ReLU for the output layer.Adam optimizer algorithm has been used to minimize the error function while RMSE and MSE were used as evaluation criteria.
In [114], an FFANN has been used to predict measurement results to reduce number of real measurements for the purpose of steam turbine efficiency calculation.As an evaluation measure, mean absolute error (MAE) and R 2 have been used.
In [115], a three-layer FFANN developed in MATLAB Neural network toolbox has been used to predict torque, power, specific fuel consumption and exhaust gas temperature of a four-stroke marine diesel engine.Levenberg-Marquardt algorithm has been used to train the network and its performance has been evaluated using IBM SPSS by calculating MSE and R 2 values.The bipolar sigmoid function was used as an activation function in hidden layer.
In [116], an RBF-ANN has been used to establish diesel generator model for the purpose of building a real time marine power system simulator.Network architecture 5-99-3 was trained using BP algorithm.In hidden layer consisting of 99 neurons, Gaussian activation function has been used while output layer used linear activation function.Network performance has been evaluated using error threshold and calculation time which confirmed applicability of a model in real time.
In [117], a significant wave height at one point in Adriatic Sea was modelled using FFANN as well as the linear and nonlinear regression.Optimal number of input variables and number of neurons was determined throughout three experiments by evaluating the solutions using RMSE, MAE, Nash-Sutchllife coefficient of efficiency (NSC or CE), percent bias (PB), persistency index (PI) and RMSE to standard deviation ratio (RSR).The established neural network models outperform the regression models with results indicating the applicability of the FFANN for the significant wave height modelling.The advantage of ANN over regression model was more prominent for the case of multiple input variables.
Similarly, an FFANN was used to predict wave induced ship motion in [118].A three-layer network with 25 hidden neurons was trained using Bayesian regularization algorithm in Matlab.Data acquisition during full-scale measurements was performed both using ship and wave buoy.For training, 70% of acquired data were used, while the remaining 30% of the data served as a validation data set.Validation was performed using RMSE, MAE, R, and CE validation criteria.
Most commonly used neural network types in other applications are FFANN, found in 58.82%, followed by GRNN and RBF-ANN, each found in 11.76% of the reviewed literature.FNN, ENN and RNN are each found in 5.88% of the reviewed papers.As for the training algorithms, backpropagation and Levenberg-Marquardt algorithms stand out among the rest, each found in 23.53% of the papers, Bayesian regularization, found in 11.76% of the papers, followed by Dynamic Time Warping (DTW), K-means, K-nearest neighbor, Kohonen, BR, Batch gradient descent, RBF algorithm and Grid search algorithm, each found in 5.88% of the papers.Most common activation functions are sigmoid, found in 41.18%, bipolar sigmoid, found in 35.29%,ReLU, found in 17.65% and Gaussian, found in 5.88% of the reviewed literature.MSE is the most commonly used evaluation metric, found in 23.81% of papers, followed by RMSE, found in 19.05% of the papers.MAE was found used in 14.29% of the papers, while R and R 2 were each found in 9.52% of the papers.Furthermore, Relative error, Standard deviation, SCR, PI and Absolute error were each found in 4.76% of the papers.

IV. ANALYSIS OF THE ANN APPLICATIONS
Based on the reviewed literature an overall analysis of the application of ANN on ships is conducted.Analysis included frequency of application purposes, type of the networks, algorithms and evaluation measures used to build models.Furthermore, an analysis per application purpose has been conducted as well as the data acquisition methods and data processing tools.

A. OVERALL ANALYSIS
The most common application of ANN is ship image classification and type identification, as shown in Fig. 12, followed by condition monitoring, ''other'', and ship route applications.ANNs are least used for ship's autopilot, gas emission, controller applications, and safety of navigation.
Standing out among others are FFANN and CNN, as presented in Fig. 13, followed by RNN.RBF ANN, GRNN, SOM and FNN were less used network types, followed by BTNN and Decision tree as the least used types.
Backpropagation is the most commonly used algorithm, followed by Levenberg-Marquardt, Adam optimizer, transfer learning and SGD.Algorithms such as K-means, DBScan, RBF algorithm and Saliency detection algorithm were used less, followed by batch gradient descent, Genetic algorithm, Variational Bayesian method and ELM based algorithm.Remaining algorithms were rarely used, as shown in Fig. 14.
Accuracy and MSE have been the most commonly used evaluation measures, followed by precision, recall, average precision and RMSE.Least used evaluation measures include R score, mAP, F1 score, MAE and relative error.The remaining, evaluation measures are rarely used, as shown in Fig. 15.
Neural network type-per-application analysis is shown in Fig. 16.CNN is the most frequently used network type, mainly for ship image classification & type identification.FFANN is the second most frequently used network type, mainly ship fuel consumption prediction and other applications.Furthermore, RNN is used mostly for ship route applications while the RBF network has been used equally in ship machinery model, ship controller tuning, ship autopilot and other applications.
Training algorithm type-per-application analysis is shown in Fig. 17.Backpropagation algorithm stands out among others as the most commonly used algorithm, mainly used for condition monitoring & condition-based maintenance, ship autopilot and in other applications, followed by Levenberg-Marquardt, mainly used in ship autopilot applications and fuel consumption prediction.Furthermore, Adam algorithm is mainly used for ship image classification and route applications, followed by transfer learning, mostly used in ship image classification.The remaining algorithms are as shown in Fig. 17.
MSE has been the most frequently used for most of the presented applications followed by accuracy, precision and recall, mainly used in ship image classification applications.The rest of the evaluation measures are as shown in Fig. 18.
Table 2 shows the summary of the ANNs' applications in maritime industry, with stated application purpose, network type, training algorithm, and evaluation measure that have been found reported in the related references.
Sixty-nine papers have been analyzed in terms of applications, ANN types, activation functions, evaluation measures, methods for data collection, data processing and the organization of the data.Nine ANN types, eight activation functions, and twenty training algorithms have been identified and related to the different applications that have been classified in this paper into eight categories with short descriptions of each application.
Based on the analysis presented in this paper, a large number of ANNs' applications in different clusters of maritime industry can be noted, with the absolute number of papers published growing in time over the last decade, as shown in     11.However, taking into account the characteristics of the ANNs and some of the known applications of ANNs in other fields, as well as considering the needs of the maritime industry, while focusing on the ship, some of the ANNs' applications could be suggested.
As shown in [119], electrical motor faults can be detected from the supply current measurements, and some other measurements, such as magnetic stray flux, motor's output torque, vibrations monitoring and input electric power, as stated in [120].For example, temperature estimations and vibration measurements are used in [121] and acoustic signals extraction methods have been used in [122] to detect electric motor faults.Usually, to detect a faulty motor a healthy reference is needed in order to compare the measurements and to discover distortions that would suggest the fault in a motor.However, if an ANN is trained to make a classification between healthy and faulty motor, this could be made without comparison to the healthy motor, as described in [123].The application of ANN as a classifier has been applied in different fields, e.g., [121] providing some references for the use of machine for fault diagnosis based on vibration measurements, and [122] describing the application of backpropagation neural network as a classifier for fault diagnostics of induction motor based on acoustic signals.
While electric motor fault detection could be extremely beneficial to the maritime industry, giving the fact that the electric motor is the primary actuator driving different machines on board the ship, and ANNs' applications to this matter have been reported in recent years, no such studies of ANN application in maritime industry have been found.
ANNs have been used to predict wave induced ship motion, as in [118] and vice versa, to estimate the waves parameters based on measured ship motion, as in [94].However, none of the investigation have been directed towards wave influence to the dynamic stability of the ship using the ANNs.Furthermore, the movements of liquids on board the ship can influence ship's dynamic stability [124].Since those movements are measurable it would be interesting to investigate if it is possible to apply ANNs to estimate influence of those movements to the ship's stability.However, no such investigation has been reported so far.

B. DATA ACQUISITION METHODS
Data used for aforementioned purposes in reviewed literature was collected by on-board sensors such as temperature and pressure sensors presented in [78], vibration, rpm, temperature and pressure sensors as presented in [79], acoustic emission sensors as in [82], temperature and pressure 139838 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.sensors [83], rudder angle sensors as in [90], motion sensors, wind speed and rudder angle sensors as in [92], oxygen sensor, knock sensor, air pressure sensors, rpm sensor, throttle position sensor and temperature sensors as used in [115], ship motion sensors embedded in inertial measurement unit as used in [94] and motion sensors and mechanical stress  sensors as in [88].Some papers used buoys to acquire desired movement data, as [91] and [118], while some of them used publicly available data such models, sea wave models etc. as in [117] or mathematical model output as in [106].However, in the remaining of the literature, authors did not define the data acquisition methods.Data acquisition methods, as analyzed in literature that defined the methods are presented in Fig. 19.Furthermore, on-board sensor types were analyzed and presented in Fig. 20.

C. DATA PROCESSING TOOLS
Tools used to process the data include software and online solutions such as frameworks and platforms.Tensorflow is an open-source free library developed by Google for the machine learning purposes and used in [94] and [102].
Similarly, an open-source library Keras providing interface for neural network development using Python was used in [64] and [94].Furthermore, Python was used in [61] and [70] along with PyTorch free open-source framework for machine learning similar to TensorFlow.In papers data was processed using MATLAB such as in [61], [104], and 139842 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

FIGURE 21.
On-board sensor types found in available literature.[118].Furthermore, its neural network toolbox was used to develop neural network model in [115] and compare it to the mathematical model using IBM SPSS Statistics software for statistical analysis.Similar statistical software, Statistica, was used in [108] for statistical analysis.Simulink, as a part of MATLAB software package was used to develop simulation model in [84].Analysis of the literature that defined the data processing tools show that 75% of literature used software as a tool, while remaining 25% of the literature used frameworks and platforms.MATLAB/Simulink and Python are the most commonly used software, followed by Keras, IBM SPSS and Statistica, as presented in Fig. 21.TensorFlow and PyTorch as frameworks and platforms were each found used in 50% of the literature.

D. DATA ORGANIZATION IN TRAINING, VALIDATION, AND TEST DATA SETS
Data organization analysis show that training data set containing 70-80% of overall data is most commonly used, found in 71.43% of reviewed papers.A larger data set, containing 81-90% of overall data is found used in 22.86% of the literature.Furthermore, training data set containing less than 70% of overall data is not a common practice, as indicated in Fig. 22.   Validation data set usually contains 10-20% of overall data, while a non-negligible number of the reviewed literature did not use validation data set or used 21-40% of the overall data.Validation data sets containing <10% of overall data is not a common practice, as presented in Fig. 23.
Test data sets usually contain 10-20% of overall data.However, a significant number of papers did not use test data sets, while some of the papers used test data sets containing <10% of overall data.Furthermore, it is not common to use 21-30% and >30% of overall data in test data sets, as shown in Fig. 24.

E. NETWORK ARCHITECTURE ANALYSIS
Network architecture analysis was conducted based on available data from the reviewed papers.Due to the limited availability of network architecture data, an analysis of FFANN was feasible.Analysis show that common FFANN consists of one hidden layer, thus being a three-layer network, followed by FFANN with two hidden layers, thus being four-layer network.It is not common to use four hidden layers, as shown in Fig. 25.Usual number of neurons in hidden layer is ten, as shown in Fig. 26.Based on the analysis of available papers, it should be noted that there  is no rule or pattern from which the conclusion on optimal number of neurons in hidden layers could be drawn.

V. CONCLUSION
ANNs are a flexible data driven tool that can be used for different applications in different fields of science if an adequate data is available.The number of variables measured on board a ship is growing by day, and communication between the ship and ''the land'' is improving as well, opening doors to concepts like digital twins that use artificial intelligence tools, such as ANNs, to make the most out of those data.
The aim of this paper was to identify the applications of artificial neural networks on ships and in the maritime industry in order to obtain information about the current standards in the field and possibly discover some blind spots.
Available papers on the topic were analyzed in terms of the ANN types, the activation functions of the neurons, and the training algorithms used to build the models, as well as the numerical measures used to evaluate the performance of the ANN models for each application.In addition, appropriate methods for data collection and data organization were identified, as well as potential limitations in developing an ANN model for a particular application.
ANNs are classified into eight application groups in this paper, with image classification and vessel type identification being the most common applications.Consequently, CNNs are the most commonly used network type, as CNNs are mainly used for image or video data recognition and identification.FFANNs are used in seven out of eight application categories, making them a broadly applicable solution.
The activation functions described in the literature are selected on the type of network layer and are thus network architecture specific.CNNs use softmax, ELU, ReLU, and leaky-ReLU activation functions depending on the type of layers used, while FFANNs use sigmoid or bipolar sigmoid activation functions, especially in the hidden layers, linear activation function in the output layer, and RBF-ANNs use RBF kernel functions.
The same is valid for the training algorithms, FFANNs typically use backpropagation, gradient descent, and Levenberg-Marquardt algorithms, while CNNs use algorithms such as Adam optimizer, QuickBundle, DBScan, and clustering algorithms such as K-NN and K-means.It can also be noted that the identification, prediction, and regression problems are solved with FFANNs, RBF ANNs, and RNNs.
Accuracy, precision, recognition score, and average precision are used in classification applications to evaluate network performance, while MSE is used in six out of eight application categories, making it a widely applicable evaluation measure, followed by accuracy, RMSE, R-score, MAE, and R 2 .
When preparing data for ANN training, typically 70-80% of the dataset is used for training the network and the remaining data is used for testing and/or validating the model.In 6% of the papers, the data were normalized to 0-1 or -1-1 ranges.In most cases where the data origin was specified, on-board sensors were used to collect the data, while only a few used available models as data source for the network.
One of the problems in developing an FFANN is the method of selecting the optimal number of layers and neurons.According to [125], a neural network with a single hidden layer is able to approximate a continuous function with multiple variables when a sufficient number of neurons are used.This was confirmed by an analysis presented in this paper as most of the papers used three-layer FFANN with single hidden layer, regardless of the application purpose.Although some methods for determining the optimal number of hidden layers and the number of neurons of a FFANN have been described in the literature, the usual approach in the reviewed papers was based on the trial-and-error method.
Through the review presented in this paper, it was found that ANNs are least used for gas emission prediction, controller design, and navigation safety, while some of the known ANN applications remain unutilized, such as electric motor fault diagnostics and sea waves or fluid cargo influence to the dynamic stability of the ship.Therefore, it can be concluded that further research on the application of ANNs on ships should be focused on these applications, especially in light of current trends in the maritime industry.Furthermore, as ships become increasingly automated and move strongly towards higher levels of automation and autonomy, incorporating technologies such as digital twins and the Internet of Things (IoT), it can be concluded that ANNs can certainly be more used in the exploitation of all this available data.

FIGURE 10 .
FIGURE 10.Indexing of the reviewed literature.

FIGURE 11 .
FIGURE 11.Number of published papers over the past years.

FIGURE 12 .
FIGURE 12. Application purposes as found in reviewed literature.

FIGURE 13 .
FIGURE 13.Network types used in reviewed literature.

FIGURE 14 .
FIGURE 14. Training algorithms used in reviewed literature.

Fig.
Fig.11.However, taking into account the characteristics of the ANNs and some of the known applications of ANNs in other fields, as well as considering the needs of the maritime industry, while focusing on the ship, some of the ANNs' applications could be suggested.As shown in[119], electrical motor faults can be detected from the supply current measurements, and some other measurements, such as magnetic stray flux, motor's output torque, vibrations monitoring and input electric power, as stated in[120].For example, temperature estimations and vibration

FIGURE 15 .
FIGURE 15.Evaluation measures used in reviewed literature.

FIGURE 19 .
FIGURE 19.Data acquisition methods in literature that defined the used methods.

FIGURE 20 .
FIGURE 20.On-board sensor types found in available literature.

FIGURE 22 .
FIGURE 22. Data organization in training data set.

FIGURE 23 .
FIGURE 23.Data organization in validation data set.

FIGURE 24 .
FIGURE 24.Data organization in test data set.

FIGURE 25 .
FIGURE 25.Number of hidden layers in FFANN used in reviewed papers.

FIGURE 26 .
FIGURE 26.Number of neurons in hidden layers.

TABLE 1 .
Activation functions with mathematical expressions and its applications.

TABLE 2 .
Per-application analysis of the reviewed literature.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE 2 .
(Continued.) Per-application analysis of the reviewed literature.