Fitting Analysis of Inland Ship Fuel Consumption Considering Navigation Status and Environmental Factors

The strategy of ecological priority and green development in China has made the fuel consumption of inland ships receive unprecedented attentions. Reliable fuel consumption prediction is the vital basis of navigation planning, energy supervision, and efficiency optimization. In this article, a cargo ship sailing on the Yangtze River trunk line was taken as the research object. A comprehensive fitting analysis of inland ship fuel consumption was conducted, and a prediction method was proposed. First, the multi-source data including ship navigation status and environment information were collected by multi-source sensors. Second, to conduct a detailed analysis of the collected data, the authors proposed data pre-processing and trajectory segmentation methods and analyzed the correlation between multi-source variables and fuel consumption. Third, a Back Propagation Neural Network with double hidden layers (DBPNN) was tailored to build a fuel consumption prediction model. Fourth, the developed model was validated using real ship measurement data. Different input variables were selected for fuel consumption prediction, and the results showed that after adding the variables of environmental feature including water level, water speed, wind speed, wind angle, and route segment, the prediction error RMSE (root mean square error) and MAE (mean absolute error) were reduced by 35.31% and 30.30%, respectively, while the $R^{2}$ (R-squared) increased to 0.9843. What’s more, compared with other ANNs (artificial neural networks) such as Elman, RBF (radial basis function), three support vector regression (SVR) models, random forest regression (RFR) model, GRNN (generalized regression neural network), RNN (recurrent neural network), GRU (gated recurrent unit) and LSTM (long short-term memory) the proposed DBPNN model showed better performance in fuel consumption prediction.


I. INTRODUCTION
The waterway transportation along the Yangtze River trunk line has effectively relieved the pressure on land transportation, railway transportation, and air transportation in China. However, as people pay more attention to green shipping and ecological environment [1], energy management and resource optimization of inland waterway transportation have become an urgent problem to be solved [2]- [6]. Accurate estimation The associate editor coordinating the review of this manuscript and approving it for publication was Fan Zhang . and prediction of inland ship fuel consumption can provide a solid basis to solve these problems.
For the past few years, some researchers have focused on ship fuel consumption prediction, and some achievements have been attained. Beşikçi et al. [7] tried to predict ship fuel consumption for various operational conditions through using an ANN (artificial neural network). Coraddu et al. [8] compared three different approaches WBM (White Box Model), BBM (Black Box Model) and GBM (Gray Box Model), in the prediction of the ship fuel consumption based on data measured by the on-board automation systems. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Wang et al. [9] proposed a ship fuel consumption prediction model on the basis of the LASSO (Least Absolute Shrinkage and Selection Operator) regression algorithm. Kee et al. [10] used the MLR (multiple linear regression) methods to construct the fuel efficiency profile of working tugboats. Based on the AIS (Automatic identification system) data and technical information, Simonsen et al. [11] presented a fuel consumption model to estimate the energy use and fuel consumption of the cruise ships sailing in Norwegian waters. Yuan and Nian [12] developed a Gaussian Process (GP) metamodel to predict the ship fuel consumption for different scenarios in consideration of the effects of the operational conditions. Yang et al. [13] proposed a novel Genetic Algorithm-based GBM (GA-based GBM) for ship fuel consumption prediction. Gkerekos et al. [14] presented a comparison of multiple data-driven regression algorithms for predicting the ship main engine FOC (Fuel Oil Consumption), including SVMs (Support Vector Machines), RFRs (Random Forest Regressors), ETRs (Extra Trees Regressors), and ANNs (Artificial Neural Networks). Hu et al. [15] collected two different sets of data showing the fuel consumption of a voyage ship with and without the influence of marine environmental factors and used the machine learning of BPNN (Back-Propagation Neural Network) and GPR (Gaussian Process Regression) to train and predict the ship fuel consumption. Capezza et al. [16] demonstrated a statistical framework and automatic reporting system for fuel consumption monitoring that addresses the reporting and verification (MRV) requirements needed to comply with the regulations. Accetta and Pucci [17] proposed an EMS (energy management system) of the electrical system of a luxury ships (yachts) to reduce the related ship polluting emissions. Zhao et al. [18] created a SSOM-SS (sailing speed optimization model for slow steaming) to balance the expected utility-based objectives (EUO) of fuel consumption, SOx emissions and delivery delay, by applying BDA (big data analytics) techniques like data fusion and feature selection to provide the SSOM-SS with accurate and suitable data on fuel consumption, and built a solver based on the GA (genetic algorithm) to solve the SSOM-SS. However, there exist some limitations in the existing research. (1) Research objects existing works were mostly seagoing ships, while few studies focused on inland ships.
(2) Limited input variables were used for fuel consumption prediction. There was a lack of consideration of some non-negligible variables, such as wind angle, engine temperature, and route characteristics. (3) There was a lack of analysis about the influence of different input feature variables when predicting ship fuel consumption.
To address the above issues, this article aims to develop a predictive model of inland ship consumption in terms of a complete future voyage comprehensively considering ship navigation status and environmental factors, and to implement a novel application of inland ship fuel consumption fitting analysis based on the developed model. Firstly, real-time status monitoring data of the inland ship and relevant environmental data are collected by multi-source sensors. Secondly, the original multi-source data is processed and analyzed in detail, including data pre-processing, ship trajectory segmentation according to the flow condition, water level and the actual topography of the Yangtze River trunk line, and correlation analysis between the fuel consumption and multi-source variables. Thirdly, a fuel consumption fit model based on BPNN with double hidden layers (DBPNN) is constructed. Finally, different feature variables are selected and discussed with the proposed model. And the performances of the proposed models are verified by field data. Besides, in comparison with SVR models, other ANNs and RNNs (Recurrent Neural Networks) are also presented. The research framework is as shown in Fig. 1.
The contribution of this article is mainly reflected in three aspects. (1) Multiple factors that affect the fuel consumption  The remaining of the paper is organized as follows. Section II introduces the multi-source data collection of the inland ship. The proposed methods of multi-source data analysis are described in detail in Section III. The prediction model of the fuel consumption rate is constructed in Section IV. Detailed experiments and analysis are carried out in Section V. Finally, conclusions and future work are presented in Section VI.

II. MULTI-SOURCE DATA COLLECTION
Multi-source data in this study is composed of ship static information, real-time ship status data and navigation environment data, first two of which are collected by an on-line monitoring system onboard, as shown in Fig. 2.
This article considers a cargo ship sailing on the Yangtze River trunk as the research object, which is equipped with  Table 1. In addition, the navigation environment data including water level, water speed, wind speed, and wind direction are collected from the hydrographic stations along the Yangtze River trunk line.

III. MULTI-SOURCE DATA ANALYSIS
This section will introduce the methods of multi-source data processing and analysis in detail, including raw data pre-processing, trajectory segmentation, and correlation analysis.

A. DATA PRE-PROCESSING
The status monitoring data is obtained through continuous time sampling by multi-source sensors on the ship terminal. And the original data usually contains some errors and anomalies due to data transmission delay, data reception abnormally, ships berthing, working cargo and/or other reasons. In the Fig. 3 (a), the normal range of longitude is between 105 and 113, and the zero values are abnormal data, which may be caused by transmission error. In the Fig. 3 (b), the SOG with zero values are noise data, which may be caused by reception abnormal, or may be ship berthing and working cargo. And the data of engine speed also contain some noise and abnormal, as shown in Fig. 3 (c) and (d). Therefore, multi-source monitoring data pre-processing is very necessary. It should be noted that the multi-source data was collected by different sensors. For example, ship locations are obtained by GPS receivers on the bridge, the engine speed and temperature come from sensors equipped on the engines, while the fuel consumption data is collected from the sensors in the bunker and tanks. That is to say, the SOG or engine speed in the same data record with abnormal longitude may be normal.
Therefore, the pre-processing flowchart is designed as Fig.4. As shown in Fig.4, the proposed data pre-processing method for inland ship multi-source monitoring data is divided into the following six steps: Step 1. Original data sorting. The whole data set is sorted according to the sampling date and time to remove the duplicate data.
Step 2. Abnormal data locating. Finding out the abnormal values. For example, the longitude with zero values from the original data. The abnormal data may be caused by transmission and reception errors.
Step 3. Replacing the abnormal data with mid-value. The abnormal value record of multi-source data may contain other useful information, which should not be deleted entirely. Therefore, the processing method is to replace the abnormal value use the mid-value which is the average of the data at the upper and lower times.
Step 4. Noisy data locating. Finding out the noisy data of the multi-source data, such as the SOG and engine speed with zero values.
Step 5. Noisy data cleaning. Noisy data in the data set has no information available, and it also affects the quality of the data set. Step 6. Real-time fuel consumption rate calculating. The real-time fuel consumption refers to the ship's fuel consumption per minute.
It should be noted that the collected multi-source monitoring data only includes the bunker fuel, left reserve fuel and right reserve fuel of the ship, during navigation. There is no real-time fuel consumption rate, which needs to be calculated using the recorded fuel information. The bunker fuel and reserve fuel are collected by the fuel gauges, which will measure the amount of oil per minute in the bunker and tanks (the error ratios of measured is 1.17 %). In order to calculate the real-time fuel consumption rate accurately, the equations are designed as follows.
where, FuelBun denotes the bunker fuel, FuleResL and FuleResR denote left reserve fuel and right reserve fuel respectively, their unit are all L, Fuel represents the fuel consumption during the sampling time, i denotes current time, and i−1 denotes the previous time, their unit are minute, FuelCR represents real-time fuel consumption rate, the unit is L/Min. The corresponding navigation environmental data including water level, water speed, wind speed and wind direction, as shown in Fig. 5. In this article, regarding wind speed, the unit is the Beaufort Scale (BS). And the range of the wind direction is from 0 • to 360 • . In addition, the wind angle is also calculated, which is the angle between the wind direction and the COG. The wind angle ranges from −180 • to 180 • .
When the wind angle is 0 • , it means that the wind direction is consistent with the heading of the ship, and the ship is sailing downwind. When it is −180 • or 180 • , it means that the ship is sailing upwind.

B. TRAJECTORY SEGMENTATION
From Fig. 5, it can be seen that the wind speed and wind direction of the route are relatively small. For example, the wind speed is below 3 on the Beaufort scale, and the wind direction does not change much within period time (such as one day). These may not help much in the analysis and prediction of real-time fuel consumption. In contrast, changes in water level and water speed are more obvious. Also, when the water speed changes, the water level also changes. To analyze and predict the real-time fuel consumption accurately, we further divided the trajectory of ships according to the water level of the waterways and the actual topography of the route of the Yangtze River trunk line.
In this work, the route of the ship from Gongan to Zigui on the Yangtze River mainline, as shown in Fig. 6. According to the water level and speed of the waterway, 7 ports in the route, i.e. Yuanshi, Yaogang, Zhicheng, Yidu, Yichang, Sandouping and Maoping, were chosen to divide the trajectory into 8 segments, with Segment No. 2-5 demonstrated in Fig. 6(b). It is important that the result of trajectory division is approved by many captains who have many years of experience in Yangtze River trunk navigation.

C. MULTI-SOURCE VARIABLES CORRELATION ANALYSYS
By the above data progressing and trajectory segmentation, the high-quality multi-source data set are obtained, which includes status monitoring data, real-time fuel consumption rate, environmental data and segment number. It is well known that the input feature variables have a great influence on the output results of the prediction model. In addition to the navigation state variables such as ES (engine speed), ET (engine temperature), SOG and COG, the environmental factors such as water level, water speed, wind speed and wind angle also affect the fuel consumption of the ship. In order to select appropriate input variables for the ship fuel consumption model, the Pearson correlation coefficient is used to analyze the correlation between each variable and real-time fuel consumption rate, as shown in Equation (5).
where, r denotes the Pearson correlation coefficient, V indicates different feature variables, F indicates real-time fuel consumption,V andF represent their mean value respectively, t is the index of the variable, and n represents the length of each variable. Through the calculation of the equation, the correlation coefficients between multi-source variables and real-time fuel consumption are obtained, as shown in Table 2.
From the Table 2, it is not difficult to find that, the strongest correlation exists the left ES and right ES, and the left ET VOLUME 8, 2020  and right ET also have relatively high correlation with the fuel consumption rate, so that they will play an important role in the subsequent experiments in real-time fuel consumption prediction. It is obvious that the higher the ES, the greater the fuel consumption. And a higher ET normally reflects a higher ES with a certain time delay, while it is also affected by the environment temperature. Meanwhile, a higher ET could lead to a reduction on engine output power, so that ETs are also important to construct the relation between engine speed, fuel consumption, and ship speed. Economically speaking, water level and water speed that determined by river flows and tides vary greatly in inland waters and provide the majority of the resistance for upriver ships [19]. Conversely, wind speed and wind direction are relatively stable and less influential, as shown in Fig. 5 (b). In addition, the influence of river flows and tides on ships is mainly reflected in ship speed on the direction of ship course, meanwhile water flows direction changes frequency when ship sailing in inland waters, as shown in Fig. 5 (a). As a result, correlation of the water speed is large than the wind speed. What's more, the water level and segment ID have obvious correlation with the real-time fuel consumption rate. These also prove the effectiveness of the trajectory segmentation. Since the ship sailing from lower reach to upper reach, the water level and segment number increase. In inland rivers, a higher the water level could lead to relatively lower resistance, and result in a smaller fuel consumption. Therefore, they have negative correlation with the fuel consumption rate. Moreover, according to the value of the correlation coefficient, we then divide them into 5 grades: very strong (r > 0.90), strong (0.60 < r < 0.90), moderate (0.25 < r < 0.60), weak (0.10 < r < 0.25) and very weak (r < 0.10).

IV. MODEL BUILDING
Artificial Neural Network (ANN) is an information processing system based on imitating the structure and function of the brain neural network [19]. With the strong ability of selflearning, self-organizing, adaptive and nonlinear function approximation, it has been widely used in data processing, modelling and forecasting [20]- [24].
Generally, an ANN has three network layers: input layer, hidden layer and output layer. The hidden layer can be a single layer or multiple layers. And the ANN with multiple hidden layers commonly has better performance, but its structure is more complex with higher time complexity, and it is prone to overfitting. Back Propagation (BP) is a simple and efficient algorithm for ANN learning, which consists of two processes: forward propagation of signals and backward propagation of errors. The BP networks have strong learning ability, and also used for fuel consumption prediction [25], [26]. Elman neural network [27] is a typical dynamic recurrent neural network, which is generally composed of four layers: input layer, hidden layer, context layer and output layer. The input layer unit only plays the role of signal transmission, and the output layer unit plays the role of weighting, the hidden layer unit has two types of linear and nonlinear excitation functions. And the context layer is used to memorize the output value of the hidden layer unit at the previous moment, which can be regarded as a delay operator with one step delay. The Elman neural network is based on the basic structure of the BP network, adding a context layer as a one-step delay operator to achieve the purpose of memory, so that the system has the ability to adapt to time-varying characteristics and enhance the global stability of the network. Radial Basis Function (RBF) network is a three-layer neural network with only one hidden layer [28]. The transformation from the input layer space to the hidden layer space is nonlinear, and the transformation from the hidden layer space to the output layer space is linear.
The weights of the RBF network can be directly solved by the system of linear equations, thereby accelerating the speed of network learning and avoiding local minimum problems. The output of the RBF network depends on its parameters and has the characteristic of ''local mapping''. Generalized Regression Neural Network (GRNN) [29] is an improvement to the RBF network, which added a summation layer, and removed the weight connection between the hidden layer and the output layer. GRNN has strong nonlinear mapping ability and learning efficiency, and has a stronger advantage than the RBF network. The GRNN network finally converges to an optimized regression with a large sample size aggregation. For a small number of sample data, GRNN also has a good prediction effect, and it can be used to deal with unstable data. The three ANNs have been widely used in data modeling and prediction [30], [31]. Jenkins et al. [32] used the Elman network to build an ensemble forecast framework (ENFF) for demand prediction of anomalous days. Raza et al. [33] proposed a fuzzy RBF neural network structure by combining the fuzzy logic system with the RBF neural network for short-term road speed forecasting. Ai et al. [34] used GRNN to test the factors' direct and indirect effects on energy consumption, and combined GRNN with urban development scenarios to predict the cities' future CO 2 emissions. Ye et al. [35] proposed the adaptive mutation particle swarm optimization and GRNN prediction model to predict equivalent salt deposit density (ESDD).
In recent years, great advances have been produced in Recurrent Neural Networks (RNNs) [36], such as Gated Recurrent Unit (GRU) [37] and Long Short-Term Memory (LSTM) [38]. The RNNs have been successfully applied to various tasks, such as data-driven modelling [39], image caption [40], sentiment analysis [41] and speech recognition [42]. For example, Ding and He [43] studied online training of the LSTM architecture in a distributed network of nodes for regression and introduced online distributed training algorithms for variable-length data sequences. Ergen and Kozat [44] proposed a new approach, the CAM-RNN to extract the most correlated visual feature and a text feature for the task of video captioning, which was composed of three parts, i.e., visual attention module, text attention module and balancing gate. Zhao et al. [45] designed three different sememes incorporation methods and employ them in RNNs including LSTM, GRU and their bidirectional variants, to improve their sequence modelling ability. Unfortunately, these advanced neural networks have not been used to predict the fuel consumption of inland ships.
In this article, the BP neural network with double hidden layers (DBPNN) is tailored to build the fuel consumption prediction model of the inland ship, its topology as shown in Fig. 7. The novel neural network model consists of four network layers. First, the multi-source feature variables are selected according to the correlation analysis and presented to the input layer. Then, two hidden layers are designed to improve the performance of the model. Finally, the output layer consists of only fuel consumption. Among them, the number of neurons, the activation function, the loss function and optimizer are determined through network training.
The specific modelling process with DBPNN can be described as follows.
Step 1. Selecting the feature variables. Firstly, selecting the status data and environmental data as the input feature variables from fuel consumption data set according to the correlation coefficient in Table 2, and the fuel consumption as the output variable.
Step 2. Normalizing the input and output data to the range of 0 to 1, and randomly dividing the data set into a training set and a testing set. Data normalization can eliminate the adverse effects caused by singular sample data.
Step 3. Creating and training the DBPNN. First, randomly initializing weights of the DBPNN, and then training it using the data of training set until the termination criterion is satisfied. Finally, setting optimal parameters of the fuel consumption prediction.
Step 4. Testing the network with a separate data set and evaluating the model using evaluation functions, such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and R-squared (R 2 ).
In step 2, the method of min-max normalization can be used to normalize the input and output data of the DBPNN as shown in Equation (6). And in step 4, RMSE, MAE and R 2 can be adapted to evaluate the performance of the developed models, as shown in Equation (7)- (9).
where, x(t) and x(t) * represent the initial data and the normalized data, respectively. t represents the index of a datum and T represents the number of data; y t andŷ t are the real values and the predicted values of the t th datum respectively; y t is the mean of y t , where t= 1, 2, 3 . . .T .

V. RESULTS AND ANALYSIS
In this study, the host platform is a notebook, of which the CPU (Central Processing Unit) is Inter (R) Core (TM) i7-6500, the main memory is 8GB RAM (Random Access Memory) and the operating system is 64-bit Windows 10. And the programming language is MATLAB 2019a. Data used in this study came from the inland vessel monitoring system of Changjiang National Shipping Group Co. Ltd, and the multi-source data set includes 9,637 data records. After the data pre-processing, a clean data set is obtained with 2,158 records. In the following case study, the fitting models for inland ship fuel consumption are established, and the fuel consumption is predicted and analyzed in detail.
First of all, LES (left engine speed) and RES (right engine speed) with the strongest correlation (as shown in Table 2) are selected as input feature variables of the prediction model, and the fuel consumption rate is selected as output data. In order to avoid overfitting in the neural network, 80% of the original data are randomly extracted as the training set and the remaining 20% as the testing set. The training data and the testing data are normalized before being presented to the neural network. The number of epochs is set to 1000; the learning rate is set to 0.05; the ''mae'' (mean absolute error) function is selected as the performance function; the ''trainlm'' function is selected as the training function and the ''learngdm'' is selected as the learning function. In addition, the training accuracy of the network is set to 0.00001, and if the training error reaches or fall below the goal, the neural network will stop training early. In order to determine the structure of the network, the number of neural and transfer function of the hidden layers and output layer are divided into several groups for experiments. The RMSE and MAE of testing data are shown in Table 3. After a number of preliminary experiments, the following parameter settings are found to be appropriate and are used, as shown in Table 4.
In addition, some experiments are conducted to compare the training time and prediction performance (including RMSE, MAE and R 2 ) of the models with different hidden layers. The results are shown in Table 5. Where, the models use the related functions in Table 4, the unit of training time is seconds (s), the ''structure'' represents the number of neurons of each network layer. From Table 5, we can find that: (1) the errors of training and testing with multiple hidden layers are smaller than that of single hidden layer, but their training time are longer. (2) The performance of the model with three hidden layers is similar to that of the two hidden layers, but the training time is greatly increased. (3) The training time of the model with four hidden layers decreased, but their errors increased instead. That's because the network stopped training early to prevent overfitting. Therefore, the model with two hidden layer models is selected for the case study.
Secondly, several DBPNN models are developed with different combination of feature variables as inputs according to the correlation in Table 2. And the different feature variables   are combined into four groups A, B, C, D, and E as shown in Table 6.
The input feature variables of group A are LES and RES with the strongest correlation, group B adds LET and RET with the correlation grade of strong, group C increases SOG and COG. And group D not only contains all navigation status data but also adds environmental data and segments data, such as WaL, WaS, WiS, SID and SID. Group E also includes environmental data and segments data, but removes SOG. Where, LES denotes left engine speed, RES denotes right engine speed, LET denotes left engine temperature, RET denotes right engine temperature, WaL represents water level, WaS represents water speed, WiS represents wind speed, WiA represents wind angle, and SID is segment number. Fig. 8 demonstrate the prediction performance of the developed models under different input feature variables. In the VOLUME 8, 2020 FIGURE 8. Fuel consumption prediction results of different input variables: (a) A group with two input feature variables; (b) B group with four input feature variables; (c) C group with six input feature variables; (d) D group with ten input feature variables, which contains status monitoring data, environmental information and segment ID. Fig. 8, the blue line represents the measured data of fuel consumption rate, the green line represents the predicted data of training set, and the red line represents the predicted data of testing set. The above prediction results show that: (1) the processed multi-source data can be well used for fuel consumption modelling; (2) The navigation state and environment of inland ships have a great influence on the fuel consumption, and considering the environmental factors in modelling can effectively enhance the accuracy of the predictive model, as shown in Fig. 8(d).
Thirdly, the prediction accuracy against training data and testing data under different input variables are calculated, including RMSE and MAE and R 2 . Table 7 records the mean results of 20 experiments. It is easy to see from Table 7 that as the number of input variables increases, the values of RMSE and MAE gradually decreases, and the values of R 2 gradually increases. Moreover, add the engine temperatures to the input variables greatly improve the accuracy of the fuel consumption model. In particular, considering the environmental factors and segment information, including the water level, water speed, wind speed, wind angle and segment ID, can improve the prediction performance to some extent, the testing RMSE value and MAE value decreased to 0.0980 and 0.0731 respectively, and the R 2 value increased to 0.9843. Comparing group C with group D in Table 7, it can be found that the RMSE and MAE are reduced by 35.31% and 30.30% after adding environment factors. Therefore, it can be said that when monitoring attributes and hydrological factors are combined as model inputs, the prediction results will be the best.
Finally, in order to verify the advantages of the constructed prediction model, it is compared with other models, including Elman network, RBF network, three support vector regression (SVR) algorithms: linear regression (LR), support vector machine regression (SVMR) and Gaussian kernel regression (GKR), random forest regression (RFR), GRNN, RNN, GRU   network and LSTM network. In the comparison experiments, ten feature variables (group D in Table 6) are selected as inputs of models and use the same training and testing data, and make the parameters of the networks consistent with our proposed DBPNN. The Elman network is also set up with two hidden layers, and the number of neurons is 81 and 8. The RBF network has only one hidden layer, the ''spread'' parameter is set as 80 (the best of value of many experiments). The parameter of RFR is 50, it is the best between 5 and 200. And in the GRNN, 4 cross-validations are conducted with selection the optimal ''spread'' ranging from 0.01 to 0.2, and the best value of 0.02 is found and used. The parameter settings of RNN, GRU and LSTM are as follows: the epochs is 1000, the number of neurons is 150, the time steps is 1, and the batch size is 100. The prediction performance (mean and standard deviation of 20 experiments) of different methods is shown in Table 8.
From the Table 8, it can be seen that, compared with the proposed DBPNN, other neural networks have relatively high prediction errors. For example, the testing RMSE value of other networks are more than 0.1750, the testing MAE value are more than 0.1050, and the testing R 2 are less than 0.9500. It is worth noting that although the training RMSE, MAE and R 2 of the GRNN are relatively close to those of DBPNN, and the testing errors of GRNN are 0.2186, 0.1178 and 0.9226, while the DBPNN's testing errors are 0.0980, 0.0731 and 0.9843, relatively. Fig. 9, Fig. 10 and Fig. 11 demonstrate the prediction performance of three models with GRNN, GRU and DBPNN. The results shown in Table 8, Fig. 9, Fig. 10 and

VI. CONCLUSION AND FUTURE WORK
In this article, considering multi-source variables of ship navigation status and environmental factors, a novel application of inland ship fuel consumption fitting analysis was implemented based on developed DBPNN. With the DBPNN method, fuel consumption of the inland ship has been analyzed and predicted in detail. The multi-source monitoring data have been pre-processed to provide high-quality data sets for the fitting analysis and prediction of ship fuel consumption. Ship trajectory has been divided into 8 segments according to the hydrological data. Correlation analysis of multiple variables has been made, which facilitates the selection of input feature variables for fuel consumption rate predictive models. And different feature variables have been selected and presented to the proposed DBPNN. The verification results of the measured data showed that (1) the multi-source data set composed of navigation status, environmental factors and segment information can be well used for the fuel consumption fitting analysis of inland ship; (2) after adding environmental data and segment ID as input feature variables, the performance of the prediction model has been significantly improved. For example, the testing RMSE is reduced by 35.31%, MAE is reduced by 30.30%, and the R 2 is improved to 0.9843; (3) compared with some methods such as LR, SVMR, GKR, Elman, RBF, RFR, GRNN, RNN, GRU and LSTM, the constructed DBPNN model consistently outperforms the other methods in inland ship fuel consumption prediction.
In this article, data used is still limited. Current research data comes from one part of the Yangtze River trunk line. In the future, more multi-source data of the whole Yangtze River trunk will be collected and considered in analysis of ship fuel consumption. In addition, structure of the neural network may be improved to enhance its adaptability, so that the developed model can work with some intermittent input values.