Deep Learning Data-Intelligence Model Based on Adjusted Forecasting Window Scale: Application in Daily Streamflow Simulation

Streamflow forecasting is essential for hydrological engineering. In accordance with the advancement of computer aids in this field, various machine learning (ML) models have been explored to solve this highly non-stationary, stochastic, and nonlinear problem. In the current research, a newly explored version of an ML model called the long short-term memory (LSTM) was investigated for streamflow prediction using historical data for forecasting for a particular period. For a case study located in a tropical environment, the Kelantan river in the northeast region of the Malaysia Peninsula was selected. The modelling was performed according to several perspectives: (i) The feasibility of applying the developed LSTM model to streamflow prediction was verified, and the performance of the developed LSTM model was compared with the classic backpropagation neural network model; (ii) In the experimental process of applying the LSTM model to the prediction of streamflow, the influence of the training set size on the performance of the developed LSTM model was tested; (iii) The effect of the time interval between the training set and the testing set on the performance of the developed LSTM model was tested; (iv) The effect of the time span of the prediction data on the performance of the developed LSTM model was tested. The experimental data show that not only does the developed LSTM model have obvious advantages in processing steady streamflow data in the dry season but it also shows good ability to capture data features in the rapidly fluctuant streamflow data in the rainy season.


I. INTRODUCTION
Studies have consistently evidenced the complex nature of predicting streamflow due to the involvement of natural variabilities, such as the complex nature, non-linearity, randomness, and non-stationarity of river systems [1], [2]. Several research efforts have been reported in the field of hydrology pertaining to the improvement of the reliability and accuracy of hydrological variable prediction [3]- [7]. Until now, many hydro-meteorological studies have been reported, but there The associate editor coordinating the review of this manuscript and approving it for publication was Kok Lim Alvin Yau .
has been no single approach that can be said to be efficient in modelling hydrological events under varying conditions, particularly as there are various catchment features that play a part. This is due to certain physical processes that characterize river flows, such as the periodicity of the methods and the current patterns in the model data [8]. It could be said that, currently, a model that outperforms other models in various hydrological conditions is non-existent. It may not be feasible to generate consistent prediction using several models due to the dynamic nature and non-stationarity of historical data; hence, studies are needed to develop more efficient models based on the available historical data [9].
The recent advancements in computational models can also be exploited to improve modelling accuracy of such models [7], [10], [11]. The feasibility of using novel data-intelligence techniques in developing efficient forecasting models should also be explored.
For the purposes of water resource management, it is important to understand the hydrological processes that control streamflow patterns. Several studies have been conducted in recent years on streamflow phenomena due to the interest in both global and regional changes in hydrologic patterns that result in drought and flooding [1], [12]- [14]. Evidence in the literature suggests that streamflow patterns can be modelled using either physically based models or artificial intelligence (AI) based models [15]. Meanwhile, there is a need for more studies on the hydrological parameters to develop the required initial and boundary criteria for simulating the elemental processes of specific watersheds with the aid of physically based models [16], [17].
From the reported studies on streamflow simulation, it is evident that classical regression tools are commonly used despite being associated with low accuracy levels [18]; this has motivated the development of AI methods, which offer more accuracy. Several reviews in the field of hydrology suggest that various AI models have been investigated for streamflow simulation; such models include support vector machine (SVM), artificial neural network (ANN), adaptive neuro-fuzzy inference system, complementary wavelet-AI, as well as hybrid evolutionary computing models [1], [12]- [14], [19]- [21]. However, several challenges are still associated with these models, especially regarding their implementation as expert systems for sustainable river engineering. Other challenges are their time-inefficiency and manual modelling processes. In various studies, various model optimisation methods such as input vector optimisation, prediction interval optimisation, integration, hybridisation, and data decomposition have been proposed [22]- [24]. Thus, it is necessary to come up with a universally applicable and automated AI model that is applicable across different local scales.
The non-linear relationships between the estimators and the simulated parameters can be understood using conceptually-based methods as they rely on historical input data and do not require previous flow information [25]. Such models could be beneficial as they demand few hydrological inputs [26]. The basic steps in the AI-based models are as follows: problem analysis, data collection & preprocessing, data-driven model selection, identification of the optimal model from the host of trained models, and evaluation of the selected model. The most important of these steps is the data-driven model identification as this is where the learning process is performed, and the features are extracted. The optimal model approximation is performed by reducing the training error between the actual and predicted matrix; the selection of the optimal model is made from the host of trained models as the model with the lowest mean squared error after a series of independent validation processes.
The approximation performance of AI models can also be influenced by factors that affect actual hydrological conditions, such as the determination of the model input, the configuration of the model, and the time-scale horizon.
As compared to the traditional ANN model, the long shortterm memory (LSTM) network model adds the setting of the forget gate. The addition of thresholds allows the LSTM model to be reset internally at the appropriate times to avoid network crashes. LSTM has better time series capture capabilities and better long-term memory, and can solve complex artificial long-time-lag problems [27], [28]. During the past two years, the application of LSTM to time series forecasting has received increasing attention [29]. LSTM is widely used for forecasting various time series, including flood debris, groundwater levels, lake water levels, watershed runoff, and meteorological problems [30]- [38].
The advantages and feasibility of LSTM in capturing the long-term dependence of time series were highlighted by comparing LSTM with the autoregressive integrated moving average and the generalised regression neural network models [39]. In 2018, Zhang et al. applied LSTM to the analysis of countercurrent problems in wastewater treatment plants and compared it with the Elman and NARX (nonlinear autoregressive network with exogenous inputs) networks [40]. Zhang et al. added a loss layer based on LSTM to predict the depth of the water table [41]. In the group's study, the new forecasting model was compared with the feed-forward neural network and the double LSTM model. The results show that the new model is more accurate for groundwater level forecasting. Later, LSTM was combined with the lion optimiser algorithm model of the ant colony optimisation (ALO) model to form the LSTM-ALO model, which was applied to flow forecasting for the Astor River basin [42]. Widiasari et al. applied an LSTM model to forecast river water levels in the lower reaches of the Semarang region. The results demonstrated the role of LSTM in flood control and disaster mitigation [43].
In an additional study, LSTM was applied to studying the water level fluctuations and reservoir operations in Dongting Lake [44]. The LSTM model was compared with the SVM model, and a comprehensive examination of the influence of the Three Gorges Dam on the water level of Dongting Lake was conducted. The accuracy of the LSTM model used in the study was much better than that of the SVM model, especially in forecasting high water-level values, where it showed an excellent performance [44]. In the research related to reservoir operation, multiple studies were conducted for different time scales and different flow regimes, and the forecasting results of three different models, LSTM, backpropagation (BP), and SVM, were compared. Compared to the BP and SVM models, the LSTM model predicted the operating modes under different conditions more quickly and accurately [45]. Tian et al. explored the potential of the LSTM model for runoff simulations of the Xiangjiang River and the Qujiang River basins. In this study, LSTM demonstrated excellent time series capture and better long-term memory compared to SVM [46]. In 2019, Huang et al. published a study in which LSTM was combined with the E1 Nino-Southern Oscillation index. In comparison with the linear regression model, LSTM used the nonlinear evolution of the data better and demonstrated its statistical advantage in long leads [47].
As highlighted in the previous paragraphs, the LSTM model has been increasingly used for various aspects of hydrology research. In the literature survey, the application of LSTM in groundwater level forecasting, reservoir operation forecasting, runoff problems in the Xiangjiang River and Qujiang River basins, and E1 Nino-Southern Oscillation positively demonstrated the feasibility of using LSTM to predict water conservancy and climate-related issues [48]. The streamflow problem is an important issue that combines water conservancy and climate. The application of LSTM in streamflow forecasting research has important practical significance for flood control and disaster reduction.
As compared with previous studies, this innovative study was conducted from the following perspectives.
i. The first objective was to examine the feasibility of applyingthe developed LSTM to streamflow forecasting located in a highly stochastic environment. The time series characteristics of streamflow periodicity and the reliable accuracy of the developed LSTM model for streamflow forecasting results illustrate the feasibility of using the LSTM in streamflow forecasting. ii. Based on the first objective, the effects of the size of the training set on the performance of the model were explored. iii. Based on the second objective, the effects of thetime intervalbetween thetraining set dataand testing set data on the performance of the model were explored. iv. Based on thethird objective, the effects of the time span of predicted data on the performance of the model were explored.

II. STUDY AREA DESCRIPTION AND DATA AVAILABILITY
In this study, the daily scale streamflow information used was collected from the Kelantan River located in the northeastern region of the Malaysia Peninsula. The location of the river is presented in Fig. 1. Permission to collect the historical data was obtained from the Malaysian Department of Irrigation and Drainage. The measurements of the streamflow were collected using telemetry methods, where a sensor placed at the targeted area of the river calculates the water level pressure. The case study site is predisposed to flooding because of heavy monsoon rainfall events. Hence, the adoption of such an intelligent model for streamflow prediction can be an important step in flood hydrology assessments. It can also be of significant economic, agricultural, and infrastructural importance. The total length of the Kelantan River is 248 km, and its drainage area is approximately 13,170 km 2 . The daily flow data of the river were obtained for the observed  Table 1.
The original data of the Kelantan River used in this study include two datasets: streamflow and rainfall.

III. METHODOLOGY
In this study, the LSTM model was performed by Tensor-Flow to predict the streamflow. The complete forecasting process is shown in Fig. 3. The original streamflow data is standardised by z-score normalisation, and then a cyclic neural network is created. The LSTM model is well-trained by means of parameter adjustment, and the pre-processed data are input into the trained LSTM. Finally, the predicted data are output to complete the LSTM-based streamflow forecast.

A. LONG SHORT-TERM MEMORY MODEL
The LSTM model was proposed by Hochreiter et al. in 1997 [27], and Gers et al. conducted a detailed study and elaboration of LSTM's forget gate in 1999 [28]. As shown in Fig. 4, the input gate, output gate, and forget gate are adopted by the LSTM model cell, which makes the model choose the states that have more influence on the current state, compared with the recurrent neural network model.

B. MODEL STRUCTURE 1) FORECASTING PROCESS
The LSTM model adopted in this study is a three-layer neural network, which consists of an input layer, a hidden layer, and an output layer. The neuron number of the hidden layer is set according to Formula (1).
where a is the neuron number of the input layer, b is the neuron number of the output layer, and c is a constant within the interval [0, 10].
In the work described in this article, the input layer size is 2, and the output layer size is 1. Combining Formula (1) and Table 2, we consider the number of hidden layer units as 30. Three thresholds in the LSTM cell follow Formulas (2) to (4).
where i (t) is the input threshold at time t, f (t) is the forget threshold at time t, o (t) is the output threshold at time t, x (t) is the input of the training sample at time t, h (t) is the VOLUME 8, 2020  output of the current unit at time t, W is the connection weight of the model, and b is the offset value of the model. Thus, the state function and output function of the LSTM unit follow Formulas (5) to (7).
where h (t−1) is the output of the current unit, h (t) is the output of the unit at the previous time,C (t) is the unit status at the previous time, and C (t) is the state of the unit at the current moment. The loss function of the LSTM model follows Formula (8).
where n is the number of the testing samples, y i is the observed value at time i, andŷ i is the predicted value at time i. During the training process of the model, the weights of the LSTM units are adjusted according to the error. The adjustment method follows Formulas (9) and (10). where h is the weight of the hidden layer, and x (t) C is the weight between the hidden layer and the output layer.
During the training process, the error direction propagation of the input gate, output gate, and forget gate follow Formulas (11) to (13). where l is the activation function of the control gate, g is the activation input function of the unit, and h is the output activation function of the unit. If the error is less than the minimum value of the expected error, the model will converge, and if the maximum number of iterations is reached, the training process is complete. Then, the testing data are input into the neural network to forecast the streamflow.

2) LONG SHORT-TERM MEMORY MODEL PARAMETERS USED IN THE EXPERIMENTS
The operating system used in this study was Microsoft Windows 10. TensorFlow 1.7.0, Cuda 9.0, and Python 3.6 were selected as the development environment.
The learning rate is one of the key parameters of the developed LSTM model. The optimised value of the learning rate is set to 0.00001, as shown in Table 2.
The other optimised values of the parameters of the developed LSTM model are as follows: the number of hidden layer units is 30; the input layer size is 2; the output layer size is 1; the learning rate is 0.00001; and the cycle index is 2000.
The main functions involved in the LSTM model are shown in Table 3.

C. APPLICATION OF DATA AND EVALUATION PARAMETERS 1) DATA APPLICATION
In the experiments, 14976 sets of streamflow data and rainfall data in the period from 01/01/1964 to 31/12/2004 were selected as the original data set for streamflow prediction. Considering that the original data has a large variation range, the z-score-normalised preprocess method was performed, as shown by Formulas (14) to (16).
wherex is the average of the original time series data, s is the standard deviation of the original time series data, and x i is the new sequence data formed by z-score normalisation. MAPE is the average of absolute percentage errors. The closer the MAPE is to 0, the better the prediction result.
where n is the number of the testing samples, Sf o and Sf f are the actual and forecasted streamflow values, respectively, and Sf o is the mean value of the actual streamflow. RMSE is the square root of the ratio of the square of the deviation of the predicted value from the true value to the number of observations. The RMSE reflects the precision of the measurement and is used to measure the deviation between the predicted value and the true value. The value range of RMSE is [0, ∞]. When RMSE is 0, the prediction result is the best.
RMSRE is the root of the mean squared relative error. The closer the RMSRE approaches 0, the better the prediction.
MAE is the average of the absolute values of the deviations. It uses the absolute value of the deviation, which avoids the positive and negative offset of the deviation and reflects the actual situation of the predicted value error. The MAE value range is [0, ∞]. When MAE is 0, the prediction is the best.
MRE is the average of the ratio of the deviation to the true value. The closer the MRE approaches 0, the better the prediction. BIAS reflects how much the predicted value deviates from the true value. The closer BIAS approaches 0, the better the prediction.
NSE reflects how well the model predicts. The value range of NSE is [−∞, 1]. The closer the NSE is to 1, the higher the reliability of the model. The closer the NSE is to 0, the closer the predicted value is to the average of the actual values. The overall model result is credible, but the process prediction error is large. If the NSE is much less than 0, the model is unreliable.

IV. APPLICATION RESULTS AND ANALYSIS
The three aspects of research involved in the forecasting of streamflow are analysed in this section. This section is divided into four parts: A, B, C, and D. These four parts discuss the performance stability of the LSTM model from four different perspectives: feasibility, differences in training set size, differences in the period of the dataset used in the training set, and predictions of different durations. As shown in Table 4, the four parts correspond to the relevant information of the training set and the test set in the experiment.

A. FEASIBILITY VERIFICATION OF THE STREAMFLOW PREDICTION MODEL BASED ON LSTM
In this section, the feasibility of applying the LSTM model for streamflow prediction is verified. The classic BP model is adopted as a reference model to be compared with the LSTM model. The streamflow data and rainfall data during the period from 20/12/1995 to 06/03/2004 were used as the training set. The streamflow data and rainfall data during the period from 07/03/2004 to 31/12/2004 were used as the testing set.
As shown in Fig. 5, the curve of LSTM prediction results fits the curve of actual observation data well. Whether it was a smooth stream flow in the dry season or a rapidly fluctuant stream flow in the rainy season, there were significant deviations for the prediction results by the BP model compared with the actual observation data.
As shown in Fig. 6, the residual distribution of prediction results by LSTM was better than the residual distribution of prediction results by BP. The residual of prediction results by LSTM was less than 175 m 3 /s in the worst case scenario where the streamflow changed rapidly at the points marked by the vertical dashed line in the rainy season. The LSTM model showed stronger robustness to residual distribution than the BP model. The corresponding residuals of prediction results by BP increased significantly in the worst case when the streamflow changed rapidly in the rainy season. Therefore, compared with BP, the residual fluctuation of LSTM has a clear advantage. Fig. 7 shows the distribution of predicted and observed data by LSTM and BP. As shown in Fig. 7 (a), the predicted 32638 VOLUME 8, 2020  data of LSTM were close to the observed data except for a few numerical points with large errors (still in the acceptable range). As shown in Fig. 7 (b), the BP model was not sensitive to data changes in low stream flow segments, resulting in lower prediction accuracy. Although the sensitivity of the BP model to the data in the high streamflow segments is better, its prediction accuracy was still lower than that of LSTM.  corresponding to the two models are similar, but BP's BIAS is slightly better than LSTM's. The NSE of LSTM is 0.98039, which is an improvement of nearly 0.1 compared to BP.   The LSTM model is perfectly suitable for the task of streamflow prediction for the Kelantan River. Moreover, compared with the classic BP model, the LSTM model has obvious prediction accuracy advantages, irrespective of whether it was a smooth streamflow in the dry season or a rapidly fluctuant streamflow in the rainy season.

B. EVALUATION OF TRAINING SET SIZE ON THE PERFORMANCE OF THE STREAMFLOW PREDICTION MODEL BASED ON LSTM
In this section, the effect of training set size on the performance of the streamflow prediction model based on LSTM is investigated. In the experiment, original data in the period from 07/03/2004 to 31/12/2004 (almost 300 days) was used as the testing set. For the training set, the deadline was fixed at 06/03/2004. By adjusting the start date of the training set, five training sets with different sizes were created. The data size of the five training sets were: 2000, 3000, 4000, 5000, and 6000 days.
The experimental results of our investigation of the impact of training set size on the prediction performance of the LSTM model are shown in Fig. 8 and Fig. 9. Fig. 8 is a data curve diagram of the prediction results of the LSTM model under different training set conditions. On the whole, the prediction result curves of the LSTM model have a good fit with the actual observations. For the streamflow prediction in the dry season, such as during the period from 5/2004 to 6/2004, as shown in the enlarged part (a), the curves have the best fit when 3000 and 4000 days of data are chosen as the training sets. However, for the streamflow prediction in the rainy season, such as during the period from 7/2004 to 8/2004 and from 8/2004 to 9/2004, as shown in the enlarged part (b) and (c), larger training sets make the prediction curve better fit the observed values.
According to Fig. 9, as the training set size increases, the distribution of the predicted values of the LSTM model has a tendency to approach the observed values gradually. Therefore, with an increase of the training set size, the prediction ability of the LSTM model improved. Table 6 shows the evaluation indices of the prediction results by the LSTM model with the five training datasets. As the size of the training set increases, the MAPE, RMSRE, MRE, and BIAS values tend to decrease first and then increase. RMSE, MAE, and NSE are steady and are not sensitive to the training dataset size changes. From the point of view of all seven evaluation indices, the training sets with 3000 days data and 4000 days data have the optimal performance for streamflow prediction.
In Fig. 10, the distribution ranges of MAPE, RMSE, and NSE corresponding to the experimental results of each training set are shown. In general, the prediction results of each training set were good. As the training sets increase in size, the three indices showed certain regularity. Fig. 10 (a) shows the change trend of the MAPE. The MAPE of each training set was relatively stable. However, the MAPE floating ranges using 3000-day and 5000-day training sets were large, and the MAPE floating ranges using 2000-day and 6000-day training sets were small. The MAPE floating ranges had the optimal value when the training set size was 4000 days. Fig. 10 (b) shows the change trend of the RMSE and Fig. 10 (c) shows the change trend of the NSE. Both RMSE and NSE had similar variation regularity as MAPE. Hence, the training set with 4000 days of data was the best choice for optimal prediction performance of the developed model. Fig. 11 shows the change curve of the loss value of the LSTM model under different training set conditions. The developed model converged faster as the size of the training set increased.
In summary, for the LSTM model trained with the historical streamflow data of the Kelantan River, as the size of training set data increases, the prediction performance of the model is optimised to a certain extent, and the model convergence speed is accelerated. However, there are certain limitations in optimising the prediction performance of the LSTM model simply by increasing the training set. The weights of parameters of the developed model will tend to occur overfitting phenomenon by an oversized training set.

C. EVALUATION OF THE TIME INTERVAL BETWEEN THE TRAINING SET AND TESTING SET ON THE PERFORMANCE OF THE STREAMFLOW PREDICTION MODEL BASED ON LSTM
In this section, the effect of the time interval between the training set and testing set on the prediction performance of the LSTM model is investigated. The purpose of the experiments was to test whether the selection of historical data from different time intervals as the training set will affect the prediction performance of the LSTM model. In the experiment, the size of the training set is set to 3000 days, and the size of the testing set is set to 300 days. Historical data in the period from 07/03/2004 to 31/12/2004 was selected as the testing set. Six groups of training sets with different time intervals to the testing set were selected, namely, no interval (that is, the data selected by the training set and the testing set are continuous), 1000 days, 2000 days, 3000 days, 6000 days, and 9000 days. The duration of the training set data was from 20/12/1995 to 06/03/2004. Hence, with a fixed size for the training set and different time intervals between the training set and testing set, experiments were performed to evaluate the performance of the streamflow prediction model based on LSTM.
To further investigate the effect of the time interval between the training set and testing set on the prediction performance of the LSTM model, more analysis has been done, as shown in Fig. 12 and Fig. 13. Fig. 12 is a data curve diagram of the prediction results by the LSTM model with different time intervals. The corresponding prediction result curves of each experimental group are close to the actual observation data, and the results of each group are satisfactory. According to the detailed local information of curves, the experimental data of each group accorded with the multi-peak fluctuations. Fig. 13 shows the distribution of predicted and observed values for the LSTM model at different time intervals. VOLUME 8, 2020 As shown in the figures, there was no significant change in the overall distribution of the predicted values of each group, and only a few numerical points changed significantly. Table 7    of fluctuation. The interesting finding is that multiple predictive evaluation indices obtained the optimal value when the time interval between the training set and testing set was set to 9000 days. Fig. 14 shows the MAPE, RMSE, and NSE of the predictions of the developed model by changing the time intervals between the training set and testing set. As the time interval increased, the value distribution of the three indices showed multi-peak fluctuations. When the interval was set to 1000 days, the three indices achieved poor results. When the interval was set to 6000 days, the three indices achieved better results. When the interval was set to 3000 days, the three indices were at their most stable stage, but the probability of the occurrence of extreme values increased.
The results shown in Table 7 are consistent with the trend of performance changes in Fig. 14. When the time intervals between the training set and testing set data change, the prediction performance of the LSTM model shows a small multi-peak fluctuation. Extreme points appeared in experiments when the time intervals were set to 1000 days, 2000 days, and 3000 days.
As shown in Fig. 15, the time intervals between the training set and testing set had little impact on the convergence speeds of the model. When the time interval was set to 4000 days, the model convergence speed decreased slightly, and the other groups all have similar convergence speeds.
In summary, according to the experimental results shown in Table 7 and Fig. 12 to Fig. 15, the time interval between the training set and testing set has little impact on the prediction performance of the developed LSTM model trained with the historical streamflow data of the Kelantan River. It is a good feature for the model training process in the case that a small VOLUME 8, 2020   amount of the historical data is missing. Missing historical data will not affect the prediction ability of the developed model.

D. EVALUATION OF THE TIME SPAN OF PREDICTION DATA ON THE PERFORMANCE OF THE STREAMFLOW PREDICTION MODEL BASED ON LSTM
In this section, the effect of the time span of prediction data on the prediction performance of the developed LSTM model is investigated. The experimental settings of this section are as follows: the training set size is set to 3000 days, and the training set and the testing set have no time interval. The training set is selected from the data set in the period from 23/06/1993 to 08/09/2001. For the testing set, each group took 09/09/2001 as the initial date of the testing set. Then, testing sets with different time spans were selected. In the experiment, the time spans of the following five groups of testing data were set as 100 days, 300 days, 500 days, 700 days, and 1000 days. Fig. 16 shows the curves of predicted data and the corresponding observed data using the five groups of testing data. As shown in Fig. 16 (a), the predicted data with a 100-day span can fit the observed data well. In Fig. 16 (b),    Fig. 16 (e).
In Fig. 17 (a) to Fig. 17 (e), the diagrams show the distribution of predicted data and observed data with the five time spans, showing that as the time span increased, the data distribution became worse. Table 8 shows the evaluation index values of the LSTM model prediction results with different prediction data time spans. Prediction results with a 100-day span achieved the best performance in terms of the indices. The other four testing groups have similar MAPE and NSE values. Compared to the prediction results with 500 days, 700 days, and 900 days, the prediction results with 300 days have better RMSE and MAE, but worse RMSRE, MRE, and BIAS. Fig. 18 is a schematic diagram of the stability of the prediction results of the LSTM model using different time spans for the prediction data. As shown in Fig. 18 (a), the MAPE using 100-day data shows a clear advantage, and the remaining four groups of prediction results are similar. If the time span is more than 500 days, the probability of extreme values will increase. As shown in Fig. 18 (b) and Fig. 18 (c), RMSE and NSE with 100-day data also achieved the best performance. Although the RMSE and NSE with 300-day data have good values, the predicted data have a large fluctuation range.
The results in Fig. 18 and Table 8 show that stability and accuracy of the prediction performance of the LSTM model slightly decreased as the time spans became larger. From the point of view of performance indices, each  experimental group also obtained good prediction results which still meet the requirements for high-accuracy prediction of streamflow.
In summary, for the LSTM model trained with the historical streamflow data of the Kelantan River, the results of this experiment show that the model can complete prediction tasks with different prediction data time spans, and can ensure the accuracy and stability of prediction results. The LSTM model can be used for streamflow prediction in a variety of time spans.

V. CONCLUSION
In this study, a deep learning data-intelligence model based on LSTM was developed to forecast the streamflow of the Kelantan River. The following four aspects were studied: (i) the feasibility of applying the developed LSTM to streamflow forecasting, and the performance comparison between the developed LSTM model and the classic BP model; (ii) the impact of the amount of data in the training set on the prediction accuracy of the developed LSTM model; (iii) the impact of the time interval between the training set and testing set on the performance of the developed LSTM model; (iv) the impact of the time span of predicted data on the developed LSTM performance. The main conclusions from this study are as follows.
1. The developed LSTM model shows excellent accuracy in capturing the time series of streamflow. Compared with the classic BP model, the developed LSTM model has obvious advantages in prediction accuracy no matter whether it was smooth streamflow in the dry season or rapidly fluctuant streamflow in the rainy season.
2. As the size of the training set data increases, the prediction performance of the developed LSTM model is optimised to a certain extent, and the model convergence speed is accelerated. However, according to the performance indices, there are certain limitations in optimising the prediction performance of the developed LSTM model simply by increasing the training set. The reason is probably that the developed LSTM model is overfitting the data.
3. The time interval between the training set and testing set has little impact on the prediction performance of the developed LSTM model trained with historical streamflow data of the Kelantan River. It is a good feature of the model training process that small amounts of missing historical data will not affect the prediction ability of the developed model. 4. In the task of predicting different time spans, although the stability and accuracy of the prediction performance of the developed LSTM model slightly decreases as the time spans become larger, from the point of view of performance indices, each experimental group can also obtain good prediction results that still meet the requirement for high-accuracy streamflow prediction. Based on this strength, the developed LSTM model can be used for streamflow prediction in a variety of time spans.
With this method, there is still room for improvement. In subsequent research, in the pre-processing phase, a bionic intelligent algorithm can be introduced into the LSTM model. In the forecasting model phase, a combination of the LSTM model and other models is an important consideration for future research.

VI. CONFLICT OF INTEREST
The authors have no conflict of interest to any party. He is currently a Professor with the Department of Civil, Environmental and Natural Resources Engineering, Lulea Technical University, Sweden. He has served several academic administrative posts (the Dean, the Head of Department). His publications include more than 424 articles in international/national journals, chapters in books and 13 books. He executed more than 60 major research projects in Iraq, Jordan, and U.K. He has one patent on physical methods for the separation of iron oxides. He has supervised more than 66 postgraduate students at Iraq, Jordan, and U.K. Australia universities. His research interests are mainly in geology, water resources, and environment. He is also a member of several scientific societies, e.g., International Association of Hydrological Sciences, Chartered Institution of Water and Environment Management, Network of Iraqi Scientists Abroad, the Founder and the President of the Iraqi Scientific Society for Water Resources, and so on. He is also a member of the editorial board of ten international journals. He received several scientific and educational awards, among them are the British Council on its 70th Anniversary awarded him top five scientists in Cultural Relations.
ZAHER MUNDHER YASEEN received the master's and Ph.D. degrees from the National University of Malaysia (UKM), Malaysia, in 2017. He is currently a Senior Lecturer and a Senior Researcher in the field of civil engineering at Ton Duc Thang University. He is major in surface hydrology, water resources engineering, hydrological processes modeling, environmental engineering, and climate. In addition, he has an excellent expertise in machine learning and advanced data analytics. He has published over 100 articles in international journals with a Google Scholar H-Index of 23, and a total of 1588 citations. VOLUME 8, 2020