Highly Accurate Prediction Model for Daily Runoff in Semi-Arid Basin Exploiting Metaheuristic Learning Algorithms

Developing trustworthy rainfall-runoff (R-R) models can offer serviceable information for planning and managing water resources. Use of artificial neural network (ANN) in adopting such models and predicting changes in runoff has become popular among many hydrologists from a long time. However, since the optimization is the most significant phase in ANN training, researchers’ attentiveness has been attracted to the ANN’s biggest problem, i.e. its susceptibility of being blocked in local minima. Consequently, use of genetic algorithms (GA), particle swarm optimization (PSO), firefly algorithm (FFA) and improved particle swarm optimization (IPSO) approaches to increase the performance of ANN, have gained remarkable interest among distinct modern heuristic optimization approaches. In this paper, the capability of four improved ANN methods, hybrid GA-based ANN, PSO-based ANN, FFA-based ANN and IPSO-based ANN in modeling rainfall-runoff (R-R) is investigated. IPSO has been used in order to increase the ability of PSO, where the new positions of particles are dynamically adjusted using two procedures which is given form the velocity obtained by PSO and proposed velocity in IPSO. The random normal grated number with a dynamical scale factor is used to compute the new position of the best particles in proposed velocity. Daily R-R data from six stations distributed in the Seybouse watershed located in semi-arid region in Algeria were used in models’ development. The selection of the input data sets was carried out using the autocorrelation, partial autocorrelation and cross correlation functions. The results of the four hybrid models were compared via performance metrics, viz., Root Mean Square Error (RMSE), Pearson’s correlation coefficient (R), Nash Sutcliffe Efficiency coefficient (NSE), and via graphical analysis (scatter plots, time series and Taylor diagram). Outcomes of the analysis at all study stations disclosed that all the ANN models enhanced with IPSO overachieved the GA-based ANN, PSO-based ANN and FFA-based ANN models in estimating runoff for both training and testing periods. The outcomes of the study indicate that the IPSO hybrid metaheuristic algorithm is the best technique in improving ANN capability in modeling daily R-R.


I. INTRODUCTION
Since the dawn of time, water has been a predominant factor in the socio-economic development of human beings.
The associate editor coordinating the review of this manuscript and approving it for publication was Bo Pu .
It intervenes in the whole functioning of the natural environment and represents a main life resource for many plants and animals. However, with the demographic explosion, the industrial growth and the various forms of life in several areas worldwide, water requirements have considerably increased; this has caused a mismatch between water demand and water supply; therefore, the problems of water availability became amplified and water resources become under high pressure. As a result, appropriate management of these resources becomes a major concern in order to minimize this pressure or to bridge the hiatus between water availability and demand [1]- [3]. In order to be eligible to arrange some water management tactics to make it obtainable anytime, a prediction of the interrelationship between the two main components of the hydrological cycle rainfall and runoff is necessary [4], [5]. Also, developing accurate models to simulate rainfall-runoff process can help to manage water scarcity problems. However, the rainfall to runoff conversion is mightily nonlinear, stochastic and strongly complex process [6], as there are several meteorological parameters and other various subprocesses influence this complicated system [7], [8]. Therefore, hydrologists and researchers have developed various rainfall-runoff (R-R) models in order to capture and represent this intricate phenomenon, where the model selection has to be made according to its ability and levels of complexity [9]. Generally, these models are categorized into (i) the physics-based technique that offers better understandability, but their accuracy is poor, and (ii) the empirical or data driven technique based on measured data that provides highly accurate results [10].
The artificial neural network (ANN) is one of these datadriven approaches which was utilized in diverse fields such as hydrology and water resources, it's become advocated due to its capability of tackling, modeling and forecasting the problems that are nonlinear or stochastic within the R-R system [11]- [17]. Since ANNs cannot represent the internal structure of the catchment or even manage the environmental data distributed related to the physical characteristics of the basin, they do not replace conceptual watershed modeling. However, they have been recognized as a applicable alternative to conceptual models for input-output forecasting, due to several advantages among which their computational speed in simulation and forecasting [18] and their capacity of making models easy to use and more accurate from complex natural systems with large inputs [19]. The ANN is found to be a very novel and useful model applied to problem-solving and machine learning. As well as it has shown its power and capacity to simulate the hydrological phenomena. Therefore, ANN models are recommended for R-R modeling; due to their simple structures and precision which help us to solve problems related to water resources management.
As there are many apprenticeship algorithms that can be applied to enhance ANN, it still leaves a large scope of probabilities. Although extremely renowned in flood prediction, there is no obvious conclusions declared regarding to which model perform better in a given application [13]. Most of the researches have applied feed-forward and backpropagation (FFBP) network in ANN model development.
In the last few years, several optimization tools have been used to enhance the potential of the backpropagation algorithm including the gradient descent (GD) which is commonly applied in backpropagation stage of the neural network training process [20] and it's formulated as reducing the error between measured and predicted output at every iteration. Nevertheless, the GD may suffer from convergence issues, training method deceleration, stocking within local minima and overfitting; if the model structure is intricate and the parameter set is large, this results in poor performance of ANN models [7], [10], [21]. Recently, several conventional heuristic tools have been created to beat the deficiency of gradient-based techniques and to facilitate the solution of difficult optimization problems and obtain the optimal ANN parameters in training; in order to enhancing its efficiency. Among these tools: artificial bee colony algorithm (ABC), biogeography-based optimization (BBO), differential evolution (DE), grey wolf optimizer (GWO), genetic algorithm (GA), particle swarm optimization (PSO) and firefly algorithm (FFA) etc. [22]. Even though the standard ANN method is old, its hybrid versions with these metaheuristic algorithms have been commonly used these days to solve complex problems in various fields such as: modelling solar energy system, injection molding process, rock engineering field, rock fragmentation [23]- [26].
Genetic algorithms are among the most popular evolutionary algorithms that are suitable for research, adaptability and learning in a miscellaneous of application areas, particularly for problems where nonlinear data and model intricacy conduce to unworthy results. This algorithm has been widely revealed to offer precise optimization solutions for research difficulties through simulation development. Conjointly with intelligence techniques, the GA has become a powerful method of modeling and optimizing complex processes [27]- [29], it is used as an enhancer of ANN parameters to ameliorate the model's efficiency [30], [31]. In addition, GAs is population-based, and many modern evolutionary algorithms are directly based on genetic algorithms or have strong similarities. There are several studies on the applicability of GA in the hydrological sciences. [32] utilized real code GAs for training (ANN) R-R models, in order to anticipate the quotidian flow which is more precise than the backpropagation technique-based ANN models. [33] suggested an intelligent hybrid model that is a combination of methods of data preprocessing, genetic algorithms and Levenberg-Marquardt (LM) algorithm to train feed-forward NN for runoff prediction.
In the other hand, PSO has become one of the most widely used swarm intelligence-based algorithms due to its simplicity and flexibility. Rather than using mutation / crossover, it uses real number chance and global communication between particles in the swarm. Therefore, it is easier to implement than GAs. It could also be used to optimize irregular, non-linear systems and solve complex problems and it has a high speed of convergence towards the ideal solution on a certain iterations number. Moreover, PSO could be involved as a training algorithm for ANN model [21], [34]. At this stage, satisfactory results in various studies have also been obtained in the problems linked to hydrology. [35] suggested a PSO-based perceptron approach to forecast water stage in the Shing Mun River in Hong Kong. [7] proposed a new hybrid metaheuristic algorithm combining biogeography-based optimization (BBO), particle swarm optimization (PSO) and grey wolf optimizer (GWO) integrated with ANN and ANFIS (adaptive neuro-fuzzy inference systems) for modeling R-R process in the watershed Fal at Tregony. [36], proposed three optimization algorithms integrated with ANFIS, i.e. particle swarm optimization (ANFIS-PSO), genetic algorithm (ANFIS-GA) and differential evolution algorithm (ANFIS-DE) for forecasting monthly streamflow of Pahang River, located in a tropical climatic region of Peninsular Malaysia. [10] applied PSO for training ANN R-R model in Jardin river basin etc.
The swarm-based firefly algorithm (FFA) is receiving considerable research attention, with a number of studies reporting favorable improvement in their modeling accuracy [37], [38]. It is indeed a relatively newer optimization approach that is straightforward with a strong potential to converge quicker to optimum solutions than other intelligent techniques [39]; because the global and local optima of the predictor data can be solved simultaneously and efficiently [40]- [43]. In such optimization problems, it was experimentally seen to surpass PSO. Lately [44] examined its validity to ANN training in classification issues and compared its reliability with GA and ABC. [45] developed an integrated adaptive neuro fuzzy inference system with firefly algorithm (ANFIS-FFA) to forecast monthly rainfall in Pahang River catchment, Malaysia. [46] adopted a novel simulation approach called multilayer perceptron-firefly algorithm (MLP-FFA) for monthly streamflow forecasting at Ajichay watershed, East Azerbaijani. [47] adopted a novel approach based on the integration of support vector regression (SVR) and FFA for rainfall predicting at two stations situated in a semi-arid area, Iran.
Finally, a new version of the PSO algorithm which is Improved PSO (IPSO) could solve multi-objective combination optimization problems in many researches. where it can restrict the position change of original and new particles in the iteration process and accelerate the convergence speed of the algorithm [48]. [49] applied an improved PSO to train artificial neural network (ANN) for water level prediction in the Heshui Watershed in China. [50] introduced a new prediction model for solar radiation, the model was essentially based on an improved support vector regression (SVR) integrated with IPSO, its application showed its superiority over the other models (multivariate adaptive regression 'MARS', genetic programming 'GP', SVR-PSO, SVR-GA, SVR-FFA and M5 tree model).
As these algorithms have different advantages and specific processes in the modeling of complex phenomena and, as their studies in hydrology, in particular R-R modeling, are still at an early stage, and little research has been done on these models to solve hydrological problems and real-time flow forecasting, investigating these models in hydrology and comparing them are highly recommended. On the other hand, the capacity of evolutionary IPSO and FFA in improving ANN efficiency in modeling R-R as well a study that groups together all the algorithms mentioned above to model this phenomenon has not been previously investigated. This gave us impetus to prepare this research.
The principle aim of the present article is to investigate the capability of ANN-IPSO in modeling R-R so as to provide an efficient method for solving such a complex hydrological problem. In order to assess the viability of the IPSO in improving ANN efficiency, this method was compared with the other three commonly used evolutionary metaheuristic optimizers, GA, PSO and FFA inspired by nature by integrating into ANN as a training algorithm for R-R modeling in the Seybouse Basin situated in a semi-arid region. Autocorrelation and cross-correlation functions were applied to define the optimal model input scenario. Daily actual R-R datasets have been utilized to train and test the hybrid models (ANN-GA, ANN-PSO, ANN-FFA and ANN-IPSO). For the purpose of identifying the most powerful ANN training algorithm, the accuracy of the hybrid models was evaluated and compared using performance metrics: Root Mean Square Error (RMSE), Pearson's correlation coefficient (R), Nash Sutcliffe Efficiency (NSE) and graphical analysis: scatter plots, time series and Taylor diagram. The recent advancement of this research is to investigate the feasibility of a new structure for an ANN model that is integrated with Improved PSO and FFA as optimizers for R-R modeling.
The computer codes for all models' combinations as well the selection process of their architectures, were programmed in MATLAB language ('MATLAB R2018b' purchased with its complete platform and its licenses).

II. STUDY BASIN AND DATA ACQUISITION A. STUDY ZONE
The Seybouse watershed is situated in Algeria's North -East, one of the constituents of the large hydrographic basin named CONSTANTINOIS-SEYBOUSE-MELLEGUE [47]; it presents a significant latitudinal extension, where it occupies an area of 6,471 km 2 . The main river, Oued Seybouse, that drains this watershed, has a total length of 240 km, it originates in the high plains of Heractas and Sellaoua and ends in the coastal plain of Annaba to flow into the Mediterranean. It is formed by the confluence of the wadis Cherf and Bouhamdane at the level of Madjez Amar and receives two other tributaries of unequal importance: the Oued Mellah and the Oued Ressoul. A set of dams were erected on all the wadis of Seybouse Watershed, intended mainly for irrigation and water supply. These dams include Hammam-Debagh on Wadi Bouhamdane, Foum El Khanga on Wadi Cherf upstream, Koudiat Harricha in Cherf downstream, Koudiat Mahcha in basse Seybouse part and other small dams built on Wadi Cherf upstream (Tiffech, Sedrata), Ben Badis on the El Heria wadi which is a small tributary of the Bouhamdane wadi, and M'djez El Bgar in the downstream Cherf. Overall, the water resources of this basin are vital to support most economic activities in the region. Figure 1 displays the location map of  Seybouse Watershed as well as the distribution of the various hydrometric and rainfall stations in the six-sub basin used in this study.

B. DATA ACQUISITION
The most measured components of the hydrological cycle are precipitation and river flow, they are crucial for any hydrological modeling [52], [53]. Thus, the database compiled of these two parameters in the six stations spread throughout the catchment area of Seybouse at different periods for each one (due to the unavailability of data at these stations for the same duration), were applied to simulate the R-R relationship. Further information about these stations is given in Table 1 and 2, while their locations are represented in Figure 1.
In the aim of evolving the model, the primary stage is to divide data into various categories for the model effectiveness training and testing. The principal goal of such a stage is for assuring that the model is functioning with a constant degree of accuracy; in case it knows invisible data instead of training. According to [54], the best results are attained if we allocate 20-30% of the original data points for testing, and use the remaining 80-70% for training. For this division, we get accuracy estimates which are: • valid -in the sense that they do not overestimate the accuracy (i.e., do not underestimate the approximation error), and • are the more accurate among the valid estimates -i.e., their overestimation of the approximation error is the smallest possible.
In this context, the data used in this research was categorized into two main parts; the first to train the models with 80% of the data collected and the second to test the calibrated models with 20% of the data to examine the model performance. The entry dataset x * was normalized in the scope [0.1, 0.9], Eq. (1): where: x is the historical data, x min and x max are the minimum and maximum values, respectively. All these data were acquired from the National Water Resources Agency (ANRH) of Algiers. Where, Max, Min and Mean are the maximum, minimum and average value of the observation series in training and testing phases for each runoff station.
As shown in table 2, we have enough data points for the six study stations, because more data points we use for the models, the more precise the model estimates.
In fact, the accuracy also depends upon the model data requirement, the quality of this data (which must be good) and model setup equation that which type of is that either lump or distributed all model have their own set of equations and input requirements to run up that model. The best way is that do adjust the model parameters until calibration and validation results come better. In recent years, the ANN-method has drawn considerable interest from scientists for prediction of the systems that are nonlinear; due to its highly learning potential without any physical acquaintance of the process to be modeled [55].
The primary principle of data handling ANNs is inspired by the nervous biological system [56]. ANNs comprise of a countless number of nodes that are linked together to address multiple issues.
In the present study, the ANN is based on the multilayer perceptron structure (MLP). MLP typically composes of three layers (input layer, hidden or intermediate layer, and output layer) Figure 2 [57]. Every layer may contain different neurons' numbers that bind to each other with the links named weights (w). the input layer nodes transmit the input signal values to the intermediate layer nodes. Similarly, the intermediate layer nodes transmit the signal values to the output layer nodes. Eventually, the output layer displays the results that have been simulated. Eq. (2) determines the output of each layer: where y: the layer's output, yi: the input of a layer; wi: weights; and b: bias. The logistic sigmoid and tangent functions are the more ordinarily transfer functions used. [58] pointed that the training with tangent function is not only quicker than the training with logistic sigmoid transfer function, but also the forecasts found through tangent networks are marginally better than those with logistic sigmoid transfer functions. [59] indicated that it is more difficult to train ANNs with the sigmoid logistic function than ANNs with the tangent function. [60] found that in outflow estimation, the tangent sigmoid function worked much better than the logistic sigmoid function.
As a result, the adopted activation function employed in this study was the hyperbolic tangent sigmoid function, Eq. (3) In this paper, the ANN was adopted to predict the rainfallrunoff process over several time horizons. Three algorithms, termed genetic algorithm, particle swarm optimization and firefly algorithm were applied to determine the improved set of ANN variables. Further explanation of these techniques is given in the following sections.

2) GENETIC ALGORITHM (GA)
The Genetic Algorithm (GA), was first created by John Henry Holland. This metaheuristic algorithm is a machine learning model, that originates its behavior and habits from a description of evolutionary systems in nature, it has been utilized to enhance the parameters of the control process that are complicated and hard to fix by traditional optimization techniques. This act is achieved by introducing by a computer, a sample of individuals demonstrated by chromosomes (similar to the chromosomes contained in human DNA) [61]. In nature, the genetic information coding generally outcomes in offspring genetically similar to the parent. Sexual proliferation makes it possible to create genetically radically dissimilar offspring of the same general organisms. Straightforwardly, a couple of chromosomes conflict at the molecular scale, exchange set of genetic knowledge and separate each other. This is named the recombination operation, that is called in GA crossover duo to the manner in which the genetic information goes from one chromosome to the other. Other operators with bio-inspiration including mutation and regeneration. In the regeneration operator, two arbitrary nominees are chosen and when the weak one is removed, the other is doubled. In the mutation, a nominee will be silenced and therefore an extremely low mutation rate can lead to genetic deviation [62].
Because ANNs and GAs are popular methods that have been used by various researchers to optimize nonlinear problems such as modeling and forecasting the rainfall-runoff systems, a conjunction model between these two methods is introduced (ANN-GA). This procedure adjusts artificial neural network variables like momentum term and number of intermediate layers' neurons. However, this method may take time in the training procedure, but the use of the genetic algorithm tries to reduce error and inaccuracy considerably. Consequently, this technique appears meaningful. In various ANN structures, the GA technique improves the various components of ANN such as neurons number in the hidden layers; which is formed by the Levenberg-Marquardt training method.

3) PARTICLE SWARM OPTIMIZATION (PSO)
PSO was first defined by Eberhart and Kennedy (1995). Its idea stemmed from the social behavior of creatures in a horde or swarm. Though it is originally created as a mechanism for simulating social behavior, the PSO technique has been identified as a computational intelligence algorithm closely linked to meta-heuristic evolutionary search optimization algorithm [63]. The evolutionary methodology acquired using Eq. (4) and (5) is the main feature of PSO which differentiates it from other improvement algorithms.
where v new , v, p new , and p signify the new velocity, current velocity, new position, and current position, respectively, of a specific particle; c 1 and c 2 denote cognition and social coefficient respectively; p best symbolizes the ideal or best position of this particle, g best is the best position that the swarm knows; r 1 and r 2 are random numbers between 0 and 1 [64]. The introduction of certain arbitrary chosen particles is the initial phase to solve optimization issues utilizing PSO (initialize the ANN weights). Every particle (i.e. weight ANN) is designated an arbitrary position and velocity. A repeated process is applied over the next phase to figure out the best possible solution; the p best and g best values of each particle were registered in other verses throughout each interaction. Therefore, utilizing equations 3 and 4, the positions of the particles vary based on their expertise and that of other particles. Particle positions are updated until the best solution is obtained [64].
As ANNs can be stocked in local minima, the integration of hybrid methods like ANNs based on PSOs becomes encouraging. The PSO constituent of this kind of hybrid model is capable of finding an overall minimum and further research. Therefore, a hybrid ANN model based on PSO has the benefits of these two techniques: in the quest field, PSO will look for all the minimums and ANN will need them to come up with the optimal solution. In ANN-PSO, every particle (the ANN weight) is a frontrunner solution to reduce the error. The enhanced weights are utilized for the network training after improving the problem. In fact, the aim of introducing the PSO to the ANN is to strengthen the training process of the ANN.

4) FIREFLY ALGORITHM (FFA)
The firefly algorithm (FFA) is a new nature inspired metaheuristic process, established by Xin-She Yang in 2008 for solving various optimization problems. The concept behind FFA is that fireflies emit or produce light generated by chemical processes, for mating purposes, the light-flashing activity draws fireflies to each other [65]. It is important to note that, the bright fireflies attract readily the less bright fireflies. This mechanism could be generated as an enhancement algorithm as the flashing-light can be programmed to be synchronized with the optimized fitness function. Three rules are followed by the firefly algorithm: • All the fireflies are unisex. • The less bright ones will move towards the brighter ones.
But, when a brighter one is no visible, fireflies will move arbitrarily.
• A firefly's brightness or light intensity is defined by the land scape of the optimized objective function. By knowing these rules, the firefly's brightness and light intensity form the fundamental basis of the FFA model's function. Eq. (6) and (7) signify the firefly's intensity and attractiveness, since each firefly demonstrates its particular attractiveness, reflecting its attractive prowess in the swarm [66].
At distance r, I denote the light intensity and β(r) the attractiveness. At distance r=0, the light intensity becomes I 0 and the attractiveness β 0 . γ is the light absorption coefficient 0.1 < γ < 10. The Cartesian distance between any two fireflies is defined as: where d denotes problem dimensionality, x i and x j are the fireflies' positions, and x i,k and x j,k are the kth component of spatial coordinate. As already noted, fireflies are enticed to each other, therefore, the next movement of firefly i can be expressed in formula (9).
where x t i is the solution vector or actual position of the firefly i, the second term is due to attraction to a brighter firefly j, and α(rand-1/2) represents the firefly's arbitrary walk with the randomization parameter α ∈ [0,1] [67].
As with PSO, firefly can readily be used to train ANN. To optimize the ANN model weights, to achieve the optimum parameter settings for ANN training and to reduce the error ratio. Each firefly is employed to represent a candidate solution to the ANN training problem (i.e. a vector of all the weights and biases of an ANN). First, the population size of candidate solutions is created for the problem in question (the ANN's weights). After this, the fireflies' light intensity is calculated and the attractive firefly (best candidate) is found within the population. Afterwards, to move all fireflies towards the attractive one in the search area, calculate for each firefly the attractiveness and distance. Eventually, in the search area, the attractive firefly moves arbitrarily. This procedure is replicated until a termination criterion is reached (the maximum number of generations is achieved) [68].

5) IMPROVED PSO
In improved PSO, the particles are randomly adjusted using a generated number by normal standard distribution (Normand) as follows: where, v I new is the improved velocity which are adjusted for new positions. v I new is determined based on best position of particle (g best ) and a random number generated by a normal distribution with mean of 1 and STD of 0 (Normrand(0, 1)). The normal random part is scaled using factor γ k which is determined by the following relation [69], [70]: Factor γ k is tended to 1 at first iterations and 0 at fill iterations γ k ∈ [0, 1]. As seen from the improved velocity, VOLUME 9, 2021 the new positions of particles are adjusted using random normal process. Thus, the new and the best particles have not the same positions. Consequently, it is reduced the chance to the local optimum by compared to original PSO by this presented optimization approach. The formulation of IPSO is presented in Eq. (12) by using two random adjusting procedures given by improved velocity presented in Eq. (10). (12) where, r ∈ [0,1] is a random number gendered by uniform distribution between 0 and 1. In IPSO, the initial velocity and its parameters are randomly determined as well as PSO. P k is named as adjusting particle rate, which is computed as follows: P k is randomly provided a pattern for adjusting the new particle using two formulations of PSO and IPSO. By increasing P k , the chance of applying improved velocity for determining the new positions for particles is increased. We commonly used this formula presented in Eq. (10) at final iterations. Thus, it is a local search on the best position for computing the global optimum results at final iterations.
As see in Eq. (12), it is applied two velocity terms for adjusting the positions of the new particles while in the PSO, we apply the velocity of particles using Eq. (4) which is determined by using p best and g best ; and this is the main differences of PSO and IPSO.

B. INPUT COMBINATION AND MODEL DEVELOPMENT 1) INPUT COMBINATION
The input combination selection is considered as one of the major factors effecting the model's effectiveness. Therefore, a proper input selection is essential before applying the ANN models. For this study, an input scenario was created and studied for the four hybrid ANN models based on the autocorrelation function (ACF), the partial autocorrelation function (PACF) and simple cross-correlation function (CCF) which have been used to identify the number of effective lags of antecedent rainfall and runoff. This method was proposed by the several researches [36], [71]- [74] to determining the optimal inputs for data driven methods.
The Table 3 is listing the CCF and PACF values of the six stations studied, a representation example of these functions (ACF, PACF and CCF) for one of the stations (station 'Medjez-Amar II' of the sub basin 3) is shown in Figure 3.
The CCF indicates that the precipitation at time t and one lag are considerably effective on runoff compared to the precipitation for two and three lags, while the other lags have fallen within the confidence limit. Therefore, R t−1 was considered as one of the parameters included in the input scenario used in the developed models. In addition, the PACF in all the stations indicates that the first lag of runoff has considerable effect, and that the second and third lags are very  between runoff (Q t ) and rainfall at various lags with the 95% confidence limit for 'Medjez-Amar II' Station.
near to the confidence limit (but in 'Bordj-Sabath' and 'Ain Berda' stations, the lags Q t−2 , Q t−3 and Q t−3 respectively are within the confidence limit * ). Therefore, these two entries were ignored.
As the precipitation beyond the 2 nd lag doesn't really affect the runoff at time t, as well as the runoff is not effective beyond the same lag. Therefore, the input combination considered for the scenario used in this research to model the rainfall-runoff process is made up of R t , R t−1 and Q t−1 to simulate the output Q t .

2) MODEL DEVELOPMENT
In the present study, four ANN models have been created and compared with each other for modeling rainfall/runoff task. The first model utilizing the (GA) for its training was termed as ANN-GA, the second utilizing the PSO algorithm was named ANN-PSO, the third using the (FFA), was called ANN-FFA and the last using (IPSO) referred to ANN-IPSO. In fact, since the efficiency of every model generally depends on the appropriate variable's determination, these three optimization techniques have been combined with ANN model to improve the calibration of its variables (optimize its weights and biases).
The choice of the hidden layers' numbers as well as the neurons number was decided after performing different combinations. Indeed, we have been able to observe, through several tests, that the rise in the number of intermediate layers or that of neurons did not lead to an improvement in the results, on the contrary it made the network more difficult to stall and its training time longer. As well, a higher probability of converging to a local minima can be introduced, so there is no notional justification for using more than two intermediate layers [75], [76]. On the other side, the use of too few nodes in the intermediate layers compared to the intricacy of the problem data, will leads to the underfitting and using too many neurons might result overfitting which occurs when unnecessarily more neurons are presented into the network [77], [78]. In this work, the use of a single hidden layer was found to be enough to have simulation results of the model with good convergence and performance [79], [80]. The ideal number of nodes in the intermediate layer has been defined following a trial and error method (forward approach) by changing the intermediate-layer neurons number [46], [81]- [84], in this case, we start from an architecture with 2 nodes in the intermediate layer, after that, train and test the ANN, then constantly increase the hidden neurons number. We repeated the above procedure until training and testing improved, then we retain the architecture which gives the minimum of the error on the test base [76]. As a result, the best ANN-GA, ANN-PSO, ANN-FFA and ANN-IPSO architectures obtained were with one hidden layer and 10 neurons.
The four computer programs that show the development process of the hybrid models were developed in MATLAB. The Figure 4 depicts the procedure of how the GA, PSO, FFA and IPSO algorithms optimize the ANN parameters.
During the application of GA, PSO, FFA and IPSO, several parameters must be specified. A suitable selection of parameters influences the algorithm convergence rate. Table 4 displays the parameters values utilized for the four optimization algorithms.  The parameters of IPSO and PSO are given a same in the optimization process. however, the new position of particles is determined with novel velocity relation which is computed using the global best of particle and a normal random generated approach.
In IPSO, the improved velocity of particles is deceptively combined with the velocity given from original PSO with a self-adaptive random process as presented in Eq. (12).

3) STATISTICAL PERFORMANCE INDICATORS
The effectiveness of different predictive models is evaluated through graphical interpretation (line, scatter and Taylor diagram) and through performance indicators: Pearson's correlation coefficient (R), Nash Sutcliffe Efficiency coefficient (NSE) and Root Mean Square Error (RMSE). R varying from −1 to 1, evaluates the linear correlation between predicted and observed values, its values of 1 and 0 signify an ideal fit and no statistical correlation between the data and the line drawn across them, respectively. The NSE is utilized to analyze the predictive accuracy of the hydrological models (varies from −∞ to 1). The RMSE is used to estimate predicting precision, which produces a positive value by squaring the errors. The RMSE rises from zero for perfect predictions to large positive values as the discrepancies between predictions and observations become significantly large. Usually, the best model forecasts are obtained when R, NSE, and RMSE are close to 1, 1, and 0, respectively. The formulation of these three-performance metrics could be defined as: where n is the data number, Q o,i is the observed runoff, Q p,i is predicted runoff,Qo is the average value of the observed runoff andQp is the average value of the predicted runoff [7].

IV. RESULTS, ANALYSIS AND DISCUSSION
In this chapter, a thorough assessment of ANN-GA, ANN-PSO, ANN-FFA and ANN-IPSO hybrid models in simulating the output Q t using defined scenario (R t , R t−1 and Q t−1 ) is presented and their efficiency in terms of numerous statistical indicators during training and testing for the various study stations is demonstrated in Table 5.
In the training stage (Table 5), it can be observed that for all the hydrometric stations that are distributed in the study basin, the ANN-IPSO model offers greater approximations or simulated runoff values than the ANN-GA, ANN-PSO and ANN-FFA models; by way of illustration, for 'Medjez-Amar II' Station in the sub-basin B3, the ANN model trained by the genetic algorithm (GA) gave a strong correlation coefficient R (0.823), a good efficiency coefficient NSE (0.666) and a low root mean squared error RMSE (0.744). Whereas, the ANN which used particle swarm optimization as training algorithm (PSO) has further improved these performance statistics, where R became very strong (0.941), NSE very good (0.868) and RMSE low (0.466). And for the hybrid model ANN-FFA, the performance indicators became more important (R = 0.961, NSE = 0.916 and RMSE = 0.425). However, the ANN trained by Improved PSO performed as the best model for R-R modeling, where R = 0.993, NSE = 0.985 and RMSE = 0.177. Also, the application of ANN-GA  The same remarks for exceeding the excellence of the ANN model that utilizes IPSO as a training algorithm on the ANN model that employs GA, PSO and FFA for the training, are valid for the remaining stations (sub-basin B2 'Moulin Rochefort' Station, sub-basin B5 'Bouchegouf' Station, sub-basin B6 'Mirebek and Ain Berda' stations).
On the one hand, the disproportion of performance between the two models of neural networks trained by GA and PSO separately can be due to several reasons: The implementation of GA is usually an intricate process that involves evolutionary operations such as selection, crossover and mutation. Also, the convergence velocity could be considerably decreased if the size of the sample is big. However, the PSO algorithm is easier to implement, and unlike GA, it achieves its final variables values in lesser generations (see Table 5), it converges faster, it has fewer parameters, and it doesn't have complex evolutionary operators like crossover and mutation. Another important point was noted during the simulation, which is the calculation time for GA is relatively small contrasted to the PSO enhancer tool, even though it rises proportionally to the number of PSO and GA generations. Therefore, the bigger calculation time for PSO is resulting to the interaction among the particles. On the other hand, the FFA has shown a more interesting ability than PSO and GA to model this intricate phenomenon, so it has proven to be an encouraging optimization algorithm; due to the influence of the attractiveness function that is specific to the firefly behavior. However, PSO is preferable than firefly with regards to the time needed for the optimum value to be produced.
Same results obtained by [65], where he found that particle swarm frequently exceeds conventional algorithms like genetic algorithms, whereas in terms of effectiveness and success ratio, the modern firefly algorithm is superior to both PSO and GA.
While, the IPSO has potentially demonstrated that it is a more powerful favorable optimization tool than FFA in solving optimization problems which are complex and non-linear.
After analyzing and discussing the results obtained from the two phases of the applied models, an important remark can be deduced is that, the reliabilities of the models in the training phase are significantly lower than that during the testing phase (look at the R and NSE values in Table 5).
Perhaps the principal cause is that the more complicated training data structure; that has a more distribution curve, which include peaks of runoff and precipitation much higher than the testing dataset.
The scatter plots for predicted and observed daily flow values for the six study stations during testing period provided by the four hybrid models, are indicated in Figure 5. As clearly observed, the linear trend of the ANN-IPSO is the nearest to the line y=x compared to those of ANN-GA, ANN-PSO and ANN-FFA. Similarly, the predicted flow's time series utilizing ANN-IPSO are compared with the observed ones through the testing period (see Figure 6). A good fit and a decent agreement are noticed between the observed and predicted flow by the ANN-IPSO model.

The Taylor diagram
The Taylor diagram [85] were used to provide a visual understanding of effectiveness measurements that plots for the modeling results a set of points on a polar plot, the diagram was used for demonstrating the spatial variation of the expected flow by the ANN-GA, ANN-PSO, ANN-FFA and ANN-IPSO over the observed value during testing period for the six study stations. The standard deviation (SD) between expected and observed values is defined by Taylor diagram along the radial intervals with roots, and R values are defined as angles of direction. The assumption is that on the Taylor diagram, the observed values have an independent display and the nearer the performance indicators of predictions to the observed values, the stronger the model performance. As illustrated in Figure 7, the results found by ANN-IPSO are nearer to the observed one in comparison with ANN-GA, ANN-PSO and ANN-FFA, which indicates higher accuracy of this model as mentioned previously considering Table 5, The standard ANN method is successfully used in water resources management issues. For example, estimation and/or forecasting runoff and provide data for early warning systems against the possible floods. As also mentioned in the introduction section, standard ANN however has some drawbacks such as training method deceleration, stocking within local minima and overfitting. So, new metaheuristics algorithms are needed to solve this issue and improve standard ANN efficiency. This has been confirmed by the results found in this study after applying the hybrid models of ANN trained separately by GA, PSO, FFA and IPSO, where they revealed the superiority of the ANN-IPSO over the ANN-FFA, ANN-PSO and ANN-GA in both training and testing phases for the six stations distributed in the study basin. As a result, the IPSO algorithm was able to improve the resolution capabilities of this complex problem with a high convergence speed compared to FFA, PSO and GA. Also, it could be used to improve other abnormal troubles that change over time. Such models can be used as a module in general hydrological analysis models [10], [86], [87].
[88] used MLP-ANN and hybrid multilayer perceptron (MLP-FFA) to forecast monthly river flow for a set of time intervals using observed data. Their results show that VOLUME 9, 2021   MLP-FFA model is satisfactory for monthly river flow simulation in Ajichay watershed (East Azerbaijani) in the province of East Azerbaijan.
[86] compared particle swarm optimization and genetic algorithm for daily rainfallrunoff modelling in Southeast Queensland, Australia. The results indicated that the ANN-PSO model significantly outperformed the ANN-GA model in terms of convergence speed, accuracy, and fitness function evaluation.

V. CONCLUSION
This paper developed a highly accurate prediction model based on a combined artificial neural network-improved particle swarm optimization algorithm (ANN-IPSO) approach for a common problem in the field of hydrology involving rainfall-runoff in semiarid basin. The developed approach outperforms other existing techniques in the literature, including genetic algorithm (GA), particle swarm optimization (PSO) and firefly algorithm (FFA).
The daily rainfall and runoff data collected from Seybouse watershed, Algeria, were used to establish all the developed models. The model's effectiveness was assessed based on different statistical measures.
Overall, the study indicated that the GA, PSO, FFA and IPSO algorithms can be employed in modeling the rainfall-runoff process. However, the optimal results from an VOLUME 9, 2021 evolutionary standpoint, significantly proved the superiority and the capacity of ANN-IPSO over ANN-GA, ANN-PSO and ANN-FFA in terms of all statistical criteria and graphical interpretation, where the input predictors are R t , R t−1 and Q t−1 .
These findings unquestionably confirm the effectiveness of the IPSO algorithm in tuning the parameters of ANN model and appreciably strengthen its forecasting performance. The IPSO-based hybrid ANN model can thus be employed in different functionalities and, more especially in hydrology and its related disciplines. As a result, this study finding indicates that the ANN model optimized by IPSO is more powerful for R-R modeling and a better alternative to other three metaheuristic-based models (GA-ANN, PSO-ANN and ANN-FFA). It's hoped that future research attempts R-R modeling by testing the general potential of ANN-IPSO method using datasets from different catchments.
YAMINA AOULMI received the degree in engineering and the master's degree in hydraulics from the National Polytechnic School of Algiers, Algeria, in 2018. She is currently pursuing the Ph.D. degree with L'arbi Ben M'hidi University, Algeria. She is currently a Field Engineer with multinational company Schlumberger, Algeria. Her current research interests include water resources, hydrological modeling using artificial intelligence, optimization algorithms, and deep learning. Dr. Shubair is a fellow of the MIT Electromagnetics Academy and a Founding Member of MIT Scholars of the Emirates. He is also a Standing Member of the editorial boards of several international journals and serves regularly on the steering, organizing, and technical committees of IEEE flagship conferences in antennas, communications, and signal processing, including several editions of IEEE AP-S/URSI, EuCAP, IEEE GlobalSIP, IEEE WCNC, and IEEE ICASSP. He is also He is also the Regional Director of the IEEE Signal Processing Society, Middle East, and the Chair of the IEEE Innovation and Research. He was a recipient of the IEEE UAE Award of the Year, in 2020. His Ph.D. thesis received the University of Waterloo's ''Distinguished Doctoral Dissertation Award,'' in 1993. He received the ''Teaching Excellence Award'' and ''Distinguished Service Award.'' He was nominated for the IEEE Distinguished Educator Award of the IEEE Antennas and Propagation Society and received the ''Distinguished Service Award'' from both ACES Society, USA, and from MIT Electromagnetics Academy, USA. He is also a nominee for the Regional Director-at-Large of the IEEE Signal Processing Society in IEEE Region 8 Europe, Africa, and the Middle East. He is also a nominee for the IEEE Distinguished Educator Award of the IEEE Antennas and Propagation Society. He has been honored to serve as an Invited Speaker with the prestigious U.S. National Academies of Sciences, Engineering, and Medicine. He organized and chaired numerous technical special sessions and tutorials in IEEE flagship conferences. In addition to his paper presentation at technical conferences, he delivered over 65 invited speaker seminars and technical talks in world-class universities and flagship conferences. He has served as a TPC Chair for the IEEE MMS2016 and IEEE GlobalSIP 2018 Symposium on 5G Satellite Networks. He holds several leading roles in the international professional engineering community. He is an Editor of IEEE JOURNAL OF ELECTROMAGNETICS, RF AND MICROWAVES IN MEDICINE AND BIOLOGY and IEEE OPEN JOURNAL OF ANTENNAS AND PROPAGATION.
BEHROOZ KESHTEGAR is currently an Associate Professor with the Department of Civil Engineering, Zabol University, Zabol, Iran. His research interests are structural reliability analysis, reliability-based design optimization, data driven-based modeling approaches, and artificial intelligent-based optimization. He has been working on different topics of structural reliability analysis, including chaos control approach, conjugate sensitivity methods and optimization method, such as harmony search and the hybrid AI methods. He established several models as predicted approach for estimation the complex engineering problems in mechanical/civil filed under multi-uncertainties. He can code the numerical methods in a computer program to solve and to analysis of engineering problems in optimization, modeling, and reliability fields. These interesting fields are the novel numerical approaches, predicted models, computational frameworks, and optimization algorithms. VOLUME 9, 2021