Parameter Estimation in Neutral Delay Differential Equations Using Genetic Algorithm With Multi-Parent Crossover

Neutral delay differential equations (NDDEs) are differential equations containing time lags not only in the states but also in the state derivatives. NDDEs have applications in modeling physical and biological systems. An NDDE model may consist of several parameters, some of which can be determined using available data. Various numerical techniques have been studied to estimate parameters of mathematical models. In this study, parameter estimation is posed as an optimization problem. The use of heuristic algorithms in parameter estimation has gained popularity because of its ease of implementation, requiring only function evaluations. But to our knowledge, heuristic algorithms have never been employed in estimating parameters in NDDE models. In this work, we apply Genetic Algorithm with Multi-Parent Crossover (GA-MPC) to obtain parameter estimates of three NDDE models with a discrete delay. We compare the estimates to those obtained using standard heuristic algorithms. Results show that GA-MPC is capable of consistently identifying model parameters that provide a good fit of the model to the data.


I. INTRODUCTION
Differential equations, where the derivatives at time t depend on the solution and its derivatives at a previous time, are called delay differential equations (DDEs). A general form of a DDE is given by y (k) (t) = f (t, y(t), . . . , y (k−1) (t), y(d 0 ), . . . , y (k) (d k ); p), (1) where d j = d j (t, y(t)), j = 0, . . . , k are the delays which satisfy d j ≤ t ∀t ∈ [t 0 , t 1 ] for some t 0 , t 1 ∈ R + . The function f is parametrized by p ∈ R L , where L is the number of parameters. The solution to (1) is dependent on some initial function φ(t) defined at a previous time. This function φ(t) is referred to as the history or preshape function. A DDE is of retarded type (RDDE) when there is no delay in the derivative terms of the DDE. On the other hand, it is of neutral type (NDDE) if there are delays in the derivatives of the state variables.
The associate editor coordinating the review of this manuscript and approving it for publication was Sun-Yuan Hsieh .
Here, we consider a single constant delay τ . That is, Note that the solution y(t) to (1) depends on the parameter p and the delay τ . Because τ can also be treated as an additional parameter, we let θ := [p; τ ], be the vector of parameters. We also denote by y θ (t) the solution to (1) to make the dependence of y(t) on the parameters and delay more obvious.
The values of the parameters in a model are often unknown. While some parameters are found in the literature, others need to be estimated. Ideally, we want the experimental data recorded at time t i , denoted by D i , to coincide with the solution to the model at t i . That is, where N is the number of data points. Hence, we look for the value of θ that will make the approximation in (2) as close as possible. This can be done by minimizing a least-squares formulation expressed as where M is a normalizing constant. We denote the cost function to be minimized by There are many numerical optimization algorithms in the literature that can be used to solve (3). Classical techniques use the gradient of the cost function J in (4) to estimate the minimizer. Gradient-based methods are fast but can be stuck to a local minimum or a saddle point [21]. Using derivative discontinuity tracking theory for DDEs, it was shown in [22] that jumps can arise in the gradient of J . Hence, to solve (3), we rely on a derivative-free method.
Numerical algorithms that only use function values of J are either deterministic or heuristic. Deterministic algorithms are computationally inexpensive and fast but tend to converge to local minimizers [21], [23]. Heuristic algorithms are often nature-inspired [24] and are computationally more expensive than deterministic algorithms but can converge to a global minimizer given enough number of function evaluations [24]- [26].
In [27], a modified particle swarm optimization (PSO) was used to estimate the parameters of dynamical systems involving ordinary differential equations. It was shown in [28] how a hybrid adaptive cuckoo search with simulated annealing could be effective in identifying parameters in chaotic systems. A hybrid Taguchi-differential evolution algorithm was proposed in [29] to estimate the parameters of an HIV model. Heuristic algorithms have also been shown to be effective in identifying parameters of S-system models [30]- [32]. Other recent works on the application of various heuristic optimization methods on estimating parameters of engineering systems are found in [33]- [41].
Heuristic algorithms such as genetic algorithm (GA) [42], which is a hybrid of an evolutionary search strategy with a local multiple-shooting approach [43], and differential evolution [44] have been applied to parameter estimation in RDDE models. In [45], Ant Colony Optimization was used to determine parameter values in RDDE models on human health. To determine the unknown parameters in chaotic systems with time-delays, a hybrid bio-geography optimization was used in [46], PSO was used in [47], and a hybrid cuckoo search algorithm was used in [48].
To the best of our knowledge, heuristic algorithms have never been employed in estimating parameters of models involving NDDEs. In this paper, we solve the least-squares formulation in (3) using Genetic Algorithm with Multi-Parent Crossover (GA-MPC), and compare the results to those obtained using standard global optimization algorithms: GA, pattern search algorithm (PS), simulated annealing (SA), and PSO. GA-MPC is a modified GA that proposes a new crossover method [49]. It has been used to effectively solve an economic dispatch problem [50] and an optimal power flow problem [51]. It has also been applied to solve an inverse problem in RDDEs [52].
In the next section, we discuss GA-MPC and a parameter bootstrapping method. In Section III, parameter estimation is done on three NDDE models arising from applications using the heuristic algorithms mentioned above. The first model describes the population growth of E. coli using a first-order NDDE. The second one is an age-structured model involving a system of first-order NDDEs. The third model is a second-order NDDE describing a pendulum-mass-springdamper (PMSD) system. Parameter bootstrapping is done to assess uncertainty in the estimates. Furthermore, since the only stopping criterion is the maximum number of function evaluations, we investigate the effect of varying this quantity to the cost function value. In the last section, we state our conclusions and recommendations.

A. GENETIC ALGORITHM WITH MULTI-PARENT CROSSOVER
Genetic Algorithm was formulated by John Holland in 1975 based on Darwin's theory of evolution [53]. It has gained popularity for its ease of use and applicability in various fields of study. Some applications of GA can be found in [25], [26], [42], [54]- [59].
The general process of GA starts with a pool of randomly generated individuals that serve as members of the first generation. The members of the population act as estimates of the minimizer. The best members of the population are selected based on their survival fitness. In the case of minimization, the lower the function value of an individual, the higher is its chance of survival. The members of the population who did not survive will be replaced by the offspring of the surviving population. The offspring are created using a crossover method. To make sure that the solution does not converge to a local minimum, a mutation step is done at the end of the generation. In this step, a portion of the population is modified randomly. The process is repeated until a stopping criterion is satisfied. A detailed discussion on GA, as well as the MATLAB codes, are found in [26].
GA-MPC proposes a new crossover method that uses three parents from the selection pool to generate three offspring (refer to Algorithm 1). Two of the offspring (o 1 and o 3 in Algorithm 1) are formulated in such a way that they are generated in the direction with less function value. These VOLUME 9, 2021 Algorithm 1 Genetic Algorithm With Multi-Parent Crossover (GA-MPC) [49] Require: Cost function f (x), x ∈ R r , PopSize, x LB , x UB , M , β, cr, p, MaxEval. Ensure: f is optimal at x * 1: Randomly generate the initial population size PopSize. The initial population must stay within the bounds x LB and x UB . 2: while FuncEvals < MaxEval do 3: Sort the population by their cost function value. Form the archive poolÃ using the best M members. 4: Generate tournament selection size TC randomly between 2 and 3. Do a tournament selection and fill the selection pool. 5: Randomly generate a numberū ∈ [0, 1]. 6: for each three consecutive individuals in the selection pool do 7: if one of the selected individual is the same as another individual then 8: Replace one by a randomly-selected individual in the selection pool. 9: end if 10: ifū < cr then 11: Sort the three individuals

12:
Compute β = N (μ,σ ). 13: The three offspring are generated from the three parents using: 14: end if 15: for i = 1 : 3 do 16: Randomly generateũ ∈ [0, 1]. 17: ifũ < p then 18: The offspring undergoes mutation: 19: end if 20: end for 21: end for 22: if a duplicate individual x j i exists then 23: Change the duplicate with 24: end if 25: end while two offspring exploit the location of the best parents to attain better estimates. The third offspring (o 2 in Algorithm 1) will move away from the best parent. Although this is counterintuitive, this gives the algorithm the capability to escape local minima. Thus, GA-MPC also adopts a randomized operator to diversify the offspring. A detailed discussion on GA-MPC is found in [49]. This algorithm ranked first among the participating algorithms in the 2011 IEEE Congress of Evolutionary Computation Competition, which tests algorithms by solving real-world optimization problems [60].
In the following simulations, the parameter values of GA-MPC are set to their default values as stated in [49]. That is, the population size PopSize = 90, selection rate M = 0.50 × PopSize, crossover rate cr = 1, crossover factor β ∼ N (0.5, 0.3), and mutation rate p = 0.1.
The results obtained using GA-MPC are compared to those obtained using standard heuristic algorithms: GA, PS, SA, and PSO, which are implemented using the built-in functions ga, patternsearch, simulannealbnd, and particleswarm, respectively, in the global optimization toolbox of MATLAB. For consistency of comparison, all the parameters in these built-in functions are set to their default values. The only stopping criterion in all the algorithms is the maximum number of function evaluations. In the parameter estimation results in Section III, the maximum number of function evaluations is set to 2000 times the number of parameters to be estimated. Because heuristic algorithms are probabilistic, we run each of the algorithms 100 times and compute for the mean of the estimates. Box plots are used to illustrate the distribution of the parameter estimates. Furthermore, we explore the effect of the maximum number of function evaluations to the cost function value by setting this maximum number to the number of parameters to be estimated times 500, 1000, 1500, or 2000.

B. PARAMETER UNCERTAINTY ANALYSIS
In [61], a parameter bootstrapping approach is presented to quantify the standard error in the parameter estimates in models involving RDDEs. We adopt this strategy to analyze the uncertainty in parameter estimates in the NDDE models. The method is summarized in the following steps: 1) Solve the parameter estimation problem in (3) and denote the estimated parameter byθ. 2) Let yθ (t) be the solution of the NDDE model. Set the maximum number L of realizations to 100. Initialize the number of realizations r to 1. 3) Using (yθ (t i ), σ 2 ), i = 1, . . . , N , generate a noisy data based on an assumed probability distribution. Here, the noisy data are generated using the normal distribution Normal(yθ (t i ), σ 2 ), i = 1, . . . , N , where σ is the noise set to 10% of the standard deviation of [61]). 4) Get a new set of parameter estimates using the simulated noisy data (4) and minimizing the modified cost function. Store this parameter asθ r . 5) If r < L, set r ← r + 1 and go back to step 3.
Otherwise, go to the next step. 6) The standard error of the parameterθ, denoted by Err(θ), is equal to the standard deviation of {θ r } 100 r=1 . The 95% confidence interval for the parameterθ is [θ − 1.96 * Err(θ),θ + 1.96 * Err(θ)]. The standard error and 95% confidence interval of the solution to the NDDE are computed similarly.

A. E. COLI POPULATION GROWTH MODEL
In [16], an NDDE model of E. coli population growth is proposed. The model assumes that all cells have the same division time and that they divide simultaneously. Furthermore, it is assumed that there is a prolonged initial step-like growth. The model is given by where y(t) denotes the number of colonies at time t, and ρ 0 , ρ 1 , ρ 2 , and τ are the model parameters. The parameter ρ 0 is the rate of cell death, ρ 1 is the rate of commitment to cell division, ρ 2 represents the gradual dispersal of synchronization, and τ is the average cell-division time [62]. The history function for this problem is given by and The function φ(t) follows a bell-shaped distribution that corresponds to non-monotone cell growth [16]. The value of γ is set to 2.25 and the initial population y 0 to 99 [16].
We wish to estimate the parameters ρ 0 , ρ 1 , ρ 2 , τ , and β by comparing the numerical solution to (5) with the number of colonies {y i } 28 i=1 based on experimental data [16]. For brevity, we define  Thus, the minimization problem to estimate the parameters of the NDDE model of E. coli population growth is given by Here, y θ (t) is the solution to (5) given an arbitrary parameter vector θ. To compute for y θ (t) numerically, we use the MATLAB built-in function ddensd with step size 0.01. The maximum and minimum values of the parameters used in the simulations are shown in Table 1. Figure 1 shows the box plots of the parameter estimates of the E. coli population growth model obtained using GA-MPC, GA, PS, SA, and PSO. Compared to the other algorithms, most of the estimates obtained using GA-MPC fall within the first and third quartiles of the 100 estimates. Furthermore, the narrow interquartile ranges suggest small variations among estimates. The cyan curves in Figure 2   are the plots of the solutions corresponding to the 100 sets of estimates. The data points are represented by the asterisks. Among the five algorithms, plots of the solution using GA-MPC produce the best fit to the experimental data.
Next, we apply the bootstrapping method presented in Subsection II-B to assess the uncertainty in the parameter estimates obtained using GA-MPC. Table 2 shows the standard errors and mean of the estimated parameters, which all fall within the 95% confidence interval. In Figure 3, we see that the plot of the solution using the mean of the estimated parameters (blue curve) fits the experimental data (asterisks) well. The bounds for the 95% confidence interval of the solution yθ (t) are represented by the red curves.

B. AGE-STRUCTURED MODEL
Next, we consider a population divided into two age groupsjuveniles and adults. This can be described by the following age-structured population growth model from [14]: where U (t) is the juvenile population, V (t) is the adult population, b j , and µ j , j = 0, 1, 2, are different birth and death rates, respectively. We note from [14] that the coefficients a i in (6) are non-negative and that We set the initial values for U (t) and V (t) as (compare with [15]) where the initial age distribution is given by n 0 (a) = 1, 0 ≤ a ≤ 50, 0, otherwise.
In this example, we use generated population data for U (t) and V (t) at n different time points. To generate noisy data, we follow the method used in DDE models presented in [61]. First, clean data are generated by solving the model given the true parameters (see Table 3). The solution of the NDDE model is computed numerically using the MATLAB function ddensd. Then noisy data are generated following a normal distribution, with standard deviation set to 10% of the standard deviation of the NDDE solution. We generate data for n = 20, 50, and 100 time points over a fixed interval. The goal is to estimate the parameters of the age-structured NDDE model from the generated noisy data {U * (t i ), V * (t i )}, i = 1, 2, . . . , n.
The minimization problem for the parameter estimation of this model is given by  where ζ is a penalty parameter, n is the number of data points, is a penalty function to ensure that the parameter values satisfy the condition a 1 ≥ a 2 a 3 . The parameter ζ is set to 1000. We solve the minimization problem using GA-MPC, GA, PS, SA, and PSO. The bounds for the parameters are given in Table 3. We see in Figures    estimates than PSO for the parameters b 2 , µ 0 , µ 1 , and µ 2 . Using PSO, some estimates for the death rates µ 0 and µ 1 are zero, which fall outside the meaningful range for the death rate. Some parameter estimates using GA, PS, and SA vary on a larger range and a significant number of estimates fall outside the first and third quartiles. There is no notable TABLE 4. Standard error Err (θ ), 95% confidence interval, and mean of the 100 realizations using GA-MPC for the age-structured NDDE model. difference in the parameter estimation results when n = 20, 50, or 100. Figure 6 shows the plots of the solutions U (t) and V (t) (cyan and magenta curves, respectively) using GA-MPC. We see that the plots have a good fit to the n data points (asterisks) and that the value of n does not affect the performance of GA-MPC. Figures 12 to 14 in the Appendix show the plots of the solutions to the age-structured NDDE model using the estimates from GA, PS, SA, and PSO.
To analyze the uncertainty in the estimates, we apply the bootstrapping method to the data with n = 50. Results in Table 4 show small standard errors of all the parameters. Furthermore, the mean values of the estimates all fall within  the 95% confidence intervals. Figure 7 shows the good fit of the solutions to the data set using the re-estimates.

C. COUPLED PENDULUM-MASS-SPRING-DAMPER SYSTEM
Consider a mechanical system consisting of a mass M mounted on a linear spring, where a pendulum of mass m is attached to it via a hinged rod of length l [9]. We assume that the angular deflection of the pendulum from the downward position is very small, and is thus negligible. Moreover, we assume that there is a linear, viscous damping C on the mass M and no external forces are acting on the system. The equation of motion is given by the second-order NDDE where C and K are the damping and stiffness coefficients, respectively. This equation describes the vertical motion of a PMSD system, and the terms x(t),ẋ(t) andẍ(t) are the position, velocity, and acceleration of the PMSD at time t.
In the following discussion, we consider the nondimensionalized version of (7) given bÿ with the preshape function φ(t) = cos(t/2) [9]. Note that if |p| < 1 and ζ > 1/ √ 2, then the steady state of (8) is locally The black asterisks are the 50 generated data points. The 95% confidence interval for the NDDE solutions is illustrated in gray and is bounded by the red curves.
asymptotically stable [9]. In particular, we are interested in the case when p > 2ζ 1 − ζ 2 , so that there is a succession of stability switches as the value of τ increases. We estimate the parameters of (8) from noisy data generated in the same way as in the age-structured NDDE model. First, clean data is obtained by numerically solving (8) using the MATLAB function ddensd. We use the following parameter values from [9]: τ = 1, ζ = 0.05, and p = 0. Then noisy data y * (t i ), i = 1, 2, . . . , n, are generated following a normal distribution, with the standard deviation set to 10% of the standard deviation of the NDDE solution. We generate data sets containing n = 20, 50, and 100 time points divided over a fixed interval. We minimize the objective function given by is the solution to (8) at time t i using an arbitrary parameter vector θ. The bounds for the parameters are shown in Table 5. Figure 8 shows the zoomed-in box plots of the parameter estimates of the PMSD NDDE model. Aside from PS, the heuristic algorithms performed well in the parameter estimation. This might be attributed to having only 3 parameters being estimated. We note that there are many outliers in the   Figure 9. The number of data points used in the estimation does not appear to significantly affect the cost function value and the fit of the solutions to the data points.
Results of the parameter bootstrapping method show small standard errors of the parameter estimates in the PMSD NDDE model with mean estimates lying within the 95% confidence intervals (Table 6). Figure 10 shows a good fit to the data of most of the solutions using the re-estimated parameters.

D. VARYING MAXIMUM NUMBER OF FUNCTION EVALUATIONS
The only stopping criterion used in the previous simulations is the maximum number of function evaluations and this is set to 2000 times the number of parameters to be estimated. Results show that GA-MPC consistently obtained good parameter estimates in the three models. Now, we look at how varying the maximum number of function evaluations affect the  We consider setting the stopping criterion to 500, 1000, 1500, or 2000 times the number of parameters to be estimated, which we denote by d. Figure 11 shows the mean cost function value of 20 independent runs performed using  Plots of the solutions U (t ) and V (t ) (cyan and magenta curves, respectively) to the age-structured NDDE model using the estimates from 100 runs of GA, PS, SA, and PSO. The generated data set with 100 data points are represented by asterisks. 500d or 1000d. In all three models, GA-MPC consistently provides parameter estimates with low cost function values even for smaller numbers of maximum function evaluations. This is advantageous in instances when modeling using NDDEs requires parameter tuning at a faster time.

IV. CONCLUSION
In this work, we examine the use of a modified genetic algorithm called GA-MPC in solving parameter estimation problems involving NDDEs. We test our method on three different NDDE models -a first-order NDDE, a system of first-order NDDEs, and a second-order NDDE. We look at VOLUME 9, 2021  particularly GA-MPC, can provide good estimates for the parameter values and a good fit to experimental data. The high accuracy and good consistency of the parameter estimates using GA-MPC are observed in all three different types of NDDE models we presented.
An advantage of using GA-MPC in solving the least-squares formulation arising from the parameter estimation problem is that it does not depend on the explicit formulation or the derivatives of the solution to the NDDE. Although heuristic algorithms are computationally expensive, they are robust and are easy to implement.
Many factors contribute to the performance of heuristic algorithms in solving optimization problems. For future research, one can explore modifying the default parameter values or the stopping criteria in the algorithms. One can also consider other recent heuristic algorithms and hybrids of heuristic algorithms in solving inverse problems in NDDEs. In this study, we have varied the number of data points and maximum number of function evaluations. For future work, one can use other metrics for comparison such as varying the noise level in the simulated data and adding different types of noise. Figures 12 to 14 show the plots of the solutions to the age-structured NDDE model using parameter estimates obtained from the data set containing 20, 50, and 100 points using GA, PS, SA, and PSO. The plots of U (t) and V (t) are the cyan and magenta curves, respectively, while the data points are represented by the asterisks. Figures 15 to 17 show the plots of the solution to the PMSD NDDE model obtained using the estimates from 20, 50, and 100 data points, using GA, PS, SA, and PSO. The plots of the solutions are in cyan and the data points are represented by the asterisks. From 2007 to 2015, he was an Instructor with the Institute of Mathematics, University of the Philippines Diliman, where he has been an Assistant Professor, since 2015. He works on numerical optimization, mathematical modeling, and partial differential equations. His current research interests include image reconstruction in electrical impedance tomography, applications of shallow water equations in tsunami mitigation strategies, solutions of neutral delay differential equations, data graduation in Hilbert spaces, and modeling the transmission of neglected tropical diseases. He is the First Prize Winner of the 2021 National Academy of Science and Technology Talent Search for Young Scientists. From 2007 to 2018, she was an Instructor with the Institute of Mathematics, University of the Philippines Diliman, where she has been an Assistant Professor, since 2018. Her research interests include mathematical models of the cardiovascular-respiratory systems, infectious diseases, and population models involving differential equations, utilizing tools, such as sensitivity analysis, parameter estimation, and optimal control theory.