Real-Time Implementation Comparison of Urban Eco-Driving Controls

Connected autonomous vehicle (CAV) technology has the potential to enable significant gains in energy economy (EE). Much research attention has been focused on autonomous eco-driving control enabled by various methods. In this study, the state of the literature on autonomous eco-driving control is reviewed, an overall systems’ description of eco-driving control for a CAV is provided, and representative methods are evaluated comparatively against each other in simulation. Simulations are conducted using real-world traffic signal data and a validated future automotive systems technology simulator (FASTSim) model. Results indicate that an EE improvement in the range of 5%–15% is attainable depending on the method and cost function used. In this article it is shown that dynamic programming (DP) methods are most effective in improving EE but are significantly more computationally expensive than other methods. The genetic algorithm (GA) methods are shown to present the most potential in terms of EE improvement and run-time. Results also indicate that velocity-sensitive cost functions allow all the methods to perform better than pure acceleration minimization.

development effort has gone into the reduction in energy use and green-house gas (GHG) emissions from road vehicles.Over time, vehicles have become significantly more efficient in terms of both energy economy (EE) and GHG emissions/mi [1], [2] under pressure from environmental regulations from the U.S. Environmental Protection Agency (EPA) and its global equivalents which exert ongoing pressure on original equipment manufacturers (OEMs) to continue this effort [3].To improve vehicular energy efficiency, traditional internal combustion engine (ICE) powertrains have incorporated electric motors and evolved into hybrid electric vehicles and battery electric vehicles (BEVs) [4] which promise further GHG reductions per vehicle [5].Regardless of powertrain technology and regardless of methods of power generation, the pressure to reduce vehicular energy consumption will continue to be present.
Vehicle energy efficiency is also subject to modes of operation.Eco-driving is a strategy designed to reduce fuel consumption by minimizing accelerations and unnecessary braking events.Eco-driving is well-known and has been shown to be effective when used by human drivers [6].As an example, eco-driving is taught as a part of drivers' education in Singapore and has resulted in a EE improvement of 11%-15% there [7].Differences in culture, infrastructure, and available technology will play a major role in determining the effectiveness of efforts to promote manual eco-driving.Vehicular autonomy and connected autonomous vehicle (CAV) technology provide a more general opportunity for the application of eco-driving strategies because they circumvent driver acceptance/training issues.When compared with a human driver (i.e., manual eco-driving), a CAV has the ability to follow optimal trajectories precisely and can take into account information which is beyond line-of-sight.
Compared with manual eco-driving, autonomous eco-driving yields the following potential benefits: 1) ability to precisely follow optimal energy traces; 2) ability to account for traffic information which is beyond line-of-sight; 3) ease and scalability of implementation; 4) improved driver/passenger acceptance.
A great variety of solutions for autonomous eco-driving control have been put forward in the literature.This diversity is due to the complicated nature of the problem and the many dimensional design space which results from it.To the author's knowledge, no comprehensive, comparative study exists.This study attempts to address this research gap by summarizing and subdividing eco-driving control strategies, defining a framework for comparative implementation of solver methods, implementing a selection of common methods, and evaluating these methods in terms of performance and practicality using real-world data [8].The current state of the literature is discussed in Section II, a system and subsystems' overview for an assumed eco-driving CAV is provided in Sections III-VI, the results are presented in Section VII, and conclusions are presented in Section VIII.

II. LITERATURE REVIEW
Much research exists in the area of autonomous eco-driving controls.In conducting the literature review, the authors were particularly interested in publications which proposed methods which might be implemented in real-time.A real-time control was defined as a control which was explicitly or could be implemented in a receding-horizon setting.Such a control should be able to execute multiple times per second.
The authors propose that the methods reviewed may be categorized by purpose and structure as follows.First, a division can be made into the categories of rule-based and optimal methods.The rule-based methods serve the purpose of providing simple and computationally light algorithms for computing target speed on an instantaneous basis.The rule-based methods often mimic the heuristics that human drivers follow when attempting to minimize energy consumption such as lighter accelerations and longer following distances.In contrast, optimal methods attempt to find a minimum energy consumption trace for a given time or distance horizon.Optimal methods, thus, require information about future conditions even if this is done purely with assumptions.Within the set of optimal methods, one can further subdivide into globally optimal methods and locally optimal methods.The globally optimal methods serve the purpose of finding the control which results in the global minimum energy consumption.For the globally optimal methods, function dictates form and all the methods proposed are variants of dynamic programming (DP).The locally optimal methods serve the purpose of finding a control trace which is more efficient than one which could be attained by a rule-based method but require less computational time than the globally optimal methods.The locally optimal methods often involve transcribing the problem into the time domain and performing trajectory optimization.As will be seen in Section VII, local optima will often resemble the global optimum far more closely than they do a rule-based method's solution.The authors propose a taxonomy based on groupings in form and function which divides methods into the following categories: rule-based eco-driving (RBED), discretized control optimization (DCO), and polynomial trajectory optimization (PTO).

A. Rule-Based Eco-Driving
RBED is a subset of autonomous eco-driving control wherein a vehicle reduces its energy consumption through a set of predefined rules which are functions of vehicle states.Due to their feedforward nature, the RBED methods are relatively simple to implement.Compared with normal human driving behavior, the RBED methods are capable of yielding considerable fuel economy improvement [9], [10].A common RBED algorithm is intelligent driver model (IDM) [11] with several works presenting modified versions of the method in eco-driving simulations [12], [13], [14], [15].Although non-IDM RBED methods appear in [16], [17], and [18], IDM and its derivatives dominate RBED literature and are often used as a comparison point in optimal eco-driving literature.When implemented on a sufficient percentage of vehicles, the RBED methods have shown promise in traffic calming [12], [19].RBED control has also been extended to cooperative and centralized fleet control schemes [13], [14].

B. Discretized Control Optimization
The purpose of a DCO method is to compute optimal controls for a vehicle at a set of discrete points in time or distance.The DCO methods require a state transition model and information about future exogenous inputs.The DCO category consists, primarily, of the DP and reinforcement learning (RL) methods.
The DP [20], [21], is a well-known mathematical optimization method which will produce globally optimal solutions to control problems subject to a chosen discretization.A realization of the DP-derived optimal solution depends on whether the chosen discretization and the model appropriately match the real-world application.To account for constraints in position and speed inherent to autonomous eco-driving control, both must be problem states.The control in the autonomous eco-driving problem is acceleration or a related control such as throttle.Such a two-state one-control DP algorithm is presented in [22] and [23] which minimizes fuel consumption while navigating around traffic signals.Hellström et al. [24] present a two-state one-control DP algorithm for heavy-duty trucks in highway conditions.Both the methods must execute at a low rate and serve to set targets for a lower level controller.The primary issue with the DP methods for real-time implementation is that run-times scale exponentially with the number of states and controls.This scaling issue is often referred to as the "curse of dimensionality."In the autonomous eco-driving literature, DP solutions proposed as real-time controls use suboptimal implementations of DP to avoid the issue.The DP methods are also often proposed as a high-level control algorithm, executing at low frequency, which serves to set targets for a low-level controller.It is most common to see DP implemented as a comparison point for the performance of another proposed solution with the caveat that the DP solution is not a candidate for real-time implementation.
Suboptimal implementations of DP are found in [25], [26], [27], and [28].Maamria et al. [25] and [26] overcome the run-time scaling issues by removing position as a problem state.This is accomplished by adding a tunable constant cost to the running cost to ensure that the correct final distance is reached at the correct time.This tunable parameter must be found via numerical root-finding.Overall, this method can be thought of as a pseudo two-state DP (2SDP) method.The pseudo two-state method was found to execute in less time than an equivalent 2SDP method which emphasizes the importance of the run-time scaling effects inherent to DP.A major limitation with [25] and [26] is that having removed the position state, it is not possible for the optimization to account for traffic signals in fixed positions making the method less applicable for urban eco-driving.Deshpande et al. [27] propose an approximate DP (ADP) solver for traffic-signal constrained driving which uses a nonoptimal rollout method to approximate the costto-go.Deshpande et al. [27] account for traffic signals by determining whether it is feasible to pass in a "go" phase or, if not, implementing eco-approach and eco-departure.Gupta et al. [28] propose a method by which precomputed DP solutions may be adjusted to account for perturbations in external inputs without having to recompute the DP solution, thus reducing the required frequency of DP method evaluations for real-time control.In all the cases, global optimality is traded for reductions in run-time.
Vahidi and Sciarretta [29], Stanger and del Re [30], Xu and Peng [31], Groelke et al. [32], Bae et al. [33], and Sun et al. [34] propose a DP-based method where the DP solution is computed at a low frequency and is used as a target by a lower level controller.Exemplary of the type is [35] which uses vehicle to infrastructure (V2I) information and DP to set velocity targets for a cruise control system for urban driving.This method was tested both in hardware in loop (HIL) simulation and on-road and was shown to produce a 30% EE improvement at a cost of an 8% increase in travel time.
The RL-based methods are proposed in [36], [37], and [38].Lee et al. [36] use RL for optimizing motor power control for an electric vehicle subject to road grade but not traffic.The RL control was found to perform nearly as well as DP for the same problem.The algorithms seen in [37] and [38] are focused on comfort (reduction in jerk) and collision avoidance rather than eco-driving and also found similar performance to equivalent DP solutions with lower run-time.Ultimately, RL suffers from the same disadvantages that DP does for the application, namely, the long run-time required to compute the strategy, but not to the same extent.

C. Polynomial Trajectory Optimization
The optimal eco-driving optimal control problem can also be solved as a trajectory optimization problem by transcribing into the time domain.Direct transcription transforms the problem into an n-dimensional optimization with the number of dimensions set by the level of discretization, but at lower levels of discretization.Run-time for the trajectory optimization will scale with dimensionality depending on the solver used.At very high levels of discretization, linear interpolation can be used between trajectory points.To reduce run-time, a coarser discretization may be used but this will necessitate polynomial interpolation between the optimization points.Because every segment of an interpolation polynomial is a function of multiple knot points, using an interpolation polynomial comes at the cost of introducing nonlinearity into the problem.The PTO methods may use bounded nonlinear solvers such as interior-point optimization (IPOPT) or sequential least-squares programming (SLSQP) or metaheuristics.
PTO is commonly used for motion planning in robotics [39].Nonlinear bounded solvers are used to perform PTO for autonomous eco-driving in [40], [41], and [42].A comparison to DP is provided in [43] for the related optimal energy management problem where the PTO method, using a nonlinear bounded solver, was shown to be able to approximate the globally optimal solution and to produce a solution in orders of magnitude less time than DP.The particularities of the optimal eco-driving problem are difficult for bounded nonlinear solvers to deal with.The issue is that vehicle motion is subject to time-varying constraints in position caused by other vehicles and by traffic signals as well as in speed by other vehicles and speed limits.These constraints will be discussed in Section IV.The combination of nonlinearity caused by interpolation polynomials and the complexity of the constraints makes the computation of meaningful gradients difficult, and thus, gradient-based solvers could struggle.Hamednia et al. [40], Khalik et al. [41], and Padilla et al. [42] do not consider distance and speed constraints simultaneously.The issues that gradient-based solvers experience are somewhat mitigated in [44] in which best interpolation splines [45], [46] are used rather than interpolation polynomials.Best interpolation splines consider each segment separately and can be used to guarantee that constraints will not be violated but at the cost of additional run-time.
Metaheuristics are also commonly used as solvers for PTO methods with genetic algorithm (GA) and particle swarm optimization (PSO) methods being the most often proposed.GA and PSO take inspiration from nature.PSO [47] takes inspiration from animals which exhibit schooling or swarming behaviors.PSO works by generating a field of candidate solutions (particles) and then computing gradients for each particle based on individual and global best discovered solutions.GA [48], [49], [50] mimics natural selection by encoding decision variables for a discretized problem into phenotypes and then mating the highest fitness phenotypes over many successive iterations.GA can be modified in many ways including by introducing random mutation, elitist selection, and others to change the breadth of the search.Where PSO is still at its core a gradient descent method, GA is not and is, thus, not subject to the difficulties that gradient descent methods face with the optimal eco-driving problem.
GA was used in [51] to generate optimal driving operations from real-world data with the final results yielding an improved fuel economy of 22% compared with the initial population.Similarly, Li et al. [52] used GA to group vehicles in compatible streams to provide a smoother traffic flow with the algorithm scaling favorably in comparison to DP.In recent studies [53], [54], [55], GA PTO methods were applied to both conventional and electric vehicles with results showing favorable fuel economy improvement for both types of vehicles.PSO was used in various studies to optimize energy consumption for individual vehicles [52], [56], [57], [58] and to streamline vehicle platoon behavior at intersections [59].A comparison of PSO-based PTO method with DP [58] found that PSO significantly underperformed DP in terms of efficiency but executed in significantly less time.PSO and GA have also been used in conjunction, and the combined method was shown to be more effective than either individually [52], [57].
The general consensus in the literature would be that PTO methods provide the opportunity to compute locally optimal solutions in substantially less time than a globally optimal solution could be computed using DP.The constraints used in much of the PTO literature were simplifications of what the authors would consider the minimum constraints for optimal eco-driving in urban conditions.The complex boundaries inherent to the optimal eco-driving problem are difficult for gradient-based solvers to account for and are, perhaps, easier for metaheuristics to account for.However, the use of GA or PSO comes at the cost of introducing randomness to the problem.

D. Summary
The publications reviewed are listed by category and method type in Table I.The literature contains a variety of approaches to the optimal eco-driving problem.There is significant variation in the constraints used in the studies surveyed.The distinction in constraints more or less reflects a division in focus between urban driving and other types of driving.Urban driving is constrained by the positions and velocities of surrounding vehicles as well as traffic signal locations and states.The constraints present in urban driving are time-varying.Inevitably, the constraints used will need to be approximate as precise knowledge of future values is not possible.Because all the optimal eco-driving methods proposed are intended to be used in a receding-horizon manner, some simplification is acceptable.However, to make direct comparisons between methods, a standard and sufficiently representative set of constraints must be applied to all.

III. SYSTEM DEFINITION
The eco-driving system can be broken down into three subsystems as shown in Fig. 1.This system-level diagram is consistent with advanced vehicle control applications such as autonomous vehicles [60] and with energy efficiency improvement strategies such as optimal energy management [61].The eco-driving subsystems are, respectively, the perception subsystem, the planning subsystem, and the plant subsystem.The perception subsystem uses the sensors and connectivity capabilities of the ego vehicle and computes motion boundaries based on a detected lead vehicle (with on-board sensors and V2V) and upcoming traffic signal information (V2I).The proliferation of new vehicles equipped with a forward object detection system will soon reach 100% per an agreement between the National Highway Traffic Safety Administration (NHTSA) and automakers which mandates the inclusion of said systems to enable automatic emergency braking as a standard feature [62].These systems often comprise a radar and a visual object detection system which work in concert to determine the location, motion, and type of objects in the ego vehicle's forward vision cone.In addition to enabling safety-oriented features such as collision avoidance systems, the forward object detection system also enables convenience-oriented features such as adaptive cruise control.In the future, most vehicles may also be equipped with V2I technology in the form of a transponder which communicates with infrastructure transponders according to the Society of Automotive Engineers (SAE) specification J2735 [63].Among the messages contained in the SAE, J2735 specifications are the SPaT and MAP messages which provide the signal phase and timing and locations of the upcoming traffic signals.
The vehicle is assumed to contain a two-level controller with a high-level controller computing an optimal eco-driving trace and the low-level controller being responsible for carrying out the optimal eco-driving trace in a safe manner.The planning subsystem, which is composed of the high-level controller, takes the information about lead vehicle motion, speed limit, and future traffic signal information and uses it to compute the optimal eco-driving trace.
The final subsystem, the plant, is the ego vehicle (physical or simulated) which executes the optimal eco-driving trace and outputs the resultant energy consumption.The subsystems and the manner in which they are treated in this study are explained in Sections IV-VI.

IV. SUBSYSTEM 1: PERCEPTION
For the eco-driving control to be evaluated in a real-world context, the algorithms which generate the optimal eco-driving trace must be able to function using only information which is currently or will soon be available to CAVs.The information which is available to CAVs comes from the advanced driver assistance system (ADAS) system of the CAV and from V2I communication where available.The data which are available Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II DATA AVAILABLE TO CAVS
to CAVs via their ADAS systems and V2I communication are listed in Table II and are further elaborated in [64].
With this information, a CAV can generate path constraints for the optimal eco-driving problem.For this study, path constraints consist of allowable locations (distances along the vehicle path) for the vehicle at specific times and allowable speeds for the vehicle at specific locations.

A. Path Constraints
The ego vehicle should not be deliberately programmed to violate traffic laws even if this provides efficiency and/or travel-time benefits [65].This means that the vehicle should not exceed the speed limit, disobey traffic signals, or collide with any other vehicle.If the ego vehicle is the first vehicle in a queue, then an upper boundary can be generated from SPaT knowledge as an inequality constraint where x is the vehicle distance along its route, B U is the upper boundary which is a function of time, and T is the final time of the drive cycle.There are many ways to generate this upper boundary based on lead vehicle and traffic signal state.A general approach would be to formulate the upper boundary based on a piecewise function wherein the boundary is generated by the closer of the nearest stop phase and the immediate lead vehicle.For the purposes of this study, only vehicles with no immediate lead vehicle are considered (i.e., there are vehicles in front of the ego vehicle but always at least one signal away) as in such a case, a long-term optimal trajectory can be generated.
An assumption made here is that waiting out a go phase while not moving, although potentially optimal for a single vehicle, will cause congestion and will not be fleet optimal.Thus, a lower bound on distance as a function of time is also defined as an inequality constraint where B L is a piecewise function of time based on the positions and phases of leading traffic signals.The upper and lower bounds combine to form a "corridor" on a phase map as shown in Fig. 2. In limiting possible paths to those entirely within the corridor, the ego vehicle is limited to largely following traffic norms and is far less likely to radically affect normal traffic patterns.The selection of stop phases to define the boundaries is done using the IDM model further described in Section V.As the IDM model represents a baseline driver, the corridor created this way is one which must reflect normal driving and, thus, is useful for this purpose.It should be noted that the boundaries created in this manner are nonconvex.
Traffic signals are generally adaptive and were in this specific case.In this study, full knowledge of traffic signal timing in the future is assumed.The effects of adaptive traffic signal timing may be dealt with through the implementation of stochastic constraints as in [34].Uncertainty on the timing of traffic signals will have the effect of extending the stop phases as used in optimization and thus tightening the corridor.
An element of reality is added to this study through the use of real-world SPaT data in the generation of path boundaries.These data were collected in 2019 and consists of traffic light phase and timing data from 19 traffic signals along a 4-mi route in downtown Fort Collins, CO, USA.These data were collected by the authors and their collection is described in [66].Several hours of SPaT data for each of the traffic signals were collected in collaboration with the Fort Collins Traffic Operations Center.From these data and the distances of the traffic signals along the route, a phase map was constructed.
To conform to traffic norms and regulations, the ego vehicle velocity is required to satisfy the inequality where S L (t) is the road speed limit at time t.For the Fort Collins drive cycle used in this study, the speed limit for all roads at all times was 35 mi/h (15.65 m/s).

V. SUBSYSTEM 2: PLANNING
The planning subsystem is responsible for calculating an optimal eco-driving trace based on the constraints computed by the perception subsystem.As described in Section III, the planning system is assumed to contain a high-level controller which computes optimal velocities and a low-level controller Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE III VARIABLES AND PARAMETERS FOR IDM
which implements them.This study is only concerned with the high-level controller.It is also assumed that the high-level controller will operate with the shrinking horizon and the solution recomputed at discrete time instants.The optimal methods selected for implementation were DP, DCO, and PTO using IPOPT, GA, and PSO as solvers.IDM was used as the baseline control to compare against.These methods are defined in the following subsections.
A. Baseline Control 1) Intelligent Driver Model: IDM, developed by Trieber et al. [11] is an RBED method intended to enable agent-based traffic modeling.This model represented a step improvement on previous car-following models as it was meta-stable, prevented collisions, and all the parameters had physical interpretations.The IDM is formulated as follows with parameters described in Table III In this study, as mentioned previously, only the optimal trace for the lead vehicle is considered.Thus, the upper bound of the traffic light constraints is used in place of a lead vehicle with varying distances but always zero speed.
IDM can represent a spectrum of drivers in terms of aggression in acceleration and follow distance.Parameter selection for IDM is important as it affects the efficiency of the generated trace.Those parameters which have the greatest effect on EE are a, b, and δ.An experiment was run on said parameters using 100 different constraint sets per case and a future automotive systems technology simulator (FASTSim) [67] 2015 Kia Soul electric vehicle (EV) model.This experiment was a full-factorial design with the levels for a and b being 1, 5, and 9 m/s 2 (this range encompassing virtually all the passenger vehicle accelerations [68], [69]), and the levels for δ being 2, 4, and 6.The EE results of this experiment were regressed onto the values for a, b, and δ and interaction terms, and the results are presented in Table IV.
The results of the regression analysis indicated that a, b, and δ were significant terms which negatively affected EE while none of the interaction terms was significant.Thus, values for a, b, and δ can be set independently.Several papers propose methods for setting these values or the values themselves.In literature, the default value for δ is given as 4 [11], [70], [71], [72].NREL produced a report in 2021 [73] which extracted 39 000 individual driving features (acceleration-from-stop, deceleration-to-stop, and cruise events) from collected driving data and fit IDM parameters to the data.Although the IDM model used by NREL is slightly differently formulated than in this article, the results are, nevertheless, informative.NREL found clusters for δ at 0.88, 1.40, 1.75, 2.13, and 4.78, ultimately the article recommends a value of 4 for δ.Setting values for a and b was also based on literature where default values are generally given as 5 m/s 2 for both.The authors adopted these established values.

B. Optimal Control
All the optimal control solver methods address the following problem: where s.t.
where (S, U ) is the running cost, (S) is the final state cost, S = [x, v] ⊤ is the state vector containing the problem states' position and velocity, S 0 = [x 0 , v 0 ] ⊤ is the initial values of the state vector, U = [a] is the control vector containing the control acceleration, J is the cost for S and U , and B L and B U are the vectors containing the constraints as described in Section IV.The overline indicates a sequence of values at multiple discrete time intervals.The goal of the optimization is to find the optimal eco-driving trace (U * ) such that the corresponding cost is equal to J * .
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The cost J is evaluated in one of three ways.It is common in the literature for the cost function for an eco-driving optimization to be entirely based on acceleration.The case for using an acceleration-based cost function is made in several papers [74], [75], [76].Nevertheless, the authors chose to consider other, progressively less abstracted cost functions as well.The cost functions were also designed to enable fast computations for both long traces and for single time steps.The cost functions used are specified below.

1) Cost Functions:
a) Acceleration l 2 norm (Al2N) cost function: The Al2N cost function is simply the square of the l 2 norm of acceleration sequence.It is given by Note that minimizing the l 2 norm squared is equivalent (gives the same acceleration sequence) as minimizing the l 2 norm but is computationally advantageous as it does not require the computation of square roots and dealing with nonsmoothness at the origin.

b) Road power cost (RPC) cost function:
The RPC cost function is based on the road loads' ABC formula [77] multiplied by velocity to return power.This cost function takes into account the impacts of viscous and aerodynamic drag in addition to acceleration, and it is given by where A, B, and C are the coefficients of the road loads' equation, and m is the vehicle mass.For FASTSim vehicles, the road loads' coefficients are not provided, and hence were chosen as A = 0, B = C RR , and C = ρ FC D with C RR being the coefficient of rolling resistance, ρ being the density of air, F being the vehicle frontal area, and C D being the vehicle coefficient of aerodynamic drag.One of the important aspects of Al2N and RPC cost functions is their independence of powertrain model.An approach related to RPC has been studied in [78] under the name wheel power minimization.

c) Battery power cost (BPC) cost function:
The BPC cost function is an extension of the RPC cost function which accounts for the efficiency of the motor/inverter based on power requirements.This calculation is a simplified facsimile of the FASTSim model and requires powertrain modeling details.The BPC cost is calculated as follows: The transmission efficiency term η T is a constant, and the motor/inverter efficiency term η M/I is calculated by interpolating using the FASTSim motor efficiency curve.It should be noted that BPC requires more component-specific information and interpolation and, thus, may be more difficult to implement and will require more computational time.
d) Summary: The three different cost functions reflect three different approaches to optimizing EE.A comparison of the cost function values for a 2015 Kia Soul EV is shown in Fig. 3.
Note that the velocity-sensitive cost functions RPC and BPC have similar contour plots but both differ significantly from the Al2N cost function.
2) Optimizers: a) Two-state DP: DP is a well-known and commonly used optimal control method.The principle advantage of DP is that it guarantees a globally optimal solution subject to the chosen discretization.The primary disadvantage of DP is that it will generally require significantly greater computational time and effort than other methods and heuristics.In the case of eco-driving control, which is a two-state one-control nonlinear optimization problem with time-varying constraints, 2SDP is a natural choice and it appears in literature in multiple forms as described in Section II.The dynamics of the problem in discrete time are represented by The boundary violation cost function J PC is shown in (20), where the path constraints in x were enforced by a squared error penalty function.The boundary violation cost is added to the running cost (S k , U k ) at each time step where B L ,k = B L (t k ), B U,k = B U (t k ), and S L ,k = S L (t k ).The final state cost function is where x target is the desired ending position, and β FS is a tuned parameter.b) Spline nonlinear programming (SNLP): A second common method to solve time-varying control problems is via direct transcription [79] wherein a problem in continuous time is transcribed to the time domain and solved at discreet times.direct transcription (DT) significantly increases the dimensionality of a control problem but allows the use of efficient methods such as IPOPT and SLSQP for linear and nonlinear problems [80], [81], [82].The dynamics of the problem are represented by The running cost and final state cost functions are the same as that for 2SDP and are shown in (20) and (21), respectively.
An issue with using IPOPT to solve discrete time optimal control problems is that the run-time required scales exponentially with the length of the state vector [83].To avoid using extremely high levels of discretization, it is common to use polynomial interpolation between more distant optimization points.The authors chose to define trajectories using piecewise cubic hermitic interpolation polynomial (PCHIP) splines with knots placed at those points in time where the upper or lower boundaries change.The trajectories are defined as where ϵ are the locations of the vehicle at the knot times (t knots ) relative to the boundaries at the knot times (B L ,knots and B U,knots ), and t is the discrete time vector for the problem.c) Spline GA (SGA): The first metaheuristic method discussed is the SGA.For this study, the phenotypes optimized are ϵ vectors.The initial population is generated randomly with an initial guess inserted in place of one randomly generated phenotype.The GA method used uses sorted selection wherein the best phenotypes are selected for crossover and random mutation wherein a certain percentage of the total chromosomes from all the phenotypes are changed to a random number each step.The method also uses elitist carry-over wherein the best phenotype is kept for the next step unchanged.The dynamics and cost function for SGA are identical to those for SNLP.GA is inherently parallelizable and scalable meaning that it is well-suited to modern parallel computing and may benefit significantly in terms of run-time from such an implementation.

d) Spline PSO (SPSO):
The second heuristic method used is SPSO which uses the PSO heuristic to optimize a positional spline trajectory.In this study, the particles used are ϵ vectors for a given set of boundaries, and the trace in distance and velocity is computed as in (24).PSO is a quasi-Newton method as it applies a modified gradient search but does so with many particles simultaneously.The particle position and velocity update equations for PSO are given as where V is the vector of particle n-dimensional velocities, w is the momentum term which sets the weight of the current velocity, c 1 and c 2 are the local and global position weights, r 1 and r 2 are random weights assigned to the local and global terms, ϵ best, p is a vector of the best solutions found by each particle, and ϵ best,g is the global best solution found by any of the particles.In this study, a mutation step was added to the PSO solver to enable faster convergence [56] with the mutation step functioning similar to how it functions in the SGA method.Like GA, PSO is inherently parallelizable and scalable making it well-suited for a parallel implementation.

VI. SUBSYSTEM 3: PLANT
For this study, a 2015 Kia Soul EV was selected as the vehicle of interest.This particular EV was selected because dynamometer data for it are available from ANL's downloadable dynamometer database (D 3 ) [84] and because the research group owns a drive-by-wire capable physical vehicle for future studies.For vehicle simulation, NREL's FAST-Sim [67] was selected.FASTSim is an efficient, accurate, and robust longitudinal vehicle simulation which is commonly used in research.Construction of the FASTSim Kia Soul EV model was done using a combination of publicly available data, common FASTSim validated model parameters [85], and tuned model parameters.The model parameters are shown in Table V.
The two tuned parameters, C RR and maximum battery storage, were tuned from assumed values to best match the battery state of charge (SoC) and battery power traces from Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the D 3 data.After tuning the data, the 2015 Kia Soul EV FASTSim model was able to match the D 3 data to within 0.2% in terms of energy consumption while closely matching the SoC and battery power traces with mean absolute percentage error values of 0.763% and 1.552%. 1 The vehicle plant will add uncertainty to the optimization in the forms of sensor noise and actuator error.Neither of these is modeled in FASTSim.However Motallebiaraghi et al. [87] performed a dynamometer validation study which confirmed that similar results could be obtained with a physical vehicle plant implying that the control is robust to the noise originating from physical sensors and actuators.

VII. RESULTS
Each optimal eco-driving trace generation method was evaluated in terms of the following two criteria.
1) Ability to produce energy-efficient solution traces.
2) Ability to produce solutions within acceptable levels of run-time.The authors evaluated the performance of the methods in generating optimal eco-driving traces for 5-min driving trajectories.Longer time horizons allow for the solvers to improve over baseline to a greater degree but also increase the optimization space for the solvers leading to rapid growth in run-times.The 5-min time horizon was picked as a sufficient compromise.Although ultimately, any on-board implementation must be receding-horizon-based to account for changing information in real-time, this article is only concerned with the efficacy of solver methods for single evaluations.
The purpose of this study was, specifically, to compare the relative merits of several optimal eco-driving trace methods found in literature.Consequently, the scope was limited.The authors assumed that optimal eco-driving trace generation is one step in an optimal eco-driving control algorithm which operates in a receding-horizon manner and that this algorithm comprises the upper level of a two level controller with the lower level being responsible for the instantaneous control of the vehicle.This conception of an optimal eco-driving control framework is consistent with the literature as described in Section II.

A. Optimal Solver Results
1) EE Improvement: A standard experiment was run for evaluation of the methods with respect to the solver and cost function.This experiment was a full-factorial design in which each solver was evaluated for 100 predefined boundaries cases and with each cost function.These predefined cases were defined by a selection of random starting times and locations on the phase map as shown in Section IV.The decision to run 100 cases per combination was made to allow for the use of large sample statistics.
The results of the experiment in terms of EE improvement over baseline and in terms of cost function reduction over baseline are shown in Figs. 4 and 5.  From Figs. 4 and 5, a definitive order is visible in the relative performances of the methods in relation to EE improvement and cost function reduction.It can be observed that the cost function reduction exceeded 100% on a recurring basis for the RPC and BPC cost functions.The ability of the RPC and BPC cost functions to be reduced by greater than 100% is reflective of the regeneration potential over a given drive cycle for those cost functions and is an artifact of the particular boundary conditions used in the experiments.All the optimal eco-driving traces in the experiment start at 15.65 m/s (35 mi/h) which is the speed limit of the four streets used for data collection but optimal eco-driving traces were not required to match this speed at the end of the drive cycle.Thus, it was possible for the energy regenerated over the course of the drive cycle to exceed the energy spent.Generally, the ranges seen for EE improvement as a percentage of the mean were quite large in comparison to the same for cost function improvement, and this is the result of the low correlations between cost function improvement and EE improvement for all the methods and cost functions.Correlations between cost function improvement and EE improvement are shown for all the cost functions in Fig. 6.Significance of comparative results (P-values), purple indicates that the column significantly outperformed the row, blue indicates that the row significantly outperformed the column, and green indicates the difference between the row and column was insignificant at 95% confidence.
Correlation between cost function improvement and EE improvement was shown to be best for BPC and then RPC; in both the cases, the correlation was significantly better than for Al2N.This is attributed to models in BPC and RPC providing closer match to the model in FASTSim.Due to the large uncertainties regarding the EE improvement results, the significance of the observed differences in effectiveness could not be assumed, and thus, T -tests were conducted between all the combinations of method and cost function, and the results are presented in Fig. 7.
The results shown in Figs. 4 and 7 indicate that the best performing method in terms of improving EE was 2SDP followed by the heuristic methods and finally SNLP.The same results indicate that the velocity sensitive cost functions enable better solutions to be found than Al2N.Neither result is surprising, only DP should be able to find globally optimal solutions and more information should lead to a better solution.
Finally, it should be added that by reducing the acceleration and braking limits by a factor of 10-0.5 m/s 2 , the IDM method was able to produce a mean improvement of 5.51% with a standard deviation of 2.79%, comparable to the results from the SPSO method.The IDM method was omitted from the comparison as it cannot be made to meet the end position constraint and, thus, achieved better EE by reducing average speed thus rendering the results not directly comparable.2) Computational Load: All the methods for this study were implemented in Python 3 with the NumPy and Scipy libraries.All the solvers were run-time optimized in Python, and all are vectorized to the highest degree possible to minimize run-time [88].Nevertheless, a specific outcome of the Python implementation is that Python has very limited parallel processing capability [89] which means that the authors were not able to experiment on the impacts of parallel processing on run-time for the SGA and SPSO methods.The computer used for simulation contained an AMD Ryzen 7 3700 × 8-core multithreading capable CPU with 16 GB of RAM running the 64 bit Ubuntu 18.04 LTS operating system with Python 3.8.All the simulations were conducted on the same computer to ensure the integrity of relative run-times.Even though Python is unlikely to be used for onboard implementation, the computational time results are of interest for the relative comparison of different methods in terms of computational time required.Fig. 8 shows the relative run-times for each method and cost function.
An immediate conclusion is that the 2SDP method is not competitive with the other methods as a real-time control due to its large run-time requirement.Implementation specifics play a huge part in determining run-time and it is possible to significantly reduce the run-time requirements for the 2SDP method by changing hardware and language but these changes would also benefit the other methods and the relative gap should remain on the same order of magnitude.The authors did not find a single paper in the literature which implemented a 2SDP method in real-time.The closest examples would be [33] and [35], in which a DP solver is used as the higher level in a two-level receding-horizon controller but the DP algorithm takes multiple seconds to produce a novel solution, and [27] which implements a real-time ADP solver.
It is also evident that the SGA method is the quickest to execute and could be made to execute in even less time with the use of parallel processing, with the same being true for SPSO.Current vehicular computing systems differ in architecture from desktop computers although this may soon change [90].Implementation on automotive controllers may result in changes to the relative run-times.However, the order is unlikely to change given that the differences are in orders of magnitude.

B. Eco-Driving Traces
The differences in EE improvement reflect visual differences in optimal eco-driving trace traces.A representative example is shown in Figs. 9 and 10 for all the methods and cost functions with one set of constraints.
In general, the optimal eco-driving traces improve over the baseline traces primarily by minimizing the speed reduction due to traffic signals.There are many local optima in the results' space and many are very similar to the global optimum, and thus, the non-DP methods are most likely to settle on a local optimum.However, these local optima clearly approximate the global optimum.The optimal traces for the Al2N cost function are visually distinct from those generated using the speed-sensitive cost functions.While the Al2N cost function is only sensitive to absolute acceleration, the speed-sensitive cost functions are sensitive to directional acceleration and proportional to speed, speed squared, and speed cubed.The result is that the speed-sensitive cost functions will tend to reduce maximum speed and encourage deceleration to a greater degree than Al2N.When compared with the literature, traces seen in this study are more jerky.There are two reasons for this.First, the constraints used in this study are more complex than those used in most of the literature being time-varying and in distance and speed.The second is that no explicit proxy for passenger comfort was added to the cost function.The velocity-sensitive cost functions resulted in traces with lower speeds and higher decelerations which would, no doubt, be less comfortable for passengers.
Nonoptimal methods may also be used to generate ecodriving traces.Four optimal methods for generating optimal eco-driving traces were compared with IDM where IDM parameters used were chosen to be representative of normal driving behavior.IDM parameters can also be chosen to result in improved EE.The IDM acceleration parameters and aggression parameter were shown to have large and significant effects on EE in Section V.By reducing the allowed accelerations by a factor of 10-0.5 m/s 2 , a mean EE improvement of around 5% was attained.An EE improvement of 5% is on the low end of what was attained with the optimal methods.IDM is a low-cost algorithm which requires no look-ahead information making it much easier to implement than the optimal methods.There are, however, advantages of the optimal methods over non-optimal methods regardless of how the non-optimal methods are applied.Optimal control allows for a degree of performance and flexibility that nonoptimal control does not.The boundary conditions used for all the optimal methods used in this study required that the vehicles arrive at a given distance at a given time thus maintaining a precise average speed.The solvers were able to still produce significantly more efficient traces than baseline.Low acceleration IDM, in contrast, cannot meet the same final condition and was able to improve over baseline principally by traveling at a lower mean speed.In generating optimal eco-driving traces, there is a balance between maximizing EE and limiting travel time.In this study, the travel-time aspect was removed from consideration by applying a strict final condition but all the optimal control methods presented could be modified to allow for a precise tradeoff between EE and travel time.Thus, while requiring more in terms of increased computational load and look-ahead information, optimal control does enable more precision and flexibility than nonoptimal control.

C. Summary
The mean results for the methods and cost functions are presented for run-time and EE improvement in Fig. 11.Several observations can be made.The first is that the globally optimal solution produced by 2SDP is usually significantly more efficient than the locally optimal solutions but requires much more run-time.DP can be made to run quicker with partial parallelization [91] but this would not be enough to reduce the run-times to the level of the PTO methods.The literature contains several papers describing the practical implementation of multistate DP-based eco-driving control algorithms but these either evaluate the problem on a less than 1-Hz basis or use suboptimal approximations of the cost-to-go function.The DP-based optimal eco-driving trace solvers seem unlikely to become the basis of widely available commercial eco-driving systems due to the computational cost unless they rely on cloud computing.
Of the PTO solvers implemented, it is clear that the GAbased method was most effective in both the criteria of evaluation.Visually, the SGA method with the RPC and BPC cost functions occupies a position up and to the right of the general trend line seen in Fig. 11 indicating a favorable performance in both the criteria.One reason for this is that the specifics of the optimal eco-driving trace generation problem, as defined in this study, lend to the strengths of the GA which can explore complex optimization spaces quickly and efficiently by exploring many directions simultaneously and removing poor solutions from the selection pool.PSO also explores many solutions simultaneously but those particles which are seeded in low reward regions have to gradually approach better solutions.
The specifics of the optimal eco-driving trace problem as posed in this study did not favor the SNLP or SPSO methods.At their cores, IPOPT and PSO are gradient search algorithms, extensions of Newton's method, and require the computation of a gradient at each optimization step.With nonlinearity caused by the interpolation polynomials and the nonconvexity of the constraints, such gradient search methods were less effective.It is not surprising that the SGA method was found to be the best of the type.
Another observation from Fig. 11 is the benefit of additional information to solvers in generating the optimal eco-driving trace.In literature, it is common to see minimization of acceleration used as a proxy for maximization of EE.This trend of acceleration minimization is also very common in robotics control literature, and as many concepts in CAV control arise from robotics it is easy to see the origins of the assumption.The assumption that acceleration minimization is a valid proxy for EE optimization is stated explicitly in several well-cited papers [74], [75].A main reason for the use of the l 2 norm of acceleration for a cost function is that it is independent of vehicle and powertrain parameters and the optimization can be reduced to a conventional quadratic programming problem.The outcomes of this study indicate that cost functions which incorporated more information about the vehicle such as velocity, aerodynamic characteristics, rolling resistance, and powertrain efficiencies enabled optimizers to achieve higher EE for electric vehicles assuming perfect preview of the constraints.Note that the velocity-sensitive cost functions in this study require minimal additional time to compute compared with the l 2 norm of acceleration.

VIII. CONCLUSION
A great breadth of knowledge on the subject of autonomous eco-driving control has been generated by the research community in recent years.As the vehicular and infrastructural technology which enables vehicular autonomous control become ever more widespread, the opportunity to apply this knowledge in production vehicles becomes more realizable.A comprehensive, implementation-oriented analysis was performed to compare the relative merits of several optimization methods found in literature.A survey of the literature was conducted, and four representative optimization methods (2SDP, and trajectory optimization with IPOPT GA, and PSO) were implemented and refined for application in simulation with real-world infrastructure data.Numerical simulations were then conducted on these methods using three progressively less abstracted cost functions (l 2 norm of acceleration, road power, and calculated battery power), and each was evaluated relative to the others in terms of performance and run-time.From these simulations, the following conclusions were reached.
1) Minimizing the l 2 norm of acceleration is confirmed to provide EE improvements.2) Speed-sensitive cost functions that reflect vehicle and powertrain characteristics can yield improved EE results over the l 2 norm of acceleration for electric vehicles.3) DP methods offer the highest potential for EE improvement (in the range of 7%-15%) but are extremely computationally expensive compared with other methods requiring on the order of 100×-1000× as long to execute.4) GA showed the most potential as a real-time method based on its relatively high performance (5%-10% EE improvement) in EE optimization and its low computational cost.Near-future ADAS systems are anticipated to include highperformance in-vehicle computers and/or embedded hardware which is capable of computing and executing eco-driving control, meaning that this technology can be implemented as a software update.If a significant proportion of vehicles use this technology, the national energy savings could be significant.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
From the selected optimization approaches considered, the results suggest the use of a GA method with an RPC cost function as providing the best tradeoff between achievable EE and computational overhead for optimal eco-driving trace generation for urban eco-driving BEVs.

Fig. 2 .
Fig. 2. Example of a "corridor" with upper and lower boundaries.The IDM simulation is carried out for a given amount of time and the upper bound is defined by the preceding phases of the traffic signals passed by the model vehicle and the lower bound is defined by succeeding phases of the same signals.As the IDM model represents a baseline driver, the corridor created this way is one which must reflect normal driving and, thus, is useful for this purpose.It should be noted that the boundaries created in this manner are nonconvex.Traffic signals are generally adaptive and were in this specific case.In this study, full knowledge of traffic signal timing in the future is assumed.The effects of adaptive traffic signal timing may be dealt with through the implementation of stochastic constraints as in[34].Uncertainty on the timing of traffic signals will have the effect of extending the stop phases as used in optimization and thus tightening the corridor.An element of reality is added to this study through the use of real-world SPaT data in the generation of path boundaries.These data were collected in 2019 and consists of traffic light phase and timing data from 19 traffic signals along a 4-mi route in downtown Fort Collins, CO, USA.These data were collected by the authors and their collection is described in[66].Several hours of SPaT data for each of the traffic signals were collected in collaboration with the Fort Collins Traffic Operations Center.From these data and the distances of the traffic signals along the route, a phase map was constructed.To conform to traffic norms and regulations, the ego vehicle velocity is required to satisfy the inequality

Fig. 3 .
Fig. 3. Comparison of cost function values for 2015 Kia Soul EV.Red polygon outlines the operational envelope of the UDDS, US06, and HWFET EPA dynamometer drive cycles and is shown as reference for common driving conditions.

Fig. 4 .
Fig. 4. Mean and standard deviation of EE improvement over baseline results for all the methods and cost functions.

Fig. 5 .
Fig. 5. Mean and standard deviation of cost function reduction over baseline results for all the methods and cost functions.

Fig. 6 .
Fig. 6.Correlation between cost function reduction and EE improvement for all the cost functions.

Fig. 7 .
Fig. 7.Significance of comparative results (P-values), purple indicates that the column significantly outperformed the row, blue indicates that the row significantly outperformed the column, and green indicates the difference between the row and column was insignificant at 95% confidence.

Fig. 8 .
Fig. 8. Mean and standard deviations of run-times for all methods and cost functions.

Fig. 9 .
Fig. 9. Example position versus time traces for all the methods and cost functions.

Fig. 10 .
Fig. 10.Example velocity versus time traces for all the methods and cost functions.

Fig. 11 .
Fig. 11.Comparison of run-time and EE improvement means for all the methods and cost functions.

TABLE I PUBLICATIONS
REVIEWED BY METHOD TYPE

TABLE IV EE
REGRESSION RESULTS FOR IDM PARAMETERS