A Global Sensitivity Analysis of Traffic Microsimulation Input Parameters on Performance Metrics

Traffic microsimulation is regularly used to develop traffic signal control (TSC) algorithms, yet moving from simulation to the field requires confidence in the results and mature understanding of the uncertainty propagation. This work aims to understand the under-reported relationship between simulation inputs and the variance in output metrics commonly used for signal control optimization. Specifically, the impact that fleet composition, intelligent driver model, lane-changing model, and the corresponding inter-driver distribution of those parameters have on delay, fuel consumption, and travel time were ranked using Sobol global sensitivity analysis. The results are presented from 98,304 total simulation evaluations of a volume-calibrated three-intersection SUMO model. Acceleration, deceleration, and headway parameters in the intelligent driver model were shown to be important for fuel consumption, delay and travel time, in addition to the adherence of the drivers to the speed limit, the variance in preferred speed between drivers, and impatience. The results also unsurprisingly show the overwhelming importance of fleet composition for fuel consumption. Additionally, the work presents car-following and lane-change model parameter bounds that result in realistic simulation. The results from this study point to the critical factors that must be determined for any individual region of study and can be used to guide for real-world data collection to support additional calibration efforts. In the future, after calibration of key parameters, the overall uncertainty in network-wide performance metrics can be reduced and provide increased confidence to conclusions drawn from microsimulation studies.

can relatively easily provide large system level efficiency increases.New developments in Intelligent Traffic Systems (ITS), connected and autonomous vehicles (CAVs), and other key enabling technologies are just becoming available.The term efficiency can be interpreted several ways here, including: minimal delay, travel time, vehicle stops, energy consumption, and others.Though the definition may vary, studies have shown optimization carried out on the aforementioned metrics will generally improve the energy consumption over baseline [3].
Development of network optimization tools based upon new sources of data requires a methodology to simulate the dynamics of the traffic network and understand the vehicle and energy-consumption dynamics.Traffic micro-simulation provides an effective tool, however, the large number of inputs may increase uncertainty and complicate uncertainty propagation in the results [4], [5].It is infeasible for the modellers to calibrate all of the parameters, so how do they chose which to focus on?
This work offers a comprehensive methodology to ascertain the significance of key parameters in traffic microsimulation using Global Sensitivity Analysis (SA), with a particular emphasis on fuel consumption, travel time, and delay.The research underscores the utility of sensitivity analysis in traffic simulation, particularly in the face of data uncertainty, and highlights the importance of calibrating and analyzing variance of key parameters.The study reveals that the significance of input parameters varies across output metrics, with fleet composition, relative driving speed to the speed limit, the variance of preferred driving speeds, the mean value of all car-following (CF) model parameters, and the impatience parameter emerging as the most influential.Furthermore, the study provides new insights into the non-influential nature of both the variance and the distribution type of inter-vehicle CF parameter distributions when considering the modelled output metrics.The following section presents a more detailed background and review of existing literature.The methodology employed in the study is then described, followed by the presentation of the results from the uncertainty analysis.The paper concludes with the key takeaways and recommendations for future research.

II. BACKGROUND A. What Is Traffic Microsimulation & Why Use It
Traffic simulation takes three forms: macroscopic, mesoscopic and microscopic.All three have been used in emissions analyses and optimization, with macro-and meso-simulation the most common in literature [6].Macrosimulation has traditionally been used heavily by both academia and industry for its simulation speed and ease of setup.To extract emissions from a macroscopic traffic simulation, the outputs must be passed to a macroscopic emissions model.Macroscopic traffic and emissions models do not consider individual vehicles, instead using link averages.Works like [7] explain that the amount of information lost when using only average vehicle velocities is substantial, particularly during congestion.Two vehicles can undergo very different velocity traces, with one undergoing acceleration while the other cruises at a constant speed.
For urban network optimization, the missing intersection fidelity in macroscopic traffic simulation makes it unappealing.To achieve a higher resolution view of traffic dynamics, modellers turn to microscopic simulation.In recent years, several concurrent developments have made traffic signal control (TSC) optimization using microsimulation feasible.For one, road networks have slowly become further instrumented and thus the availability of calibration datasets for simulation have increased.Second, computing power has advanced to where traffic microsimulation with sub-second resolution is a realistic avenue.That being said, traffic simulation is not a trivial problem.The parameters available for calibration quickly become overwhelming, as they increase for every additional vehicle and intersection in the network being studied.In a review of the state of traffic simulation coupled with emissions models, [8] describes a trade-off between model complexity and emissions accuracy once passing a critical threshold.
While the trade-off is not disputed, the flexible test bed that traffic microsimulation affords for future research drove the choice of simulation method for this paper.Specifically, the traffic microsimulation software SUMO is used.SUMO ("Simulation of Urban MObility") is a microscopic traffic simulation suite that is space-continuous, time-discrete, and comes "out-of-the-box" with several emissions models [9].The PHEMlight model was chosen as baseline starting point for this work, as the CO 2 estimation of PHEMlight closely matches that of the more sophisticated PHEM model [10], [11].The authors realize that PHEM approximates vehicles according to EU standards whereas the simulated network to be discussed in Section III-B lies in the US.As the focus of this work is on energy consumption variance and not absolute quantity, it was determined appropriate.

B. Simulation Calibration
The complexity of traffic microsimulation, in addition to its localized, socio-physical nature makes calibration a necessity if meaningful results are to be obtained [12].Outside of the modelling of traffic control strategies (i.e.traffic signals) and the network geometry itself, the simulation of microscopic driving captures the interplay between several models, including: the car-following model (CF-model), the intersection model, and the lane-changing model (lc-model) [13].The complexity of this model interplay and its ability to model individual actor dynamics comes at a cost of prediction accuracy and increased output uncertainty [14].To combat this, there has been much work in the space of making calibration computationally efficient, of which [15] presents a good review.
An oft under-discussed point in the calibration literature is that the methods are ultimately only as good as the data sources.The research community relies heavily on the NGSIM dataset for car-following model calibration, which was collected in 2005 [16].Recently, the accuracy of the collected vehicle traces has been questioned [17].In practice, localized trajectory datasets are rare.Instead modellers must rely on aggregate measures like loop detector data, but there is also a trade-off; using aggregate measures for calibration can result in vehicle trajectories that are completely different from reality and in general, traffic calibration solutions are not unique [18], [19].

C. Sources of Uncertainty in Traffic Simulation
While calibration is an indispensable tool, it is also important to understand which uncertainty (or variance) can be reduced through calibration and which cannot.Punzo and Montanino [20] describe the traffic microsimulation uncertainty in great detail, while focusing on two modes: aleatory and epistemic.The reader is pointed to their work for a more in-depth discussion of the meaning behind each term, but essentially epistemic uncertainty is reducible, whereas the aleatory is not.
In terms of CF-model parameters, an example of epistemic uncertainty is the distribution of headway times that "drivers" in the modelled network adhere to, which can be measured using roadside radar or cameras.The aleatory uncertainty is the headway time of a specific driver.Punzo and Montanino's [20] prescription to deal with the stochastic and uncertain nature of traffic simulation is a five-part procedure, which will be followed in this study.One, modellers should establish which parameters are to be left uncertain, and which are to be fixed.Two, the input uncertainty should be modelled.This involves quantifying the reasonable bounds for each parameter, as well as the shape of the distributions if interdriver heterogeneity inside of the simulation is assumed.The third, fourth and fifth steps deal with quantification of the uncertainty.Low-discrepancy sequences of input parameter combinations should be simulated and the outputs quantified.Finally, the output variance (a proxy for output uncertainty) can be attributed to input parameters through the use of a sensitivity analysis.
1) Uncertainty in Car Following Parameters: The CFmodel serves as the cornerstone of traffic microsimulation.As discussed in Section II-B, a wealth of literature exists on the calibration of CF-models against vehicle trajectories.Recent research underscores a notable disparity in driver aggressiveness (measured by the count of positive and negative jerk) between Poland and the rest of Europe [21].Similarly, UK drivers were found to exhibit maximum accelerations 4-11% higher than their Korean counterparts, indicating a generally higher level of aggressiveness [22].Further, a survey of CF model calibration to datasets from China, Italy, and the U.S. reveals a distinct difference in the following headways [23].This variability, contingent not only on the specific trajectory dataset and traffic conditions, but also on geographical location and cultural differences [24], emphasizes that there is variability of the parameters for both datasets and location.
The dispersion of CF-model parameters among the simulated vehicle population, a concept known as inter-driver heterogeneity, carries significant weight.Research underscores that the strategy of maintaining uniform parameters throughout the simulation fleet falls short in accurately replicating Edie's space mean speed diagrams [20].However, this does not discount the effectiveness of average or optimal CF-model parameters when applied to other Measures of Performance (MoPs).Indeed, other studies have demonstrated that when predicting emissions for a platoon of vehicles, the use of either optimal or average CF model parameters can yield accurate results in aggregate [25].The impact of driver heterogeneity on aggregate MoPs remains an under explored area, with the majority of existing literature focusing on its effect on platoon stability [26].The importance of aggregate accuracy, or trafficlevel accuracy, cannot be overstated.In scenarios where trace data for every simulated vehicle is unavailable, the modeller is compelled to rely on CF parameters that closely mirror the true distribution.
2) Uncertainty in Fleet Composition: Prior global SAs in literature make no mention of fleet composition, though in fairness the quantification of network wide fuel consumption is not their end goals [12], [20], [27].Yet, the authors feel this is at odds with the discussion of uncertainty in [20].Fleet composition is highly localized, with the percentage of trucks in the fleet composition varying from 2% on urban residential roads to 40% on intercity interstates [28] and trucks have been shown to cause an out-sized percentage of total emissions in highly-calibrated simulations [29], [30].When trucks have been considered in signal control strategies, energy consumption at the full system level has been shown to drop substantially [31].
Even in the case where energy consumption is not considered, it is known that heavy-duty trucks have different driving characteristics than passenger cars, and can cause instability in traffic flows [32].The argument for omitting fleet composition from a study (that does not focus on energy) is that calibrated CF-model parameter distributions will capture the driving behavior of various vehicle types.However, with such an assumption, the modellers must use caution on which distributions they choose, as NGSIM only contains 3% trucks [33].
3) Additional Uncertainty: Traffic micro-simulations typically have lane-change models (LC-models) with their own sets of parameters.These can include the desire of the driver to keep in the right-most lane, the willingness of a driver to impede another during a lane-change, etc. [13].There is a multitude of literature on car following models but lane-change models see less attention, and are infrequently calibrated to empirical data [34].This is unfortunate, because lane changing is a primary cause of oscillations and disturbances in traffic flow [35] and aggregated measures-of-performance (MoPs) have been shown to be highly sensitive to several lane-change parameters [12].
Parallel to these modeling challenges is the issue of uncertainty in actual traffic demand.In this research, we operate under the assumption that the modeller is equipped with aggregate information from loop-detectors and aims to simulate a particular target day.Here, the uncertainty doesn't stem from the aggregate counts per se.Rather, it revolves around the processes and factors leading to these aggregate counts.This aligns with our exploration of lane-change models: both topics reveal the complexities and uncertainties underlying our understanding and modeling of traffic dynamics.

D. Why Do a Sensitivity Analysis?
The above sections served to highlight the following difficulties with traffic simulation.One, there is evident uncertainty in traffic simulation.Two, the dimensionality of simulation input parameters makes calibration unpractical if the input space is not reduced, and three, even with calibration, there is still some amount of inherent stochasticity that must be quantified.
Sensitivity Analysis (SA) tackles these three topics head on, by mapping changes in simulation inputs to the corresponding change in outputs [36].It is a powerful tool for understanding causalities, for performing dimensional reduction, and providing decision support in terms of input ranking and input space reduction [37].There are different forms of SA used in the traffic modelling community, but they generally fall in two categories: one-at-a-time (OAT) or analysis of variance (ANOVA).The generally agreed upon best-practice in traffic simulation research is to use a modification of the ANOVA approach, as it is able to capture the interplay between parameters, whereas the OAT approach is not [38].Particularly, this work will utilize the Sobol SA framework, which is a Monte-Carlo-based ANOVA approach that was generalized to quantify the variance contribution of each model input [39].

E. Exploration of Variance
At this point, the reader would do well to ask the question "variance of what?".In commonly cited literature, the SA is typically done on calibration-specific MoPs [12], [38], [40], [41].Inline with these prior papers, each model evaluation's ability to pass calibration (described in Section III-C) is assessed in this work and the sensitivity of calibration error is presented.However, the reader will come to learn that the presented calibration methodology simply quantifies whether simulated traffic volume matches reality, as the only localized data available was loop detector counts.Thus, there is a relative large CF-model and LC-model parameter space that can fulfill the calibration requirements.
With the baseline volume and calibration met, the range and variance of MoPs frequently used in traffic signal control optimization are explored, which builds from the aforementioned calibration-specific literature.Further, knowing which of the calibrate-able simulation parameters are influential in the variance of TSC MoPs can help modellers to focus their calibration and research efforts on a subset of parameters.Even in the case where volume-flow calibration is not possible, knowing the influential parameters can save computation time when performing parameter sweeps or sampling.
1) Traffic Signal Control Measures of Performance: For years, existing TSC optimization has been done by tools such as TRANSYT and Synchro [42], [43].Both tools seek to minimize a cost function based on stops and delays, though these are calculated through macroscopic equations [44].In recent years, literature has increasingly turned to microsimulationbased control optimization.A subset of reinforcement-learning based traffic signal control (RL-TSC) is growing rapidly in interest, where the most common performance metrics by order of occurrence in literature are forms (average, network total, etc.) of: delay, travel time, waiting time, queue size, speed, throughput, number of stops, and fuel consumption [45].
While reinforcement learning methods show promise in literature, there are few if any reinforcement-learning based control strategies used in practice due to challenges around: communication reliability, compliance, the heterogeneity of road users, and uncertainty in traffic measures and sensing technology [46].Owing partly to the last two points, reinforcement learning can over-fit to the simulator [47].Noaeen et al. [45] performed an extensive review of the existing RL-TSC literature and found that only 3 of 160 papers utilized simulated vehicle types outside of personal cars.Further, the authors were unable to find discussion of carfollowing parameters explicitly in reinforcement-learning TSC literature.

A. Sobol Sensitivity Analysis
Sobol's Sensitivity Analyisis is a global, ANOVA approach capable of quantifying higher-order model effects and sampling across the entire input space [48].Although the global nature of this method renders it more computationally demanding compared to popular local SA techniques, it is better suited for handling non-linear models, such as traffic simulations.
The focus of SA is on a black-box function where Y is a univariate scalar, X i represents an uncertain model input, and there are k total inputs.In the case of this work, f is the traffic simulation, X is a vector representing the uncertain model inputs (i.e.fleet composition, headway time distributions, etc.), and Y is a scalar measurement from the simulation (i.e.fuel consumption).The basis of Sobol's method is functional decomposition, which states that the function, f , can be written as a summation of partial functions where the total number of summands is 2 k .The initial term, f 0 , represents the expected value of f , E ( f ), or more simply, the mean of Y .Similarly to Eq. 1, the variance of f , V ( f ), can also be decomposed through summation, where there are The variance of f resulting from a single parameter X i is written as where X ∼i is the set of all variables except X i and E X ∼i (Y |X i ) represents the expected value of f when X i is held constant.The expected value is computed by integrating over the parameter space of X ∼i .In the same way, the variance attributed to two parameters, which is the basis for the second-order sensitivity of the two parameters.
The first order variance given in Eq. 2 is the expected reduction in output variance if X i is fixed.Eq. 3 (V i, j ) represents the amount of variance in the output that is due to interaction between two parameters [49].The computation of higher order effects becomes computationally intractable, so instead, the total variance to which X i contributes is commonly calculated.This metric, V T i , is calculated by subtracting all other first order variances from the total variance as below and represents the expected variance that would be left if all input parameters except for X i were fixed [49].While Eqs. 2, 3, & 4 all represent variance, Sobol's SA results are frequently normalized by the total variance of the model, V (Y ), allowing for easier interpretation.As an example, the first-order sensitivity of X i is and is scaled between 0 and 1.There is likewise a normalized version of V T i , denoted as S T i When a model is purely additive, S T i = S i and k i=0 S T i = 1.Eqs. 2 & 3 can be solved by analytically integrating over the parameter space.However, because the function f is not typically analytically solve-able, the integral is approximated through a (quasi) Monte Carlo integral laid out by Saltelli et al. [49].The input parameter matrices are generated using Saltelli's modification to the Sobol sequence, which is a quasi-random sequence that is highly effective at sampling the input space with low-discrepancy [50].The sequence is able to converge at a rate of 1/N , which is faster than the Monte Carlo rate of 1/

√
N [51].Only first and total sensitivity indexes are considered in this work, making the equation for the number of model evaluations where k is the dimension of the problem, as in Eq. 1 and N is the number of samples to generate and is commonly chosen as a power of two [52].For an in-depth mathematical understanding of Sobol's method, readers are encouraged to consult [48] and [49].Confidence in the SA results is determined via bootstrapping, where samples are drawn from the total number of model runs, N , with replacement and the sensitivity indexes are calculated for each sample [53].Distributions of the sensitivity indexes are built, and the resulting 95% confidence intervals (CI's) are calculated [54].Zhang et al. [55] instruct that the SA results should be considered stable if the parameter with the highest sensitivity index's CI is less than 10% of the sensitivity index value.

B. Simulation Network
The simulated SUMO network represents a three intersection corridor of Tuscaloosa, Alabama primarily consisting of US-82/McFarland Blvd between Airport and Harper Roads.This was advantageous, as the authors had access to the physical controllers in the network as well as historical loop detector data.Figure 1 shows the SUMO model of the target network overlaid on geo-located satellite images.Each of the three intersections in the network has a different layout; the left-most is a three-way intersection, whereas the other two are four-way intersections.All three intersections are signalized (marked as TL1-3 in Fig. 1), and the field controllers were emulated using NEMA controllers in SUMO with matching configurations [56].They all operated in coordinated mode during the simulated period of time, with actuation on the non-coordinated phases (east & west approaches).
The network sees heavy traffic volume traversing from west to east in the mornings and vice versa in the afternoon, as residential neighborhoods lie to the west and the city of Tuscaloosa to the east.There is an industrial area to the South of TL1 and, therefore the network experiences heavy truck traffic.
1) Simulated Traffic Volume: After modelling the physical network, a method to capture, identify, and replicate the actual traffic situation on a "standard" day was necessary.Fortunately, the intersections are equipped to capture and log parameters identified in the Federal Highway Administrations (FHWA) Automated Traffic Signal Performance Measures (ATSPM) initiative.They record anonymized traffic signal events and upload them to a central SQL server managed by Alabama Department of Transportation and located on the University of Alabama campus.The data has a resolution of 0.1 seconds and is only recorded when triggered, storing the data in an enumerated format [57].
The volume-location pairs (i.e.known volumes at each detector location in the network) were processed via an out-ofthe-box SUMO tool called routeSampler [58].Developed by the SUMO team, the tool uses integer linear programming (ILP) to generate vehicle trips (trip describes the route and the departure probability inside of a time window) that satisfy the traffic counts at each location.Instead of using a single output of routeSampler across all simulations, the simulations regenerate traffic demand using different random seeds and tuning parameters.The authors felt this better represented the aleatory uncertainty in the network as compared to using a single traffic definition.

C. Simulation Calibration
To guide the calibration of the traffic simulation, the authors relied on the methodology presented in the 2019 Update to the 2004 USDOT Traffic Analysis Toolbox Volume III: Guidelines for Applying Traffic Microsimulation Modeling Software [28].Critical to that methodology is the selection of a target day to simulate and calibrate against.By choosing a single day, the modelers can look closely at the unique traffic characteristics of that day.
Following the USDOT calibration methodology, July 24th, 2023 was selected as the "representative" day in the detector dataset.The calibration criteria were only applied to the individual lane detectors, as they are the most reliable.Figure 2 shows the time-variant volume during a simulation of the representative day.The diamonds represent simulation performance, while the shaded areas represent 1 and 1.96 standard deviations from the representative day.The simulated time for this paper spanned only from 5:30 to 8:10AM, with 30 minutes of warm-up time, resulting in data-collection from 6 to 8AM.Simulating the morning rush captures both high (6:45 -8:00AM) and low volume (6:00 -6:45AM) periods.The simulation model satisfied the volume-based calibration criteria by wide margins.
It should be highlighted that the USDOT Calibration manual calls for modellers to use two different metrics, one localized performance measure (e.g.capturing queuing or bottleneck characteristics) and one system performance measure (such as travel time).Volume counts were utilized as the metric for calibration due to additional performance measures being unavailable for the time period and corridor under study.The lack of localized data outside of detector counts is one of the motivations of this work, as there is a wide range of potential outcomes when calibrating only to detector data.

D. Car Following Model Specifics
The calibration discussed in Section III-C focuses on accurate traffic demand and results in a constrained set of vehicle trips.The modeller is however left with an array of carfollowing model parameters to tune.The authors decided to focus on the IDM model [59] owing to its simplicity and effectiveness in describing real-world acceleration profiles [60].
The IDM model calculates the acceleration of the follower vehicle, v f , as a function of its current velocity, v f , the distance from the lead vehicle, s, and the difference in the leader and follower velocity, v.The desired acceleration, v f , at any time t is written as where s * is the desired minimum following gap, v 0 is the freeflow speed on the road, β is a tuning acceleration exponent, and a is the maximum follower acceleration.The following gap is formulated as a function of current velocity and the difference in leader and followers velocity and given as where a is again the follower's maximum acceleration, b is the follower's maximum deceleration, τ is the minimum time headway, and s 0 is the minimum space between the follower and lead vehicle.The target simulation network described in Section III-B has varied speed limits, thus a static v 0 is not applicable.SUMO instead models the desired velocity as a speed factor, SF v , which is a multiplier on the speed limit, In total, the IDM model has 6 tuning parameters {a, b, τ, s 0 , SF v , β} with Kesting and Treiber [61] fixing β = 4, reducing the dimensionality to 5.There are other works that have left β included in CF-model calibration [62], [63].Even so, it was considered fixed at 4 for this work to represent the trend in literature.Table I below summarizes the bounds of the 5 considered parameters IDM CF-model parameters used in the following SAs.The mean column in Table I represents the upper and lower bound of the mean parameter for the inter-driver distribution, and is later referenced by the symbol in the parameter column (a = acceleration distribution mean).Similarly, the variance column is the range of the variance parameter and represented later, for example, as a σ .The range of mean and variance comes from literature and trial and error simulations.As will be explained in Section IV-A.2, using calibrated parameter ranges directly from literature led to unrealistic SUMO simulations.More information on the selection of individual parameter bounds is provided in Table I's footnotes.
1) Intra & Inter-Driver Variability: Previous SAs of carfollowing model parameters have typically focused on the error (e.g.RMSE of vehicle spacing) between the simulated and actual vehicle traces or the error of emissions estimates after calibration [25], [69].Both works were influential in the formation of this paper, but traffic modellers do not always have the luxury of localized vehicle traces to calibrate traffic simulations.As the focus was on aggregated simulation outputs, three variables relating to each car-following parameter were considered uncertain: the mean, the variance, and the shape (normal, log-normal, uniform) of the inter-driver distribution.The distribution type is later represented with the superscript dist (e.g. a dist ).Including more than just the mean of the distribution is in keeping with recent literature on the importance of inter-driver variability and acknowledges the uncertainty in distribution shape [20].Additionally, the standard deviation of driver heterogeneity parameters was an important parameter in a prior sensitivity analysis of traffic  [12].Including the mean, variance and distribution type of the distribution tripled the dimensionality of the SA.Normal, uniform, and log-normal distributions were considered for all parameters in Table I outside of SF v , for which only normal and uniform distributions were considered.
Following the notation from Section III-A, an individual car-following parameter (e.g.maximum acceleration, a) is represented in simulation as a = X a dist , j X a, j , X a σ , j where X a, j represents a SA sample of acceleration mean, X a σ , j is the corresponding sample of variance, and X a dist , j is the sampled distribution type.
For the uniform distribution, the mean and standard deviation were used to calculate the upper and lower bounds of the distribution using the equation for variance of a uniform distribution.The inter-driver distribution (whether uniform, log-normal, or normal) was bounded in simulation according to the maximum and minimum SUMO bounds in Table I.The SUMO tool, createVehTypeDistribution, is employed to generate values in the simulation.This tool operates by sampling from the underlying distribution until it produces a value that aligns with the specified bounds.
In recent studies, the significance of intra-driver variability in car-following parameters has been emphasized.This refers to the variations in a single vehicle's behavior under different driving conditions or over time, addressing an often overlooked aspect in addition to inter-driver variability [70], [71].As both the IDM model and SUMO do not allow for intra-driver variability, it was consequently left out of this simulation.Furthermore, the correlation of CF-model parameters Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
in the NGSIM dataset has been demonstrated [65], with Ge and Menendez [41] identifying potential issues arising from the assumption of independence.They proposed a sensitivity analysis method that supports correlated input parameters.However, including covariance increases the dimensionality of sensitivity analysis further if it is also assumed to be uncertain, and the authors could find little literature on ranges of covariance for IDM CF-model parameters.Moreover, independent samples from uniform distributions have been proven to yield the most robust simulation outcomes across various congestion regimes [20].Consequently, this work assumes the independence of input parameters.

E. Other Considered Parameters
The car-following parameters described above are not the only tuneable parameters available to traffic modellers, with one example being the fleet composition of the simulation network (also known as vehicle mix).Instead of calibrating, it is common for modellers to use either the simulation package's default fleet composition or national projections.However, as stated in Section II-C.2, there is much variation between locations.The role of this variation and uncertainty was explored in the sensitivity analysis by modelling fleet composition as an input to the sensitivity analysis.Specifically, the fleet was modelled as a bimodal distribution with heavy-duty diesel tractor-trailers and EU4 personal cars (PKW_G_EU4 and HDV_TT_D_EU6 PHEMLight classes).Each class samples their length from a distribution, though that was left out of the SA.The simulated trucks also sample from a different distribution of a truck , b truck and τ truck , which are provided in Table I and come partly from [68] and SUMO's documentation.The modelled range of fleet composition, represented as p truck , is also shown.As fleet composition is highly localized, the upper bound was set according to regional data from the Alabama Department of Transportation.The parameter range contains the simulated network's truck percentage (11%).
The random seed of simulation was also considered, following the lead of Ge et al. in their sensitivity analysis of an Aimsun traffic model [40].The random seed controls how vehicles sample from the distributions, the departure time for vehicles, and how the routeSampler solves the traffic demand problem.Essentially it is a measure of the stochasticity inherent in SUMO.Prior works have controlled for stochasticity by evaluating each parameter combination ten times [12].Instead, by directly including the random seed in this work's sensitivity analysis, the inherent uncertainty was investigated.
Several lane-change model parameters were also included to analyze the effect that they had on simulation outputs.The specific lane-change model utilized in the simulations is described in detail by [13].To save simulation evaluations, not every parameter was considered.Instead they were chosen heuristically based on the author's modelling experience.The parameter lcStrategic controls how early a vehicle makes strategic lane changes, and lcAssertive is a parameter controlling the acceptable front and rear gaps when merging.Impatience is a junction model parameter which describes the willingness of a driver to impede a vehicle with higher priority.Finding literature on reasonable bounds for the range of lanechange and junction model parameters proved difficult.The bounds presented in Table I were instead determined via oneat-a-time parameters sweeps and iterations of the SA.Traffic flow became unrealistic when the allowable range was larger, resulting in both calibration failures and unreasonable queue lengths.In comparison to CF-model parameters, the distribution type was not considered.The inter-driver distribution was instead assumed to be uniform in the absence of better information.
Other parameters were considered in the iterations of SAs and traffic simulations that led to this paper, including: simulation step, car-following model, reaction time, junction stop-line gap, and desire to drive in the right lane.These parameters were determined non-influential and thus omitted from the final version.

F. Optimization Metric Calculation
1) Average Per-Vehicle Energy Consumption: The energy consumption in SUMO is modelled using the aforementioned PHEMLight emissions model [11].The output of simulation is a log of fuel mass flow rate in mg/s, which is converted into a fuel mass at every time, t, using the simulation's step length, t step .To deal with the difference in fuel-type (diesel & gasoline), the lower heating value (LHV) of each fuel is used to convert the fuel-mass into mega-Joules of energy.The final output value is a simple average, written as where N veh represents the total number of vehicles in simulation, ṁn,t is the mass-flow rate of fuel at time t for vehicle n, and LHV f uel,n is the LHV of vehicle n's fuel type, converted into units of MJ/mg.The results presented in Section IV have been converted back into liters of gasoline equivalent using the LHV of gasoline for ease of comparison.
The simulated network described in Section III-B has its "network bounds" on roads that continue outside of the considered network.Due to this fact, a traffic simulation artifact appears at the edges: when a vehicle's leader has a slower desired speed that its own (i.e.SF v,leader < SF v ), the vehicle abruptly accelerates when its leader departs the network.The acceleration is unrealistic, as the road continues on in reality and the leader does not actually disappear.By spatially plotting fuel consumption, the boundary of realistic fuel consumption was identified, and all fuel consumption outside the boundary was ignored.
2) Delay: The calculation used by SUMO for delay is dependent entirely on the desired (v desir ed ) vs. actual speed (v actual ).Thus, the total delay for a vehicle n's trip through the network is expressed as where the summation represents each simulation step until the vehicle leaves the network.

3) Other Traffic Measures:
The aggregation of all vehicles in a network, while convenient, can potentially mask spatial differences as vehicles traverse distinct routes through the network.This consideration is particularly crucial when assessing the impact of various traffic measures on different network segments.
In the simulation post-processing, several additional metrics were calculated beyond delay and energy consumption.These include average speed, number of stops, travel time, and the calibration values as discussed in Section III-C.These metrics were evaluated at both a network and intersection level.It is noteworthy that recent TSC literature, particularly those based on reinforcement learning, frequently rely on phase or approach-based state definitions [45], [72].
To analyze how the parameters in Table I could change not only the absolute quantities, but also the quantities relative to each other, the ratio of measurement in the east-west (mainline) approaches to those in side-streets was derived.A new parameter, ratio of delay, is proposed to capture this effect.This compares the ratio of driver delay on the mainline intersection approaches to that of the delay on the side streets.TL2 and TL3 have 4 approaches in the mainline and 4 in the side streets.TL1 has 3 mainline phases and 1 side street phase.The ratio is then calculated as Ratio of Delay = where E(D j ) represents a vehicle's expected delay in phase j during the simulation measurement time.The goal of this metric is to observe how the variation in the input parameters can change where simulation metrics are accumulated and the potential for the difference to impact TSC.

IV. RESULTS
In the effort of investigating simulation variance and uncertainty, two large SAs were performed, which are later denoted as SA1 and SA2.First, the sensitivity of simulation outputs to the CF-model parameters discussed in Sections III-D & III-E was analyzed.Then fleet composition and truck-specific CF-model parameters were added in SA2.Both were ran on a 128 core Ubuntu 22.04 machine with 500 GB of RAM.The SA was conducted using the open-source SALib Python package [52], [73] and SUMO version v1_19_0+0020-ea410df4cf2 was used for the simulations.The code for running the simulation, sensitivity analysis, and corresponding variance analysis is available on Github.

A. SA 1
The initial analysis incorporated 22 parameters and an N value of 2048, culminating in a total of 49152 simulation evaluations.Each simulation was assessed in accordance with the calibration delineated in Section III-C, with 1.58% of simulations failing calibration.Examination of the calibration failures revealed that they did not contribute to outliers in any of the considered MoPs, hence they were classified as non-influential.In these instances, the calibration failure was marginal, with the volume counts at all but the WB detector at TL2 within the acceptable range of the USDOT calibration methodology.Upon probing the source of failure, it was Fig. 3. Sensitivity Index convergence with increased number of simulations in the SA and the resulting confidence interval for select top parameters.The target CI is 10% of the top S T , which is marked by the red lines.The number of simulations is controlled by N in Eq. 6, and was increased from 16 to 2048.attributed solely to the inherent randomness of the simulation, specifically the route generation.
Figure 3 explores how the number of samples effects the convergence of top parameters.The sensitivity analysis results stabilized after an N value of 256, or a total simulation number of 6144, giving confidence in the importance ranking in parameter for SA1, even though the fuel consumption CI target took until N = 2048 to converge.
Figure 4 displays the total and first order sensitivity indexes for the listed parameters and simulation outputs.Unsurprisingly, the per-vehicle fuel consumption is most sensitive to the mean value of the SF v distribution, whose S T = 0.45 ± 0.04.A basic force balance can explain this fact, as a vehicle's aerodynamic resistance is proportional to the square of its velocity and makes up a majority of a vehicle's resistive forces at free-flow speed.Fortunately for modellers, SF v is one of the more straight-forward parameters to calibrate.
The hierarchy of importance in parameters, after SF v , is followed by b, which is in line with the findings of da Rocha et al. [25].Their study demonstrated that the estimation error for emissions and fuel consumption is most sensitive to the deceleration parameter in Gipp's CF-model.Though not shown for brevity here, an aggregate analysis of results of the SA further substantiates this by revealing a positive correlation between fuel consumption and the b parameter, r (49150) = 0.42, p < 0.01 (where r is the Pearson correlation coefficient and p is the probability value of the null hypothesis).As the mean of the inter-driver b distribution escalates, the fuel consumption correspondingly increases.
Following b, the next parameter of significance is SF σ .While not shown here, but can be reproduced with code provided, this parameter exhibits a noteworthy interaction with SF v .When the mean of the speed factor distribution is high and the variance is low, the simulation's average speed reaches its peak as expected.This is primarily due to the diminished likelihood of traffic flow disruption by slower drivers.Moreover, the distribution of speeds, which is bounded by Table I, clips the right-tail of very fast drivers, thereby eliminating would-be high fuel consumers when the variance is high.
The next critical parameter is impatience.This parameter, akin to SF v , exhibits a negative correlation with fuel consumption.As the impatience parameter increases, the per-vehicle fuel consumption decreases.This can be attributed to shorter wait times at yielding turns, leading to shorter queues, less time loss, and reduced idling.Interestingly, an increase in driver aggression (represented by a higher impatience value) does not appear to significantly affect the priority traffic flow to cause an increase in fuel consumption due to slowdowns resulting from drivers merging into oncoming traffic.
The parameter that follows the impatience distribution mean in importance is τ , with S T = 0.13 ± 0.01.The results generally indicate that the higher the mean of the τ distribution, the higher the fuel consumption.The ranking of τ as the second most impactful CF-model parameter aligns again with the findings of da Rocha et al. [25].
The parameter that succeeds τ in importance is a, which exhibits two dominant trends.Firstly, a low a in combination with a high b results in higher than average fuel consumption.This is due to the fact that when the simulated vehicles have a low a on average, the queues in simulation take longer to discharge, leading to congestion at traffic signals.Upstream vehicles are more likely to have to slow down for the slowto-discharge queue and then accelerate, destroying kinetic energy in the process.Conversely, when both a and b are near the upper bounds, the average fuel consumption tends to also be higher than average, due to extra fuel consumption from aggressive driving.Both the a and b findings are in agreement with [26], who show that low a and high b cause instability in heterogeneous traffic flow.The only remaining important parameter in Figure 4 is s 0 .In the context of fuel consumption, all other parameters under consideration contribute less significantly than Zhang et al.'s prescribed cutoff of 0.05 [55].
The input parameter exerting the most significant influence on average travel time is a, accounting for 53 ± 3% of the observed variation.The subsequent two parameters of importance, b and τ , can be rationalized by their impact on delay.The fourth-most influential parameter, SF v , intuitively plays a role, as travel time is directly proportional to the vehicle's average speed.However, caution is advised when extrapolating these findings to other networks, as the ranking of these delay-impactful parameters is likely specific to the relatively close proximity of traffic lights in the simulated network.
Delay, being a measure of a vehicle's actual speed versus its SF v (refer to Eq. 11), is not directly impacted by SF v .Instead, delay is introduced solely through congestion dynamics and delay directly caused by the traffic signals.This is confirmed by the S i 's for average vehicle delay in the network, with the mean of the a distribution emerging as the most crucial simulation input.Rapid acceleration to the desired speed significantly reduces delay, while slow acceleration extends the time spent below the desired speed.The inverse relationship between delay and τ is underscored in the parallel lines plot depicted in Figure 5. Furthermore, both τ and b exert a strong influence on average delay, with τ being positively correlated and b negatively.Figure 5 also highlights impatience, which is the last parameter before the cutoff of 0.05.The interaction observed between impatience and the other parameters in Figure 5 presents two scenarios: one where high impatience in combination with a low b, high τ , and low a leads to high delay, and another where low impatience directly results in high delay, as vehicles wait longer before merging.
The sensitivity indexes of delay ratio largely follow that of average delay, with the exception of increased importance of seed, which has a S T = 0.06.The random seed's high ranking points to the fact that the time of side street departure impacts the ratio of delay between the mainline and side street phases.The simulation traffic demand is generated using real-world detector counts that are aggregated into 10 minute intervals.For low-volume roads (especially side-streets) there is high uncertainty as to when departures occur relative to the mainline phases, which effectively see consistent flow during the 10 minute period.The traffic signals in the simulated network are operating in coordinated mode with a cycle length of 150 seconds.If the vehicle arrives at the intersection during the beginning of the coordinated mainline phase service, it can expect to wait longer than a vehicle that arrives near the end of the service.The impact of side-street arrivals on the localized average delay is realized in the random seed.
The car-following and lane-change parameter distribution shapes (uniform, log-normal, or normal) and variance are non-influential, with the exception of SF σ v .While this contradicts [35] and the findings that lane-changes cause instability in freeway flows, only a subset of model parameters was considered.Because of a lack of literature on the proper bounds for the parameters, there is a possibility that they were constrained too conservatively.The finding of relatively low sensitivity to inter-driver distribution parameters is, on its face, contra to [20].However, they investigated Edie's space mean speed error between NGSIM-based ground truths and simulation with various inter-driver distributions.They look at three different congestion regimes and ultimately find that different distributions perform best in different congestion scenarios, with others introducing high amounts of error.The simulation outputs measured in this sensitivity analysis are not as high-resolution as those in [20].Sensitivity to relative error is not analyzed, meaning that the distributions-specific parameters may still contribute to trajectory-based error, yet that error does not influence the aggregate simulation outputs in this work.Prior literature has shown that trace errors for individual vehicles are less important when looking at aggregated emissions/fuel consumption [25].For the aggregate measures presented in this work, the mean of the parameter distribution is more important than either the shape or the width of the distribution.
1) Discussion of Intra-SA Variation: The empirical distribution functions (eCDF) of simulation outputs is investigated in Figure 6.The total network fuel consumption has a skew of 0.53, and a kurtosis of 0.56 (Fisher's definition), indicating a right-skew towards higher fuel consumption with more outliers than a normal distribution, as do average travel time, average delay, and the ratio of delay.
Without further knowledge on the inter-driver distributions of car-following and lane-change parameters, the resulting Fig. 6.Empirical distribution functions (eCDFs) for average network travel time, the ratio of delay, the aggregate network fuel consumption, and average delay.Plotted over top is the range of results (5th to 95th percentile) from simulating with SUMO default car-following parameters and those coming from two commonly cited papers.Paper 1 refers to [64] and Paper 2 is [65].
eCDFs represent the range of values that the modeller should consider, as they are all plausible.However, the 50% percentile in the eCDFs does not represent the most likely outcome for the network, only that it is the most likely outcome when a low-discrepancy Sobol sequence is built using the bounds presented in Table I.
Included in Figure 6 are the ranges of outputs that can be expected when simulating with default car-following parameters from SUMO and two commonly cited IDM calibration papers ( [64], [65]).Each of the aforementioned distributions was ran 100 times, with SUMO's default SF v of 1.0.The range of simulated outcomes between the 5th and 95th percentile is represented by the shaded areas.Both papers define the inter-driver distributions, meaning that there is heterogeneous traffic in simulation, which explains why the paper output distributions have more variance than the SUMO defaults, which have homogeneous traffic.
The clustering of results from the two literature-based car-following parameter distributions and those from SUMO underscores a significant finding of this study.It suggests that simulations using SUMO defaults yield traffic flow dynamics that are markedly different from those obtained with parameters derived from literature.While the impact of the difference in traffic signal control is not further investigated in this study, the delay ratio indicates that simulations with literature-based parameters accrue significantly more delay in the mainline phases than those with SUMO defaults.The implications of this observation remain to be explored.However, it is plausible that this effect could influence the phase timings that modelbased TSC identifies as optimal.
2) Investigating the Importance of SA Bounds: Definition of the input parameter distribution is one of the most delicate steps in SA, especially for variance-based methods like Sobol's [36].In fact, literature has shown that different definitions can lead to dramatic differences in SA conclusions [74].The same observation was made in the formation of this study.In prior works performing SAs on the IDM carfollowing model, it is common to see the lower bound for a less than 0.5 m/sec 2 , with the upper bound of τ reaching 5 sec [25], [41], [63].These works did not use a traffic micro-simulation software, but instead were investigating a microscopic traffic flow model, which is presented only in mathematical equations.
Utilizing the bounds derived from the referenced prior literature often leads to numerous simulations that do not pass calibration.This outcome resonates with the findings of a previous study, where real-world car-following behavior was juxtaposed with simulations using SUMO [75].In this study, a significant discrepancy was observed between the real-world and simulated behaviors, further emphasizing the challenges in accurately calibrating traffic simulations.While the trouble parameter combinations (low a, high τ ) do not occur in IDM calibration literature (to the authors knowledge), the input Sobol sequence fills the entire hypercube with equal likelihood.The eCDFs of the resulting simulations show long right-tails and high skew which can lead to incorrect SA conclusions when using Sobol's variance-based method [36], [76].
As the focus of this work was on simulation outputs and not calibration, using SA indices derived from simulations that fail calibration was undesirable.The goal was to analyze a range of outputs that could all be considered feasible given the calibration procedure described in Section III-C.In the face of unrealistic outcomes in SA, there are several strategies for redefining the input space, filtering the model output, and/or using a sensitivity analysis method that can better handle skew [36].Ultimately, the best solution for this work was narrowing the range of model parameters, with the final acceptable ranges presented in Table I.This is a form of calibration in itself.While the initial SAs with wide bounds resulted in faulty indices, finding combinations of input parameters that lead to outliers helped to narrow the inputs into realistic regions.

B. SA 2
After running SA 1, p truck , truck-specific SF v , a, τ , and b were added as parameters.All lane-change parameters outside of impatience were removed due to their low sensitivity indices, resulting in 22 input parameters.A total of 49152 simulations were ran using N = 2048, with the primary goal being to explore the role of fleet composition in the output parameters.With the addition of trucks, 2.29% of simulations fail calibration, which is roughly 1% more than SA1.Still, analysis showed that the main attributing factor to the calibration failures was randomness.For example, a common failure point is high frequency of truck departures on side streets.The simulated trucks require larger time gaps to make unsignalized right turns, and thus cause congestion on side streets, leading to volume below the observed vehicle counts.
Figure 7 shows simulation output metrics with respect to car-following model parameters, lane-change parameters, and fleet composition, with the bounds presented in Table I.The importance of fleet composition in the simulation output metrics is clear.Class 8 trucks dominate fuel consumption in the network, so much so that per-vehicle consumption The importance of all other parameters, even those in SA1, is reduced, with none eclipsing the 0.05 mark.
The remaining three metrics demonstrate sensitivity to fleet composition, albeit to a lesser degree than SA1.In contrast to SA1, the confidence intervals for top parameters in these three metrics did not converge to the target of 10% of the S T value.Despite this, the results remained stable for N > 256 and thus, the relative ranking of these parameters can be trusted.The trend for average delay is consistent with that in Section IV-A, with the notable exception that p truck emerges as the second most sensitive parameter.The significance of p truck can be attributed to the lower average a of trucks, coupled with their additional length.In SA 2, the two vehicle classes were modeled with independent a distributions.As the number of trucks increases, the lower average a of trucks influences the average delay and travel time of the entire vehicle population.However, the impact is not merely a result of weighting the average by the number of slow-accelerating vehicles.Long trucks that are slow to accelerate also affect traffic flow, causing delay in vehicles that might otherwise have experienced low delay due to a high a parameter.This analysis is also applicable to average travel time, with the addition of an independent distribution of SF truck for trucks, which is on average slower than passenger cars.This further underscores the importance of fleet composition in average travel times.
The inclusion of a heterogeneous fleet in the simulation also influences the delay ratio, as indicated by Figure 7.It is the second most influential parameter after a, and as the fleet composition increases, so does the delay ratio.This implies that the more trucks there are in the simulation, the more the total delay is comprised of delay accrued on mainline phases.Similar to SA1, the seed also surpasses the 0.05 cutoff, the cause of which is explored in Section IV-A.However, the inclusion of trucks as a parameter introduces additional randomness in both departure time and the type of vehicle that enters the network at a given time.
1) Discussion of Intra-SA Variation: The range of possible simulation outputs is explored in Figure 8. Comparing the eCDFs from SA 2 to SA 1 in Figure 8 reveals the impact of fleet composition on both the magnitude and shape of the distribution.The total fuel consumption distribution is nearly uniform, with average delay and average travel time both having positive skew and long right tails.These tails are made of a disproportionate number of calibration failing simulations.For example, 2.45% of simulations fail calibration when considering all simulation evaluations in SA 2; however, 17.9% of simulations above the 95% percentile average delay fail calibration and 52.5% fail for simulations above the 99% percentile.The calibration failures are caused by both randomness and fleet composition.Of the simulations that fail calibration, 64% have a higher percentage of trucks than the SA average.Trucks require a larger time headway to pull from side streets to the mainline, and thus a sequence of trucks on side streets can cause unrealistic congestion, leading to calibration failure.The routing algorithm does not differentiate between trucks and cars, thus the higher the percentage of trucks, the more likely a queue of trucks on side streets.

V. CONCLUSION & DISCUSSION
In the face of uncertainty in traffic simulation, sensitivity analysis is a powerful tool to find the meaningful parameters to calibrate and analyze variance.Popular literature in the traffic simulation space typically uses SA as part of a calibration workflow by analysing error metrics.However, as is common outside of literature, the authors of this work did not have high-resolution data to utilize in calibrating microscopic carfollowing models.Only detector counts were available, which gave volume and location pairs, but little else.So, instead of running the sensitivity against an error metric as prior trafficsimulation based sensitivity analyses have done, the variance of common traffic simulation output metrics was analyzed using calibrated traffic volumes.
After running two extensive sensitivity analyses (and many more unpublished iterations) the following key takeaways were realized: 1) The importance of input parameters does vary across output metrics, but fleet composition, the speed at which vehicles drive relative to the speed limit, the mean value of all of the IDM CF-model parameter distributions, and the impatience parameter are important.2) Neither the type of inter-driver distribution, the variance of CF-model parameters, nor the lane-change model parameters are influential in the considered output metric variance, with the exception of SF σ v 's role in fuel consumption, where low inter-driver variance with high SF σ v can lead to high aggregate fuel consumption.3) SUMO default parameters for the IDM CF-model and those from literature lead to significantly different simulation output distributions (consider median values in Figure 6 compared to SUMO default).However, without clear supporting data, the authors still recommended the use of the simulation software's default parameters if localized datasets are not available for calibration.4) The bounds for the sensitivity analysis are of critical importance.Too wide of a and τ ranges led many simulations that failed calibration and generated significant skew in the simulation outputs.They must instead be set to realistic ranges for the modellers' simulation software.
The observed importance of the distribution means of IDM CF-model parameters matches prior literatures [25] and [63], while also contributing new information about the importance of SF v , p truck and impatience.Further, this work shows that both variance and distribution type are non-influential when considering aggregate simulation output metrics, as are the lane-change model parameters.Still, the results emphasize the importance of considering inter-driver and fleet heterogeneity, with the majority of output variance being described by the aforementioned parameters.
Care should be taken if generalizing these results to other emissions models, car-following models, lane-change models, or simulation software.The authors' best judgement was used at several points, not the least-of-which being the narrowing down of input parameters to chose (i.e.Table I).Initial sensitivity analyses were performed with CF-model parameter ranges found in prior literature.However, too wide of a and τ ranges led to many simulations failing calibration which generated significant skew in the simulation outputs.They must instead be set to realistic ranges for the modellers' Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
There is clear potential for future research extending this approach in a couple of areas.Ge and Menendez [41] have laid out a methodology for handling correlated inputs, and it would be interesting to investigate the difference between results when considering correlated input parameters.In the authors' own work, the takeaways from this paper will be used to guide stochastic simulations for traffic signal optimization.The prescription is to consider the mean of the p truck , SF v , a, τ , and b distributions as uncertain in the absence of localized information on the simulated network.

Fig. 1 .
Fig. 1.SUMO model of the simulated network including three intersections.

Fig. 2 .
Fig. 2. USDOT calibration plot for a simulation of the representative day at in the EB direction at TL2.The calibrated model stays within the bounds defined by USDOT calibration manual, even during transient periods of high volume.The diamonds show the range of the day that was simulated.

Fig. 4 .
Fig. 4. Total (S T ) and first order (S 1 ) sensitivity indices and corresponding confidence intervals for each considered parameter and simulation metrics in SA1.SF v , a, b, and τ standout as the most influential parameters for all four metrics.

Fig. 5 .
Fig. 5. Parallel coordinates plot displaying the relationship between input parameters with high sensitivity indexes and average vehicle delay.The lines are colored based on average per-vehicle delay (as seen in right most axis).

Fig. 7 .
Fig. 7. Total (S T ) and first order (S 1 ) sensitivity indices and corresponding confidence intervals for each considered parameter and simulation metrics in SA2.The importance of p truck in Fuel Consumption is immediately clear.is strongly correlated with fleet composition, r (49150) = 0.98, p < 0.01.Without trucks, the 95% fuel consumption is 380.96L or 0.076 L/vehicle.With trucks, that value becomes 949.76 L or 0.19 L/vehicle.Fixing the value of trucks would reduce the variance in fuel consumption by 97.0 ± 5.5% (S 1 ).The importance of all other parameters, even those in SA1, is reduced, with none eclipsing the 0.05 mark.The remaining three metrics demonstrate sensitivity to fleet composition, albeit to a lesser degree than SA1.In contrast to SA1, the confidence intervals for top parameters in these three metrics did not converge to the target of 10% of the S T value.Despite this, the results remained stable for N > 256 and thus, the relative ranking of these parameters can be trusted.The trend for average delay is consistent with that in Section IV-A, with the notable exception that p truck emerges as the second most sensitive parameter.The significance of p truck can be attributed to the lower average a of trucks, coupled with their additional length.In SA 2, the two vehicle classes were modeled with independent a distributions.As the number of trucks increases, the lower average a of trucks influences the average delay and travel time of the entire vehicle population.However, the impact is not merely a result of weighting the average by the number of slow-accelerating vehicles.Long trucks that are slow to accelerate also affect traffic flow, causing delay in vehicles that might otherwise have experienced low delay due to a high a parameter.This

Fig. 8 .
Fig. 8. Empirical distribution functions for average network travel time, the ratio of delay, network aggregate fuel consumption, and average delay.The eCDFs from SA1 are plotted for comparison.

TABLE I RANGE
OF PARAMETERS CONSIDERED FOR SENSITIVITY ANALYSIS microsimulation