Driver-Pedestrian Interactions at Unsignalized Crossings Are Not in Line With the Nash Equilibrium

Recent developments in vehicle automation require simulations of human-robot interactions in the road traffic context, which can be achieved by computational models of human behavior such as game theory. Game theory provides a good insight into road user behavior by considering agents’ interdependencies. However, it is still unclear whether conventional game theory is suitable for modeling vehicle-pedestrian interactions at unsignalized locations or if more complex models like behavioral game theory are needed. Hence, we compared four game-theoretic models based on two different payoff formulations and two solving algorithms, to answer this question. Unlike the most previous studies that employed naturalistic datasets to test and validate such models, this study utilized a distributed simulation dataset to test and compare the models. The study was conducted by connecting a CAVE-based pedestrian simulator to a motion-based driving simulator to replicate the traffic scenarios for 32 pedestrian-driver pairs. The findings demonstrated that there is a high variability between participant pairs’ behaviors. Our proposed behavioral game-theoretic model outperformed other models in predicting the interaction outcome. This translates to a decrease by 70% and 67% in the root mean squared error (RMSE) when compared to the baseline model, for marked and unmarked crossings, respectively. The model can also predict which interaction will take the longest time to resolve. According to our results, road users cannot be expected to behave in line with the Nash equilibrium of conventional game theory that underscores the complexity of human behavior with implications for the testing and development of automated vehicles.


I. INTRODUCTION
Road user interaction has been a topic of interest for years from a safety perspective for human-human interaction [1] and has become popular in recent years due to successive The associate editor coordinating the review of this manuscript and approving it for publication was Jjun Cheng .
improvements in vehicle automation bringing the challenges of human-robot interaction into the topic [2], [3], [4].Among different types of interactions, the interaction of pedestrians as vulnerable road users (VRUs) with drivers and automated vehicles (AVs) has a great impact on traffic safety and efficiency as pedestrians constitute a great proportion of the traffic ecosystem [5].They are also known to exhibit unpredictable behaviors [6].To this end, previous research has strived to understand [7], [8], [9], [10] and quantitatively model [11] how VRUs and vehicles/AVs interact with each other with the latter becoming an essential part of the test and development procedure for the future deployment of AVs [12].
The subsequent sections delve into the exploration of related work concerning the computational modeling of road user behavior.Subsequently, the methodology employed in this study is explained, encompassing the empirical investigation, computational models, and model fitting.
The results section provides a comparison of the distinct models developed within this study.The paper concludes by discussing the findings and presenting the conclusions.

A. RELATED WORK
Existing modeling approaches to road user behavior are often separated into two types of architecture: glass-box and black-box models [13].Black box models such as deep learning models offer a generalizable approach where the behavior of several agents can be simulated with high accuracy [14] but the underlying mechanisms of the model components are unknown: there is a lack of interpretability in the connection between the inputs and outputs of the model [15] and with human psychological theories which makes the model interpretation difficult.On the other hand, glass box models offer the advantage of interpretability and transparency by providing explanations for the mechanisms in relatively great detail.These models rely on different modeling paradigms including agent-based modeling [16], [17], optimal control theory [18], [19], Markovian processes [20], [21], evidence accumulation [22], [23], proxemics [24], discrete choice modeling [25], [26] and game theory [27].
From the above modeling approaches, agent-based and discrete choice models have a long and rich history in predicting road user behavior.Agent-based models have been used for modeling different traffic scenarios such as two-dimensional trajectory modeling of vehicular movements at intersections where one-dimensional simplification is not enough to capture road user behavior and distance-based factors play a more important role than time-based variables [28].The downside of these models is that road users are generally assumed to act mostly like moving objects without considering each other's intentions before making every decision.Logit models are among the most commonly used models for modeling pedestrians' gap acceptance behavior [26], [29], [30], [31] due to the binary nature of pedestrian crossing decisions, the convenience of utilizing them, and the flexibility of their application together with other models [32].They have been compared to a number of statistical methods namely maximum likelihood method, Raff's method, root mean square method and probability equilibrium method, and have been found to be the most appropriate model for estimating the critical gaps of pedestrians [33].Moreover, their ability to be incorporated into other modeling approaches such as microscopic traffic flow models [34] and artificial neural networks [35] makes them an attractive choice.Having said that, the discrete nature of these models provides no concept of time such as time-varying utility functions and the ability to fully capture traffic agents' interdependencies.To this end, other modeling approaches such as evidence accumulation and game theory have become popular for road user behavior modeling studies, over recent years.
Evidence accumulation offers a well-established depiction of human behavior for some specific decisions [36], [37] and suggests that evidence for a particular response is integrated by single or multiple accumulators over time and by a rate known as drift rate which is the rate at which sensory information reaches a bound (a decision boundary) [23].This model has been used for simulating and predicting driver gap acceptance in left-turns [38], pedestrian crossing decisions [22], [39], and AV-human interactions in take-over and crossing scenarios [40].That said, while evidence accumulation models provide ample detail about the decision-making process, they do so for a very constrained set of tasks and are typically considered single-decision models suggesting they may not be able to account for all types of interaction scenarios.Also, as opposed to game-theoretic models, these models are mostly incapable of capturing road users' interdependencies.
Game theory extends optimal control theory to a decentralized multi-agent decision problem [41] and explains the interaction of multiple agents whose interests do not coincide, and their decisions, generally, depend on the actions of all [42].In this model, agents keep revising their decisions and beliefs until they become mutually consistent, that is until (the Nash) equilibrium is reached.This is the core idea in conventional (also known as orthodox/traditional) game theory which relies on perfect rationality of players who are always assumed to be self-interested and choose optimal choices.Overall, conventional game theory has the advantage of accounting for interdependencies, unlike agent-based, logit, and evidence accumulation models [43].Thus, it has been used in several vehicle-pedestrian interaction studies [44], [45], [46], [47], [48] However, behavioral economics suggests that agents' preferences, along with concern for fairness, are highly context-dependent [49]: individuals make decisions based on a heuristic estimate of the potential value of losses and gains [50] and they do not usually play the Nash equilibrium in strategic situations such as unrepeated normal-form games [51].This is due to different reasons, including bounded rationality [52], [53] and positive theory [54] which are the backbones of behavioral game theory.Behavioral game theory utilizes experimental evidence to create computational models of human cognitive limitations, social utility and preferences, and learning rules aware of ''how people actually behave in strategic situations'' [55].To date, several behavioral game-theoretic models have been introduced and tested using economic games.For instance, the dual accumulator (DA) model that combines the 110708 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
knowledge of evidence accumulation paradigm with game theory is a promising approach to simulating human decision-making [56].The authors compared their model to several existing behavioral game theory models, i.e., noisy introspection [57], logit quantal response equilibrium [58], level-k reasoning [53], and cognitive hierarchy theory [59], employing a hold-one-out analysis.They showed that the model makes the most accurate out-of-sample predictions [56].However, this model has not previously been tested in the context of road user modeling, highlighting a gap in the literature.Some studies have employed other behavioral game theory models for the road traffic context, such as the logit quantal response equilibrium in vehicle-pedestrian interactions [60] and level-k reasoning [61], [62], [63] and cognitive hierarchy reasoning [64] in vehicle-vehicle (including AVs) interactions showing that the models can capture road user behavior well.Using two different multiagent Markov-Games, i.e., one based on the Nash equilibrium and one based on logit quantal response equilibrium, Alsaleh and Sayed estimated cyclist-pedestrian strategies using a multiagent deep reinforcement learning approach and found that the latter predicted road user trajectories with higher accuracies [63].
All things considered, to the best of our knowledge, no study has ever directly compared conventional game theory to behavioral game theory in the vehicle-pedestrian interaction domain.Hence, it is currently unclear whether conventional game theory models are sufficient for road user interaction and especially vehicle-pedestrian interactions, or whether higher complexity in modeling provided by behavioral game theory is needed.There is also a lack of comparison between game-theoretic models and logit models.The main contribution of this study is a comparison of these two types of game theory models also with logit models (representing the popular modeling approach in this area).This was done by using a dataset from a controlled distributed simulator study.Unlike naturalistic studies which are the common validation tools for the models of road user behavior [66], controlled studies provide a safe environment where one can directly control the interactions between agents, varying the conditions of interest to study their causal (rather than correlational) impact on behaviors and outcomes.Also, this technique enables multiple observations for each participant, allowing a better understanding of interindividual differences.
Our main research question is as follows: -Are traditional models such as logit and conventional game theory (the Nash equilibrium) enough to predict vehicle-pedestrian interaction outcomes at unsignalized locations or are more complex models such as behavioral game theory needed?

II. METHODOLOGY
This section describes all the methods used in the study, beginning with a description of the controlled distributed simulation study, followed by a definition of each computational model, and details of the model fitting.

A. EMPIRICAL STUDY
A distributed simulator study was conducted to investigate road user interactions in a safe and controlled environment, providing a large dataset of vehicle-pedestrian interactive behaviors to test and validate the computational models of this study.The full details of the study can be found in [67].Here, we provide a summary of the study.
The study was conducted by connecting the University of Leeds Driving Simulator (UoLDS) to the HIKER (Highly Immersive Kinematic Experimental Research) pedestrian lab.UoLDS is a high-fidelity motion-based driving simulator with an eight degree-of-freedom motion platform carrying a Jaguar car housed in a 4 m-diameter spherical projection dome, with a 300 • field-of-view projection system.HIKER is a 9 × 4 m CAVE simulator consisting of eight 4k projectors that are used to project virtual scenes at 120 Hz to the floor and walls.Fourteen body markers were attached to different parts of the pedestrian's body, represented as pink spheres to the driver (Figure 1-a).The pedestrian could also see the vehicle as shown in Figure 1 Upon arrival, both participants were asked to sit in their respective briefing areas in two separate rooms and read and sign the consent form.The instruction to the pedestrian was to stand at a marker on the HIKER's floor (the first blue cross in Figure 1-c) where they could see that cars are going both ways but they could not tell when the subject vehicle was approaching due to a visual obstruction (a bus stop; Figure 1-c).After hearing an auditory tone, they were asked to step to a second marker which was the curb of the virtual road where the driver could see them.This marked the beginning of the interaction.The participants (driver and pedestrian) could decide whether they wanted to wait for the other to pass first or they themselves passed.Both participants were told: ''Please assume that you are late for an important meeting, such that you want to avoid any unnecessary delays, but of course, you also want to stay safe.''Drivers were told to maintain the speed limit (30 mph) as they would in their normal driving and were also reminded that pedestrians have priority at zebra crossings.Upon completion of the experiment, participants were asked to fill out post-experiment questionnaires for demographic information and personality traits (not reported here, see [65]).

B. COMPUTATIONAL MODELS OF VEHICLE-PEDESTRIAN INTERACTION 1) LOGIT MODEL (LOGIT)
A logistic model was tested assuming the utilities to be the linear function of TTA and pedestrians' total waiting time which is in line with the literature [26].Two different intercepts were considered for each crossing: as the probability of the pedestrian passing first or waiting can be denoted by P(U) and P (1 − U) , respectively, the probability of pedestrian passing first can be defined using the Logit function [26]: where U is the utility of waiting/passing for the pedestrian, β 0z and β nz are intercepts for the unmarked and marked crossings, respectively and β 1 and β 2 are coefficients for TTA and waiting time of pedestrians, respectively.

2) ORIGINAL CONVENTIONAL GAME-THEORETIC (OCGT) MODEL
A conventional game theory model by Wu et al. which considers the two-agent game of vehicle-pedestrian was chosen and slightly modified [48].This model was chosen due to a well-balanced integration of road user safety and efficiency metrics, the ease of working with its payoff formulation and the fact that it is one of the few game-theoretic models in the literature with an explicitly stated payoff formulation.The model was established as the baseline for comparison against other models utilizing more complex payoff formulations and solving algorithms.Table 1 shows the parameters of the study including the Wu et al. model's parameters [48].Table 2 shows the Wu et al. model's payoff formulation.The model's payoffs are defined as a summation of utilities relating to (i) the perceived risk of being involved in a conflict with another road user modeled as k = 1/TTA, and (ii) the time spent as a result of one waiting for another, which is equal to the time that the passer takes to pass the crossing (t i ).The presence of these utility values in all outcomes with a negative sign when they have a negative influence on a road user, or a positive sign otherwise, is the main assumption of the formulation.Additionally, a weight coefficient was considered for the total waiting time of the pedestrians with the following assumption: pedestrians who have waited for a longer time, are more inclined to be cautious 110710 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and less likely to engage in risk-taking by accepting smaller gaps [31].This was assumed in the opposite direction in the original paper [48] as the authors' definition of waiting time was different from our study.
Table 2 suggests that there is no unique Nash equilibrium, and the game has two dominant strategies {(pedestrian pass, vehicle wait), (pedestrian wait, vehicle pass)} which can be obtained using the mixed strategy algorithm by equating the expected utilities of each player [68].
where P pp and P vw are the probability of pedestrian passing first and vehicle waiting, respectively [48].Another dominant strategy (P pw , P vp ) can be obtained as one minus the probabilities in (3).In this study, we present all the results based on the pedestrian's probability of passing first.

3) ALTERNATIVE CONVENTIONAL GAME-THEORETIC (ACGT) MODEL
An alternative payoff formulation was proposed, based on Wu et al.'s original payoff.The formulation was provided to correct some of the assumptions of the original payoff which we suspected did not correctly capture road users' perceived utilities of the different outcomes.For instance, road users' utility functions were modified to help the model distinguish between marked and unmarked crossings as shown in Table 3.According to traffic regulations in the U.K., similar to many western European countries, drivers should give way to pedestrians waiting to pass as well as those at a zebra crossing (see Rule H2 in.The Official Highway Code, 2023 [69]).Thus, while based on the regulations pedestrians have priority at a zebra crossing, the driver (vehicle) was also assumed to have priority at non-zebra locations, as there was no refuge for this crossing type and the crossing behavior could be considered as an instance of jaywalking [70] in the experiment.
The following modifications were made to the original payoff formulations: I) The utility of risk perception is not considered when a road user is waiting for the other to pass first, thus removing k from their utilities in these instances.II) When road users with no right of way want to pass first, they get a higher negative score for risk perception (knR p , knR v where R p = 1 and R v = 0 if pedestrians have right of way (i.e., at zebra crossings), and vice versa).III) When a road user waits for the other to pass first, they do not only lose the approaching vehicle's TTA but also the pedestrians' estimated crossing duration [−a(t v + t p )]. IV) When a road user waits for the other to pass but none of them passes immediately, they will lose their own passing time with a multiplier (m) which can make it worse than waiting for the other to pass first.V) When a vehicle waits for a pedestrian, the pedestrian gains the vehicle's TTA (t v ) instead of their crossing duration (t p ).
Similar to the original model, the above formulation was solved using the mixed-strategy Nash equilibrium and ( 4) and ( 5) show pedestrians' and vehicles' probabilities of passing first and waiting for zebra and non-zebra crossings, respectively.

4) BEHAVIORAL GAME-THEORETIC MODELS
Both original and alternative payoff formulations were solved using a model from the behavioral game theory category creating OBGT [original (solved by) behavioral game theory] and ABGT [alternative (solved by) behavioral game theory] models, respectively.The DA model from the behavioral game theory category was chosen [56] and utilized as an alternative game solution to the mixed-strategy Nash equilibrium.According to the model, agents generate preferences by considering the conveniently available strategies with assumptions about opponents' preferred strategies using evidence and stochastic sampling, i.e., the process of a finite number of accumulation steps in payoffs inspired by existing cognitive models of preferential choice [56].
The following equations show the model formulation: P P,w P (t) = e λ V P,w P (t) w e λ V P,w P (t) (9) where V D,c D (t) and V P,w P (t) are the values of action c = cross for driver and action w = wait for pedestrian, respectively.P D,c D (t) and P P,w P (t) are the estimated action probabilities for c and w, respectively and finally v D,c D ,c P (value for driver of action c if pedestrian plays c) and v P,w P ,w D (value for pedestrian of action w if driver plays w) are the payoffs as defined in the two-agent game under study.By increasing λ , agents are more likely to choose the option with the highest value whereas the lower values of this parameter represent agents with a greater degree of ''randomness'' in their decisions.
The model was slightly modified and named the generalized DA model.To this end, a distinction mechanism was added to the model which explains how rapidly activations and beliefs are updated by an agent, and how long it takes to perform such an update.This was done by setting the parameters (ω and γ in ( 6) and ( 7)) that define the rate of change during an update of the agents' activations (preferences) and beliefs as follows: ω = 1 − γ whereas in the original DA model, it was assumed that ω = γ = 1.Both ω and λ parameters are called ''DA parameters'' in this paper.Also, while in the original model, the first agent (driver) samples one of the other second agent's (pedestrian's) actions w with probabilities P w at each time step, and updates their own value based on that sample, a weighted average across all possible actions w is taken in the generalized model.This is also true for the other agents' possible actions.
The model has a concept of decision-making over time.This time is known as model convergence time.The criterion for the convergence was to consider a threshold of 0.001 for the change in the two consecutive probabilities of actions for both agents.
Figure 2 illustrates how road users decide whether to pass first or wait for each other using the DA model under the following conditions: a) t v = 6 s, t = 30 s and at a zebra crossing and b) t v = 5 s, t = 45 s and at a non-zebra crossing.As can be seen from the figure, the model assumes that both the driver's and pedestrian's values of actions are the same at the first time step (Figure 2-a) as time goes by, the value of passing first for the driver becomes lower 110712 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.while it increases for the pedestrian.This happens because both agents' information about the priority rules and available safety margin is being updated over time.As a matter of this deliberation process, the probability of passing first for the pedestrian increases and converges to a constant value.The opposite of this situation happens to the driver.Figure 2-b shows the alternative, although with a slight difference, just after the first time step and at the beginning the values of both actions for the driver (pass, yield) tend to decrease and as a result, the probability of yielding to the pedestrian becomes higher.However, quite soon the probabilities of actions swap places and the driver decides to pass first probably when observing the pedestrian is less assertive in crossing the road.This happens because the pedestrian feels less safe at an unmarked crossing although the safety margin seems to be enough for them to pass first.

5) MODEL FIT
All the models were fitted to the experiment dataset using maximum likelihood estimation method by computing likelihood and log-likelihood functions as follows: If the pedestrian i crossed in trial j 1 − p(X ij,θ ) Otherwise ( 10) (11) where n = 32 is the number of PPs and p is the model-predicted probability of the pedestrian crossing first in trial j of participant i, where X ij specifies the experimental condition on that trial, given model parameters θ.
Both DA models (i.e., ABGT and OBGT) were fitted with three different assumptions about the parameters: a) Using both DA parameters (i.e., ω, λ ) and the gametheoretic model's payoff parameters as free parameters, separate per participant pair.b) Fixing DA parameters, i.e. choosing two constant values for λ representing high = 1 and low randomness = 18 and a predefined value (i.e., 0.9) for ω [56], and using payoff parameters as free parameters, separate per participant pair.
c) Having DA parameters shared across all participants and letting the payoff parameters be free per participant pair; in this method, alternating minimization [71] was used to account for varying payoff (free) parameters with shared DA model parameters across the PPs with the following form: where LL(θ PO , θ DA ) is the total negative log-likelihood function, θ PO is the vector of payoff parameters and θ DA is the DA model parameters.This method solves the problem by fixing θ PO and minimizing in θ DA , and then fixing θ DA and minimizing in θ PO .This method helps the function converge to a global minimizer, which in our case is the total (sum of) negative log-likelihood across all PPs.
All models were fitted to both crossing locations at the same time and thus the parameters were shared between the two crossing types.The above procedure was used for all models using Powell's method implemented in Scipy [72].
Table 6 shows the parameter ranges of all game-theoretic models used in the study.The parameter space was chosen in a way that guaranteed the best fit for each model after several rounds of manual testing regarding the optimization algorithm.The parameter bound criterion for all models was to conform with the theoretical reasoning, for example, by limiting the lower bounds of multipliers (c, m & n) to 1 or keeping a and δ between 0 and 1.The main criterion for choosing the bounds for the conventional game-theoretic models was to discard any parametrization that yields probabilities outside the range of 0−1.Also, for all models, the bounds were set in a way that expanding them makes the algorithm choose values resulting in a worse fit.
All the models were compared using information loss criteria, i.e., the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), as well as error indicators, including the Mean Absolute Error (MAE) and the Root Mean Squared Error (RMSE).The following equations show the formulations for these metrics: where k is the number of estimated parameters in the model.
where: n is the sample size.
where: |actual -predicted| is the absolute difference between the actual and predicted probabilities and n is the number of data points.16)

III. RESULTS
In this section, first, the observed behaviors of participant pairs at both crossings are presented followed by the modeling results for the individual and aggregated data.

A. OBSERVED BEHAVIOR AT BOTH CROSSINGS
Figures 3 and 4 show the crossing behavior as well as the probability of pedestrian crossing first as a function of time gap, for all 32 PPs, for all models.Looking at the panels in Figure 3, it can be seen that while different PPs behaved differently for TTAs equal to two and three seconds, all of them passed the crossing first at higher time gaps.Also, 28% (nine out of 32) of the pedestrians crossed first in all trials, irrespective of the available safety margin (TTA).Looking at the observed data in Figure 4, one can see the crossing behavior at non-zebra was quite different compared to the zebra crossing among the pedestrians: First, very few pedestrians passed before the driver, at the 2-second TTA (i.e., 11, 12, and 29).Second, the crossing probability increased for the 3-second time gaps but was still low.This was due to the crossing behavior of PP 3,11,12,20,23,24,and 29.Third, data of some pedestrians, i.e., 17, 20, and 22 showed fluctuations (rises and dips) as TTA increased.Finally, three out of 32 pedestrians (i.e., PPs 4, 25, and 28) did not pass at all, suggesting they were risk-averse.

B. MODEL PERFORMANCE FOR BOTH CROSSINGS
Figure 3 shows that the two conventional game-theoretic models, i.e., ACGT and OCGT performed comparatively weakly in almost all cases.This can be confirmed by looking at Table 4 which shows the model comparison for both crossing types including information loss criteria (AIC, BIC) and error indices (MAE, RMSE).However, when Wu et al.'s payoff formulation was solved with the DA model (OBGT), a clear improvement can be seen in all cases, according to the plots in Figure 3 and Table 4. Also, the logit model did a better job of capturing pedestrians' crossing behavior than ACGT, OBGT, and OCGT models but was outperformed by our proposed model (ABGT) in almost all cases.The exceptional performance of the ABGT model is evident in the plots of PPs 2, 13, 14, 15, 20, 24, 25, 27, and 31.Figure 4 and Table 4 show that similar to the zebra crossing, overall, the Wu et al. model combined with the DA model, i.e., OBGT, performed better than the baseline model (OCGT) but the differences were subtle for the nonzebra crossing.Also, the logit model performed second-best with a weaker performance compared to the zebra crossing.Unlike zebra crossing, the differences between the ACGT and ABGT are much more obvious.Although the two models utilize the exact same payoff formulations, the ABGT model outperformed all other models in almost all cases while it is clear that the ACGT model was not capable of exhibiting the observed pattern of probabilities.For ABGT, the model's capability to capture the more complex crossing behaviors of pedestrians No 3, 6, 8, 13, 14, 18, 19, 27 and 30 is specifically noticeable compared to other models.In addition, this model achieved a 70% reduction in RMSE for marked crossings and a 67% reduction for unmarked crossings when compared to the baseline (OCGT) model.However, It is worth noting that comparing models with varying numbers of free parameters should encompass both elements of model parsimony such as AIC, BIC, etc., and error indices.One cannot assert that a model is superior solely based on its predictive accuracy.Overall, Table 4 shows that when moving from conventional to behavioral game-theoretic models, the improvements in all criteria, including negative log-likelihood are observable which firmly confirms the observations of Figure 3 and 4.

C. OVERALL RESULTS FOR BOTH CROSSING TYPES
Figure 5 shows the average of all 32 pedestrians' crossing probabilities over time gaps for both crossing types.In line with the individual data, the overall fine performance of the ABGT model and the better performance of the Wu et al. combined with the DA model (OBGT) compared to the original model (OCGT) is evident for both crossing types.

D. ABGT MODEL DECISION TIME
As explained in Figure 2, the dual accumulation process in the DA model corresponds to a deliberation process that unfolds over time such that we can define a model convergence/decision time.Thus, we conducted a correlation test to understand if there is a relationship between the ABGT model's decision time (DT) and pedestrians' crossing initiation time (CIT) in the experiment (i.e., from the time the auditory tone was triggered to the time the pedestrian started crossing the road, minus one second).Table 5 shows that there is a weak, yet significant correlation (r(821) = .213,110714 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.p = .000),between the ABGT model's DT and CIT, which can be also confirmed by Figure 6.From the figure, it can be seen that most of the initiation times are concentrated in the 1−2 second range.The figure also shows that the model had a hard time predicting DT within this range.By increasing CIT, more instances of successful estimations are observable.Three different points were chosen to show how the model predicted the interaction outcome over time which can be seen by the respective insets.Figure 6 shows that in points C: [1.5,290] and B: [5,490] the model performed well.Point C belongs to PP 29 with the following experimental conditions: non-zebra, TTA of 6 s, both female and the waiting time of 80 s and point B is for PP 8 at zebra, with a TTA of 5 s, with a male driver and female pedestrian, and a waiting time of 78 s.Hence, probably the most obvious difference between these two points is the crossing type that led to different CITs and DTs.Finally, in point A: [1.2, 895], it can be seen that the pedestrian's probability of passing and waiting swapped places after a few time steps which made the model predict the interaction outcome incorrectly, and over a longer time.This point refers to PP 1 at non-zebra, with a TTA of 4 s, with a female driver and male pedestrian, and a waiting time of 64 s.Although both the TTA and crossing type made the model predict lower values and subsequently lower probabilities of passing first for the pedestrian over time, the pedestrian passed first suggesting that there might be other influencing factors such as gender and personality traits that were not considered for calculating the probabilities.

E. ABGT MODEL PARAMETERIZATION RESULTS
Figure 7 shows the pairwise distribution of the best-fitted model, i.e., ABGT parameterization as a function of the  From the figure, it can be seen that there is a positive correlation between the values of a and the average crossing probability suggesting that higher values of this parameter resulted in higher average probabilities.Also, a moderate negative correlation can be seen between a and c suggesting that fixing one parameter (e.g., a) and leaving the other one to vary freely could improve the model fit results.However, it is quite debatable what would be the exact value of the fixed parameter as there is no theoretical reasoning for choosing a specific value for either of these parameters.One can try different values and see which value would yield the best results.Finally, several instances of hitting bounds can be seen but as we explained above, broadening the bounds did not result in a better model fit.

IV. DISCUSSION
In this study, we compared a number of computational models of road user interaction, namely a logit model and four game-theoretic models, using a controlled study.The findings showed that our proposed model, which was based on behavioral game theory outperformed all others for almost all PPs' data, for both crossing types.The second-best performing model was the Logit model confirming the findings of many studies that relied on this type of modeling to predict vehicle-pedestrian interaction outcomes [26], [33], [73].Moreover, a huge improvement was observed by switching from mixed-strategy Nash equilibrium to dual 110718 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.accumulation to solve the same payoff matrices.This was especially noticeable for our ACGT versus ABGT models and for the non-zebra crossing which constituted the worst and the best models for this crossing type, respectively.This helps us answer our main research question: In line with behavioral economics, people do not play the Nash equilibrium in their daily life [51], which may also be true about road users.As stated by [74], people are usually not aware that they are playing a game.They have some beliefs about their surroundings, other potential players and their available strategies, and the possible outcomes of each chosen strategy.Hence, they use heuristics and the rule of thumb to take action.Road users' divergence from Nash equilibrium has been reported in cyclist-pedestrian interactions [65] and is observed here for vehicle-pedestrian interactions.[64] suggested that this might be due to the possible Nash equi-librium's inability to consider suboptimal road user behavior.While one could argue that the performance difference between ACGT and ABGT models can be attenuated by formulating a different payoff matrix, we tend to believe that it is less possible to propose a model based on a different payoff formulation and solve it by mixed-strategy algorithms, which works better than the ABGT model.This is due to the inherent limitation of the mixed-strategy algorithm with respect to only considering the opposing player's utilities.Overall, the results of this study can be beneficial for the and development of AVs when there is a need for studying a large number of vehicle-pedestrian interactions in a safe and controlled manner with subsequent computer simulations and mathematical modeling.
Unlike other models used in the study, the behavioral game-theoretic models provide concept of time and suggest that the initial conditions (i.e., kinematics and crossing type) are processed over time.The time it takes for the model to process those initial conditions correlates with the actual time it takes for the PPs to reach a point where one of the agents can go ahead and pass first (Figure 6).That said, the agents may be adjusting their behavior at multiple points in time during the interaction.Hence, the key simplification in the model is that the interaction is modeled as a single decision-making making it a simple model capable of predicting interaction outcomes, which could be quite useful in some types of applications.Also, the DA model relaxes this single decision simplification a little more, as there can be many steps of deliberation in the DA process, even though those steps of deliberation in the model are not connected to how the external world is developing over time.
Moreover, checking the participant features and traffic conditions of the three selected points in Figure 6 revealed that there was a difference in traffic conditions between the points where the model performed well (i.e., B and C in Figure 6); longer CIT and DT were observed at a zebra crossing (B) compared to a non-zebra crossing (C).We have previously shown that pedestrians had considerably longer CITs at zebra crossings in the experiment [67].Also, investigating the third point, i.e., A, where the model performed poorly suggested that other factors such as personality traits could play a role in the pedestrian's CIT (see [67]).Therefore, the observed discrepancy between the model's DT prediction and the pedestrians' CIT could be due to the lack of consideration of such variables.A more complete account of this negotiation process could include these variables (e.g., personality traits) in future studies.
We used a novel approach in model fitting employing a distributed simulation dataset to test and validate the models.The controlled nature of the study allowed us to understand and pinpoint each and every PP behavior, individually as well as evaluate each model's performance with respect to the individual data, something that is not possible in naturalistic studies.It also helped us formulate the alternative payoff matrix having the confidence that there are no unknown correlations between the studied variables.Previously, we showed that distributed simulation can generate pedestrians' gap acceptance behaviors, using a desktop driving simulator connected to the HIKER lab [75].This paper tried to a take step forward in this direction by replicating game-theoretic interactions using two high fidelity simulators to maximize the validity of the experiment.That said, naturalistic data still provide some advantages over simulator data which should not be overlooked: studying road user behavior over a longer period to understand, for example, driving patterns [76], giving a truthful representation of road users' 2D movement on the road for vehicle-vehicle [28] and vehicle-pedestrian interactions [77] and the capability of tracking a large number of road user parameters, especially those related to driving performance [78] are some of the aspects that still make the naturalistic data a necessary tool for a successful traffic microsimulation.
Another strength of this modeling approach is that the inputs of the proposed model (the agents' kinematics) are usually easy to record and extract and unlike many models, it does not demand metrics such as vehicle's deceleration, dimensions, etc. which are usually more difficult to achieve when using naturalistic video data.Moreover, while the modeling framework of this study is both computationally less expensive and intensive than most machine-learned models, we do not consider it quite a substitute for these black box models, rather, we think the combination of these two would generate an even more powerful computational framework which balances interpretability and generalizability.
Several improvements can be made to this study.First and considering the empirical study, we did not account for the interaction approach phase while research showed that the interaction commences as soon as road users see each other even before the time pedestrians reached the curb [79].Second, making the utility functions time-varying would yield a more complete picture of the whole interaction from the approach phase to the time that both agents passed the crossing.Third, from both the experimental and modeling perspective, there is a need to further develop a methodology to consider situations where multiple pedestrians are interacting with multiple vehicles.This could be done by using head-mounted displays where several pedestrians wear these devices connected over a network.Fourth, our ABGT model currently uses some of the features of behavioral game theory while there are more aspects associated with this theory that distinguishes itself from its conventional counterpart concerning collective behavior, which have not been investigated in a road traffic setting.These include theories of strategic complementarity [52], theories of team reasoning [80], and theories of social projection [81], [82].To this end, extending the framework to a multi-agent problem is one of the most important future research directions.Finally, due to the nature of the experimental work, the behavior of a limited number of people was studied and the models' performance including the proposed ABGT model was judged accordingly.Future studies should test and validate the framework using large naturalistic datasets to both confirm and improve its performance and generalizability.

V. CONCLUSION
In this study, we compared several computational models of road user interaction using data from an experimental setting.The results showed that drivers and pedestrians do not play the Nash equilibrium when interacting at unsignalized crossings and more complex behavioral modeling paradigms like behavioral game theory are needed to fully capture the pedestrians crossing decisions at these locations.The ABGT model was successful in replicating these interactions 110720 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
by taking into account how agents negotiate their available strategies and gains and losses over time which sometimes results in choosing a suboptimal decision as opposed to the assumed rationality of players in conventional game theory.For instance, this model resulted in a reduction by 70% and 67% in the RMSE compared to the OCGT model, for marked and unmarked crossings, respectively.These findings are especially a pivotal point for the virtual testing and development of AVs where they need to take over human driver tasks to a great extent the unpredictability of VRUs into account.That said, achieving this goal still requires significant progress, as discussed in the future research directions in the discussion.
-b.In this experiment, 64 participant pairs (PPs) (32 drivers; Age: M = 31.53,R = 21−50, SD = 1.72); paired with 32 pedestrians; Age: M = 25.09,R = 19−34, SD = 0.87) interacted with each other under different traffic scenarios.The study was approved by the University of Leeds Ethics Committee (Reference No AREA 21-022).The scenarios were defined based on different crossing types (i.e., zebra and non-zebra crossings; see Figure1-c) and five different vehicle time-to-arrival conditions (TTAs, i.e., the temporal distance of the vehicle to the center of the crossing, 3−7 s) resulting in 10 conditions that were repeated two times in each experimental block.There were two blocks resulting in 40 randomized trials per PP.

FIGURE 1 .
FIGURE 1.(a) The driver's view of the pedestrian: the driver is stopping, and the pedestrian is shown by pink spheres, (b) the pedestrian's view of the vehicle in the pedestrian lab: the pedestrian is crossing the zebra and the subject vehicle is to their right and (c) top view of the zebra (left) and non-zebra crossing (right) in Unity including the designated standpoints (blue markers).

FIGURE 2 .
FIGURE 2. An illustration of how the DA model works providing the estimated value and probability of pass/wait of both agents when the pedestrian passed first (a) and when they wait for the driver to pass first (b).The vertical dashed lines show the time that model converged according to the defined threshold.

FIGURE 3 .
FIGURE 3. Pedestrian's probability of crossing first over time gap at zebra crossings for all models.

FIGURE 4 .
FIGURE 4. Pedestrian's probability of crossing first over time gap at non-zebra crossings for all models.

FIGURE 5 .
FIGURE 5. Average probability of pedestrian crossing first over time gap for all models.

FIGURE 6 .
FIGURE 6.The relationship between the ABGT model's DT and CIT in the experiment.

FIGURE 7 .
FIGURE 7. Pairwise distributions of parameters for ABGT model as a function of average crossing probability in each PP.

TABLE 1 .
Parameters of the study.

TABLE 2 .
Wu et al. [46]Payoff matrix.(The vehicle is the row player and the pedestrian is the column player.)

TABLE 6 .
Parameter ranges for all GAME-THEORETIC models.