Estimation of Drivers’ Gaze Behavior by Potential Attention When Using Human–Machine Interface

Recently, various visual information presentation systems known as human–machine interfaces (HMIs), such as road projection lamp systems, have been developed for safe driving. However, it is unclear how these HMIs change the drivers’ gaze behavior and improve their cognitive awareness of the environment. Therefore, in this study, we introduce the concept of potential attention to propose a probabilistic method to estimate drivers’ gaze behavior when using HMIs. The potential attention hypothesis can propose an explanation to understand gaze behavior. This method assigns potential attention to objects the driver is likely to gaze, such as vehicles and pedestrians, thereby estimating the driver’s potential gaze point from potential attentions. The study is divided into two steps. The first step analyzes the drivers’ gaze behavior in the simulator experiment when a road projection lamp is displayed to alert pedestrians. In the second step, we propose a method for estimating the driver’s gaze through the potential attention method based on the results of the simulator experiment. The modeling results for gaze behavior measured in the simulator experiment as the first step show that gaze behavior can be estimated with high accuracy. This proposed method is expected to apply to a method to determine where the HMI display should be placed.


I. INTRODUCTION
Recently, various human-machine interface (HMI) systems have been developed for driving support. Some systems use augmented reality (AR), head-up displays (HUDs), and road projection lamps to display various visual information directly in the driving environment. These systems keep the driver's gaze on the road by displaying information, such as the speedometer, on the windshield or road surface, whereas the conventional head-down display (HDD), presents information on an in-vehicle monitor or instrument panel display. In addition, these systems can directly highlight objects such as signs and pedestrians, thereby, making them easily detectable for drivers.
The associate editor coordinating the review of this manuscript and approving it for publication was Emanuele Crisostomi .
These systems are, thus, expected to induce safer driving behavior and improve driver's ability to recognize the environment.
Various methods for displaying information using AR and HUD have been proposed [1]. In previous studies, the effects on driver behavior were compared between HUD and HDD systems. In terms of gaze behavior, the results show that the HUD can keep the driver's gaze on the road in the simulator environment [2], [3]. The results also show that the HUD can assist drivers in recognizing information quicker in a real vehicle environment. [4]. As the driver's gaze is fixed on the road, they can afford to drive and pay attention to the surrounding environment. Regarding driving performance not limited to gaze behavior, it was found that using a HUD allows for more stable steering and faster reaction times [3], [5], [6]. VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ To induce safe driving behaviors correctly through HMI displays, we need to understand how the HMI display changes the drivers' attention and gaze behavior. For example, HMIs can visually highlight objects that are difficult for drivers to pay attention to, such as pedestrians. This visual support allows drivers to detect potential dangers early on. The effect of HMI warnings on risks has been validated in previous studies. The results show that the timing of braking is faster, and drivers are more aware of dangers [7], [8]. It has also been shown that highlighting objects in the driving environment, such as pedestrians and signs, makes it easier for drivers to recognize them [9]. Even in complex environments, AR and HUD can control the allocation of drivers' attention by appropriately displaying the points they should pay attention to [10].
However, previous studies have not clarified how HMI improves drivers' awareness of the environment. In general, it is known that the driver's gaze behavior is significantly influenced by their attention to the environment when using HMIs. [11], [12]. Therefore, it is important to understand the mechanism that induces gaze behavior to elucidate drivers' attention when using HMIs. By understanding drivers' gaze behavior when using HMIs, it will be possible to examine the optimal HMI display method for drivers to recognize the driving environment efficiently.
Many studies have been conducted on estimating the drivers' gaze points using machine learning methods. Most of these studies used a method to estimate the gaze point by detecting the physical features of drivers, such as facial orientation, from in-vehicle cameras [13]- [15]. The advantage of this method is that the gaze point can be calculated geometrically by measuring the physical characteristics accurately. This method can also estimate where the driver gazes inside or outside the car. However, it is difficult to accurately estimate physical characteristics due to the influence of sunlight, glasses and bumpy road [16], [17]. End-to-end learning is another estimation method, and uses the image of the outside camera and the result obtained by the eye tracker as training data. This method does not use the estimated physical features of drivers for detecting gaze points [18]- [20]. In these studies, highly accurate gaze estimation was performed by using classification information in the road environment, which was obtained by the input of semantic segmentation [18], [19]. Although they cannot estimate the gaze point inside the vehicle, they use only a few sensors, such as an exterior camera.
To the best of our knowledge, no studies have focused on methods for gaze estimation using an HMI. Moreover, due to the characteristics of machine learning, generalizing the drivers' gaze behavior when using HMIs is difficult. In addition, another machine learning model is needed to forecast where drivers might gaze. Thus, due to the lack of understanding of generalized driver gaze behavior, it is difficult to determine the optimal display method of HMIs when attempting to understand gaze behavior through machine learning methods. Therefore, we proposed a probabilistic mathematical method to estimate the gaze behavior of drivers when using HMIs. Due to the proposed method being a probabilistic method that can be applied to various environments, it is expected to understand generalized driving behavior, which is difficult within the field of machine learning.
In this study, we modeled the drivers' gaze behavior when a road projection lamp was displayed. A road projection lamp is a system that presents information on the road surface, as shown in Fig. 1 [21]. To construct the proposed method to estimate gaze behavior, we needed to collect and analyze data on drivers' gaze behavior when using HMIs. Therefore, this study is divided into two steps. We firstly conducted a simulation experiment to measure the drivers' gaze points. The purpose of this experiment was to clarify the impact of road projection lamps on drivers' gaze behavior. Next, we proposed a probabilistic mathematical method based on the experimental results to reproduce gaze behavior when using an HMI. To verify the usefulness of the proposed method, we modeled the data obtained from the simulator experiment and validated the estimation accuracy. If the gaze point can be correctly estimated when using the HMI, it will lead to an understanding of the drivers' gaze behavior, which will greatly contribute to the development of the HMI.
This study makes two contributions to understanding drivers' gaze behavior when using HMIs. First, we conducted a simulator experiment to analyze gaze behavior when a road projection lamp is displayed. The experimental setup for understanding gaze behavior can be applied not only to road projection lamps but also to various HMIs for more efficient development. Second, we propose a mathematical method for estimating the drivers' gaze behavior when using an HMI. This proposed method can be applied to any type of HMI or in the absence of an HMI. In addition, it can contribute to the understanding of drivers' gaze behavior, as the method is based on a hypothesis of drivers' attention.
In this paper, in Section II, a simulator experiment using a road projection lamp to collect the driver's gaze behavior to be modeled is described. Section III provides the results of the driver's gaze behavior obtained by the simulator experiment. Section IV describes the characteristics of the process of the driver's gaze behavior for modeling when using the HMI. Section V proposes a gaze behavior modeling method, and Section VI describes the results of the proposed method. Sections VII and VIII provide discussions and conclusions for this study, respectively.

II. EXPERIMENT
In a simulator experiment, we measured the drivers' gaze points when using road projection lamps. The aim of this experiment is to examine the influence of the road projection lamp on the driver's gaze behavior. The results in this experiment are applied to design the theory of the proposed method for estimating gaze behavior, as described in Sections IV and V, and to model the gaze behavior, as described in Section VI. The participants drove the course assuming a night-time simulation test on a driving simulator. The night-time situation means that objects appearing in the simulation that the participants can recognize are limited. On the road, some pedestrians appear at regular intervals and are alerted by road projection lamps. We measured and analyzed the drivers' gaze behavior while the road projection lamp was displayed. This study was approved by the Research Ethics Committee of Ritsumeikan University (reference number: BKC-HitoI-2019-069). The study complied with all the guidelines of the Declaration of Helsinki.

A. PARTICIPANTS
Ten university students participated in this study (10 males, 20-25 years; mean = 21.8 yrs). All participants had held a driving license for more than one year (mean = 2.8 y) and had no vision problems. The participants signed informed consent forms, which allowed for the use of the collected data for scientific purposes and publication. The participants received 1,000 JPY per hour for their participation.

B. APPARATUS
The simulator environments were generated using Vizard 5.0 (WorldViz). The distance between the participants and the screen was 1.6 m, and the screen size was 1.36 m long × 2.43 m wide (field of view: 46.1 × 74.5. The field of view in the simulator has been adjusted to suit this environment. We used a Logitech G29 wheel (Logitech) for vehicle control, and an eye movement measurement device (Tobii Pro Glass 2, Tobii Technology) to measure the gaze point. The driving simulator used in this study is shown in Fig. 2.

C. DRIVING ENVIRONMENTS
The driving courses are extracted from real road maps with low curvature, and the course width is 3 m. We created 11 courses, each with a five-minute completion time. Courses are assigned as follows: one course for practice, two courses for testing the control condition (without road projection lamp), and eight courses for testing conditions in which all the road projection lamps are displayed, respectively. All these lamps are mentioned in the following section of this paper. The course is with white road edges, making it difficult for drivers to perceive the speed they are moving at. Random visual markers that in the passing environment are known to generate optic flow [22]; therefore, we displayed random dots on the course with an average density of six dots/m 2 .
To verify the drivers' attention allocation, pedestrians and pigeons are projected randomly on the left and right sides of the road edges. Pedestrians appear four times during each course, at approximately 60 s intervals (for example, Fig. 3). One out of these times, the pigeons appear on the opposite side of the road (e.g., Fig. 3(b)). We measured the driver's reaction time to the pedestrians and pigeons to verify their attention to the surrounding environment. In addition, to prevent the driver's gaze point from being fixed, other objects such as a ball and an ornamental plant appear. These objects appear approximately 15 times at approximately 20 s intervals. As the test is set in a night-time environment in which vision is limited, all objects appear only when they are within 10 m from the vehicle. In addition, we assumed that the vehicle would be able to recognize pedestrians through various sensors before they became visible to the human eye. Therefore, the road projection lamp displayed an alert to indicate the presence of pedestrians before the drivers could observe them. The alert from the road projection lamp is displayed when the distance between the vehicle and the obscured pedestrian approaches 20 m.

D. CONDITIONS
In this experiment, we used eight types of road projection lamps with different icon types (arrow, exclamation mark), blinking frequencies (0 Hz, 10 Hz), and display duration (1.0 s, 3.6 s). The experiments were conducted under nine conditions, which included eight road lamp display conditions (two icons × two frequencies × two durations) and one control condition without road lamp display. A detailed explanation of each condition follows below.

1) ICON TYPE (ARROW, EXCLAMATION)
To investigate the effect of the road projection lamp, we prepared the arrow and exclamation road projection lamps. The conditions for displaying each icon were defined as the arrow condition and the exclamation condition. The condition in which the road projection lamp is not displayed is defined as the control condition. In the arrow and exclamation conditions, the road projection lamp is displayed in the center of the road to alert the driver to the presence of a pedestrian (for example, Fig. 3). In the arrow condition, arrows are displayed pointing in the direction of the pedestrian so that the driver is alerted of where the pedestrian appears (for example, Fig. 3(a)).

2) FREQUENCY
The previous study shows that flashing headlights increased the recognition rate of pedestrians at night [23]. To verify the blinking effects of road projection lamps on drivers, we conducted tests under two conditions: one with a blinking road projection lamp and one without blinking. In the blinking condition, the blinking frequency was set to 10 Hz.

3) DISPLAY DURATION
We considered that the display duration might influence the drivers' gaze behavior. To verify the effect of display duration, we tested conditions with different display durations. The road projection lamp is displayed for approximately 3.6 s from the appearance of the projection lamp until the driver passes a pedestrian. In contrast, in the condition where the display duration is changed, the projection lamp is displayed only for 1 s after the appearance of the road projection lamp.

E. PROCEDURE
The participants were instructed to drive safely and only had to steer the vehicle. The vehicle speed was constant at 20 km/h. Before the experiment, the participants drove on a practice course to become familiar with the simulator environment. After the participants were sufficiently familiarized with the driving operation on the simulator, the experiment was initiated by randomizing the course order for each participant. A five-minute rest period was provided after each of the five courses to reduce the burden on the participants.

F. ANALYSIS OF GAZE BEHAVIOR
Eight AR markers, displayed on the simulation screen, can be used to calculate the gaze point in the screen coordinates, as shown in Fig. 2. To evaluate the influence of road projection on drivers' gaze behavior, we calculated the visual angle between the road projection lamp and fixation (VARF). The angle that indicates the positional relationship of the gaze points around the road projection lamp or pedestrian was calculated and defined as the polar angle (PA). When analyzing road projection lamps and pedestrians, the reference point of the PA is different, respectively. Since pedestrians appear randomly on the left and right sides of the road, the PA was calculated based on when the pedestrian appeared on the right side and was reversed when the pedestrian appeared on the left side. Using VARF and PA, the drivers' gaze points were plotted in a polar coordinate format for the visualization of the gaze position centered on road projection. The visual angle between the pedestrian and fixation (VAPF) was calculated in the same way. The variables are summarized in Fig. 4. The VARF, VAPF, and PA were calculated in two sections, namely the Symbol and Avatar sections. We define the Symbol section as the duration from the appearance of the road projection lamp to the appearance of the pedestrian, and the Avatar section as the duration from the appearance of the pedestrian to the disappearance of the pedestrian. We compared the drivers' gazes in these two sections and examined changes in gaze behavior.

III. RESULTS OF EXPERIMENT
The experiment results revealed no significant differences in gaze behavior in relation to blinking frequency and display duration. The main aim of this study was to propose an estimation method for driver gaze. Therefore, to show only the results in line with the aim of this study, we report only the results of the effects of the different icons on gaze behavior.   Fig. 5 shows that gaze dispersion around the road projection in the Symbol section is different for each condition. In the control condition, the gaze points were concentrated at the front of the road projection lamp. In contrast, the gaze points were distributed around the road projection lamp in the arrow and exclamation conditions. Especially in the arrow condition, the gaze points are distributed in the direction where the pedestrian appears. Fig. 6 shows no significant difference between each condition. The gaze points in the Avatar section were distributed in front of the road projection lamp under all conditions. Fig. 7 shows the result of VARF in the Symbol Section. A Jarque-Bera test and Bartletts' test do not indicate that normality (p = .500) and equal variances (p = .364) are violated. Therefore, the result of VARF was analyzed using a one-way repeated measures analysis of variance (ANOVA), through which significant differences between conditions were revealed (F(2, 27) = 12.0, p < .001, η 2 = .35). Bonferroni test as the post-hoc test shows a significant difference in the control condition with the exclamation condition (t(9) = 3.67, p = .005, r = .77). It also showed a significant difference in the arrow and exclamation conditions (t(9) = 2.69, p = .025, r = .67). No significant difference was revealed between the control and the arrow conditions (t(9) = 1.22, p = .255, r = .38). Fig. 8 shows the result of VAPF in the Symbol section. A Jarque-Bera test and Bartletts' test do not indicate that the VOLUME 11, 2023 observed data are not consistent with a normal distribution (p = .061), and the equality of the error variances (p = 0.151), respectively. Thus, the results of VAPF were also analyzed using ANOVA, and significant differences between conditions were revealed (F(2, 27) = 12.0, p < .001, η 2 = .47). The Bonferroni test as the post-hoc test showed a significant difference in the control with the arrow condition (t(9) = 4.49, p = .002, r = .83), and the control and exclamation conditions (t(9) = 3.25, p = .010, r = .73). A significant trend was also observed between the arrow and exclamation conditions (t(9) = 2.14, p = .061, r = .58).

B. RESULTS OF VISUAL ANGLES
A Jarque-Bera test did not indicate that the observed data of VARF and VAPF in the Avatar section are consistent with a normal distribution (p = .009, p = .003, respectively), but a Bartlett's test indicated that the equality of the error variances was assumed (p = .578, p = .534, respectively). Accordingly, the results of VARF and VAPF in the Avatar section were analyzed using Friedman's test. The results showed that there was no difference between conditions in VARF and VAPF (χ 2 (2) = 5.4, p = .067, W = 0.11; χ 2 (2) = 0, p = 1.000, W = 0.00, respectively).

IV. GAZE BEHAVIOR WHEN USING HMI
The experiment results showed that the road projection lamp changed the driver's gaze behavior before the pedestrian appeared (in the Symbol section). Moreover, the changes in gaze behavior differed depending on the icon type. In the arrow condition, the gaze was guided in the direction where the pedestrian appeared, whereas in the exclamation condition, the gaze was concentrated around the road projection lamp. In contrast, there was no difference between conditions after the pedestrian appeared. Fig. 9 provides an example of the time series data of VARF in the exclamation condition. The road projection lamp is displayed at 0.0 s, and a pedestrian appears at 1.8 s. Fig. 9, shows that the VARF gradually decreases and remains small until the 1.8 s mark. However, after the 1.8 s mark, VARF becomes larger since the drivers' gaze is shifted to pedestrians. The same gaze behavior was also observed for pedestrians and other objects that appeared on the road. Therefore, the road projection lamp is considered an object that assists with driving safety and drivers should gaze at it while driving. In summary, the gaze behavior for these objects can be described by the following three processes (e.g., Fig. 10).   These results indicate that the road projection lamp can temporarily guide the driver's gaze. However, this effect differed depending on the individual and the type of road projection lamp (and other HMIs). For future applications of road projection lamps (and other various HMIs), the display method needs to address these individual differences and situations.

V. MODELING OF GAZE POINT
The purpose of this section is to express the potential attention as a probabilistic mathematical method. If gaze behavior can be estimated, it will be possible to design the ideal display position of road projection lamps according to the gaze behavior influenced by HMIs.

A. POTENTIAL ATTENTION
Potential attention is a hypothesis that explains the gaze behavior proposed in the field of machine learning [19]. In the field of machine learning, many methods are used to estimate the drivers' gaze points by estimating the items of potential attention that the driver is likely to be paying attention to. For example, traffic lights, other vehicles, and pedestrians in the road environment have potential attention because we assume that drivers should pay attention to them for safe driving. This method assumes that the actual gaze point is determined from the potential attention depending on the driver and their prior gaze behavior. By applying this hypothesis, we can estimate the drivers' gaze behavior when using HMIs. From the experiment results, it was considered that the road projection lamp (and other HMIs) should be gazed at as well as other objects, such as pedestrians. In other words, when using HMI, potential attention should be paid to the following three areas: 1) areas for driving, such as future path points [24]; 2) areas for safe driving, such as other vehicles and pedestrians; and 3) areas for obtaining supplementary information through HMIs. If this hypothesis can be expressed by probabilistic mathematical method, it will be possible to estimate gaze behavior corresponding to various HMIs and individual drivers. In addition, the estimation results can be applied to determine the optimal HMI display method.

B. MATHEMATICAL METHOD FOR ESTIMATING GAZE
The results of the simulation experiment showed that the effect of the road projection lamp varied among individuals. In the estimation of driver gaze behavior using machine learning methods, it has been shown that the estimation accuracy can be improved by incorporating probabilistic methods to deal with individual differences and situation dependence [20]. For this reason, we also estimate the gaze point probabilistically. In this study, we focused on gaze estimation in a two-dimensional screen system, as shown in Fig. 3, but the proposed method can be applied to various coordinate systems such as a three-dimensional space.
As shown in the previous section, this study hypothesizes that potential attention is generated for three factors (for driving, for safe driving, and for information by HMIs). All of these potential attentions are represented by a Gaussian probability density function p(x), as shown in (1).
where, for estimation in a two-dimensional screen coordinate system (X , Y ), x = [X , Y ] T , µ is the mean value of x, the covariance matrix = diag(σ 2 X , σ 2 Y ), and σ 2 is the variance value of x.
When the number of the object that generate the potential attention is M , and the gaze point distribution p fix (x) is estimated as follows: where η is constant value for normalization. Considering (2) it is clear that the mean value of the gaze estimation distribution p fix (x) will be close to the distribution with the smallest variance in each potential. Fig. 11 shows an example of generating a gaze estimation distribution with one-dimensional probability density functions N 1 (µ 1 = 5, σ 2 1 = 3) and N 2 (µ 2 = −5, σ 2 2 = 2). It can be seen that the mean value of N 3 calculated by (2) is close to the distribution of N 2 because the variance of N 2 is smaller than that of N 1 . By setting a small variance for objects that are considered particularly important, it is possible to express the gaze points that are concentrated around the object.

C. DISPERSION FUNCTIONS DESIGN
The previous section shows a method for estimating the drivers' gaze behavior by designing the variance of the distribution for each potential attention. In the simulator experiment, the driver gazed at the object immediately after its appearance (condensing process), continued gazing at it for a certain period of time (gaze process), and then returned to the gaze position before its disappearance (diffusion process). For this reason, in this study, each potential attention for road projections and pedestrians was generated using a probability density function with its variance value changing over time. The variance function σ 2 (t) for road projections and pedestrians is designed as follows: where the time elapsed since the appearance of the target object is t, and α, β, and γ are parameters to be optimized for each individual. Fig. 12 shows the example of the variance function. This represents the condensing process -gaze process -diffusion process and is expected to represent the gaze behavior when using HMIs.

D. PARAMETER ESTIMATION OF VARIANCE FUNCTIONS AND ITS APPLICATIONS
The simulator experiments in this study generated road projection lamps, pedestrians, pigeons, and other objects while driving. However, for simplicity, we used a situation where only road projection lamps and pedestrians appear for modeling. As a potential attention, we need to estimate the distribution of gaze points p fix (x) using three distributions: the normal VOLUME 11, 2023 gaze distribution p n (x) (when there are only road edges), p r (x) for road projection lamps, and p p (x) for pedestrians. There are two applications in this estimation.

1) STEP1: ESTIMATING PARAMETERS OF THE VARIANCE FUNCTION OF ALL OBJECTS
By using the data measured in the simulator experiment, we can estimate the parameters of the variance function of all objects for the road projection lamp and the pedestrian, as shown in (3). Once the parameter estimation is completed, it is possible to estimate how the driver's gaze point is influenced by the HMI, without measurement. The normal gaze distribution p n (x) can be estimated based on the road shape and vehicle state [25]. However, in this study, the mean value and variance were calculated from the measured gaze data for 10 s before the road projection lamp was displayed. The mean value of p fix (x) is the measured gaze data during the appearance of road projection lamps and pedestrians, and the variance of p fix (x) is set to a constant value of 0.01 as observation noise. The mean values of p r (x) are the center coordinates of the road projection lamp display. Similarly, the mean value of p p (x) is the central coordinate at which the pedestrian appears. Therefore, we need to find the elements σ 2 rX , σ 2 rY , σ 2 pX , σ 2 pY in the covariance matrices r and p . To simplify the estimation, only the variance in the X direction was estimated using (3), and the variance in the Y direction is determined by the display ratio of the target object. Thus, we estimate the parameters θ = [α r , β r , γ r , α p , β p , γ p ] T in σ 2 rX , and σ 2 pX . The maximum likelihood estimation method is used to estimate the parameters. The following formula indicates the extent to which the accuracy S of the data is estimated using the maximum likelihood estimation method. The value of S is 1.0, when the driver's gaze is estimated correctly without error.
where N is the number of data, x i is the estimated value of the model, f (x i ; θ) is the likelihood, and F max = f (x = µ; θ) is the maximum likelihood.
In this study, we modeled the gaze behavior in the exclamation condition (display duration of 3.6 s) and the control condition (no projection lamp), which were measured from the simulator experiments. The gaze point in the arrow condition was directed to the direction in which the pedestrian appeared, whereas in the exclamation condition, the gaze was fixed on the road projection lamp. Therefore, due to its simple results, only the exclamation condition was chosen. Because there was no difference in gaze behavior in relation to frequency, the data obtained in the 0 Hz and 10 Hz conditions were modeled as identical. In addition, pedestrians appear four times under one condition, but one of them is excluded because the pigeon also appears. Therefore, each of the six trials was modeled for each participant. Five trials were used for training, and one trial for validation. Subsequently, we verified the generality of the estimation.

2) STEP2: DESIGN OF HMI DISPLAY POSITION TO INDUCE IDEAL GAZE BEHAVIOR
We estimated the variance function for road projections and pedestrians in STEP 1. It is possible to design an HMI display position from the estimation results, that induces ideal gaze behavior. We design the gaze line p fix (x, t) in advance. This indicates where the driver should gaze, such as the ideal gaze for novice drivers and complex environments. It is, thereby, possible to calculate the mean value µ(t) of p r (x, t) (the display position of the road projection lamp to induce the driver's gaze to ideal behavior). However, the investigation of STEP 2 by numerical calculations was not carried out in this study.

VI. RESULTS OF MODELING
A. ACCURACY OF MODELING Table 1 shows the estimation accuracy S of the training and validation data for each participant. First, we evaluated the extent to which the proposed method reproduced the gaze points based on the estimation accuracy of the training data. The results show that the drivers' gaze behavior can be estimated with high accuracy. In particular, the estimated accuracy values for Participants 4 and 10, in the exclamation condition, and Participants 1 and 7, in the control condition, exceeded 0.9. The participants with such high estimation accuracy showed consistent gaze behavior throughout the entire training course. In contrast, the gaze behavior of participants with a relatively low estimation accuracy was not consistent throughout the entire course. For example, half of the training data reflects the road projection lamp being gazed at, whereas it was ignored in the other half. Therefore, the estimation accuracy varied among individuals. However, the average estimation accuracy of the training data exceeded 0.8, in both the exclamation and control conditions. The proposed method can, thus, estimate a driver's gaze behavior with high accuracy.
Next, from the estimation accuracy of the validation data, we verify the generality of the proposed method for the estimation. The results show that the estimation accuracy for the validation data was as high as that of the training data. The participants with high estimation accuracy in the training data also tended to have high estimation accuracy in the validation data. In addition, the participants whose accuracy was low in the training data tended to have low estimation accuracy in the validation data. One possible reason for this is that the accuracy of parameter estimation has been improved due to consistent gaze behavior throughout all the training data. The results indicate that the estimation accuracy of the verification data shows almost the same performance as that of the training data. Therefore, once the parameters are estimated, the proposed method can estimate the driver's gaze generically.
We also created an average model that trained the gaze behavior of all drivers instead of individual drivers. The results of estimation accuracy S are 0.846 and 0.839 (training and validation) in the exclamation condition, and 0.825 and 0.836 (training and validation) in the control condition, respectively. These results are close to the average value of the training results for the individual drivers, as shown in Table 1, and it is considered that the proposed model simulates the gaze behavior of the average driver. Fig. 13 shows an example of the result of the validation data. Fig. 13(a) shows the results of Participant 1 in the exclamation condition. The red, blue, and green ellipses indicate the normal gaze distribution p n (x), distribution for road projection p r (x), and distribution for pedestrian p p (x). In addition, the red crosses are the measured gaze points, and the orange ellipses show the estimated gaze distribution p fix (x). If the red cross is included in the orange ellipse, it means that the gaze estimation is accurate. The estimated parameters of Participant 1 are θ = [0.0024, 0.35, 0.88, 0.0004, 0.06, 0.39] T . Fig. 13(a) shows that the estimated gaze point is approximately correct throughout the entire process, although the estimation error is a bit large when estimating the gaze for pedestrians (t = 1.88 s). Fig. 13(b) shows the results of Participant 5, whose estimation accuracy was lower in the exclamation condition. The estimated parameters of Participant 5 are θ = [0.0032, 0.64, 1.50, 2.97, 0.96, 0.89] T . It can be seen that the gaze point is not included in the orange gaze distribution p fix (x) which means the estimation error is large.

B. MODELING RESULTS IN EXCLAMATION CONDITION
It is considered that the gaze behavior of Participant 5 was not consistent throughout the entire data, as described in the previous section. In the five training data analyses, there were some scenes in which the driver gazed at the road projection and pedestrians, but there were also scenes in which the driver did not gaze at either of these. The variance functions p r (x) and p p (x) did not become sufficiently small when the driver did not gaze at either the road projections or pedestrians. Therefore, the estimation error is large, and the gaze estimation distribution p fix (x) does not overlap the measured gaze point. In addition, due to the normal gaze point being unstable, the normal gaze distribution p n (x) of Participant 5 can be observed as being larger than that of Participant 1. In this situation, we generate potential attention only for the pedestrian because the road projection lamp is not displayed. Fig. 13(c) shows that the estimated gaze point is approximately correct throughout the entire process, although there is an estimation error when estimating the gaze for pedestrians (t = 1.81 s).

VII. DISCUSSION
In this study, the drivers' gaze point when using HMIs is represented by a probabilistic mathematical method that applies potential attention. As a result of modeling the data measured in the simulator experiment, the average estimation accuracy of the training data exceeded 0.8. This result suggests that the proposed method can estimate drivers' gaze behavior with high accuracy. We also confirmed that the estimation accuracy of the validation data was as high as that of the training data and that driver gaze estimation was possible, in general.
The proposed method can deal with any number of objects as long as the objects in the road environment can be recognized by various sensors. In addition, we consider that our proposed method can be applied to any type of HMI and to situations without HMIs. That are considered advantageous of our proposed method over those of other studies. For example, a machine learning method for estimating drivers' gaze behavior needs to be trained in a variety of environments [18]- [20]. In addition, it is necessary to retrain the machine learning methods when the HMI is used because they are usually trained in situations without HMIs, and the gaze behavior is influenced by the HMIs. The method proposed in this study allows us to estimate how the driver is influenced by the road projection without measuring the driver's gaze point once the parameter estimation of the variance function for specified objects is performed. In addition, since correct gaze estimation was performed by our proposed method, potential attention can be considered as one of the methods of human attention allocation. Thus, it can also be assumed that it expresses the mechanism by which gaze behavior is induced. Therefore, the two main contributions of VOLUME 11, 2023 FIGURE 13. An estimation example of drivers' gaze behavior. The red cross, red circle, blue circle, green circle, and orange circle indicate the measured gaze point, normal gaze distribution p n (x), road projection lamp distribution p r (x), pedestrian distribution p p (x), and estimated gaze distribution p fix (x), respectively. An ellipse of distribution is drawn with 2σ . The time when the road the projection lamp appears was set to t = 0.0 s. this study are the proposal of a simpler method for estimating gaze points and understanding the mechanism of driver gaze behavior using potential attention.
However, this study has several limitations. (i) The estimation accuracy is worse for the participants whose gaze behavior was not consistent. We assume that the parameters of the variance function of potential attention for objects in our proposed method are constant. Thus, gaze behaviors that differ from those during training cannot be estimated. In addition, it is possible that drivers will become accustomed to the HMI display and will gradually decrease the extent to which they pay attention to the road projection. In this case, the gap between the estimated parameters gradually increases. To solve these problems, it is necessary to optimize the system for each driver by using an online parameter estimation or changing the estimation algorithm to take the time series of the driver's gaze behavior into account. (ii) In this study, we estimated the gaze behavior of 10 drivers studies on driver modeling. The number of the participants were determined based on similar driving behaviour experimental research (eg. [26]- [28]). However, as mentioned in limitation (i), the estimation accuracy depends on each driving behavior. The proposed method cannot be a general method to estimate the gaze behavior of all drivers due to individual differences. In addition, this study cannot examine the results of the participants' driving experience. The average driving experience of the participants of this study was 2.8 years; thus, the results depend on the driving behavior of relatively young drivers. It is generally known that novice drivers are less dependent on peripheral vision for vehicle control [29]. Therefore, we assume that the gaze behavior (condensinggaze-diffusion process) can be observed from experienced drivers but the movement of the gaze behavior is smaller than that of novice drivers. Thus, the results of modeling accuracy for experienced drivers may differ.
To validate the proposed method sufficiently, it is necessary to increase the number of participants in various situations. (iii) Gaze behavior in real environments is influenced by a much greater extent of information, although, we simply assume the usual gaze point as the location to generate potential attention, surrounding objects, and HMIs. Therefore, it is necessary to consider what objects should generate potential attention. If we use too few areas to generate potential attention, the accuracy of the estimation by the proposed method will be poor. Then, using conventional machine learning methods is better for estimating the driver's gaze. It is necessary to examine which object information influences gaze behavior in a more complicated environment.

VIII. CONCLUSION
The purpose of this study was to estimate the drivers' gaze points when using an HMI. First, we measured and analyzed the drivers' gaze behaviors when using the HMI in the experiment. The experimental results show that the road projection lamp guided the driver's gaze point. In addition, a series of gaze behaviors were observed when drivers gazed at pedestrians or road projections, such as gazing at an object immediately after its appearance (condensing process), keeping their gaze on the item for a certain period of time (gaze process), and returning to the gaze position before its disappearance (diffusion process). Next, we proposed a probability method for estimating the drivers' gaze based on the experimental results when using HMIs by applying potential attention prediction to that which the driver is likely to be paying attention to. The gaze point data obtained in the experiment was used to verify the estimation accuracy and generality of the proposed method. We confirmed that the proposed method could estimate the gaze points with high accuracy, averaging 0.850 and 0.838 in the control and exclamation conditions, respectively. It is expected that the proposed method is to be applied to the system's design and that an ideal display position for the road projection lamp (as well as other various HMIs) will be established to induce ideal gaze behavior.
In this experiment, we used a simulator to measure gaze behavior when using HMIs. To apply HMI systems, such as road projection lamps, to vehicles, it is necessary to verify gaze behavior in more complex environments with various HMIs. In addition, we need to verify the accuracy of the estimation of gaze points in real environments.
We believe that the merit of this study is that display position and timing of HMIs can be automatically designed from drivers' gaze behavior, which is described in V.D.
Step 2, when applying the results of this study to an actual HMIs. However, since the proposed HMI design method is only a theoretical attempt, the evaluation of the applicability of this study is also future work.