Data-Driven Risk-Sensitive Control for Personalized Lane Change Maneuvers

Most current research in the field of autonomous vehicle control assumes that all vehicles will follow the same patterns of automated driving behavior, resulting in systems with “conservative” or “average” driving styles. These systems may not be acceptable to drivers who prefer a more aggressive style of driving, however, while extremely cautious drivers may consider the standard outputs to be too aggressive. To address this problem, in this paper, we introduce Risk Sensitive Control (RSC), an inverse optimal control algorithm that estimates risk-sensitive driving features and incorporates them into a receding-horizon controller. RSC uses a meta-learning algorithm to update the parameters of the cost function, continuously improving the controller online as more and more driving data is gathered from the user and subjective risk feedback. An estimator takes into account individual differences in subjective risk analysis, in terms of driving features and surrounding vehicle locations, by adjusting the cost function and its constraints. We test this approach using five lane change scenarios, some safe and some risky, with thirty real drivers in a CARLA simulation environment. Our quantitative and qualitative evaluations demonstrate that the proposed framework is able to generate a user’s preferred driving maneuvers during lane changes, i.e., control commands the user associates with lower subjective risk, outperforming conventional, model-based predictive control methods in terms of replicating the user’s own driving behavior.


I. INTRODUCTION
Thanks to rapid advances in automobile technology, autonomous vehicles (AVs) are gradually being introduced into our lives. Experts predict that personalization of the in-car AV experience will become a key differentiator between competing OEMs and mobility service providers [1]. Among premium car owners in the US and Europe, 15% say personalization is one of their top three criteria when purchasing a car, which rises to 17% in China [1]. Part of this The associate editor coordinating the review of this manuscript and approving it for publication was Mehul S. Raval . personalization is adequately reproducing the driving habits of the vehicle's owner, which also generates more trust in the automated system [2]. It is important to develop these personalized driving technologies so that autonomous driving feels safe and natural to a wide variety of users with different driving styles and preferences. Personalized AVs will provide their users with a more comfortable experience, building trust in AV technology and leading to wider acceptance of these vehicles [3], [4]. Our goal is to not only improve the driving experience but to also improve driving safety by preventing accidents caused by human error. In order to model driving behavior in both longitudinal and lateral directions, lane VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ change maneuvers were selected for analysis. Among many ongoing research efforts and applications currently under development in the field of autonomous driving which deal with perception, decision-making, and the control process, in this study, we focus on building a framework for personalizing control when generating lane change maneuvers for autonomous vehicles, directly incorporating user preferences and risk perception. An illustration of our problem setting is shown in Fig. 1. The red vehicle is our ego vehicle, which is performing a lane change to pass the preceding blue vehicle. The red area near the red vehicle represents the area considered to be most important during each driver's assessment of risk, which is perceived differently by two drivers. In Fig. 1, Driver 1 feels safer than Driver 2 when performing the same passing maneuver. Our proposed RSC approach is able to capture that individual sensitivity and incorporate it into the control output, increasing the driver's comfort, while still ensuring the safety of the maneuver. Data-driven driver modeling [5]- [8] has recently been proven to be capable of capturing more diverse and complex information about driving styles than traditional methods using Model Predictive Control (MPC) [9], [10], which model vehicle motion using predefined physical rules. These traditional methods are better at generating less risky, collision-free trajectories, compared with data-driven methods. Predicting the riskiness of driving situations is a critical functionality in advanced AV applications. While driving risk is a very broad topic that includes both objective and subjective risk, theoretical studies such as [11]- [14] have shown that drivers adjust their behavior to match their targeted level of subjective risk, rather than responding to the objective level of risk which is actually present. Therefore, in this study, we focus on modeling subjective risk as it is perceived by different individuals. In order to generate the lane change maneuvers preferred by users (i.e., those they perceived as being less risky), in this study we propose a framework that combines subjective risk models and data-driven control methods. An overview of the proposed method is shown in Fig. 2. In addition, it should be noted that an individual's driving style is influenced by a variety of factors, such as the driver's intentions, personality, and skill level, as well as their physical and mental condition. In our work, personalization is examined on the basis of subjective, individual differences in risk perception, including which factors influence the perception of risk, such as particular driving behaviors and the presence and locations of surrounding vehicles. We then construct a meta-algorithm that uses the feature importance of these driving factors as hyperparameters, in order to learn a user's preferred driving style. Parameters obtained from this meta-learning are then used to produce lane change control commands, and these parameters are continuously updated online in response to additional data and changes in the driving environment. Our research is based on the following hypotheses: 1) (hyp1) Driving behavior varies among drivers, even when encountering the same lane change scenarios. 2) (hyp2) The factors which influence these differences in driving behavior and locations of the surrounding vehicles, do so by influencing the subjective risk perception of the driver, which can differ among drivers even during the same lane change scenario. 3) (hyp3) These individual differences in the factors which influence their subjective risk perception can be applied to generate safer, personalized driving commands. In this study, we expand on our previous works [15], [16], and propose a data-driven control framework that can learn to generalize personalized lane change behavior from an individual's own driving behavior. Based on the three hypotheses listed above, we first model subjective risk perception by extracting the influential features from driving behavior and surrounding vehicle location data. Secondly, we demonstrate how these influential features can be represented as a hyperparameter of a cost function. Finally, we propose our Risk Sensitive Control (RSC) method, which utilizes this information to generate personalized driving commands. To the best of our knowledge, this is the first work that has proposed personalizing control of an autonomous vehicle using subjective risk perception as the parameter selection criterion. The main contributions of this paper are listed as follows: 1) Risk Sensitive Control is proposed to generate personalized driving sequences for lane change scenarios. This includes a meta-learning method to learn control parameters from real-world driving data. Subjective risk models are applied to analyze individual differences in the influence of features (ego vehicle, and surrounding vehicles). These individual differences are applied as learning momentum of a cost function of predictive controllers. 2) A dataset named 'Subjective-risk-lane-change-dataset' 1 is generated, which includes ego vehicle driving behavior data, surrounding vehicle location information, and subjective risk scores of the drivers. Collected during both safe and risky lane change scenarios in CARLA simulators, it also includes the demographic information for 30 participants. The rest of this paper is organized as follows. We discuss related work in Section II. In Section III, we introduce our proposed method, called Risk Sensitive Control framework. In Section IV, we describe in detail our experimental setup and conditions; We then present and discuss our results in Section V. In Section VI, we examine the theoretical implications of our experimental results in relation to our hypotheses regarding driver behavior and personalized control. Finally, we conclude this work in Section VII.

II. RELATED WORK
Studies related to the research presented in this paper can be divided into four categories. The first is research on subjective risk perception and driving behavior. The second is research on objective function (i.e., cost function) estimation. The third is the personalization of automated control, including attempts to adjust control parameters to model an individual's personal driving behavior. The fourth category of research includes studies on data-driven control frameworks, which combine machine learning methods with predictive controllers to model driving behavior.

A. SUBJECTIVE RISK AND DRIVING BEHAVIORS
During our review of studies on driving risk estimation, we found that the terms ''objective risk'' and ''subjective risk'' were frequently used to describe driving risk [11], [12], [17]. Objective or ''actual'' risk can be defined as the objective probability of being involved in an accident, which is directly measurable, while subjective or ''perceived'' risk can be defined as a driver's own estimate of the (objective) probability of an accident. Several studies [18]- [23] have shown that driving behavior is influenced by subjective risk perception. In earlier studies [12], [18], one of the key elements of the driving task which was focused upon was the avoidance of potential objective risks carrying the possibly severe consequences. However, other studies [24] have rejected the idea that objective risk is a primary determinant of driver behavior, suggesting instead that drivers generally seek to avoid behavior that elicits the perception of danger, and that their driving behavior is adjusted to lower these perceptions to match a target level of perceived risk that is acceptable to them [13]. Therefore, in our previous studies [15], [16], [25], we began modeling the relationship between subjective risk and driving behavior. In this study, we focus on developing control parameters based on these subjective risk models.

B. OBJECTIVE FUNCTION ESTIMATION
Estimating objective functions from historical data is a well-investigated problem known as Inverse Optimal Control (IOC) [26] or Inverse Reinforcement Learning (IRL) [27], [28]. The goal is to identify a parameter vector that weights various features so they will exactly reproduce the optimal outcome being sought. Instead, we can leverage our strong intuition regarding how we'd like autonomous vehicles to behave and create a cost function that demonstrates that behavior. Rather than computing an optimal policy given a cost function, IOC reveals the cost function that best explains a demonstrated optimal behavior. Another approach for objective function estimation is online parameter estimation, which has been employed in the Intelligent Driver Model (IDM). Several studies have used online filtering techniques to estimate IDM parameters [29]- [32]. Inspired by these approaches, in this study we use a meta-learning algorithm to continuously update the parameters of a cost function, improving the performance of the controller as more data is gathered from the user's subjective risk feedback.

C. PERSONALIZED CONTROL
Investigation of personalized control for AVs has a long history, which began with the tuning of the parameters of the intelligent driver model (IDM) [9]. Later, in [10], [33], the driving styles of autonomous vehicles were modified by changing the parameters of their Model Predictive Control (MPC) systems. However, in these studies, the MPC parameters were manually tuned to suit specific drivers. As a result of the rapid development of sensor technology and machine learning techniques, driver models can now be trained with real-world driving data [34]- [36], using a learning-based framework to create personalized automated driving styles. A driver's own behavior models are combined to generate control commands which feel natural to the user, and MPC is used to follow the driver's reference models. In the studies cited above however, the driver models and MPC parameters were developed separately, thus the driver models could not interact directly with the controller.

D. DATA-DRIVEN CONTROL
Alternatively, finding rigorous and efficient ways to integrate data into control theory has been a problem of great interest VOLUME 10, 2022 for many decades. Since most of the classical contributions in control theory have relied on model knowledge [37]- [39], the problem of finding such models from measured data, i.e., system identification, has become a mature research field. More recently, which train controllers directly with data has received increasing interest, and theoretical premises have been put forward [40]- [42]. These data-driven, model-based control methods have led to the development of ''combined control'', which generates control commands directly from observation, as proposed in [43]- [46]. This field is still under-explored but offers significant opportunities to exploit an abundance of data in a more reliable and safer manner than ''black-box'' approaches. An approach to learningbased control, known as Learning-based Model Predictive Control (LMPC), was proposed in [47]. The system is trained using multiple iterations of repetitive tasks, such as laps driven on a racetrack. A safe set and a terminal cost function can also be used in order to guarantee recursive feasibility and non-decreasing performance at each iteration [48], [49]. This approach can be used to teach the system how to follow a driving route based on its previous experience driving the route, however, there are no representative ways for the system to learn personalized driving styles. Recent successes in the field of machine learning, as well as the availability of improved sensing and computational capabilities in modern control systems, have led to growing interest in data-driven learning methodologies. Model-free methods, such as end-to-end behavior cloning (imitation learning) [5], and reinforcement learning (RL) [6]- [8] have achieved state-of-the-art results in many challenging domains. However, these methods learn ''black-box'' control policies and typically suffer from low sample complexity and poor generalization.
The control method proposed in this study is a data-driven, model-based method that uses vehicle models to predict the driving behavior of the ego vehicle, based on historical, real-world driving behavior and surrounding vehicle information. Moreover, the parameters of the cost function are learned from a particular person's subjective risk feedback and real-world driving data, in order to model individual differences in the relationship between driving behavior and risk perception. This relationship is then used to generate personalized control commands that reduce the level of subjective risk as perceived by a specific driver.

III. PERSONALIZED RISK-SENSITIVE CONTROL
Discrete-time, finite-horizon MPC is a well-studied control method which optimizes a convex quadratic objective function [47], [48], [50] with respect to state-transition dynamics of a specific state (z). In this section, we first describe our method of modeling subjective risk analysis for learning control parameters. We then propose a controller for executing control parameters derived from subjective risk models. Finally, we introduce our learning and execution scheme for generating control sequences.

A. PROBLEM FORMULATION
We use the following z vector to represent the state of the ego vehicle at time t: where x and y represent the Cartesian coordinates of the vehicle along its trajectory, φ is yaw angle, v is velocity, ω is lateral acceleration, and l is longitudinal acceleration. The control signals for this system at time t are: where δ represents steering angle, and a is acceleration force (or brake force). As shown in Fig. 1, our lane change problem can be defined as the movement of the ego vehicle (red) into the next lane in order to pass a preceding vehicle. We use z k t to represent the state of the surrounding vehicle (the blue and yellow car) traveling in direction k ∈ 1, . . . , K , where K = 6 relative to the ego vehicle at time step t. If the k-th direction contains several vehicles, the nearest vehicle will be selected as the location of the surrounding vehicle. Here, u k t represents the control input of the surrounding vehicles. State z t and control inputs u t at time step t (where t ∈ {1, · · · , T }, T is the length of time horizon), are subject to the following constraints, respectively: Z is the collision-free region in a given map, and U is the configured space for steering angle and acceleration force (or brake force) in our simulation system. Consider that the function f : Z × U → Z is uniformly continuous, bounded, and Lipschitz continuous in z t for fixed control u t . Therefore, the state dynamics of a particular vehicle can be represented as:ż The ego vehicle's strategy is represented by control input where u t is applied in each time step. We focus on the discretized, dynamic, state-space form setting within a time horizon of length T . The goal is to find the optimal control command at each time t for each individual driver i, as follows: where J i is the cost function for each individual driver i at each time step t: s.t. 36400 VOLUME 10, 2022 Here, h(·, ·) represents the state cost (deviation from target trajectory), Q represents the terminal cost towards the lane finishing goal, andẑ n is the approximated state in each time step [48], [51], [52].

B. SUBJECTIVE RISK MODELS
Based on our previous work [15], [16], to extract a particular driver's influential driving features for adjustment of the control parameters, Random Forests (RFs) are used to model the relationship between the driving signals of ego vehicle, surrounding vehicle information, and subjective risk R i , where i represents a participant number. There are two reasons for using RFs. First, we assume our subjective risk perception process can be modeled using bagged, treebased methods rather than deep learning or kernel-based methods. Second, calculating Gini importance, a measure of variable importance calculated by observing the effect of each variable on model accuracy, by randomly shuffling each predictor variable from RFs, does not take much extra time, and is a useful method for answering indirect questions [53]. We then apply this feature importance value to measure the relationships between driver behavior (or surrounding vehicle locations) and subjective risk perception. To be specific, since z t ∈ R N and u t ∈ R M and k ∈ 1, . . . , K , we generate feature importance values ρ i f ∈ R N +M , where f represents an index of the features of ego vehicle driving signals, or an index of surrounding vehicle locations, in order to compare the importance of various features to different individuals. After normalizing all of the parameters, feature importance rankings are integrated into an index for each feature category, that is η = N +M f =1 ρ i f = 1 for ego vehicle driving signals, and ζ = N +M +K f =N +M +1 ρ i f = 1 surrounding vehicle locations. Therefore, we will get ρ i = {η, ζ }, and these feature importance values are applied as hyper-parameters for generating risk-sensitive control commands.

C. RISK SENSITIVE CONTROL
Our goal in this study is to achieve personalized, data-driven control for following a reference trajectory by generating safe, personalized control sequences. RSC can be thought as a two-level framework with training and execution stages, as shown in Fig. 3. During the training phase, we combine subjective risk models (SRM) with nonlinear Model Predictive Controllers (MPC). SRM is used to extract individual differences in the influence of driving behaviors and surrounding vehicle locations on risk perception. Non-linear MPC is used to generate lane change maneuvers that track personalized, real-world driving data. In the execution phase, the learned hyperparameters are applied to generate online driving behavior. These hyperparameters are updated continuously based on observation of the driver's reactions to surrounding vehicles. In general, it is challenging to compute a reference trajectory that maximizes performance over an infinite horizon for a system with complex, non-linear dynamics or parameter uncertainty. Non-linear MPC is an appealing technique for tackling this problem due to its ability to handle state and input constraints while minimizing a finite-time, predicted cost [50], [54], [55]. It became clear VOLUME 10, 2022 during observation of our subjective risk models that both longitudinal and lateral acceleration influence participants' perception of risk. Therefore in order to model personalized, risk-sensitive control, we developed a risk-sensitive vehicle model which is an expansion of the kinematic bicycle model [56], [57]; In our model, the two front wheels (resp. the two real wheels) of the vehicle are lumped into a unique wheel located at the center of the front axle (resp. of the rear axle) shown in Fig. 4. The control inputs correspond to acceleration a and steering angle δ. Our vehicle model is defined as shown in Eq. (9) with the center of gravity located at the center of the rear axle:ẋ where L is the distance between the front and rear wheel of the ego vehicle, φ is the yaw angle at the center of gravity. Equation (9) was inspired by methods used for executing control strategies within joint dynamic systems. This system can be locally linearized g(·) from Eq. (5) using the Forward Euler method as follows:  [56], [57], which can generate control that is more sensitive to both longitudinal and lateral acceleration.
In this manner, the values of state vectorz t+1 can be approximated as functions of time t. Using this model, the system being analyzed can be represented with ordinary, firstorder differential equations (ODEs), which can be used to describe how the state variables change over time. Using control vector u, state vector z, and the ODEs above, we can compute (or predict) the outputs of the system for the next time step. The output is, again, a vector containing the predicted state variables. The process (an autonomous vehicle in this case) uses the control inputs to transform the input state into the output state. We can obtain a discrete-time mode by using Forward Euler Discretization with sampling time t as follows:z We then use first degree Taylor expansion around current statez and control inputsū. Here we represent reference values z r,t ∈ Z i and u r ∈ U i , which are the real world driving data for individual driver i to bez andū: A and B are Jacobian matrices of Eq. (9) with respect to z and u respectively. Then, using the equations provided in the Appendix of this paper, we can obtain A, B, and C. We can now predict future motion using Eq. (14) and Eq. (10). The goal of the controller is to find the optimal control input u i t that results in an output state that is as close as possible to the reference state at each time step for individual driver i. In other words, the controller tries to minimize the error between the process output and reference variables via the control input variables for T time steps. In order to incorporate individual differences in driving behavior, we developed subjective risk analysis models to learn hyperparameters representing the feature importance rankings of RFs, as described in Section III-B. In this study, we applied do-mpc toolbox [58] to solve the optimization problem of the non-linear model predictive control component of our modular design. By minimizing the cost function J i created for each participant, our goal is to generate personalized control commands at each time t for individual i as follows:

D. LEARNING CONTROL PARAMETERS
For simplicity, formulation of a cost function based on Eq. (5) can be summarized as follows: where Q T ∈ R N represents the terminal state coefficients, Q n ∈ R N represents the state coefficients, Q m ∈ R M represents the control input coefficients; and Q k ∈ R K are coefficients representing penalties of the distances to surrounding vehicles traveling in direction k. Here, z r is the reference state at time step t which is obtained from real driving data. The expression (·) k t , k ∈ 1, . . . , K to represent the state of surrounding vehicle traveling in the k-th direction, 36402 VOLUME 10, 2022  while Z k represents the zones surrounding the ego vehicle, as shown in Fig. 5, thus (x 1 t , y 1 t ) represents a situation where positionof the surrounding vehicle is located in Area #1. Finally, our problem of interest is to tune the parameters θ i = {Q T , Q n , Q m , Q k }, which represent the terminal cost, stage cost, control input deviation cost and surrounding vehicle distance penalty, respectively, so that J i t→t+T has the minimum loss. Inspired by [43]- [46] and as reflected in Eq. (16), the idea behind differentiable MPC is to optimize cost J i directly with respect to the tunable parameters by applying gradient descent, as shown below and in Fig. 6.
Here, j = 0, 1, . . . is the iteration index, and ξ = {z t:t+T , u t:t+T −1 } | θ j represents the driving states and control input with respect to θ j ; ∂J i j ∂θ j | θ j thus is the gradient of the cost with respect to the parameters evaluated at θ j for participant i. Hyperparameter learning rates ρ i are extracted from our subjective risk models as explained in Section III-B. Each update of θ j consists of two components; a forward pass, where at θ j , the corresponding trajectory ξ is solved using Eq. (14) and Eq. (15), and the personalized loss J i is computed using
• J i t→t+T : The specific scalar-valued control cost function based on Eq. (16) for driver i in j-th iteration with a current guess of θ j .

2) BACKWARD PASS
• ∂ξ ∂θ j , ∂J i j ∂θ j : The partial gradients of system trajectory ξ , and the loss with respect to the current guess of θ j respectively.
The total gradient computed uses the chain rule with respect to the current θ j .

IV. EXPERIMENT SETUP
In this section, we will first explain how we collected driving data from the participants. We then describe how our experiment was conducted.

A. EXPERIMENTAL DESIGN
In order to model the relationship between driving behavior, surrounding vehicle locations and subjective risk perception for lane change maneuvers, and we set up five experimental lane changes scenarios as illustrated in Fig. 7. These five scenarios were predefined to be either safe or risky: Scenarios #1 and #2 were considered safe based on [59], while Scenario #3 was considered to be relatively risky, Scenarios #4 and #5 were deemed risky, based on emergency level from [60]. While maneuvers other than lane changes are possible in some of these experimental scenarios, we instructed the participants in our experiment to perform a lane change The ego vehicle is shown in red, and the surrounding vehicles are shown in blue or yellow. The colored bars in front of the surrounding vehicles show their predefined velocity patterns, e.g., in Scenario #2 the blue vehicle is set to drive slower than the yellow vehicle, therefore the ego vehicle is expected to pass the blue vehicle on the left. VOLUME 10, 2022 to the right or left, as they saw fit whenever they felt comfortable in regards to timing. Our participants assigned a level of subjective risk to each of these maneuvers after watching them being autonomously executed on a driving simulator using Personalized MPC (PMPC), as defined in our previous work [16], whose parameters are tuned to generate driving behavior close to a specific driver. Then they repeated this process using the proposed RSC method. The five driving scenarios were as follows: Participants were asked to change lanes to the right to avoid an unsafe following distance. The unsafe distance from the surrounding vehicle is predefined based on [60]. Since these scenarios occur on highway, the participants were instructed not to stop but to change lanes to avoid a collision. 5) Scenario #5 (Cut Off From the Right): One surrounding vehicle, traveling at (110 − 130 km/h), cuts in front of the ego vehicle from the right. Participants were asked to change lanes to the left to avoid an unsafe following distance. Scenario #5 was created to observe differences in driving behavior and subjective risk perception due to the direction of the lane change, with respect to Scenario #4. Therefore, the unsafe distance from the surrounding vehicle is again predefined according to [60]. Then experimental conditions are identical in Scenario #4, except the direction of the lane change is to the left.

B. DATA COLLECTION
We collected four types of data for this study: ego and surrounding vehicle driving signals, demographic information of study participants, subjective risk perception feedback using questionnaires, and the driving intervention behavior of the participants during autonomously generated lane change maneuvers when different vehicle control strategies were being used.

1) SIMULATION AND DATA COLLECTION SETUP
The data used in this study was collected from our study participants during a simulated driving experiment using the CARLA driving simulator [61]. The experiment took place in our lab, and the hardware used included three monitor screens (to simulate a realistic, wide-view driving environment), a Logitech G27 driving force pedal set, and a Logitech steering wheel. The data collection environment is shown in Fig. 9. Data collection was conducted in three steps. First, experimental participants were briefed on data collection ethics and asked to practice using the driving simulator to familiarize themselves with the system. The Town04 CARLA map was used. Participants were allowed to drive around freely to familiarize themselves with driving in the simulator through habituation driving. Second, participants were asked to drive in each of the five predefined lane change scenarios, which they repeated four times per scenario, so we could record their personal driving data during each of the 20 lane changes. Finally, all participants were asked to view the same lane changes as automatically generated by personalized driving models, first using PMPC, and then as generated using the proposed RSC system, as if they were using autonomous driving systems. They reported their perceived level of risk for each scenario on a paper questionnaire prepared in advance. The risk score values were described as shown below: . Data collection environment used in this study. All participants were asked to drive in five, predefined lane change scenarios in the CARLA driving simulator, from the driver's point of view. In the photo above, driving data is being collected from an expert driver while driving in CARLA. • 1 = Very safe • 2 = Safe • 3 = Neither risky nor safe • 4 = Risky • 5 = Very risky We collected both ego vehicle and surrounding vehicle driving data using the PythonAPI in CARLA, while running the simulation system at a fixed time-step of 0.05 seconds per frame (20 fps). The generated dataset ('Subjective-risk-lanechange-dataset') 2 is available on the Internet.

2) EXPERIMENTAL PARTICIPANTS
A total of 30 subjects took part in this experiment. The participants included men and women, and their level of driving experience ranged from expert to beginner. Our study participants had an average of 25.7 years of driving experience (SD = 17.3). All of the participants had a valid Japanese driver's license, and they all used the driving simulator for habituation before we began collecting data. The demographic information is summarized in Table 1. Each participant drove in each of the five lane change scenarios four times (i.e., each subject executed 20 simulated lane changes). After performing the lane changes in the simulator, they were asked to view the same lane changes as executed by two different automated driving systems, a Personalized MPC (PMPC) system, and our proposed RSC system. The personalized control commands for both systems were generated using each participant's own driving data. They were asked to assign a subjective risk score to each of the system-generated lane changes, which were the same lane change scenarios they had previously driven in themselves.
The collection of this data was approved by the ethics committee of the Institutes of Innovation for Future Society of Nagoya University (ID: 2020-29). In order to determine how many participants were needed to evaluate our proposed system, we conducted sample size estimation [62] using a two-tailed t test. The calculated minimum sample size for our comparison of velocity sequences during simulated driving was 19. We recruited 30 participants in order to collect driving data from a variety of drivers, and to extract a wide range of personalized driving patterns.

3) INTERVENTION AS SUBJECTIVE RISK IDENTIFIER
In order to investigate the relationship between driving behavior and subjective risk perception, participants were instructed to intervene when observing the automated lane changes generated by the baseline PMPC and proposed RSC methods if they felt the situation was risky. The lane change scenarios in which the participants intervened were classified as ''very risky'' (score = 5). The locations of the participant interventions in each scenario are shown in Fig. 11, where plus marks (+) are used to represent the intervention points of all of the drivers who intervened. The number of interventions during PMPC and RSC generated driving, the locations where participants intervened in the scenario as reported by our 30 participants, and the features that participants identified as evoking the perception of risk were all recorded as summarized in Fig. 10. We can see that RSC-generated driving triggered much fewer interventions than PMPC-generated driving on average.

4) SUBJECTIVE RISK AND DRIVING FEATURES
After the driving data was collected, all of the participants were asked to assign subjective risk scores to each of the same five lane change maneuvers as generated automatically using the PMPC and proposed RSC methods, both of which were personalized using each participant's own driving data. The participants were asked to assign a subjective riskiness level R i s (for i-th participant, and s represents the scenario id) to each lane change maneuver using the Likert scale. Furthermore, they were asked to intervene if they did not feel comfortable with the automated vehicle's lane change behavior, in which case the lane change was assigned a subjective risk score of 5 (very risky) by default. After intervening, participants were asked to explain why they felt it was necessary to intervene, i.e., what feature of the lane change evoked their perception of risk. Driving features for each participant were recorded at the time when the participant intervened, including ego driving features ( Table 2) and surrounding vehicle locations (Fig. 5). The most influential factors to subjective risk were noted by each participant in questionnaires. The summary of these responses is shown in Fig. 10, and details for each participant are shown in Table 5 (Appendix-B).

C. FEATURE EXTRACTION AND PREPOSSESSING
The collected data, which included ego vehicle driving states and control commands at time t ( z t and u t , respectively), VOLUME 10, 2022 surrounding vehicle driving states z k t and control commands u k t , k ∈ (1, . . . , K ) were synchronized at 20 [Hz]. The steering angle is normalized to be δ t , δ k t ∈ [−1, 1] and acceleration a t and brake force b t were represented as a t , a k t ∈ [−1, 1], where [−1, 0] corresponds to a brake pedal input. The dynamic features of the ego vehicle's driving signals and the surrounding vehicle location data offer a wealth of information about driving behavior during lane changes. Figure 5 illustrates how we defined our surrounding vehicle locations, thus K = 6. To balance differences in each individual's subjective risk assessments, the Likert scale subjective risk scores they reported were normalized by removing the mean and scaling to unit variance. The features shown in Table 2 were those deemed to be relevant and were recorded.

V. EXPERIMENTAL RESULTS
The effectiveness of our proposed Risk Sensitive Control method was evaluated through a leave-one-lane-change-out validation experiment, and the results of this experiment are described in this section. After subjective risk classification and extraction of hyper-parameter ρ i for each participant, lane change maneuvers for the same scenarios were automatically generated using PMPC and RSC. We observed participant intervention behavior when using each vehicle control system to determine which system prompted the least driver intervention. Then, following subjective risk modeling, we statistically tested the lane change subjective risk assessment results reported by our study participants when using each of the automated lane change control methods. Thus, the two methods were compared using both quantitative and qualitative methods.

A. SUBJECTIVE RISK MODELS
We will first discuss our observations on the relationship between driver types and subjective risk, and then the causative factors of intervention based on feedback collected from the participants. Lastly, we explain how we explore the relationship between driving behaviors and subjective risk using classification models, in terms of feature importance.

1) DRIVERS AND SUBJECTIVE RISK PERCEPTION
To obtain an intuitive understanding of individual differences in the perception of risk and the related influential factors, we first compared the average subjective risk scores assigned by different types of drivers, based on their driving experience, expertise, age, and gender. We then compared participant responses to our five lane change scenarios.

a: DRIVING EXPERIENCE, EXPERTISE, AGE
Based on our classification of drivers according to their years of driving experience, driving expertise (expert or nonexpert), and age, as summarized in Table 1, the results of our comparison of the subjective risk scores for each category 36406 VOLUME 10, 2022 FIGURE 12. Results of ego vehicle driving signal feature importance analysis for subjective risk models of our 30 participants. Red bars represent the most influential ego vehicle driving signal for that participant. The different background shades of the bar charts represent the corresponding driver types. These driving signal features are defined in Table 2. of drivers is shown in Fig.13. Overall, the highest subjective risk scores were reported by expert drivers. This implies that expert drivers, such as driving school instructors, perceive greater danger (i.e. risk) in complex driving situations. In contrast, we observed that the lowest subjective risk scores were reported by our elderly drivers, the group which also had the most driving experience, as shown in Table 1. This suggests that as age and driving experience increase, drivers' subjective perception of risk, in relation to their own driving behavior, decreases when encountering complex lane change situations.

b: DRIVING SCENARIOS
When setting up the various lane change scenarios for our experiment, we determined a priori which lane change scenarios were ''safe'' and ''risky''. However, as shown in Fig. 13, when we compare the subjective risk scores reported by our participants according to their level of driving experience, age, and expertise, we discovered that there was no consensus among the study participants as to which lane change scenarios were ''safe'' and which were ''risky''. The expert drivers recognized Scenario #5 as risky (a vehicle traveling at high-speed passes on the right and then aggressively merges into the ego driver's lane), but reported Scenario #3 (a car stopped in the ego vehicle's lane) as the most dangerous scenario. Normal adult drivers reported Scenario #4 as the riskiest (a vehicle traveling at high-speed passes on the left and aggressively merges into the ego driver's lane), while also recognizing that Scenario #5 was risky. But surprisingly, these drivers considered Scenario #2 (passing a slow-moving vehicle on the left, after being passed by a faster-moving vehicle traveling in the left lane) to be about as risky as Scenario #5. Elderly and young drivers did not consider Scenario #5 to be especially risky, and these two groups also considered Scenario #2 to be the riskiest. These results confirm that different types of drivers have different subjective perceptions of the level of danger present when encountering the same pre-defined lane change scenarios, confirming that our first hypothesis (hyp1) is valid because driving behavior varied among our drivers, even when encountering the same lane change scenarios. Furthermore, our second hypothesis (hyp2) was also confirmed, since perception of risk varied among our participants when encountering identical driving scenarios.

2) INTERVENTION BEHAVIOR AND SUBJECTIVE RISK
We also investigated which driving factors triggered participants to intervene when viewing the lane change maneuvers autonomously generated by the PMPC and RSC models, based on their questionnaire responses. The results of the questionnaires revealed that the following three features were the most influential among our participants when perceiving risk: insufficient distance to a surround vehicle, abrupt changes in steering angle, and sudden changes in velocity. When observing lane changes made when using the RSC model, the factor that most often evoked the perception of risk was an insufficient distance from surrounding vehicles, with 56.7% of the participants reporting that this motivated VOLUME 10, 2022 them to intervene, while 30% reported that abrupt changes in the steering angle seemed risky as shown in Fig. 10. The third phenomenon that increased the level of perceived risk, leading to intervention, was sudden changes in velocity due to acceleration or deceleration. These results also confirm our second hypothesis (hyp2), that the source of differences in driving behavior is individual variation in perception of risk. Data on participant interventions during lane change scenarios when using PMPC and RSC are shown in Table 5, revealing that study participants found automated lane changes made with RSC were less risky than those made with PMPC.

3) DRIVING FEATURES AND SUBJECTIVE RISK PERCEPTION
To highlight the effects of surrounding vehicles on subjective risk assessment, we compared the results of subjective risk classification when only ego vehicle data was considered (Ego−only), when only surrounding vehicle information was considered (Sur − only) and when both ego and surrounding vehicle information were considered (Ego−Sur), the latter of which was used in the proposed method. In this paper, Area Under the Curve (AUC) was used to represent subjective risk classification performance. A Likert scale lane change risk classification system was used, where 5 = Risky, and 1 to 4 = Non-risky. As shown in Table 3, the averaged AUCs when using both ego and surrounding vehicle information (Ego − Sur) were higher than when using the other methods. Feature importance results when using only ego vehicle driving signals are shown in Fig. 12, in which the red bars represent the most influential driving signal for a particular driver. For most of the participants, velocity, steering angle, and throttle acceleration signal most strongly evoked the perception of risk. Fig. 18 (Appendix-C) shows the feature importance of surrounding vehicle location in relation to subjective risk perception. These results show that surrounding vehicles located in Areas #3 and #2 were the most likely to evoke the perception of subjective risk. However, participant opinions as to which ego vehicle driving signals and which surrounding vehicle locations were the most influential differed among our 30 participants. This also supports our second hypothesis that differences in driving behavior are based on individual variation in risk perception. Note that these personalized feature risk importance values were applied as hyperparameters when training the parameters of the automated driving control methods used in this study.

B. ESTIMATION OF COST FUNCTION
Execution costs for each of 30 drivers when RSC generated the lane change in Scenario #1 are shown in Fig. 14. At first, the cost function is weighted by the terminal cost towards the goal. However, due to our proposed learning process, the cost function then begins to learn from the real-world driving behavior of the driver. As the ego vehicle approaches the preceding vehicle, the cost increases, so a collision is avoided. Since the feature importance of surrounding vehicle locations and ego vehicle driving behavior are different for different individuals, the cost functions for each study participant also varied. These results also demonstrate that our proposed RSC framework is able to learn from the real-world driving of the user while simultaneously avoiding collisions with surrounding vehicles.

C. BASELINE METHODS
In order to validate our framework, conventional MPC and Personalized MPC [16] were selected as baseline methods.
1) Conventional MPC (MPC): As proposed in [33], we selected the parameters, shown in Table 4, empirically, where diag(·) represents the minimum and maximum constraints of velocity and acceleration, and where d min represents the minimum distance to surrounding vehicles. 2) Personalized MPC (PMPC): As proposed in [16], the same parameters are used as with conventional MPC, however, driver preferences and constraints for velocity (v), acceleration or brake (a), steering angle (δ), and min distance (d min ) are extracted from each participant's data.

D. QUANTITATIVE EVALUATION
Quantitative evaluations were conducted as follows. First, we compared the driving behavior profiles generated using the baseline and proposed methods, then we compared the similarity of these driving profiles with the actual driving data of participants belonging to each of the various driver types to confirm personalization performance.

1) GENERATION OF DRIVING PROFILES
A comparison of the generated velocity, longitudinal acceleration and steering wheel angle using MPC, PMPC, RSC, and real driving data of Participant 28 (P#28), an elderly driver, are shown in Fig. 15. To intuitively understand individual differences in the driving patterns of the participants, we generate histogram charts for the velocity v [km/h] and steering  We can observe that after learning the control parameters, the RSC method most accurately reproduced the participant's actual driving data.
angle δ [degree] data shown in Fig. 19 (Appendix-D) for representatives of each type of driver during each of the five lane change scenarios. From the results of this visualization, we can make the following observations. 1) Expert drivers had a small range of variance in their driving data when navigating Scenarios #3, #4 and #5.
2) Elderly drivers exhibited large variances in driving data when navigating ''safe'' scenario #1, ''neither'' Scenario #3, ''risky'' Scenario #4 and #5, i.e., their personal driving styles during each scenario were very different compared to the other scenarios. 3) Expert and young drivers exhibited greater similarity (i.e. consistency) in their driving styles, compared to the driving styles of normal or elderly drivers. Since the driving data shown in Fig. 15 is from Scenario #3, which involves passing a preceding vehicle stopped in the ego vehicle's lane, we can observe the driver's actual acceleration and steering angle changing rapidly to avoid the stopped vehicle. MPC and PMPC, which are unable to learn from real driving data, did not perform as well as RSC in this scenario, since RSC can learn and thus could better imitate the driving style of a specific driver, as clearly shown in Fig. 15.

2) COMPARISON OF DRIVER STYLE AND DRIVER TYPES a: SIMILARITY SCORE
For our comparison of human and system generated driving styles, we used Kolmogorov-Smirnov distance (KS-distance) [36], [63] as our evaluation metric. A static error criterion (such as the MSE) is inadequate for evaluating similarity between a human driver and a driver model, therefore, KS-distance was used to quantify similarity in the shape and location of the probability distributions of velocity VOLUME 10, 2022 FIGURE 16. Averaged similarity scores of velocity sequences among different driver types for five lane change scenarios using proposed RSC and baseline PMPC modeling methods. In each scenario, RSC shows higher similarity to actual driving data, compared to PMPC.
in real and simulated driving sequences. Average KS distance for an indicator was defined as: where p and q are the numbers of time steps in the real and simulated driving sequence, respectively (here p = q), and F p (x) and F q (x) are the empirical distribution functions of the indicator for the real and generated velocity sequences.

b: IDENTIFICATION OF DRIVER TYPE
In this experiment, we calculated the similarity scores between the velocity sequences generated for each driver when using the baseline PMPC method, and the proposed RSC method, and averaged it with the results for the driver category to which the test participant belongs. This comparison was conducted for each lane change scenario. From the experimental results shown in Fig. 16, we can conclude that the velocity generated by our proposed RSC model is closer to the average distance for the corresponding driver type, compared to PMPC. It was also observed that the proposed RSC model reproduced the driving styles of each type of driver.

E. QUALITATIVE EVALUATION
Our third hypothesis (hyp3) is that individual differences in influential driving factors can be applied to generate safe and personalized driving commands. The results of our subjective risk score comparison for the lane change scenes generated using the PMPC and RSC models, based on study participant risk level feedback, are shown in Fig. 17. For 28 of 30 drivers, the RSC-generated lane changes received lower subjective risk scores than the PMPC-generated lane changes. We compared the decrease in the subjective risk levels for all drivers (mean = 0) to validate our proposed framework at a statistical level (α = .05). The p-value of the one-sample t-test result was 2.74e − 05, which indicates that the RSC-generated lane changes were judged to have significantly lower subjective risk levels than those generated using PMPC.

VI. DISCUSSION AND LIMITATION
To evaluate our first hypothesis (hyp1), that driving behavior varies among drivers, even when driving in the same environments, we set up null hypothesis H 0 , where driving behavior includes velocity v, acceleration a and steering angle δ. In order to validate individual differences in driving behavior, we conducted an unpaired t-test comparing velocity, acceleration and steering angle for all participants. Although we did not observe individual differences (p-value > 0.05) in driving signals v and δ during Scenario #2 lane changes, the results of this unpaired t-test showed that in most of the other lane change scenarios, differences among 30 drivers, were observed at a significant level (α = 0.05). Therefore, we can validate our hypothesis that driving behavior varies among different drivers when driving in the same environments (in this case, our predefined lane change scenarios). Regarding our second hypothesis (hyp2), based on the various reasons given by our study participants for intervening in machine-generated lane changes (Fig. 11), as well as differences in their rankings of the risk levels of the lane change scenarios (Fig. 13), it seems likely that the source of these differences in driving behavior is individual variation in the perception of risk. This is validated in Table 5, which compares driver intervention behavior. Our third hypothesis (hyp3), that individual differences in the factors which evoke the perception of risk can be applied to generate personalized driving commands, was statistically tested by applying subjective risk modeling in our proposed RSC method, resulting in significantly lower subjective risk level assessments compared to PMPC.
There are two main limitations to this study. First, participants could not observe surrounding vehicles which were located directly behind the ego vehicle (i.e., in Area #5 in Fig. 18) during driving data collection due to the front-view camera setting in the CARLA simulation system. This meant our analysis of which surrounding vehicle locations were most influential when drivers were assessing risk levels could not reflect realistic driving situations which involve following vehicles. Second, driver interventions still occurred during RSC-generated lane changes, because some risky driving situations, such as those in lane change Scenario #4 and #5, still evoked an uncomfortable perception of danger in some of our participants. This suggests that generating comfortable lane change maneuvers needs to predict the surrounding vehicle's behaviors especially in the risky scenarios. These two limitations strongly suggest that the ability to predict the behavior of surrounding vehicles is crucial for the accurate prediction of risk perception by drivers.

VII. CONCLUSION
In this study, we proposed a data-driven method of autonomous vehicle control which combined subjective risk models and predictive controllers to generate personalized lane change maneuvers. Our experimental results demonstrate that the proposed Risk-Sensitive Control method is able VOLUME 10, 2022 FIGURE 19. Histogram graphs of velocity and steering angle from our data collected during the five lane change scenarios. The starting position is fixed and the surrounding vehicle information is predefined (i.e., surrounding vehicles always drive in the same manner). Obvious differences in driving behavior among drivers who are Young, Normal, Elderly and Expert, including velocity and steering angle/lane change trajectories, can be observed.
to generalize personalized lane change sequences that elicit lower levels of subjective risk in drivers, and achieve higher similarity to the driving style of a specific driver, in comparison to existing AV control methods. Our subjective risk model assigns, personalized weights to its hyperparameters which include ego-vehicle driving signals and surrounding vehicle locations, in order to fuse sensitivity to risk with preferred motion patterns during lane changes. Our Risk Sensitive Control approach, which is the inverse of the Optimal Control framework, uses a meta-learning algorithm to update the parameters of the cost function, based on an individual's real-world, real-time, driving data. By updating personalized cost functions, the control system to learn and reproduce a driving style that reduces the level of subjective risk perceived by individual drivers. To model the personalized driving sequences of the control frameworks, we compiled an original dataset ('Subjective-Risk-Lane-Change-Dataset') which includes driving behavior data, surrounding vehicle location data, subjective risk scores, and demographic information for 30 drivers. Qualitative results show that our study participants reported significantly lower subjective risk scores when lane change maneuvers were generated using our proposed method. Quantitative results demonstrate that our model outperforms other control frameworks in terms of similarity scores, demonstrating that our model is able to learn personalized driving styles from collected data. Therefore, our proposed model achieves data-driven control based on both personalization and subjective risk sensitivity.
Scenarios with interventions (during PMPC and RSC), and the most influential subjective risk factors for each of our 30 participants (as given questionnaires) are listed in Table 5.

C. SURROUNDING VEHICLE LOCATION IMPORTANCE
A ranking of the importance of surrounding vehicle locations, in regards to risk perception, for each of our participants shown in Fig. 18.

D. VELOCITY AND STEERING ANGLE DATa FOR EACH DRIVER TYPE DURING EACH LANE CHANGE SCENARIO
Histogram graphs of velocity and steering angle for different driver types during the five lane change scenarios are shown in Fig. 19. The graphs show representative data for each driver type.