Predictive Lane Change Decision Making Using Bidirectional Long Shot-Term Memory for Autonomous Driving on Highways

This paper presents a lane change decision algorithm for predictive decision-making for an autonomous vehicle using a Recurrent Neural Network (RNN) with a Bidirectional Long Short-Term Memory (Bi-LSTM) cell. The proposed decision-making algorithm was trained and validated by driving data collected by vision, laser scanners, and chassis sensors of autonomous vehicles. The input features for the Bi-LSTM based RNN consist of the clearance and relative velocity with the surrounding target vehicles, lane measurements, and the velocity of the autonomous vehicle. The output features are configured to generate the probability of three maneuvers, left lane change, right lane change, and lane-keeping. The Bi-LSTM based RNN is configured to decide in advance two seconds before lane changes by using two seconds of observation. The collected 20,108 datasets were accumulated in global coordinates. After processing and resampling the collected datasets, 1,120, 320, and 160 datasets were generated to train, validate, and test the Bi-LSTM based RNN. The proposed algorithm was evaluated by a case study and a driving data-based prediction accuracy analysis. The results of the predictive lane change decision by the proposed algorithm have been shown to be more accurate and similar to a driver than previous approaches.


I. INTRODUCTION
The development and mass-production of a driver assistant system have had an important role in improving driver comfort, convenience, and safety. The Advanced Driver Assistant System (ADAS) has been applied to various vehicle models from many manufacturers, showing a high sales rate. The success of the ADAS, which is a level 2 autonomous system, led to extensive research on autonomous driving to develop a level 3 or higher-level Autonomous Vehicle (AV). Generally, the field of autonomous driving research is classified into perception, localization, motion planning, and control [1]. These functions are implemented sequentially in an AV to achieve autonomous driving in the real world. The perception and localization focus on the processing of sensor measurements, such as radar, LiDAR, or a camera. The control function generates actuator inputs to achieve the desired behavior determined by the motion planning algorithm. Therefore, the The associate editor coordinating the review of this manuscript and approving it for publication was Kathiravan Srinivasan . performance of the sensor and actuator is the most important consideration for designing the perception and control algorithm, respectively. However, the motion planning algorithm of an AV requires human-like behavior based on the intention inference of surrounding traffic participants and the situation awareness of the driving environment [2], [3]. Therefore, motion planning covers not only generating the desired path and velocity, but also the decision-making process for selecting the target maneuver. This is the reason why a part of the motion planning is classified as task planning for the AV. For example, in the case of a lane change, determining the possibility and necessity of a lane change can be considered as decision making [3], and generating a desired path and velocity to change the lane can be regarded as motion planning in a narrow sense [4].
Among the various functions of ADAS and AVs, a lane change function is the next step of the lane following and distance control with the preceding vehicle. However, lane change requires decision making for more complex driving scenarios with many considerations, such as surrounding VOLUME 9, 2021 This The predictive lane change decision can enhance the driving safety of autonomous driving when performing lane change. In addition, the proposed algorithm can improve the acceptance of the autonomous vehicle to passengers and traffic participants, because the proposed algorithm can secure the preparation time before initiating lane change. Therefore, lane change can be performed more mildly and sufficient guidance can be provided to passengers. In particular, it is effective for level 3 or higher systems where the responsibility of the system increases.
Because the level 3 or higher-level autonomous driving system have more responsibility of driving task than driver assistant system such as lane change assist, the lane change function of autonomous driving system requires more accurate and predictive lane change decision to secure the preparation time to match the ego vehicle's position and speed to target lane's traffic flow before performing the lane change. Therefore, the proposed algorithm in this study can be applicable to the autonomous driving system regardless of the level. However, it is more effective for a higherlevel autonomous system, which should perform skillful lane changes to improve the driver and traffic participants' acceptance of autonomous driving. Therefore, the proposed algorithm is trained using driving data collected by the environment sensors on the AV and designed to predict the lane change timing based on the observation of driving conditions. The performance of the predicted lane change decision algorithm is validated by the driving data-based analysis.

II. RELATED WORKS
Decision-making by AVs should increase the acceptance of autonomous driving by drivers and traffic participants. Researchers have introduced various approaches to design the decision-making algorithm for lane change based on measurements from the environment sensors and communication, such as Vehicle to Vehicle (V2V) or Vehicle to Infrastructure (V2I) [5]. The previous studies for lane change decisionmaking can be classified into three categories: (1) rule-based, (2) model-based, and (3) learning-based approaches.
Various rules have been proposed to design lane change decision algorithms. Generally, clearance and relative velocity with a preceding vehicle and side lane targets have been used frequently to configure the rules for lane change decisions. Various methods have been used to design improved rules. The mixed logical dynamical system modeling was introduced to combine logical rules and physical laws [6]. In this study, safety constraints were considered as linear inequalities. The utility function to evaluate the lane change possibility has been defined based on the average travel time, average time gap density, and remaining travel time of the surrounding vehicles [7]. In addition, game theory is utilized in lane change decisions with a pay-off matrix, which divides the longitudinal motion into acceleration and deceleration [8]. To quantify the lane change desire, incentives for speed, route, and keeping right were evaluated by integrating with an Intelligent Driver Model (IDM), and the lane change decision was made based on the incentives [9]. An approach was introduced to subdivide lane changes and set rules. Lane change behaviors were categorized into five cases, and the lateral displacement of an AV for each case is approximated as a 5 th order polynomial. A lane change decision was made by comparing the polynomial and the physical threshold of the lane change maneuver [10], [11]. Furthermore, to consider the side lane vehicle inside the blind spots, V2V communication was utilized in the lane changing decision algorithm [12].
For model-based approaches, the driving characteristics of human drivers have been modeled to define the reference model for lane change decisions. Initially, a lane change model based on a gap acceptance was proposed, and the forced merging model was used for traffic jam situations where acceptable gaps are hard to be found [13]. Similarly, a gap acceptance model with the likelihood function of lane changing actions was used to model the execution of lane changes [14]. To respond to driving situations that are difficult to distinguish with Boolean logic, the fuzzy rule-based lane changing decision model was introduced. A simple binary decision was designed based on a fuzzy interface system [15]. More complex fuzzy rules considering the space gap with front and side lane targets and average speed in the current lane were proposed to cover the lane changing to a slower and faster lane [16]. Furthermore, an adaptive model was proposed to reflect the different driving characteristics of drivers. A Gaussian mixture model was used to adjust the parameters of the sinusoidal lane change model, the time needed to complete the lane change [17]. Stochastic model predictive control was used to consider the uncertainty of the predicted time-to-collision and safetydistance with surrounding targets into the lane change decision [18].
With the development of processors, learning-based approaches have been utilized for lane change decision problems. Because the lane change decision can be considered as a binary classification problem, the Support Vector Machine (SVM) algorithm was trained to learn whether to maintain lane-keeping or start a lane change [19]. To enhance the performance of the SVM, Bayesian parameters optimization was adopted to SVM [20]. Bayes classifier and decision trees were combined to increase the decision accuracy using a majority voting principle [21]. To consider the temporal dependency of the vehicle behavior and traffic situation, a dynamic Bayesian network was introduced to the situation assessment for the lane change. This network estimated the beliefs about the driving situations [22]. Random decision forest is also utilized to estimate the probabilities of lane following, left lane change, and right lane change [23], [24]. However, driving data collected by sensors have noise, and some information is not measurable. To compensate for this problem, a partially observable Markov decision process was used for lane change decision-making in urban traffic [25]. An end-to-end learning approach was utilized to model the relationships between the rear-side view images and the driving situations, which were classified as blocked, free, and undefined [26].
Based on the advance of a simulation environment for autonomous driving, the Reinforcement Learning (RL) method has been introduced to design the decision-making algorithm without labeled data. Multi-kernels least-squares policy iteration was used to reduce the parameter complexity of the lane change problem [27]. A Deep Q-learning algorithm was also used to design the lane change decision algorithm. The Deep Q Network (DQN) with real-time validation was designed based on the assumption that the ego vehicle states are fully observable [28]. This approach was adopted to solve the lane change decision for a truck-trailer combination [29]. In addition, DQN combined with the rulebased constraints was applied to integrate high-level decisionmaking and low-level trajectory generation [30]. Moreover, the Q-making technique was applied to DQN to integrate high-level policy and low-level control by forcing the agent to explore a subspace of Q-values [31]. An integrated approach between the RL-based lateral controller and IDM-based longitudinal controller was proposed. However, this approach was initiated based on the results of a gap-selection module, which was frequently used to define the lane change decision rule [32]. According to a careful review of the literature, various approaches have been proposed to develop a decision-making algorithm for lane-change maneuvers. Many studies have formulated lane change decision-making as a binary classification problem [7], [9], [10], [12], [15], [16], [18], [21], [24], [25]. This means that the target direction of the lane change is already determined by the global path planning or road structure. To cover the various driving situations, the maneuver candidates increased [6], [8], [11], [13], [14], [17], [22], [24], [26]- [32], and probabilistic concepts have been introduced [14], [17], [18], [21]- [23], [25]. Some studies used the Next-Generation Simulation (NGSIM) datasets to train and validate the learning-based approaches [21], [24]. Because NGSIM datasets were generated from the images from the camera installed in the infrastructure, the characteristics of the measured target states and those of the datasets are different. Therefore, the algorithm based on the external datasets might be challenging to implement AVs. When learning is carried out using a traffic simulator, the performance of the developed algorithm depends on the degree of simulation of the real world. In addition, previous studies have focused on lane change decisions at the time of the sensor measurement. The difference between the proposed algorithm and previous learning-based approaches can be summarized in three; (1) there is a time difference between input and output data; (2) The proposed algorithm uses both surrounding vehicle and road shape information; (3) Inputs can be obtained from current autonomous vehicles.

III. OVERAL ARCHITECTURE OF THE BI-LSTM BASED LANE CHANGE DECISION MAKING
The overall architecture of the proposed algorithm with the input and output is described in Fig. 1

IV. DATA COLLECTION
The first step in applying the learning-based methodology is to collect data appropriate for the target driving scenario. In this study, the learning and validation of the lane change decision-making algorithm were conducted using collected data instead of open-source data, such as NGSIM or Argoverse. In other words, it means that all data used in this study were collected directly. The highway driving data of the ego vehicle and surrounding targets were collected with a data collection vehicle, which was the AV. In other words, VOLUME 9, 2021 training was performed only using measurable information from the sensors of the AV so that the proposed algorithm can be directly applicable to the autonomous driving system. In other words, the target platform of the proposed algorithm is AV and the data was collected by the manual driving of AV to reflect the human factor for lane change decision. In addition, the information of the data collection vehicle can be acquired synchronously with the sensor measurements which means that the interaction between the ego vehicle and the surrounding vehicles can be considered in the training process of the network. The details of the data collection process will be described in the following sub-sections.

A. TARGET ROADS
The lane change to the left and right lane should be considered as a different driving task of autonomous driving because the possibility of each lane change maneuver is quite different based on the driving conditions. For example, the average speed of the left lane is generally higher than the right lane in countries using the right-hand drive. This means that overtaking a preceding vehicle can be performed frequently by changing the lane to the left side. Meanwhile, the lane change to the right lane could be safer than the left lane because the velocity of the right lane driving vehicle is generally slower than the ego vehicle, making it easier to match the velocity with the targets by deceleration.
Based on these considerations, the driving data were collected on the expressway in Seoul and Gyeonggi-do in South Korea. Fig. 2 depicts the route of the data collection road with a satellite map and the global trajectories acquired by DGPS. The data collection road is a multi-lane road with three or more lanes connecting Seoul and a satellite city. Therefore, this section of the expressway has more traffic and frequent lane changes than in rural areas. The data collection road in the capital region consists of first a ring expressway, 2 nd the Gyeongin expressway, and finally, the Seohaean expressway, which are connected with three junctions, Anhyeon, Iljik, and Jonam.

B. DATA COLLECTION VEHICLE
Data collection was performed with a data collection vehicle, which was configured to develop the autonomous driving system. The sensor, processor, and actuator of the data collection vehicle are summarized in Fig. 3 with the sensor Field of View (FOV). The major perception sensor to detect the surrounding vehicles is a laser scanner system, which consists of six ibeo.LUX LiDAR from Ibeo Automotive Systems GmbH and a LiDAR processor to cover the 360-degree FOV around the data collection vehicle. This LIDAR-based object detection system provides a position, heading, and velocity in the local coordinate system with class information such as passenger cars, or truck/buses. The sampling rate and detection range of the LiDAR is 25 Hz and 100 m. As a front camera, Mobileye Q3 is used to detect the lane markers in the form of a lateral offset, heading angle, curvature, and quality level. The lane detection quality is evaluated from 0 to 3 for each lane marker.
To accumulate the driving trajectories in the global coordinates, OxTS RT3002 is used as an RTK-DGPS to measure accurately the latitude, longitude, and altitude with a 0.02 m accuracy. In GPS shaded sections such as a tunnel, the global position was estimated using a localization algorithm with a high-definition digital map, images from the Around View Monitoring (AVM) camera, and static obstacles detected by LiDAR. The measurements from the chassis sensors, such as the wheel speed, steering wheel angle, and yaw-rate sensors, are collected using a gateway Electronic Control Unit (ECU) and Controller Area Network (CAN) to USB interface device. The collected data from the LiDAR, camera, AVM, Real-Time Kinematic Differential Global Positioning System (RTK-DGPS), and chassis sensors are stored in an industrial PC with a timestamp for each measurement step. A Micro-Autobox II is used to implement the lower-lever control algorithm to actuate the Motor Driven Power Steering (MDPS) and Smart Cruise Control (SCC) system, which are deactivated during data collection by manual driving.

C. DATASET GENERATION
The driving data were collected by manually driving the data collection vehicle on the target road discussed in the previous sections. Before using the driving data to train the Bi-LSTM based lane change decision algorithm, it is necessary to convert the driving trajectories to a dataset suitable for lane change decision problems. Before discussing further, it is necessary to define the expressions related to the data. In this study, the accumulated time series array from the sensor measurements and lane change flags, which is labeled by offline data processing, is defined as a sequence. The accumulated sensor measurement and lane change flag of specific length become the input and output sequence, respectively. Therefore, in learning and evaluation, both input and output sequences are used. This group of one input sequence and the corresponding output sequence are defined as a dataset. When the proposed algorithm is applied to vehicles, lane change decisions are made using the input sequence received from the sensors.
As mentioned in Section IV.A, the target road is a multilane expressway with three or more driving lanes. Because the proposed lane change decision algorithm considers the left and right lane change as a different maneuver, we selected the driving data with left and right driving lanes. In addition, the driving data with a lane quality level of 3 were selected for clearly defining the criteria for a lane change. However, the detection quality can fall below 3 in some driving situations due to poor painting or backlighting occurs. In this case, High-Definition (HD) map can be the solution to achieve the redundancy of the lane marking detection. In addition, if an HD map is not available, the virtual lane marking can be estimated based on the lateral motion of the ego vehicle using steering wheel angle, yaw-rate, and lateral acceleration. These two methods allow the proposed algorithm to maintain performance when used in the vehicular system. Therefore, if the proposed algorithm is trained by high-quality lane measurement, it can be applicable to various driving conditions.
The selected position of the ego vehicle and surrounding vehicles and the lane markers in the local coordinate system were transformed into the global coordinate system using the global position from RTK-DGPS. Because the time stamp of each data sample was assigned during the data collection process, the entire trajectories accumulated in a global coordinate system can be divided by the desired length. In this study, the observation horizon is defined as the length of the data to be accumulated for the lane change decisions, and the time until the future time point at which the lane change is performed VOLUME 9, 2021 is defined as the prediction horizon. In other words, the proposed algorithm determines whether to change lanes after the prediction horizon based on the data during the observation horizon. The dataset for training and validating the Bi-LSTM based RNN network is generated by segmenting the vehicle trajectories, lane makers, and corresponding vehicle states.
From the collected driving data, a total of 20,108 datasets were generated when applying an observation horizon of 20 steps and a prediction horizon of 20 steps. In order to determine the observation and prediction horizon for the Bi-LSTM based RNN. In this study, 6 candidates of 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0s with a sampling time of 100ms are used for the observation and prediction horizon, respectively. A total of 36 combinations of observation and prediction horizons were compared. Based on the accuracy comparison of candidates, 20 steps were selected as observation and prediction horizon in consideration of prediction accuracy, computational load, and prediction time comprehensively.
The generated datasets are resampled to improve the training performance because the biased training data cause the problem of overfitting a specific dataset. After the resampling, the datasets have a similar number of lane-keeping and left and right lane changes. Eight hundred datasets were prepared for the lane-keeping maneuver while 400 datasets were generated for each lane change maneuver, left and right, respectively. The resampled datasets were divided into 70% for training, 20% for validation, and 10% for testing. In other words, 1,120, 320, and 160 datasets were used for training, validating, and testing the proposed Bi-LSTM based RNN.

V. BI-LSTM BASED LANE CHANGE DECISION
The lane change decision algorithm was designed based on the RNN with Bi-LSTM cells. Conventional approaches have focused on the lane change decision at the current moment based on the states of the surrounding vehicles. In some studies, the prediction algorithm for the behavior of the surrounding vehicles is introduced to improve the performance of the situation awareness and lane change decision. However, the previous studies configured the prediction for the surrounding vehicles and the lane change decision function separately. Therefore, a process for integrating the algorithms is required. Even if the performance of the individual algorithm is developed, it is difficult to guarantee the performance of the integrated algorithm. This study introduces the Bi-LSTM based RNN to consider the prediction of the surrounding targets and the lane change decision simultaneously. The details of the input feature, output feature, structure, and training process of the network are presented in the following sections.

A. NETWORK INPUT AND OUTPUT WITH THE DATA ENCODER
As mentioned in Section IV.B, the input data are acquired from the LiDAR, front camera, and chassis sensors of the data collection vehicle. Because the objective of the proposed algorithm is a maneuvering decision between lane keeping and lane changing, vehicle states from the LiDAR system are processed to derive suitable parameters for the lane change decision. The states of the target vehicle consist of the relative x position, y position, heading angle, and velocity with respect to the local coordinate of the data collection vehicle. The local coordinate system is the right-handed coordinate system, which has an origin in the center of the front bumper, and the x-axis is aligned toward the driving direction. Therefore, the clearance between the subject vehicle and the surrounding targets is defined as shown in Fig. 4. When calculating the clearance, the lane information from the front camera is considered to compensate for the curvature of the road.
The  As a result, the input features of the network at each time are x k , and the input sequence during an observation horizon is X k , given by the following equations. In summary, information from the classified targets, ego vehicle, and lane markers are used as input features shown in Fig 5, which describes the conceptual diagram of the proposed network. In this case, the scales of the input features are different, which degrades the performance of the neural network training based on the back-propagation method. The data encoder standardizes the input features so that the mean µ and standard deviation σ are matched to 0 and 1. The parameters µ i and σ i for standardization were derived from the training datasets and stored for use when testing and implementing the proposed algorithm. In this study, 17 µ i and σ i were determined based on the training dataset. The standardization is performed as follows: where x i,k , and z i,k are the input and standardized input of the n-th input feature at k-th sampling step. µ i and σ i are the mean and standard deviation from the training datasets. The outputs of the Bi-LSTM based RNN are the probability of Lane-Keeping (LK), Lane Change to Left (LCL), and Lane Change to Right (LCR), which are denoted as P LK , P LCL , and P LCR . The output sequences are given by the following equation.
Here, index n means the maneuver candidates such as LK, LCL, and LCR. The index k is the prediction step from 1 to 20, and a is the exponential decay rate to give more weight to future outputs. Because the objective of the proposed method is deciding the lane change as earlier as possible, higher weights were assigned for further future outputs. In addition, the probability of the three maneuvers in near future is similar because the near future motion of the vehicle is almost similar to the current state. Therefore, we have increased the weight in the future outputs to increase the performance of the predictive lane change decision. The maneuver with the largest calculated probability is the output as the final flag of the maneuver decision module.

B. NETWORK STRUCTURE
The conceptual and unrolled diagrams of the Bi-LSTM based RNN structure are described in Figs. 5 and 6 with the observation horizon h and prediction horizon p. Because the behavior of the vehicles is one of the temporal dynamic behaviors, the RNN is a suitable neural network to learn the time-dependent characteristics and to consider the historical information. The RNN enables previous outputs to be used as inputs shown in Fig. 6. The recurrent structure of the RNN reduces the weight and bias of the neural network. In addition, because RNN can process an input sequence of any length, the output can be determined even before all observations have been made.  It is an important characteristic that the length of the input sequence is not limited because, unlike images, the number of detected surrounding vehicles is continuously changing [33].
However, a vanishing or exploding gradient problem often occurs when we try to capture the long-term dependencies. This is because a multiplicative gradient can be exponentially decreasing or increasing with respect to the number of layers. LSTM is introduced to avoid the problem of vanishing gradients [34]. Furthermore, not only future states but also past states in the RNN also have information on the vehicle behaviors. To utilize the past states and increase the amount of input information, a bidirectional structure is introduced to connect the hidden layers of opposite directions shown in the unrolled diagram of Fig. 6. The updating equations of the Bi-LSTM can be summarized as follows.
Here, LSTM(·) means the LSTM update process, which is described in the Bi-LSTM layer block of Fig. 5. The forward and backward LSTM layer output is denoted as h f ,k and, h b,k respectively. W f ,y, and W b,y represent the weight of the forward and backward LSTM layer. b y is the bias at the output layer. An important issue when applying a neural network to a specific problem is selecting hyperparameters, such as the number of layers, hidden units, and layer types. The structure of the proposed RNN is described in Fig. 7, which is determined by comparing the training accuracies of the candidates. The proposed Bi-LSTM based RNN consists of a combination of the Bi-LSTM layer, a fully connected layer, and the SoftMax layer. Each layer of the RNN is expressed in a different color, and the number of hidden units in each layer is shown together.

C. NETWORK TRAINING
The proposed Bi-LSTM based RNN was trained by the Adam algorithm, which means adaptive moment estimation [35]. An element-wise moving average of the parameter gradients m i and their squared values v i are updated as follows.
Here, i is the iteration number of the training, and θ i is the parameter vector. E(θ i ) is the loss function, and ∇E(θ i ) is the gradient of the loss function. β 1 is the gradient decay factor, and β 2 is the squared gradient decay factor. In this study, the value of β 1 and β 2 is 0.90 and 0.999, respectively. The network parameter is updated as follows: where an ε of 10 −8 is used to avoid the singular case when v i becomes a too-small value, and α is the learning rate. The initial value of α is 0.005 with a learning rate drop factor of 0.2 after 10 iterations. Maximum epochs of 100 and a minimum batch size of 20 are used to train the proposed neural network described in Fig. 7. The history of the accuracy and loss of the training process of the proposed Bi-LSTM based RNN using training datasets are summarized in Fig. 8. As can be seen in Fig. 8, the accuracy and loss reach final values of 99.3% and 0.048.

VI. RESULTS
The proposed lane change decision-making algorithm was evaluated by a driving data-based simulation to investigate the timing and accuracy of the lane change decision of the proposed algorithm compared with base algorithms. The simulator used in this study was developed based on the MATLAB environments. A case study was performed in Section VI.A. Driving data for testing was used to analyze the lane change decision accuracy in Section VI.B. In this section, three base algorithms that include both rule-based and learning-based methods are introduced to compare the performance with the proposed algorithm. Each base algorithm is named ''Base #1,'' ''Base #2,'' and ''Base #3'' in this study. Among the base algorithms, Base #1 and Base #2 are the rule-based approaches for lane change decisions. Base #1 is a clearancebased approach and Base #2 uses a new parameter called safety distance. Base #3 is the learning-based approach based on the Hidden Markov Model (HMM). Through this, the predictive lane change decision performance of the proposed algorithm is compared with the rule-based and learning-based approaches. Details of each base algorithm will be described in the following paragraphs. Base #1 determines the necessity and direction of lane change using only the current information of the surrounding vehicles and lane measurement. When the AV follows the FC target closer than the desired clearance and drives at a lower velocity than the desired velocity set by the motion planner of the AV or the driver, Base #1 determines that a lane change is necessary to drive at the desired velocity. To prevent chattering between driving modes, a dead zone between lane change activation and deactivation is introduced. The lane change decision condition of Base #1 is summarized as follows: where c des , t gap , and c min are the desired clearance, the desired time gap and minimum clearance. Base #2 uses the concept of safe distance for the lane change decision [36]. The same lane change activation condition is used for Base #2. However, Base #2 uses different conditions for the front and rear target in contrast to Base #1, which only considers the time gap with the front and rear target to check the possibility of a lane change. The safe distance SD LC for the front and rear vehicle is defined as follows: where v x,f , and v x,r are the velocity of the side lane vehicle, such as the FL, FR, RL, and RR vehicle; t LC,1 is the time gap for the relative velocity between the ego vehicle and the side lane vehicle; t LC,2 is the time gap for the minimum clearance with the side lane vehicle, and c LC is the minimum clearance. Based on the human driving data analysis, a t LC,1 , t LC,2 and c LC of 1.0 s 0.5 s and 12 m, respectively, were used in this study. Base #3 is based on the HMM, which is frequently used to model the decision-making problem with discrete hidden variables. The same input and output features are used to train the HMM. Therefore, the driving states, LK, LCL, and LCR, are defined as X t at time t. The five surrounding vehicle states, lane measurements, and velocity of ego vehicle are used as the vector of observations z t at time t. The joint distribution between the hidden modes m 0:t , and the observation o 1:t is written as follows: where u k is the driver's control inputs at time t. In this study, u k is the steering when angle applied by the driver when collecting the driving data. P(z k , u k |m k ) is assumed as a multivariate Gaussian distribution. The parameters of HMM are learned from the training datasets using the Baum-Welch algorithm.
Therefore, the most likely driving states given the observation of driving environments: where M i is the number of hidden nodes for each driving states i. The details of the HMM are provided in [37].

A. DRIVING DATA-BASED CASE STUDY
The case study of the left lane change scenario to overtake the preceding vehicle is described from Figs. 9 to 11. Fig. 9 shows the driving situations with the lane measurement and the surrounding vehicles at 2.3, 5.1, and 7.1 s instants. The classified vehicles are marked as different colors in Fig. 9. First, the data collection vehicle is represented as a black VOLUME 9, 2021  and yaw rate of the data collection vehicle are shown in the upper right corner of each figure. The front scene recorded by the driving record camera of the data collection vehicle at the same moment as in Fig. 9 is shown in Fig. 10. The results of the driving data-based simulation are summarized in Fig. 11. Fig. 11 shows the LC flag, ego vehicle velocity, SWA, clearance, relative velocity, and lateral offset history. The color of the clearance and relative velocity plot matches the vehicle color in Fig. 9. The proposed case is a situation in which a lane change must be performed to the left or right lane to overtake the FC vehicle. In this simulation, the desired velocity of the proposed algorithm, Base #1, and Base #2 was set as 100 km/h, i.e., 27.8 m/s, which is the speed limit of the data collection road. As we can see in Figs. 9 and 11, the velocity of the preceding vehicle is about 20 m/s, which is significantly lower than the target speed. This low speed of the FC vehicle is maintained until the ego vehicle crosses the lane shown in Fig. 9 (c) Based on the situation analysis, the results of the lane change decision-making of the proposed algorithm, Base #1, and Base #2 are described in Fig. 11 (a). As we can see in the SWA and lateral offset history, the driver started the lane change at about 5 s. However, after 2 s, the driver accelerates the ego vehicle, which leads to a decrease of the clearance and relative velocity with the FC vehicle. This phenomenon means that the driver was already performing a lane change at this point. The proposed algorithm generated the lane change flag from 2.3 s to 3.1 s and determined that the lane change to the left lane is possible much earlier than 7.1 s, the time when the ego vehicle crosses the lane. However, as can be seen in Fig. 9 (a) and Fig. 10 (a), a safe distance was not secured in the left lane, so Base #1 and Base #2 judged that the lane change is not possible. Base #2 generated the lane change flag at 5.1 s after the RL vehicle overtook the ego vehicle and became the FL vehicle. However, Base #1 could not decide the lane change even when the driver performed the lane change. This means that the lane change decision algorithm based on the simple time gap is only applicable to limited conditions. In the case of Base #2, which used a safe distance, the lane change was decided. However, Base #2 could not advance the lane change decision time to the moment when the driver's intention occurred. Meanwhile, Base #3 showed a predictive decision similar to the proposed algorithm. However, the lane change flag is generated at 2.7 s to 4.0 s, which was later than the proposed algorithm.
The evaluation results showed an improved lane change decision performance compared to the base algorithms. In addition, it can be said that the proposed Bi-LSTM based RNN approach can make a lane change decision similar to the driver, who makes a predictive decision based on the driving situation. Particularly, the proposed algorithm decided the lane change before the human driver's lane change behavior occurred. In the case study, the proposed algorithm predicted the lane change 2.7 s earlier than the initiation of lane departure by the human driver. Meanwhile, Base #1 using the concept of gap acceptance failed to decide a lane change. Base #2 based on the safety guarantee distance determined a lane change at the same moment as the human driver. Base #3 based on HMM made a predictive lane change decision, but decision was later than the proposed algorithm. Therefore, it can be verified through this case study that the proposed algorithm is more effective in predictive decision-making than rule-based methods and can be determined faster than HMM, which has been widely used for lane change decisions. This decision-making is possible because Bi-LSTM, where learning takes place in both forward and reverse directions, is more effective in predictive decision-making than HMM, which models probability transitions between hidden states. In particular, since an RNN has a recurrent structure, better performance than HMM is secured in predicting a future lane change.

B. DECISION PERFORMANCE ANALYSIS
The lane change decision performance was analyzed by using the testing data, which was not used for network training and validation. The evaluation results of the proposed, Base #2 and Base #3 algorithms are summarized in Fig. 12. The accuracy of the lane change decision in advance for 2 seconds was compared. For Base #3, the analysis was performed both for the case of predicting the future lane change and deciding the current lane change. In this analysis, 'True' means that the decision-making of the driver and algorithm match, and 'False' means that the decisions of the driver and algorithm are inconsistent.
As can be seen in Fig. 12 (a), the proposed algorithm made an accurate decision for the lane change cases except for 2 cases of the right lane change. Meanwhile, 12 cases of lane-keeping were determined as a lane change. These false    Fig. 12 (b). Therefore, it is difficult to decide 2 s earlier by using the rulebased method. As shown in Fig. 12 (c), Base #3 performed well in determining the lane change of the current moment. Base #3 accurately judged all lane changes except for one right lane change case. However, as shown in Figure d, Base #3 showed lower accuracy than the proposed algorithm when predicting 2 seconds in advance. As the result of the case study, the accuracy of the decision has been lowered due to the delay in determining the lane change of Base 3 than the proposed algorithm. In other words, if the prediction time was set to 1 second, it is highly likely that the proposed algorithm and Base #3 produced similar performance. This can be seen as a result of RNN's recurrent structure and BI-LSTM's forward and backward learning.
To quantify the analysis results, the confusion matrix was utilized. Because 'True' and 'False' are defined for the decision making, the lane change decision making can be considered as a classification problem for a future maneuver between lane keeping and lane change. However, the lanekeeping 'True' and lane change 'True' cases cause different results in the vehicle behavior. Therefore, it is difficult to apply the true false definition of the previous analysis to a confusion matrix. This means that the confusion matrix should be modified to reflect the importance of the maneuver decision. In this study, lane change was considered as condition positive, and lane-keeping was considered as condition negative. In this step, the left and right lane changes were considered as one condition. The modified confusion matrix is defined as shown in Table 1. The values of True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) of the proposed algorithm and Base #3 are summarized in Table 1 based on the same input datasets, which were prepared in terms of the predictive lane change decision.
In the case of the proposed algorithm, an FP and FN of 12 and 2, respectively, occurred in 160 testing datasets, which are accurate results, and even the decision making was performed 2 s earlier. Meanwhile, from the results, Base #3 had a TP of 63, lower than the results of the proposed algorithm. To evaluate the results of the lane change decision making, the recall/True Positive Rate (TPR), False Positive Rate (FPR), precision, and F1 score F 1 were used and defined as follows.

Recall/TPR
The recall, FPR, precision, and F 1 of the proposed and Base #3 are summarized in Table 2. As can be seen in Table 2, Base #3 for the current lane change decision showed the most accurate results in all performance metrics. However, in the predictive lane change, the proposed algorithm shows better performance than Base #3. Therefore, the proposed algorithm shows more accurate decision performance in predictive lane change situations. This means that the proposed predictive lane change decision algorithm can improve the safety and the acceptance of autonomous driving by securing extra time to plan the desired motion of AVs.

VII. CONCLUSION
A predictive lane change decision algorithm using a Bidirectional Long-Short Term Memory (Bi-LSTM) based Recurrent Neural Network (RNN) was proposed and evaluated by simulation with driving data. The collected driving data by the AV consisted of surrounding target states, lane measurements, and the velocity of the ego vehicle. Before using the datasets, a data encoder accumulates and standardizes the input data to make input sequences for the RNN. Then, 1,120, 320, and 160 driving datasets were used to train, validate and test the Bi-LSTM based RNN to learn the lane change decision of human drivers. The proposed algorithm predicts the probabilities for lane keeping, lane change left, and lane change right. A case study and testing data-based accuracy analysis were conducted to evaluate the performance of the proposed decision-making algorithm, and it was compared with three base algorithms. The evaluation results showed that the proposed algorithm determined the lane change moment in advance based on the states of the surrounding vehicles and the shape of the road. In addition, this study only used the surrounding vehicles and lane information which is measurable from the sensor of the autonomous vehicle.
Future works on the predictive lane change decision can be summarized in four aspects; (1) The first aspect is to analyze the effect on the motion planning of autonomous driving and derive an improved lane change algorithm. In other words, it is necessary to perform a quantitative analysis on how much the time secured through the predictive decision of the lane change will improve the safety and driver acceptance of the motion planning; (2) The second is the research on situations, where it is difficult to recognize the shape of the road with a camera or where more vehicles should be considered in addition to the five surrounding vehicles used in this study to improve the robustness of the proposed algorithm; (3) The third aspect is to extend the coverage of the proposed algorithm to more complex driving situations. The target scenario of the predictive decision algorithm can be expanded to situations such as uncontrolled intersections, roundabouts, and merge/split road, where complex interactions with surrounding vehicles occur similar to lane changes. In particular, the improvement of safety is expected when the proposed approach is applied to the intersection scenarios where the driving directions of the vehicles are different and it is difficult to detect objects due to blind spots. (4) Finally, it is VOLUME 9, 2021 expected that performance can be further improved through a combination with an attention-aware neural network or a convolutional neural network.