Learning Car-Following Behaviors for a Connected Automated Vehicle System: An Improved Sequence-to-Sequence Deep Learning Model

Data-driven car-following modeling is of great significance to traffic behavior analysis and the development of connected automated vehicle (CAV) technology. The existing researches focus on reproducing the car-following process by capturing the behavior of the host vehicle using the information of its nearest preceding vehicle. While the other preceding vehicles may affect the host vehicle as well. To fill the gap above, this paper presents an improved sequence-to-sequence deep learning-based (ISDL) car-following model for a CAV system. Firstly, the kinematics information considering the multiple preceding vehicles are organized as the input characteristics. Secondly, an improved sequence-to-sequence deep learning framework is proposed by integrating an encoder with the bidirectional gated recurrent unit (GRU) neural network and a decoder using an attention-based GRU neural network in an end-to-end fashion. Finally, the car-following data with multiple preceding vehicles captured from the NGSIM dataset are employed to train and calibrate the proposed model. Experimental results indicate that the deep learning-based models’ performance in learning heterogeneous driving behavior can be enhanced by adding information about multiple preceding vehicles. In addition, the proposed ISDL model outperforms the benchmark car-following models in terms of the accuracy of the simulated speeds and simulated positions. Through tests on platoon simulation, the ISDL model is capable of reshaping the traffic oscillation phenomenon as well.


I. INTRODUCTION
Car-following and behavior modeling describe the interaction relationship and the trend of motion between the host vehicle and its preceding vehicle in a lane [1]. Since the accuracy of car-following models plays a critical role in analyzing traffic state and making simulations, it is necessary to study the microscopic car-following models to further improve traffic safety and efficiency.
The associate editor coordinating the review of this manuscript and approving it for publication was Shajulin Benedict .
With the rapid development of perception and communication technology, connected and automated vehicle highway (CAVH) system [2] has become a development trend of intelligent transportation to relieve congestion and improve capacity. In a CAVH system, connected automated vehicles (CAV) are proposed to improve operational efficiency and reliability. Based on the Vehicle to Everything (V2X) communication system, the CAVs can not only capture the motion information from their surrounding vehicles but also transfer the real-time traffic state data with roadside units [3]. Hence, an effective car-following model is also a significant technology for the CAVs to estimate traffic state and make decisions, which can help the CAVs to efficiently process information from more preceding vehicles ahead and better fit human driving behaviors under different scenarios.
Normally, microscopic car-following models can be divided into two categories, including the conventional model based on mathematical formulas [4], [5] and data-driven models [6]. Supported by high-fidelity traffic data and artificial intelligence technologies, deep learning-based models have become a main branch of the data-driven car-following models [7], [8], [9]. Though researchers have proposed various car-following models during the past few decades, some limitations still exist as follows.
(1) Most existing deep learning-based car-following models concentrate on using a recurrent neural network, such as the long short-term memory (LSTM) neural network [9] or the gated recurrent unit (GRU) neural network [8] to learn the memory effect and reaction delay of human driving characteristics. It is necessary to treat the car-following driving behavior as a sequence-to-sequence (Seq2seq) issue and extend the basic Seq2seq framework to improve the performance in generating the vehicle trajectory.
(2) The development of the communication technology promotes the information flow topology of CAVs, which are beneficial to improve the stability of controlling platoon of CAVs [10]. Though some deep learning-based car-following models have been proposed to fully consider reaction delay or memory effect in human driving behavior, few of them consider employing the complex information flow topology to improve the capability of deep learning structures in capturing heterogeneous driving behavior.
Since the multi-vehicle information interaction is fundamental for the CAVs and the present data-driven carfollowing models do not consider the influence of multiple preceding vehicles such as acceleration, speed, and space headways, this paper intends to propose a novel data-driven car-following model based on an improved Seq2seq deep learning (ISDL) framework. The proposed ISDL model employs the bidirectional GRU (Bi-GRU) structure and GRU structure as the encoder and decoder of the ISDL model respectively. In addition, the attention mechanism is introduced to extend the context vector in the decoder by extracting important information at each time interval. To fully capture the characteristics of the car-following behaviors, the kinematics parameters reflecting the status of the host vehicle and multiple preceding vehicles are utilized to form the input vectors of the proposed model. Based on the trajectory data of the Next Generation Simulation (NGSIM) data [11], the platoon trajectories containing the car-following behavior of multiple vehicles are extracted to train and calibrate the models. Experiments on empirical trajectories reveal that the new model yields significantly higher simulation accuracy and stability than existing car-following models in terms of speed prediction and position prediction. Furthermore, the proposed model is capable of capturing heterogeneous driving behaviors and reshaping the traffic oscillation in platoon simulation.
Based on the aforementioned description, the contributions of our work focus on the following aspects: (1) The information flow topology containing motion parameters of preceding vehicles is integrated with a deep learning-based car-following model for the first time to capture heterogeneous driving behavior.
(2) A novel deep learning-based car-following (ISDL) model is constructed based on an improved Seq2seq structure by integrating the Bi-GRU encoder and attention-based decoder into an end-to-end fashion.
(3) Experiments on real-world trajectories data indicate the proposed ISDL model yields significantly high simulation accuracy in reproducing the car-following behavior, which outperforms the IDM model, GRU model, LSTM model and Seq2seq-based models in terms of speed prediction and position prediction.
(4) The proposed model is capable of conducting platoon simulations and reproducing the traffic oscillation phenomenon.
The rest of the paper is organized as follows. Section II summarizes the related work of microscopic car-following modeling from categories of model-driven models and datadriven models. Section III describes the methodology of the Seq2seq structure, GRU cell, and the framework of the proposed ISDL car-following model in detail. Section IV presents the dataset description, model implementation with baselines, and evaluation indexes. Section V provides the result discussion and analysis. Finally, the conclusion and outlook for future work are given in Section VI.

II. LITERATURE REVIEW
In this section, we review the work of microscopic carfollowing modeling by dividing the methods into modeldriven models and data-driven models.

A. MODEL-DRIVEN CAR-FOLLOWING MODELS
Conventional model-driven car-following models usually use the dynamics method to study the influence of the motion state change of the preceding vehicle on the motion state of the host vehicle. By quantitatively analyzing the dynamic characteristics of ''preceding vehicle−host vehicle'' pairs in a lane, the formation and evolution mechanism of traffic phenomena such as traffic congestion and traffic oscillation can be studied [12].
Based on the prior knowledge of driving behavior, modeldriven car-following models usually make specific assumptions about driving behavior, and they can be classified into many types including secure distance models (e.g. Gipps model [13] and FRESIM model [14]), psycho-physical models (e.g. Winsum model [15]), optimization velocity models (e.g. OVM model [16], GF model [17], and FVD model [18]), stimulus-response models (e.g. GHR model [19] VOLUME 11,2023 and Newell model [11]) and intelligent driving models (e.g. IDM model [20], [21]). In general, model-driven car-following modeling depends on the physical methods or algebraic methods such as vehicle dynamics, and mathematical statistics, aiming at constructing models with practical physical and mathematical meanings. The advantage of model-driven car-following models lies in focusing on the several key elements that describe the physical properties of car-following behavior. However, these models extremely rely on mathematical formulas. Fine calibration of the formulae in the model is required before application. In addition, the randomness of the human driving behavior and the complexity of the road conditions make the parameter calibration often subject to large errors, and it is difficult to meet the needs of the future mixed operation of different types of vehicles in a CAVH environment.

B. DATA-DRIVEN CAR-FOLLOWING MODELS 1) ARTIFICIAL-BASED INTELLIGENT MODELS
The advent of the big data era has enabled researchers to obtain a large amount of high-precision real-time vehicle data, which in turn has driven the development of data-driven following models. Based on real-world vehicle driving data and machine learning methods, the data-driven following models are capable of exploring the inherent laws of car-following behavior by the training, learning, iteration, and evolution of sample data.
Wei et al. [22] proposed a self-learning support vector regression (SVR) method to study the asymmetric features in the following behavior and their impact on the traffic flow evolution, analyzing the time-lag phenomenon of stopand-go waves at the microscope level and reproducing different congestion propagation patterns at the macroscope level. He et al. [23] proposed a K-nearest neighbor-based algorithm, which is used as the output of the model by finding K-similar driving scenarios in the historical vehicle trajectory database to obtain the most likely driving behavior. Kehtarnavaz et al. [12] introduced artificial neural networks (ANNs) to learn the feature of the car-following behaviors for the first time, and then many ANN models [24], [25], [26] have been built in this issue. Panwai and Dia [27] reveal that the BP artificial neural networks-based car-following model using the velocity and the headway of the preceding vehicle has a better performance than the Gipps model and the psychophysical models.

2) DEEP LEARNING-BASED MODELS
With the development of artificial intelligence, 5G communication technology, and data storage technology, the data-driven follow-through model gradually develops from machine learning models to deep learning model. As a frontier theory of machine learning technology, deep learning methods have been applied to build the data-driven carfollowing model by related scholars [28].
For example, Wang et al. [8] early verified that the recurrent neural network (RNN) method can significantly improve the trajectory fitting accuracy of the longitudinal trajectory than the traditional car-following models. Zhou et al [29] found that deep neural networks can not only fit the car-following trajectory well but also predict the traffic oscillation accurately. Huang et al. [9] considered asymmetric driving behaviors and proposed an LSTM-based car-following model. Experimental results indicate the proposed model is able to reproduce a variety of traffic flow characteristics significantly.
Besides, it also proved that the deep learning model-driven car-following model can not only fit the real driving trajectory well but also fully reflect the driving memory influence and response delay phenomenon. Lin et al. [30] proposed a formation following model based on the LSTM and scheduled sampling technique, which can effectively reduce the propagation of spatio-temporal errors. Hao et al. [31] proposed an encoder-decoder model based on GRU to recognize the driver's intention and forecast the trajectory of the vehicle. Though existing studies have shown that RNN-based deep learning models can effectively fit the following behavior of vehicles and reconstruct the following trajectory, the existing data-driven models do not fully consider the potential influence of multiple preceding vehicles on driving behavior.

A. PRELIMINARIES 1) Seq2seq MODEL
Previous studies [7], [8], [9] have demonstrated that an RNN-based healing model can well fit the car-following trajectory and respond to the driver's response delay. The structures of the RNNs model can be divided into the one-toone structure, one-to-many structure, many-to-one structure, and many-to-many structure. In the framework of a many-tomany structure, the input and output can be corresponding sequences.
Seq2seq [32] model is a classical RNN framework for handling many-to-many patterns, which has been widely used in fields such as machine translation [33], [34] and time series prediction [35], [36]. In the case of the car-following problem, the feature vectors formed by the variables of multiple preceding vehicles and host vehicles can be organized into input sequences according to time series, and the future states such as speed and acceleration of the host vehicles can generate used as output sequences.
The core architecture of the Seq2seq model is an encoderdecoder framework, which is composed of two RNN models as encoder and decoder respectively. In the encoding process, one RNN model acts as the encoder to compress the input sequence into a fixed-length context vector, and in the decoding process, another RNN acts as the decoder of this vector to output the corresponding sequence according to the context vector output in the encoding process and the input at the last moment. As shown in Fig.1, the Seq2seq framework consists of an encoder, a context vector, and a decoder. In the encoding stage, the input time series is X t = (x 1 , x 2 , . . . , x t ). According to the data at different input moments, the hidden state of the encoder is described as: where h t represents output of the hidden layer and c represents the context vector of the encoder. f (.) and q(.) are nonlinear functions. Based on related studies [31], f (.) can be RNN-based function and c = h t . During the decoding process, the context vector is the initial hidden layer state of the decoder and the output prediction sequence is Y = (y 1 , y 2 , . . . , y n ). For the output at time interval t: where g(.) represents the RNN-based function and s t is the hidden state at time interval t, s t = f (y t−1 , s t−1 , c). n represents the length of the output sequence.
Note that the Seq2seq model aims to maximize the conditional probability p(Y ) through training.

2) GRU STRUCTURE
RNN can deal with time series with length changes, and the input information of each layer depends on the output information of the previous layer and the previous information. To tackle the RNN's problem of gradient explosion or gradient disappearance when processing long-term data, gate recurrent unit (GRU) [37] is proposed to solve the gradient problem in long-term memory and backpropagation of the error.
Different from the LSTM unit which contains three gates including the input gate, forget gate, and output gate [38], the GRU units only have a reset gate r i and an update gate z i . The states of the two gates are determined by the hidden state h i−1 and input matrix x i . Specifically, the reset gate selects the degree of information to remember from the input x i . The update gate determines how much information to forget in h i−1 and how much information to remember in h i ′ . Compared to the LSTM block, the structure of the GRU block is more simple with similar performance. Meanwhile, the GRU model is easy for training and the efficiency can be greatly improved. Hence, GRU structure is greatly used in natural language processing [39] and feature classification [40]. It has been also used in traffic condition estimation [41] and car-following behavior [8].
The network structure of the GRU model is shown in Fig.2, and the calculation formula of the model is as follows: where σ s and • define the sigmoid function and the scalar product of two vectors. W z , U z , W r , U r , W h , U h and W y are weights and biases. The sigmoid activation function σ (.) and function tanh(.) are shown as follows.
where the sigmoid function maps the value into range (0, 1) and the tanh function maps the values between -1 and 1.

B. CAR-FOLLOWING MODEL BASED ON IMPROVED Seq2seq DEEP LEARNING MODEL
To give full consideration to the potential impact of various information on driving behavior and develop an intelligent model for describing connected driving in the future, this paper proposes a car-following model based on deep Seq2seq learning framework. Firstly, we consider the potential influence of multiple preceding vehicles' information on the car-following behavior of the host vehicle [42]. Secondly, an improved Seq2seq framework, which employs the bidirectional GRU and one-way GRU model as the encoder and decoder, is proposed for extracting and learning the carfollowing behavior. To better extract the information and assign appropriate weights to important hidden states, this model introduces the attention mechanism to generate the different context vectors at each time interval. Fig.3 provides the framework of the ISDL model. To estimate the speed of host vehicle at time interval t +1, the vector of input matrix at time interval t ′ can be written as: The input matrix of the Seq2seq model is given as: where T indicates the length of the input timestep, which can be recognized as the length of the memory when the host vehicle follows the preceding vehicles.

2) BiGRU-BASED ENCODER
To better employ past and future driving information to capture the characteristics of the following behavior in the matrix, the proposed ISDL model proposes a bidirectional GRU structure to process driving information.
To be specific, the forward states of the bidirectional GRU neural network (BiRNN) are computed: X t is the input matrix of the driving behavior.

3) ATTENTION-BASED DECODER
In a conventional encoder-decoder framework used in the car-following model proposed by Ma et al. [43], the encoder converts the input sequence into the semantic vector of the same compression length and the context vectors of the decoder at different time intervals are the same. Since the feature distribution of sequence data is different, the importance of its influence on the output is also different. To prevent the information input first from being diluted by the information input later when processing long sequences, this paper introduces an attention mechanism in the hidden layer to make the input of the decoder adopt different intermediate semantics c. Each c is calculated by weight a and encoder hidden layer output of the Bi-GRU h. Since different values are given corresponding weights, important features can be captured according to the impact degree of the input sequence, and the hidden state obtained by the encoder under the Bi-GRU structure is calculated as (h 1 , h 2 , . . . , h T ). If the current hidden layer state of the decoder is s t−1 , the correlation between each input position j, j = 1, 2 . . . , T and the current output position is calculated asiijŽ The Eq.19 can also be written as: where v a , W a and U a represent the weights of the correlation function θ(.). The context vector c t of t-th prediction time interval, t = 1, 2, . . . , T a is computed as: 28080 VOLUME 11, 2023 where α tj represents the weight of the hidden state h j on the context vector c t . The hidden state s i of the decoder given the annotations from the encoder is computed by where W s , W z ′ , W r ′ , U s , U z ′ , U r ′ , W c , C z and C r are the weights of the decoder.

4) OUTPUT AND LOSS FUNCTION
The output of the improved Seq2seq framework is the speeds of the host vehicle in future periods For the output at a future time interval, the predicted speedv f t+ t , t = 1, 2, . . . , T a it can be written as: The loss function, namely the objective function, is used to calculate the error between the predicted value and the real value. Considering that the optimization of only one variable may lead to abnormalities [20], this paper adopts the dual objective of speed and displacement as the loss function. Normalized data with the mean square error is used to calculate errors between the simulated trajectory and the observed trajectory.
The objective function is: where L MSE is the mean square error function. N k represents the number of training data.v The pseudo-code of training an ISDL model is presented in Table 1.

IV. EXPERIMENT A. DATA PREPARATION
To validate the performance of the proposed model, the sdata 1 of the NGSIM [44] project are adopted to prepare the experimental dataset. The original NGSIM data are collected in form of images by camera equipment with 0.1 s time interval. Trajectory data of vehicles on the US -101 highway road shown in Fig.4 are extracted including the exact  location information of each vehicle. The dataset is large and highly accurate, which can satisfy the testing requirements for training deep learning-based models under car-following scenarios, especially multi-vehicle car-following behaviors containing complex information flow topology. Note that the extracted car-following data only consider the impact of multiple preceding vehicles in a lane on the host vehicle without considering the surrounding vehicles in other lanes.
According to relevant research, high traffic volume usually occurs during the morning rush hour of US-101 from 8:05 am to 8:20 am, leading to the phenomenon of multiple car-following behaviors and traffic oscillation. Hence, the dataset during the period above is taken as the basic dataset. Through data preprocessing, 621 groups of car-following pairs with two preceding vehicles are selected including 461 groups of data for training and validation and other 160 groups of data for testing. During the process of model testing, the well-trained deep learning-based car-following models predict the speeds of the host vehicle according to the speeds and accelerations of the host vehicle and the preceding vehicles, and the relative position between the host vehicle and the nearest preceding vehicle. In addition, the predicted position of the vehicle can be calculated as: wherev f t andv f t+1 present the traffic speeds of the predicted vehicle speed at a time interval t and t + 1. t is the time interval and t = 0.1s.
The experimental environment is a DELL computer (Inter Core I7-10750h CPU, 32G RAM). The Keras high-level neural network API in Tensorflow is used as the framework, and the model is built and trained in Python 3.7 to evaluate the prediction performance of the model.

B. BENCHMARK MODELS
To estimate the capability of reshaping car-following of the proposed model, we compared it with the IDM model [45], the LSTM model [9], the GRU model [8], the Seq2seq deep learning (SDL) model [43] and GRU-based SDL (GSDL) model [31]. IDM: IDM model is a classic accident-free theoretical following model, which has clear physical significance and can intuitively display the changes in driving behavior. In addition, this model can describe the following behavior of single-lane vehicles under free flow and congested flow at the same time. All model parameters have clear physical meanings, which can intuitively display the changes of driving behavior. The specific expressions are as follows: According to the data of the host vehicle and the preceding vehicle in the training set, relevant parameters of the IDM model are calibrated, and the calibrated data can be obtained as shown in Table 2.

LSTM:
The LSTM-based car-following model is proposed by [9] according to the architecture of the LSTM structure. The kinematics states of the host vehicles and preceding vehicles are formed as the input of the LSTM model. Besides, the number of the hidden layer of the LSTM car-following model is selected as 1.
GRU: The GRU-based car-following model used the basic GRU structures described in Section IV-B, and there is one hidden layer in the model [8].
SDL: The first Seq2seq deep learning (SDL) model for modeling the car-following behavior is given by Ma et al. [46]. The SDL model employs the LSTM as the encoder and decoder with the numbers of hidden layers both set as 1.
GSDL: A GRU-based Seq2seq deep learning (SDL) model is also employed as a baseline model. It is built according to Hao's study [31]. The GSDL uses a one-way GRU layer as the encoder and another one-way GRU layer with an attention mechanism as a decoder.
To fairly compare the performance of different models, the number of training epochs and the batch size are chosen as 50 and 64 for all deep learning-based models including the LSTM model, GRU model, SDL model, GSDL model, and ISDL model. The optimizer of the model is chosen as Adam with the learning rate set as 0.0001, and the number of hidden units in the hidden layer of all deep learningbased car-following models is set as 128. In addition, the training will be stopped automatically if the loss function of the training data set is not improved in 10 consecutive training sessions. Since the number of the preceding vehicles determines the amount of information that is used to capture the car-following behavior, the deep learning-based car-following models will consider the same number of the preceding vehicles to extract kinematics states as the input of these models.

C. EVALUATION CRITERIA
In this paper, the mean absolute error of the predicted velocity (MAE v ) and predicted position (MAE x ) are selected as evaluation indexes. To better evaluate the error of the trajectory simulation, the mean squared error of the predicted position (MSE x ) is introduced as the evaluation indexes as well. The calculation equations for these indexes are revealed as follows: where N s is the number of vehicles in the testing dataset.

A. INFLUENCE OF THE CRITICAL PARAMETER ON THE ISDL MODEL
The driving decision in the car-following process is closely related to the historical driving behaviors and the reaction delay of vehicles. As a common feature of human driving vehicles, the reaction delay consists of human psychological processing time, device response time, and vehicle movement time [47]. Previous studies indicate that the length of the input timestep considering the reaction delay affects the performance of the deep learning models [8], [9]. Papathanasopoulou et al. [48] indicate the reaction delay in a wide range from 0.4 s to 3.0 s and Zheng et al. [24] demonstrate that the minimum value of reaction delay is around 0.5 s. With the time interval of collecting the kinematics information set as 0.1 s, it is necessary to learn the length of the timestep on the performance of the ISDL by selecting the length of the timestep from [5], [10], [15], [20], [25], and [30], which corresponds to reaction delay from 0.5 s to 3.0 s with a step of 0.5 s.    5 indicates the performance of the ISDL model with input timestep ranging from 5 to 30 under the scenario N p = 1 where the host vehicle generates a trajectory simulated by using the information of one nearest preceding vehicle in the scenario N p = 2 where the kinematics parameters of two nearest preceding vehicles are employed to generates simulated trajectory. As indicated in Fig.5, the simulated speed error and simulated position error of the ISDL model decrease gradually with the input timestep increasing from 5 to 15. After the input timestep is larger than 15, the MAE v , MAE x , and MSE x are stable with little fluctuation under the two scenarios. Hence, the optimal input timestep of the ISDL model is set as 15 and it is equal to 1.5 s. Note that with the same input timestep, the ISDL model considering two preceding vehicles' state information works better than the other with only one preceding vehicle's state information, indicating that the multiple vehicle information is a benefit to improving the predictive quality of simulated trajectories.
Considering the fact that the development of communication technology promotes the application of the CAVs, it is necessary to learn the influence of the information of the multiple vehicles on the performance of deep learning-based models. Fig.6 compares the MAE v , MAE x , and MSE x of the LSTM model, GRU model, SDL model, and ISDL model with different input features. It is displayed in Fig.6 that these models with two preceding vehicle information are superior to those with only one preceding vehicle information in terms of speed prediction and position prediction, indicating that the extra kinematics parameters provided by the second preceding vehicle are beneficial for fitting the behavior of the host vehicle. For each deep learning-based car-following model, the improvements in the simulated position are more significant than those in the simulated speed. Note that all deep learning-based car-following models organized their input matrix with two preceding vehicle information in the next sections.

B. COMPARISON OF THE PERFORMANCE OF DIFFERENT MODELS
In this section, we compare the overall performance of the ISDL model with those of the IDM, the LSTM, the GRU, the GSDL, and the SDL model. To achieve a fair contrast, the input timesteps of the deep learning-based model are set as 15. Table 3 presents the overall error indexes of different car-following methods. The proposed ISDL model works best among these models, which outperforms the second-best simulator GSDL model with the improvement of 0.042 and 5.55 on MAE v and MSE x . This may be caused by the fact that the ISDL proposes an improved sequence-to-sequence framework that employs a Bi-GRU encoder to better extract input from multiple preceding vehicle information. It can be found that the GSDL outperforms the SDL model with an improvement of 7.55 on MSE x since it extends the Seq2seq framework with an attention mechanism to generate a context vector at each simulated time interval. Compared to the SDL and the GSDL model, the ISDL model is capable of learning the car-following behavior more effectively with the improved Seq2seq framework.
Besides, the deep learning-based models with Seq2seq frameworks have smaller errors on both simulated speed and positions since the Seq2seq is able to take memory effect and reaction into account [43]. The SDL model outperforms the LSTM model with improvements of 0.63 and 18.16 on MAE x and MSE x respectively. Meanwhile, Table 3 demonstrates that the deep learning-based model provides higher quality simulated results than the IDM model in fitting the following behavior since the data-driven models have more parameters to capture the heterogeneous following behavior from a large dataset.
To study the model performance of models under different states, we divide the car-following dataset into three sub-datasets including the low-speed car-following scenario, medium-speed car-following scenario, and high-speed car-following scenario, which correspond to the speeds below 7 m/s, speeds from 7 m/s to 12 m/s, and speeds more than 12 m/s. Table 4, Table 5, and Table 6 indicate the performance comparison of different models under the above scenarios. As shown in Table 4 when the vehicle follows at low speeds, the deep learning-based models have superior performance than the IDM model, and the ISDL model outperforms the SDL model with improvements of 14.44% and 21.83% on MAE x and MSE x respectively. It reveals that the ISDL model is capable of fitting the short headway of the car-following under the low-speed scenario. In addition, it is indicated in Table 5 and Table 6, the Seq2seq models present lower speed prediction and position prediction than other models, which demonstrates that the Seq2seq structure can well memory the driving behavior in medium-speed and high-speed car-following behavior. Fig.7 and Fig.8 present the simulated error distribution of speeds and position respectively for four representative models including the IDM model, GRU model, SDL model, and ISDL model. It can be learned from Fig.7 that the four models all can provide accurate simulated speed while the simulated speed values present different distribution characteristics. Since the IDM is a classical following model driven by traffic safety, it generates the simulated speeds less than the empirical data with high frequency to guarantee safe space headway. While the GRU, the SDL, and the ISDL model prefer to generate simulated speeds a little larger than the empirical data to make the trajectory of the host vehicle closer to that of the preceding vehicle. In addition, the error distribution of the simulated speeds in Fig.7 reveals that the errors of the GRU, the SDL, and the ISDL model are more concentrated and near the mean values, illustrating that deep learning-based models can reshape the driving behavior better than the IDM model.
Similar to the phenomenon indicated in Fig.7, Fig.8 shows the error distribution of simulated positions of the four models. It can be distinctly found that the Seq2seq-based model behaves more accurately than the IDM and GRU model in tracking the leading vehicles. Meanwhile, the distribution of big errors of the SDL which are smaller than −10 and    larger than 10 are more frequent than those of the ISDL model, indicating that the ISDL model not only inherits the features of the SDL model in sequence learning but also improves its performance by introducing attention mechanism and bidirectional encoder to extract important information.  Furthermore, to investigate the performance of the models under heterogeneous driving behaviors, the simulated trajectories and the corresponding simulated speeds of the representative vehicles are presented. Note that we divided the behavior of the driver into three types including regular behavior, aggressive behavior, and cautious behavior according to related studies [49]. The aggressive driver tends to anticipate traffic conditions in advance and prefers to choose a smaller gap to follow the leader, which may cause traffic oscillation or stop-and-go traffic [50]. On the contrary, the cautious driver always intends to keep a large distance from the preceding vehicle. Fig.9, Fig.10, and Fig.11 present the performance of the models under the regular, aggressive, and cautious driving behavior respectively, which are distinguished according to the average gap distances of the trajectories. As shown in Fig.9, Fig.10, and Fig.11 where vehicle 915, vehicle 796, and vehicle 965 are in regular, aggressive, and cautious driving behavior, the IDM, LSTM, and ISDL model show different performance in simulating the following trajectories. Among the three models, the simulated trajectories of the ISDL model are closest to the empirical trajectories under three driving behaviors. Fig.9 indicates that the simulated speeds for regular driving behavior provided by the IDM, the LSTM, and the ISDL reveal comparable accuracy. Meanwhile, it can be observed in Fig.10 and Fig.11 that the LSTM and the ISDL model have much stronger performance than the IDM for aggressive and cautious driving behaviors. Note that the IDM model prefers to keep a larger gap than the LSTM model and ISDL model, which indicates that the deep learning models are capable of accurately generating simulated trajectories. Fig.12 investigates the detailed following behavior of different models. Considering that human drivers cannot accurately judge the speed of the leading vehicle or precisely maintain their speed, the relative spacing and speed between any two consecutive vehicles are usually oscillating. To further compare the accuracy of different kinds of models, it is necessary to get the oscillating behavior of different models by analyzing the driving behavior of specific vehicles. Fig.12 shows the condition where vehicle 865 follows vehicle 860, and provides the empirical data and simulation of the IDM model, LSTM model, and ISDL model. It is indicated that the ISDL model yields a more accurate approximation of the oscillating behavior than the IDM model and LSTM model.

C. DETAILED BEHAVIOR COMPARISON AND PLATOON SIMULATION
Platoon simulation can reflect traffic phenomena such as oscillation and hysteresis, and it is a crucial application and indicator of car-following models. In the platoon simulation, the leading vehicle runs on a preset route, influenced by the instructions or the traffic states outside the platoon. Other vehicles run based on the car-following models. The movement of vehicles is affected by two preceding vehicles and their previous states except for the second vehicle, which has only one preceding vehicle. To further explore the performance of the proposed model in reproducing traffic oscillation, we extract a platoon that traverses stop-and-go waves for platoon simulation and the two leader vehicle are 739 and 745. Fig.13 presents the time-space diagrams of the real and simulated trajectories, indicating that the proposed model can estimate the traffic oscillations through platoon simulation. Though errors of the simulated accumulate from the upstream to downstream, the stop-and-go waves with accurately predicted speeds are well captured by the ISDL model.

VI. CONCLUSION
Data-driven car-following modeling is important for traffic simulation and the development of connected automated vehicle technology. This paper focuses on proposing an improved Seq2seq deep learning model (ISDL) for the CAVH environment. Firstly, the kinematics information of multiple preceding vehicles and host vehicles are extracted and organized into an input matrix according to the information topology of CAVs. Secondly, we proposed an improved Seq2seq framework that combines the Bi-GRU encoder, attention mechanism, and GRU decoder into an end-to-end fashion to learn the characteristics of the car-following behavior. Thirdly, the high-fidelity NGSIM data of car-following behavior with several preceding vehicles are employed to train, validate and test the ISDL model. Finally, the proposed model is compared with the IDM, the GRU, the LSTM, the SDL, and the GSDL model. Several main findings are concluded from the experiments.
(1) The simulated speeds and positions of the ISDL model are more accurate than those of the baseline models, indicating that the ISDL captures heterogeneous driving behaviors by mining the underlying information from the field data. (2) The introduction of multiple vehicle information is capable of improving the simulation performance of the deep learning-based car-following models in terms MAE v , MAE x , and MSE x . (3) The proposed model can better reproduce the oscillating phenomenon between relative spacing and speed than the benchmark models. In addition, the ISDL model can provide the macroscopic stop-and-go waves in platoon simulation.
These findings shed light on the connected automated vehicle research area. One application is to generate neighboring human driving speed for CAV according to specific information flow topology. It can be employed to estimate the traffic condition through traffic oscillating simulation under the CAVH environment. Moreover, it can be expected that more vehicle dynamics parameters such as braking and steering, which are probably acquired through V2V communication in the practice of real-world CAVH environment, can be introduced as the input of the deep learning framework to predict the precise trajectory and motion of the vehicle.