Data-Driven Robust Predictive Control for Mixed Vehicle Platoons Using Noisy Measurement

This paper investigates cooperative adaptive cruise control (CACC) for mixed platoons consisting of both human-driven vehicles (HVs) and automated vehicles (AVs). This research is critical because the penetration rate of AVs in the transportation system will remain unsaturated for a long time. Uncertainties and randomness are prevalent in human driving behaviours and highly affect the platoon safety and stability, which need to be considered in the CACC design. A further challenge is the difficulty to know the exact models of the HVs and the exact powertrain parameters of both AVs and HVs. To address these challenges, this paper proposes a data-driven model predictive control (MPC) that does not need the exact models of HVs or powertrain parameters. The MPC design adopts the technique of data-driven reachability to predict the future trajectory of the mixed platoon within a given horizon based on noisy vehicle measurements. Compared to the classic adaptive cruise control (ACC) and existing data-driven adaptive dynamic programming (ADP), the proposed MPC ensures satisfaction of constraints such as acceleration limit and safe inter-vehicular gap. With this salient feature, the proposed MPC has provably guarantee in establishing a safe and robustly stable mixed platoon despite of the velocity changes of the leading vehicle. The efficacy and advantage of the proposed MPC are verified through comparison with the classic ACC and data-driven ADP methods on both small and large mixed platoons.


I. INTRODUCTION
C OOPERATIVE adaptive cruise control (CACC), which leverages vehicle-to-vehicle (V2V) wireless communications, ensures a convoy of vehicles travel at the same longitudinal velocity with safe vehicular gaps. Both theoretic and experimental studies have revealed the great potential of CACC in reducing traffic congestion, accidents and fuel consumption [1]- [7]. This has attracted much research interest and many CACC strategies have been developed for effective platooning of pure automated vehicles (AVs) [8], [9]. However, the penetration rate of AVs in the transportation system will remain unsaturated for a long time, resulting in the coexistence of AVs and human-driven vehicles (HVs) on roads [10]. Hence, it is in great need to develop CACC for mixed vehicle platoons consisting of both AVs and HVs. The key difference between the platoon of pure AVs and mixed platoon is that the later involves HVs whose behaviours are not programmable as AVs. Moreover, human driving behaviours have inherent uncertainties and randomness that would cause traffic congestion [11] and oscillation [12]. Therefore, the behaviours of HVs need to be considered in designing the CACC of mixed platoon to ensure platoon safety (i.e., collision-free) and robustness (i.e., formation-maintainable), as demonstrated in the experiments [13]- [15]. However, the state-of-the-art CACC strategies for platooning pure AVs are normally based on the simple and identical mass-point vehicle models and thus cannot be applied to mixed platoons. This raises the demand of developing new CACC strategies for mixed platoons. Many models have been developed to capture the human driving behaviours in the car-following setting [16], e.g., the intelligent vehicle model, the optimal velocity (OV) model, etc. Compared to other models, the OV model is simple in representation but can characterize qualitatively almost all kinds of traffic behaviours and the transitions between different behaviours [16], [17]. Hence, the OV model has been adopted for analyzing the stability of mixed platoon [18], [19] and developing model-based CACC for AVs within mixed platoons [17], [20]- [24]. Linear quadratic regulators are designed in [20], [21] for controlling an AV to smooth the mixed traffic flow on a ring road. Optimal control is developed in [17] for an AV to lead a number of HVs at a signalized intersection. Tube model predictive control (MPC) is used in [22] to control the AV behind a group of HVs. Robust control is designed in [24] for the AVs in large-scale mixed platoons. However, all the above control designs need to know the parameters of the OV model, which is too restrictive as the HV behaviours are difficult to be modelled exactly [10]. Moreover, both the OV model of HVs and the point-mass model of AVs adopted in the above works do not include the effect of time delays in propulsion, which could affect the platoon stability. Therefore, it is more appealing to develop a CACC for mixed platoons that considers the propulsion time delays and has no need of knowing the HV model (i.e., OV model) parameters.
Adaptive dynamic programming (ADP) [25] has been adopted by [26]- [28] to design data-driven optimal CACC for AVs in the mixed platoon, where the HV model parameters are not required. ADP has been proved powerful to learn optimal stable controllers by utilizing the collected input-state data of the system. However, these works lack a systematic way to guarantee a safe inter-vehicular gap and platoon's robustness against leader velocity changes. Reinforcement learning based CACC for mixed traffic has been developed in [29], where the AVs acceleration commands are generated by a centralized learning model managed on the cloud. The learning model is trained offline using experimental data of mixed traffic to mimic behaviours of the OV model under safety and physical constraints. However, the centralized setting relies on vehicle-to-cloud communications and may cause time delays in applying the acceleration commands. Also, parameters of the learning model are fixed once being trained, which poses challenge in generating optimal CACC for general mixed platoons. All the above data-driven designs have not considered noises in vehicle state measurement and unknown propulsion time delays, which may lead to degraded CACC performance.
To address the above challenges, this paper aims to develop a data-driven MPC for mixed platoons with unknown HV model parameters, unknown propulsion time delays and measurement noise. Due to its capability of real-time optimization and explicit constraint handling, MPC has been widely used for platoons of pure AVs [5]- [7], [9], [30] and also for mixed platoon with known HV models [22]. In principle, the MPC design relies on a known platoon model to predict the future platoon trajectory under the candidate control sequence. The existing model-based MPC designs are inapplicable to this work, because the investigated mixed platoons have unknown models. The proposed design will adopt the technique of system reachability analysis to predict the trajectory of the mixed platoon under specified input and safety constraints. Reachability analysis has been used for autonomous vehicle path planning in [31], but the exact vehicle model is needed. Recently, data-driven reachability analysis has been developed in [32] and used for MPC design for generic discrete-time linear systems in [33]. However, their MPC design requires data sets collected through offline experiments, making it inapplicable for mixed platoon application. In practice a mixed platoon is more likely to be formed on-the-fly and a priori knowledge of it is unavailable. Hence, setting offline simulation experiments for data collection is unrealistic for mixed platoon applications.
The above background motivates this work and the main contributions are summarized as follows: 1) A data-driven robust MPC is proposed to control the ego AVs in the mixed vehicle platoon with unknown HV models. Each vehicle (either AV or HV) is characterized by a third-order dynamic model with an unknown propulsion time delay. The third-order model captures more realistic vehicle dynamics than the second-order point-mass model used in the existing literature on mixed platoons. Moreover, the proposed MPC is applicable for a wide range of mixed platoon formations that contain the one in [26]- [28] as a special case. 2) The proposed MPC explicitly considers input and safety constraints and has provably guarantee in establishing a safe and robustly stable mixed platoon, which is lacking in the data-driven ADP-based designs [26]- [28]. The MPC determines a real-time safe, robust and optimal acceleration command for each ego AV, which has not been realized by either the ADP-based methods [26]- [28] or the reinforcement learning based method [29].
3) The proposed MPC adopts the idea of data-driven reachability and uses vehicle state measurement with unknown noise, which has not been investigated in the literature. The data is collected when the mixed platoon is on-the-fly, rather than collected through offline simulation experiments as in [33]. During data collection, the lead AV generates a small velocity change to excite the platoon dynamics and ego AVs are equipped with the classic adaptive cruise control (ACC) to avoid collisions.
The rest of this paper is organized as follows. Section II describes the mixed platoon and CACC problem, Section III presents the data-driven MPC design, Section IV provides the simulation results, and Section V draws the conclusions.
Notation: The symbol R n is the n dimensional Euclidean space. ⊗ is the Kronecker product. The superscripts and † are transpose and pseudo-inverse, respectively. | · | is the absolute value and x P = x Px. I n is a n × n identity matrix. 1 n is a n dimensional column of ones. I [a,b] denotes the set of integers from a to b. 0 is a zero matrix whose dimensions are known from the context unless it is necessary to be given. s.t. is short for subject to.

II. MIXED PLATOON MODEL AND CACC PROBLEM
This paper considers the mixed platoon in Fig. 1, where all the vehicles are equipped with V2V wireless communication devices. The mixed platoon has N + 1 vehicles, including the lead AV 0, the end AV N, and N − 1 HVs between them. The role of AV 0 is to ensure controllability of the entire platoon and assist the data collection for designing the proposed data-driven MPC for AV N. AV 0 is assumed to already have a well-tuned controller, e.g., the MPC in [30], to ensure the tracking of reference longitudinal velocity. The focus of this paper is to design the longitudinal acceleration commands of AV N to follow AV 0 by using the motion information from vehicles 0 to N − 1. The CACC design in this paper is based on Fig. 1, but applicable to more general mixed platoons. This is because a general mixed platoon can be split into multiple sub-platoons with the formation of Fig. 1, which will be demonstrated in the simulation studies.
The longitudinal dynamics of AVs are characterized by the third-order linear systeṁ where i = 0, N. The variables p i , v i , a i , and u i are the vehicle position, longitudinal velocity, acceleration, and acceleration command, respectively. τ i is the unknown propulsion time delay. The acceleration command u N is to be determined for controlling AV N to track v 0 whilst keeping a safe inter-vehicular gap h * between itself and HV N − 1. The platooning error vector of AV N is defined as By using (1), the platooning error system of AV N is derived asẋ where x N−1 is the platooning error vector of HV N − 1 and The car-following behaviour of HV i , i ∈ I [1,N−1] , is captured by the third-order nonlinear systeṁ where h i = p i−1 − p i is the gap between vehicles i − 1 and i , α i is the headway gain, and β i is the relative velocity gain. V (h i ) is the spacing-dependent desired velocity given by [20]: where h s is the smallest gap before the HV intends to stop and h g is the largest gap after which the HV intends to maintain the maximum velocity v max . This paper establishes a stable platoon and thus ensures h s < h i < h g . To build a more realistic mixed platoon model, the HV model (3) includes the acceleration dynamics with unknown propulsion time delay τ i , which has not been considered in the existing literature on mixed platoons [16]- [24], [26]- [29]. When AV 0 travels at the velocity v 0 (≤ v max ), the HVs will ultimately reach the equilibrium point The linearized models of the HVs around the equilibrium point are derived aṡ with Define the overall platooning error vector as x = [x 1 · · · x N ] and the control input as u = u N . By using (2) and (5), the overall platooning error system is derived aṡ with the system matrices The acceleration a 0 of AV 0 is regarded as a disturbance, because it is an external input intending to drift the overall platooning error system (6) away from the equilibrium position. Hence, u will be designed to ensure that the platooning error system is robustly stable against a 0 . Discretizing (6) using the forward Euler method with sampling time t s to get the control-oriented mixed platoon model where t is the sampling step, A = I n + t s A c , B = t s B c and E = t s E c . w(t) is the measurement noise which is unknown but bounded. The dimensions of the vectors x(t), u(t), a 0 (t), y(t) and w(t) are n = 3N, m = 1, q = 1, n = 3N and n = 3N, respectively. Although the car-following behaviours of HVs can be captured by the OV model (5), the uncertainty and randomness in human driving make it impossible to identify the exact model parameters α i and β i , i ∈ I [1,N−1] . The propulsion time delays τ i , i ∈ I [1,N] , are also unknown. Hence, the system matrices A and B in (7) are unknown and the model-based CACC designs in [17], [18], [20]- [24] are inapplicable. This paper will develop a data-driven MPC to get an optimal u(t) to realize two objectives: 1) Ensure stability of the mixed platoon: . 2) Satisfy input and safety constraints: where In (9), u max is the maximum acceleration, h max is the maximum allowable inter-vehicular gap error (i.e., deviation from the safe inter-vehicular gap h * ), and v max the maximum allowable velocity error (i.e., deviation from the equilibrium velocity v * ). Although over-approximation leads to design conservativeness, the over-approximated model is essential for designing MPC to generate a robustly optimal control law u(t) to realize the objectives in (8) and (9). 3) Compute the reachable set of platoon state and design the data-driven robust MPC (see Section III-D). Before proceeding, the necessary preliminaries of reachability and zonotope are provided in Section III-A.

A. Basics of Reachability and Zonotope
The proposed MPC will be based on the reachable set of the mixed platoon model (7). The reachable set is the union of all possible y(t) within a finite time when starting from the initial state y(0) ∈ Y and implementing a set of possible u(t) ∈ U, in the presence of disturbance a 0 (t) and measurement noise w(t). The reachable set is to be computed using the matrix zonotope based set-propagation technique in [31], [32], which can represent the high-dimensional sets compactly and is computationally efficient. This technique is essential for this work, because the state x(t) in the platoon model (7) has the dimension of n = 3N, which will be very high as the number of vehicles N increases. The basic knowledge of set representation is recalled from [31], [32] and given below.
Definition 1 (Zonotope): Given the center c z ∈ R n and generator matrix G z = [g (1) Definition 2 (Matrix Zonotope): Given the center matrix C m ∈ R n× p and generator matrices The following operations of zonotope are to be used: • Interval over-approximation: Z = c z , G z can be over-approximated by an interval In this paper, the reachable set will be computed directly from noisy data with the presence of disturbance a 0 (t). The noise w(t) and disturbance a 0 (t) are assumed to be unknown but satisfy Assumptions 1 and 2, respectively.
Assumption 1: The measurement noise w(t) is bounded by a zonotope, i.e., w(t) ∈ Z w = c w , G w for all t ≥ 0. The one-step noise propagation Aw(t) is also bounded by a zonotope, i.e., Aw(t) ∈ Z Aw = c Aw , G Aw for all t ≥ 0.
According to [34], the bound of Aw(t) in Assumption 1 can be determined by using the largest singular value of A and the upper bound of w(t). The largest singular value of A could be obtained through experiments by considering model parameter uncertainties of HVs [24]. Assumption 1 may be restrictive, but it still remains as an open problem in the field of data-driven control to deal with measurement noise. Since the acceleration a 0 (t) of AV 0 is known to satisfy |a 0 (t)| ≤ u max and the matrix E is known, Assumption 2 is always true.

B. Collection of Noisy Mixed Platoon Data
It is seen from (7) that the mixed platoon is influenced by the acceleration a 0 of AV 0. Hence, this paper applies a small time-varying acceleration command u 0 (t) to AV 0 to excite the platoon dynamics for data collection. To maintain safety of AV N during data collection, its acceleration command is set as u(t) = u ACC (t), i.e., the classic ACC controller [35]: where the function "sat()" is defined as sat(z, z limit ) := max(min(z, z limit ), −z limit ) such that |z| ≤ z limit . The gap controller u gap (t) is used to maintain a safe inter-vehicular gap between HV N − 1 and AV N: where k h and k v are constant gains, d safe (t) is the safe inter-vehicular gap designed as d safe (t) = d still + t g v N (t), d still is the standstill distance and t g is the time headway. The speed controller is to control AV N at the specified velocity v set whenever h N (t) ≥ d safe (t): where k s is a constant gain. In this paper, the MATLAB example "Adaptive Cruise Control with Sensor Fusion" is used as the reference to set the values: k h = 0.2, k v = 0.4, k s = 0.5, d still = 5 m, t g = 1.5 s and v set = 24.5 m/s. The AV N has access to the real-time input u(t) and noisy state measurements y(t) of the system (7). Collect T steps input-state data to obtain the sequences {u(t)} T −1 t =0 and {y(t)} T t =0 . Using them to construct the data set S data : The unknown disturbance and noise sequences corresponding to the collected data set S data are denoted as: According to Assumptions 1 and 2, the sequences in (14) satisfy the relations: where the matrix zonotopes M a 0 , M w and M Aw are computed from the zonotopes Z a 0 , Z w and Z Aw , respectively.

C. Over-Approximation and Reachable Set of Platoon Model
Due to the existence of measurement noise, there generally exist multiple models [ A B] that are consistent with the collected data set S data = {U − , Y + , Y − }. Under Assumptions 1 and 2, the zonotopes Z a 0 , Z w and Z Aw that bound Ea 0 (t), w(t) and Aw(t) are known. Hence, by collecting enough data such that [Y − U − ] has full rank n + m, a matrix zonotope M AB can be constructed to over-approximate all possible system models (i.e., mixed platoon dynamics) that are consistent with the noisy data, as shown in Lemma 1.
Lemma 1: Given the data set S data = {U − , Y + , Y − } of the mixed platoon model (7), where [Y − U − ] has full rank n+m. Then under Assumptions 1 and 2, the matrix zonotope contains all possible system models [ A B] that are consistent with the data and the bounds of disturbance and noise. Proof: By using (7), the dynamics of y(t) is derived as Based on (13) and (14), (17) is equivalently written as Since the matrix [Y − U − ] has full rank n + m, the following equation is derived from (18): By using (15) and (19), the matrix zonotope M AB in (16) can be used to over-approximate all possible system models [ A B] that are consistent with the noisy data.
Define R t as the model-based reachable set of y(t) at time t. Then it is computed based on (17) and given as (20) where R 0 = y(0), 0 and Z u,t = u(t), 0. The data-driven over-approximation of R t is provided in Lemma 2. Lemma 2: Given the data set S data = {U − , Y + , Y − } of the mixed platoon model (7), where the matrix [Y − U − ] has full rank n +m. Then under Assumptions 1 and 2, the model-based reachable set R t is a subset of the data-driven reachable set R t characterized bŷ whereR 0 = y(0), 0 and Z u,t = u(t), 0. Proof: It is easy to see that based on (16) and (20), R t is governed by (21). Since [ A B] ∈ M AB as shown in Lemma 1, it is true that R t ⊂R t . Therefore, the data-driven representation in (21) provides an over-approximation of the real mixed platoon dynamics in (17).
One of the sufficient conditions to the statements in Lemma 1 and Lemma 2 is collecting "enough data" such that the matrix [Y − U − ] has full rank n + m. Physically, this ensures that the dynamic behaviours (acceleration/deceleration) of the mixed platoon are captured by the collected data, so that the unknown parameters α i and β i , i ∈ I [1,N−1] , and τ i , i ∈ I [1,N] , in model (7) can be identified from the data [36]. The full rank condition is satisfied if the input u(t) of AV N is persistently exciting of order n+1 during data collection [33], [36]. The persistent excitation is realized via designing the input u(t) in [33]. However, this is inapplicable for mixed platoons because u(t) does not influence the behaviours of the HVs ahead of AV N. In this paper, the dynamics of the mixed platoon is excited by applying a small time-varying acceleration command u 0 (t) to AV 0, as described in Section III-B. Note that it is also possible to ensure the full rank condition under natural driving (without manually adding extra u 0 (t) to AV 0), given that during data collection AV 0 has time-varying velocities that are able to excite fully the dynamics of the mixed platoon. This will be demonstrated through simulations in Section IV-C. Under the natural driving setting, however, it is expected that more time could be taken to collect enough data.

D. Data-Driven Robust MPC Design and Implementation
This section describes the data-driven MPC design based on (21) to realize the platooning objectives in (8) and (9) with robustness against the measurement noise w(t) and leader disturbance a 0 (t). As the starting point, the MPC design based on the mixed platoon model (7) is provided to illustrate the basic principle of MPC. At each time step t, the input u(t) is obtained by solving the constrained optimization problem: is the cost function, N c is the prediction horizon, and e y (t +k+ 1|t) = Fy(t +k+1|t)−ȳ r (t) and e u (t +k|t) = u(t +k|t)−u r (t) are the predicted tracking errors. The input reference u r (t) = u gap (t) is the gap controller in (11). {u(t + k|t)} N c −1 k=0 is the control sequence to be determined, and y(t) is the measured state at time t. The weights Q ∈ R 3×3 and R ∈ R m×m are user-specified symmetric positive matrices. The obtained first optimal control input u * (t|t) is set as u(t) for AV N.
The MPC problem in (22) needs the unknown matrices A and B and thus is not implementable. This paper will reformulate it as a data-driven MPC problem using (21). The key idea is to determine {u(t + k|t)} N c −1 k=0 at each time step t such that the predicted state sequence {y(t + k + 1|t)} N c −1 k=0 always stay within the computed reachable set and the cost is minimized. The obtained data-driven MPC problem is where Z u,t = u(t + k|t), 0. The constraint in (23b) is the intersection of the reachable set (21) and the state constraint set Y. The combination of (23b) and (23c) is to ensure the predicted state sequence {y(t +k+1|t)} N c −1 k=0 always stay within the computed reachable set sequence {R(t + k + 1|t)} N c −1 k=0 . The data-driven MPC problem in (23) is solved recursively at each time step to obtain the optimal control input u(t) = u * (t|t) for AV N. Properties of this data-driven MPC design are described in Theorem 1.
Theorem 1: Under Assumptions 1 and 2, if the MPC problem in (23) is feasible at the first time instance t 0 , then it is feasible at any time t ≥ t 0 and the obtained control inputs realize the platooning objectives in (8) and (9), i.e., guaranteeing that the mixed platoon is robustly stable and satisfies the input and safety constraints.
Proof: According to Lemma 2, the computed reachable setR t is the over-approximation of the model-based reachable set R t , i.e., R t ⊂R t . The control input sequence is designed to satisfy the constraints in (23b) and (23c). This guarantees that the predicted state is always within the intersection of the over-approximated reachable set and the state constraints regardless of the measurement noise and disturbance. Therefore, feasibility of (23) guarantees robust constraint satisfaction of the over-approximationR t and of the true reachable set R t . Feasibility of the problem in (23) also ensures that the predicted state is always enforced within the same constraint set Y. This means that Y is a constant terminal set of the proposed MPC [30]. Therefore, the MPC problem in (23) is feasible at any time t ≥ t 0 , if it is feasible at the first time instance t 0 .
According to Theorem 1, the velocity deviation v N = v N − v 0 satisfies |v N | ≤ v max for any leader acceleration a 0 . Hence, it is straightforward to show that the mixed platoon is head-to-tail string stable [37] in the sense of L 2 string stability [26]. For the considered mixed vehicle platoon in Fig. 1, head-to-tail string stability ensures velocity fluctuations to be suppressed from AV 0 to AV N, but allows amplification of velocity fluctuations among the HVs between them. The head-to-tail string stability for a general mixed platoon can be investigated in a similar way by splitting the platoon into multiple sub-platoons [38].
Solving (23) involves computing the matrix zonotopes with intersections and constraints in (23b) and (23c), which is computationally expensive and undesirable for real-time implementation. To overcome this, (23) is reformulated as a more computationally efficient optimization problem: where s u (t + k + 1) and s l (t + k + 1) are extra variables introduced to ease the computation. Y u and Y l are the upper and lower bounds on the individual dimensions of Y, respectively. R u (t + k + 1|t) andR l (t + k + 1|t) are the upper and lower bounds on the individual dimensions ofR(t + k + 1|t), and they are computed via over-approximatingR(t +k +1|t) using an interval, as described in Section III-A. The only difference between the MPC problems in (23) and (24) is that the constraints in (23b) and (23c) are replaced by (24b) -(24g) with the extra variables s u (t + k + 1) and s l (t + k + 1). Such a replacement improves the computational efficiency whilst retaining the properties proved in Theorem 1 and the head-to-tail string stability. The real-time implementation of the proposed CACC is summarized in Algorithm 1.
Remark 1: The data collection process in Algorithm 1 is necessary whenever a new mixed platoon forms, e.g., due to lane changes, and cut-ins/outs of vehicles. There may not be time to collect enough data if the platoon formation changes quickly, e.g., at on/off-ramp areas. In such cases, a switch

Algorithm 1 Proposed Data-Driven Robust MPC
Require: Input and state constraints (U, Y), weights (Q, R), and prediction horizon N c .
Section III-B end Compute M AB using (16).
Section III-C Solve (24) for {u * (t + k|t)} N c −1 k=0 . Section III-D Apply u(t) = u * (t|t) to AV N. end from the data-driven MPC to classic ACC can be made to avoid collisions and ensure safety of the mixed platoon.

IV. SIMULATION RESULTS
The proposed MPC has been verified in two different mixed platoons: (i) Sub-platoon 1 and (ii) the entire platoon (consisting of Sub-platoons 1&2) in Fig. 3. The Sub-platoons 1&2 have the same formation as Fig. 1 with N = 5 and N = 3, respectively. The case (ii) is used to demonstrate the applicability of the proposed MPC design for a more general mixed platoon. This has not been investigated in the existing data-driven CACC designs [26]- [28]. Each AV in the platoon is set to not "look beyond" another AV, e.g., AV 8 would not include the V2V signals from those vehicles further ahead than AV 5. This setting is to avoid using unreliable V2V communications between vehicles that are too far apart [39].
To further demonstrate advantages of the proposed design over the existing ones, three different platooning designs are simulated and compared: Classic ACC [35], ADP [28], and Data-driven MPC proposed in this paper. In the ADP design, input-state data are collected for computing the constant gain K ADP to implement the control law u(t) = −K ADP x(t).
The simulations are conducted in MATLAB by using the toolbox YALMIP [40] with the solver MOSEK [41] for computing the MPC, and the CORA toolbox [42] for computing the zonotopes. To simulate more realistic traffic conditions, the vehicles have the following different model parameters:   10), (40,10), (20,10) and (0, 10), respectively.

A. Results of Sub-Platoon 1
This section presents the comparative simulation results of Sub-platoon 1 in Fig. 3 by applying Classic ACC, ADP, and Data-driven MPC to AV 5. Classic ACC is given in (10) with k h = 0.2, k v = 0.4, k s = 0.5, d still = 5 m, t g = 1.5 s and v set = 24.5 m/s, which are the default values in the MATLAB example "Adaptive Cruise Control with Sensor Fusion". ADP is designed based on the platoon model (7) and follows Algorithm 1 in [28] with Q = 10 −3 × I 3 and R = 1 but neglecting the driver reaction time. Data-driven MPC is designed by running Algorithm 1 in Section III-D with T = 3400, Q = I 3 , R = 10 −2 and N c = 2. During data collection of ADP and Data-driven MPC, the excitation signal u 0 (t) = 0.2 sin(πt/600) m/s 2 is applied to AV 0, and u acc (t) in (10) is applied to AV 5 with k h = 0.2, k v = 0.4, k s = 0.5, d still = 5 m and t g = 1.1 s. The corresponding gap controller u gap (t) in (11) is used as the input reference u r (t) for Data-driven MPC.
The vehicle acceleration commands under three designs are shown in Fig. 4. For ADP and Data-driven MPC, the collected data sequences are required to have the full rank of n + m (which is 16 in this example), where n and m are the dimensions of the platooning model state and the input of AV 5, respectively. However, as seen in Fig. 4, Data-driven MPC collects a larger amount of data than ADP. This is for constructing a more accurate over-approximation of the platoon model to improve the MPC performance. After finishing data collection, all the three designs have similar acceleration commands that are within the limit [−u max , u max ], despite of the rapid acceleration and deceleration of AV 0.
As shown in Fig. 5, both ADP and Data-driven MPC ensure the entire platoon travel at the same velocity. However, for Classic ACC, AV 5 is set to drive at the constant speed v set = 24.5 m/s whenever h N (t) ≥ d safe (t). This makes the velocity of AV 5 different from its preceding vehicles at the high speed region during t ∈ [150, 240] s. Hence, compared to ADP and Data-driven MPC, Classic ACC has larger inter-vehicular gaps between HV 4 & AV 5, as seen in Fig. 6. It is also shown in the middle sub-plot of Fig. 6 that AV 5 crashes into HV 4 under a rapid deceleration of AV 0, due to the lack of considering the safety constraint y(t) ∈ Y in ADP. The above results demonstrate that Data-driven MPC is advantageous in establishing a safe and stable mixed platoon with more compact vehicular gaps, which is beneficial for reducing traffic congestion and fuel consumption.
The computation time to obtain the MPC control input, including computing the data-driven reachable set and the control input sequence, is 0.015 s (on a PC having an Intel(R) i9-10850K CPU 3.60GHz with 32 GB of RAM). This computation time is shorter than the sampling period 0.02 s and does not introduce control input delays.

B. Results of Sub-Platoons 1&2
This section presents the comparative simulation results of the entire platoon in Fig. 3 by applying Classic ACC, ADP, and Data-driven MPC to AVs 5&8. Classic ACC and ADP are designed in the same way as that in Section IV-A. Data-driven MPC for AV 5 is also the same as in Section IV-A. Datadriven MPC for AV 8 is designed by running Algorithm 1 with T = 2000, Q = I 3 , R = 10 −2 and N c = 3. It should be emphasized that the data sequences for AVs 5&8 are collected simultaneously, by applying the small time-varying acceleration command u 0 (t) to AV 0. During data collection of ADP and Data-driven MPC, the signals of u 0 (t) applied to AV 0, u acc (t) applied to AVs 5&8, and input reference u r (t) of AVs 5&8 are identical to those used in Section IV-A.
It is observed from Fig. 7 that the acceleration commands of all vehicles are within the limit [−u max , u max ] under Classic ACC and Data-driven MPC, despite of the rapid acceleration and deceleration of AV 0. However, the acceleration command of AV 8 under ADP exceeds the lower limit −3 m/s 2 . This is because the input constraint is not considered in ADP. As shown in Fig. 8, both ADP and Data-driven MPC ensure the entire platoon travel at the same velocity. For Classic ACC, both AV 5 and AV 8 are set to travel at the constant speed v set = 24.5 m/s when h N (t) ≥ d safe (t). Since HVs 6&7 follow their immediate predecessors directly, they also travel at the same velocity as AVs 5&8 at steady states. As a result, at the high speed region in t ∈ [150, 240] s, the entire platoon is split into two sub-platoons that are travelling at different velocities: one sub-platoon consists of AV 0, HV 1, HV 2, HV 3 and HV 4, and the other sub-platoon consists of the rest. Hence, compared to ADP and Data-driven MPC, Classic ACC has larger inter-vehicular gaps between HV 4 & AV 5 and between HV 7 & AV 8 at the high speed region, as seen in Fig. 9. It can also be observed from Fig. 9 that by implementing ADP, both AV 5 and AV 8 crash into their front vehicle (HV 4 and HV 7) under a rapid deceleration of AV 0, due to the lack of considering the safety constraint in designing ADP.

C. Results of Sub-Platoon 1 Under Aggressive Leader and Stochastic HV Parameters
This section reports a test of the proposed Data-driven MPC on Sub-platoon 1 with an aggressive leader and stochastic HVs parameters α i and β i , i ∈ I [1,4] . In the simulation, AV 0 follows the SFTP-US06 Drive Cycle in Fig. 10, which is a representative of aggressive, high speed and/or high acceleration driving behaviour with rapid speed fluctuations. To simulate the uncertainty and randomness in human driving behaviours, the HV model parameters are assumed to have stochastic changes, i.e., α 1 = 0.2 +δ(t), β 1 = 0.4 +δ(t), α 2 = 0.2 + δ(t), β 2 = 0.45 + δ(t), α 3 = 0.3 + δ(t), β 3 = 0.4 + δ(t), α 4 = 0.2 + δ(t), β 4 = 0.45 + δ(t), where δ(t) is a white noise satisfying |δ(t)| ≤ 0.1. All the other parameters remain the same as Section IV-A, except that h g = 50 m, v max = 36 m/s, u max = 4 m/s 2 . Using these h g , v max and u max is to ensure the following vehicles have the capability to track the velocity of AV 0 to form a mixed platoon. The initial vehicle positions are the same as in Section IV-A, while the initial velocities are zero, same as AV 0 (see Fig. 10). No excitation signal is applied to AV 0 for data collection. Simulation results show that the Data-driven MPC collects enough data within 70 s. As shown in Fig. 11, the acceleration commands of all vehicles are within the limits, and the inter-vehicular distances between

V. CONCLUSION
This work designs CACC for a mixed platoon consisting of both AVs and HVs. To capture more realistic traffic conditions, the model parameters of HVs are assumed to be unknown and the acceleration dynamics of each vehicle are considered with an unknown propulsion time delay. A data-driven MPC is proposed to obtain the control law of the ego AV in establishing a safe and robustly stable mixed platoon. The MPC design leverages the data-driven reachable sets of the mixed platoon, which are determined based on an over-approximation of the unknown platoon model by collecting noisy vehicle state measurements. The simulation results of both small and large mixed platoons have verified efficacy of the proposed MPC and its advantages over the classic ACC and ADP methods in guaranteeing platoon safety and robust stability with smaller vehicular gaps. Although the proposed MPC optimization problem needs to be solved at each sampling step, the simulation results have shown that the computation time is less than the sampling period. Therefore, no time delay will be introduced in implementing the MPC control law. One future work will be extending the proposed design for ecological mixed vehicle platooning. Another will be developing a data-driven MPC for mixed vehicle platoons with consideration of control input delays that are caused by the time delays in actuators, communication and sensors.