Paving the Way for Evaluation of Connected and Autonomous Vehicles in Buses-Preliminary Analysis

This study intended to provide potential safety consideration that will pave the way for evaluation of connected and autonomous vehicles (CAV) in public buses. The geo-localized crash data of Las Vegas metropolitan area from 2014 to 2017 were collected, involving 27 arterials with 466 bus crash samples, and Chi-square Automatic Interaction Detection (CHAID) decision tree model was proposed to examine the effect of CAV technologies in bus crash severity so that the drivers’ factors can be determined and controlled if CAV technologies were employed. Results suggest that contributory predictors of crash severity outcomes are from driver’s action of vehicles with main responsibility (including going straight, making U-turn and passing other vehicles/racing), and crash type (angle and rear-end). If these factors are controlled by CAV technologies, it is suggested that severe crashes involving buses could be reduced significantly. The findings provide useful insights for CAV companies and policy makers to improve the driving state and traffic safety in public buses.


I. INTRODUCTION
In recent years, Connected & Autonomous Vehicle (CAV) technologies have been the hits of artificial intelligence area and the whole automotive industry, driven by billions of business opportunities and markets. Currently a great number of countries, regions, international automakers and the new internet companies have been exploring and investing CAV technologies [1]. Connected vehicle (CV) technologies focus on vehicle-to-vehicle (V2V) communications by synchronizing the movements of adjacent vehicles, while autonomous vehicle (AV) technologies can replace human drivers with a series of decision-making process for a varying array of driving tasks. As stated by Society of Automotive Engineers (SAE), there are five levels of automation: level 0-no automation, level 1-driver assistance, level 2-partial automation, level 3-conditional automation, level 4-high automation, and level 5-full automation [2]. Indeed, the realization of full automation will bring a series of advantages, e.g. alleviating the traffic congestion, reducing traffic crashes and The associate editor coordinating the review of this manuscript and approving it for publication was Tai-Hoon Kim. fatalities, and mitigating the environmental burden, etc. [3] while reducing driver fatigue, and improving drivers' health and wellness [4]- [6]. Benefits rendered from such technologies are obvious, and how to apply these technologies into large vehicles operation is critical so that the injury by large vehicles will decline as drivers shift to autonomous driving. Therefore, the diffusion extent of CAVs will affect the crash frequency of large vehicles, and transportation policymakers are responsible for facilitating the employment of such advanced technologies, which is the motivation of this study.
Large vehicles, e.g. truck and bus, due to synchronized platooning and fixed routes, are supposed to be one of the first batches of CAV adopters. The reason that trucks and buses are selected as the first trial lies in that not only the injury severity caused is much more severe, and platooning phenomenon during traveling at a specified distance facilitates the CAV technology application, but the technological sophistication and maturity required are capable of dealing with the complicated situation. Platooning for trucks represents at least two trucks driving smoothly at certain distance [5]- [7], while for buses it denotes at least two buses operates along the fixed route. In this way, the trucks or buses can communicate with each other, respectively, and operate synergistically to make response once some vehicles run into unexpected situation or crashes.
So far there have been a large amount of cities setting up exclusive or dedicated lane for buses, which is much safer. Ideally, an autonomous system will be capable of operating safely within the dedicated lane by fixed routes in the future before it is completely allowed to mix with traditional cars. More importantly, even within dedicated lanes, sometimes human drivers may behave to interrupt, forming into mixed traffic lane, which may cause conflicts in mixed traffic situation. Consequently, at the outset, it is necessary to divide the exclusive or dedicated lane for CAVs to improve the safety, especially bus dedicated lane with CAVs Based on the bus operating characteristics, if CAVs were employed in the future, the injury due to human drivers will be reduced greatly, and the potential benefits induced by the implementation of CAV technologies will bring a substantial payback, thus making enormous safety improvements for the urban traffic. Hence, the purpose of this study is to investigate the bus drivers' factors from the injury severity perspective so as to pave the way for future implementation of CAV technologies, and avoid these factors.

II. DATA DESCRIPTION
To examine the potential benefits of employing CAV technologies, this study collected the bus crash data from 2014 to 2017 obtained from the GIS open data site maintained by Nevada Department of Transportation. The target area in this study was the metropolitan Las Vegas area formed by the major and minor arterials, including City of Las Vegas, City of North Las Vegas, City of Henderson, and Clark County. The reason that Las Vegas was selected lied in that the most of bus routes were straight lines, which is suitable to implement the CAVs. The study area was composed of 27 major and minor arterials as shown in Figure 1, and 35,130 crashes were involved. After the data were extracted, buses were found to have contributed to 466 crash occurrences. Four main factors from the Traffic Safety Crash Data were extracted: the road users' features (including drivers, pedestrian and cyclists), the vehicle profiles, roadway characteristics and the environment.
In Nevada, the crash injury severity is typically categorized as property damage only (PDO), injury and fatality. In order to summarize the injury severity reflected from ArcGIS, the crashes within 100ft of arterials were buffered, thus the severity levels of total crashes can be considered as observed injury severity. Consequently, the dependent variable in the proposed model was ordinal in which the response of interest referred to PDO, injury and fatality was treated as 1, 2 and 3, respectively.
Since the objective was to investigate the bus drivers' factors from the injury severity perspective so as to pave the way for future implementation of CAV technologies, the bus drivers' factors were emphasized. The crash data collected involved either two vehicles or more than two vehicles, in which the vehicle with main responsibility was named as vehicle 1, and those with minor responsibility was as vehicle 2 according to the classification of NDOT. Since the crashes involving 2 vehicles account for about 89%, the classification can verified the results reasonably. Correspondingly, the drivers' features were divided into vehicle 1 and vehicle 2. Thus, in the following section, the drivers' features were categorized according to this division, such as the variables vehicle 1 driver age, vehicle 1 driver action, vehicle 2 driver age, vehicle 2 driver action, etc.
Additionally, the rest of road users, e.g. pedestrian, pedal cyclist and motorcyclist, only provided binary dataset in the crash injury, i.e. yes or no, thus the road users in this study mainly concentrated on the drivers.
According to the vehicles involved during the injury, the explanatory variables reflecting the vehicle profiles included the total vehicle, vehicle types, vehicle direction, and vehicle conditions (e.g. hit-and-run, mechanical defects, driving too fast, etc.).
The roadway characteristics contained the number of vehicle lanes, roadway conditions (e.g. dry, wet, ice, snow, etc.), and highway factor (work zone or not), while the crash environment extracted the weather, lighting conditions, and first harm (e.g. median, fence, pedestrian, etc.).
In order to evaluate the proposed models in SPSS software, the categorical variables were digitalized, and all the variables collected were listed and summarized in Table 1 with dependent and categorical variables before, and the descriptive statistics of the continuous/indicator variables in the following.

III. MODELING
Decision trees were employed as the methodological method to examine the potential effect of CAVs technologies on bus crash severity. The reason that decision trees were considered as the approach lied in that they can handle both the highly correlated variables and linearity issues, the results can be derived from a number of variables depending on the partitioning method, and the output was simple to understand and easy to interpret [7]. Currently, there have been several widely used decision tree algorithms, Chi-square Automatic Interaction Detection (CHAID), exhaustive CHAID, Classification and Regression Tree (CART), and Quick Unbiased Efficient Statistical Trees (QUEST). Among these, CART, QUEST and exhaustive CHAID are usually used to partition binary categorical variables, while CHAID is considered if the categorical variables include more than two types [8].
CHAID is a recursive partitioning algorithm that searches for an optimal decision tree structure based on the correspondence between the response variable and a set of splitting variables. CHAID is part prediction, part clustering estimation command that seeks to reduce uncertainty about the values of/predict a response variable but simultaneously partitions the dataset into clusters of observations based on the set of splitting variables. The basic steps of CHAID are built by applying the splitting rules sequentially till the final splitting. First, CHAID seeks to split a larger heterogeneous sample into smaller, more homogeneous subsets on the basis of the most predictive explanatory factor. After the first partition, the homogeneous variables are clustered and the subsets obtained continue searching the optimal split for next level. The splitting continues for each node until no more potential splits are made and the final results can display the data value and proportion of each variable. At last, the decision tree can be automatically pruned to avoid overfitting and formed to achieve the optimal partition results [9].
The CHAID method used in this study was developed by Kass (1980) [10] and then extended to include the exhaustive CHAID method, in which the former was not bound to binary splits and allows for multiple level splits at each node, while the latter always generated a binary tree as CART and QUEST. In the splitting step, the Chi-square test for independence was employed to assess whether the purity of splitting a node was improved by a significant amount, and the significance level for both splitting and merging was set as 0.05. The maximum tree depth level was set as 4, and the minimum absolute values in a parent and a child branch were set to 100 and 50, respectively. The final results from CHAID decision tree can be used to decide the reduction in severe bus crashes with main responsibility by considering the conditions if CAV technologies were adopted.

IV. RESULTS AND DISCUSSION
From 2014 to 2017, the selected arterials were involved in 35,130 crashes in Las Vegas Metropolitan area, and buses were found to have contributed to 466 of these crashes, leading to 246 PDO, 215 injury and 5 fatality outcomes. The CHAID decision tree results showed that the buses with main responsibility during the crashes were demonstrated to have accounted for a large amount of injuries, as presented in Figure 2.
The 1, 2 and 3 in the figure represent PDO, injury and fatality. There are only two levels of nodes and two critical variables shown. The way is such that the data collected are very limited, and the classification causes fragmentation so that some variables with small proportion are removed. On the contrary, only the most significant variables are kept, which can explain the injury severity purposely.
The first node starts from driver's action with main responsibility, i.e. bus driver's action plays a critical role in the injury severity. As shown from Figure 2, it can be found out that the critical classification lies in (2,5), i.e. between going straight and passing other vehicles/racing. The results indicate that when vehicle 1 driver's action includes going straight, making U-turn and passing other vehicles/racing, the likelihood of a bus crash resulting in PDO, injury and fatality is 47.4%, 50.4% and 2.2%, respectively.
The second level of CHAID considers crash type as the critical variable. The crash type (<= 2) represents the angle and rear-end, and those (>2) denote the rest types. Shown from the results, when combining vehicle 1 driver's action with crash types, the likelihood of a crash leading to PDO reduces to 42.7%, while the likelihood of injury increases to 55.1%, and the fatality keeps the same. Importantly, the final node suggests that angle and rear-ends crash types are the most dangerous factors examined, and fatality accounts for 4 by 5. Table 2 gives the results of prediction for the three severity types. For the three severity levels, the injury prediction gives the best performance, whose accuracy is close to 85%, while PDO only reaches 46.3%. This implies that bus severity is more likely to concentrate on the injury. The total prediction accuracy arrives at 63.5%, which is acceptable since the dataset is not large enough.
Generally speaking, when the bus drivers make actions, e.g. going straight, making U-turn, or passing other vehicles, care should be taken of vehicles around to avoid the injury since buses may generate severe injury easily; when bus drivers run into angle or rear-end crashes, alert should be taken so that appropriate actions can be made, for example, watching out for sharp bends and adjusting the speed accordingly.
Consequently, if CAV technologies exist for buses, safety benefits can be obtained by accounting for the significant factor of going straight, making U-turn, and passing other vehicles/racing by vehicle 1 driver's action, angle and rear-end crash types. According to the SAE-defined levels, these needs contain autonomously controlling steering, acceleration and deceleration, monitoring of driving environment, and fallback performance of dynamic driving task. More importantly, safe operation of CAVs for buses that can perform these tasks autonomously, may require dedicated bus lane, readable lane markings, integrated traffic signals and signs, and /or specified time periods.
Shown from the factors influencing the bus drivers, the corresponding techniques can be realized if in the future the CAV technologies are mature adequately to be utilized. So far, there has no actual CAV buses employed in Las Vegas, and all the analysis above tends to be hypothetical, so it is not easy to make before-and-after study at the moment, but the method adopted provides some valuable insights for the implementation of CAV in buses.

V. CONCLUSION
In this study potential safety consideration that will pave the way for the implementation of CAVs in public buses was investigated from the injury severity perspective. To figure out the drivers' factors replaced by CAVs, especially buses in Las Vegas, CHAID decision tree model was proposed so that the drivers' factors can be determined and controlled if CAV technologies were employed. The results showed that driver's action with main responsibility (including going straight, making U-turn and passing other vehicles/racing), and crash type (angle and rear-end) were significant for the crash severity, and if these factors can be controlled by CAV technologies, autonomously controlling steering, acceleration and deceleration, monitoring of driving environment, and fallback performance of dynamic driving task should be applied.
Some weakness still exists in this study. One weakness is that since the current studies mainly concentrate on ''automated'' vehicles, it is difficult to predict when CAV technologies replace the automated vehicles completely, and how to integrate the developing stages with developed technology so as to make the study serve the practical purpose. Besides, the infrastructure challenges from the deployment of CAV technologies require further exploration and investigation to operate safely and autonomously. Moreover, as the bus data collected only accounted for a small proportion, so future efforts to consider the heavy trucks/carriers, especially those with platooning phenomenon, could be tried. Additionally, since the results of the study are the preliminary analysis from Las Vegas, it is worthwhile to continue exploring the evaluation in-depth to confirm the findings of this study in future studies.
DAIQUAN XIAO received the B.S., M.E., and Ph.D. degrees from Chang'an University, Xi'an, China. He is currently an Assistant Professor with the School of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan, China. He has published more than 20 articles and most of them were indexed by SCI/SSCI/EI. His research interests include transportation safety and integrated urban traffic management.
XUECAI XU received the B.S. degree from the Xi'an University of Technology, Xi'an, China, the M.E. degree from Southwest Jiaotong University, Chengdu, China, and the Ph.D. degree from the University of Nevada at Las Vegas, Las Vegas, NV, USA. He worked as a Senior Research Fellow with the School of Civil and Environmental Engineering, Nanyang Technological University, Singapore, and a Senior Research Assistant with the Department of Civil Engineering, The University of Hong Kong, Hong Kong. He is currently an Assistant Professor with the School of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan, China. He has published more than 30 articles indexed by SCI/SSCI/EI. His research interests include transportation safety, intelligent transportation systems, and system engineering.
SHENGYANG KANG received the B.S. degree from the Huazhong University of Science and Technology, Wuhan, China, in 2018, where he is currently pursuing the master's degree in road and traffic engineering with the School of Civil Engineering and Mechanics. His research interests include transportation safety, traffic simulation, and integrated urban traffic management. VOLUME 8, 2020