Analyzing Charging Behavior of Electric City Buses in Typical Chinese Cities

Electric city buses have potential to reduce greenhouse gases emission in case the majority of the electric power used in electric buses originate from the renewable sources or nuclear power plants. Their charging behaviors analysis is critical to their development and mass-adoption. To analyze charging behavior characteristics of electric city buses at different locations, the datasets collected from 17576 electric buses operating in 14 cities are used based on the probability statistics method. Then, the characteristic parameters including the charging power and charging duration are utilized to cluster the cities into 5 clusters based on the K-means algorithm. The results enrich the traditional research conducted only under limited test routes and provide the comparison of key characteristic parameters among different clusters. The analysis results are useful in studying the connection between the operational efficiency and the charging behaviors, optimizing the charging scheduling, evaluation of charging load and planning charging infrastructures construction.


I. INTRODUCTION
With the growing concerns over environmental pollution and energy security, electric vehicles (EVs) are being intensively studied and increasingly adopted worldwide [1]- [3]. Especially, the large-scale application of electric city buses in megacities is considered an effective and viable mean to mitigating traffic congestion and improving air quality [4]. Due to continuous advancements in vehicle manufacturing and battery technologies, electric buses with increased driving range per charge and better fast-charging capability have been developed [5]. The popularity of electric buses worldwide may be ascribed to incentives of governments, such as the Low or No Emission Competitive program in the United States, the Green Bus Fund Program in the UK, and the Ten Cities and Thousand Vehicles Program in China. By the end of 2018, there are approximately 255,000 electric city buses running all over China, and it is anticipated that the fleets of transit city buses in major cities will fully embrace electrification at the end of 2020.
To support electric bus penetration, substantial studies have been conducted including infrastructure planning, The associate editor coordinating the review of this manuscript and approving it for publication was Xin Luo .
cost-benefit analysis, carbon dioxide emission evaluation, and charging scheduling optimization [6]- [18]. The research on infrastructure planning includes estimation of energy demand of electric buses [6], investigation of the impacts on electrical distribution system [7], [8], and charging station location planning [9], [10]. These studies focused on reducing the impact on power grid by optimizing charging infrastructure and charging strategies. The cost-benefit analyses and carbon dioxide emission evaluation are highlighted in many studies. Considering that the fuel economy, energy pathway and lifecycle cost calculation parameters are related to the operating conditions, the carbon dioxide emission and lifecycle cost were calculated under different operating conditions, such as the routes in Finland, California (USA) [11] and China [12]. The results show that electric buses have potential to significantly reduce carbon dioxide emission and their lifecycle cost-benefit is heavily influenced by purchasing costs and charging strategies [13], [14].
In contrast to charging infrastructure planning and cost-benefit analysis, examples on real-world charging and charging scheduling optimization of electric buses are rather limited [15]. This is mainly due to lack of real operation data and these studies were conducted under simulation environment or relatively simple test routes. Santos et al. [16] devised mixed fleet arrangements to optimize the performance of these mixed bus fleets based on real data of a bus network in Porto, Portugal. Paul and Yamada [17] and Wang et al. [18] developed a modeling framework to optimize electric bus recharging schedules, which is implemented in a real-world case based in California and Japan respectively. However, the results cannot fully reveal the actual charging and operation of electric buses and their difference in different regions. Since understanding actual charging behaviors is the basis for operational assessment of electric buses [19], [20], it is necessary to analyze charging behaviors of electric buses.
This study aims to further deepen the understanding of charging patterns and charging behaviors of electric buses via analyzing more comprehensive characteristic parameters such as charging frequency, initial state of charge (SOC), travel distance between two successive charging events, etc. The results may be of benefit for predicting the demands of electric buses and optimizing the construction of charging infrastructure.
The remainder of this paper is organized as follows. Section II describes the data used in this study. Section III analyzes the charging behaviors of electric buses in detail. Section IV introduces the characteristic parameters of charging behaviors and the used analysis method including the K-means algorithm and the cluster validity measurement. Section V presents the result of clustering and detailed analysis of different charging patterns, followed by the key conclusions summarized in Section VI.

II. DATA PREPARATION
In order to study the charging patterns of electric buses in different cities, the data is collected from fourteen different cities including Beijing, Shenzhen, Shanghai, Nanjing and Jinan. The selected cities stride over wide latitude across China and cover various levels of economic development as shown in Figure 1. This can comprehensively reflect the charging patterns of electric city buses in disparate regions. The collected data covers the time from June 2018 to August 2018, including a total of 21,586 new energy buses (17,576 electric buses and 4010 plug-in hybrid electric buses). The items of data collected are shown in Table 1 for demonstration. It is worth noting that 231 different models of electric buses with battery capacity varying from 33 to 327 kWh are studied.
Due to complex driving conditions, the original datasets may suffer from data distortion and data missing, and the datasets are processed through data cleaning and data integration for better data quality. More than 2.9 million charging events are finally obtained, and each charging event contains the information such as the charging start time, charging duration, charging start SOC, charging end SOC and charge consumption, etc.
Based on the collected data, the numbers of the buses with charging events, online at least once and monitored each day in June, July and August are shown in Figure 2. The average online rate of electric buses is about 70%, which means the electric buses operates at least once occupies 70% of all the   monitored vehicles. Figure 2 shows there are more online and charged buses on workdays than that on weekends. It indicates that the number of electric buses need recharging on workdays is higher than on weekends. This is understandable since more people commute on workdays. Therefore, the collected data is divided into two groups, i.e., data collected on workdays and data collected on weekends. Considering that bus transportation is busier on workdays and the operator needs to meet the charging needs on workdays, this paper only analyzes the charging behaviors of electric buses on workdays.

III. CHARGING BEHAVIORS ANALYSIS OF ALL ELECTRIC CITY BUSES
Based on the running data of electric buses, the charging behaviors of electric buses are preliminarily analyzed by using the probability statistics method.   There are a certain number of charging events starting from 14:00-22:00, and the charging events decrease after midnight. Calculating the proportions of charging events during the day (5:00-20:00) and the night, it is found that the latter is less than 18%.
The average charging power and charging duration per charging event by time are presented in Figure 4. The charging power during the day is significantly higher than that at night. The charging duration of electric buses is about 0.5 hour during the day, while the charging duration is about 2 hours at night. It indicates that electric buses are generally charged via fast charging during the day while charged via slow charging at night. Figure 5 shows the probability distribution function (PDF) of the charging power. It is obvious that it has two peaks at around 30 kW and 110 kW. Despite that charging at higher power during the day can shorten the charging duration, the construction of fast-charging stations is difficult and expensive. Also, it may has great impact on power grid. Thus, the actual conditions of cities need to be considered when choosing the charging modes of electric buses. Figure 6 shows the distribution of charging start SOC during the day (5:00-20:00) and the night. The charging  start SOC usually ranges from 50% to 70% in the daytime while it ranges from 30% to 50% at night, which means that electric buses are recharged at high initial SOC in the daytime. Figure 7 shows the distribution of charging end SOC during the daytime and nighttime. It can be seen that more than 80% of charging processes finish with 90% to 100% SOCs regardless of day or night. It indicates that the batteries are generally full when the electric buses began a new trip after recharge.
The distribution of charging frequency is depicted in Figure 8. The case of electric buses charged twice a day has the highest proportion. The majority of buses (about 86%) are charged once or more times a day. Figure 9 shows the distribution of the travel distance between two charging events. It indicates that the peak rate occurs in the range of 20-40 km. About 80% of buses travel less than 100 km between two charging events.

IV. CHARGING PATTERNS ANALYSIS METHOD
Based on the charging data, the K-means algorithm is used to explore the inherent regular patterns of charging behaviors, which provides an effective basis for comparing the charging behaviors.

A. CHARACTERISTIC PARAMETERS FOR DESCRIPTION OF CHARGING BEHAVIORS
Since electric buses need to operate during the day, the charging behaviors of electric buses vary greatly during the daytime and nighttime, which has shown partially in section III. The characteristic parameters of charging behaviors consist of two parts. One day is divided into two periods: operation period in the daytime and non-operation period at night. Considering that most of electric buses in cities usually start to operate after 5:00, 5:00 is set as the operation period start time. The end time of operation period is determined by analyzing the distribution of charging start time and charging duration of electric buses for each city, as shown in Figure 10. Figure 10 shows that electric buses in Shanghai have a peak charging-start-time at around 20:00 and the corresponding charging duration becomes significantly longer. It indicates that most of the buses in Shanghai stop operation at the time and start to recharge. Therefore, the operation period of Shanghai is defined as 05:00-20:00. The same method is used to determine the operation period of other cities.   The operation period of Shanghai, Qingdao, Zhengzhou and Chongqing are set between 5:00 a.m. to 20:00 p.m., and the others are set between 5:00 a.m. to 22: 00 p.m.
In this study, seven characteristics of charging behaviors are defined in Table 2.

B. K-MEANS ALGORITHM AND SILHOUETTE COEFFICIENT
The K-means algorithm is widely used in user classification and behavior research. Easy implementation and high efficiency are main reasons for its popularity [21], [22]. The basic process of the K-means algorithm can be summarized as follows.
Step 1: 1, 2, . . . , n), n is the sample set size, and m is the indicator number of sample x i . Since different indicators have different dimensions and the dimension of indicators directly affects the clustering results. We use the z-score standardized method for normalization and get the sample set R = [x 1 , x 2 , . . . , x n ] after normalization.
Step 2: Randomly select k samples from sample set R as cluster centers: 2 ,. . . , C

k ]
Step 3: Use Euclidean distance formula to measure the distance between normalized samples and clustering centers, and allocate the samples to the nearest clustering centers. The Euclidean distance between sample x i and clustering center C (t) j is given as follows.
Step 4 (Update the New Cluster Center as Follows: where q denotes the number of elements currently assigned to cluster C (t) j .
Step 5: Return to step 3 until C The K-means algorithm is a clustering algorithm for unsupervised learning, and the k value has a direct effect on the clustering results. The silhouette coefficient is used to indicate the cohesiveness and separation of instances in one cluster from those in the other clusters [23]. In this study, we use the silhouette coefficient-for which the higher the coefficient the better the clustering-to judge the effectiveness of K-means clustering and to determine the k value [24].

A. CHARGING CHARACTERISTICS OF ELECTRIC BUSES IN CITIES
The characteristic parameters of charging behaviors in the operation and non-operation periods are calculated based on the charging data, respectively. The results are shown in Table 3. The differences of charging characteristics among different cities and the relationship among these parameters are shown in Figure 11 and Table 3. Figure 11(a) shows the average charging power in the operation period and the non-operation period of 14 cities. It can be seen that the former is significantly higher than the latter. The variation of charging power is relatively greater in the operation period in terms of the standard deviation. Figure 11(b) shows the ratio of the charged buses in the operation and non-operation periods to all the charged buses throughout the day. Except for Shanghai, the ratio of the charged buses in the operation period is larger than that in the non-operation period. The ratio of the charged buses in the non-operation period is small in most cities and the average value is 0.36. By comparing the charging power and the ratio of the charged buses in cities, it is observed that cities with high charging power have a smaller ratio of the charged buses in the non-operation period and a larger ratio of the charged buses in the operation period.
The charging times in the operation and non-operation periods in different cities are presented in Figure 11(c). The charging times in the operation period is between 1-3 times, while that in the non-operation period is about once. Taking the ratio of the charged buses in different periods into account, it is found that the cities which have a small ratio of the charged buses in the non-operation period, generally charge multiple times in the operation period. Fig. 11(d) provides a scatter plot of the charging duration per charging event. Considering the charging times of electric buses, it is not surprising that most electric buses that charge frequently in the operation period have a short charging duration per charging event. However, there are also some exceptions, such as Wuhan and Yichang.
In addition, Table 3 shows that the change of SOC in the operation period is generally around 30%, while that in the non-operation period is around 50%. Table 3 also shows that the travel distance between two charging events in different cities varies between 54.8km and 134.6km.

B. CLUSTERING RESULTS
The average Silhouette coefficient is used to determine the number of clusters for the K-means algorithm. Figure 12 shows the value of the Silhouette coefficient with respect to the number of clusters.
The Silhouette coefficient suggests that when the number of clusters is two, it gives the best clustering results. This corresponds to two typical charging patterns under two  charging modes: fast charging and slow charging. However, the charging patterns beyond also want to be explored. From the Silhouette coefficient, it can be seen that the cluster number of 5 is the second best alternative, which can provide both relatively stable clustering and satisfies the purpose of this study.
The selected cities are divided into five clusters based on the characteristic parameters of charging behaviors, and the clustering result is shown in Table 4. The order of the clusters (in the row) is presented in the first column. Fig. 13 (a) displays the distribution of charging power for each cluster. Fig. 13 (b) shows the proportion of charging start time over 24 h. Fig. 13 (c) plots the distribution of charging duration. Fig. 13 (d) presents the distribution of the SOC change per charge event. Fig. 13 (e) is the distribution of charging frequency. Fig. 13 (f) shows the distribution of travel distance between two charging events. In the following paragraphs, we discuss specifically the charging behaviors of electric buses in each of the 5 clusters.
Cluster #1 is the group of buses whose charging power is between 20-50 kW. The charging start time is either at 12:00 or after 20:00 during the night while the proportion of charging at night is about 43.8%. The charging duration is about 1 hour in the daytime and 2 hours at night. The Case of charging 1-2 times per day has a proportion of as high as 50%, and the mean number of charging times is 1.6. The change of SOC during a single charge event is distributed between 10%-70%. The travel distance between two charging events is basically located within the range 20-220 km.
Buses in Cluster #2 are also mainly under the slow charging mode. While the buses in this cluster usually charge in the daytime between 8:00 and 20:00, with two peaks respectively at 9:00 and 18:00. The buses are often charged 1-3 times per day, and the average number of charging times is 1.4. The change of SOC during a charging event is about 10%-40%. It is significantly lower than that in Cluster #1, which indicates that the battery utilization is lower than that in Cluster #1. The travel distance between two charging events usually ranges from 20 km to 80 km.
Buses in Cluster #3 are also mainly under the slow charging mode. However, the SOC change during a single charging event is within 20%, which accounts for 71%. Only 7.2% charging events have a SOC change between 70% and 80%. So the utilization ratio of battery is quite low. Buses in this cluster are mainly charged during the day from 10:00 to 18:00 and at 0:00 at midnight, and the total proportion of charging at night is as low as 19.7%. The charging duration is about 0.8 hour in the daytime, and 2.5 hours during the nighttime. The average number of charging times per day is twice.
The charging power in Cluster 4 is concentrated in 20-50 kW and 100-140 kW, and fast charging accounts for a high proportion. The buses in this cluster are charged less at night, and their charging start time is evenly distributed between 7:00 and 20:00. The charging duration is about 24 minutes during the day. The distribution of charging frequency shows the buses in this cluster are charged frequently, and the average number of charging times per day is 3.3. The SOC change per charging event is usually between 20% and 60%. The travel distance between two charging events is 20-60 km.
The distribution of charging power in Cluster 5 shows that the proportion of slow charging is higher than the slow charging. But the fast charging also accounts for a considerable proportion. The charging start time is mainly distributed between 8:00 and 22:00. The charging duration is about 0.5 hour during the day and 2 hours during the night. The buses in this cluster usually charge 1-2 times per day, and the average number of charging times per day is 1.5. The SOC change usually ranges from 10% to 70%, whose peak occurs at 10%-20%.

VI. CONCLUSION
In this paper, 3 months' running data of electric city buses in 14 cities are explored in this study. We first analyze the charging behaviors of all electric buses, and then we study the charging patterns based on the proposed charging characteristics and K-means algorithms.
Based on the analysis, we found that electric city buses are charged every day generally, and 82% of charging events occur in the daytime. The charging power during the day is significantly higher than that at night. Furthermore, the starting SOC usually ranges from 50% to 70% in the daytime, while it ranges from 30% to 50% at night. The travel distance between two charging events is usually between 20km and 60km, and over 80% of buses travel less than 100km.
We successfully cluster cities into groups within which they have relatively homogeneous charging patterns, and across which they have heterogeneous charging characteristics. Buses in Cluster #1, #2, #3 are all under slow charging, the main differences between them are the distribution of charging start time and the SOC change. While buses in Cluster #4 are mainly under fast charging, and the charging times per day is significantly higher than others. Buses in Cluster #5 adopt both fast charging and slow charging, whose statistical charging characteristics have some similarities with that of Cluster #2.
In the future, we would like to propose an evaluation system for the operational efficiency of electric buses, and analyze the quantitative relationship between the operational efficiency and the charging characteristics. Based on it, we can optimize the charging strategy pertinently in consideration of the constriction of charging infrastructure, the battery capacity and the power supply capacity. Furthermore, the impact on power grid will be researched to prepare for charging infrastructure planning.