Simulation-Based Sensitivity Analysis for Evaluating Factors Affecting Bus Service Reliability: A Big and Smart Data Implementation

Service quality is a significant concern for both providers and users of public transportation. It is crucial for transit agencies to clearly recognize the causes of unreliability before adapting any improvement strategy. However, evaluation of main causes of bus service unreliability has not been investigated well. Existing studies have three main limitations in context of recognizing causes of service unreliability. First, public transport networks and traffic condition are highly complex systems and most of the existing models are not capable to accurately determine the relationship between service irregularity and impact factors. Second, definition of “Big data” has been neglected and most of the studies only focused on one source of large scale data set to determine the causes of unreliability. Third, bus service unreliability can impact the users’ perception toward the public transport, significantly. It has been recommended by number of studies that bus service reliability should be evaluated from both service providers’ and users’ perspective. However, the impact of service unreliability from passengers’ perception is not well investigated, yet. Consequently, we proposed a novel simulation-based sensitivity analysis to evaluating main causes of bus service unreliability using a combination of three different sources of big data. Moreover, for the first time we developed a simulation model in R studio which is an open source and powerful coding environment. According to the results, the level of reliability in Route U32 showed the highest sensitivity to headway variations. Waiting time can be decreased by 61% if only bus operators can reduce the headway variation by 25% of the actual observed data. Big gap and bus bunching could be almost disappeared by decreasing headway variations. Moreover, the terminal departure policy could significantly improve the passenger waiting time. Waiting time can be decreased by 36% when almost all the buses depart the terminal on-time.


I. INTRODUCTION
, in one of the first and most fundamental studies conducted in this field, clearly explained that adopting any corrective strategies by bus service providers without understanding the sources of unreliability is indeed a waste of time and resources [1]. It is also explained that collective bus motion in transit systems is unstable, i.e., even if one starts The associate editor coordinating the review of this manuscript and approving it for publication was Yu Wang . with perfectly even headway, they invariably become irregular, and if enough time passes, buses bunch up. To address this problem, transit agencies insert slack into their schedules and require buses (or transit vehicles) to depart on time at predefined control points along the route [2]. Van Oort (2011) estimated that in terms of passenger travel time, about e12 million per year was lost (Hague, 2004), which is due to the unreliability of buses and trams [3]. He also stated that passenger waiting time may possibly be extended up to 600% due to service variability. One reason for this instability is VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ that if a disruption causes a bus to slow up relatively to the bus it follows, the bus encounters more passengers along the way, and the bus will be delayed by these extra passengers. Conversely, the next bus tends to catch up. [4], [5] proposed an offline framework which is capable to diagnose the sources of irregularity all over the route using Automatic Vehicle Location (AVL) data set. The authors in [6] conducted a very fundamental study into causes of unreliability and argued that any cause of unreliability is basically in one of these two categories: Endogenous (internal factors such as operator behaviour, route specifications, scheduling issues, unpredictable passenger demand, etc.), or Exogenous (external factors such as traffic signals, traffic incidents, traffic jams, doubled-parked vehicles on the route, etc.).
The following issues are usually considered for every transit operation: i) General traffic circumstances have an impact on transit service because transit vehicles usually function in mixed traffic. Variation in journey time is the result of interactions among vehicles, involving accidents, revolving movements, unlawful parking, and speed fluctuations.
ii) The existence of signalised crossings throughout the route disrupts the smooth running of traffic and increases the possibility of delays. Running times rise because of stop-andgo actions and stop durations at red lights.
iii) The requirement varies by day in a week and season in a year. Firms justify systematic or already-known changes in demand as they adopt timetables appropriately. However, unsystematic variants in demand bring up variability in times spent in the same position and/or in riding times. iv) Bus and operator availability is linked to the operating plans and strategies of the transit agency; however, it affects reliability in terms of being able to cover all scheduled trips and having spares in case of breakdowns, accidents, or operator absences.
Many researchers believe that preventative strategies should be considered at the initial stages of network design [7], [8]. In addition, headway irregularity and service unreliability can be avoided by adopting preventative strategies based on real-time data. However, source(s) of unreliability need to be clearly recognized before taking any action. Holding, the most commonly employed intervention, refers to intentionally delaying a vehicle, possibly at the expense of extending trip times for passengers on board, in order to reduce the waiting time of passengers who will board [9], [10]. Many studies have argued that the most effective intervention is basically the holding strategy. Accordingly, several methods have been proposed to evaluate the holding strategies, ranging from simple heuristics to sophisticated model-based optimization [11]- [14]. In addition, it is recommended by most of these studies that the best outcome can be achieved only when operators recognize the source of irregularity. However, to the best of our knowledge, research on causes of unreliability hardly can be found. Therefore, we developed and proposed a novel simulation model in R studio to analyse the sources of unreliability.
Based on above discussion, bus service irregularity and main sources of unreliability has been studied widely. However, to the best of our knowledge, there are still significant limitations in previous studies: Based on current available literature, there are three main methods to analyze factors affecting bus service reliability (causes of unreliability): 1-Descriptive models, 2-Inferential models and 3-Simulation models. Inferential models are not as popular as two other methods. Most of studies, which have used inferential methods, only have focused on impact of specific cause of irregularity (such as headway deviation) on bus service quality. In addition, analysis and output of these studies cannot be extended to other routes with different variables and scenarios. Descriptive analysis also has significant disadvantages. Generally speaking, descriptive models analysis effect of various factors (independent variables) on service regularity (target variable). Inferential models investigate the variables which probably can explain the service unreliability. Descriptive statistics is fundamentally different from inferential models. These models only summarize the data that has actually been measured. Whereas, Simulation models are reflection of a real situation in the world and the output of the simulation model must be same as the real world observation, as much as possible.
In these models (such as regression-based models) selected factors for analyzing must be independent from each other. However, variables in traffic and transportation systems are deeply inter-correlated. Based on our best knowledge from literature review, Simulation models are the most accurate and reliable method to analyze and determine the main causes of unreliability, because: 1-Simulation models are capable to capture and reflect much more details such traffic condition on street, during a specific period of time, 2-simulation models are the best method to implement and combine various sources of big data such as AVL, APC and AFC, 3-Unlike most of descriptive and inferential models, Simulation models doesn't have constrains in variables nature and volume.

2) CONTRIBUTION
Accordingly, we developed a multi-agent microscopic simulation model to evaluate the sources of bus service unreliability by conducting sensitivity analysis. To the best of our knowledge, our methodology is a novel approach in developing simulation model which is capable to determine the source of unreliability and implement improvement strategies. In addition, most of pervious simulation models are not available for further investigation. Therefore, for the first time we developed simulation model in R studio which is an open source and powerful coding environment and all the codes and R packages are available along with this publication.
B. ALL THE LARGE DATA SETS ARE NOT ''BIG DATA'' 1) LIMITATION Although, many definitions are proposed for Big Data'', the term still has no agreed definition. It has been argued by Boyd and Crawford (2012) that size of the data set is not the only factor to consider for defining big data. Big data is more about the capacity to search, aggregate and crossreference, than size of data set. (Kwon, Lee, & Shin, 2014) proposed three-dimension definition of big data: • Volume refers to size of data set, which in this context it should be larger. All Automatic Data Collection Systems (ADCS) have been used in this study can be considered as large volume.
• Velocityindicates the speed of data collection or rate of data generation. Automatic Vehicle Location (AVL) data used in this study records data every 5 seconds. In addition, Automatic Fare Collection (AFC) and Automatic Passenger Counting (APC) systems record data for every transaction or passenger activity.
• Variety refers to different sources of data sets. It is not unusual in Big Data implementation to collect and combining data from different sources of information. This study is designed to collect, aggregate and analyze data from three different data sources: AVL, APC and AFC. Subsequently, Gartner (2019) proposes the following definition of big data which is in line with the three V's: ''Big data is high-volume, high velocity, and/or high variety information assets that demand innovative forms of information processing that enable enhances insight, decision making and process automation''.

2) CONTRIBUTION
This study is one the first and most comprehensive attempts ''on sources of bus service unreliability'' which exactly matches with three-dimension definition of big data, as described above.
C. MEASURING THE UNRELIABILITY FROM PASSENGER's AND/OR OPERATOR's PERSPECTIVE 1) LIMITATION Impact of various sources of unreliability (such as late departure from terminal and passenger demand variations) on bus service quality can be valuated from two different and important dimensions: Passenger's perspective and service provider's perspective. Bus service unreliability can impact the users' perception toward the public transport, significantly. It has been recommended by number of studies that bus service reliability should be evaluated from both service providers' and users' perspective. However, the impact of service unreliability from passengers' perception is not well investigated, yet.

2) CONTRIBUTION
We designed and proposed a framework within simulation model which is capable to measure the level of unreliability from both passengers' and operators' perspective. '' In this study, the main objective is not the selection and adoption of strategies, but it is to improve the reliability of high-frequency bus services and analysis tools currently used in the bus transit industry. Thus, a simulation model of highfrequency bus service was developed to study the causes of service unreliability (through sensitivity tests). However, corrective strategies based on simulation results are suggested and implemented at the end of this study.
There is a number of factors that can, directly and indirectly, cause variation and irregularity in bus service. In Section II, a comprehensive discussion is presented on 'causes of bus service unreliability' according to a review of literature. It is followed by Section III that presents the data collection procedures. Moreover, the simulation model development and bus service reliability indicators will be discussed in this section. Results of sensitivity tests, implementation of the selected strategy based on sensitivity results, and output of simulation model will be presented in Section IV.

II. RELATED WORKS A. SCHEDULE DEVIATION AT TERMINALS
The departure time from the terminal is contingent on the station departure principles for each bus. Departures can be assigned to keep a constant journey schedule or headway. Cham's categorization of schedule deviations at terminals is used here to describe the two methods of station departure principles. According to both principles, timetable deviations at stations can emerge from vehicle or driver absences, tardy completion of a previous journey, poorly-scheduled recovery duration, and insufficient operator control ( Figure 1) [1]. It is expected that schedule deviations at a terminal are the main reasons for unreliability because these deviations complicate the flow along the whole route. For instance, when the headway becomes a little longer at the terminal, it may be compounded by more passengers boarding; thus, dwell times get longer along the whole route. By considering the actual and scheduled departure times, schedule deviations can be calculated at any place along a segment. While a few buses can compensate for a schedule deviation and return to their planned times along the route, some others might not be able to do so. The deviation of schedule might increase if the bus lags behind the timetable VOLUME 8, 2020 even further as it travels along the route, and that could result in a further service decline.

B. RUNNING TIME VARIATIONS
One important factor that can worsen transit service reliability is an increase in the variation in run time for a given mean run time [15], [16]. Variation in run time results in unpredictable service from the perspective of passengers because it increases waiting time and in-vehicle time [17]. Passenger behaviour variables such as boarding and alighting percentages affect the run-time variation [17]- [19].
Different studies have identified various factors that can impact bus running time, including the distance involved, passenger behaviour (traveller boarding and alighting), geometric conditions (such as the number of signalised intersections), tardiness at the beginning, time of day, number of real stops made, environmental factors (such as rain and snow), and traffic conditions [16]. [20] proved that running time variation can be decreased by using new buses.
However, transit agencies are faced with the problem of optimising running times since alterations in running time can greatly influence service reliability and the entire journey cost [21]. Agencies attempt to minimise these delays by various means, such as smart card ride collection programs, reserved bus lanes, back-door-only policies for alighting, front-door-only policies for boarding, low-floor buses, consolidation of bus stops, and transit signal priority (TSP) technology [22], [23]. In addition, researchers have found that improving the reliability of service in terms of running time and running time variation is strongly dependent on increasing passengers' satisfaction levels and responding to demands [24], [25].

C. LEVEL OF CROWDING
From an operational perspective, crowding is an important challenge because it slows boarding and alighting. From a planning perspective, crowding is a measure of efficiency. Each of these perspectives requires separate measures of crowding based on passenger count data [26]. The on-board level of crowding reduces the quality of service for boarding passengers. Bus crowding prolongs waiting times beyond headway because waiting passengers may be unable to board the first arrived bus.
The increase of the costs, externalities, and penalties of crowding, which are expected to emerge as the occupancy ranks of buses or bus stops, leads users to be willing to pay more to decrease their travel times if, for instance, they ride on a bus with standard occupancy of four travellers per square meter, compared to riding on a bus that has only a handful of passengers, all seated safely [27], [28]. Thus, a connection is expected to exist between the crowding level and the value of travel time savings (VTTS); in some cases, it has been empirically identified by some authors [29], [30].
Along with cost, trip time, reliability, and service frequency, crowding has a significant impact on modal choice because of the value placed on decreasing crowding in all definitional variants. People react in a few different ways when faced with crowding. The reason that passengers dislike crowding is not limited to the physical discomfort associated with being forced to stand close together and share space with other passengers [27], [31]. Research has shown that both wait times and travel times that are significant factors of service reliability can be decreased depending on the number of people in vehicles, carriages, and stations, which results in a decrease in crowding costs and crowding externalities [28], [32].
[33] developed a model for running time, which shows that the overall running time is affected by the number of stops. Having more passengers on board requires that more stops be served. Therefore, the variability in running times is increased by increasing the passenger load variability.
When buses are not full, travellers are aware that they are going to get on the first bus that arrives at their stop, but when buses are overfilled on average, users cannot easily anticipate whether or not the next bus will have extra space. This suggests that passengers may be obliged to wait for at least one other bus, which may result in an increase in waiting time. This situation is one of the sources of uncertainty in travel times, which increases the overall price of travel more than an increase in the standard waiting time. It is because greater variability in travel times is negatively valued by travellers, as shown by a growing body of research on the valuation of travel time variability and reliability [34]- [37].

D. DWELL TIME VARIATION
The bus dwell time is introduced as the time consumed by a bus at the bus stop for commuter boarding and alighting as well as the time of closing and opening bus doors. The bus dwell time is crucial to estimate bus station capacity [38], [39], and it is, moreover, a key component of bus travel time [40]- [42]. Furthermore, the dwell time functions play a critical role in the transit assignment models and analysis for transit reliability [43]- [45]. Consequently, the estimation of bus dwell time is vital for public transport designers and bus operators.
Reference [46] conducted an interesting fundamental study on the estimation of bus dwell time. Since then, several case studies have been done to explore secondary factors contributing to the dwell time estimation. For instance, the authors in [47], [48] examined the link between dwell time and bus fare payment system. In [49], [50], the effect of bus floor types on the bus dwell time was investigated. Reference [51] investigated how platform walking on the bus rapid transit (BRT) stations affects the bus dwell time. Tirachini and Hensher (2011) examined the influence of fare collection technology on city bus services.
Reference [52] analysed the impact of fare payment methods and the level of crowding on dwell time. Their model proved that dwell time could be decreased significantly when the fare was collected automatically. Moreover, they argued that dwell time can be increased significantly when 60% of bus capacity was surpassed. Furthermore, Sun et al. (2014) considered the smart card data to study passenger boarding/ alighting behaviour and its impact on bus dwell time. They found that the critical occupancy plays a significant role in determining the regime of boarding and alighting processes and the overall activity time.
The review on dwell time variation and factors influencing dwell time proved that dwell time can be a significant cause of irregularity in bus services. According to [40], dwell time can be increased by various factors such as passenger activity, fare payment method, on-board and in station crowding, etc. An increase in dwell time value can directly increase the running time on a specific segment. Therefore, dwell time variation can lead to irregularity (especially in running times), hence decreasing the level of reliability of a bus service. Figure 2 illustrates how crowding can result in service irregularity through impacting the dwell time.

E. EXOGENOUS FACTORS
Exogenous factors such as traffic conditions, weather, etc. affect bus service. Environmental factors strongly influence the reliability of bus service in terms of running time. Running time can be affected by mixed-use right-of-way conditions such as traffic accidents, turning traffic, and traffic jams. These can have different effects on service reliability. For example, traffic accidents and double-parked vehicles usually occur randomly and have comparatively short effects on bus service reliability, perhaps affecting only a small number of trips. Weather effects are random as well but might affect a whole day's operation.
Several studies have examined the negative effects of bad weather conditions (such as rainy or snowy weather) on traffic flow speed [54], [55]. Such conditions can also affect bus service and slow down buses [56]- [58]. Reference [59] showed that bus travel time increases on rainy days compared to non-rainy days, but the authors did not mention whether the increases were significant. Guo et al. (2007) used ordinary least-squares regression to develop a model to find the relationship between weather variables (temperature, rain, snow, fog, and wind) and Chicago Transit Authority bus ridership. All of the weather variables considered were found to have considerable impacts on passengers' demand, even though they impacted bus and rail modes in different ways. Cohen et al. (2009) employed ordinary least-squares regression to evaluate the effects of temperature, rain, and snow on transit ridership and revenue in New York City. Their results showed that most of the variables considered had significant effects on transit company revenue. In the spring and fall, when temperatures are usually a little cooler, both bus and subway revenue increased. On the other hand, higher temperatures in the summer were correlated with lower subway revenue. That is, snow and rain were correlated with lower revenue for both bus and subway operations.

III. METHODOLOGY
A. DATA Route U32 in Kuala Lumpur City Centre (KLCC), Malaysia, was selected in order to conduct the present study. This route is a high-frequency route passing through the most populated and congested streets of the city. There are 30 bus stops in each direction (totally 60 stops) and currently nine buses are serving this route along an operation day. Table 1 demonstrates the route U32 specifications and the list of the key stops in each direction. Evaluation of public transportation reliability is not possible without the implementation of ''Big and Smart Data''. The RapidKL bus company is one of the biggest public transportation companies in Asia, which was established in 2004. The main vision and mission of this company are to provide sustainable public transportation services in Malaysia. RapidKL uses very high-tech and updated technologies to monitor and evaluate the bus services such as Automatic Data Collection Systems and Smart Fare Collection Systems. Raw automatically-collected data, like manually-collected data, must be processed to identify and eliminate errors. For example, AVL data should have a record at every polling interval or stop; thus, there should not be large gaps in the data when a bus is in revenue service. Bad data are removed during the processing stage, which does not require much time or effort once the process has been automated. Basically, there are two types of bad data in both APC and AVL data sets: Double recorded and outliners data which have abnormal distance VOLUME 8, 2020 from other data (such as very long running times or very late arrival times in high frequency routes). Noisy data can influence the output of all models, to some extent. Therefore, it is highly recommended in many studies to apply data cleansing methods before going to analysing step [62]- [66].
Double recorded data can be easily recognized and eliminated by simulation model. We implemented outlier detection method [64], to observe and delete outliers. Based on validation results (Wilcoxon test), output of simulation model is highly reliable. Therefore, it can be concluded that in presence of ''Big Data'' simple procedures such as outlier and double recorded detection methods would be enough to have a clean data set. Simulation models and Machine Learning techniques are more robust in presence of noisy data, because when we are dealing with large size of data, these models are capable to estimate the missed data and outliners. AVL and APC data sets contain inconsistencies and anomalies, such as missed data (mostly in AVL data sets) and imbalanced number of boarding and alighting in a complete trip (in APC data sets). Imbalanced boarding and alighting was the most observed anomaly in collected APC data. Negative volumes also were observed in APC data sets. Many of the trips with imbalanced passenger activities data would be corrected after aggregating APC data set with AFC data in simulation model environment. Simulation model is able to detect the negative values and correct them, before modelling the dwell time. In addition, we applied missing data treatment techniques [63] for correcting missing data in AVL data set in AVL data set.
In this study, data sets were extracted from the RapidKL Automatic Data Collection Systems' archive such as Automatic Vehicle Location (AVL), Automatic Passenger Counting (APC), and Automatic Fare Collection (AFC). All data used in this study were collected during May, June, and July 2019 (Only weekdays).

B. SIMULATION MODEL CONCEPT AND ARCHITECTURE DESIGN
The framework we used to develop simulation model is based on our previous study [67], and we added sensitivity analysis and adopting suitable strategy based on sensitivity results to our simulation model. Figure 3 presents the activity diagram of the simulation algorithm. As described earlier, we used three different sources of ADCS to capture as much details as possible: AVL, APC and AFC. After cleaning process of bad data (such as double-recorded and outliers), four main parameters were extracted from raw ADCS: Running times and headways from AVL data set, passenger demands from AFC and APC data sets and dwell times by aggregating AVL, AFC and APC. Simulation model have to generates running times for each segment (which includes arrival times and headways), dwell times which is time interval a bus waiting at bus stop for boarding and alighting passengers (also indicated the departure times) and passenger demand models which indicates all passenger activities such as boarding, alighting and arrival rates. Running time distribution, dwell times and  location controllers, terminal departure times, vehicles' profiles and demand representation must be defined to run the simulation model. Basically, each run of simulation models includes: Initialization (setting up route specifications and vehicles' profiles), Terminal departure behaviour, Traveling on a segment (distance, running time, and passenger activity along the segment) and Serving a stop (dwell/holding time and initiate passenger activity).
Every simulated route contains different locations such as segments, stops, and terminals. Each location has a particular controller and distribution. Before running the simulation, all the controlled-based parameters such as demand representation, distribution of segment running times, vehicle profiles, and route specifications (e.g., the number of stops, length, and the number of buses) must be specified exactly. The next following steps would be replications. Each run has an initialization and monitoring (data collection) stage. In the end, observations from all these replications are to be compared to each other to obtain the performance measures. In this simulation environment, every vehicle has a specific profile containing the location and time of the insertion, movements, and removal.

C. PASSENGER DEMAND MODEL
Passenger activities (arrival rate, boarding, and alighting) are completely considered in this simulation model. To run the simulation model, passenger activity must be calculated and inserted according to the route's specifications. Tables 2 and 3 present the passenger arrival rate and passenger activity, respectively, for route U32 during the morning peak hours.

E. RUNNING TIME MODEL
Evaluating running times at a microscopic level helps the researchers to gain a clearer insight into bus service performance and reliability. In addition, developed models can be used in simulation environment to generate running times for each segment. Moosavi and Yuen (2020) analyzed factors affecting running times and proposed a running time model accordingly. The equation below shows the model proposed by them. This method is used in the current study in order to generate the segment running times.
All the above parameters need to be inserted into the simulation model separately for each recorded running time.
The simulation model will determine the running time by conducting a non-linear regression analysis.

F. VERIFICATION AND VALIDATION
Many researchers have argued that simulation model is a useless tool before verification and validation. North (2007) clearly stated that: ''Before verification and validation, models are toys; after verification and validation, models are tools'' [68].
The main purpose of verification is to ensure that the codes are working as intended (Law, 2008). In this study, verification of the simulation model involves three separate stages: 1) Going through the codes and running them line by line to discover any invalid value or bug; 2) Recording all details of decisions and actions in the log file of simulation. These recorded details are double-checked in the second stage to ensure everything is recorded properly, and 3) Watching the route layout and movement of vehicles at a very high accelerated pace. Therefore, any error or bug in the simulation model can be observed through this stage. Different verification analyses were conducted on codes, individual components, and algorithms, as explained above. A number of minor errors and bugs were detected during the verification process. All the errors were rectified before proceeding to the validation stage. Figure 4 shows the snapshot VOLUME 8, 2020 of the simulation model animation playback, during the last stage of the verification.
Basically, a simulation model is validated to ensure that the output of simulation is close enough to the real-world situations and the results are reliable. In this study, the simulation model was validated through a comprehensive comparative analysis between the simulation output and the real-world actual data using the Wilcox signed-ranking test. Generated dwell times, running times, and headways (which were calculated using the automatic data collection systems and smart cards) were validated, and the results are reported in Table 4. According to Table 4, the calculated dwell times, headways, and running times had relatively small errors after 1000 iterations for each segment and key stop, but still considered valid results. The estimated headways showed a higher percentage of rejection compared to the dwell and running times. Although verification tests were successful and the code was extensively revised where needed, it is probable that the algorithm programming or design errors have caused these differences. However, the difference is not significant.

G. MEASURING BUS SERVICE RELIABILITY
In this study, four different indicators of bus service reliability were selected based on the results reported in [67], [69] regarding the bus service reliability. According to these comprehensive studies, four reliability indicators were selected to cover the passengers and agencies' perspectives of reliability: Waiting time, on-board crowding from passengers' perspective, Headway Regularity Index at Stops (HRIS), and Bunching/Big Gap percentage from operators' perspective. A brief explanation on the methods to calculate these four reliability indicators is presented below.

1) WAITING AND EXCESS WAITING TIME (EWT)
The simulation model calculates the actual waiting times using Equation (2): where E(h) is the average headway, and cov(h) denotes the coefficient of variation of headways. TCQSM (2010) suggested to calculate and consider ''excess waiting time'' (EWT) for a better understanding of the waiting time reliability. EWT can be calculated simply by Equation (3): AWT is the abbreviated form of the Actual Waiting Time and the SWT is the Scheduled Waiting Time, both of which are calculated using Equation 2 for the average waiting time [w]. The only difference is in the headways. To calculate AWT, the actual headways, which were recorded and extracted from AVL data, are required. While in SWT, the scheduled headways (according to service providers' timetable) are required.

2) ON-BOARD CROWDING
Some authors have suggested the use of the nominal seating and standing capacity of a vehicle in measuring the load factor [70], [71]. If this definition is employed, a load factor of more than 80% indicates that a vehicle is crowded. The authors broadened the Transit Capacity and Quality Service Manual (TCQSM, 2010) definition of the impact of on-board crowding on comfort for various passenger loads on a 42-seat bus, as outlined in Table 5. Passengers consider on-board crowding important because it adversely affects their comfort. A set of social, sensory, and psychological issues are associated with high levels of human density, including perceptions of risk to personal security and safety [72], [73], heightened anxiety [74], feelings of exhaustion and stress [75], [76], a feeling of invasion of privacy [71], potential ill-health [72], the likelihood of arriving late to work [77], and the probable decrease in productivity associated with working on a bus [78].

3) HEADWAY REGULARITY INDEX AT KEY STOPS
Short headway and high passenger demand are the two most significant characteristics of high-frequency services. In addition, passengers tend to arrive at stops more randomly instead of rigidly following the timetable. In this situation, reliability can be measured by the service providers' ability to minimize headway variations and average waiting time for the passengers. Accordingly, the headway regularity index at key stops (HRIS) is designed to measure headway reliability at one specific point. Equation (5) can be used to calculate HRIS. When HRIS is equal to zero, there is no variation in the headways, and when the value is equal to 1.0, it indicates that bus bunching or big gaps happen frequently.
where n is the number of busses that serve stop j, H i,j denotes the scheduled headway for bus i at stop j, and H i,j represents the actual headway for bus i at stop j.

4) BUS BUNCHING AND BIG GAP
According to TCQSM (2010), in high-frequency bus routes, headways less than one minute can be considered as bunching, while headways longer than twice the scheduled headway can be a big gap.

IV. SENSITIVITY ANALYSIS RESULTS AND DISCUSSION
The aim of conducting sensitivity analysis is to determine the effect of various factors on bus service reliability. One parameter would be selected and swept in the simulation model between ranges of logic values, while other parameters are constant. One thousand simulated iterations are considered for every parameter. To conduct sensitivity analysis in this study, four parameters are taken into account as the main sources of irregularity: passenger demand variations, terminal departure deviations, dwell time variations, and headway variation. According to the review of literature, these parameters are recognized as the main reasons for irregularity in most transit routes. For every sweep in parameters' value, the passenger excess waiting time, on-board level of service, HRIS, and bus bunching/big gap metrics are calculated and compared. Figure 6 presents the sensitivity analysis procedures and measuring bus service reliability. The 100% value in the following tables represents the actual value of each parameter, which is observed in the real world. To present on-board level of service, seated passengers were considered in the level of service (LOS) A to C, while standing passengers in LOS D to F. The headway regularity index was captured and presented for each key stop separately.

A. PASSENGER DEMAND
Passenger demand is one of the factors that could significantly affect the bus service reliability. Average waiting time and on-board crowding are two important reliability indicators from the passengers' perspective, which can be directly affected by the level of demand. The sensitivity of bus service reliability to variations in passenger demand is presented in this section. As presented in Tables 6 to 9, passenger demand VOLUME 8, 2020   is increased from 25% of the observed value in the real world up to 200%. Passenger demand variation is indicated in the first column of each table. The minimum demand is 25%, the actual real-world situation is presented in the 100-percent row, and the maximum demand is considered as twice the actual case, in the 200-percent row.
Passenger demand can indirectly impact the passengers' waiting time by increasing the dwell time. In other words, when demand increases, there are more passengers on bus stops to be boarded. Therefore, dwell time will be increased. Longer dwell time also can directly increase the on-board and off-board waiting times along the route. According to Table 6, both average waiting time (W.T) and excess waiting time (E.W.T) will be increased with an increase in passenger demand. Average waiting time is reduced by the reduction of passenger demand because dwell time and the probability of arriving a full bus (or too crowded bus with no more capacity) will be decreased. However, the effect is not of a high significance since (as can be seen in Table 6) waiting time was decreased only by 2.9% when passenger demand was as small as 25% of the actual value.
The excess waiting time showed more sensitivity to passenger demand variations (see Table 6, last column). When demand is increased above the actual value (100% row), the excess waiting time increases with a faster rate compared to the average waiting time. It might be because of the increased number of passengers who are not able to take on the first arrived bus. Table 7 shows the impact of passenger demand on the on-board level of crowding according to the TCQSM guideline. It is highly expected that passenger demand variations have a direct significant effect on the on-board level of service. According to Table 7, if passenger demand increases as twice the actual value, 100% of passengers will experience an overcrowded situation in bus throughout the route. In this situation, increasing the number of vehicles in the route can be the most efficient method to decrease crowding. On-board crowding can be controlled also by improving on-time performance.
Tables 8 and 9 present the impact of passenger demand variation on the headway regularity index and bus bunching/big gaps, respectively. Headway regularity indexes at key stops (HRIS) are presented separately in both directions. According to Table 8, headway irregularity is very high even when passenger demand is at the minimum possible value. Therefore, it can be concluded that the current level of irregularity in the route is more due to other operational irregularities (such as running time variation), rather than the passenger demand impact.
The effect of passenger demand variation on bus bunching and big gaps is similar to the other metrics trend. However, the big gap relative change is higher than the bus bunching relative change. The passenger demand increment will increase the bus running time. Therefore, the average headway will be increased and more big gaps will be observed.

B. TERMINAL DEPARTURE DEVIATION
According to reports released by the RapidKL Bus Company, there is a notable deviation in departure times from the terminal. Therefore, it would be very useful if we can determine how this deviations impact bus service reliability. Comprehensive sensitivity analysis of terminal departure behaviour on reliability metrics is conducted and presented in this section.
The terminal departure behaviour model was adjusted to simulate the exact departure according to time table with 0% deviation. Then, the departure deviation was increased by increments of 25% through twice the average deviation, which was observed in the real world. Therefore, a departure deviation of 100% indicates the actual observed situation on the field, 0% indicates a perfect departure with no deviation, and 200% indicates two times greater than the observed value. Tables 10 to 13 present the results of sensitivity analyses of terminal departure on waiting time, on-board level of service, headway regularity, and big gap/bunching, respectively. Excess waiting time is even more sensitive to departure time deviations. Excess waiting time decreased by 51.17% when the departure time deviation was 0%, while it was increased by 10.3% when the deviation in departure times was as long as twice. This is a range wider than the one shown in the passenger demand sweep. Therefore, waiting time is more sensitive to terminal departure behaviour than passenger demand variations. As presented in Table 11, the percentage of overloaded buses (LOS F) increased by 4.5% when the deviation in departure was twice the observed value (200% row). As presented in 0% terminal departure deviation, more even loading was observed with an increase in the LOS A and B percentages and a decrease in LOS C to F. HRIS also showed sensitivity to terminal departure deviation changes. However, as can be figured out from Table 12,  the level of sensitivity is not considerable. Headway regularity index is calculated through considering all recorded headways during morning peak hours from 6:00 a.m. to 9:00 a.m. During these 3 hours, many factors can, directly and indirectly, affect the headway regularity, other than the first departure from the terminal. Therefore, the impact of only the terminal departure on the total amount of headway irregularity can be low as presented in Table 13. For example, HRIS 1 was improved by 6% when the departure deviation was 0%.
According to Table 13, bus bunching is much more sensitive to terminal departure time deviations. No bus bunching was recorded when the terminal departure deviation was decreased to 0%. Moreover, bus bunching increased by 85% when the departure deviation increased by 200%. Big gap changes also showed a high sensitivity to terminal departure deviation, especially when departures are perfectly controlled at terminals with 0% deviation from the scheduled departure times (about 83% improvement). Therefore, it can be concluded that the terminal departure control policies can be a very effective method for the elimination of big gap and bunching.

C. DWELL TIME VARIATION
Variation in dwell time can result in irregularity along a transit route. Longer dwell times would lead to longer running times. Moreover, dwell time increment will cause loner waiting time for both on-board and off-board (in bus stops) passengers. In addition, dwell time may directly impact the on-board level of crowding. Therefore, many bus companies adapt operational strategies in order to control dwell time variations along the transit route. In this section, the sensitivity of bus service reliability indicators to dwell time variation will be examined.
Dwell time was modelled explicitly based on suggestions provided in [40], and it was implemented in the simulation model for further analysis. The model was run 1000 times for each key stop, and all dwell times were recorded according to their location and time. To conduct a sensitivity test on dwell time variation, the results of the model were adjusted to a very small value of variation by 25% and then the variation was increased until 200% of the actual value observed in the real world. Since the dwell time model was 100% validated, the results of the sensitivity analysis also must be reliable. Tables 14 to 17 present the passenger waiting time, on-board crowding, headway regularity index, and big gaps/bunching results of the dwell time sensitivity analysis, respectively.  According to Table 14, the passenger waiting time showed response to dwell time variation changes. However, the level of sensitivity is not very considerable. The average waiting time and excess waiting time decreased only 1% and 1.44%, respectively, when the dwell time variation was decreased  to 25% of the actual value. Moreover, the waiting time and excess waiting time increased by 2.22% and 3.42%, respectively, when the dwell time variation was as great as twice the real-world value. Therefore, it could be concluded that unreliable bus services in terms of waiting time are not attributed to dwell time variations.
Although on-board crowding can be improved by decreasing the dwell time variations, the level of sensitivity is not considerable as presented in Table 15. The levels of services A to C which are on-board crowding levels with all passengers seated were improved by 3.8%, 5.7%, and 0.8%, respectively, when the dwell time variation was as low as 25%. Although the level of service F also improved to 16.5%, it is still a very high percentage of over-crowded buses.
According to Table 16, the headway regularity does not consistently decrease as the dwell time variation increases. The inconsistent variation in the headway regularity index is due to the random selection of boarding rates for calculating the dwell time.
Moreover, according to Table 17, bus bunching showed more sensitivity to the dwell time variations compared to big gaps. An increase in the dwell time variation can lead to an incensement in bus bunching up to 25%. By increasing the dwell time variation, the probability of unfilled trips would be increased along a transit route. If the first trip is unfilled, the second trip will face more passengers to serve and will have longer dwell and travel time on the route. Moreover, the headway between the second and the third trip will decrease since the second one falls behind the timetable and the third one will face fewer passengers to serve. Therefore, the third trip could probably catch up with the second trip. Based on the dwell time variation sensitivity results, dwell time does not significantly affect the reliability level of the bus service.

D. HEADWAY VARIATION
Literature consists of several studies conducted into the effects of headway variation on the quality and reliability of bus service. However, to the best of our knowledge, the literature lacks studies focusing on the evaluation of the headway variations impacts on the service performance with the use of simulation models. Many of the bus service reliability indicators are calculated considering headways, directly and indirectly (such as waiting time and big gap/bunching). Therefore, it is expected that most of the reliability indicators show high sensitivity to headway variations. Headway variations affect the reliability of a bus service both in the course of the trip and in subsequent trips. Accordingly, decreasing headway variation at the first stages of a route can have a positive impact on following trips and improves the reliability of the whole route. Results of the headway variation analysis are presented in Tables 18-21. Waiting time and excess waiting time both decreased significantly when headway variations decreased compared to the current level of headway variations. Waiting time and excess waiting time were improved by 61% and 94%, respectively, when headway variation decreased down to 25% of the observed value. Therefore, it can be concluded that passengers will not experience any excess waiting only if the service provider succeeds in controlling the headway variations through corrective strategies. One of the most practical strategies to control the headway variations is the headway-based terminal dispatching strategies [69]. Moreover, waiting time will be increased up to 98.6% if headway variation increases as great as twice the actual value. The obtained results proved that waiting time is highly sensitive to headways variation. Table 19 presents the impact of headway variation changes on on-board level of crowding. The results revealed that although headway variation could have an impact on on-board LOS, this impact is not significant.   Headway regularity indexes for all key stops showed a highly acceptable improvement by decreasing headway variation, as shown in Table 20. On the other hand, the results proved that if headway variation increases during an operation day, headway irregularity can sharply increase..
For instance, HRI in key stop 1 improved by 44% when the headway variation was as low as 25%, and it diminished by 155% when headway variation was 200% of the actual value. In addition, according to Table 21, big gap and bus bunching could be almost disappeared by decreasing headway variations. Big gaps and bunching decreased by 100% and 75%,  respectively, when headway variation was decreased to 25% of the real-world condition The headway variation sensitivity test results proved that the bus service reliability could be highly impacted by headway variations. Therefore, it can be a great hint for bus operators and transit planners that controlling headway regularity can be an effective method to increase reliability. I addition, decreasing the headway variation can improve service reliability from both passengers and operators' point of view.

E. DISCUSSION ON SENSITIVITY ANALYSIS RESULTS
This section presents the sensitivity analysis results by considering four different parameters: passenger demand, terminal departure deviation, dwell time variation, and headway variation. According to the literature, these four parameters are the main factors that can significantly decrease the level of bus service reliability. However, it should be noted that each route has its own specific condition and without conducting sensitivity analysis, it would be impossible to figure out which parameter can significantly impact service reliability.
It is very important to specify the main reason for unreliability before taking any action or adopting any improvement strategy. In other words, implementing strategy before fully understanding the causes of unreliability can be waste of time and budget for bus provider companies. Comparing the sensitivity analysis results can help researchers and operators to gain a clearer insight into the causes of unreliability. Figures 6-10 illustrate the comparison of sensitivity analysis results. Figure 6 compares the effects of the four above-mentioned causes of unreliability on passenger waiting times. The vertical axis shows the changes in average waiting time induced by the changes to sensitivity tests of the four parameters. According to Figure 6, the waiting time showed the highest sensitivity to headway variations. Waiting time can be decreased by 61% if only bus operators can control the  headway variation as much as possible. Moreover, the terminal departure policy could significantly improve the passenger waiting time. Waiting time can be decreased by 36% when almost all the buses depart the terminal on time. Accordingly, strategies that can control the level of headway variations (such as holding at key stop) and appropriate terminal departure policy can significantly increase the service reliability in terms of passenger waiting time. In addition, a combination of these two types of strategies can be a suitable method for optimizing the waiting time.
The on-board crowding level (on-board LOS) is the other indicator of reliability from the passengers' perspective. Many factors can affect the on-board level of service, including the level of passenger demand, service irregularities, the number of vehicles serving the route, and the number of seats in each vehicle (bus type). The vehicles that are currently being used in Route U32 have 30 seats. The number of seats per vehicle is constant; therefore, we cannot apply this factor to the sensitivity analysis. In addition, the Rapid KL Bus Company is not interested in increasing the number of buses servicing the route. Therefore, we decide not to consider these factors in our analysis. However, according to the literature, increasing the number of buses in a route and increasing the number of seats in vehicles both can significantly improve the on-board level of service.
It is very obvious that on-board crowding level can be improved significantly when passenger demand decreases. Although decreasing the level of demand is not in operators' control, it would be useful to figure out to what extent the variations in demand can impact the on-board level of service. In addition, terminal departure also showed a considerable effect on on-board crowding, according to Figure 7.
The on-board level of service ''A'' can be increased by 50% if only all the buses depart the terminal on time (terminal departure deviation = 0%). Late departure from the terminal leads to late arrival at the first key stop. Thus, more passengers would be waiting for boarding at stops, when bus arrives late. It is the simplest explanation of how late departure can lead to a more crowded bus.
Moreover, the dwell time variation and headway variation did not show any significant effect on the on-board crowding level. Therefore, from on-board crowding sensitivity results, it can be concluded that the terminal departure strategy to control the deviation is the most effective strategy to improve the on-board service reliability. Figure 8 compares the results of four different sensitivity analyses conducted on the Headway Regularity Index at key Stops (HRIS). The dwell time variation and terminal departure deviation tests did not show any considerable effect on HRIS. In addition, the obtained results proved that variation in passenger demand has a considerable impact on headway regularity. On the other hand, according to Figure8, headway variations can significantly impact HRIS. Headway irregularity can be decreased by 50% when headway variation is 25% of the current situation. In addition, if headway variation increases up to twice the observed value, the regularity index will be increased up to almost 4, which indicates that all the buses are bunched and service is totally irregular. Therefore, the regularity index, similar to waiting time, is extremely sensitive to headway variation. Figures 9 and 10 illustrate the effect of four different sensitivity analyses on bus bunching and big gap, respectively. According to the results, the best method to reduce bus bunching is controlling deviation at the terminal. Bus bunching can be reduced by 100% when all the buses depart exactly on time (departure deviation = 0%).
On the other hand, bus bunching can significantly increase, when headway variations increase. As a result, the most effective method to control bus bunching is controlling the headway variations. Moreover, the terminal departure policies can have a significant effect on the improvement and even total elimination of bus bunching.
According to Figure 10, the big gap percentage showed the highest sensitivity to the headway variations. Controlling the headway variations can improve the current level of big gaps by 100%, and can worsen the current big gap situations to almost 180%. Therefore, reducing the current level of the headway variations can significantly reduce the big gaps. Moreover, decreasing the terminal departure deviations can significantly reduce the percentage of big gaps.

V. IMPLEMENTATION OF CORRECTIVE STRATEGY BASED ON SENSITIVITY RESULTS
The results of sensitivity tests were evaluated based on the passenger waiting time, passenger crowding reliability VOLUME 8, 2020 metrics, headway reliability metric (HRIS), and bus bunching/big gap, which are identified as the bus service reliability indicators. The morning peak hours of an operation day are simulated, from the start (typically in the early hours of the morning) to the end (typically 9:00 AM); therefore, there are no boundary conditions to specify (or mis-specify). It is possible to simulate operations in a specific time period with given boundary conditions in the middle of an operation day, but it turns out that a convenient way of obtaining them is warming up the model using simulation from the beginning of the day to the beginning of the selected period.
The average and excess passenger waiting times are presented along with the percentage of changes in these observations to actual values. The headway regularity index was captured and presented for each key stop separately (HRIS 1 to 5). Moreover, the percentage of big gaps and bunching are also shown and compared with the baseline results.
According to the sensitivity test results, bus service reliability showed the highest sensitivity to headway variation. One of the most effective strategies to control the variation in bus service is the dispatching policy at the terminal [79]. In addition, one of the benefits of implementing strategies at the terminal is that no passenger would be affected by the extra time of holding strategy.
Regarding the management strategy based on previous headway, the terminal recovery time (time until departure) for each bus is calculated as the greater of the scheduled headway minus the previous headway and zero. The previous headway is the time since the last bus departed (or will depart) the terminal. In other words, the bus is instructed to depart the terminal as close to the scheduled headway as possible. Tables 22-24 present the impact of a headway-based dispatching strategy on bus service reliability indicators.  According to Tables 22 to 24, bus service reliability is improved by implementing terminal dispatching strategies. However, service reliability was improved more significantly by implementing a headway-based departure strategy.
According to Table 22, waiting time is reduced (improved) by 51% after implementing the headway-based departure strategy at the terminal. However, the actual waiting time is still higher than the scheduled waiting time even after adopting the strategy (139% of the planed waiting time, which is equal to 1 minute and 57 seconds). The scheduled-based dispatching strategy did not show any significant improvement of the headway regularity index, according to table 23. On the other hand, dispatching from the terminal based on headway significantly improved the headway regularity. Since Route U32 is a high-frequency bus service route, it is highly expected that basically, headway-based strategies show better effects on service quality ( Figure 11).

VI. CONCLUSION
Adopting any corrective strategy before understanding the main reason(s) of irregularity and unreliability in public transportation can be waste of time, capital, and resources. Due to the size and complexity of the bus service network, it is not possible to try any changes or implement any strategy directly on the bus route in the real world. Therefore, to determine the unreliability sources and select the best strategy, we need simulation models. Many studies have been conducted on public transport reliability and strategies to improve the level of service. However, to the best of our knowledge, there is no simulation-based sensitivity research to deeply analyse the source of unreliability. The simulation model proposed in this study is developed in R studio (an opensource R environment) that is gradually becoming a leading software environment not only for the simulation, but for different types of analyses, modeling spheres, and many other tasks.
Comprehensive analyses and discussion were presented on sensitivity test results obtained in this research. Four different causes of unreliability were tested through the sensitivity analysis: 1) Passenger demand variations, 2) Terminal departure deviations, 3) Dwell time variations, and 4) Headway variations. Variations were changed from 0% to 200% of the real-world observations to clearly understand the level of sensitivity of the model to each factor. Based on the obtained results, the level of reliability in Route U32 in Kuala Lumpur, Malaysia, showed the highest sensitivity to headway variations. For instance, waiting time and excess waiting time both decreased significantly when headway variation was decreased compared to the current level of the observed variation in the real world. Waiting time and excess waiting time were improved by 61% and 94%, respectively, when headway variation decreased to 25% of the observed value.
In addition, the terminal departure deviations also had a considerable impact on bus service reliability. Excess waiting time showed high sensitivity to the terminal departure time deviations. Excess waiting time decreased by 51.17% when the departure time deviation was 0% (perfectly on-time departure), while it increased by 10.3% when the deviation in the departure times was as long as twice. No bus bunching was recorded when the terminal departure deviation was decreased to 0%. Moreover, bus bunching increased by 85% when the departure deviation increased by 200%. According to the results of the dwell time sensitivity analysis, the bus service reliability is not significantly affected by dwell variations. In addition, terminal departure also showed a considerable effect on on-board level crowding. The on-board level of service ''A'' can be increased by 50% if only all the buses depart the terminal on time.

FUTURE WORKS
The primary objective of this study was to develop a simulation model to analyse the sources of the bus service unreliability on a single bus route. Simulation codes and R studio packages developed in this study are provided as supplementary files. Therefore, expanding this simulation model for analysing more than one specific route or even a network in a specific transport network based on our method and finding can be an interesting subject for future studies. In addition, the present paper was more focused on sensitivity analysis and causes of bus service reliability, rather than the implementation of corrective strategies. Since developing and adapting new strategies was out of the scope of this study, only two strategies were designed and adopted on the route (based on the sensitivity results). However, there are more other control strategies such as previous headway holding, Prefol headway holding, etc., which can be designed and applied to the same purpose.