A Schedule-Based Model for Passenger-Oriented Train Planning With Operating Cost and Capacity Constraints

In the planning stage, train operators design timetables to serve passenger trips and a train circulation plan to support these timetables. These designs consider not only operating costs but also passenger convenience. In this study, we developed an optimization model for a new problem that focuses on timetabling and train-unit scheduling while also considering passenger itinerary choices in a schedule-based train system. This optimization model minimizes passenger travel costs within the constraints of a limited budget available for operating costs. The model is solved by an iterative heuristic that simulates the interaction between train operations and passenger itinerary choices. The heuristic solves the timetabling and train-unit scheduling problem using a decomposition approach to increase computational efficiency, while passenger loading is solved by a user-equilibrium passenger assignment model. An example based on the high-speed railway network in southern China was used to demonstrate the effectiveness of the proposed model and method.

determine the number of scheduled train services and list these in the timetable by specifying stopping patterns and departure/arrival times. A train unit (TU) circulation plan helps operators to assign TUs with sufficient seats for onboard passengers to support the implementation of the timetable while considering the capability of depots to manage various types of TU and the availability of adequate train resources in depots for the next day's operations [4].
Additionally, a train plan should consider passenger dynamics. When a plan is used, passengers may change their itineraries accordingly. For example, fewer passengers may use train service T1 if a new train service T2 provides better service. These changes in passenger behaviors may warrant adjustments to the revised timetable and train circulation plan. In the above example, a smaller TU than the TU used in the new plan may be sufficient for T1. A planning process that could simulate this interaction would reduce the cost incurred from the various adjustments required to reach the point where passengers do not change their itinerary choices and operators do not need to further adjust their plans. Therefore, the passenger flows should not be fixed but should vary during the timetabling. However, most current train scheduling methods only consider fixed passenger flows.
Overall, train planning should (a) manage different types of TU; (b) allocate adequate TU resources in depots; and (c) consider passenger dynamics. Existing models incorporate one or two of these three important aspects, but no existing schedule-based model incorporates all three. Our study proposes a schedule-based model to combine these aspects. The model is to minimize the generalized passenger cost under the constraint of the operating cost, because the focus of operators has gradually shifted to market-oriented designs which aim to provide services that maintain the feasibility of passengers' itineraries and reduce their journey times [5], and operators would like to devote a certain amount of effort to satisfy existing customers and attract potential customers.
The model is solved by a new heuristic which addresses both timetabling and TU scheduling issues based on the decomposition approach, and adopts a user-equilibrium passenger assignment model to predict passengers' reactions to the timetable. The model and method are examined by applying them to the South China high-speed railway (HSR) network to demonstrate their efficiency and applicability. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ The remainder of this paper is organized as follows. The literature is reviewed in Section II. The details of our model are described in Section III. The solution method is presented in Section IV. We apply the proposed methodology to one example network in Section V. Finally, a summary of the study is provided in Section VI.
II. LITERATURE REVIEW Our problem includes the line planning problem (LPP), transit timetabling problem (TTP), vehicle scheduling problem (VSP), and passenger assignment problem. The LPP, TTP, and VSP are the three important planning problems faced by operators and have been the subject of many studies. Guihaire and Hao [6] provided a review of the LPP and TTP and proposed integrated models for these problems. Parbo et al. [7] and Bunte and Kliewer [8] reviewed passenger-oriented TTPs and VSPs, respectively. We refer interested readers to these reviews for more details. Here, only the studies that are most relevant to our integrated planning problem are reviewed.
Some researchers have explored the integration of two or more of the three planning problems because the solutions obtained from such integration could potentially reduce costs and improve service quality. For example,Wang et al. [2] integrated the TTP and VSP to improve energy efficiency. The model of Zhao et al. [9] combined the TTP and VSP to balance the service level and operating cost. Michaelis and Schöbel [10] suggested a model for the LPP, TTP, and VSP to maximize the number of passengers using buses. Laporte et al. [11] proposed a model that solved the TTP, VSP, and user routing problem simultaneously to reduce the line running cost, fleet size cost, and passenger inconvenience.
However, previous integration studies have generally not considered the following aspects that are important to practical application.
(1) Vehicle type: The vehicle-type problem involves the assignment of an appropriate vehicle type to a transit service according to operating cost [12], passenger demand [4], [13], depot capabilities [14], and energy consumption [15]. If the assigned vehicle type supplies an insufficient vehicle capacity, passenger demand cannot be met. Conversely, if the assigned vehicle capacity supplies an excess vehicle capacity, operational expenses are wasted.
(2) End-of-day balance: Some depots may not be able to manage all vehicle types because of equipment restrictions. The end-of-day balance problem requires each depot to house the desired number of vehicles of each type at the end of an operational period, to ensure that sufficient vehicles are available for the next operational period [4]. Accordingly, to maintain an end-of-day balance, extra vehicles may need to be dispatched from some depots to other depots before the next operational period.
(4) Station-track assignment: Station-track assignment considers station capacity and assigns tracks to train services to maintain safe operation [18] and smooth passenger interchanges by ensuring that various train services share the same platform [17].
The above studies [2], [9]- [11] have not considered vehicle type or end-of-day balance. Schöbel [19] addressed the end-of-day balance problem in a frequency-based integrated model but ignored the stopping pattern design. Moreover, few integrated models assign tracks with consideration of passenger interchanges. Therefore, we developed a novel schedulebased model that considers vehicle type, end-of-day balance, station-skipping, and station-track assignment. In our study, the timetable design and TU scheduling require an analysis of passenger flows for each transit service, so compared with a frequency-based model, a schedule-based model is more suitable [20]- [22].
Furthermore, previous studies have used the systemoptimum (SO) condition in passenger loading to establish timetabling models when considering vehicle capacity [23]. However, the SO condition cannot be applied realistically to a congested scenario, because it requires some passengers to sacrifice their ideal choices if the total passenger cost is to be minimized. In contrast, the user-equilibrium (UE) condition describes a state of equilibrium in which passengers (who are self-serving and aim to lower their own travel cost) cannot decrease their costs by choosing another itinerary. Hence, we used the UE formulation, as it is more appropriate for transit assignment than the SO condition.
Generally, a real-life problem considers multiple issues, involves large numbers of transit services and passengers, and consequently is too complex to solve directly. Thus, heuristics, such as the hybrid artificial bee colony algorithm [24] and column-generation-based heuristic [25], have been suggested to solve such problems. Alternatively, the iterative method [19], [26] can be used to solve the LPP, TTP, and VSP by executing an iterative process. Compared with other methods, the iterative method has the obvious advantage of capturing the interaction between passengers' choices and transit planning via a simulation process [27]. Thus, we adopt an iterative process.
The adoption of an iterative process alone may not be sufficient because the TTP is a non-deterministic polynomialtime (NP)-hard problem for which an exact global optimum is hard to find [18]. Many methods have been proposed to solve this problem, including genetic algorithm [28]- [31], Lagrangian duality theory [32], alternating direction method of multipliers algorithm [33], and decomposition approach (DA) [34]- [36]. The DA has been widely adopted to solve TTPs because it can handle complex problems. Generally, the DA involves decomposing the whole network into several zones and subsequently generating timetables within these zones with consideration of coordinated operations across the zones [34], or performing the decomposition at the transit service level [35], [36], wherein a timetable is created for each transit service. Our proposed DA provides a new approach to decomposition at the TU level that combines timetabling and TU scheduling. In each decomposed subproblem, a TU selects unscheduled train services to form its circulation path, and the timetables of train services are generated simultaneously such that the operating cost can be easily controlled to fulfill the operating cost constraint.
Train type, end-of-day balance, station-skipping, stationtrack assignment, and passenger dynamics are all important aspects to consider for efficient train planning. However, no models have been developed that consider all of these aspects; in particular, station-track assignment has rarely been used to smooth passenger interchanges. This paper is thus a useful contribution to the train planning literature as it describes the development of a schedule-based model that considers all of the above aspects, and thereby improves passenger service under the constraint of an operating budget. An iterative heuristic based on a new decomposition approach was devised to solve this model. This heuristic simulates the interaction between passengers' choices and train planning, thereby ensuring that solutions are always in line with the main constraint, i.e., the operating budget constraint.

III. MODEL FORMULATION A. Problem Description
In this study, a train service is one run of a TU from the original terminal to the destination terminal, and a line assembles the train services that pass through the same stations in the same operating time window. Train services on the same line are allowed to have different stopping patterns. A schedule plan includes (a) a line plan that records which train services should be scheduled and which stopping patterns of scheduled train services can be used, (b) a timetable that details the departure/arrival times and station-track assignments of the scheduled train services, and (c) a train circulation plan that consists of TU assignments and reallocations.
Operating budget (ε) is generally controlled to generate a schedule plan in real-life situations. Operators want to provide the best service to passengers possible in line with budget constraints. Hence, we treat the generalized passenger cost (C P ) as the objective and include the operating cost (C O ) as a constraint (Inequality III-A.1). Before the introduction of the problem details, the elements and sets used in this study are listed in Table I. (1)

1) Supply Side:
Operators are assumed to provide the operating cost boundary, the train network information, and the general planning framework. To clarify the train network information, a small-scale example is given. The simple network has three stations. A station can contain more than one station track to simultaneously accommodate several train services. One platform serves one station track; however, a station track that is not equipped with platforms can only serve train services that skip the station. Additionally, a real station can serve more than one train corridor and is divided into several yards, each of which serves train services in one train corridor and is modeled as an abstracted station. Travelers can change between these abstracted stations by using walking links. Double tracks are assumed to be used to connect stations, as shown in Fig.1. Each track serves one direction. A depot serves one adjacent terminal. For example, a TU is dispatched from depot A to the adjacent terminal, station I in Fig. 1, and then it provides the train service from station I to station III. When the TU arrives at station III and completes the train service, it may enter depot B or stop at station III for a certain duration, which is the layover time. During the layover time, the crew can prepare for the next trip from station III to station I. A depot may not be able to serve all train types because of equipment limitations.
A depot houses the TUs that it can handle at the start of operations for initial dispatching and the same number of the same types of TUs must be housed in the depot at the end of the day to ensure the availability of adequate TU resources for the next day's operations. If this requirement cannot be met, the end-of-day balance cannot be achieved and TUs must be reallocated.
Hence, the given train network information includes (a) the tracks without/with a platform at a station, (b) the train types that can be served by a depot, (c) the walking time between two platforms serving tracks p and p (T Walk pp ), (d) the safety train headway (T H ), and (e) the minimum layover time (T LT ). Moreover, we assume that: (a) the number of available TUs is sufficient, and depots have sufficient space to accommodate parking TUs during the operating period; and (b) there are sufficient staff to form a feasible crew schedule.
The consideration of designing schedule plans can be explained using the network in Fig. 1. Two lines are planned, line 1 in the direction from station I to station III and line 2 in the opposite direction. Each line can contain a maximum of two train services in a one-hour period. A schedule plan for this example is presented in Fig. 2. If a train service stops at a station, its path in Fig. 2. extends along the time axis but not along the station axis. If a train service skips a station, its path moves along both the station and time axes.
If a train service is scheduled, the TU assignment problem is solved to allocate a suitable train type to the service depending on the onboard passenger flow and the facilities available at the terminals. Three TUs of the same type are assumed to be sufficient to support this schedule plan. As shown in Fig. 2, TU 1 serves T2 and T4 marked in black, while TU 2 serves T1 marked in red and TU 3 serves T3 marked in blue. T1 and T3 cannot be served by the same TU because the time difference between the arrival of T1 and the departure of T3 is less than the minimum layover time (T LT ). In this case, the number of departures equals the number of arrivals at a terminal. The endof-day balance is maintained. However, if these three TUs are not the same type or some train services are canceled because of a limited operating budget, TU reallocation is necessary to maintain the end-of-day balance. For example, if T4 is canceled, an extra dispatch from station III to station I is scheduled outside of the operation period.
As Eq. (2) shows, the operating cost includes the total variable cost (C Var ), total fixed cost (C Fixed ), and total cost of TU reallocation (C TR ). C Var includes the financial expenditures for energy, attrition, and human resources [8], and is assumed to be a linear function of the total train running time (Eq. (3)). C Fixed is paid for investment in the planning period [8], and assumed to be a linear function of the number of TUs used (Eq. (4)). C TR is introduced because of TU reallocation to maintain the end-of-day balance and calculated based on Eq. (5).
where c Var is the average variable cost per minute; t Dep ns is the time at which train service n departs s; t Arr ns is the time at which n arrives at s; c Fixed f is the fixed cost for a TU of type f ; and b TU vnn equals 1 when n = n and TU v serves n before serving n , or when n = n and n is the last train service served by v, and 0 otherwise. When f ss is the minimum variable cost between s and s because the reallocated TU runs at maximum speed without stopping. TR f ss is the number of TUs of type f that are dispatched from s to s to achieve the end-of-day balance.
c Var , c Fixed f , and c TR f ss are given by the operators in advance. 2) Demand Side: Based on previous operations, train operators can predict the demand data and use them for planning. The given demand data for each OD pair include (a) the origin and destination zones, (b) the passenger flow, and (c) the departure time leaving the origin zone. In addition, access and egress times (i.e., the travel times between stations and their neighborhood zones) are given.
The generalized passenger cost is formulated as follows: where A i is the number of passengers using itinerary i , and the generalized cost of i (c Iti i ) is the summation of the uncongested and congested costs (i.e., c Unc i and c Con i ), as shown in Eq. (7). c Con i is related to the train capacity and calculated under the UE condition (the details are given in formulas (13)- (19) in the next section). c Unc i is the weighted sum of the in-vehicle time (IVT, c i1 ), waiting time (c i2 ), walking time (c i3 ), number of interchanges (c i4 ), access/egress times (c i5 ), and fare (c i6 ): The weights of these costs (w r j ) can be calibrated in advance using passenger data. The last three cost components are fixed, whereas the first three vary as the timetable changes. The IVT of n in i from the boarding station s to the alighting station s is given as follows: Passengers of itinerary i wait for the first train service n at boarding station s, and the waiting time is given as follows: where T Ori i is the arrival time of i at the origin station and is calculated by summing the departure time leaving the origin zone and the access time needed for i . If passengers of i alight from n at s and interchange to n at s, the waiting and walking times can be computed as follows: (12) where b Track np equals 1 if p is used by n, and 0 otherwise. Eq. (12) shows that only when n stops at p and n stops at p (b Track np , which is not determined in advance but is calculated during the station-track assignment. For example, consider a station with three tracks: track 1, track 2, and track 3. If n and n are assigned to tracks 1 and 2, respectively, and the walking time from platform 1 to platform 2 (T Walk 2,1 ) is 5 min, then t Walk in is 5 min. However, if n and n are assigned to tracks 1 and 3, respectively, and T Walk 3,1 is 10 min, then t Walk in is 10 min. This demonstrates how station-track assignment affects passenger walking time.

B. Model Formulation
To build a schedule-based integrated model (SIM) for the proposed problem, the following constraints are needed: (a) Constraint III-A.1, which ensures that the operating cost is within the budget; (b) the constraint set for passenger loading C S P , which assigns passengers according to the train capacity limit and UE condition; (c) the constraint set for line planning C S L , which sets rules to design the stopping pattern and determine which train services should be scheduled; (d) the constraint set for timetabling C S T , which ensures that the train timetable is safe and meets the operating requirements (such as the running and dwell time limit); and (e) the constraint set for TU scheduling C S S , which allocates the scheduled train services to TUs with consideration of train types and the end-of-day balance. Thus, the SIM model is as follows. minC P , s.t. Constraint III-A.1, constraint sets C S P , C S L , C S T , and C S S .
Details of constraint sets C S P , C S L , C S T , and C S S are introduced in the following sections.

1) Constraint Set for Passenger Loading C S P :
• Capacity constraints In the proposed problem, we strictly control the passenger onboard flow and calculate the surcharge for using n from s to s (c Sur nss ) based on the passenger flow. The relative constraints are as follows (∀n ∈ N, ∀s, s ∈ S n : s = ξ DS ns ): where A Flow nss is the number of passengers on n from s to s ( A Flow inss equals 1 if itinerary i uses n from s to s , and 0 otherwise); and Cap−T n is the capacity of n ( is the capacity of a type f TU). Constraint (13) ensures that the passenger onboard flow from s to s cannot be larger than the train capacity. Constraint (14) shows that if the train capacity exceeds the passenger flow on n from s to s , c Sur nss equals 0, whereas if the train capacity equals the passenger flow on n from s to s , c Sur nss is greater than or equal to 0. Then, the congested cost of itinerary i in constraint (6) can be calculated based on c Sur nss (∀i ∈ I r , ∀r ∈ R), as follows: c Con i is incurred by passengers when they must exert a certain effort to board a train in a train system that does not allow seat reservations [37] or must buy tickets in a train system that does allow seat reservations [38]. Previous studies [37], [38] have proved that c Con i in both types of train systems can be calculated using the same equation, i.e., Eq. (15), which shows that c Con i is the sum of the surcharges for all sections along the itinerary.
• UE condition The proposed problem considers assigning passengers based on the UE condition (∀i ∈ I r , ∀r ∈ R), which is represented as c Iti i , i.e., the generalized cost of i : wherec r is the equilibrium cost over all of the itineraries of OD pair r . The UE condition states that the generalized costs of the chosen itineraries are equal to or not greater than those of any unchosen itinerary. Constraint (16) has been widely used to describe the equilibrium route flow pattern that satisfies the UE condition in transit networks (Szeto et (17) is set as follows (∀i ∈ I r , ∀n ∈ N Iti i , ∀r ∈R): where M is a sufficiently large number and b Stop sn equals 1 if n stops at s, and 0 otherwise. In addition, constraints (18) and (19) are set to seek a workable passenger assignment (∀r ∈ R): where Pas r is the number of passengers in OD pair r . As Constraint (18) states, if i is used ( A i > 0), the waiting time for the associated n must be non-negative to ensure that passengers do not miss n. Constraint (19) requires that the total passenger flows on the itineraries for OD pair r equal the total number of passengers. However, if no feasible itineraries can be found for an OD pair or if some passengers between an OD pair cannot use any feasible itineraries because of inadequate train capacity, then these passengers are accommodated by a residual itinerary (ĩ) with an unlimited capacity. The uncongested cost ofĩ equals the loss caused by a passenger leaving the train system and opting to take an alternative mode of transport.
2) Constraint Set for Line Planning C S L : In this study, we did not consider terminal changes, and thus we set Constraint (20) as follows (∀n ∈ N, where b Sch n equals 1 if n is scheduled, and 0 otherwise. If n is scheduled, it should stop at the origin and destination terminals, and the stopping pattern of the intermediate stations is considered.
Moreover, we determined the stopping pattern and schedule train services based on onboard passenger flows. For determining the stopping pattern, we have (∀n ∈ N, (21) where A BA ns is the number of passengers on n who board or alight at s ( ins equals 1 if passengers on i board or alight from n at s, and 0 otherwise); and Min is the minimum number of passengers. Constraint (21) requires that if the number of passengers on n who board or alight n at s is less than Min , this station must be skipped by n to reduce the variable cost. For scheduling train services, we have (∀n ∈ N, ∀s, s ∈ S n : served by its assigned TU, and 0 otherwise. Constraint (22) requires that if n is the first train service or the last train service served by its assigned TU, the number of passengers on n must not be fewer than Min . The model may continue to schedule a connecting train service, regardless of the number of passengers onboard. For example, consider a TU serving train services T1, T2, and T3 in the order T1-T2-T3, where T2 is a connecting train service. If T1 and T3 serve no fewer than Min passengers, all three train services may be scheduled. This would be done if not scheduling T2 would result in the TU being no longer able to serve T1 and T3, as this would mean an additional TU would be required, the operating cost may increase. Hence, such connecting train services would be scheduled.
3) Constraint Set for Timetabling C S T : Timetabling should consider the safety and infrastructure restriction, and thus C S T includes the following constraints (∀l, l ∈ L; ∀n ∈ N l ; ∀n ∈ N l ): • Operating time-window constraint The proposed problem restricts the operating time window according to Constraint (23), as follows: t Dep • Running time constraint The proposed problem considers the train speed limitation in the form of Constraint (24), which sets the range of running times between two stations (∀s, s ∈ S n : s = ξ DS ns ) as follows: • Dwell time constraint If n stops at s (∀s ∈ S n ) to allow passengers to alight and board, the proposed problem limits the range of the dwell time at s according to Constraint (25), as follows: • Constraints for headway To ensure safety, we restrict the headway using Constraints (26)-(30), as follows (∀s ∈ S n ∩ S n , ∀ p ∈ P ns ): Dep nn s equals 1 if n departs from s before n , and 0 otherwise. Constraints (26) and (27) impose the necessary headway (T H ) between the arrival and departure times of n and n at s; Constraints (28) and (29) apply a logical set of departure and arrival orders for n and n at s; and Constraint (30) maintains safety by ensuring that a sufficient time difference exists between n and n if they are operating on the same station track.
• Constraint for station-track assignment The proposed problem designs the station-track assignment, and thus the following constraint is set (∀s ∈ S n ): Constraint (31) requires that if n is not scheduled, no track is assigned to n; whereas if n is scheduled, one track is assigned to n. In addition, if n stops at s, a track equipped with a platform must be assigned to n.
• Overtaking constraint Overtaking can only happen at a station which has more than one station track. Hence, we have (∀s, s ∈ S n ∩S n : Constraint (32) ensures that n and n cannot overtake each other in a section (s, s ). If n stops at one station track, a laterarriving train service can use another station track to overtake n as this is allowed by Constraints (26)- (31). In addition, n cannot be overtaken at the station that it skips by any other train service, as this is disallowed by Constraints (25)- (31).

4) Constraint Set for TU Scheduling C S S :
• Scheduled train service constraint In the proposed problem, TUs are assigned to serve scheduled train services one by one. Hence, we have (∀n ∈ N): Constraint (33) ensures that if n is scheduled (b Sch n = 1), one TU is assigned to it. When the right side of Constraint (34) equals 0, v does not serve n; therefore, the left side of Constraint (34) should also equal 0. When the right side of Constraint (34) equals 1 (i.e., v serves n), the left side equals 1 or 0. When the left side of Constraint (34) equals 0, n is the first train service served by v. When the left side of Constraint (34) equals 1, v serves one train service directly before serving n. When Constraint (33) applies, Constraint (34) requires a TU to serve train services one by one.
• TU connection constraint An improper connection provided by TU v for scheduled n and n must be prevented, and thus the following constraint is Constraint (36) requires that the extra dispatches ( TR f ss ) should be non-negative, and Constraint (37) determines the values of the extra dispatches. Note that mathematical transformations (like introducing a sufficiently large value of M) can convert Constraints (26)-(31) to linear constraints. But not all constraints can be converted to linear ones (e.g., the UE condition). Thus, the SIM is a non-linear model.

IV. METHODOLOGY
The proposed SIM, a non-linear model with many variables and complex non-linear constraints, is difficult to solve. Thus, we decompose the SIM into sub-models. In a sub-model, some variables of the SIM are fixed as parameters and set as the values offered by other sub-models, and thus the nonlinear constraints can be linearized. For example, the submodel for schedule plan generation provides timetables to calculate waiting times (i.e., t Wait in =t Wait in , where vari able is a parameter which equals a fixed value set for a variable), and non-linear Constraint (18) in the sub-model for passenger assignment can then be a linear constraint as follows: Thus, the sub-models can be handled by the CPLEX that is commonly used for linear programming (LP) problems [26], [39]. Based on the above discussion, an iterative heuristic is suggested, as shown in Fig.3.
First, the initial schedule plan generation (ISPG) process generates a feasible schedule plan. Next, an iterative process is executed to improve the schedule plan. Each iteration comprises four steps: (a) the user-equilibrium passenger assignment (UEPA) process, which is used to identify the feasible itineraries and compute the passenger flows according to the given schedule plan; (b) the line plan design (LPD), which determines the number of scheduled train services and designs a stopping plan based on the passenger loading outputted by the UEPA process; (c) the schedule plan adjustment (SPA) process, which uses the results of the UEPA process and the LPD to improve the schedule plan; and (d) the train service insertion (TSI) process, which inserts unscheduled train services into the schedule plan adjusted by the SPA because the adjustment may lower the operating cost and the budget may allow for additional train services to improve the passenger service.
The ISPG process and the LPD determine N Sch , N Uns , S The constraint set C S LH describes that if train service n belongs to the set of scheduled train services (i.e., n ∈ N Sch ), b Sch n should be 1, otherwise 0; and if station s belongs to the set of stations at which n stops (i.e., s ∈ S Stop n ), the value of b Stop sn is determined by b Sch n , otherwise 0. The ISPG, UEPA, and SPA processes can be considered as sub-models that are developed based on the SIM by a certain modification including fixing some variables and linearizing some constraints. To clarify the specific modification, the details are provided in the following sections.
The iterative process ends when the relative change in C P is smaller than the pre-set gap allowance, and the schedule plan with the minimum C P is selected as the outputted schedule plan. Otherwise, the next iteration begins. It is possible that a transit paradox [40] may occur. In a transit paradox, the operators implement an adjustment for service improvement that has the unintended effect of worsening the system performance in terms of C P . When a transit paradox occurs, the proposed method opts for the schedule plan with the best performance rather than the schedule plan in the final iteration.
Moreover, the line plan for train services is designed based on the information of passenger flows. However, this information may be missing in some cases. For example, the passenger flow on a newly added train service is unknown. To solve this problem, the passenger flow in such a case is assumed to be zero, and the newly added train service is assumed to stop at all passing stations. This line plan is likely to provide more feasible itineraries than a line plan on which newly added train services skip some stations. Furthermore, these newly added train services will use train type f * , which has the highest fixed cost and the highest capacity. Thus, we set that train services will not use other train types (∀v ∈ V f ; ∀n ∈ N New ; ∀n ∈ N): This assumption ensures that the operating cost remains within ε. When the schedule plan is input into the UEPA process, the passenger flows can be determined, and TUs can be reassigned according to the onboard passenger flow to minimize the operating cost. Some TUs may be replaced with TUs of a different type, and the fixed costs of these TUs are lower than that of f * . Therefore, the operating cost will not exceed ε and the modified solution given by the SPA will remain feasible.

A. ISPG
The ISPG process can be considered as a modified version of the SIM that excludes C S P and C S L , but adds some more constraints including C S LH and Constraints (41) and (42).
As indicated by Constraint (42), passenger flow on itinerary i , A i , is set as a fixed value (Ā i ). Because all train services used in the ISPG process are newly added,Ā i is zero according to the aforementioned assumption. The decision variables Dep nn s , t Arr ns , t Dep ns , b Track np , b TU vnn , and TR f ss . Because the passenger flows are unknown and the ISPG process aims to obtain a feasible solution as a starting point for the later processes, which will improve the solution, this study does not set a specific objective for the ISPG process. That is, when the ISPG process finds the first feasible solution, the process can stop. Thus, the ISPG process can be formulated as follows. min0, s.t. Constraint III-A.1, (41), and (42), constraint sets C S LH , C S T , and C S S .
Although the ISPG process is relatively simpler than the SIM model, it remains difficult to solve the ISPG process directly. Hence, a decomposition approach, DA-ISPG, is proposed. In the DA-ISPG approach, the original problem is divided into several subproblems in each of which a TU is used, and the train circulation plan and timetables of its served train services are determined simultaneously.
1) DA-ISPG: The DA-ISPG algorithm is shown in Table II. If N Uns is empty or the insertion of a new TU violates the operating cost constraint, the algorithm stops; otherwise, a new TU is added, and train services are assigned to it. If v serves more than one train service, these train services can be considered as predecessors (n Pre ) or posteriors (n Post ). For example, TU 1 in Fig. 2 serves T2 before serving T4, and we designate T2 as n Pre and T4 as n Post .
The train-service assignment in Table II can then follow an ascending order or a descending order. In an ascending order, a train service is first selected as n Pre ; subsequently, the posterior services are selected one by one. In contrast, in a descending order, a train service is first selected as n Post ; subsequently, the predecessors are selected one by one. In this study, the train-service assignment follows a descending order, and a train service can be selected for the assignment if (a) it is unscheduled; (b) while looking for a predecessor of n Post , its destination terminal is the original terminal of n Post ; and (c) its ideal arrival time at the destination is the latest when compared with those of the other train services that satisfy the two abovementioned rules. The ideal departure and arrival times can be given by the operators based on their previous experience or calculated by a timetabling model (whose solution is denoted the M-MinPC solution) that can minimize C P without consideration of TU scheduling [27]. The DA-ISPG algorithm attempts to use information from the M-MinPC solution when selecting the train services for N Uns because the M-MinPC solution has already minimized the total generalized passenger cost. Alternatively, the train-service assignment and the selection of train services for N Uns can follow a random order or other specific rules. This can be easily altered in the algorithm. The train-service assignment for a TU is summarized in Table III. The timetable generation process for train service n used in the train-service assignment (TTG-TSA) process can be defined as follows:  Tables II and III is  2) A Simple Example for the DA-ISPG: To demonstrate the DA-ISPG algorithm, we used the simple example given in Fig.1 and assumed the schedule plan in Fig. 2 to be the M-MinPC solution. First, TU 1 was selected and the train-service assignment was run to determine which train services were served by TU 1. Four train services were incorporated in the set of N Uns , and T4 had the latest arrival time in the M-MinPC. Thus, T4 was selected as n Post , and its timetable was generated (as shown in the upper left corner of Fig. 4).
Second, the operating cost of the presented schedule plan was less than the budget, so we added the predecessor of T4. The destinations of T1 and T2 were the same as the original station of T4. In the solution for M-MinPC, the arrival time of T2 was later than that of T1, and T2 was thus selected. T LT was assumed to be 20 min, and then the timetable of T2 was generated (as shown in the bottom left corner of Fig. 4). Because of the limitations of T LT and the operation window, no predecessor of T2 could be found, and thus we turned to TU 2 to generate its circulation plan. If adding an additional TU results in an operating cost that exceeds the budget, the DA-ISPG stops. Otherwise, train services are selected for TU 2. Accordingly, T3 was first selected and then T1 was selected. As this example shows, the operating cost is checked whenever a new TU or train service is added, such that the outputted schedule plan can be easily controlled to ensure that the operating cost constraint holds.

B. UEPA
The UEPA process focuses on the passenger assignment, so the timetable is fixed based on the output of the ISPG or SPA processes, and used to calculate the fixed values for the waiting time and the uncongested cost. Furthermore, the respective constraints in the SIM can be deleted to form the model used in the UEPA process. That is, the UEPA process can be expressed as a problem that identifies the A i ,c r , and c Sur nss subject to C S P . minC P , This problem can be formulated as an LP model after the mathematical transformations and solved by the columngeneration method, as described in previous studies [27], [37], [38]. The column-generation method starts by solving the LP model with the set of the optimal itineraries only. Then iteratively, itineraries that have the potential to reduce C P are added. If adding new itineraries would no longer reduce C P , the procedure terminates. based on the passenger assignment outputted by the UEPA process, constraint set C S LH , and Constraints (21) and (22). The LPD may cause some itineraries become infeasible because the used train services are not scheduled or the used stations are skipped, and passengers who choose these itineraries in this UEPA process will be assigned to another feasible itinerary in the next UEPA process.

D. SPA
Based on the SIM, the SPA process fixes A i (∀r ∈ R, ∀i ∈I r ), b Sch n , and b Stop sn (∀s ∈ S n ;∀n ∈ N) according to the output of the UEPA process and the design of the LPD, respectively. Furthermore, the SPA process excludes C S P except Constraint (18) which ensures that the itineraries used in the UEPA process are feasible in the SPA solution. Thus, the SPA process can be formulated as follows. The SPA adjusts the timetable and the train circulation plan simultaneously, which is a NP-hard problem. To increase the computation speed, the SPA process first runs the timetable adjustment (TTA) model which adjusts the timetable with keeping the train circulation plan, and then uses the TCPG model to check whether the train circulation plan can be further improved to reduce C O after the adjustment. The TTA model is formulated as follows. The DA-TTA approach decomposes the original problem into several subproblems, and a timetable is generated for each line. To avoid the spatiotemporal conflict between line l and other lines, the subproblem for l fixes other lines and focuses on generating a timetable for l . That is, a subproblem for l can be formulated as a new model (M-DA-TTA) that modifies the TTA model by adding Constraint (45) in which N Fix = N−N l and the fixed valuest Arr n s andt Dep n s are obtained in the previous calculation. After each subproblem, the TCPG model is used to check whether the train circulation plan could be further improved to reduce C O . If it could, the updated train circulation plan is used for a further adjustment. In contrast, if the DA-TTA decomposed the problem in terms of TUs, the train circulation plan would be fixed and thus TUs would be selected one-by-one. That is, the train circulation plan could not be updated with a better plan during the adjustment. Thus, the DA-TTA was set to decompose the problem in terms of line. The DA-TTA algorithm is given in Table IV.
Although the DA-TTA approach cannot always arrive at the global optimum, it reduces computation time because the M-DA-TTA model has a smaller solution set and can thus be solved in a considerably shorter time. Moreover, the DA-TTA approach improves the solution iteratively until a local optimum is found.

E. TSI
The TSI process aims to add some new train services into the output of the SPA process under the operating cost  Thirty-four main stations were selected from the HSR network in South China (Fig. 5). Nanningdong and Zhaoqingdong stations have two yards, one for each of the two HSR corridors they serve; accordingly, one yard was modeled as a substation as stated in Section III. In this manner, a 36-station abstract network comprising four main HSR corridors was built in which all depots could serve all train types. The network was used to test the heuristic from three aspects, the efficiency, optimality, and the application to a large-size problem, in the following sections.

A. Optimality and Efficiency Test 1) Problem Setting of the Optimality and Efficiency Test:
The model with the UE condition is not a convex problem, and a TTP is already an NP-hard problem whose globally optimal solution is difficult to be found. Thus, we transformed this problem into a simplified problem that relaxes the train capacity constraint (when the train capacity is sufficiently large that the passengers always select the itinerary with the lowest uncongested cost): (47) Constraint (47) allows the non-linear constraints to be linearized even if the objective is quadratic. The simplified problem was thus solved using the branch-and-bound algorithm (BBA) via CPLEX, and C P in its optimal solution was the lower bound for the studied problem. Thus, the BBA was selected as the method against which to benchmark the heuristic. Different cases were set to test the heuristic, as shown in Table V.
2) Optimality Test: Table V shows the results. We explored the factors that influenced the gaps in C P as follows.
First, the effect of problem size was tested. To increase the problem size, the number of train services running on a line was increased in cases 1-5, and the number of OD pairs (Π OD ) was increased in cases 6-10. The gaps in C P and the total uncongested cost (C Unc ) in these cases illustrated that the heuristic maintained good solution quality as the problem size increased. Second, the influence of train capacity was analyzed in cases 10-12, in which the number of passengers in an OD pair (Π Pas ) was increased. As indicated by the results, the gaps in C P and C Unc were affected by the increase in Π Pas because a higher value of Π Pas implies a higher probability that a greater number of train services do not have sufficient seats. When some passengers could not use the itinerary with the lowest uncongested cost because of insufficient train capacity, the studied problem assigned these passengers to an alternative or artificial itinerary, whereas the simplified problem continued to assign all passengers to the itinerary with the lowest uncongested cost. Both the alternative and artificial itineraries usually have higher uncongested costs, and therefore, the gap in C Unc will not be zero. In cases 11 and 12, the congested cost could not have been zero, and the gaps in C P would thus have been larger than the gaps in C Unc . Overall, the gaps in C Unc and C P in Table V are below 2% and 6%, respectively. Hence, the test results demonstrate that the solutions obtained using the heuristic tend to be close to the global optima of the simplified problems, indicating that the heuristic can generate solutions similar to the global optima of the studied problems.

3) Computation Time Test:
The computation time is generally affected by the number of train services on the network for scheduling and the number of OD pairs. Therefore, when comparing the computation times of the BBA and the heuristic, we selected cases 1-5 and cases 6-10 to test the influences of these two factors. In these cases, the BBA found the global optimum, and the heuristic obtained solutions with the same objective values. Fig. 6 shows the log10 values of the computation times for clarity. The computation times of the BBA were shorter than those of the heuristic when the problem size was small. The BBA tackled only one ILP problem, whereas the heuristic tackled several ILP problems, one of which was for a possible train service assignment to a TU. However, Fig.6 shows that the computation time of the BBA increased more quickly than that of the heuristic as the problem size increased, indicating the abilities of the heuristic to solve large problems are better than those of the BBA.  network that was collected from the official website of China Railway (http://www.12306.cn/mormhweb/). Because of differences in zone attractiveness and trip time restrictions, 9076 OD pairs were generated. As passenger volume data were unavailable, passenger volumes were set based on daily observation. The uncongested cost of a residual itinerary was 3000 min. Additionally, because real TU information was unavailable, we assumed that there were three train types (Type 1, Type 2, and Type 3) with different capacities (1000, 1300, and 1600 passengers, respectively) and fixed costs (600, 800, and 1000 O-units, respectively). The operating budget can be set based on previous operational experience and the allowable resources. Alternatively, a range of budget levels can be set and analyzed to find a proper solution that is in keeping with the allowable resources. We demonstrated this method with the following example. First, we ran the timetabling model suggested by.Xie et al. [27] to obtain a timetable minimizing the passenger cost, which we denoted the M-MinPC timetable. Next, we ran the TCPG model to compute the minimum operating cost ( It can be seen that as the operating budget decreased, the generalized passenger cost initially increased slowly and then rapidly. This trend is attributable to the fact that the decrease in the operating budget led to a decrease in the numbers of scheduled train services and scheduled TUs, which resulted in fewer passengers being served. Setting a higher operating boundary generally enabled a solution to be found that had a lower generalized passenger cost than that for lower operating boundaries. However, operators should not necessarily adopt a higher operating boundary, as extra investment may occasionally fail to generate a proportionally higher return. For example, the solution for M-OC(1) required a 33.33% increase in the operating cost relative to the solution for M-OC(2), but the corresponding reduction in the generalized passenger cost (for the solution for M-OC(1) corresponding to that for M-OC(2)) was only 6.08%. The latter result accounts for the nearly horizontal line linking these two solutions in Fig. 7. Thus, the solution for M-OC(2) appeared to be superior to that for M-OC(1). Therefore, we suggest that if a practical operation does not have a hard budget, a range of budget levels should be set and their respective solutions determined and compared, as this will enable a proper solution to be found that results in a relatively low passenger cost and operating cost.

3) Analysis of M-OC(2):
The proposed heuristic is an iterative method. Feedback from the passenger assignment is generated in each outer iteration and used in the SPA, which has an inner iteration to improve the schedule plan to reduce the passenger costs. For example, in M-OC(2), there was a decreasing trend in both the total generalized passenger cost and the average non-IVT time (i.e., the sum of the waiting and walking time). The average non-IVT time in the first 20 iterations of the M-OC(2) without a pre-set criterion is shown in Fig. 8. The average non-IVT time decreased by 30.43%. After eight outer iterations, the gap of the total generalized passenger cost decreased to 0.1% (i.e., the pre-set criterion for stopping was met), and the heuristic stopped. However, the average non-IVT time only converged at the 10th iteration; to reach this convergence point, we could use a lower pre-set criterion or allow more outer iterations to run. On average, one outer iteration took 1.52 h, and 29.23% of the computation time was used to optimize the schedule plan. Generally, the ISPG and SPA processes were completed within 3.42 min and 26.66 min, respectively. These long computation times resulted from the complexity of the passenger assignment. In a future study, we will consider how to improve the efficiency of passenger assignment.
To demonstrate that our model and method can manage different types of trains, we changed some settings. We added Type 4 trains, which had the same capacity and fixed cost as Type 1 trains, but we assumed that Type 1 and Type 4 trains needed different equipment in depots. All depots were assumed to serve all train types, except for the depot serving Shenzhenbei Station, which was assumed to serve only Type 2, Type 3, and Type 4 trains. We reran M-OC(2) to identify a new solution, which we denoted Solution 2; thus, we denoted the original solution of M-OC(2) as Solution 1. Both solutions had a similar number of used TUs (gap ≈ 2.08%), average running times per TU (gap ≈ 2.20%), and average non-IVT time (gap ≈ 1.10%). Solution 1 used 39 Type 1 TUs and Solution 2 used 10 Type 1 TUs and 27 Type 4 TUs. The depot serving Shenzhenbei Station had a unique structure: Type 4 vehicles were used in Solution 2 to take over the work of the Type 1 vehicles that were related to this depot and used in Solution 1.

4) Comparison of M-OC(1), M-OC(2), and M-MinPC:
The M-MinPC solution is obtained via the traditional hierarchical approach which first designs a timetable and then find a train circulation plan to support the timetable. In this approach, a timetable which can fully use the network capacity to provide a good service to passengers, but the respective operating cost may be too high. For example, if the operators want to control the operating cost within 80% of C O−MinPC , the operating cost of the M-MinPC solution exceeds the limit by 25%. Thus, the traditional hierarchical approach usually includes a further step to delete train services from the M-MinPC solution to meet the requirement of the operating cost.
Our approach integrates the consideration of operating cost into planning, so that an outputted solution meets operating cost requirements and attempts to maintain service quality. For example, when the operating budget was 80% of C O−MinPC , our approach generated a solution that met the requirement (i.e., the solution for M-OC(1)). Compared with the solution for M-MinPC, in the solution for M-OC(1), the number of served passengers was decreased by 6.5%, and the average waiting time was increased by 8.9 min. Similarly, when the operating budget was 60% of C O−MinPC , our approach generated the solution for M-OC (2). Compared with the solution for M-MinPC, in the solution for M-OC(2), the number of served passengers was decreased by 8.1%, and the average waiting time was increased by 19.9 min. This decrease in service quality was due to the reduction in the operating cost. However, given that the latter was 40% (1 − 60%), this decrease in service quality may be acceptable in real-life operations.
This comparison indicates the advantages of our approach to generate a schedule plan to balance the needs of operators and passengers. In summary, our large-scale example demonstrates that our approach can be applied to a practical-sized train problem.

VI. CONCLUSION
A schedule-based model is proposed in this study to solve a new integrated problem that comprises the LPP, TTP, VSP, and the UE passenger assignment problem. To fulfill the requirements of operators and passengers, the model minimizes the generalized passenger costs under the operating cost constraint. To obtain high-quality solutions within a reasonable time, a new heuristic was developed. The model and algorithm were applied to numerical examples. The results show that the heuristic can obtain the similar results as the BBA at considerably faster computation speeds. Furthermore, the algorithm was able to handle an integrated problem that involved 9076 OD pairs, 240 high-speed train services, and 36 stations forming a four-corridor HSR network. Therefore, the model and the algorithm can be used for a practical-sized train planning problem.
However, in this study, we made some assumptions that may not hold in real applications. For example, we assumed that double tracks were used between stations, TUs and crews were sufficient to support schedule plans, and passengers can always find a feasible access/egress mode in time. In future studies, we will relax these assumptions to construct a planning tool that can be applied in various practical scenarios.