Generating Travel Plan Sets in a High-Speed Railway Network With Complex Timetables and Transfers

Developments in high-speed railways have seen them become the mainstay of passenger transport systems in China, satisfying the increasing demand for rapid access to numerous locations. In practice, the operation of the network is a vital feature of the Chinese High-speed Railway (CHR), enabling the provision of through services while allowing indirect services via major hubs. The generation of travel plan sets in a large-scale, complex high-speed railway network is an important issue, as it is the foundation and key to those travel service systems where people can inquiry about their travel plans. This paper describes a method of generating all alternative travel plans that are potentially attractive to passengers. The inherent problem is first modeled for any journey between a given origin and destination pair, and then a spatiotemporal network is constructed. For each journey, a direct set of travel plans without transfers is extracted from the train timetable data. The construction of indirect travel plans is more complicated, and a two-stage generation method is proposed: (i) compute the $K$ -shortest paths using the improved Yen* algorithm to identify transfer node(s), and (ii) connect train services according to the train timetables. Furthermore, the reliability of a specific travel plan with transfers is determined by mapping the buffer time as a reliability measure. Finally, the proposed method is tested on the CHR network using the 2019 train timetable.


I. INTRODUCTION
Railways are large-capacity transport systems that play an important role in integrated transport around the world. In recent years, to satisfy traveler demands, the Chinese High-speed Railway (CHR) has seen remarkable developments. By the end of 2019, China had over 35,000 km of dedicated high-speed railway (HSR) lines, which had contributed to carrying over 7 billion passengers in the past 10 years. Increased access to previously train-free areas and a rise in the number of available train types has produced an explosion in the choice of origin-destination (O-D) travel plans for passengers. At present, network operation is one The associate editor coordinating the review of this manuscript and approving it for publication was Jesus Felez .
of the important features of CHR, ensuring the provision of direct and indirect services via major hubs. However, with the complex topological structure of this network, diversified train schedules, and the availability of transfer modes different from urban rail transit (URT) and the roadway network, generating feasible HSR travel plans is a complex task. For instance, a case study of a specific O-D pair (from Chengdu City to Changsha City) is illustrated in Table 1. The roadway network has three alternative routes, whereas the HSR network offers more than 150 plans, which is 50 times more than the former.
A travel plan can be defined as a sequence of trains that allow a passenger to travel between the origin and the destination. The travel plan set (TPS) for a given O-D pair consists of all the travel plans that could potentially be chosen by passengers. And generating TPS is the foundation and key to those travel service systems where people can inquiry about their travel plans such as CTRIP in China and DB in Germany. There are a number of topics involved in choice modeling [1]- [3]. However, few recognize separate methods of generating a TPS for railway networks, including HSR networks. A separate TPS generation step prior to the choice model has the advantage of specifying the set of available alternatives and reducing the computation time, as it is only necessary to generate the TPS once per network configuration.
The objective of this paper is to develop a method for efficiently generating passenger TPSs in HSR networks while considering the complex timetables and transfers. The main contributions of this study are as follows: 1) A spatiotemporal network considering the network topology, train timetables, and transfers is modeled to describe an HSR network. Note that the network with transfers is a large-scale, complex system, and there are various different types of trains.
2) The spatiotemporal network is so large and complex that transfer points, and consequently travel plans, will be drastically scaled-up. A two-stage method is proposed for generating travel plans with transfers. An improved Yen * algorithm, incorporating the A * algorithm, is developed to calculate the K -shortest paths.
3) The reliability of transfers between trains is defined and incorporated into the travel plan, so that the generated TPS allows passengers to evaluate the risk of missing a certain connection.
The remainder of this paper is structured as follows. Section II gives a brief review of the relevant literature. Section III analyzes three factors related to the problem of travel plan generation. In section IV, a spatiotemporal network is modeled and the relevant notation is explained. A two-stage generation method based on the improved Yen * algorithm is then described, and the transfer reliability is defined for TPSs including transfers. Section V presents a case study used to test the proposed method based on the HSR network of China in 2019. Finally, the conclusions to this study are summarized and some ideas for future research are given in section VI.

II. LITERATURE REVIEW
Research on the problem of feasible TPS generation initially considered road networks, focusing on the ''shortest path'' model. The shortest-path algorithm began back in the late 1950s [4], since then an enormous number of extended and improved methods [5]- [9] based on Dijkstra's algorithm have been proposed. Zhan and Noon [10] applied and tested their algorithm on real road networks, while Namkoong et al. [11] used a tree-based link labeling algorithm in developing route guidance systems and traffic control systems. Klunder and Post [12] proposed a new label correcting algorithm, based on the use of buckets of queues. Sanders and Schultes [13] exploited the hierarchy inherent in real world road networks.
Note that TPS generation in railway networks is unlike that in road networks, as trains run on fixed routes and according to timetables. As time coordination and fixed routes are taken into account in railway networks, two main approaches have been proposed for modeling timetable information, that is, the time-expanded approach [14]- [16] and the time-dependent approach [17]- [20]. The essential differences between these two approaches are the representation states of each node and edge in the constructed network graphs. In a time-expanded graph, every node denotes a departure or arrival event at a station and every edge between nodes represents elementary connections between the two events, which results in memory problems for large networks. In a time-dependent graph, each node represents a static station, and edges are only used if the corresponding stations are connected by an elementary connection. The two most frequently encountered timetable issues are optimal path searching and speeding-up query problems. Tong and Richardson [21] proposed an optimal path searching algorithm by combining Dijkstra's algorithm with the branchand-bound method. Florian [22] applied a label setting algorithm to find optimal paths for more than one destination. Other studies have attempted to limit the path search by considering other variables [23]- [25]. Jin et al. [26], [27] proposed an approach for optimizing localized integration between public bus services with metro system that take into consideration commuter travel demand. Zhu and Xu [28] introduced a route filtering method to generate a route choice set for a URT network, which shares common characteristics with our subject. However, the differences between them are significant. For example, URT networks typically have high train frequencies, so the passenger can simply take the first train to his/her destination. In this case, plans for passengers consider all alternative paths from an origin to a destination. Schedule coordination is taken into account to remove unreasonable routes that passengers are unable to complete given the constraints of their entry and exit times. However, in an HSR network, the train capacity is deterministic and most services operate at intervals of at least 30 min. Passengers tend to be more sensitive to the schedule, as reflected in the pre-ordering of tickets. Therefore, the TPSs in such networks are based on both the physical network topology and the trains' operational plans. Furthermore, the network in China continues to expand (expected to exceed 38,000 km by the end of 2020), and is supplemented with numerous regional links and intercity railways. VOLUME 8, 2020 Thus, the TPS generation problem is different and more difficult in HSR cases because the spatiotemporal complexity increases tremendously.
To the best of the authors' knowledge, previous research on enumerating feasible plans is limited, especially regarding large-scale and complex HSR networks. The objective of this paper is to provide an approach for generating reasonable TPSs in an HSR network under consideration of the complex timetables and likelihood of transfers.

III. PROBLEM DEFINITION
Our task is to identify algorithmic rules for generating the travel plans that a passenger will consider. Generally, travel plans can be divided into two kinds, namely plans with no transfers (so-called direct travel) and plans with one or more transfers. A direct travel plan involves a specific train traveling from the origin station to the destination station in the train schedule with defined departure and arrival times. The generation process is thus a straightforward task of searching for direct train plans in the train timetable, with no consideration of path-finding. The generation of plans including one or more transfers, in contrast, is far more complicated. This is due to three factors: the number of transfers, transfer node(s), and the connecting time threshold.

A. NUMBER OF TRANSFERS
From a psychological perspective, a greater number of transfers is associated with passenger perceptions of tiredness. From a computational perspective, the number of transfers is reflected in the search size and computational load. Usually, passengers select from among a set of proposed plans, and the number of transfers is an important criterion with an upper limit. When the number of transfers is larger, there is a higher probability that this path will be eliminated. Thus, if a path has more than two transfers, the probability that it will be eliminated increases. Therefore, the maximum number of transfers is set to two, as passengers are relatively unlikely to travel to a place that requires more than two transfers between different railway lines.

B. TRANSFER NODE(S)
The itinerary including transfers can be divided into several segments, with each segment treated as a direct service. The transfer node is the core of this division, and can be determined based on the physical network. The inherent algorithm is a path-finding method that computes the Kshortest paths following a certain criterion. In this study, an improved Yen * algorithm is applied and the transfer nodes can be identified.

C. CONNECTING TIME THRESHOLD
The connecting time is defined as the difference between the arrival and departure of two trains involved in a trip with transfers. It includes the transfer time and waiting time. The transfer time is the time required for passenger interchange, and varies with station layout, passenger characteristics, and passenger behavior. For example, when a passenger transfers between two stations within a city by subway or taxi, the transfer time is usually longer than when simply transferring within the same station. Thereby, the connecting time threshold has a significant impact on the reliability of the transfer travel plan. Travel may be considered a waste of time when the value is too high, whereas it may be unfeasible when the value is too low.

IV. METHODOLOGY
The overall framework of the proposed method is shown in Fig. 1. There are three steps in generating TPSs for a specific O-D pair on the HSR network. First, the direct set consisting of travel plans without transfers is extracted from the timetable data. Second, the indirect set consisting of travel plans with transfers is obtained by a two-stage method. This two-stage method applies the improved Yen * algorithm to find the K shortest paths, where the transfer node(s) is identified, followed by the identification of connecting train services with constraints. Third, the indirect set with transfers is generated and the reliability of the travel plan with transfers is defined.

A. NOTATION AND RAILWAY SPATIOTEMPORAL NETWORK
A static railway network can be represented by the graph G = (N, L), where N is the set of nodes and L is the set of railway lines. This static network supports path finding, but is not very efficient as an active network. To capture the structural and dynamic characteristics of an HSR network, a spatiotemporal graph G = (V,L,E) is constructed based on the timetable, consisting of a set of nodes V = C ∪ S, a set of lines L, and a set of spatiotemporal arcs E = E 1 ∪ E 2 . There are two kinds of nodes with attributes of location and associated lines: the first type C = {c 1 , c 2 · · · } represents city nodes, while the other type S = {s 1 c 1 , s 2 c 1 · · · s 1 c 2 , s 2 c 2 · · · } represents station nodes. The city nodes are connected by a travel time dependent link l ij = {c i , c j } ∈ L. The characteristics of all links in the set are attributed to this single link l ij . A previous study [29] defined five kinds of arcs, namely boarding arcs, running arcs, stopping arcs, transferring arcs, and alighting arcs. Here, we simplify these into two kinds of arcs: • Running arc (t de tr p (s m c i ), t ar tr p (s n c j )) ∈ E 1 : represents a passenger who departs from station node s m c i at time t de (s m c i ) and arrives at another station node s n c j at time t ar (s n c j ) by train tr p . • Transferring arc (t ar (s n c j ), t de (s n c j )) ∈ E 1 or (t ar (s n c j ), t de (s l c j )) ∈ E 2 : represents a passenger who arrives at station node s n c j at time t ar (s n c j ) and then departs from station node s n c j or s l c j (l = n) at time t de (s n c j ) or t de (s l c j ), which can be considered as a transfer process within a station or within a city. Assuming that passengers need to travel from the origin node of city o to the destination node of city d in the network, the solution procedure consists of three parts. First, list the station sets {s 1 O , s 2 O , · · · }, {s 1 D , s 2 D , · · · } in o, d city respectively, and search for direct trains from station s m o (m = 1, 2 · · · ) to station s n D (n = 1, 2 · · · ) in a pairwise manner. Second, apply the improved Yen * algorithm to find the transfer city node(s). Trips with transfers can thereby be divided into several segments according to the transfer node(s), and each of the segments is treated as the direct plan. A travel plan with transfers can then be obtained by integrating these segments with connecting time threshold constraints. Third, the reliability of a travel plan with transfers is determined as a function of the buffer time.

B. TWO-STAGE METHOD 1) PATH-SEARCHING PROCESS
In this section, K -shortest loopless paths algorithm, which enumerates only paths without repeated nodes, is introduced. Yen's algorithm [30] is a classical deviation algorithm that is appropriate for this task. Considering the k-th shortest path in the form p k = o → c k 1 → c k 2 · · · c k j · · · → d , for each node c k 1 to be analyzed, p k is a deviation from p k−1 , k = 2, 3, · · · K , at c k i . To obtain p k , it is only necessary to look for all the shortest deviations from p k−1 's, then scan these deviations to identify the one with the shortest length (minimum cost).
To improve the algorithm's performance and make it suitable for the network model, two optimizations are added to accelerate the search process. First, we set constraints on the spur node. As the shortest loopless path p k can only deviate from a transfer city node (i.e., a node intersected by two lines), there is no need to visit all the nodes in the path. Second, the heuristic strategy of the A * algorithm (a graph searching algorithm using an evaluation function to sort the nodes) is adopted to reduce the search scope. The evaluation function is described as follows: where g(n) is the actual cost from the initial node s to spur node c k i , and h(n) is the estimated cost of the optimal path from c k i to the target node d, which depends on the heuristic information of the problem areas.

2) CONNECTING TRAIN SERVICES
Based on the improved Yen * algorithm, the K -shortest paths can be calculated. As each node in the path (except the origin and destination nodes) denotes a transfer city, a sequence of all feasible transfer cities can be obtained. As mentioned before, a travel plan can only be considered feasible if the number of transfers is less than or equal to the defined upper boundary. Thus, recognizing each planned travel segment (between two stations) as a running arc and transfer segments as transferring arcs, the feasible travel plans including one or two transfers are elaborated. (Note that the method is based VOLUME 8, 2020 on the assumption that each train runs on time according to the schedule and that transferring to trains on the same lines will not occur.) • One transfer. There are two forms of transferring: transferring within the station and transferring within the city. Fig. 3 illustrates the situation, where a transfer arc within a station represents the former case and a transfer arc between two stations represents the latter. The difference in transfer time is significant, as the time consumed in the latter process is expected to be somewhat longer than that in the former process. By defining the connecting time threshold, feasible train connections are obtained. Note that if a train that can directly travel to the destination station is contained in a plan with transfers, it should be excluded from the plan set.
• Two transfers. There are three issues to be considered. First, it is assumed that transferring within a city is not reasonable. Changing stations within the city for making the transfer usually requires passengers to take public transportation (bus/taxi/subway and so on), that has a more deterrent effect in increasing the time travel and decreasing travel comfort. Second, considering the psychology of passengers and the accessibility of the given network, no more than two within-station transfers are acceptable. Moreover, as mentioned before, travel plans from the origin to the transfer station, and from the transfer station to the destination (for example, from City O to City T, and from City T' to City D) have been identified in the generation process of a single-transfer TPS, which simplifies the problem. The following step aims to find all train services between two transfer cities, and the entire plan is obtained by connecting the three segments according to the connecting threshold. Finally, a plan with two transfers should be ruled out if it contains a single-transfer plan as a subset. By employing the abovementioned search process, all travel plans (including plans with and without transfers) between the origin city node and the destination city node are enumerated.

C. RELIABILITY OF TRAVEL PLANS WITH TRANSFERS
To visually indicate the reliability level of plans with transfers and make it convenient for passengers to choose a plan, the reliability index R is proposed in this section. Traditionally, the delay probability [31] is an important factor in estimating the reliability level of a travel service. Here, we introduce the buffer time, which is the available time exceeding the minimum connecting time, as the measure. For h>l, a buffer time of h minutes implies a lower risk of missing a train than a buffer time of l minutes.
We first consider how to form a dataset of buffer times. Using GPS technology, passengers' travel time and location information can be recorded, thereby providing frequent sampling along transfer routes. By gathering these samples, the connecting time threshold for a specific transfer route can be determined and a dataset of buffer times can be thus obtained. The reliability of a single travel plan with transfers is defined using the cumulative distribution function of these buffer times, while the reliability of connections with several transfers can be regarded as the product of the reliabilities of each single transfer. Obviously, more transfers will result in an increased risk of missing a connection.
The model developed by Disser [32] is introduced to describe the reliability of a travel plan with a single transfer. The main target of the reliability index R is to make it more convenient for passengers to choose a travel plan, be prepared for the second journey, and reduce the risk of missing the train. The reliability R is defined as where a = 0.6, b = 8, s = 0.99, so that the maximal reliability of a single transfer is 99% and a buffer time of 0 min leads to 60% reliability. It is appropriate from the passengers' perspective in practice. However, it also should be noted that the values of parameters a and b can be changed if we would like to set different standards for the maximal reliability and the one when the buffer time is 0 minutes.

V. CASE STUDY A. NETWORK DESCRIPTION
To demonstrate and test the proposed method, the current CHR network (Fig. 6) with four vertical lines and four horizontal lines is used. The network consists of eight main lines and 37 cities ( Table 2) with nearly 100 stations in  operation. More than 20 million daily trips are transported by over 5000 pairs of HSR trains in the network. All the trains are operated as electric multiple units consisting of eight or 16 carriages, with the train capacity ranging from 494-1299 seats [33].

B. NUMERICAL EXAMPLE
A specific O-D pair (from Jinan City to Wuhan City) is taken as an example. Suppose that K = 5. The minimal connecting times within the station and within the city are 30 min and 60 min, respectively, and the maximum connecting times are 120 min and 180 min.
As can be seen from the results in Table 3, there are four alternative paths (as shown in Fig. 7): the path via Xuzhou, Bengbu, and Hefei City is the shortest one; the second-shortest path goes through Xuzhou, Bengbu, Nanjing, and Hefei City; the third-shortest path passes through Xuzhou and Zhengzhou City; and the fourth path goes by way of Shijiazhuang and Zhengzhou City. Other more circuitous routes have been excluded by the constraints.    Table 5. The overall results are summarized in Table 6.
From the results, the following conclusions can be stated: 1) The number of single-transfer travel plans is far higher than the number of direct plans. This is because the HSR train services are relatively infrequent, which results in the transfer behavior.
2) In contrast, the number of travel plans including two transfers is very low, and may even be zero. There are several factors contributing to this result. As mentioned before, an unreasonable travel plan with two transfers would be ruled out if it contained a single-transfer plan as a subset, and the number of single-transfer plans implies that most trips can be completed by transferring at most once. Another scenario is that the distance between the two cities is not far enough to require two transfers, which further reflects the accessibility of our network. The second example, which combines HSR trains with conventional trains, demonstrates the validity of our method. Plan 4 in Table 4 includes two transfers. Due to the use of lower-speed trains, the travel time is much longer than plans using HSR trains.
3) There are more transfer plans within a station than within a city, which implies that the timetable scheduling is good. As CHR network operation makes it possible to transfer at each intermediate station, the proposed method can generate all available travel plans that satisfy passengers' travel demands under constraints such as cost, travel time, and transfer reliability.
The reliability results for the single-transfer TPS are shown in Fig. 8. This chart shows the reliability of single-transfer journeys from Jinan to Wuhan. It is divided into three levels of reliability, with over 90% in the high-reliability class. Based   on the proposed measure, the reliability of transfer plans is extremely sensitive to the buffer time. For cases where the buffer time h<5 min, the reliability is less than 75%, whereas for h>9 min, the reliability is higher than 87%. The buffer time parameter is affected by the minimum connecting time. Therefore, it is important to study the connecting time domain under different cases and optimize internal spatial streamlining of stations based on passenger requirements to transfer between trains.

C. IMPLEMENTATION DETAILS
Our method was tested on a PC with an Intel Core i7 @ 2.80GHz, a 16GB RAM. The calculation time varies according to the network. Figure 6 shows the tested network with 37 city nodes, 97 station nodes and 8 main lines, and the train timetable dataset has over 108,000 operation data. In total, it takes 9 ms to determine K-shortest path by improved Yen * algorithm, and the second stage takes an estimated 30 seconds to search connecting train services.

VI. CONCLUSION
This paper has investigated the necessity and challenge of finding all feasible and reasonable TPSs. A two-stage generation method based on a spatiotemporal network has been proposed. This method can handle the structural complexity and topological dynamics found in typical HSR networks.
The main findings from this study can be summarized as follows: • Unlike previous studies, we generated all feasible travel plans, rather than solving the earliest arrival problem or applying other constraints. To achieve this goal, a railway network was first modeled using a spatiotemporal graph.
• To modify Yen's algorithm for our realistic network, two optimization methods were proposed, namely, setting constraints on the spur nodes and integrating the heuristic strategy of the A * algorithm.
• Mapping the buffer time in a reliability measure enabled us to calculate the reliability of each travel plan with transfers, allowing passengers to be informed about and prepared for the second leg of their trip. In the future, several possible extensions of this study can be made. For example, how to model passengers' choice behaviors in detail, how to understand the impact of transfers especially for the fact of changing stations within the city, etc. All of them are interesting and worth further research works. VOLUME 8, 2020