Neural Network-Based Prediction Model for Passenger Flow in a Large Passenger Station: An Exploratory Study

As the hub and carrier to transfer the passengers, the railway station is an important factor that affects the rail passenger transportation because the normal operation of the station without load redundancy is determined by the moderate passenger flow. It means reasonable and accurate prediction of passengers entering and leaving the station can provide the basis and guarantee for the station security, the resources allocation and the personnel deployment. Since the neural network model is good at processing the common regular data changes through training the network and adjusting the weight value based on a large number of training samples, the neural network model is used in processing the short-term irregular data to predict the passenger flow at the railway station which is susceptible to the constantly changing external factors. In this paper, the neural network is used to predict the passenger flow. First, the key factors affecting the change of the passenger flow are selected and analyzed as the input of the neural network. Second, the learning and the rate updating of variable step size are adopted to estimate the number people entering the station during a certain time interval, which is then weighted with the historical data to derive the prediction of the passenger flow during the next time interval. The simulation results show that the experiment results show that the method proposed in this paper can better track and predict the sudden changes in the passenger flow caused by emergencies. Meanwhile, it can be found that in the process of forecasting abnormal passenger flow, the most critical link is to summarize and summarize the characteristics of railway station passenger flow, clarify the type and time distribution of passenger flow at each station, and analyze the factors that cause abnormal changes in passenger flow.


I. INTRODUCTION
In Asia, especially in China, rail transit is favored by more people for it can solve the growing contradiction between the supply and the demand of transportations in large and medium cities with its technical advantages of large capacity, fast speed, little pollution, high safety and good punctuality. In recent years, with the development of the urban rail transit system, the high-speed rail, the bullet train, the inter-city train and the ordinary train have become more and more popular and multi-functional, bringing great convenience to the people's travelling [1]. Now, the trains are becoming the The associate editor coordinating the review of this manuscript and approving it for publication was Sabah Mohammed . first choice for more and more people to travel for its steadily increased speed, high safety and excellent comfort.
As the hub and carrier to transfer the passengers, the railway station is an important factor that affects the rail passenger transportation because the normal operation of the station without load redundancy is determined by the moderate passenger flow [2], [3]. It means reasonable and accurate prediction of passengers entering and leaving the station can provide not only the basis for the personnel deployment and the resources allocation but also the strong support for the security work.
The passenger flow volume not only changes with time, but also is the function of time which is of high nonlinearity since it's affected by multiple external factors. Compared with other similar application scenarios (shopping malls, highway, scenic spots), the passengers entering the railway station are of more obvious characteristics such as obvious periodicity which is measured by days. For example, first, the passenger flow at the railway station is susceptible to the weather. But the overall trend nearly remains the same despite the local fluctuations caused by the weather. Second, at certain time intervals, the number of passengers entering the station at each time interval is closely connected to the number of trains at the next time interval: the fluctuation in the passengers' number is particularly evident during the winter and summer vacation and other statutory holidays.
This paper is going to estimate the number of passengers entering the railway station at different time intervals by using the neural network for its great advantage in data prediction [4] and the historical data [5]. Since the neural network model is good at processing the common regular data changes through training the network and adjusting the weight value based on a large number of training samples, the neural network model is used in processing the short-term irregular data or data that will abruptly change to predict the passenger flow at the railway station which is susceptible to the constantly changing external factors.
The rest of this paper is organized as follows: we first conduct a comprehensive review that forms the theoretical foundation of this study. Section 3 discusses applications of prediction model. Section 4 discusses the improved prediction model for the prediction of the passenger flow. Section 4 presents the experiment. Section 5 concludes.

II. LITERATURE REVIEW
In 1984, Okutani and Stephanedes [6] applied Kalman filtering theory to the prediction of dynamic short-term passenger flow. Friedman et al. [7] first proposed the theory of combining the neighborhood prediction method with the time series method in 1977. Davis and Nihan first introduced the nearest neighbor prediction method and time series method to short-term passenger flow prediction in 1991 [8]. Since the k-nearest neighbor algorithm does not require parameter assumption and parameter adjustment process, the prediction effect is better. The research of Smith et al. [9] Mainly focused on how to reduce the time-consuming problem of the algorithm. Some scholars studied the prediction of traffic passenger flow data using passenger flow data of neighboring time observation points [10], [11].
Setyawati and Creese [12] applied a differential autoregressive integrated moving average (ARIMA) prediction model to the short-term passenger flow prediction of railway station. The algorithm considered that the model could not solve the influencing factors of emergency events and achieved good results Effect. In 1997, Arem et al. [13] summarized the existing short-term traffic flow prediction theory at that time, pointed out the necessity of short-term traffic flow prediction, and the accuracy of short-term traffic flow prediction technology needs to be improved.
In 1943, biologists and mathematical physiologists jointly demonstrated the mathematical model of neurons, which laid a solid foundation for future study and research [9]. The main achievements of foreign scholars using neural networks for prediction are: In 2001, Dia [14] and other researchers proposed a goal-oriented neural network prediction model to predict short-term passenger flows, and obtained good results. Chen and Muller [15] studied the shortterm traffic flow prediction method of neural network model based on dynamic ranking learning with great advantages; Vlahogianni et al. [16] optimized the neural network prediction model, Structural optimization theory is applied to road traffic flow prediction. Through simulation experiments, multi-section prediction is achieved and the results show that the optimization model effectively reflects the random variability of road traffic flow. In 2012, Zheng et al. [17] combined Bayes rule and conditional probability theory to optimize different neural networks, and applied the optimized model to the prediction of short-term traffic flow on expressways. The example verification results show that the optimized prediction results are more accurate than the prediction results of a single neural network model.
In 2010, Sugiyama et al. [18] proposed a real-time forecasting method for passenger flow in railway station; Ozerova [19] used the bilinear correlation method to study the influencing factors of intercity commuter passenger flow, and then use linear regression method to predict commuter passenger flow; Hrushevska [20] discussed the passenger flow rule from the train schedule and other aspects, and based on this, the suburb The daily passenger flow of the railway station is forecasted.
In recent years, the combined forecasting model has become the mainstream of passenger flow forecasting. The combined forecasting model combines several algorithms that perform well to effectively overcome the limitations of a single algorithm in forecasting and improve forecast accuracy. Hao et al. [21] analyzed the short-term passenger flow changes in railway station, and analyzed the short-term historical passenger flow changes in detail. Guo et al. [22] studied the short-term prediction method of interval passenger flow, and selected the BP neural network based on the characteristics of interval passenger flow itself; the interval passenger flow forecasting was realized with Matlab, and a comparative analysis was performed, empirical research shows that the characteristics of interval passenger flow data and neural network have good prediction accuracy. George et al. [23] proposed that passenger flow forecasting has important strategic significance in the management of transportation systems. They adopted a new data source, social media, to address this challenge, and developed a systematic approach to check social media activity and perceived events. Preliminary analysis shows that there is a moderately positive correlation between passenger flow and the ratio of social media posts. This discovery prompted researchers to develop a new method for improving traffic prediction. Asmer et al. [24] conducted a feasibility study on the introduction of security checks at train stations. A simulation model for Brunswick, Germany train station has been It's found that there's a close correlation between the data at adjacent time intervals, so this paper decides to make the prediction of the data at the next time interval according to the data at the previous time interval. In this way, the short-term fluctuation of the data can be better tracked and predicted. Meanwhile, the prediction results by the model and the historical data were weighted to obtain the final estimation result of the passenger flow volume which has been proven to be of higher accuracy by the simulation results [25], [26].

III. THE PASSENGER FLOW PREDICTION MODEL
The passenger flow prediction model is established mainly through the BP neural network, in which the input is the major factors influencing the change of the passenger flow, and the output is the prediction results of the passenger flow. The method of rate updating of variable step size is adopted to avoid oscillation and falling into local optimal value. Since the station schedule is designed in a day cycle and the passenger flow is of distinct periodicity without considering the interference of external factors, it is of higher accuracy to make the weighted prediction by combining with the historical data.

A. CONSTRUCTION OF THE NEURAL NETWORK FOR THE PASSENGER FLOW PREDICTION IN THE PASSENGER STATION
Here, a three-layer BP neural network is used as a model for predicting the number of people entering the station. The three-layer neural network mainly includes the input layer, the hidden layer and the output layer. There is no connection within the layer, but there are full connections between the layers. The specific network structure is shown in the figure 1.
Among them, x i , i = 1, 2, . . . , m represent the key factors affecting the number of people entering the station.
Here, a hyperbolic tangent S-type function is used as the transfer function of the hidden layer and the output layer f (·), that is: The input vector is X = [x 1 , x 2 , . . . , x m ] T . Each element represents a quantified factor affecting the passenger flow at the station. These factors usually include weather, holidays, etc. W is the connection weight vector from the input layer to the hidden layer, which can be explicitly expressed as the following matrix.
In this paper, in the calculation of the output value of the hidden layer, the threshold vector B = [b 1 , b 2 , · · · b n ] T is regarded as a node weight value of constant 1 in the input layer. Therefore, the input vector X can be extended to ; correspondingly, the weight vector from the input layer to the hidden layer W can be extended to: In this case, the output F can be expressed as follows: Similarly, the weight vector from the hidden layer to the output layer T can be expanded to T = [t 10 , t 20, . . . t n0 , b 0 ], so the output of the output layer of the BP neural network is:

B. THE PREDICTION MODEL OF THE NUMBER OF PEOPLE ENTERING THE STATION
For time k, the input vector is x 4 (k), 1], x 1 (k) represents factors affecting the number of people entering the station, which indicates the actual flow of people entering the station in the time interval ((k − 1)T 0 , kT 0 ], T 0 represents the interval period of two adjacent time points, such as one hour or one day; x 2 (k) is the weather condition at time KT 0 , the value of x 2 and its corresponding neural network input value constitute a hash table {sunny day = 1, rainy day = 2, snowy day = 3, gale = 4}; x 3 (k) represents the outdoor temperature P vat time KT 0 , the value of and its corresponding quantized neural network input value also constitute a hash table: This is an example of an equation: x 4 (k) represents holidays during time intervals, similarly, the value of x 4 (k) and its corresponding neural network input value can also be expressed as a hash table{statutory holiday = 1, winter and summer vacation is 2, weekend is 3, else is 4}.
Based on the above definition of the input layer nodes and the transformation of the model calculation in 36878 VOLUME 8, 2020 section 2.1, the structure of the prediction model of the number of people entering the station at different time intervals is shown is Figure 2. Among them, the node whose input layer is identified as 1 is used to transform the threshold vector into the weight vector, and the value is updated iteratively by the gradient descent method.
The output of the station passenger flow prediction model based on the neural network is as follows: {t j0 (k − 1), w ij (k − 1)}, the weight value of the trained neural network during the k−1 time interval and X (k) = [x 1 (k), x 2 (k), x 3 (k), x 4 (k), 1], the input value of the different conditions during the k time interval are used to predictx(k + 1), the station passenger flow volume during the k+1 time interval. The initial weight {t j0 (0), w ij (0)} is obtained by training historical data.

IV. THE IMPROVED PASSENGER FLOW PREDICTION MODEL
Due to the advantage of algorithm, it is developed to do research in many areas; while the network will be unable to achieve the desired results with a standard algorithm as demonstrated by many cases. The main reasons are the defects of the algorithm itself: (1) Easy to form the local minimum but not the global optimum; (2) Low learning efficiency and slow convergence rate because of massive training times; (3) The falsely saturated in learning process. We carried out the following improvements toward the defects.

A. WEIGHT VECTOR UPDATE
An error function is obtained by comparing the difference between the passenger flow obtained from the actual statistics and the passenger flow predicted by the neural network during the k time interval: Among it, x(k) represents the actual statistics of the passenger flow during the k time interval,x(k) represents the passenger flow volume during the k time interval predicted by the neural network model.
Here, the gradient descent method is used to adjust the weight vector from the input layer to the middle layer W (k) and the weight vector from the middle layer to the output layer T (k) to make E(k) → 0. The specific adjustment mechanism is as follows: This is an example of an equation: where, And the weight vector T (k) is updated by step size δ while the weight vector W (k) is updated by step size η(k − 1). The inner product term of the error gradient of the current sample and the previous sample is considered at the same time. When the inner product is positive, it indicates that the two adjustments have the same direction on the error surface, and the learning rate can be increased to quickly search for the global optimum; when the inner product is negative, it indicates that an inflection point is encountered on the error surface so the learning rate should be reduced to avoid oscillation.

B. THE WEIGHTED FUNCTION OF THE PASSENGER FLOW PREDICTION
Since the number of trains at each passenger station is in a day cycle, the fluctuation of the passenger flow in different time intervals of each day should have the same trend. Besides, there is a certain connection between the passenger flows of two adjacent days. Therefore, the following weighted prediction model of the passenger flow is considered: Among it,x(k) is the predicted value of the final passenger flow of the model in this paper during the k time interval; x (k) is the real value of the passenger flow during the k time interval;x(k) is the predicted value of the neural network for the passenger flow during the k time interval; α is an adjustable weighting coefficient.
Based on the calculation and transmission of the above data, the prediction model of the station passenger flow as shown in the figure below is obtained. Among them, X (k) is the input of the neural network andx(k +1) is the final output value of the prediction model. VOLUME 8, 2020

C. THE ERROR FUNCTION TO ACCELERATE THE LEARNING SPEED
Input layer can be changed as Hidden layer can be changed as Error signal δ o k = −η ∂E ∂net k and δ And For output layer, do differentiation for For hidden layer, do differentiation for E And Therefore, connection weights are So error gradient expression: Minimal error gradient means δ o k is closed to zero. And So one new error function is introduced: Activation function of the neuron is the hyperbolic tangent function: where, a is the input of activation function. The output range is (−1,1)for the hyperbolic tangent function, thus reducing the training time.
Because, 2 (29) Take output layer for example, so So that, it can accelerate learning rate.

V. EXPERIMENT AND RESULTS
The annual passenger flow data of Beijing from June 1, 2017 to May 31, 2018 are shown in Table 1: From Table 1, it can be seen that the passenger flow of the Beijing has a certain periodicity. Among them, the passenger flow in February 2018 has a large fluctuation. The main reason for this phenomenon is that the most important traditional festival in China is the Spring Festival. In January, Beijing was a first-tier city dominated by immigrants. At this time of the year, a large number of immigrants left Beijing to return to their hometowns, causing a sharp increase in passenger traffic. In addition, another peak period of passenger traffic growth is in July of each year. This period is the summer vacation period of schools in China. As a cultural, political, and economic center of China, Beijing has also become an important place for many students to travel. As a result, the passenger flow of the Beijing railway station passenger flow has increased.
In the simulation part, the time interval is divided by hours and T 0 = 1, so there are 24 time intervals in each day. The data of the 32-day passenger flow at the simulated station is taken as the sample set, in which the first 30 days are taken as the training samples, the 31st and 32nd days are taken as the test samples respectively, and the coefficient in the weighted prediction function α is taken as 0.4.The MATLAB simulation is used to compare the performance of the weighted neural network prediction model proposed in this paper and the general basic neural network prediction model.  The basic neural network takes the data of different time intervals in the first 30 days in the sample set as the training set for batch input training, and finally obtain the weight vector satisfying the training accuracy and the corresponding output results according to the input test vector. The weighted neural network proposed in this paper repeatedly trained the first 30 samples in the sample set and adjusted the weight vector one by one. After the training, the test vector is input to obtain the prediction result. At the same time, the weight vector is further adjusted according to the error function for the prediction at the next moment.
The Percentage Relative Error (PRE) is used to compare the prediction accuracy: Figure 4 shows the prediction results and error percentage curves (interpolation fitting) of the two neural networks on the 31st day. The passenger flow on Day 31 has no obvious characteristics compared with the previous 30 days and the overall trend of the passenger flow remain consistent. It can be seen from the simulation results that both the basic neural network and the weighted neural network have good VOLUME 8, 2020 performance in the prediction of the passenger flow on Day 31, with the percentage error basically below 10%. The weighted neural network proposed in this paper is better than the general basic neural network in accuracy as a whole. Figure 5 shows the prediction results and error percentage curves (interpolation fitting) of the two neural networks on the 32nd day. The simulated data of the 32nd day is the passenger flow affected by the short-term continuous rainfall at 3 p.m. It can be seen from the curve of the actual statistical results that the passenger flow increased rapidly during the period after the rain at 3 p.m. but declined rapidly after a sustained period of the time. Normally, the data in the abrupt situations cannot be well predicted by the basic neural network because the basic neural network relies too much on the historical data whose weight value won't be updated after the training. However, the basic neural network proposed in this paper takes the environmental factors into consideration and weights the historical data according to these factors, so the weight vector can be constantly updated according to the error value between the prediction results and the actual statistics in each time interval for the prediction during the next time interval. It can be seen from the prediction results that the weighted neural network can quickly track and predict the data the moment the data abruptly change; and based on the comparison results of the percentage error, the prediction performance of the weighted neural network is better than that of the basic neural network.  The Beijing station as the prediction target station is considered to be an unknown structure (black box) dynamic system. The goal is to build a mathematical model that can be used to predict the downstream passenger flow at Beijing Station. The railway system is considered an input and single output system, while the drop-off passenger flow (circled in red) at Beijing Station is the system output. The boarding passenger flow of the 18 transfer stations (circled in other colors) in Figure 6 is the system input. Smart card data is recorded from 4:45 am to 23:15 am with a sampling rate of 15 minutes. Therefore, obtaining 76 data points per day results in a total of 32 days of observations and 15 × 76 × 1140 data points, as shown in Figure 7. The first 1064 data points are used for model identification, and the remaining 76 data points are used for model prediction. The goal is to use the proposed improved neural network modeling method for iterative multi-step prediction.
In order to facilitate the description of the drop-off passenger flow at the Beijing target station, the boarding passenger flow is instructed from the 18 original transit stations of the Beijing station. The next step is to determine the input lag shown for the longest time. Suppose the passenger chooses these routes with the least number of transfers, and estimates the commute time between any two stations by calculating the in-train travel time and transfer time. For example, it takes 30 minutes from the Beijing station to the other stations, which is equivalent to two time lags considering the 15-minute sampling rate. In other words, the boarding passenger flow at other stations 30 minutes ago was not enough to help passengers getting off at Beijing station and should not be included in the prediction model. However, passengers boarding from other stations may take longer. 30 minutes to Beijing Station, as the actual travel time between each station for each passenger may exceed the minimum commute time (ie 30 minutes). In order to delay arrival, a 30-minute buffer is added during the shortest commute time, which is equivalent to two time intervals. Therefore, the maximum time lag for boarding passenger traffic from other stations can be calculated as the shortest calculation time (30 minutes) plus the buffer time (30 minutes) divided by the sampling rate (15 minutes). Based on similar logic, the maximum time lag for the remaining 17 stations can be calculated. The results show that each station will provide three time-delayed boarding passenger flows to the passenger flow at Beijing Station. Therefore, a total of 55 candidate variables were selected.

VI. CONCLUSION
In this paper, the neural network is used to predict the passenger flow to prove that such prediction can bring great convenience to the station security, the resources allocation and the personnel deployment. The key factors affecting the passenger flow constitute the input of the neural network and the prediction results constitute the output. Besides, the prediction method is used to predict the passenger flow during the next time interval based on that during the previous time interval by weighing the prediction results with the history data of the passenger flow during the next time interval. Then the weight vector is modified according to the error between the weighted prediction value and the actual statistical value. Meanwhile, an improved step size updating method is used to prevent the error function from oscillating or falling into the local minimum during the updating of the weight value. The simulation experiment compares the performance of the weighted neural network prediction method proposed in this paper and the traditional basic neural network prediction method. The experiment results show that the method proposed in this paper can better track and predict the sudden changes in the passenger flow caused by emergencies. Meanwhile, it can be found that in the process of forecasting abnormal passenger flow, the most critical link is to summarize and summarize the characteristics of railway station passenger flow, clarify the type and time distribution of passenger flow at each station, and analyze the factors that cause abnormal changes in passenger flow. Classification, and finally, mathematical statistical methods are used to establish a functional relationship between various parameters, so as to obtain an effective plan for abnormal passenger flow prediction.
ZHUCUI JING received the Ph.D. degree in computational mathematics from the Chinese Academy of Sciences, China. She is currently an Associate Professor with the School of Economics and Management, Beijing Jiaotong University. Her current research interests include innovation, leadership, and health industry.
XIAOLI YIN received the master's degree in probability theory and mathematical statistics from Shanxi University, China. She is currently an Associate Professor with the Business College of Shanxi University. Her current research interests include intelligent transportation systems, wind power prediction, and natural language processing.