Period Division-Based Markov Models for Short-Term Traffic Flow Prediction

Short-term traffic flow prediction is very important and provides the basic data for traffic management and route guidance. The rules of traffic flow data during different periods in a day are different. Thus, this article proposes a membership degree-based Markov (MM) model and two period division-based Markov (PM and PW) models. The MM model introduces the membership degree to determine the state of traffic flow. The PM and PW models introduce the Fisher optimal division method to divide one day into several periods based on traffic flow data. Then, the period division-based Markov models integrate the Markov (CM) or weighted Markov (WM) model with the MM model to predict traffic volumes during different periods. The impacts of vehicle type on traffic flow prediction are also discussed. The proposed models are verified using the field data. The results show that: (1) the PM and PW models both perform better than the CM, WM, state membership degree-based Markov and weighted state membership degree-based Markov models; (2) the PW model sometimes performs better than the backward propagation (BP) neural network; (3) when traffic flow data are distinguished by vehicle type, the performance of the PM and PW models can be improved. It is suggested to adopt the proposed period division-based Markov models to predict traffic flow with the concern of vehicle type, so that more accurate traffic flow information can be provided for traffic management and route guidance.


I. INTRODUCTION
The accurate short-term traffic flow prediction is an important premise of which Intelligent Transportation System (ITS) provides reliable real-time road-traffic information for travelers and managers, hence, researchers pay more and more attention to short-term traffic flow prediction [1]- [3].
For precise traffic flow prediction, the rules of traffic flow data obtained from ITS should be analyzed in detail. The variation of traffic flow on weekdays is usually different from that on weekends [4]. Furthermore, Ma et al. [5] indicated that the change rules of traffic flow data during different weekdays are various and those during different weekends are also various. For urban traffic flow, the fluctuations of data during different periods in a day or for different vehicle types are also various [6]. Thus, the period division and vehicle type are considered when predicting traffic flow in this research.
The associate editor coordinating the review of this manuscript and approving it for publication was Rashid Mehmood .
In general, the time interval adopted for short-term traffic flow prediction is less than half an hour. The shorter the time interval is, the more violent the fluctuation of traffic flow is. Whatever the time interval is, it is necessary to accurately describe the change rules of traffic flow [7]. Because traffic flow fluctuates violently during a short-term period, the state of traffic flow in the next time interval mainly depends on its current state and the states in several backward time intervals, such a characteristic of traffic flow just coincides with the characteristic of a Markov chain which means the states of data in the next time interval largely rely on their states in the current and several backward time intervals [7], [8]. Since the Markov model, especially the high-order Markov model, can capture the change rules of short-term traffic flow in the process of data conversion [8], this model is used to predict short-term traffic flow.
To improve the accuracy and operability of short-term traffic flow prediction, three new Markov models are proposed in this article. To calculate the degree of which traffic VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ flow belongs to a certain state, the membership degree-based Markov (MM) model is formulated. To consider the influence of period division on traffic flow, two period division-based Markov models are then proposed after using the ordered clustering method. First, traffic flow data for different vehicle types are collected by detectors or sensors in road networks. Then, traffic flow data in a day are divided into serval parts corresponding with different periods. Next, the Markov (CM) model, the weighted Markov (WM) model, and the MM model are used to predict traffic flow during different periods. When the CM and MM models are adopted to predict the traffic volumes in one day, the obtained period division-based Markov model is called the PM model. If the WM and MM models are utilized, the corresponding model is named the PW model. In addition, the impacts of vehicle type on short-term traffic flow prediction are also concerned. The rest of this article is structured as follows. In Section 2, the existing models or methods about short-term traffic flow prediction are summarized and reviewed. In Section 3, two previous models (i.e., the CM and WM models) and three new models (i.e., the MM, PM and PW models) are expatiated. Specifically, the mathematical formulas are presented to explain how traffic volumes are predicted by these models. In Section 4, the field data from Hefei city of China are used to verify the selected and formulated models. Finally, the discussions and conclusions are outlined in Section 5.

II. LITERATURE REVIEW
Short-term traffic flow prediction as an essential component of ITS has been investigated by a lot of researchers. Recently, some scholars have proposed many short-term traffic flow prediction methods. In Table 1, the related works obtained by using single methods to predict short-term traffic flow are summarized.
Among the methods listed in Table 1, since the process of data conversion of short-term traffic flow is similar to a Markov chain, some researchers discussed the Markov model [7], [8]. The regression method, the autoregressive integrated moving average (ARIMA) model, and the Kalman filtering method are usually adopted to capture the linear rules of traffic flow [9]- [11]. However, these models cannot accurately capture the nonlinear rules of traffic flow [5]. Short-term traffic flow has strong volatility and nonlinearity, so that it is hard to get the suitable distribution and function for traffic flow [2], [5]. Thus, the support vector machine [12]- [14], neural network [15] and deep learning [16]- [22] have been widely used because of strong nonlinear fitting ability. In general, these methods need to train a large number of parameters.
A single model has its own advantages and disadvantages to predict short-term traffic flow. To integrate the advantages of two or more models, hybrid prediction methods are developed. The hybrid methods can be divided into the modified hybrid model and the weighted hybrid model. Chen et al. [23] and Du et al. [24] both proposed the modified hybrid methods on the basis of which traffic flow data are divided into two parts. Moreover, the neural network and the Markov model were utilized to predict these two parts, and the final predicted value was obtained by summing the predicted values of these two parts. Similarly, other researchers also divided traffic flow data into two parts. But then, they adopted two of the ARIMA model, the support vector machine, the generalized autoregressive conditional heteroscedas-ticity (GARCH) model and the Markov model to predict the two parts of traffic flow data [25]- [30]. To accurately capture the change rules of short-term traffic flow, Zhang et al. [31] and Yang et al. [32] divided traffic flow data into three parts, including the periodic trend, the deterministic part and the volatility part, and they pointed out that the volatility part is extremely important for short-term traffic flow prediction. On the other hand, Guo et al. [33] proposed a weighted hybrid method by combing the neural network, the support vector machine and the random forests method. These three models were individually used to predict short-term traffic flow, and the final predicted value was the weighted sum of the predicted values from these three models. Also, some scholars put forward the weighted hybrid method based on three single models, and the neural network method was selected to determine the weighted factors for each single model [34], [35].
The aforementioned researches reveal that the Markov model is often used by many researchers to predict one part of traffic flow data. However, some researchers only utilize the Markov model to predict traffic flow. Evans et al. [36] proposed a method for the prediction of breakdown based on the Markov model. The results revealed that the Markov model can accurately predict the state of vehicle arrival distribution. To predict travel time, Yeon et al. [37] first estimated the probability of breakdown for road traffic flow using the Markov model. Then, the relation function between the travel time and the probability of breakdown was constructed to calculate the travel time. Chen et al. [38] denoted that the change of headway and spacing could be simulated by the Markov process. Thus, the headway and spacing prediction model was proposed based on the Markov model. Some researchers pointed out that the high-order Markov chain could accurately describe the variation of short-term traffic flow and the performance of the Markov model was good for short-term traffic flow prediction regardless of whether the required data are missing or not [7], [8].
In practice, traffic flow prediction models or methods need to be not only extremely accurate but also strongly operable. Li et al. [39] indicated that the accuracy of the neural network, deep learning and hybrid methods is usually higher, but the algorithms of these models or methods are often very complicated and a large number of parameters need to be calibrated based on big data. Generally speaking, the training time for these models or methods is quite long so that it is hard to operate them for traffic managers. By comparison, the calculation for the Markov model is relatively simple since the slight parameters are only involved. Actually, the Markov model can provide the stronger operability and the satisfactory accuracy for traffic managers. Thus, the Markov model is adopted in this article.
Additionally, the characteristics of traffic flow have been discussed by some researchers when predicting short-term traffic flow. Zhang et al. [31] and Yang et al. [40] indicated that there is a periodic trend for traffic flow data. Hosseini et al. [4] pointed out that the rules of traffic flow are different on weekdays/weekends. Furthermore, some researchers denoted that the patterns of traffic flow during different weekdays/weekends are also various [5], [41]. In reality, the difference between the variations of urban traffic flow during different periods in a day may be huge. When predicting traffic flow, He et al. [42] and Duan et al. [43] considered the impacts of time periods in a day. However, they divided one day into several time periods using qualitative analysis. Thus, the influence of period division on traffic flow prediction should be further discussed. As we know, traffic flow often consists of several types of vehicles and the patterns of traffic flow for different types of vehicles are different. The existing literature usually neglects the impacts of vehicle type on short-term traffic flow prediction [16]- [22].
To overcome the above problems, a membership degree-based Markov model and two period division-based Markov models are formulated to predict short-term traffic flow. Additionally, the impacts of vehicle type on short-term traffic flow prediction are discussed.

III. METHODOLOGY
In the first two sections, the Markov model and weighted Markov model are introduced. In the last two sections, the three new models, including the membership degree-based Markov model and two period division-based Markov models, are formulated.

A. MARKOV MODEL
The raw traffic flow dataset collected from a detector or sensor is denoted by v = {v(z −ã,z)|0 <ã ≤z}, wherez is a time point,ã is a sample interval. Based on v, the series of traffic volumes can be gotten and is denoted by X =(x 1 , x 2 , x 3 , . . . , x t , . . . , x T ), where x t is the observed value of traffic volume at time interval t and T is the number of time intervals. To predict the traffic volume at time interval t, the traffic volumes before time interval t should be selected as the historical data denoted by X =(x 1 , x 2 , x 3 , . . . , x t , . . . , x t−1 ). Such a series can be divided into I states, i.e., E=( , a 2i ], the traffic volume at time interval t is in state i, where a 1i and a 2i are the lower and upper bounds of state i, respectively.
To obtain the relationship between traffic volumes during two different time intervals, the transfer probability of which one state is changed into another state needs to be calculated. Based on the characteristic of no memory of Markov chain, the probability of which state i is transferred to state i by w steps, which is denoted by p w ii , can be calculated as where m w ii is the number of occurrences of which state i is transferred to state i by w steps in series X ; M i is the number of occurrences of state i in series X ; i ∈ {1, 2, 3, . . . , I }; w ∈ {1, 2, 3, . . . , W }; W is the largest number of transfer steps, i.e., the order of the Markov model.
Based on transfer probability p w ii , the transfer probability matrix by w steps denoted by P w is where p w i is a vector. If the traffic volumes at the W time intervals before time interval t are selected, the number of steps of transfer from these W time intervals to time interval t will be 1, 2, 3, . . . , w, . . . , W from the nearest to the farthest. The corresponding states at these W time intervals are the initial states transferring to the state at time interval t. The row vector denoted by . . , p w iI ) which the probability of transfer is from state i to other states by w steps can be extracted from transfer probability matrix P w . Then, the modified probability matrix denoted by R can be expressed as where p w i t−w is a vector; p w i t−w i is the probability of which state i is transferred to state i by w steps from time interval t − w to time interval t.
State i corresponding with p i (t) = max W w=1 p w i t−w i is the state in which the predicted value of traffic volume is at time interval t, and max means maximizing. Thus, the predicted value of traffic volume at time interval t calculated by the Markov (CM) model, which is denoted byx CM t , can be VOLUME 8, 2020 written aŝ where a 1i and a 2i are the lower and upper bounds of state i , respectively.

B. WEIGHTED MARKOV MODEL
To consider the unequal effects of different states on the predicted value, the weighted factors of initial states are introduced.
On the basis of (1)-(3), the weighted probability of the predicted value of traffic volume in state i , which denoted by p i (t) = max W w=1 l w p w i t−w i , can be calculated by the weighted factors and the transfer probabilities [44]. In this case, the autocorrelation and weighted factors can be computed as where r w is the autocorrelation factor with w transfer steps; l w is the weighted factor with w transfer steps. Thus, the predicted value of traffic volume at time interval t calculated by the weighted Markov (WM) model, which is denoted byx WM t , can be given by [44] x WM The states divided by the Markov model are usually imprecise. For a specified state, the membership degree can be used to indicate the degree of which traffic flow is in such a state. Thus, the inaccuracy of state division can be remedied [45]. In this article, the Markov model is improved by introducing membership degree. This new Markov model is called the membership degree-based Markov (MM) model. Supposing λ i is the center point of state i, namely According to (8), a series of center points (λ 1 , λ 2 , λ 3 , . . . , λ i , . . . , λ I ) can be obtained. To acquire all the data constituted by center points, this series is extended towards the left and right bounds. Thus, two new center points λ 0 and λ I +1 need to be added. The new series of center points denoted by (λ 0 , λ 1 , λ 2 , . . . , λ i , . . . , λ I +1 ) is constructed. In this case, the membership degree of x t in state i, which is denoted by f i (x t ), can be calculated as Then, the membership degree matrix denoted by M can be given by As stated earlier, the transfer probability matrix with w steps can be given by (2). Similarly, the traffic volumes at the W time intervals before time interval t are selected, the corresponding states at these W time intervals are the initial states transferring to the state at time interval t, and the row vector denoted by . . , p w iI ) which the probability of transfer is from state i to other states by w steps can be extracted from the transfer probability matrix with w steps. In this case, the row vector of the membership degreebased transfer probability matrix, which is denoted by r w t , can be written as where w w is the weighted factor of which one state transfers to another state by w steps; W w=1 w w = 1. The membership degree-based probability matrix denoted by R can be given by Then, the membership degree-based probability of the predicted value of traffic volume at time interval t in state i , which is denoted by p i (t), can be expressed as Finally, the predicted value of traffic volume at time interval t obtained by the MM model, which is denoted byx MM t , can be estimated aŝ where λ i is the center point of state i .

D. PERIOD DIVISION-BASED MARKOV MODELS
Usually, traffic flow data at a signalized intersection show different change rules during different periods in one day. As shown in Figure 1, the fluctuations of traffic volumes during different periods in a day are various. Therefore, to accurately predict traffic volumes in one day, several suitable periods need to be divided and traffic volumes during each period should be predicted using an effective and suitable method. Furthermore, the time sequence of a series of traffic volumes should not be disrupted when dividing one day into several periods. In this article, the Fisher optimal division method is adopted for period division.

1) PERIOD DIVISION
The Fisher optimal division method is a classification technique for an ordered series. Such a method is to both minimize the difference of data from an identical period and maximize the difference of data from two different periods. The specific steps are as below [46]. To predict the traffic volume at time interval t, series X =(x 1 ,x 2 ,x 3 , . . . ,x˜t , . . . ,xT ),T ∈ {1, 2, 3, . . . , t − 1} including the traffic volumes at the former t −1 time intervals needs to be selected. If a certain period in seriesX contains a series of traffic volumes (x m k ,x m k +1 ,x m k +2 , . . . ,x n k ), such a period is denoted by g k = (m k , m k + 1, m k + 2, . . . , n k ), k ∈ {1, 2, 3, . . . , K }, where m k and n k are the starting and ending time intervals for period k, respectively; and K ∈ {1, 2, 3, . . . ,T } is the number of periods in series X . The meanx g k and diameter D k (m k , n k ) of g k are computed asx where m k ∈ {1, 2, 3, . . . ,T }; n k ∈ {m k , m k + 1, m k + 2, . . . ,T }; m 1 = 1; n k = m k+1 − 1; n K =T . SeriesX can be divided into K periods, and the classification function denoted by f (T , K ) can be expressed as When a kind of classification minimizes the classification function, such a classification will be optimal. Thus, the objective function of the optimal classification denoted by B(T , K ) should be written as The algorithm of period division is as below.
Step 1: The starting time interval of the last period, i.e., m K , can be found using the following recursion, namely Then, the K th period denoted by g K = (m K , m K + 1, m K + 2, . . . ,T ) is obtained.
Step 2: The starting time interval of the second last period m K −1 can be found using the following recursion, i.e.
Step 3: The former period is substituted by the latter period, and (20) is repeatedly used. Thus, the obtained periods denoted by g k , k ∈ {1, 2, 3, . . . , K } are the divided optimal periods.

2) MODEL FORMULATION
The time intervals of traffic volumes in the prediction day need to be selected from series (1, 2, 3, . . . , t, . . . , T ). The selected time intervals are denoted by (z, z + 1, z + 2, . . . , t, . . . , Z ), where z ∈ {1, 2, 3, . . . , t} and Z ∈ {t, t + 1, t + 2, . . . , T } are the starting and ending time intervals in the prediction day, respectively. SeriesX can be divided into K parts corresponding with K periods, and these K periods are regarded as the periods divided for the prediction day. Because the change rules of traffic flow during different periods are various, different Markov models are used to predict the traffic volumes during different periods.
When the CM and MM models are both selected, the integrated Markov model is named the PM model and expressed asx When the WM and MM models are both adopted, the integrated Markov model is called the PW model and written aŝ wherex PW t is the predicted value of traffic volume at time interval t obtained from the PW model; ϕ WM k is an identifier to indicate whether the WM model is selected for the kth period, ϕ WM k = 1 if yes, otherwise, ϕ WM k = 0. The predicted values obtained from the period divisionbased Markov models can be calculated using the following steps.
Step 1: The series of traffic volumes X andX are gotten based on v; Step 2: The traffic flow data in the prediction day are divided into K parts corresponding with K periods by utilizing the Fisher optimal division method; Step 3: The MM, CM and WM models are used to predict the traffic volumes during different periods. Then, the predicted values of traffic volumes in the prediction day are obtained.

IV. CASE STUDIES
To verify the accuracy and reliability of the formulated models, four approaches in Hefei city of China are selected and locate at the four signalized intersections. The raw traffic flow data were downloaded from http://www.openits.cn/. The traffic flow data were obtained from the microwave detectors. The collected time, vehicle type, traffic volume, speed, occupancy, etc. were collected using the microwave detectors. The time interval adopted for short-term traffic flow prediction is usually less than half an hour. Also, the peak 15-minute flow rate is usually adopted for signal timings [6]. Thus, a 15-minute time interval is used to aggregate the selected data. Then, the days in which missing data are no more than 20% are selected, and the missing data in these days are remedied by utilizing the simple average trend method [47]. Finally, the traffic volumes from July 11 to July 24, from August 8 to August 14, and from August 29 to September 4 in 2016 are obtained. Two vehicle types including passenger car and truck are considered in this article, and the corresponding data obtained from the four selected microwave detectors are viewed as the research data. Through analysis, atypical data do not exist in the selected data. Therefore, the impacts of any atypical situation on traffic flow are not considered.
As shown in Figure 2, Detector 1 is located at the westbound approach of the intersection of Xiangbin Street and Huangshan Road; Detector 4 is located at the westbound approach of the intersection of Tianzhu Road and Huangshan Road; Detector 5 is located at the southbound approach of the intersection of Kexue Street and Tianhu Road; and Detector 6 is located at the eastbound approach of the intersection of Tianzhi Road and Huangshan Road.

A. ANALYSIS OF FIELD DATA
Traffic volume often not only fluctuates with time but also depends on the selected location. Figure 3 illustrates the variations of traffic volumes for all kinds of vehicles detected by the four detectors during the four weeks. It can be seen that: (1) the change rules of traffic volumes during different weeks are very similar for the same detector, and the amount of traffic volume during a weekday is clearly greater than that during a weekend for each detector; (2) traffic volume in each day obviously occurs morning and evening peaks, and traffic volumes during different periods in one day are greatly different; (3) traffic volumes from different detectors are usually various during the same period. Figure 4 shows the variations of traffic volumes for passenger car, truck and both of them during the week from July 11 to July 17 in 2016. It can be revealed that: (1) the change rules of traffic volumes during weekdays or weekends are similar, but there are some difference between different weekdays or weekends during the same period; (2) the variation of traffic volume during a weekday is different from that during a weekend, the former is more violent than the latter, and the amount of traffic volume during a weekend is lower than that during a weekday; (3) the variations of traffic volumes for different vehicle types are various. Thus, the traffic volumes during weekdays and weekends should be predicted separately.
The above analyses reveal that weekday/weekend, vehicle type, and period all impact on the change rules of traffic flow. Therefore, these three influencing factors should all be considered so as to improve the accuracy of traffic flow prediction.
To predict the traffic volumes from August 29 to September 4 in 2016, the traffic flow data collected by each detector from July 11 to July 24 and from August 8 to August 14 in 2016 are used as the historical data. Specifically, the traffic volumes in each corresponding day in the first three weeks   are regarded as the inputs in the formulated models to predict the traffic volumes in the prediction day in the fourth week. For example, the traffic volumes on the three Mondays (i.e., July 11, July 18 and August 8 in 2016) are utilized to predict the traffic volumes on the last Monday (i.e., August 29 in 2016).

B. CALIBRATION OF MODEL PARAMETERS
When using the Markov models, the data of traffic volume are divided into 9 states [32], [48], and the largest number of transfer steps is set to 9. Taking Detector 1 as an example, Table 2 lists the results of state division in different cases. Mon, Tue, Web, Thu, Fri, Sat and Sun are the abbreviations of Monday through Sunday, respectively. Taking the traffic volume data for passenger car on Monday as an example, the traffic volumes are divided into 9 states, including [1,10), [10,20), [20,50), [50,80) Considering the sudden change of traffic volume, it is more suitable to divide one day into 6 periods based on the ordered clustering method [49]. Thus, the number of periods is set to 6 for the PM and PW models. Taking Detector 1 as an example, Table 3 lists the results of period division in different cases. Taking the traffic volume data for passenger car on Monday as an example, there are 96 time intervals in a day, and the day is divided into 6 periods, including 1-27, 28-38, 39-71, 72-86, 87-88 and 89-96. Since period division also relies on the selected data, the intervals of each period during different days or for different vehicle types are not identical. Figure 3, Figure 4 and Table 3, the CM and WM models are suitable for the case in which the amount of traffic volume is relatively lower and its variation is relatively more stable, whereas the MM model is appropriate for the case in which the amount of traffic volume is relatively greater and its variation is relatively more violent. Therefore, the CM or WM model is selected to predict the traffic volumes during Periods 1 and 6, and the MM model is utilized to predict the traffic volumes during Periods 2, 3, 4 and 5.

According to
Based on the above-mentioned parameters, the Markov models can be carried out. The selected sample data are from July 11 to July 24, from August 8 to August 14, and from August 29 to September 4 in 2016. Because of the discontinuous duration and the difference between traffic volumes in two separate days, the data of traffic volumes in the corresponding days in the first three discontinuous weeks are selected as the input data when the Markov models are used to predict the traffic volumes in the prediction day. When  the traffic flow data are available, the data of traffic volumes in the adjacent and corresponding days should be treated as the input data to predict the traffic volumes in the prediction day. Otherwise, the data of traffic volumes in the previous and similar days can be viewed as the input data to predict the traffic volumes in the prediction day. When the data of traffic volume are distinguished by vehicle type, the traffic volume for each vehicle type is first predicted, then the predicted value of traffic volume including all the vehicle types can be obtained by summation.
To validate the accuracy of the three proposed models, the CM, WM, state membership degree-based Markov (SM) [45] and weighted state membership degree-based Markov (WS) models are selected as the comparative models. The mean absolute percentage error (MAPE), the mean absolute error (MAE), and the root mean square error (RMSE) are adopted to evaluate the performance of all the used models. MAPE is used to evaluate the prediction accuracy. The smaller the value of MAPE is, the higher the prediction accuracy is. MAE and RMSE are used to measure the deviation VOLUME 8, 2020 between the observed and predicted values. The smaller the values of MAE and RMSE are, the smaller the deviation is. The formulas of MAPE, MAE and RMSE are as below: wherex t is the predicted value of traffic volume at time interval t; x t is the observed value of traffic volume at time interval t; m is the number of the time intervals selected for traffic flow prediction.

1) DIFFERENT MARKOV MODELS
Taking Detector 1 as an example, the predicted and observed values of the traffic volumes from August 29 to September 4 in 2016 are illustrated in Figure 5.  without distinguishing the data by vehicle type, whereas they are indicated in Figure 7 when distinguishing the data by vehicle type.
From Figures 6 and 7, it can be seen that: (1) the performance of the CM model is close to that of the WM model, the performance of the SM model is close to that of the WS model, and the performance of the PM model is close to that of the PW model; (2) all the performance indices obtained from the PM and PW models are the smallest, those obtained from the SM and WS models are the largest, and the MAE and RMSE values obtained from the MM model are smaller; (3) when the data are distinguished by vehicle type, the performance indices obtained from the PM and PW models will decrease in most cases. The above discussions represent that the period division-based Markov models perform the best, and the prediction accuracy can be further improved with the concern of period division and vehicle type together.
To verify the effectiveness of the formulated models, the Student's t test [50] is utilized to compare the performance indices of all the models. Figure 8 displays the results of Student's t tests at 5% significance level when weekday/weekend and vehicle type are both concerned. Any two of the seven Markov models are paired, a total of twenty-one paired models are gotten. For example, CM-WM means the paired model composed by the CM and WM models. In For all the performance indices, if the calculated t-value for a pair of models is less than the left critical t-value, the former model is significantly better than the latter one at 5% significance level; if such a calculated t-value is greater than the right critical t-value, the latter model is significantly better than the former one at 5% significance level.
Based on Figure 8, the outcomes reveal that: (1) based on MAPEs, MAEs and RMSEs, the PM and PW models both perform better than the existing Markov models and the latter outstrips the former at 5% significance level; (2) based on MAEs and RMSEs, the MM model outperforms the existing Markov models at 5% significance level; (3) the WM model does not always perform better than the CM model at 5% significance level; (4) the WS model sometimes performs worse than the SM model at 5% significance level.
To investigate whether traffic flow data should be distinguished by vehicle type, the Student's t test is also carried out for the paired scenario formed by without and with distinguishing the data by vehicle type. Figure 9 illustrates the results of Student's t tests for all the models at 5% significance level. The left and right critical t-values are −1.70 and 1.70, respectively. For all the models, if the calculated tvalue is larger than the right critical t-value, traffic flow data should be distinguished by vehicle type; if such a calculated t-value is less than the left critical t-value, traffic flow data should not be distinguished by vehicle type. The findings indicate that: when distinguishing the data by vehicle type, all the performance indices obtained by utilizing the PM and PW models will significantly decrease at 5% significance level, the MAPE values obtained by using the CM and WM models will significantly decrease at 5% significance level, the RMSE value obtained by adopting the WS model will significantly decrease at 5% significance level. The above analyses show that: (1) the PM and PW models both perform better than the existing Markov models and the performance of the PW model is the best, which means that the prediction accuracy can be enhanced by dividing one day into several suitable periods; (2) the WM model does not always perform better than the CM model, as do the WS and SM models, which represents that the weighted factors calculated by the autocorrelation factors could not significantly improve the performance of the CM and SM models; (3) when distinguishing traffic flow data by vehicle type, the performance of the PM and PW models can be obviously enhanced, which indicates that not only the division of periods but also the concern of vehicle type can effectively improve the accuracy of the prediction models.

2) PW MARKOV MODEL AND BP NEURAL NETWORK
To compare the period division-based Markov models with other kinds of models, the PW model is compared with the backward propagation (BP) neural network. Table 4 lists the mean, minimum, maximum and standard deviation of all the three performance indices obtained by using the PW model and the BP neural network for the selected four detectors.
According to Table 4, the outcomes reveal that: (1) the mean, minimum, maximum of MAPEs obtained by utilizing the PW model are less than those obtained by utilizing the BP neural network in most cases, whereas the standard deviation of MAPEs obtained by adopting the PW model is greater than that obtained by adopting the BP neural network in most cases; (2) the mean, minimum, maximum of MAEs and RMSEs obtained by utilizing the PW model are larger than those obtained by utilizing the BP neural network, whereas the standard deviation of MAEs and RMSEs obtained by adopting the PW model is less than that obtained by adopting the BP neural network in most cases.
The comparison indicates that the PW model can often obtain the least MAPE, whereas the BP neural network can usually acquire the least MAE and RMSE. The reason is that the prediction accuracy of the BP neural network is higher than that of the PW model during Periods 2, 3, 4 and 5, and the amount of traffic volume during these periods is higher than that during the other two periods. However, the prediction accuracy of the PW model is higher than that of the BP neural network in the whole day. On the whole, the PW model does not perform worse than the BP neural network.

V. DISCUSSIONS AND CONCLUSIONS
In this article, a membership degree-based Markov model and two period division-based Markov models are proposed. On the basis of the CM model, the MM model is formulated by introducing membership degree. Considering period division, the PM and PW models are also developed. Moreover, the influences of vehicle type on short-term traffic flow prediction are concerned. The proposed MM, PM and PW models are verified using the field data from the four microwave detectors during the four weeks together with the CM, WM, SM, WS models, and the PW model is also compared with the BP neural network.
The experimental results show that the PM and PW models outstrip the existing Markov models, and the PW model sometimes performs better than the BP neural network. It can be concluded that the performance of one prediction model can be improved with the concern of period division and vehicle type, and period division is more important than vehicle type distinction for improving the accuracy of traffic flow prediction. To get more accurate traffic flow information, it is suggested that the period division-based Markov models are applied in practice under the condition of distinguishing VOLUME 8, 2020 the data by vehicle type. When predicting traffic flow, it is necessary to divide one day into serval time periods using the ordered clustering method, so that the rules of traffic flow can be captured more accurately. In the meantime, the procedure of the proposed period division-based Markov models is also suitable for other kinds of methods.
In the future research, the calibration method for ascertaining the lower and upper bounds of each traffic flow state and the largest number of transfer steps (i.e., the order of each Markov model) will be discussed. To get more detailed information, traffic flow for each direction will be further predicted. Furthermore, the recent multi-source or big data (e.g., smart data, mobile data, live data), and other kinds of methods (e.g., deep learning) will be utilized to predict the traffic volumes with the concern of period division and abnormal conditions, and more existing models will be used to verify the performance of the proposed models. His research interests include traffic behavior analysis based on multi-source data and its application in urban traffic planning, management, and control. VOLUME 8, 2020