A Graph Deep Learning-Based Fast Traffic Flow Prediction Method in Urban Road Networks

In modern smart cities, road networks are becoming more and more complicated, resulting in more complex format of graphs. This brings many challenges to the forecasting of traffic flow in road graphs. Most of traditional traffic flow forecasting methods ignored many implicit relationships inside road graphs. And this cannot be well suitable for modern road networks in smart cities. Besides, the operation of smart cities is accompanied with real-time big data stream. The running efficiency of forecasting methods is another important concern. To handle this issue, this paper proposes a graph deep learning-based fast traffic flow forecasting method in urban road networks. Firstly, the theory about graph convolution operations is deduced and can be used as the basis of a graph convolution network (GCN). Then, the whole road network is viewed as a complex road graph, and the GCN is introduced to establish a novel forecasting method for graph-level traffic flow. With roads being regarded as nodes and their relations being regarded as edges, graph-level forecasting can be realized with the use of the proposed method. Experiments are carried out on a standard real dataset to evaluate the proposal. The experimental results show a proper performance of the proposal.


I. INTRODUCTION
In contemporary society, the continuous progress of industry has brought huge resources and wealth to the society. With the gradual popularization of cars in general families, urban roads all over the world are facing huge traffic flow pressure [1], [2]. Urban road networks are the lifeblood of urban economic development, and their carrying capacity determines the efficiency of economic development [3], [4]. Many countries have also taken some policy interventions to cope with the increasing traffic management pressure, such as traffic limitation and green travel promotion [5]. However, this still cannot well solve the problem, because the base amount of car ownership is always very large [6], [7]. In recent years, various countries have begun to explore the use of technical means to build some intelligent traffic management systems [8], [9]. Among them, the ability to accurately predict the traffic flow values in the future is an important prerequisite for building an intelligent traffic management system [10].
The associate editor coordinating the review of this manuscript and approving it for publication was Hassan Omar .
As is stated by related works such as [11] and [12], traffic flow forecasting is of great significance to construction of intelligent transportation systems. Results delivered by such studies can be used in numerous detailed follow-up analyses, but also to make planning decisions and to design transport infrastructure as well as traffic control systems [13]. Besides, for the area of road maintenance engineering, various traffic flow values in different roads directly influence the usage loss status of roads. If the road management departments can predict the traffic flow of the roads in advance, it will be beneficial to their future affair planning. To this end, the traffic flow forecasting is a significant work in many areas, and this study concentrates on the smart methods that can be utilized to predict future traffic flow values [14].
Although machine learning has made great progress in traffic flow prediction technology in recent years, there are still some limitations that have not been solved [30], [31], [32]. Because the existing machine learning prediction models focus on the pattern rules at the data level, thus ignoring the internal structural characteristics of the road network. In fact, as is shown in Fig. 1, urban road network can be viewed as a kind of complex network structure, which contains different nodes and complex relationships. The structural characteristics of road networks have a great impact on the formation of traffic flow. Ignorance of this point will make it difficult for the prediction model to have good generalization performance.
In order to deal with this problem, this paper uses the graph deep learning method to represent the road networks. Graph deep learning is the fusion of graph theory and deep learning. Because graph is an important data representation format, deep learning can be applied to the data with graph structure to obtain more in-depth feature representation of objects. In the road network scenes, the method of graph deep learning can be used to capture the structural characteristics of the road network, so as to establish a powerful prediction model. Therefore, this paper proposes a graph deep learning-based traffic flow forecasting model for urban road networks (named as DTFUN for short). Specifically, each road is regarded as a node, and the association between roads is regarded as the edge between nodes, so as to construct a graph network with graph structure. Then, the graph convolution network is used to model the structural characteristics of the above graph network, so as to obtain the forecasted results for the subsequent traffic flow values. The main contributions of this paper can be summarized in three aspects: • This paper declares that traffic flow forecasting needs to employ structural characteristics in road networks.
• The graph deep learning is employed to construct a strong forecasting model for traffic flow forecasting.
• The experiments are carried out on a real dataset, so as to evaluate practivity of the proposal. The rest of this paper is divided into several sections. In Section II, the problem scenario is stated and some related works are surveyed. In Section III, the main technical methodology is described. In Section IV, experiments are conducted and the obtained results are analyzed, so that performance of the proposal can be evaluated. In Section V, this paper is summarized and concluded.

A. RELATED WORK
The task of traffic flow prediction has always been an important issue in traffic management and planning, and significant achievements have been made in the past few decades. In recent years, with the development and application of deep learning methods, research on traffic flow prediction based on deep learning methods has gradually become a hot topic. Previous traffic flow prediction methods mostly processed traffic flow data based on specific scenarios, ignoring the spatial correlation of road networks. At present, existing methods mainly focus on the complex spatial structure of traffic road networks and the temporal dependence of flow data, and expand research on the basis of spatiotemporal characteristics.
Zhao et al. [33] proposed the temporary graph convolutional network (T-GCN) model, which explores the performance of traffic prediction from both temporal and spatial dimensions. T-GCN uses GCN to learn the Space complexity between traffic roads, and GRU model to learn the time series dependency of traffic data.
Wang et al. [34] proposed a spatiotemporal graph neural network for dynamic prediction of traffic flow. The spatial layer of this model is used to extract spatial relationships between traffic networks, the GRU layer is used to extract local temporal correlations of traffic data, and the Transformer layer is used to directly learn global temporal features in the data sequence.
Wang et al. [35] proposed an ST-GCN method that can predict traffic flow without historical data. On the basis of extracting spatiotemporal features using GCN and GRU models, this method incorporates the Adjacent Similar algorithm to predict traffic flow at intersections without historical data.
Lai et al. [36] introduced the NodeRank algorithm to calculate the importance of road nodes based on extracting the spatiotemporal features of traffic flow prediction tasks.
Qi et al. [37] proposed an asynchronous graph convolutional networks (FedAGCN) based on joint learning, starting from the accuracy and time cost of traffic flow prediction. This method designs a cloud model to aggregate the global parameters of each submodel. The whole learning task is divided into several sub graph Learning space features, and joint learning is used to retain the local correlation of parameter updates.
Zheng et al. [38] proposed an STA-ED framework based on the scenario of predicting the flow of different vehicle models on the traffic network. This method sequentially inputs traffic data into the Spatial Attention Layer, LSTM Encoder, Temporary Attention Layer, LSTM Decoder, and finally obtains traffic prediction values.
Duan et al. [39] focused on the spatial, temporal, and prediction cycle dimensions of traffic flow prediction. By improving the graph attention network, dynamic prediction of traffic flow was achieved.
The typical existing research works are summarized in TABLE 1.

B. PROBLEM STATEMENT
Main workflow of the proposed DTFUN is shown in Fig. 2. It is assumed that the whole road networks are viewed as a graph-structured network, in which each road is a node and relations among roads are edges. Let i denote index number of roads and range from 1 to I , and t denote index number of timestamps and range from 1 to T . Thus, x i denotes the i-th road node and X   t-th timestamp. It is assumed that there are T timestamps in training data. Given all the traffic flow values of the T timestamps, the goal is to train a forecasting model according to the historical data, which can be represented as follows: It represents that the X (t) i is related to features of x i at the t-th timestamp. When the index number of timestamps t is given, forecasting value of traffic flow at that time can be calculated. From the macroscopic view, this work manages to learn a sequential model that can generate forecasting results for traffic flow values after the T -th timestamp. The process can be represented as follows: For a road node x i , it has two main levels of features: traffic flow values and adjacency relations. The former represents the inherent traffic flow value of itself, and the latter represents implicit relations between itself and other road nodes. The adjacency can be specifically defined before modeling, and the adjacency relations between x i and other road nodes are represented using an adjacency matrix. Having integrated the two parts of features together, deep representation for the road node x i can be obtained. Then, the GCN model can be utilized for modeling and to output forecasting results. The next section is going to present more detailed mathematical description for the GCN modeling process, as well as its optimization and training process.

III. METHODOLOGY
This section gives mathematical descriptions for GCN model through two subsections. The first subsection gives some basic mathematical preliminaries, and the second subsection gives the main reduction process of the GCN model, as well as its objective function. The variables involved in this section are listed and explained in TABLE 2.

A. MATHEMATICAL FOUNDATION
The classical Fourier transform in continuous intervals can be defined as the following formula: where F (·) denotes the Fourier transform operator, and j denotes the imaginary number and j 2 = −1. The Fourier transform manages to map signals in the time domain into the spectral domain, while the inverse Fourier transform manages to map signals in the spectral domain into the time domain. 93756 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   The inverse Fourier transform can be represented as: where F −1 (·) denotes inverse Fourier transform operator. The inverse Fourier transform in continuous intervals can be represented as the following formula: The Fourier transform has a very important characteristics in terms of convolution. The convolution operations in time domain, the convolution operations will be transformed into multiplication operations when they are mapped into the spectral domain using Fourier transform. The following formula can be deduced: Similarly, an expression about inverse Fourier transform can be also deduced as follows: It is known that general convolution operations have much computational complexity. The essence of inverse Fourier transform is to express a function as a linear combination of several orthogonal basis functions. Thus for graph-level signal, the Fourier transform selects eigenvectors of Laplace as the basis function. For a signal in a graph can be denoted as: Due to the fact that U = (u 1 , u 2 , · · · , u n ) is actually n linear independent vectors in a n-dimension space, the Laplace matrix is selected for use. Hence, graph-level Fourier transform and inverse Fourier transform can be represented as the following formulas, respectively:

B. GRAPH CONVOLUTION NETWORK
The main idea thought of GCN is demonstrated in Fig. 3. Let x i denote the set of I road nodes, where i ranges from 1 to I . In GCN, a basic operation is the graph convolution. The graph convolution between x i and a graph convolution filter g is represented as: where ⊗ denotes the graph convolution operator, and ⊙ denotes the harmand multiplication. Assuming that U T g is with the format of a diagonal matrix, the above formula can be rewritten as: The GCN model can be viewed as a specific example of the above formula. Expanding the F (g) with use of the first-order Chebyshev polynomial, the F (g) in the above formula can be rewritten as follows: where λ 1 , λ 2 , · · · , λ n are core parameters of the graph convolution filter. To further simplify the above formula, the following formula can be deduced: where β 0 and β 1 are parameters, D is the degree matrix of the graph, and A is the adjacency matrix of the graph. The parameters in the above formula can be further reduced, so that the above formula can be rewritten as: where β is the parameter to be learned, and E denotes identity matrix. To ensure the stability of GCN, the above formula can be finally approximated as follows: where ∼ A and ∼ D are calculated as follows: For road node at the t-th timestamp, its feature representation can be obtained as follows: where v (t) i denotes traffic flow value of x i at the t-th timestamp. Given all the training data, the learning goal is to search optimal solutions of the following formula: whereR (t) i denotes real traffic flow value at the t-th timestamp, α denotes penalty parameter that needs to be set manually, and denotes the set of parameters. Then, the adaptive momentum estimation is selected as the optimizer to search solutions. After training, the parameters can be learned and the forecasting model in Eq. (19) can be formulated.

IV. EXPERIMENTS AND ASSESSMENT A. BASIC SCENARIOS
The dataset is a universal one that is used for evaluation of traffic flow forecasting. It was released by the Caltrans Performance Measurement System, and was thus named as PeMS for short. It has four most typical versions that are named as PeMS03, PeMS04, PeMS07 and PeMS08, respectively. They mainly record traffic flow values in some continuous time intervals. This work uses the version of PeMS03 for evaluation.
For the PeMS03 dataset, its initial data format is with the continuous format. In other words, its data can be visualized as a continuous curve within a continuous interval. The PeMS03 dataset has 358 monitoring nodes, and is involved with the interval from Sep 1, 2018 to Nov 30, 2018. In order to transform the initial continuous values into the discrete forms that can be substituted into models, sampling operations are conducted under some specific frequency. Here, To evaluate contributions of the proposed DTFUN, three forecasting models that can be used for sequential modeling are selected as the baseline methods. The DTFUN will be compared with them with respect to performance indexes for assessment. The selected three approaches are named as: LSTM, GRU, and CNN, respectively. They are briefly described as follows: • LSTM: It refers to long short-term memory (LSTM) model which is a typical sequential modeling method using the structure of neural computing.
• GRU: It refers to gated recurrent unit (GRU) model which is also a typical sequential modeling method using neural computing. It is actually a revised version of LSTM.
• CNN: It refers to convolution neural network (CNN) model which is a typical neural network structure. It integrates the convolution operations into the neural computing process.
• TGCN: It refers to the temporal graph convolution network (TGCN) model which is an improved GCN structure. It integrates the temporal modeling into the graph convolution operations [33].
• STGCN: It refers to the spatial-temporal graph convolution netwrok (STGCN) model which is also an improved GCN structure. It integrates the spatial-temporal information into the graph convolution operations [34]. For performance measurement, there are also two major metrics employed: MAE and RMSE. They are briefly described as follows: • MAE: It refers to mean average error, and calculates the average error between real values and predicted values. The unit error is measured using absolute values. The MAE can be defined as the following formula: • RMSE: It refers to rooted mean squared error, and uses a formula similar to Euclidean distance to measure error between real values and predicted values. The RMSE can be defined as the following formula: Besides the two metrics, another evaluation metrics that combines the two metrics together is introduced for comparison. The specifically developed metric is defined as the sum of MAE value and RMSE value, and is named as SMR here. The SMR is calculated as follows: For the dataset, it is divided into training part and testing part. The former is used to train the forecasting models, and the latter is used to evaluate the trained models. The proportion of training data is uniformly set to the value of 80%. The model does not directly have some hyperparameters, while the training process has the hyperparameters. The learning rate is set to 0.001 and 0.002 to construct different scenarios. During training process, the batch size is set to 32, and the epoch number is set to 10. The Adaptive momentum (Adam) algorithm is selected as the optimizer for training process. The internal parameters inside the Adam are set as their default values. As the initial dataset is within the format of continuous values, it is expected to take sampling operations for it. Four sampling frequencies are selected here: 5 minutes, 10 minutes, 15 minutes, and 20 minutes.

B. RESULTS
Main experimental results (MAE and RMSE) are demonstrated in Table 3 and Table 4. The two tables have seven lines and nine columns. The first line lists main experimental setting information, and other lines list experimental results. The first column lists four experimental methods which contain the proposal and baselines, and the other eight columns list experimental results. Among, the second, fourth, sixth and eighth lines correspond to MAE results, and the third, fifth, seventh and ninth lines correspond to RMSE results. There are four groups of MAE and RMSE values in each table, which correspond to four scenarios when sampling interval is set to 5 minutes, 10 minutes, 15 minutes and 20 minutes. It can be seen from the two tables that experimental results can be achieved better when learning rate is set to 0.001. And the five baseline methods have relatively fluctuating performance presentation under different parameter setting. Compared with three basic methods: CNN, LSTM, GRU, the DTFUN can always attain better experimental results. When it comes to two GCN-based methods: TGCN and STGCN, the DTFUN may not have significant performance advantage compared with them. Because they consider more fine-grained features when modeling traffic networks. Thus, we will discuss the time complexity in the following contents.
In order to achieve better visualization effect, main results in Table 3 and Table 4 are also demonstrated in the format of curve diagrams. The Fig. 4 and Fig. 5  Combining MAE and RMSE together, the metric SMR is utilized for assessment. Related results are illustrated in Table 5 and Fig. 6. The Table 5 has seven lines and nine columns. The first line lists some experimental setting conditions that contain two learning rate values, and the other six lines list experimental results of four methods. The first column lists six experimental methods, and other eight columns list experimental results. And the main results in Table 5 are also visualized in Fig. 6 for better visualization effect. It is composed of two subfigures which correspond to SMR results under learning rate of 0.001 and 0.002, respectively. The X-axis denotes sampling interval that ranges from 5 minutes to 20 minutes, and the Y-axis denotes SMR values. It can be seen from both Table 5 and Fig. 6 that the proposed DTFUN have proper performance in the experiments. Although DTFUN cannot have better performance compared with two temporal forecasting methods: 93760 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  TGCN and STGCN, it can perform better than general deep learning-based forecasting methods.
In addition to the forecasting performance, it is expected to testify the time complexity for experimental methods. It has been mentioned that I denotes size of road nodes, T denotes size of timestamps, n denotes size of feature factors in graph convolution. Besides, the hidden size of both LSTM and GRU is denoted as S h , the size of adjacency matrix in GCN is denoted as S A , the epoch number of all experimental methods   In the experiments, we use three evaluation metrics to assess forecasting performance of the proposed DTFUN. It can be seen from the results that DTFUN can perform better than general deep learning-based forecasting methods. But when it comes to recent graph convolution-based methods, the DTFUN may not show obvious performance superiority. Thus, we make further discussion for time complexity. It can be seen from TABLE 6 that DTFUN has less time complexity compared with other two graph convolution-based methods. The DTFUN has the complexity level of O[I · T · S A · S e ], the TGCN has the complexity level of O[(I + S h ) · S h · T · S A · S e ], and the STGCN has the complexity level of O[(I + S h ) · S h · T · S atten · S A · S e ]. The latter two methods have large time complexity than the DTFUN. Considering that topic of this paper focuses on a fast traffic flow forecasting method. This requires that technical methods have less time complexity and better practice. From this point, the DTFUN still has some overall advantage compared with baseline methods.

V. CONCLUSION
The graph-level forecasting is a promising means for traffic flow in modern urban road networks. This can be expected to promote forecasting effect of traditional methods. This paper discusses feasibility of graph convolution theory, and introduces the a graph deep learning method named GCN to construct a graph-level forecasting method for traffic flow. The proposed method is named as DTFUN for short. In experiments, it is compared with three other methods that do not use graph learning for assessment. A real standard dataset for traffic flow is selected as the simulation scenario, and obtained results can well verify the proposal's performance presentation. Two aspects of views can be concluded from the experiments.
• The GCN can perform better than other deep learning-based general forecasting methods, because its thought can well fit structure of road networks.
• The GCN can reduce forecasting error about 5%-10% compared with several typical deep learning-based forecasting methods.
• The GCN can serve as a baseline for traffic flow forecasting tasks, and it can be extended for realistic engineering applications. In future works, the authors are going to pay attention to a new kind of network entity named as social vehicular networks. Under such media, more personalized service such as traffic path recommendation [40] can be carried out for users, according to traffic flow forecasting results.
DONGFANG YANG received the B.Sc. degree in computer science and technology and the M.Sc. degree in computer application technology from Zhengzhou University, in 2005 and 2012, respectively. She is currently an Associate Professor with the School of Information Engineering, Zhengzhou Shengda University, China. Her research interests include edge intelligence, intelligent transportation systems, and big data.
LIPING LV received the B.Sc. degree in industrial automation from the Qingdao Institute of Architecture and Engineering, in 2001, and the M.Sc. degree in control theory and control engineering from the Qingdao University of Science and Technology, in 2005. She is currently a Professor with the School of Information Engineering, Zhengzhou Shengda University, China. Her research interests include information processing, edge intelligence, wireless sensor networks, and artificial intelligence.