A New Multipredictor Ensemble Decision Framework Based on Deep Reinforcement Learning for Regional GDP Prediction

Gross domestic product (GDP) can effectively reflect the situation of economic development and resource allocation in different regions. The high-precision GDP prediction technology lays a foundation for the sustainable development of regional resources and the proposal of economic management policies. To build an accurate GDP prediction model, this paper proposed a new multi-predictor ensemble decision framework based on deep reinforcement learning. Overall modeling consists of the following steps: Firstly, GRU, TCN, and DBN are the main predictors to train three GDP forecasting models with their characteristics. Then, the DQN algorithm effectively analyses the adaptability of these three neural networks to different GDP datasets to obtain an ensemble model. Finally, by adaptive optimization of the ensemble weight coefficients of these three neural networks, the DQN algorithm got the final GDP prediction results. Through three groups of experimental cases from China, the following conclusions can be drawn: (1) the DQN algorithm can obtain excellent experimental results in ensemble learning, which effectively improves the prediction performance of single predictors by more than 10 %. (2) The ensemble multi-predictor region GDP prediction framework based on deep reinforcement learning can achieve better prediction results than 18 benchmark models. In addition, the MAPE value of the proposed model is lower than 4.2% in all cases.


I. INTRODUCTION
The gross domestic product (GDP) refers to the market value of all final products produced by a country or region using production factors during a certain time [1]. It is a paramount indicator to measure the economic status and development level, as well as the core indicator of national economic accounting [2]. However, with the improvement of the overall economic level, the rapid accumulation of economic aggregation cannot solve all social problems [3]. The research to properly adjust the overall planning and distribution of social resources while maintaining the sustainable and stable development of the economy will be the key to the next stage of social development [4]. To accurately assess the economic cycle, longer periods of GDP data and other socio-economic The associate editor coordinating the review of this manuscript and approving it for publication was Alberto Cano . drivers of GDP changes are used to finally estimate their correlations, leading to the conclusion that the economy is in growth or recession [5], [6].
In the national economy, the provincial economy has a very crucial position, and its regional GDP reflects the local economic capacity [7]. In addition to the economic growth represented by GDP, economic development or social development not only includes the content of economic growth but also means changes in an economic and social structure that accompany growth [8]. Many scholars realize that the traditional GDP only reflects the growth of the total economic volume, but does not reflect the contribution of natural resources and the social environment to the economy [9], [10]. The development of society can be examined and measured from different aspects. For GDP, the traditional economic growth model can also be adjusted from the aspects of environmental protection and quality of life to take green GDP as a measuring point. The GDP direction based on pure economic growth can also be changed to a development direction that links GDP with the indicators of longevity, health, and education [11].
In this way, all classes of the whole society can share the welfare of the society while the economy is developing. Therefore, the GDP, as a tool for measuring economic development, needs to be further improved. It is also crucial for sustainable development and has positive value in promoting economic development, optimizing the economic structure, and improving people's living standards [12]. In the competitive world, the government could prospect the development of the market economy and formulate development plans based on the forecasting results [13], [14]. Accurate economic prediction is the crucial basis for local governments to make scientific the GDP prediction systems apply scientific methods to forecast the prospects of economic phenomena and provide a scientific basis for realizing the sustainable development of the regional economy and resources and environment [15], [16].
The structure of the paper is organized as follows: Section 2 mainly introduces the applied data information, the research methods, and the hybrid framework. In Section 3, detailed experiments are used to test the performance improvement of the proposed model. Section 4 summarizes the main contributions of the research and gives an outlook on the potential research directions.

A. RELATED WORKS
Economic forecasting is based on actual data with scientific approaches to predict the future by employing data modeling [17]. Time Series Forecasting Method (TSFM) refers to arranging the historical data of the forecast target into a time series, and then the quantitative forecasting methods will be conducted to analyze trends over time and build mathematical models for extrapolation [18]- [20].
The recent mainstream applied forecasting approaches can be divided into two types: parametric models and non-parametric models [21]. The parametric models mainly consist of linear regression and auto-regressive integrated moving average (ARIMA) approaches. These models are established in advance based on statistical theories, by which the modeling parameters are calculated. The forecast models had stable time series data as input, but they also demonstrated the poor fitting ability to complex nonlinear systems that the forecast accuracy of GDP growth is not accurate enough [22]. Non-parametric models mainly contain the support vector machine (SVM) and artificial neural networks, which can be regarded as artificial intelligence (AI) models [23], [24]. With the development of popular AI algorithms, the machine learning and deep learning models are widely used because of their powerful intelligent learning and fitting ability to complex data, which can use various optimization methods to update parameters and minimize the training error with fast speed [25], [26]. Therefore, researchers have proposed many models to conduct GDP forecasting. Abonazel and Abd-Elftah applied the autoregressive integrated moving average (ARIMA) model to forecast the Egyptian GDP based on historical data [27]. Ghanem et al. designed a functional link artificial neural network (FLANN) for the electricity prices prediction, which was affected by the COVID-19, and obtained excellent accuracy improvement [28]. Bildirici and Sonustun proposed the Multilayer Perceptron (MLP) in a hybrid model to predict the chaotic behaviour in gold, silver, copper, and bitcoin prices [29]. This model effectively predicts the long-term behaviour of these investment vehicles.
Compared with the above models, the deep learning model can obtain a broader application prospect. Chen et al. used the long short-term memory (LSTM) in stock returns prediction by the data from China stock market [30]. And the improved accuracy has been increased from 14.3% to 27.2%. Wang et al. proposed an echo state network (ESN) in electrical energy consumption forecasting [31]. Zhao et al. also used the Elman neural network (ENN) in the boosting structure to predict direct economic losses of marine disasters in China [32]. Ameyaw and Yao employed the bidirectional long short-term memory (BILSTM) to predict the African CO2 Emissions affected by GDP [33]. Bharati et al. utilized a convolutional neural network (CNN), vanilla neural network, and visual geometry group-based neural network (VGG) for lung disease prediction [34]. Liu et al. proposed the gated recurrent unit (GRU) model to predict the Chinese energy consumption in China [35]. Two other benchmark models, namely the multiple linear regression (MLR) and the support vector regression (SVR) are also applied in the test. The results proved that the GRU outperformed others with higher accuracy and lower errors. To establish forecasting models with spatiotemporal characteristic information of multiple economic impacts, spatial-temporal predictors are employed. Wang et al. proposed a temporal convolutional network (TCN) for the short-term prediction of power system load, which displayed the best performance among the comparing models [36]. In the research of Lum et al., the TCN with dilated causal convolutional layers is also used as the predictor to replace the LSTM or recurrent neural networks (RNN) [37].
The forecasting methods possess their characteristics. In economic forecasting, a reasonable forecast should be selected according to the characteristics of the data and application scenarios. In the research of different GDP impacts, the cross-validation method can accurately analyze the inner connection of the GDP datasets and reduce the forecasting errors [38], [39]. Besides, a single predictive model may only reflect part of the information on the influence factors in GDP analysis [40], [41]. To effectively improve the performance of single predictors and the comprehensive analysis of GDP data, optimization algorithms can be added to the hybrid framework to optimize the input features [42]. Mladenović et al. chose the firefly algorithm (FFA) as a biological stimulated metaheuristic optimization algorithm VOLUME 10, 2022 to optimize the SVM, which can provide accurate predictions of CO2 emission in comparison to ANN [43]. Long proposed the genetic algorithm (GA) to improve the SVM parameters and to analyze the total GDP of Anhui province based on the data from 1989 to 2007 [44]. Guleryuz used particle swarm optimization (PSO) with adaptive neuro-fuzzy inference system (ANFIS) for industrial energy demand forecasting, which is affected by many economic and social parameters [45]. The above research works proved that the heuristic algorithm-based models outperformed the single predictors. Li et al. combined the deep belief network (DBN) modified by extracted periodicity knowledge with the contrastive divergence (CD) algorithm and the least-squares method to optimize hidden parameters and the output weights [46]. The hybrid model demonstrated better performance than the comparative single models. Besides, scholars have used heuristic algorithms to build integrated models to predict GDP. Although a lot of research has been done in this field, there is a lack of breakthrough progress that the performance of the model needs to be further improved.
The reinforcement learning (RL) algorithm approach has recently attracted the attention of scholars, which can increase the decision-making ability and integrate multiple predictors based on agent optimization algorithm to obtain excellent prediction modeling results [47], [48]. Tang et al. proposed the hybrid method by the LSTM and the deep deterministic policy gradient (DDPG) algorithm to realize epidemic prediction based on the epidemic and socio-economic data [49]. The model proved excellent accuracy in tracking the realtime epidemic trend. Hu et al. applied a deep reinforcement learning (RL)-based energy management strategy (EMS) for hybrid electric vehicles [50]. The experiments verified the effectiveness in comparison with benchmarks by fuel economy. Ee et al. used the deep Q-network (DQN) to enhance the LSTM-based stock price prediction for decision-making to achieve maximum profit [51]. Fu et al. also proposed the DQN method in building energy consumption forecasting [52]. The deep-forest-based DQN (DF-DQN), can obtain better accuracy than DDPG with decreased MAE, MAPE, and RMSE by 5.5%, 7.3%, and 8.9% respectively.
Through the above literature survey, it is meaningful to research the application of ensemble optimization methods with the validation process to optimize the predictors in a hybrid model. The paper utilized a novel ensemble multipredictor region GDP forecasting framework based on deep reinforcement learning.

B. THE NOVELTY OF THE PAPER
In the research, to obtain an accurate prediction of GDP value, a new multi-predictor ensemble decision framework based on deep reinforcement learning is proposed. The innovation and contribution of the paper are presented as below: (1) In the study, three special neural networks, namely TCN, GRU, and DBN, are applied as main predictors to conduct the process respectively. These three deep networks can learn both the temporal and spatial characteristics and provide more accurate GDP forecasting results. (2) The proposed DQN algorithm could adaptively control model parameters by the GDP data characteristics and integrate the prediction results from different predictors. It has a strong decision-making capacity and can dynamically optimize the parameters on various occasions.

A. FRAMEWORK OF THE PROPOSED MODEL
Different from ordinary time series forecasts, regional GDP forecast is characterized by complex factors and high artificial influence. Traditional models such as SVR and ELM are limited in nonlinear time series prediction. The deep neural network has a stronger learning ability and is suitable for complex forecasting problems such as GDP forecasts. However, the depth prediction features extracted from a single deep network model are usually not comprehensive enough. This paper proposes a deep network ensemble model based on DQN. GRU, DBN, and TCN are used as base models to realize the preliminary prediction of regional GDP. DQN was adopted to optimize the model ensemble process. The specific model framework is displayed in Figure 1. Firstly, the economic data are analyzed and 20 GDP-relevant features are extracted. Then, GRU, DBN, and TCN are adopted to study the deep relationship between features and GDP and achieve preliminary prediction results. Finally, DQN is modified to realize the combination of the results given by the base models and provide a more accurate GDP forecasting result.

B. ECONOMIC CHARACTERISTICS
Regional GDP forecasting is not a single factor time series forecasting problem. It involves education, industry, employment, population, and other factors. As listed in Table 1, 20 GDP-related indicators are used as prediction features, covering four aspects: economy, population and employment, industry, and education. Historical economy indexes are the most important features. Employment and population determine the purchasing power, thus, are related to the GDP development. Education can improve one's capacity and improve employment in society. Finally, industrial information can affect GDP directly. For example, industrial output, energy consumption, the transportation efficiency are all crucial parts of the economy. Therefore, the 20 features in Table 1 are reasonably selected to provide comprehensive information for GDP forecasting.

C. BASE MODELS 1) BASE MODEL I: GRU
GRU neural network is improved from LSTM neural network. However, compared with LSTM neural network, GRU effectively avoids the gradient disappearance and long-term dependence problems of traditional RNN [53]. As shown in Figure 2, the structure of GRU is introduced. The memory unit of the GRU network only has two gates, namely, update gate Z t and reset gate r t . In addition, the unit state and the output are combined into one state, so the model training efficiency is improved guaranteeing the model accuracy.
Updating gate Z determines the information transfer ratio between the hidden states while resetting gate r determines the forgotten information between the hidden states. Assuming that the candidate state of the current hidden layer is ch t , the hidden state at the last moment is h t , the weight matrix is w, the deviation is ε, and [h t−1 , X t ] represents the connection of two vectors, then the learning process of GRU unit can be summarized as follows [54]: The 20 features are all-time series data. To make the training process of GRU more convenient, the time window parameter of GRU is set as 2. Input the features to GRU and then, the preliminary GDP forecasting results can be obtained.

2) BASE MODEL I: DBN
DBN is a probabilistic deep learning network composed of a series of constrained Boltzmann machines (RBMs) [55]. As shown in Figure 2, RBM consists of an explicit layer v and a hidden layer h which are used for input data and feature collector respectively. Double-layer RBM can reduce the dimension of high-dimensional features and reduce the complexity of data. In addition, DBN networks are mainly based on the idea of Bayes, which can capture high-level information hidden in data that is arduous to read. The output information has some representation of the data. In the DBN model, given the state of (v, h) of RBM, its energy function VOLUME 10, 2022 is calculated as follows: where θ = {w, a, b} is the parameter to be calculated, a and b are the bias of the explicit layer and the implicit layer respectively, and w is the connection weight between the explicit layer and the implicit layer. When θ is determined, the joint probability distribution of (v, h) can be calculated according to the energy function: When the state of explicit layer v is determined, the activation probability of the hidden layer unit is expressed as below: When the state of hidden layer h is determined, the activation probability of the explicit layer unit is expressed as below: When the number of training samples is K , the parameter θ can be determined by solving the problem of maximizing the logarithmic likelihood function, and the target function for the problem of maximizing the number likelihood function is given in the following equation: Using DBN to forecast regional GDP, the depth features of multivariate GDP data will be extracted. A more comprehensive forecast of regional GDP can be achieved.

3) BASE MODEL I: TCN
TCN is a neural network model combining dilated causal convolution (DCC) and residual connection (RC) [56], which is mainly used for time series modeling or data analysis. Its basic constituent unit is the TCN residual block, which is composed of two layers of DCC in RC mode. The structure of the TCN residual block is introduced in Figure 4. Each DCC layer is calculated with weight normalization and activate function ReLU. DCC can expand the acceptance domain of the network, that is, realize the analysis of long time-series data. Therefore, using TCN to forecast regional GDP has an advantage in discovering potential effective information in long-term historical data of GDP. In this paper, the convolution kernel is 2, the expansion coefficient is 1, and the receptive field is 3. The receptive field of DCC in the same layer can be  expanded to 4. The following equation explains the extended convolution operation [57]: where X is the GDP data, f is the filter function, and q is the data length.

D. ENSEMBLE STRATEGY BASED ON DEEP REINFORCEMENT LEARNING
The single deep network model has limitations in regional GDP forecasting. Ensemble learning can combine the advantages of different depth models to improve the stability and accuracy of regional GDP forecasts. This paper uses reinforcement learning to ensemble the above three base models. DQN is a value-based reinforcement learning method, which is widely used in path planning, scheduling optimization, and other problems [58]. It uses a neural network to simulate the Q function, which improves learning efficiency and avoids the serious memory occupation caused by the Q table [59]. In this paper, the weights of DBN, GRU, and TCN model outputs are optimized by DQN to achieve an integrated prediction of regional GDP. The reinforcement learning ensemble strategy is composed of five parts: agent, environment, state, action, and reward, which is illustrated as follows: where w l represents the weight vector of the l-th order.

2) ACTION
To dynamically adjust the value of the weight vector, part of the weight needs to be adjusted each time. Therefore, the l action is denoted as Al: where Al represents the weight adjustment vector, x∈[−0.1,0.1] and it is a random number.

3) REWARD
The incentives are designed to improve the accuracy of regional GDP forecasts. Firstly, the current weight vector is calculated as below: Then, the MAPE index predicted by the final regional GDP is used as the incentive basis. The MAPE value before and after adjustment is compared, and the incentive is defined as follows: Under the guidance of the reward, the agent can choose the weight value with the highest prediction accuracy of regional GDP.

4) INTELLIGENT AGENT
DQN was used as an agent in the study. Q is the quality of the action. Based on DQN agents, actions can be determined according to the ε-greed quest. The core of DQN is to build a deep network and calculate the critical value of actions. After a series of experiments, the deep network of this study is shown in Figure 5, which adopts the three-layer state path fully connected layer (FCL). More layers will exponentially increase training time and are therefore not considered. Because the weight adjustment action is done at once, only the action path of a single layer is used. It is worth noting that the activation functions of all fully connected layers are set to rectifying linear units (ReLU) due to the high operating speed.
DQN learns the data of the validation set to obtain the output weight of each base model and obtains the final regional GDP prediction result through linear weighted integration.

III. CASE STUDY A. GDP DATASET
The key to analyzing the modeling effect of the proposed ensemble GDP forecasting model is to conduct multiple case studies. To further evaluate the comprehensive modeling effect and generalization modeling ability of the proposed GDP forecasting model, it is essential to select the most representative data set. Based on the modeling of the GDP data set in [60] and [61], three sets of data from three Provinces of China are used to establish experimental analysis. The source of the dataset is the China Statistical Yearbook [62] (Note: all of the data was downloaded from www.stats.gov.cn and www.caac.gov.cn.). The dataset mainly contains quarterly GDP data of three provinces from 2005 to 2021. The basic statistical information of the three datasets is shown in Table 2. Figures. 6 to 8 show the time series characteristics of these three GDP data. It is a very important step to evaluate the stability and adaptability of the model. In this paper, the performance of the model is analyzed by using the ten-fold cross-validation method. The average value of the calculated results will be used as an indicator to evaluate the performance of the model. Python3.8.5 and TensorFlow2.3 are the core platforms and toolkits for experimental analysis and neural network framework construction.

B. PERFORMANCE EVALUATION INDEXES
Time-series regression analysis index is commonly used to evaluate the model used in this paper. Three classic indexes, which include the MAE (Mean Absolute Error), the MAPE (Mean Absolute Percentage Error), the RMSE (Root Mean Square Error), and the SDE (Standard Deviation of Error), are used to fully analyze the overall predictive stability of all VOLUME 10, 2022   benchmark models and the proposed ensemble model in all case studies. The core calculation formula of these indexes is listed below [63]: where Y (T ) represents actual GDP data. Y (T ) represents the GDP data got by the prediction model. n represents the number of samples. In addition, intuitive evaluation of performance differences between algorithms is critical. To fully compare the performance differences between two different algorithms, the Promoting percentages of the MAE (P MAE ), the Promoting percentages of the MAPE (P MAPE ) the Promoting percentages of the RMSE (P RMSE ), and the Promoting percentages of the SDE (P SDE ) are used. These indexes can be calculated based on the following formula [64]:

C. EXPERIMENTAL RESULTS AND COMPARATIVE ANALYSIS WITH BENCHMARK ALGORITHMS 1) COMPARATIVE EXPERIMENTAL RESULTS OF DIFFERENT PREDICTORS
To fully verify the adaptability and stability of TCN, GRU, and DBN networks in the field of GDP prediction modeling, three classical deep learning networks (ESN, ENN, and RNN) and three traditional shallow neural networks (BPNN, ELM, and RBF) were respectively used to construct comparative experiments. Table 3 shows regression statistical indexes of the prediction results of these neural networks. Figure 9 shows the scatter plots of the prediction results of three neural networks. From Table 3 and Figure 9, the following conclusions can be drawn:  (1) Compared with MLP, RBF, and ELM, other deep learning models with multiple hidden layers can achieve better GDP series modeling results. This proves that the traditional neural network is limited in extracting the depth feature information of nonlinear GDP data, which limits the prediction effect of the model to some extent. (2) The prediction accuracy of TCN, GRU, and DBN is better than that of other deep learning models. This fully proves the ability of these three neural networks to analyze the characteristics of the deep fluctuation of GDP. TCN neural network can effectively optimize the training ability of the model and improve the modeling effect by combining CNN and RNN structures. GRU algorithm improves the gradient problem in model training and improves the stability of the model through gating structure. DBN algorithm effectively extracts the core feature information of the original data and optimizes the analytical capability of the model through unsupervised learning and the RBM framework. Therefore, these three neural networks have excellent GDP modeling effects. (3) DBN, TCN, and GRU can achieve the best prediction results for different GDP datasets respectively. However, for three sets of GDP datasets with different fluctuation characteristics, a single neural network is arduous to adapt to different cases. Therefore, it is indispensable to use an ensemble learning algorithm to improve the comprehensive generalization and recognition ability of the GDP prediction model.

2) COMPARATIVE EXPERIMENTAL RESULTS OF DIFFERENT ENSEMBLE MODELS
To prove that the DQN-TCN-GRU-DBN algorithm is an excellent GDP forecasting framework, the following three comparative experiments are conducted in this section to fully evaluate the performance of DQN-TCN-GRU-DBN: Part I: DQN-TCN-GRU-DBN is compared with TCN, GRU, and DBN algorithms respectively to prove that the ensemble learning model can effectively improve the adaptability and robustness of all single predictors.
Part II: To evaluate the ensemble performance of reinforcement learning algorithm and traditional meta-heuristic algorithm in the field of ensemble learning, the DQN algorithm was compared with PSO and GA respectively.
Part III: To verify that the DQN algorithm effectively improves the limitations of traditional reinforcement learning algorithms and improves the ability of weight decision, it is compared with Q-learning and Sarsa respectively. Table 4 shows the index evaluation results of several prediction models. Tables 5 to 7 shows the promoting percentage of DQN-TCN-GRU-DBN by other models. Figure 10 shows the loss of different algorithms during iteration. From Tables 4 to 7 and Figure 10, the following conclusions can be drawn: (1) Table 5 shows the comparison results between DQN-TCN-GRU-DBN and single neural networks. Compared with TCN, GRU, and DBN, the proposed  DQN-TCN-GRU-DBN model can obtain more satisfactory prediction results in all cases. The comparison results fully verify the ability of ensemble learning to optimize single predictor results. The possible reason is that ensemble learning optimizes the weights by analyzing the modeling effects of predictors in response to different data sets, which improves the adaptability of models. The DQN algorithm improves the performance of all single predictors by more than 10 percent.   (2) Table 4, and Table 6 show the comparison of DQN's performance with the heuristic algorithm. Compared with traditional population-based heuristic algorithms, all agent-based reinforcement learning algorithms can achieve better prediction results in all experiments. The experimental results effectively prove that the reinforcement learning model can effectively analyze the modeling effects of different data in the three prediction periods and obtain high-quality weight decision results. The possible reason is that reinforcement learning algorithms can improve the global comprehensive optimization ability and make weight decisions with more data characteristics by constantly training agents. (3) Table 4, Table 7, and Figure 10 show the comparison between DQN and traditional reinforcement learning algorithms. Compared with Sarsa and Q-learning, the DQN algorithm based on deep reinforcement learning can obtain more satisfactory results. This fully proves that the deep reinforcement learning model can effectively improve the shortcomings of traditional reinforcement learning and achieve better weight analysis results. The possible reason is that the DQN algorithm effectively solves the shortcoming of the limited ability of the Q table to store state behavior pairs through a neural network.

3) SENSITIVE ANALYSIS OF THE PARAMETERS AND THE INPUT FEATURES OF THE MODEL
This section focuses on evaluating the influence of model parameters on prediction accuracy. For each parameter, this article evaluates the impact of five alternative values on the results. Figure 11 shows the prediction errors corresponding to various parameters. MAE is used as an indicator to evaluate the influence of parameters on results. Table 8 shows the parameter selection results of the proposed model, which can make the model obtain the optimal prediction accuracy. Table 9 gives the selection results of input features of different neural networks. Based on the results in Table 9, each neural network can achieve the best prediction accuracy. Table 10 shows the calculation time of different models. From Tables 8 to 10 and Figure 11, the following conclusions can be drawn: (1) Based on Figure 11 and Table 8, it can be found that the model proposed in this paper is relatively stable. Appropriate changes in model parameters do not have a huge impact on model performance. In addition, when the model is modeled according to the parameters in Table 8, the model can obtain the most stable and superior GDP prediction results. (2) Based on Table 9, it can be found that industrial structure features, historical GDP data, and education have a great impact on GDP prediction results.
The experimental results provide technical support for the subsequent formulation of regional policies and the optimization of industrial structures. (3) Based on Table 10, it can be found that the calculation time of the integrated model is more than that of the    (1) Compared with the XGBoost and ARIMA algorithms, other state-of-the-art hybrid ensemble frameworks can achieve more satisfactory GDP modeling results and   improve the stability of the model. This fully proves that the hybrid model framework can effectively combine the advantages of each component and establish an effective GDP forecasting framework. Therefore, the framework combining ensemble learning and deep learning has a positive application prospect in GDP prediction. (2) The presented DQN-TCN-GRU-DBN model can achieve the best prediction accuracy in all cases. This fully proves that DQN-TCN-GRU-DBN is a modeling framework with excellent value in the field of GDP prediction. On the one hand, three deep network frameworks (TCN, GRU, and DBN) with their characteristics can respectively establish excellent GDP forecasting models. Another method, the DQN algorithm based on deep reinforcement learning, effectively analyses the advantages of these three neural networks and

E. DISCUSSION
Based on all the comparative experimental results, the following discussion and analysis can be obtained: (1) Based on Table 3 and Figure 9, it can be seen that DBN, TCN, and GRU have certain advantages in GDP prediction. However, a single neural network is difficult to adapt to different data sets. Therefore, the ensemble learning method can effectively optimize the overall adaptability of the model.  Table 8 and Figure 11 show the sensitivity of the model and the results of parameter selection. The stability of the model is proved and the optimal parameters are selected. (4) Table 9 shows the selection results of different neural network input features. The experimental results show that industrial structure features, historical GDP data, and education have a great impact on GDP prediction results. In the future, the government can make corresponding policies to improve the local economy by analyzing the results.   The experimental results show that the proposed model can achieve better prediction results than state-ofthe-art models. In general, this model has excellent research prospects in the field of GDP prediction.

IV. CONCLUSION AND FUTURE WORK
As an important indicator of national and regional economic construction and sustainable development of society, GDP forecasting technology provides technical support for the regional government to analyze and formulate economic policies. This paper proposes a new ensemble GDP prediction framework based on deep reinforcement learning. The core contribution and conclusion of this paper will be elaborated from the following perspectives: (1) Three kinds of neural networks (TCN, GRU, and DBN) with their characteristics are selected as the main predictors in this paper. Different from traditional RNN and shallow neural network frameworks, these three neural networks optimize the ability to analyze the original characteristics of GDP and achieve excellent prediction results through their special structures. (2) As the main ensemble learning method, the DQN algorithm effectively combines these three neural networks (TCN, GRU, and DBN) and obtains a satisfactory ensemble GDP prediction framework. Compared with traditional meta-heuristic algorithms and classical reinforcement learning algorithms, DQN based on deep reinforcement learning can achieve better ensemble results. Overall, the DQN algorithm effectively improves the prediction performance of single predictors by more than 10 percent. (3) To verify the overall modeling effect and stability of the DQN-TCN-GRU-DBN algorithm, 14 alternative models and 4 existing models were reproduced and compared with DQN-TCN-GRU-DBN. In general, DQN-TCN-GRU-DBN achieved optimal experimental results in all cases and achieved MAPE values of less than 5%. (4) Industrial structure features, historical GDP data, and education have a great impact on GDP prediction results. In the future, the government can make corresponding policies to improve the local economy by analyzing the results.
The GDP prediction technology proposed in this paper provides technical support for the sustainable development of society and the formulation of economic policies. In the future, this model will be optimized from the following perspectives: (1) It is very important to formulate regional economic development strategies based on GDP prediction results and regional policies in the future. In the future, the government can realize the macro-control of a regional economy according to the GDP forecast results. (2) Economic exchanges between different regions and other social behaviors also affect changes in GDP data.
In the future, GDP data of other regions can also be used as input features of the GDP prediction model of the target region. (3) Feature engineering algorithms such as feature extraction and feature selection can effectively optimize model input and improve feature quality, which further improves the modeling ability of the predictor. In the future, feature engineering algorithms will be used to efficiently optimize model inputs.