Air Quality Prediction Using Improved PSO-BP Neural Network

Predicting urban air quality is a significant aspect of preventing urban air pollution and improving the living environment of urban residents. The air quality index (AQI) is a dimensionless tool for quantitatively describing air quality. In this paper, a method for optimizing back propagation (BP) neural network based on an improved particle swarm optimization (PSO) algorithm is proposed to predict AQI. The improved PSO algorithm optimizes the variation strategy of the inertia weight as well as the learning factor, guaranteeing its global search ability during the early stage and later enabling its fast convergence to the optimal solution. We introduce an adaptive mutation algorithm during the search process to avoid the particles from falling into the local optimum. Through an analysis and comparison of the experimental results, BP neural network optimized using the improved PSO algorithm achieves a more accurate prediction of AQI.


I. INTRODUCTION
Population concentration, climate change, industrial production and other factors have lowered the air quality in many parts of China. At present, there are still numerous ongoing problems regarding this issue; for example, the awareness of environmental protection among the citizenry needs to be strengthened, and the level of air quality monitoring and control is low, decreasing the achievements in the regulation of air pollution [1]- [3]. However, with improvements in living standards and economic development, the air quality in China has become an increasing concern among the public, and greater accuracy in air quality prediction has become an urgent need.
Some statistical methods, such as autoregressive integrated moving average (ARIMA), have been widely used in air quality prediction. However, these linear models may not obtain a reliable prediction if the sequence is nonlinear or irregular. In recent years, support vector regression (SVR) [4] has been applied in nonlinear regression forecasting. However, SVR with implicit kernel mapping such as RBF kernel [5] may not achieve an air quality forecasting model with a good The associate editor coordinating the review of this manuscript and approving it for publication was Diego Oliva . performance because the data used by air quality predictors is massive and complex, which may result in an overfitting. Compared with the above methods, neural network is characterized by large-scale parallel processing, a high learning ability and a high non-linearity, and is more suitable for air quality prediction.
Although back propagation (BP) neural network has numerous advantages, its disadvantages are also obvious, namely, it is apt to fall into the local minimum, requires long-term learning, and achieves a low convergence speed [6]- [10]. Experts from around the world have noticed these problems, and have put forward numerous suggestions for an improved performance. Tang et al. proposed introducing the adaptive learning rate into BP neural network, reducing the learning time [11]. In addition, Lba et al. introduced genetic algorithm into feedforward neural network, which improves the adaptability of network training [12], and Yao et al. applied an adaptive increase and decrease algorithm to select the structure of the network, stabilizing the network training more effectively [13]. The application of standard particle swarm optimization (PSO) in BP neural network can reduce the learning time and increase the calculation accuracy [14]- [18]. However, the limited convergence speed of standard PSO is slow, and there are problems regarding local optimization and premature maturity, resulting in an inaccurate weight selection [19]- [22].
We improved the PSO algorithm accordingly, optimized the overall prediction performance of BP neural network, adjusted the change strategy of the inertia weight as well as the learning factor, and ensured the diversity of particles during the early stage and the fast convergence to the global optimal solution. An adaptive mutation algorithm is also introduced during the search process to avoid particles from being trapped in the local optimum.

II. PARTICLE SWARM OPTIMIZATION AND ITS IMPROVEMENT A. STANDARD PARTICLE SWARM OPTIMIZATION ALGORITHM
PSO is used to simulate the social state of a biological population. Each bird is regarded as a particle swarm, and through iterations can share information, combine its own experience, continuously improve its own behavior, and improve the flight experience through both individual and group information. PSO initializes the particles first, which will be continually updated during the iterations, based on the individual extremum pbest and global extremum gbest [23]. The best solution found by the particle itself is pbest, and gbest is the best solution for all particles. Supposing a population of m particles in the d-dimensional target search space [24], vector is the position of a particle, and vector V i = (v i1 , v i2 , . . . , v id , . . . , v iD ) is the velocity. Each particle updates its velocity and position according to the following formula [25]- [28]: where k is the number of current iterations; v k id and v k+1 id are the velocities of the d-dimensional components of particle i at k and k + 1 iterations; and x k id and x k+1 id are the positions of the d-dimensional components of particle i at k and k + 1 iterations, respectively. In addition, p gd is the global extremum of all particles in dimension d; c 1 and c 2 are the learning factors for non-negative constants; r 1 and where v max is a constant that prevents particles from escaping from the solution space.

B. IMPROVEMENT OF PARTICLE SWARM OPTIMIZATION ALGORITHM 1) IMPROVEMENT IN THE INERTIA WEIGHT
The weight function can adjust both the overall and local search ability of the algorithm. In the standard PSO algorithm, the inertia weight decreases along a line, which makes it possess a strong global exploration ability in the initial stage of iteration as well as a strong local search ability during the later stage, but tends to be ''premature.'' We adopted a method to reduce the inertia weight coefficient. A variation diagram of the inertia weight is shown below.
During the initial search phase, the inertia weight coefficient decreases nonlinearly, which enables the algorithm to achieve a stronger capability of conducting an overall search during this stage, and enter the local search as soon as possible. After k iterations, the inertia weight coefficient starts to decrease in line, which allows the algorithm to stably find the optimal solution. The algorithm adjustment is as follows: where t is the number of iterations; w max and w min are the maximum and minimum values of the inertia weight coefficient, respectively; l 1 (t) is a nonlinear function; l 2 (t) is a linear function; and d is the initial inertia weight after the initial search. The values of l 1 (t) and l 2 (t) are derived as follows: where t max is the maximum number of iterations.

2) IMPROVEMENT IN THE LEARNING FACTORS
To obtain the diversity of the particles during the initial search phase and converge to the global optimal solution as soon as possible during the later stage, by analyzing the influence of the change in the learning factor, the parameters c 1 and c 2 are dynamically adjusted using the tangent function to better balance the global and local searches. The tangent function is expressed as follows: (7) The curves of parameters c 1 and c 2 are shown in Fig. 2.
As we can see from the figure, during the initial stage of the search, c 1 is larger than c 2 and each the particle pays attention to its own historical information to ensure diversity. However, during the later stage, c 1 decreases, whereas c 2 increases, making the particles pay more attention to the social information of the group to maintain a fast convergence.

3) ADAPTIVE MUTATION PARTICLE SWARM OPTIMIZATION
During the iterative process, the standard PSO algorithm easily falls into the local extremum, and the population loses the ability of the overall search during this time. By referring to the ''mutation'' operation of the genetic algorithm, we can mutate one dimension of the particle, adjust its position with a certain probability and enter other regions to continue the VOLUME 8, 2020  search. By doing so, we can effectively expand the search range, and obtain the global optimal solution of the algorithm. This is the basic idea behind the adaptive mutation PSO, the formula of which is as follows: where p (i, k) represents the k-dimensional mutation operation of the i particle of the population. A variation occurs when a random number x of within 0 to 1 is greater than 0.95, and rand is a random value of within 0 to 1.

III. IMPROVED PARTICLE SWARM ALGORITHM TO OPTIMIZE BP NEURAL NETWORK A. DETERMINING STRUCTURE OF NEURAL NETWORK
First, we decided that BP neural network has a three-layer structure, and the input layer neuron n 1 and output layer neuron n 3 were determined according to the number of inputs and outputs. Second, the number of neurons in the hidden layer n 2 was determined based on the empirical formula, obtaining the minimum error of Formula (9). Here, η k is the threshold for the output layer, and θ j is the threshold of the hidden layer. The connection weight between the input layer and the hidden layer is defined as W ij . The connection weight between the hidden layer and the output layer is defined as V jk , f 0 is the Sigmoid excitation function of the hidden layer, and f 1 is the linear excitation function of the output layer. A schematic diagram of the BP applied is shown in Fig. 3 [29].

B. OPTIMIZATION OF BP NEURAL NETWORK USING IMPROVED PARTICLE SWARM OPTIMIZATION ALGORITHM
The mean square error (MSE) generated by each network training set is regarded as the approximate fitness function used to calculate the fitness value, i.e., Eq. (11), and the minimum error value E min is calculated based on the fitness function f (x) = E (x).
In Eq. (11), y i andŷ i are the target and predicted values, respectively. The smaller the MSE is, the more accurate the model. We update V i of the particles under different components until the training error is less than E min or the number of iterations reaches t max . If the error after training does not meet E min , we can adjust the weight and threshold to meet this condition. Fig. 4 shows the improved PSO algorithm process of the BP neural network.
The main steps are described as follows: Step 1: The BP neural network topology is determined from the training sample data.
Step 2: The particle velocity, position, individual extremum and global extremum value are initialized. Step 3: The appropriate fitness function is selected to evaluate the adaptive value of each particle.
Step 4: Each particle fitness value is evaluated. If this value is better than the individual optimal solution, then the individual extremum pbest is updated with the current value. If the individual is better than the global best, the same approach replaces the global extremum gbest.
Step 6: If the number of iterations is less than the set maximum value, or if the error parameter is less than the set error value, then return to step 3.
Step 7: Using the improved PSO algorithm, the optimal weights and thresholds obtained are assigned to the BP neural network for training and learning.

IV. PREDICTION OF AIR QUALITY INDEX MODEL OF IMPROVED PARTICLE SWARM ALGORITHM TO OPTIMIZE THE BP NEURAL NETWORK A. DATA SELECTION
The air quality data used in this paper are from the China air quality monitoring and analysis platform, and include the average daily fine particulate matter (PM 2.5 ), inhalable particulate matter (PM 10 ), ozone (O 3 ), NO 2 , CO, SO 2 [30]- [34], and the Chongqing air quality index (AQI) for all of 2018-2019. After deleting invalid and missing data, a total of 10,272 pieces of data were collected. The 9,844 data from January 1, 2018 to November 30, 2019 were used as training samples. A total of 428 data from November 1 to 30, 2019 are used as testing samples. A portion of the data are shown in Table 1.

B. AQI PREDICTIVE SIMULATION AND RESULTS ANALYSIS 1) PREDICTION OF AQI VALUES BASED ON BP NEURAL NETWORK
The MATLAB2018a platform was used for the simulation experiments. Python programming and a MySql database were used. The trainlm function was selected as the training function of the BP neural network, the sigmoid function was selected as the transfer function of the hidden layer, and the purelin function as the transfer function was selected as the output layer. The maximum training number is 4,000, the learning rate is 0.01, and the target error is 10 −7 . The errors of the hidden layer are minimized when the number of nodes is 9. The mean absolute percentage error (MAPE) was used to evaluate the performance of the prediction model [35].
Using standardized training samples and test samples, the BP neural network was trained and tested with the network parameters set, with an accuracy of 92.84%.
As we can see from the Fig. 5 and Fig. 6, the BP neural network prediction model has a good prediction ability for the  AQI index, whereas the error of the single-point prediction is large; hence, the model still needs further improvement.

2) PREDICTION OF THE AQI INDEX BASED ON PSO NEURAL NETWORK AND IMPROVED PSO NEURAL NETWORK
In this paper, multiple linear regression, PSO-BP neural network and improved PSO-BP neural network are used to compare the prediction accuracy. The results of multiple regression experiments are shown in the Fig. 7 and Fig. 8. The prediction accuracy is 97.14%, and the prediction effect is much better than that of the BP network. However, the accuracy still needs to be improved.  and the BP neural network is optimized using PSO. As shown in the Fig. 9, the prediction accuracy is 98.04%. The improved PSO-BP neural network was trained, and the parameters were set as follows: c 1_end = 0.5, c 1_start = 2.5, c 2_start = 1, c 2_end = 3, w min = 0.5 and w max = 1. The experimental results of the improved PSO-BP neural network are shown in Fig. 11 and Fig. 12, and the prediction accuracy is 99.03%.
From Fig. 13, we can see that the improved PSO algorithm in this paper is better than the standard PSO algorithm in terms of the convergence accuracy, speed and optimization results. After optimization, the average optimal fitness value of the improved PSO algorithm is lower than that of the standard PSO algorithm.

3) RESULTS ANALYSIS
The prediction error statistics of the BP neural network, multiple linear regression, PSO-BP neural network and improved PSO-BP neural network are shown in Table 2.   The total error value of the improved PSO-BP neural network is only 25.91, which is far lower than that of the other models. At the same time, the prediction accuracy of the  improved PSO-BP neural network is 99.03%, which is 6.19, 1.89 and 0.99 higher than that of the BP neural network, multiple linear regression and PSO-BP neural network, respectively, indicating that the model achieves a better prediction performance. In addition, the simulation results from Fig. 5, Fig. 7, Fig. 9 and Fig. 11 show that the predicted value of the improved PSO-BP neural network is closer to the real value.
The simulation results show that the BP neural network, which is optimized by the improved PSO-BP algorithm, has an excellent learning ability.

V. CONCLUSION
In this study, an improved PSO algorithm was used to optimize the BP neural network and predict the AQI. Based on the PSO algorithm, the mechanism combined with the BP neural network is introduced. The PSO algorithm was improved, and the improved BP neural network was established to optimize the prediction model. Compared with a traditional BP neural network, it is not easy to fall into the local minimum and achieve a better search ability. The simulation results show VOLUME 8, 2020 that the network model achieve a high accuracy in predicting the AQI and is a promising application.
RUIXIAO ZHAO was born in Henan, China, in 1996. She received the bachelor's degree from the Xinxiang University of Science and Technology, in 2018. She is currently pursuing the master's degree with the School of Information and Electrical Engineering, Hebei University of Engineering. Her research interests include data mining and machine learning.
ZHE CHENG was born in Hebei, China, in 1995. He received the bachelor's degree from the Hebei Normal University of Science and Technology, in 2017. He is currently pursuing the master's degree with the Hebei University of Engineering. His research interests include machine learning and natural language processing. VOLUME 8, 2020