Study on Prediction Model of HIV Incidence Based on GRU Neural Network Optimized by MHPSO

Acquired Immune Deficiency Syndrome (AIDS) is still one of the most life-threatening diseases in the world. Moreover, new infections are still potentially increasing. This difficult problem must be solved. Early warning is the most effective way to solve this problem. Here, we aim to determine the best performing model to track the epidemic of AIDS, which will provide a methodological basis for testing the time characteristics of the disease. From January 2004 to January 2018, we built four computing methods based on AIDS dataset: BPNN model, RNN model, LSTM model and MHPSO-GRU model. Compare the final estimated performance to determine the preferred method. Result. Considering the root mean square error (RMSE), mean absolute error (MAE), mean error rate (MER) and mean absolute percentage error (MAPE) in the simulation and prediction subsets, the MHPSO-GRU model is determined as the best performance technology. Estimates for the period from May 2018 to December 2020 suggest that the event appears to continue to increase and remain high.


I. INTRODUCTION
Acquired immune deficiency syndrome (AIDS) is the result of human immunodeficiency virus (HIV) infection, and its main feature is T cell immune function defect [1]. HIV mainly exists in the blood, semen, vaginal secretions and milk of infected people. Therefore, it is usually transmitted between people through direct sexual contact, iatrogenic and perinatal transmission [2], [3]. Since 1997, the annual incidence of AIDS in the world has declined slightly, but it is still one of the most important causes of death in the world's major disease burden [4]- [7]. It is estimated that there were 1.8 million new infections in 2016, with an average life expectancy of 40.32 years. About 70% of these new infections occur in low and middle-income countries, with investment spending reaching 8.13 billion US dollars [8].
In recent years, the number of AIDS cases reported annually The associate editor coordinating the review of this manuscript and approving it for publication was Wei Wei . has increased [9], [10]. In addition, the mortality rate of AIDS in class B statutory infectious diseases is the highest from 2008 to 2017, and the vaccine has not been determined [11], [12]. Therefore, facing the major public health challenges and threats brought by AIDS, and providing a clear and quantitative direction for effective formulation of preventive plans and rational allocation of limited available resources, this is a high-precision and accurate understanding of the prediction model of AIDS epidemic behavior is necessary. Jia J established an HIV / AIDS epidemic model with general nonlinear incidence and treatment methods, and studied the stability of disease-free balance and unique local balance [13]. Wang Yawen established the combination of autoregressive moving average model, generalized regression neural network and ARIMA model. ARIMA (1,1,1) (0,1,1) 12 and ARIMA-GRNN model can better predict the monthly incidence of AIDS in China, but the effect of the combined model is better [14]. Yang K used nonparametric local smoothing method to develop a flexible and effective data modeling method for St disease incidence, monitoring the incidence of cancer, AIDS, cardiovascular disease and other chronic or infectious diseases, with good results [15]. Mar Masi áused the multi state method to predict the development of PLWH into non AIDS events, and estimated three transfer probabilities: from alive and NAE free to alive and NAE experienced (''NAE development''); from alive and NAE experienced to death (''death after NAE'') [16]. Rezaei s elucidates the overall prevalence of depression in HIV / AIDS patients through systematic review and metaanalysis [17]. Adamu incidence rate in various states affected by insurgency activities in Boko Haram, Nigeria, collected data from 2004 to 2017, reporting 16 [18]. The information of 102 patients, including age, gender, diagnosis year and patient status, found that most of the HIV / AIDS infected people were women in these Boko Haram areas. Although these laws were discovered, they did not give a trend of future incidence rate. Deribew A studied the development trend of HIV / AIDS mortality in Ethiopia from 1990 to 2016, found that the decline of HIV / AIDS mortality was relatively slow, and provided the corresponding HIV testing and treatment plan, but did not give a future development trend, and the plan had a certain lag [19].
Deep learning is a kind of machine learning method to represent data washing. It combines the bottom features of data into the top features through deep neural network, and is used for classification, prediction and other tasks. Compared with the traditional shallow learning methods such as neural network, logic regression, support vector machine, decision tree, etc., its network depth is deeper, its nonlinear representation ability is stronger, and its network learning ability is stronger. Shallow learning is often ineffective in dealing with complex problems, even when the number of samples is sufficient. This is because the network complexity is low and the learning ability is not strong. Even if a large number of samples are input, some important details cannot be learned. However, in the past, due to the limitation of computing power, blindly increasing the number of layers and parameters of the network is easy to lead to slow convergence and over fitting of the network. Today's deep learning technology can effectively solve these problems. While increasing the network depth, greedy layer by layer training method is used to prevent over fitting and improve the network performance.

II. RNN AND ITS DERIVED NEURAL NETWORK
From the first Turing test to the present deep learning, in recent decades, imitating human brain has always been the direction of artificial intelligence. However, in the past few years, due to the limitation of computing power, computers have been unable to process the same amount of data as human brain. With the rapid development of computer computing power, deep learning model solves this problem through various methods [20]. ANN (artificial neural network) has also developed rapidly and been widely used, and derived convolutional neural network (CNN) with spatial distribution data and recurrent neural network (RNN, recurrent neural network) with temporal distribution data [21]. At the same time, the cyclic neural network model is widely used in time series data processing because it can process the time-series data.

A. CYCLIC NEURAL NETWORK MODEL (RNN)
In terms of time series data processing, taking the prediction of AIDS incidence to be discussed in this paper as an example, it is different from the traditional artificial neural network (ANN) that takes a certain time cross-section factor as the input value of data. The RNN takes a certain long-term factor data as the time series and takes the data in the past period as the input value [22]. Figure 1 is to expand unified neuron A according to time [0, t]. the characteristics of chain structure reveal that RNN is essentially related to sequence and list. In this process, we can see the operation of the whole neuron at different time points. The input variable x 0 is processed in the neuron at the time of 0 and input h 0 . This process is not over, the whole process has entered into 1 moment.
However, due to the high complexity of RNN model, gradient explosion and gradient disappearance are easy to occur when the input variables enter the hidden layer for data processing. In 1994, yoshua bengio explained the gradient disappearance by mathematical formula (1) [23].
Because the hidden layer gradient will multiply the coefficient w repeatedly in the process of back propagation along the time dimension, if the data is long series data, the more times the coefficient W is multiplied repeatedly, the more likely it is to cause the gradient explosion and gradient disappearance.

B. LONG AND SHORT TERM MEMORY NETWORK MODEL (LSTM)
The solution to RNN problem comes from the introduction of long-term memory network (LSTM) in 1997. Hochrater and schmiduber, the proponents of long-term and short-term memory networks, found that the essential reason for RNN to produce gradient explosion and gradient disappearance is that the calculation method of hidden layer state causes the gradient to be expressed in the form of continuous product, and put forward an improved way to solve this problem [24], as shown in Figure 2.
If we omit o t , y t and L t for each layer, then the model can be simplified as shown in the figure above. In RNN model, VOLUME 8, 2020  we can see clearly that the hidden state h t is determined by h t−1 and x t together. On the way through the activation function σ , which is tanh in general, the LSTM model on the right side of the figure above is much more complex.
RNN neural network in the process of transforming the growing neural network, because of repeated cycles in the hidden layer, the error will become more significant along with the cycle, resulting in the disappearance of gradient, the change of training results is not obvious, and finally the whole training process cannot escape from the local optimal solution [25], [26].

III. GRU NEURAL NETWORK
GRU network was proposed in 2014, which is essentially a cyclic neural network and an improved version based on LSTM network [27]. Based on the LSTM unit, the input gate and forgetting gate in LSTM are combined into a single update gate, and the specific structure is shown in Figure 3.
According to Figure 3, the forward propagation formula can be obtained as shown in formula (2)- (6).Where u is the index of the updated gate vector, r is the index of the reset gate vector, h is the index of the hidden cell vector at all times, h is the index of the new memory cell at t, and f and ϕ are activation functions. f is the sigmoid function, ϕ is the tanh function, and ∼ t s h represents the information of the new memory unit at time t [28]. From Figure 4, it can be seen that the new memory information is obtained from the synthesis of the hidden unit states at t time input: x t i and t − 1 time, indicating that the new memory can have the new input information and historical information at the same time. The reset gate determines the weight of the hidden unit information in time t − 1 to the new memory information ∼t s h in time [29]. If the current time state is independent of the previous time, the reset door can prevent the transmission of the previous time information. However, there is an update gate as the final output control gate, and the information of the previous moment can also be transmitted through the update gate. The weights of the new memory information and the hidden unit information at time: t − 1 are respectively controlled by the update gate. The above is the basic working principle of GRU, which can not only solve the problem of RNN gradient disappearing, but also reduce the network parameters compared with LSTM, and has faster convergence speed. Moreover, experiments show that the network performance is even better than LSTM.

A. GRU NEURAL NETWORK BASED ON MHPSO OPTIMIZATION
Particle swarm optimization (PSO) is a kind of swarm intelligence optimization algorithm, which is proposed by Kennedy and Eberhart in 1995 to simulate the foraging of birds. According to formula (7) and (8), the particles of the population constantly adjust their speed and position to optimize until the convergence termination condition is met.
In the formula, V t i,j represents the velocity of the particle i in the j-th dimension at the t-th iteration; x t i,j is the position of the particle i; ω is the inertia weight; c 1 and c 2 are acceleration coefficients or learning factors; y t i,j is the t Individual extreme points during the iteration; y t i is the global extreme point of the particle swarm; r 1 , r 2 are random numbers uniformly Aiming at the problem that particle swarm algorithm is easy to fall into the local optimal position, this paper proposes a hierarchical heterogeneous dynamic particle swarm optimization algorithm. The main purpose of this algorithm is to establish a direct connection between the particle swarm structure layer and the layer. The number of particles is the same.
Each time the algorithm iterates, it arranges all particles according to the current fitness value and assigns them to different levels. The adaptive value of the particles is inversely proportional to the level they are in. When the algorithm runs, the particles will Attracted by the particles on the upper layer, these particles are called attracted particles, and the particles themselves will also act as attracted particles of a certain particle in the lower layer. Particles, and it is the attracting particles of the fourth layer of particles; for the particles at the highest and lowest layers, its attracting particles are other particles in the same layer as it. In the process of optimization, the particles will not only The local best and global best positions move, while also moving in the direction determined by its attracting particles.
The MHPSO algorithm incorporates a parameter related to attracting particles in the iterative update formula. The specific display can be seen from the following equation: Among them, x (i) t a,j represents the position of the attracting particle a of the particle i, and B represents the attracting particle a of the particle i, c 3 represents the constant of the acceleration coefficient, and r 3 represents the attraction coefficient of particle i to attract particle a.
The position update formula is shown in equation (10): The gradient descent method adopted by GRU neural network is easy to make the weight converge to a certain value, but it can not guarantee that it is the global minimum value of the error plane. The optimization of GRU neural network by MHPSO is to replace the gradient descent method in GRU neural network by GRU. The main idea of this algorithm is to optimize the connection weight and threshold of GRU neural network by MHPSO algorithm, which can improve the GRU neural network Prediction performance.
The specific steps of short-term traffic flow prediction based on optimized GRU neural network parameters of MHPSO are as follows: (1) build training data set and test data set, and preprocess traffic data to obtain time series of historical traffic flow: (2) using MHPSO to optimize the parameters of GRU neural network to get the optimal short-term traffic flow prediction model. The specific modeling steps are as follows: Step 1: initialize the parameters of particle swarm and GRU neural network. The parameters of particle swarm include population scale, population layers, iterations, learning factors, and the limited range of particle position and velocity. The initial values of particle position and velocity are random. The parameters of GRU neural network mainly include the number of neurons in each layer of neural network and the number of layers of hidden layer.
Step 2: calculate the fitness value of each particle in the population and build the population hierarchy. The fitness function of particles in the population is defined as: where n is the population size, Y i is the sample output value, and y i is the actual output value Step3: calculate the fitness value of each particle Step 4: update the speed and position of particles according to formula (3) and (4) Step5: if the end condition of iteration (good enough position or maximum number of iterations) is reached, it will end. Otherwise, turn to step 2 to continue iteration MHPSO-GRU network is to optimize the initial weight of deep GRU network through MHPSO, and then restore the optimal solution as the initial weight of deep GRU network. Because the optimal solution obtained by MHPSO is close to the final required weight and width of deep GRU network in the global scope, the deep GRU network trained by learning can have less learning time and improve its disadvantage of easily falling into local minimum, thus improving the performance of prediction using this model.

B. PREDICTION MODEL OF AIDS INCIDENCE BASED ON GRU NETWORK
This paper proposes a prediction model of HIV incidence based on GRU network. The network structure of the model is shown in Figure 5. Among them, the one month incidence of AIDS as a sample, recorded as P. P is the incidence data of six months before the forecast month. The reason for setting VOLUME 8, 2020 the forecast interval to 6 months is based on the influence of date factors. As the input of RNN, P needs to be divided into multiple subsequences, and the time step of segmentation takes the middle value.

IV. MHPSO-GRU NETWORK AND PREDICTION OF AIDS INCIDENCE A. GRU NEURAL NETWORK BASED ON MHPSO OPTIMIZATION
With the development of machine learning algorithm and data mining technology, the accuracy of incidence prediction is also increasing. In the face of complex and random systems, we often use one-dimensional time series data analysis, but this result is caused by many factors. It is necessary to study the influence of other factors in time series, which makes the pattern recognition of time series more important. In order to explore how these influencing factors affect the system and its internal relations, we can decompose the time series into some interpretable contents, such as trend item, cycle item, cycle component and irregular component, which can help us better simulate the time series. It is necessary to construct a topological space equivalent to the original system, that is, to project the influence factors hidden in the one-dimensional system that have not been observed into the geometric space, from the current state of the system, we can get the state of the next moment, that is, from the value at this time to the prediction value of the next moment of the time series. In this paper, GRU, a derivative network of RNN neural network, is used to predict the incidence of AIDS. The MHPSO-GRU network prediction model was constructed for the sequence of AIDS incidence.

B. PREDICTION MODEL OF AIDS BASED ON MHPSO-GRU
Cyclic neural network can not only predict the image of time series, but also process the length of any input or output. The parameters of MHPSO-GRU model are less than those of traditional RNN and LSTM model, so it is easier to program and put into use. The incidence of AIDS over the years is a time series, which is more suitable for RNN model training.
In this paper, we build a prediction model of AIDS incidence based on MHPSO-GRU model, and the specific framework is shown in Figure 6.
In the prediction framework of AIDS incidence shown in Figure 6, this paper first collects the time series data of AIDS incidence in 2004-2011, draws the time series image, and analyzes the trend impact of AIDS incidence. In this paper, a model based on MHPSO-GRU neural network is established to predict the incidence of AIDS in the next few years. At the same time, we also use BPNN model, traditional RNN model and LSTM model to model and predict the future incidence of AIDS based on historical data, and analyze the differences and advantages and disadvantages of the four prediction methods.

V. MODEL PERFORMANCE EVALUATION EXPERIMENT AND ENALYSIS A. EXPERIMENTAL DATA AND NETWORK PARAMETER SETTING
The monthly cases of longitudinal AIDS collected from January 1, 2004 to April 30, 2018 in mainland China were collected from the National Notifiable Infectious Diseases Reporting System (2018). A total of 172 observations over 15 years were collated and summarized (table S1). Subsequently, in order to evaluate the maintainability and flexibility of each model developed, we used the data from January 1, 2004 to December 31, 2016 to train the basic parameters of the model, and used the remaining 16 data points to verify the model extrapolation ability. Our current research is not necessarily ethically approved or agreed, because the data do  In addition, when HP filter is used to remove the short-term monthly impact of AIDS incidence to analyze long-term trend and periodicity, a trough is usually observed in January and February of each year, while the case is frequently reported in December of each year. In the remaining months, the number of reported cases remained stable and the volatility has become more pronounced since 2010 (Figure 7). In addition, in the past 14 years, the number of reported AIDS cases has increased significantly, with an annual increase of about 14.384%.    (only 3054 cases). In addition, when HP filtering is used to remove the short-term monthly effects of AIDS incidence to analyze long-term trends and periodicity, a trough is usually observed in January and February each year, and the case is frequently reported in December each year. The number of notified cases remained stable for the remaining months, and this fluctuation has become more pronounced since 2010. In addition, the number of reported AIDS cases has shown a significant upward trend over the past 14 years, and has increased by approximately 14.384% per year.
The above figure is a time series image of the incidence of AIDS from January 2004 to January 2018. As shown in the figure, the incidence of AIDS shows a gradual upward trend, which seems to be more obvious since 2010, with a downward trend in 2018. In addition, periodic fluctuations also occur in the same period.
For the prediction experiment of AIDS incidence based on MHPSO-GRU network, the model parameter settings are shown in Table 2.
For the actual change curve and prediction curve of AIDS incidence from January 2017 to April 2018, see Figure 9. MAPE is predicted to be 8.15%. The actual and  predicted curves of AIDS incidence from January 2009 to January 2018 are shown in Figure 9, of which the predicted MAPE is 8.21%. From the comparison of the results in Figure 9 and Figure 11, it can be seen that the difference between the whole prediction curve and the actual curve is not big, and the whole prediction error is quite stable within one or ten years, without large deviation, which proves that the method in this paper is effective. Figure 10 shows the image of the training loss function of the AIDS incidence prediction model changing with the number of iterations. From the above figure, it can be seen that the MHPSO-GRU based AIDS incidence prediction model tends to be stable when the number of training reaches 400, and then the loss function value is close to 0.28, indicating that the prediction effect is better. Table 3 shows the prediction ability and efficiency of the four best fit models selected. We found that compared with MAE, MAPE, mer and RMSE, the fitting results of basic BPNN model were the lowest, so as to analyze the future epidemic mode of known cases of AIDS. In the out of sample prediction, AIDS had the maximum value of the above four indicators. The MHPSO-GRU model is considered to be the most suitable model for prediction in the derivative model based on the full consideration of the performance simulated and predicted by these models (Figure 11 and Figure 12). Subsequently, this hybrid approach was re established and data from the whole report were used to simulate the time behavior of AIDS in the near future. As can be seen from table 7, it shows that the disease seems to be on the rise from May 2018 to December 2020. Figure 11 shows the predicted incidence of AIDS by MHPSO-GRU model, LSTM model, RNN model and BPNN model. It can be seen that BPNN model has the worst prediction effect and the largest deviation from the actual data. The prediction error of RNN model is large, that of LSTM model is small, and that of MHPSO-GRU model is minimum. Figure 12 shows the global prediction of AIDS incidence by four algorithms: BPNN model, RNN model, LSTM model and MHPSO-GRU model. The blue line is the time sequence of actual incidence of AIDS, the red line is the prediction sequence of MHPSO-GRU model, the yellow line is the prediction sequence image of LSTM model, the purple line is the prediction sequence of RNN model, and the green line is the prediction sequence of BPNN model In addition to BPNN model, other models have good prediction effect, but MHPSO-GRU model has better prediction effect. As described above, the curve (solid red line) fitted by the MHPSO-GRU method can better approximate the reality (solid blue line) compared to other curves.   Table 4 shows the prediction value and 95% prediction interval of 2018-2020 based on MHPSO-GRU model, and figure 13 shows the prediction value and prediction interval of 2018-2020. From the comprehensive table and figure, we can see that the incidence of AIDS in the future is still rising, and the situation of AIDS prevention and control is still very serious.

C. PERFORMANCE COMPARISON
To sum up, according to the longitudinal data of known cases of AIDS reported, we use four useful methods, BPNN, RNN, LSTM, MHPSO-GRU, to model and solve. Figure 12 is a comparison chart of the simulated and estimated incidence rate between the selected four models and observations. The red line is the prediction sequence of the MHPSO-GRU model, and the blue line is the actual incidence rate curve. It is seen that the error of the red line and the blue line is smaller. Table 3 shows the prediction performance comparison of sample fitting of the four models. Mae of MHPSO-GRU network model is 278.875, MAPE is. 0.167, Mer is 0.076, RMSE is 771.710, which is the best of the four effective methods. This may be a useful technology for estimating the future behavior of AIDS.

VI. CONCLUSION
In the past decades, the incidence rate and mortality rate of AIDS has been increasing. The situation of countries and regions in the world has worsened. The premise of effective containment and intervention strategies for AIDS by relevant organizations is accurate identification and early warning of the future long-term track of this disease behavior.
The advantages of our research include the amazing findings gathered from the AIDS incidence rate data collected from January 2004 to April 2018. The validation of these data has been supported by China's mandatory infectious disease surveillance system. It can be seen that figure 8 shows that the incidence rate of AIDS in China is increasing. This work can study the future trend of AIDS incidence rate in China, and provide corresponding treatment strategies and intervention measures. In the future work, we plan to study the trend of AIDS rate in other areas to prove that our model is effective and universal.
Even so, there are some limitations in this work. First, the data cover the period including policy interventions (2010). Although the results of the first choice model are satisfactory, the actual impact of the model's accuracy before and after the sudden rise in incidence rate of AIDS is still unknown. Secondly, in the construction of RNN model, due to the lack of mature theoretical guidance on the optimal number of hidden cells, feedback delay and other key parameters. Although mhpso algorithm is used to improve the number of singular numbers in this paper, its specific simulation and prediction process is largely unknown. Third, it is estimated that 32.4% of AIDS patients in China do not know their infection status. Therefore, the actual situation may be more serious than the estimated situation. Fourthly, many complex factors, including unpredictable factors, may lead to the occurrence and spread of AIDS. The prediction model we built only relies on the known AIDS cases in the past, without considering other factors to predict the unknown cases. Therefore, if conditions permit, factors related to the occurrence and development of AIDS are expected to be used together to assess the epidemic. Finally, the prediction model is based on the cases in China from January 2004 to April 2018. Therefore, the analysis of the epidemic only represents the situation and trend of AIDS in the country. After being recorded, this method has great benefits for other regions and infectious diseases.
To sum up, we use BPNN, RNN, LSTM respectively according to the longitudinal data of known AIDS cases reported, Four useful methods of MHPSO-GRU are used to model and solve the problem. It is found that the performance of MHPSO-GRU network model is the best. This may be a useful technique to estimate the temporary behavior of AIDS, so it can help health policy makers to reasonably allocate health resources The prevention and control plan of the disease shall be properly formulated. In addition, the number of new HIV infections will continue to increase and is already high, and these issues deserve urgent strategic attention in the context of properly adopted policies.