Livestock Product Price Forecasting Method Based on Heterogeneous GRU Neural Network and Energy Decomposition

The characteristics exhibited by livestock product price fluctuation should be characterized, and the trend of price fluctuation should be forecasted in times, which are critical to developing the animal husbandry market. As reported from existing studies, the trend of price fluctuation is difficult to accurately forecast due to the multi-modality of the factors of the price fluctuation of livestock products. To address the problems, a novel price forecasting method was proposed by complying with GRU neural network and the principle of energy decomposition. First, to acquire the price fluctuation information at different frequencies, this study proposed a variation mode decomposition method based on actual signal energy (AE-VMD) and a multi-scale adaptive Lempel-Ziv complexity calculation method (MA-LZ). Second, to preserve the information of time series and multimodal data, this study developed a heterogeneous GRU neural network (AH-GRU) in accordance with attention mechanism. Lastly, the effect of static information (non-time series, including growth period, origin, longitude and latitude) on price fluctuations was introduced in the forecasting initially, and the final forecasting result was outputted via the dense layer. As indicated from the experimentally achieved results, the proposed method outperformed the mainstream livestock product price forecasting method in forecasting accuracy, trend forecasting and method convergence.


I. INTRODUCTION
Animal husbandry is recognized as a vital part of China's agricultural economy. Over the past few years, the price of livestock products fluctuates significantly due to extreme meteorological events, epidemic disease spread, policies and regulations. Large fluctuations significantly impact consumer price index [1] , while negatively affecting people's livelihood and related industries [2] . A timely and accurate grasp of the law of price changes of livestock products is critical to the stable and healthy development of the livestock product market. After years of research and development, considerable research result has been achieved for the price forecasting methods of livestock product. At present, the most widely used price forecasting methods can fall to three types: conventional forecasting method, intelligent forecasting method and hybrid forecasting method.
For the conventional forecasting methods, Zhang [3] used multiple linear regressions to build a pork price forecasting method. Ge and Wu [4] built a unary non-linear regression method with time as the independent variable and a multiple linear regression method with yield, import volume and export volume as independent variables. The results show that the regression analysis forecasting method was highly dependent on the factors of price fluctuation. Due to the multimodality of the factors of the price fluctuation of livestock products, the application scope of the regression method was limited [5] . Nevertheless, the regression analysis forecasting method was still widely used in the field of short-term price forecasting because of its simple principle and strong method interpretation ability. Dong [6] summarized the types and characteristics exhibited by time series methods, and built a short-term forecasting method of fresh milk price based on time series method. Molina [7] employed the autoregressive integrated moving average method to forecast the monthly pork price in the Philippines. Wu [8] proposed an exponential smoothing method to forecast pork price in China. Because the time series analysis method ignores the effect of non-time factors on the results, it could only build the method according to the rule of historical data when use this method to forecast the price.
Conventional forecasting methods are only suitable for dealing with linear problems, whereas the price series of livestock products are mostly non-linear. Thus, intelligent forecasting methods that use artificial intelligence methods to address non-linear problems in price series have received more and more attention. Neural network methods applied in livestock product price forecasting mainly consist of back propagation neural network [9] , radial basis function neural network [10] and extreme learning machine [11] . Based on recurrent neural network Kurumatani [12] significantly reduced the convergence speed of the forecasting method. Jha and Sinha [13] used time-delay neural networks to forecast the monthly wholesale prices of oil crops in different markets in India. Compared with linear methods, this method has obvious advantages in forecasting the trend of price changes. According to the non-linear characteristics exhibited by livestock product price fluctuations, Li [14] built the short-term forecasting method of livestock product price by using neural network technology. Compared with the linear method, this method has better non-linear fitting ability and higher forecasting accuracy. Considerable theories and practices have proved that it was impossible to deal with the linear and non-linear laws of livestock product price at the same time by a single method [15] . Therefore, hybrid method was introduced to forecast the price of livestock products. Wu [16] used ARIMA method, GM(1,1) method and the radial basis function to create the ARIMA-GM-RBF Hybrid Method, which solved the problem of low forecasting accuracy attributed to insufficient information utilization of the single method. Chuluunsaikhan [17] forecasted the daily retail price of pork in the South Korean domestic market based on news articles by incorporating deep learning and topic modeling techniques. As indicated from the experimentally achieved results, there was a strong correlation between the meaning of news articles and the price of pork. Based on the idea of "forecasting combination", Li [18] proposed forecasting combination frameworks with different time scales to enrich the diversity in the modeling process by introducing price dynamic change information. Niu [19] built an egg price forecasting method based on LM-BP, and the results showed that this method could provide a relatively ideal short-term forecasting of egg price, and the forecasting effect was significantly improved compared with the single forecasting method.
Though livestock price forecasting has been extensively studied, there are still the following problems. First, the current mainstream forecasting methods have limited ability to deal with the factors of price fluctuations and cannot deal with multi-source, multi-modal and heterogeneous input data. Moreover, the effect of non-time series static data (e.g., variety, growth cycle, latitude and longitude), cultivated land area, policies and regulations on price fluctuation of livestock products was ignored. Second, different frequency price fluctuations represent different data characteristics. The high-frequency fluctuation has high frequency and low amplitude, which represents the price change attributed to the imbalance of short-term market supply and demand. The low-frequency fluctuation has a low frequency and high amplitude, representing the price change attributed to external factors. Over the past few years, signal decomposition strategy has been widely used in the financial field and achieved effective results, whereas it has not been applied in the research of livestock product price forecasting. If treated for different frequency fluctuations, the accuracy of livestock product price forecasts can be improved. Third, most of the researches are assessed by complying with the criteria of mean absolute percentage error (MAPE) and root mean square error (RMSE). Though it helps assess the closeness of the forecasted value to the actual value, it cannot assess the trend of price fluctuation. Accordingly, though the forecasting error is very small, and the trend of price fluctuation is misjudged, the result cannot be sufficiently referenced for industry and government departments.
To address the problems, this study proposed a novel price forecasting method based on GRU neural network and energy decomposition. Taking the price of pork, mutton and beef in Hebei Province of China as the research object, this study compared the mainstream animal product price forecasting method to verify the validity of the method. The main contributions of this study can be summarized as follows: (1) AE-VMD method and MA-LZ method were proposed to increase the accuracy of signal decomposition and frequency recognition. (2) AH-GRU neural network was proposed, thereby completely preserving the information of global time series and multi-modal data. (3) Combined with static information and dynamic information (amount of precipitation, temperature, other product prices and other time series) initially. The effect of static information on price fluctuation was introduced in the forecasting process to provide targeted forecast for livestock product price.

II. EXPERIMENT FRAMEWORK AND METHODS
To more specifically improve the performance of signal decomposition and accurately separate signals at different frequencies, the effect of static information on price fluctuation was introduced in the forecasting process. A novel price forecasting method was proposed based on GRU neural network and signal decomposition. According to Fig. 1, the method mainly consisted of three parts, i.e., decomposition stage, forecasting stage and integration stage, in which the green part was the method and algorithm optimized here. The raw data was decomposed into multiple frequency energy signals by using AE-VMD algorithm, and the high-frequency and low-frequency signals were identified by adopting MA-LZ algorithm. The frequency signal was processed by attention mechanism, and the time series of factors of price were inputted in heterogeneous GRU neural network. In the dense layer, the output of hidden state by heterogeneous GRU neural network was integrated with static data to yield the corresponding frequency output. Lastly, different frequency data were fused by applying the frequency fusion method to achieve the final forecasting result. Sections 2.1 to 2.3 introduce the methods and algorithms optimized in this study.

A. VARIATIONAL MODE DECOMPOSITION BASED ON ACTUAL SIGNAL ENERGY
VMD searches for the optimal solution of the variational method by iteration. VMD is adopted to decompose the input signal into mode set | 1,2,3 … and corresponding center frequency | 1,2,3 … , where denotes a predefined parameter used to determine the amount of information in the schema. If the value of is too large, it will not only consume considerable computational performance, while reducing the decomposition accuracy. As opposed to the mentioned, the information in the pattern set cannot support the training method. Thus, the assignment of is of high significance.
At present, no unified criterion has been formulated for VMD parameter selection. Li [20] employed the maximum envelope kurtosis as an indicator to determine the number of VMD modes. Zhang [21] presented the grasshopper optimization algorithm to build the parameter adaptive VMD method. Based on the instantaneous frequency mean, Ding [22] selected the optimal number of VMD modes. Over the past few years, a parameter selection rule based on signal energy was proposed in non-ferrous metal price forecasting. The literature [23] adopted the average absolute percentage error between the reconstructed signal ∑ and the original signal to formulate a selection rule for the number of modes, as expressed below: The value can be determined when r is sufficiently small, and no obvious downward trend is identified. The literature [23] proved the feasibility of selecting modes by complying with the number of signals, whereas two problems remain in this method. First, the value is selected based on the signal energy, while it cannot indicate the actual energy of the signal. Second, the literature exploited mean absolute percentage error to determine the number of methods.
Since MAPE has no symmetry, it exhibits more sensitivity to negative error forecasting (where the forecasted value exceeds the actual value). Second, MAPE is not always available for differential solution, so it is not suitable for acting as an optimization standard. On that basis, a variational modal decomposition method was proposed based on the actual energy of the signal, as expressed below: Where denotes the sample number; represents the original price sequence; is the individual Component; is a minimal value set to 0.01 by default.
First, in accordance with the Parseval theorem, the novel calculation method exploits the actual energy of the signal to replace the original signal fluctuation, thereby more accurately indicating the trend and characteristics exhibited by the signal fluctuation. Second, the symmetrical average absolute percentage error is adopted to replace the average absolute percentage error, thereby addressing the defects of MAPE calculation method. Lastly, the minimum is added to the denominator to solve the instability when the real value and forecasted value are close to zero. As revealed from the experimental findings, if R is not obviously decreasing, the value can be determined. Specific experimental details are assessed in Section 4.1.

B. FREQUENCY COMPONENT IDENTIFICATION
AE-VMD was adopted to decompose the livestock product price into multiple modes by complying with frequency. In addition, the mentioned modes abided by strong laws and were independent of each other. The high-frequency signal represents the short-term price change attributed to market factors. Though the highfrequency signal fluctuates frequently in the short term, it cannot reflect the rule of long-term price change. The low-frequency signal represents the long-term price change attributed to external factors. Though the variation range of low-frequency signals is low, it will reflect factors of price changes.
If the targeted forecasting of different frequency signals can significantly increase the accuracy of price forecasting, high-frequency signals and low-frequency signals should be distinguished efficiently and effectively. Over the past few years, Lempel-Ziv [24] [25] [26] algorithm has been extensively applied in signal complexity calculation. Since the Lempel-Ziv algorithm divided the original signal by the average value, thereby causing the loss of considerable information in the binarized sequence, so the change tend of signal change is unlikely to be judged. Given this, a multi-scale adaptive Lempel-Ziv complexity calculation method was developed.
The MA-LZ method divides the original signal into multiple scales by signal mean at the initial stage of signal decomposition. Thus, this method is capable of retaining the original signal information maximally. In the respective segment of signal, the threshold is determined by complying with the difference between adjacent two points to examine the correlation between adjacent signals. After the complexity of the binarization sequence is obtained, signal frequencies are distinguished by referencing the method in literature [27] . The algorithm of MA-LZ method is expressed below. Where , … is the number of signals in the respective segment; , … is the average value of the respective signal; P and Q are temporary sequences; PQ represents the cascade sequence of P and Q ; PQω represents the sequence in which the last node is removed from PQ.
Initialize c n 1, k 1 (5) Build P ,Q and P Q ω, where k k 1 (6) Judge whether Q is the string of P Q ω // If the judgment in step (6) is false, execute step (7), otherwise execute step (8) (7) Q denotes the new sequence and c n c n 1 (8) Judge whether k is greater than n // If the judgment in step (8) is false, execute step (5); otherwise, execute step (9) (9) Use the normalization formula • to normalized the c n (10) Return to step (2) until the remaining signal complexity is calculated (11) Set the threshold φ 0.8 ， Using equation ∑ ∑ φ，m ∈ i [27] to find the minimal value of m (12) Thus, modes 1 to m are recognized as highfrequency components; modes m 1 to i are identified as low-frequency components //end

C. HETEROGENEOUS GRU NEURAL NETWORK BASED ON ATTENTION MECHANISM
Cho et al. proposed GRU neural network in 2014, i.e., one of the most successful variants of LSTM neural network [28] . Compared with conventional machine learning methods, GRU neural network exhibits more efficient nonlinear fitting ability. To solve the problem that most researches only rely on multi-dimensional time series to complete price forecasting, this study integrated the attention mechanism into GRU neural network method based on the research in literature [29] . Moreover, the heterogeneous GRU neural network method was proposed based on attention. The static information was combined with the dynamic information to forecast the price of livestock products. AH-GRU method mainly consisted of three parts, i.e., an attention mechanism, heterogeneous GRU neural network and a dense layer.
First, this study introduced the attention mechanism based on the original input sequence. In the original data, the original sequence with input length L at a certain time was , , … ∈ . Where t 1,2,3 … L . At a certain time , the original input sequence was mapped to the hidden layer of GRU neural network as , . The attention weight could be determined by calculating the value of the hidden layer of GRU neural network at time and the original data , , … ∈ of the -th input. It was employed to represent the influence degree of the -th eigenvalue of the original input data on the target, as expressed below: exp tanh ∑ exp tanh Where ∈ , and denote all initialized attention weights. If the value of is larger, the characteristic parameter will more significantly impact the target. After the attention weight was obtained by training, the new sequence x , , … was obtained as the input sequence of the method by weighting the original input sequence Second, as the core of the method, the structure of the heterogeneous GRU neural network is presented in Fig. 2, where x denotes the price series, and represents the time series of factors of the price of the product.
represents the historical price summary from x to x , and denotes the historical summary of all-time series. According to the formula, the output of the heterogeneous GRU neural network was formed by calculating the two hidden states. denotes the weight of the different gates, represents the deviation, * is the matrix multiplication, • expresses the multiplication of the elements at the corresponding position, f denotes the combination of all the gate functions and cell state functions to update the hidden state. As revealed from the calculation process, the heterogeneous GRU neural network connected all input sequences. Accordingly, the input sequences x and did not exhibit independent hidden states. The actual output hidden states and were generated by connecting the previous hidden states and . To balance the effect of heterogeneous data on the method, the hidden state and were set to equally contribute to the output here. Lastly, to further improve the effect of static data x in price forecasting, the static data was correlated with the trend of price fluctuation, and transformed into vector and merged with the hidden state , . * , * x *

A. DATA
This study selected the price data of pork, beef and mutton in Hebei province since 2000 as the research objects. To be specific, this study selected variety, growth period, producing area, latitude and longitude, cultivated land area, policies and regulations and over 30 types of static data. 57 types of time series data (e.g., corn price, soybean meal price, feed price, temperature and rainfall) acted as the dynamic data. All data originated from Wind Economic Database (https://www.wind.com.cn/default.html), China Statistical Yearbook (http://www.stats.gov.cn/tjsj/ndsj/), and National Meteorological Science Data Center (http://data.cma.cn/).

B. ASSESSMENT INDICATORS
In existing studies, error generally acts as an assessment index when assessing the performance of price forecasting methods (e.g., RMSE and MAPE). The error can reflect the proximity between the forecasted value and the actual value, whereas it cannot reveal the trend of the forecasted price fluctuation. Thus, the trend of price fluctuations should be accurately forecasted [30] .
To assess the forecasting performance of the forecasting method, this study measures the forecasting effect of the method from three dimensions. The first dimension uses root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R 2 ) to quantify the method performance. The calculation formula is as follows: ̅ Forecasting the trend of price fluctuation is a vital part of this study. To verify the effectiveness of the forecasting method proposed in this study in forecasting the trend of price fluctuations, the second dimension uses trend forecasting statistics ( ), correct upward trend (CP) and correct downward trend (CD) to measure the performance of price trend forecasting [31] .
represents the accuracy in forecasting trends, while CP and CD indicate the performance in forecasting uptrends and downtrends. The larger the , CP and CD values are, the better the performance of the method in forecasting the price fluctuation trend is. The calculation formula is as follows. Where represents the i-th forecasted value, m represents the length of the original sequence, is the number of data points in the upward trend, and expresses the number of data points in the downward trend. CD 100 , 1, 0 0 0, otherwise In addition, to further determine whether the proposed method can improve the forecasting accuracy, Diebold-Mariano test (DM test) was used in the third assessment dimension to discuss whether the forecasting error of the proposed method is different from that of other methods. DM test is essentially a t-test, performed to test whether the average values of the two loss sequences of alternative forecasting are equal. In the case of autocorrelation, it exploits the autocorrelation Consistency Estimation of the standard deviation of loss difference time series to discuss the effectiveness of the forecasting method proposed in this study.

C. PARAMETER SETTINGS
The PC machine used in the experiment was configured below. The CPU was the lntel core i5-6500, the memory was 16GB, and the operating system was Ubuntu 16.04. Used Pytorch 1.4.0 framework to build the forecasting method and complete the calculation process. For method parameter setting, the batch processing size of the experimental method was lastly determined as 128, the number of iterations was 400, the loss function was set as the cross-entropy loss function, the optimizer adopted the adaptive moment estimation, and the time step was set to 5.

A. PARAMETER SELECTION OF AE-VMD METHOD
To verify the effectiveness of AE-VMD method, Figure 3 lists the signal energies of the methods proposed in literature [23] and this research under different k values. Given the calculation rule in literature [23] , does not decrease obviously when k=8 for the decomposition of beef price sequence. For mutton and pork price series, does not decrease obviously when k =9. Thus, k=8 is defined when forecasting the price of mutton, and k =9 is defined when forecasting the price series of beef and pork. Likewise, the method proposed here was used to decompose the price series. k = 10 was defined when forecasting the price of beef and pork, and k = 7 was defined when forecasting the price of mutton. The decomposition methods with k = 8, 9 and 10 are built. The average percentage error of the final forecasting results is taken as the standard to verify the effectiveness of the AE-VMD method. According to Table 1, when k = 10, the MAPE of the forecasted price of beef and pork reached the minimum, reaching 1.1672% and 2.6412% respectively. Compared with k = 8, MAPE of beef and pork decreased by 59.48% and 50.71% yearon-year. Compared with k = 8, the MAPE of beef and pork decreased by 38.25% and 37.22% year-on-year. Likewise, when k = 8, the MAPE of forecasted price of mutton reaches 0.6762%. Compared with k = 9 and 10, it decreased by 42.29% and 61.25%. As revealed from the experimentally achieved results, the parameters selected by AE-VMD method outperformed those selected in literature [23] . Table 1 Forecasting accuracy of different k values

B. RESULTS OF DIFFERENT MODELS
In this section, the effectiveness of the proposed method here was confirmed by comparing with the time series analysis method (ARIMA [32] ), neural network method (GRU) and hybrid method (VMD-GRU, DBN-SVM-BPN [33] and ANN-SVR-ELM [34] ). To quantitatively test the proposed method, the forecasting results of six methods are presented in Table 2.
The root mean square errors of the forecasting method proposed in this study for beef, mutton and pork were 0.726, 0.535 and 0.738 respectively, and the average absolute errors were 0.607, 0.412 and 0.621 respectively. The forecasting performance was higher than the current mainstream forecasting algorithms. The R 2 fitting degree reached 0.954, 0.995 and 0.991, respectively, indicating that the forecasting method proposed here has better effect, faster convergence speed, and can better reflect the change law of livestock product price.
From the performance of forecasting the trend of price fluctuation, the values of beef, mutton and pork proposed in this study were 87.097, 90.909 and 90.566 respectively, the CP values were 68.805, 41.818 and 42.875 respectively, and the CD values were 63.326, 59.091 and 47.619, respectively. Accordingly, the forecasting method proposed here effectively impacts the price trend forecasting of beef, mutton and pork.  Table 3 lists the DM test results of the forecasting method proposed here and other forecasting methods. In the comparison between the mixed method and the single method, the original hypothesis is rejected at the 5% significance level, that is, there is a significant difference, and the forecasting accuracy of the mixed method is better than that of the single method. Compared with the other five methods, the proposed method has better performance because it adopts AE-VMD decomposition method to reduce the difficulty of original price series forecasting.

V. CONCLUSIONS AND FUTURE WORK
In this study, a livestock product price forecasting method was proposed, which could help relevant enterprises improve the ability to deal with price risks to a certain extent and facilitate the sustainable and healthy development of livestock product industry. The method feel to four parts, i.e., the variation mode decomposition method based on actual signal energy, the multi-scale adaptive Lempel-Ziv complexity calculation and the heterogeneous GRU neural network based on attention mechanism. This study assessed the execution efficiency of the method from three aspects: statistical analysis, fluctuation trend forecasting and Diebold -Mariano test. As indicated from the experimentally achieved results, the forecasting method proposed here outperformed the current mainstream animal product price forecasting method in forecasting accuracy, trend forecasting and method convergence speed.
The subsequent research content primarily focuses on two aspects. First, in this study, the hidden state and in the heterogeneous GRU neural network were set equal contribution to the output. Given the complex actual situation, the contribution of hidden state to output results will be inconsistent. For this reason, the contribution of hidden states to output results should be dynamically explored in the future research. Second, as impacted by time constraints, this study has only collected 30 types of static data, which is relatively small in quantity and type. Accordingly, empirical analysis will be subsequently conducted on price fluctuations of relevant products based on continued data collection.