Introduction
Stock price forecasting is one of the most challenging tasks encountered by financial organizations and private investors. To effectively mitigate the risk and to gain high investment return, a large number of prediction models are proposed. Then, with the development of information technology, the research on finance changes from macro to micro. Research direction of study is shifted from low-frequency data to high-frequency data. Charles Sutcliffe [1] pointed out that high-frequency data offers many opportunities for more detailed analysis of market activity. Jacquier et al. [2] presented that the financial high-frequency data is non-normality and proposed a stochastic volatility model to solve the non-normality. Jobson and Korkie [3] shows that the high-frequency data is highly nonlinear. Subsequently, the paper [4] demonstrates that high variability of intra-minute returns across stocks is shown to have a crude U-shaped pattern.
The above literature shows that the high-frequency data has the characteristics of the discrete, non-linear and non-normal distribution. In response to these characteristics, many scholars to improve the existing financial low-frequency data analysis model to adapt to the new characteristics of high-frequency data. The models of statistical techniques, such as autoregressive integrated moving average [5], are replaced by artificial intelligence (AI) methods, including artificial neural networks (ANNs) [6], Genetic Algorithm [7], and Hidden Markov Model [8]. In general, BP neural network is the most widely used. However, BPNN suffers from the risk of over-fitting, large number of parameters, and difficulty in obtaining a stable solution [9]. To solve the problems described above, intensive research to improving ANNs are presented such as BPNN with genetic algorithm [13], [14] and ANNs with metaheuristics [15]. Recently, support vector machine regression (SVR) has also been more envisioned in nonlinear regression estimation. SVR has a better generalization owing to the structural risk minimization principle. Thus, SVR has been successfully used in many fields of time series prediction such as traffic flow prediction [10], and financial time series forecasting [11], [12]. Nevertheless, it is difficult to set up the parameters of SVR. A model to fitting parameters fails to meet changing data. Soon afterward, some improved SVR are proposed, such as SVR with fuzzy model [16] and GA-SVR [17]. In this paper, the particle swarm optimization algorithm (PSO) is proposed to optimizing the adaption and parameters for SVR algorithm. Major contributions include:
We discuss that stock data have different distribution characteristics in different periods or different stocks.
The parameters of the traditional SVM are fixed in advance, and can not adapt to the changing characteristics of the financial high-frequency data.
We propose a novel algorithm to cope with changing financial high-frequency data, which applies adaptive mechanism and particle swarm optimization algorithm to adjust parameters dynamically.
We do a comparative experiment at three different time scales, including traditional SVR and BPNN. Experimental results show that the proposed method is more effective than others.
The rest of the paper is organized as follows. Section II reports on related works about forecasting methods for high-frequency data and time series. Section III describes an analysis of the problem, followed by our experiments in section IV. Section V concludes the paper.
Related Work
This section will present the principle of SVR as described by prior research. The formalization of SVR is described in detail, which is helpful to understand the proposed method in this paper. SVM/R, originally proposed by Cortes and Vapnik [18], uses a linear model to implement nonlinear class boundaries through mapping the input vectors x into the high-dimensional feature space. In the new space, a non-linear model is converted into a linear model. Thus, SVM/R is interested in the solution of the maximum margin hyperplane in the new space. Then, the classification/ regression problem is transformed into a quadratic programming problem, which is easy to solve by an optimization program.
Given a data \begin{equation} f(x)= w^{T}\varphi \left ({x }\right)+b\quad \varphi :R^{n}\to F,~ w \,\epsilon \,F \end{equation}
\begin{equation} R_{reg}=\frac {1}{2}\parallel w\parallel ^{2}+\,C\times \frac {1}{n}\sum \nolimits _{i=1}^{n} {\vert y_{i}-f(x_{i})\vert _{\varepsilon }} \end{equation}
\begin{equation} \vert y_{i}-f\left ({x_{i} }\right)\vert _{\varepsilon }=\! {\begin{cases} \mathrm {0,} &\quad \vert y_{i}-f\left ({x_{i} }\right)\vert \le \varepsilon \\ \left |{ y_{i}-f\left ({x_{i} }\right) }\right |-\varepsilon ,&\quad else \\ \end{cases}} \end{equation}
By introducing nonnegative slack variable \begin{equation} f(x)=\sum \nolimits _{i=1}^{n} {(a_{i}-a_{i}^{\ast })k(x_{i},x)+b} \end{equation}
\begin{equation} f(x)=\sum \nolimits _{i=1}^{n} {(a_{i}-a_{i}^{\ast })\mathrm {exp}(-\parallel x_{i}-x\parallel ^{2}/2\sigma ^{2})+b} \end{equation}
However, the parameters of SVM are determined in advance. Usually, cross validation is used to determine parameter values. And the SVM with fixed parameters does not apply to constantly changing financial high frequency data.
The Proposed Adaptive SVR
In this section, we first introduce the overall architecture of the proposed adaptive SVR in this paper. Then, details and knowledge of the proposed method are described.
A. Overall Architecture
As time goes on, the price of stock will change a lot. So, SVR based on fixed parameters is difficult to adapt the changing. Online learning will waste a lot of computing resources. Meanwhile, the speed of execution cannot meet the demand. We set up a threshold
B. The Adaptive SVR
1) The Optimization Algorithm Used in SVR
Particle swarm optimization, originally proposed by Eberhart and Kennedy [19], is a kind of evolutionary computation, which is the behavior of birds’ predation. In this paper, we use PSO for short. The basic idea of PSO is to find the optimal solution through cooperating and sharing information among individuals. It is simple and easy to be implemented and does not have many parameters. At present, PSO has been widely used in optimization, neural network training, fuzzy system control and so on.
In PSO, each potential solution is called a particle, the position of the i-th particle in the n-dimensional space, the flight velocity, and the optimal value are expressed as follows:\begin{align} x_{i}=&\left ({x_{i,1},x_{i,2},\cdots ,x_{i,n} }\right)\in R^{n} \\ v_{i}=&(v_{i,1},v_{i,2},\cdots ,v_{i,n})\in R^{n} \\ p_{i}=&(p_{i,1},p_{i,2},\cdots ,p_{i,n})\in R^{n} \end{align}
After obtaining the optimal value of the individual and the global optimal solution, the particles update their speed, position according to the following formula:\begin{align} v_{i}\left ({t+1 }\right)=&wv_{i}\left ({t }\right)+c_{1}r_{1}\left ({p_{i}-x_{i}(t) }\right)+c_{2}r_{2}(p_{g}-x_{i}(t))\notag \\ \\ x_{i}\left ({t+1 }\right)=&x_{i}+v_{i}(t+1) \end{align}
2) The Adaptive SVR
In our proposed method, the PSO is used for optimizing the parameters (\begin{equation} f\left ({C,g }\right)=minarg\sum \nolimits _{m}^{M} \sum \nolimits _{n}^{N} {Loss(y_{i}-y_{i}^{\prime }(c_{m},g_{n}))}\qquad \end{equation}
Problem Analysis
Obviously, our problem is to improve the accuracy of financial high-frequency data prediction. In this section, we mainly focus on how to use SVR for forecasting. In order to accurately forecast stock price with SVR, we studied the impacts of penalty coefficient
A. The Effect of C in Different Stocks or Different Periods
We randomly selected five stocks on the Shanghai Stock Exchange. In the analysis, we employ RMSE, MAPE, MAD to evaluate
Firstly, we study the relationship between the different stocks and the
Subsequently, we validated our assumption that the same stock has different data distributions at different times and that different
B. The Effect of G in Different Stocks or Different Periods
g is also of great importance to affect the ability of forecasting in the SVR. We study the effects of g with the same analysis methods as mentioned above. Firstly, we study the different effects of g on three different time scales at the same period, such as daily data, 30-minute data and 5-minute data. Then, we analyze the change of g in the same stock in different periods. In order to seek out the change regulation of g in SVR, we evaluated the error of the SVR model by g ranged from 10−5 to 104.
Figure 3. shows the changing of SVR prediction results for daily data by different g. It is clear from the Fig. 2 that the prediction results are better when the C value is between 10−2 and 102. However, there are different C values for different stocks. Experiments in 30-minute data and 5-minute data also prove that our conjecture is correct. The distribution of stock data is inconsistent at different times.
Figure 3. shows the changing of SVR prediction results for daily data by different g. It is clear from the Fig. 2 that the prediction results are better when the C value is between 10−2 and 102. However, there are different C values for different stocks. Experiments in 30-minute data and 5-minute data also prove that our conjecture is correct. The distribution of stock data is inconsistent at different times.
From the above analysis, it is concluded that the impacts of
Experiments and Results
A. Dataset and Data Modeling
1) Dataset
The data set is from Shanghai Stock Exchange, including SH600006, SH600016, SH600026, SH600036, SH600056, which can be download by http://www.wstock.net. We choose three different time-scales (or named frequencies) to validate our approach namely, the daily data, the high frequency data of 30 minutes, and the high frequency data of 5 minutes. Daily data is from the listing date to March 31, 2017. The 30-minute data is from January 1, 2017 to March 31, 2017. And the 5-minute data ranges from February 1, 2017 to February 28, 2017. By the way, we take eighty percent of the data as the training set, and twenty percent of the data as the test set in our experiment.
2) Data Modeling
Let us denote
B. Experiment Setup and Comparison
1) Experiment Setup
Good parameter settings will lead to the better performance. This article has three model parameters that need to be set, including traditional SVR, BPNN, and adaptive SVR. Then the paper describes the parameter settings in this experiment in order to reproduce the experimental results. In the traditional SVR model, the parameters are
We use the same parameter setting with mentioned traditional SVR, as the initial value of the adaptive SVR. And the threshold of the adaptive SVR is
2) Evaluation Metrics
To evaluate the proposed method, we apply three widely used quality indexes, i.e., Root Mean Square Error (RMSE), Mean Absolute Percent Error (MAPE), Mean Absolute Deviation (MAD). The formulas are as following:\begin{align} RMSE=&\sqrt {\frac {1}{N}\sum \nolimits _{i=1}^{N} \left ({{observed}_{i}-{predicted}_{i} }\right)^{2} } \\ MAPE=&\sum \nolimits _{i=1}^{N} {\left |{ \frac {observed_{i}-{predicted}_{i}}{observed_{i}} }\right |\times \frac {100}{N}}\qquad \quad \\ MAD=&\frac {1}{N}\sum \nolimits _{i}^{N} \left |{ \left ({{observed}_{i}-{predicted}_{i} }\right) }\right | \end{align}
3) Performance Comparison
In order to verify the effectiveness and robustness of the proposed method, we performed experiments on different stocks and different time scales. The BPNN model adopted by [11] and traditional SVR model are used as comparison methods which are commonly used for prediction. In the following, we discuss the experimental results in details.
The first experiment is performed on the data set of 5 minutes. We compared the performance of BPNN, traditional SVR, and adaptive SVR by the quantitative index of RESE, MAPE, and MAD. The detailed results are shown in Table 2. Obviously, the result of AD-SVR is the best, followed by BPNN, and SVR is the worst. It can be discerned from the table 2 that the forecast results of the five stocks have been greatly improved. The worst one of the five stocks is SH26, which was 11% lower than the traditional SVR, in RESE, MAPE, and MAD. And the SH26 stock is also better than BPNN. However, the best one of the five stocks is SH56, whose error has been reduced by about 60% compared with traditional SVR, and reduced by about 45% compared with BPNN.
To compare the results intuitively, we show one of our experimental results in figure 2. This figure illustrates the prediction results of SVR, BPNN, AD-SVR, and the true values which ranges from 21.66 to 22.47. It can be seen that the result of AD-SVR is closest to the real value than that of SVR and BPNN, especially where the peak appears. As time went on, the prediction of SVR became worse and worse. As far as the section 3 is concerned, the reason is that the change of data distribution feature leads to lower SVR performance of prediction.
The second experiment is conducted on the data set of 30 minutes. The errors of SVR, BPNN, and adaptive SVR are shown in Table 3. Compared with traditional SVR, it can be seen that improvements are more pronounced than that in the data set of 5 minutes. The worst one of the five stocks is SH06, whose error has been reduced by about 15%. And the stock SH56 is the best one whose error is 79% lower than traditional SVR in RESE, MAPE, MAD. For BPNN, this experiment is difficult to draw conclusions. The results in the SH16, SH26, and SH36 are better than that of traditional SVR. But the result in the SH56 is worse than that of SVR. This could be due to BPNN being affected by initialization parameters. Yet, compared with BPNN, the results of AD_SVR are almost superior to that of BPNN, except for SH36. And AD_SVR is more robust than BPNN due to nonrandom parameters.
Similarly, we show one of our results by the following figure 5. Compared with 5-minute data, the 30-minute data fluctuated significantly, ranging from 21.63 to 23.59. As we can see from figure 5, the first 25 predictions of SVR, BPNN, and AD_SVR are almost the same as the true value. As time goes by, the prediction of SVR and BPNN becomes worse and worse. However, the prediction of AD_SVR is always moving up and down around the true value. As shown in figure 5, the red line and the black line are almost coincident.
The third experiment is implemented on the daily data set. The results are indicated in Table 4. The results are similar with the previous experiments. Compared with SVR, the errors of AD_SVR in different stocks have been reduced, ranging from 8% to 41%. It should be noted that the training set of adaptive SVR is much smaller than that of traditional SVR, in order to improve the speed of the program. The results of BPNN is better than that of SVR in SH06, SH26, and worse than that of SVR in SH36 and SH56. However, AD_SVR outperforms BPNN and SVR.
The following figure 6 shows the results of SH56 in daily data, which changes from 9.91 to 25.10. Changes are consistent with those experiments mentioned above. The results of SVR and BPNN are getting worse. We can see that several peaks of the green and blue lines deviated from the true value greatly.
From the experiment we can see that, BPNN is easily affected by random initialization parameters. So, it is difficult to gain a good performance for all stocks. And traditional SVR does not apply to changing stock data. However, AD_SVR has a strong robustness to adapt to changing stocks (different stocks, different periods, different time-scales and so on). The verification experimental results indicate the effectiveness and reliability of AD_SVR model.
Conclusion
In this paper, an adaptive SVR based on PSO is proposed to enhance the versatility of the model and to avoid suffering from adjusting parameters of SVR. We tested our proposed algorithm on three time-scales on the stocks of Shanghai Stock Market. The results showed that the adaptive SVR has better adaptability and better prediction results than the traditional SVR and BPNN. And we don’t take the impact of historical models into account in this experiment. So, a weighted adaptive SVR will be introduced in our future work.
ACKNOWLEDGMENT
Yanhui Guo and Siming Han contributed equally to this work.