Introduction
In the last few decades, these have been a surge of interest in the research of tourism demand among academics and practitioners. Although the tourism flow is growing steadily in the medium and long term, its rapid fluctuation is usually caused by various uncertain obstacles in the short term, such as financial crisis, regional conflict and epidemic disease [1], [2]. These obstacles pose a great challenge to short term forecasting. Moreover, the perishable nature of tourism products, including vacant hotel rooms, unsold tourism services and surplus employees, can lead to an inevitable financial loss without the rational market demand guidance [3], [4]. From a macro perspective, scientific tourism demand forecasting can provide a reliable reference for tourism related departments to formulate tourist policies and sustainable development strategic plans; from a micro perspective, it can guide relevant tourism enterprises to make development plans and allocate the tourism resources rationally [5]. Therefore, to achieve reasonable analyses and credible forecasting of short term tourism demand has always been a goal strive for [6].
As we all know, the change of tourism demand is affected by many factors. As stated by Song et al. [7], the numerous influence factors with respect to both origin and destinations, such as tourist income level, tourism price in a destination or the competing destinations, exchange rate, are the main factors corresponding to tourism demand analysis, and this view is also supported by Jiao et al. [8]. Furthermore, many researchers also believe that some crucial influence factors can be used to establish potential map relationships to make up for the insufficiency of predictors, so as to achieve the purpose of improving the prediction accuracy [9], [10]. Notably, the prices of competing European city destinations and tourist income [1], the exchange rate and tourism price in competing destinations [11] and other similar influencing factors have been employed to forecast tourism demand. Typically, Ulrich et al. [1] introduced the price of competing European city destinations and tourist income as the influencing factor and combined them with the Bayesian VAR model. Jiao et al. [11] used the exchange rate and tourism price in competing destinations to forecast the inbound visitor arrivals to 26 European cities. However, due to the low availability and time difference of influence factors data, the extraction and utilization efficiency is poor. Many existing researches combined with influence factors analyze mainly the annual changes of tourism demand, and most predictors used are relatively simple and less precision. Therefore, this study constructs a set of composite influence factors based on exchange rate, consumer price index (CPI) and others to analyze monthly fluctuations of tourist arrivals. The aim is to mine and capture the correlation between the inherent components of historical data and various influence factors. Based on this, the following problem is to regulate the difficulty of prediction task. Undoubtedly, introducing exogenous variables is necessary for improving the prediction accuracy, but it is bound to increase the systematic errors of prediction approach. How to solve this technical problem of multi-factors fusion and balance the relationship between them is the primary goal of this paper.
In general, for univariate data (i.e. historical tourism demand data), most of the current researches attempt to denoise the tourist flow time series and transform irregular fluctuation into a smooth form, which is easy to capture and understand, for the purpose of improving the forecasting performance [12], [13]. In the field of short-term time series prediction, one of the popular data preprocessing technologies is extracting the empirical mode features according to different time scales by empirical mode decomposition (EMD) [14]. Then, its related derivative technologies came into being, such as the wavelet decomposition (WD) [15], ensemble EMD (EEMD) [16], [17] and complete EEMD with adaptive noise (CEEMDAN) [2]. Their main theoretical principle is extracting intrinsic mode functions (IMFs) according to the frequencies and time scale of data, which can approximately represent the law and trend of data at different stages. For a single tourist historical time series, its processed goal is to explore the potential periodic rules, trends and complex changes within the data. However, some emergencies and external factors often have a significant impact on the tourism market that can not be ignored. And these uncertain events (white noise) adversely affect the internal development of the tourism industry, and bring difficulties in grasping the future trend of tourist flow [6]. Although introducing influence factors into the decomposition strategy to explain these outliers (i.e. emergencies) can effectively grasp the dynamic of tourists flow and improve the accuracy of prediction results, a new issue subsequently arises on how to fuse the influence factors into tourism data by the preprocessing methods.
Generally speaking, the tourism demand data contains regular (trends, cycles, and seasons) and irregular components. These irregular components of the influence factors (such as the change in tourist income, exchange rate, and travel intention) are the main factors causing the fluctuation of tourist arrivals. To solve the issue of multi-source data fusion, this study innovatively proposes a decomposition ensemble learning framework integrating the influence factors into the irregular components of tourism demand data. Usually, forecasting the most complex feature components is regarded as the most difficult challenge and as the key to improving forecasting accuracy. Existing studies show that the irregularity contained in these IMFs is generally caused by economic development, policy orientation, international situation, and other factors [18]. Given the importance and availability of these factors in prediction, this paper investigates the impact of certain economic factors on tourism passenger flow. First of all, the fluctuation of the exchange rate produces the most direct impact on the travel cost, especially for overseas tourists [19]. For example, the currency devaluation leads to an increase in the cost of studying and shopping abroad, but for foreign tourists, the cost of international travel is reduced and their willingness to journey is enhanced. Secondly, the consumer price index (CPI) is a macroeconomic indicator reflecting changes in household consumption levels over a certain period. It reflects not only the economic growth capacity of a country or region, but also the willingness of residents to consume and travel [20]. The third factor to be considered is the personal or household consumption expenditure or income, which plays a similar role as CPI and includes a large proportion of spending on transport and travel [21]. It reflects short-term intentions of consumers without delay, while there is a time lag in CPI. Considering these three factors causing short-term tourism demand to fluctuate, this study combines them with the IMFs containing the complex irregularity, thereby excavating their internal correlation by machine learning (ML).
At present, the most popular ML is the deep learning model, which mainly adopts large-scale distributed parallel processing and extends the shallow neural network to more than two nonlinear mapping layers, by imitating the functioning of human brain [22]. And its effectiveness in various situations has been verified by many researchers [23], [24]. Thereinto, the bidirectional gated recurrent unit (BiGRU) can obtain a stronger self-learning function bases on solving the problem of gradient disappearance under long-term information retention and be considered the best predictor for mining potential dependence on historical data [25]. Hence, the superior BiGRU model is employed to explore the relationship between potential factors, which is used as a bridge for multi-factor fusion in this study.
Usually, for each IMF, various ML models can be adopted as predictors, such as auto-regressive integrated moving average (ARIMA) [16], nonlinear autoregressive neural network with exogenous input (NARX) [15], and back-propagation neural network (BPNN) [14]. But with these developments, each predictor assessing the tourism demand has its advantages and drawbacks. Although some of them have been widely used in the empirical analysis according to their respective strengths, no single model can deal with all types of IMF components. Therefore, it is necessary to design a model matching strategy for selecting appropriate predictors based on the characteristics of each component [3], [8]. For the low-complexity components, some studies have found that they show two characteristics, namely trend and periodic pattern. The former represents the long-term trend of low-complexity and low volatility of tourism demand, and the future trend of these components is easy to grasp [13], [26]. Therefore, we use the most classical ARIMA model to simulate the linear features of data. For the latter, although it is low-complexity, it exhibits nonlinear characteristics, and a useful nonlinear mapping model is appropriate for these components. The support vector regression (SVR) is a prediction model developed on the basis of statistical theory, some researchers found that it has stronger competition and generalization ability in small sample data, especially in tourism demand, and its prediction results are closer to the actual trend [27], [28]. Hence, SVR is adopted to deal with the decomposition components with low-complexity and obvious nonlinear characteristics. Notably, a well-designed prediction model matching strategy is greatly conducive to reducing system error. Determining an excellent predictor is vital to effectively mine the association information composed of various historical data [29].
In the face of complex datasets related to tourism demands, considering various influence factors comprehensively is a meaningful attempt. Some short-term factors affecting tourism travel are closely related to the short-term fluctuations of the tourism market. Introducing various influence factors can reasonably capture future fluctuations based on more comprehensive information. The difficulty and systematic error of prediction are bound to increase. To solve this issue, the complex task is divided into several modules. It has been confirmed that this divide-and-conquer strategy can deal with intractable problems and achieve better results [16]. On the basis of an in-depth understanding of the historical data and its related influence factors, an information architecture fusing the related influence factors, which causes strong fluctuation with the complex characteristic components of tourism demand, constitutes the core idea of our research.
Inspired by the above research, this study proposes a novel multi-factor hybrid ensemble learning approach (ICEEMDAN-LXBSA) based on the improved CEEMDAN (ICEEMDAN), intrinsic mode functions (IMFs) classification and multi-factors fusion. First, aiming at reducing the complexity of the original data, the ICEEMDAN is used to decompose the monthly tourist flow datasets into several IMFs. Second, based on range entropy (RE) and maximal information coefficient (MIC), the obtained IMFs are classified into the following three categories: high-complexity (HC), low-complexity low-correlation (LCLC) and low-complexity high-correlation (LCHC) components. Third, some influence factors (such as exchange rate, consumer price index (CPI) and personal or household consumption expenditure or income) are fused with the high-complexity components, and then the least absolute shrinkage and selection operator (LASSO) is used to filter out redundant variables, thereby obtaining an effective feature combination with strong predictive power. BiGRU is adopted as the predictor to fit its dynamic relationship. Furthermore, SVR and ARIMA models are employed to forecast the remaining two components (LCLC) and low-complexity high-correlation (LCHC) components, respectively. Finally, the final result is obtained by summing up all the individual predictions. To verify the effectiveness of the proposed approach, an empirical experiment investigating the monthly tourist arrivals data from three main origins (USA, Korea and Japan) to Shanghai is conducted to analyze the related issues.
Methodology
In this section, we first introduce the decomposition algorithm, the classification method, and the prediction model. Then, the procedure of the proposed ensemble learning approach is detailedly presented. Finally, the tourism demand forecasting model is described.
A. Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN)
As a derivative of empirical mode decomposition (EMD), the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) is developed by the Colominas et al. [30]. The EMD is an empirical signal decomposition method introduced to analyze nonlinear and nonstationary signals, which decomposes a signal (
Get
new signalN (x^{(n)} ) by decomposingx^{(n)}=x+\beta _{0}E_{1}(w^{(n)}) Gaussian white noise (n(n=1,\cdots,N) ) using EMD, wherew^{(n)} . And the first residue\beta _{0}=\epsilon std(x)/std(E_{1}(w^{(n)})) is obtained by using operatorr_{1} and performing the action of averagingM(\cdot) on\langle \cdot \rangle new signal:N \begin{equation*} r_{1}=\langle M(x^{(n)})\rangle\tag{1}\end{equation*} View Source\begin{equation*} r_{1}=\langle M(x^{(n)})\rangle\tag{1}\end{equation*}
When
, the first decomposed modek=1 is calculated:imf_{1} \begin{equation*} imf_{1}=x-r_{1}\tag{2}\end{equation*} View Source\begin{equation*} imf_{1}=x-r_{1}\tag{2}\end{equation*}
When
, the second residue is evaluated by averaging the local mean ofk=2 , where the adaptive adjustment parameter of Gaussian white noise isr_{1}+\beta _{1}E_{2}(w^{(n)}) . The second decomposed mode is defined as follows:\beta _{k}=\epsilon _{k} std(r_{k}),\,\,k=1,\cdots,K \begin{equation*} imf_{2}=r_{1}-r_{2}=r_{1}-\langle M(r_{1}+\beta _{1}E_{2}(w^{(n)}))\rangle\tag{3}\end{equation*} View Source\begin{equation*} imf_{2}=r_{1}-r_{2}=r_{1}-\langle M(r_{1}+\beta _{1}E_{2}(w^{(n)}))\rangle\tag{3}\end{equation*}
The rest residue
can be calculated in the same manner:r_{k} (k=3,\cdots,K) \begin{equation*} r_{k}=\langle M(r_{k-1}+\beta _{k-1}E_{k}(w^{(n)}))\rangle\tag{4}\end{equation*} View Source\begin{equation*} r_{k}=\langle M(r_{k-1}+\beta _{k-1}E_{k}(w^{(n)}))\rangle\tag{4}\end{equation*}
The
th decomposed modek is obtained by:imf_{k} \begin{align*} imf_{k}=r_{k-1}-r_{k}=r_{k-1}-\langle M(r_{k-1}\!+\!\beta _{k-1}E_{k}(w^{(n)}))\rangle \tag{5}\end{align*} View Source\begin{align*} imf_{k}=r_{k-1}-r_{k}=r_{k-1}-\langle M(r_{k-1}\!+\!\beta _{k-1}E_{k}(w^{(n)}))\rangle \tag{5}\end{align*}
Return to step 4 until all the intrinsic mode functions conditions are met. The final residue satisfies:
\begin{equation*} r_{n} = x-\sum _{k=1}^{K}imf_{k}\tag{6}\end{equation*} View Source\begin{equation*} r_{n} = x-\sum _{k=1}^{K}imf_{k}\tag{6}\end{equation*}
Therefore, a signal (
) can be expressed asx . In the field of time series, the ICEEMDAN has achieved a better spectral separation of the IMFs, and at the same time, it can mine effectively the intrinsic information with different features and reduce the difficulty of prediction tasks. Therefore, the ICEEMDAN, as a decomposition tool, is used to decompose the tourism demand time series in this study.x = \sum _{k=1}^{K} imf_{k} + r_{n}
B. Intrinsic Mode Functions (IMFs) Classification Method
For the different intrinsic mode functions (IMFs), we first employ range entropy (RE) to evaluate the complexity of time series. Then, the least absolute shrinkage and selection operator (LASSO) is used to filter both the high-complexity components and the preselected influence factors, and the correlation coefficients between the low-complexity components and the original time series are calculated by the maximal information coefficient (MIC) method for further partition. The application and introduction of specific methods are as follows.
1) Range Entropy (Re)
Approximate entropy (AE) and sample entropy (SE) are the most commonly used methods for measuring complexity in modern signal processing (especially time series analysis). In fact, SE is derived from AE, which can overcome some limitations of the former, limitations including the inconsistency of the embedding dimension
2) Least Absolute Shrinkage and Selection Operator (Lasso)
The LASSO is a compression estimation technique for dimensionality reduction of high-dimensional data [34]. Its core principle is to construct a penalty function that forces the absolute value of the parameter to be estimated within a preset threshold, thus achieving the effect of compression model regression coefficients. A smaller threshold can compress some regression coefficients to 0 and eliminate such variables with a coefficient of 0, thereby realizing the effect of variable filter. The LASSO method is equivalent to adding the penalty term (\begin{equation*} \hat {\beta }_{LASSO}=\arg \min _{\beta \in R^{d}}\left ({||Y-X\beta ||^{2}+\lambda \sum _{d}^{j=1}|\beta _{j}|}\right)\tag{7}\end{equation*}
Equivalently, \begin{align*}&\hat {\beta }_{LASSO}=\arg \min _{\beta \in R^{d}}\left ({||Y-X\beta ||^{2}}\right) \\&~\qquad \qquad s.t.\sum _{d}^{j=1}|\beta _{j}|\leq t\tag{8}\end{align*}
Let
3) Maximal Information Coefficient (MIC)
In regard to the calculation of correlation, MIC is used to analyze or measure the linear or nonlinear strength between each IMF and the original series in this paper. The MIC can not only find the linear function relationship between variables, but also get the nonlinear function relationship (such as exponent, period). The larger the MIC value, the higher the correlation is. The paired data \begin{equation*} MIC(X_{m};Y)=\max _{|X_{m}||Y|< B}\frac {I(X_{m};Y)}{\log _{2}(\min (|X_{m}|,|Y|))}\tag{9}\end{equation*}
C. Forecasting Models
Each prediction model has its advantages and disadvantages. To improve the forecasting accuracy of tourism demand, it is necessary to select a suitable predictor for each component based on the strength of different models and the data feature. In this paper, bidirectional gated recurrent unit (BiGRU) is used to forecast the high-complexity components, while ARIMA and SVR models are applied to fit the future trend of low-complexity low-correlation (LCLC) and low-complexity high-correlation (LCHC) components, respectively.
1) Autoregressive Integrated Moving Average (ARIMA)
The ARIMA model is one of the most common regression models in the time series forecasting. ARIMA(\begin{align*} x_{t}=\alpha _{1}x_{t-1}+\cdots +\alpha _{p}x_{t-p}+\delta +\beta _{1}\epsilon _{t-1}+\cdots +\beta _{q}\epsilon _{t-q} \tag{10}\end{align*}
There are three iterative stages in building an ARIMA model. At the first stage, the time series needs to be stabilized. For an unstable time series, it usually requires to be stabilized through difference disposal of
2) Support Vector Regression (SVR)
The SVR model, as a supervised learning technique, is developed by Sain et al. [36]. Based on the principle of structural risk minimization, SVR uses a kernel function to detect the best regression hyperplane in a high-dimensional feature space that can balance learning ability and generalization ability. Compared with other models, SVR has the advantages of fast convergence and multidimensional function estimation. For a set of learning sample (\begin{equation*} f(x)=\omega ^{T}\phi (x_{i})+b\tag{11}\end{equation*}
Typically, the kernel function of SVR is employed to dispose the mapping relationship in high dimensional feature space. In this study, the linear function \begin{align*} \min _{\omega,b} \frac {1}{2}\omega ^{T}\omega +C\sum _{i=1}\max \left ({0,|y_{i}-(\omega ^{T}\phi (x_{i})+b)|+\epsilon }\right) \tag{12}\end{align*}
3) Bidirectional Gated Recurrent Unit (BiGRU)
As a variant of the long short term memory (LSTM), the gated recurrent unit (GRU) has fewer parameters and faster convergence while maintaining the good learning performance of LSTM. Differing from LSTM, GRU replaces the input gate and forgetting gate of LSTM with update gate, and its internal structure consists of update gate and reset gate (Figure 1). Among them, the update gate (\begin{align*} r_{t}=&\sigma (W_{r}x_{t}+U_{r}h_{t-1}) \tag{13}\\ z_{t}=&\sigma (W_{z}x_{t}+U_{z}h_{t-1}) \tag{14}\\ \widetilde {h}_{t}=&\tanh (Wx_{t}+r_{t}Uh_{t-1}) \tag{15}\\ h_{t}=&(1-z_{t})h_{t-1}+z_{t}\widetilde {h}_{t}\tag{16}\end{align*}
BiGRU is developed based on GRU concept in the forward and backward transmission of hidden layers, so that the output layer includes the information from both previous and following states. Its basic structure is shown in Figure 2. The idea of BiGRU is that the current hidden state (\begin{align*} \overrightarrow {h}_{t}=&GRU(x_{t},\overrightarrow {h}_{t-1}) \tag{17}\\ \overleftarrow {h}_{t}=&GRU(x_{t},\overleftarrow {h}_{t-1}) \tag{18}\\ h_{t}=&\alpha _{t}\overrightarrow {h}_{t}+\beta _{t}\overleftarrow {h}_{t}+b\tag{19}\end{align*}
D. Procedure of the Proposed Model
Generally, exchange rate, tourist income, consumer price index (CPI) can cause short-term changes in tourists traveling intention. To study the changes of overseas tourists arrival in Shanghai, a novel multi-factor hybrid ensemble learning framework is established to forecast the monthly tourist overnights flow from three major origins (the USA, Japan, and Korea). This hybrid approach integrates decomposition ensemble strategy and multi-factors fusion to achieve the purpose of capturing short-term fluctuations. The overall framework is shown in Figure 3. It consists of five fundamental steps: data decomposition, intrinsic mode functions classification, multi-factors fusion, individual prediction and ensemble results.
Step 1:
Historical data decomposition. By using ICEEMDAN, the tourism demand time series are decomposed into several intrinsic mode functions with different frequencies and characteristics (denoted as
).{IMF}_{i}, i=1,2,\cdots,n Step 2:
Intrinsic mode functions (IMFs) classification. The obtained IMFs by ICEEMDAN are firstly classified into the high-complexity (HC) and low-complexity (LC) components according to range entropy (RE) testing results. Then, by calculating the correlation between each low-complexity component and the original time series by Maximal information coefficient (MIC), LC components are divided into two categories, namely, low-complexity and high-correlation (LCHC) components and low-complexity and low-correlation (LCLC) components.
Step 3:
Multi-factors fusion. The HC components Y are usually caused by various short-term change of influence factors. Here, the preselected three influence factors X including exchange rate, consumer price index (CPI) and personal or household consumption expenditure or income are combined with each HC component to generate an information matrix with 4 rows and
columns, where 4 represents the number of HC and X,n represents the length of each training set series. Meanwhile, the same lag length is set for each column of information matrix. To eliminate the negative influence of redundant components in each column of time series, LASSO, a feature extraction method, is used to screen out the efficient component from each row of the matrix, thereby determining the reasonable input structure for each predictor simultaneously, where a combination of grid search method, LASSO with SVR is developed to determine the optimal adjustment coefficient. In detail, for each row vector of the matrix, the extracted components are remained, otherwise it is taken as 0. Then, all the row vectors processed by the above way are arranged into a time series from the first row to the last row.n Step 4:
Individual prediction. Based on the previous results, three individual predictors (BiGRU, SVR, and ARIMA) are selected to forecast each time series. In detail, BiGRU is adopted to forecast the time series fusing the influence factor X and HC components Y, while the SVR and ARIMA are applied to fit the future trend of LCHC and LCLC components, respectively.
Step 5:
Ensemble results. The final result is obtained by summing the prediction results (PR) of individual prediction for each time series in Step 4.
Empirical Study
In this section, the inbound overnight tourist volume from three major origins to Shanghai is selected to verify the effectiveness of our approach. First, the time series of three major origins and their influence factors are described and analyzed. Then, the detailed experiment design is elaborated. Finally, the experimental results are provided based on different evaluation criteria.
A. Data Description
Shanghai attracts a large number of international tourists to flock to the city annually. From the Wind Database (https://www.wind.com.cn/) and the Shanghai Bureau of Statistics (https://tjj.sh.gov.cn/), the Japan, the United States and the Korea topped the list of inbound foreign travelers to Shanghai, accounting for 40.75% of the total foreign tourists in 2019. As the focus of this study, the monthly tourist overnights flows from three major origins (USA, Japan, and Korea) to Shanghai and their influence factors are selected (denoted as USA-SH, Japan-SH and Korea-SH, respectively) because of data availability and data completeness (see Figure 4). The influence factors of tourist flow from USA to Shanghai include personal consumption expenditure (PCE(USA): monthly) and consumer price index (CPI(USA): monthly) in USA, and the monthly average exchange rate (USD/CNY). The influence factors of tourist flow from Japan to Shanghai involve household (two persons and more) consumption expenditure (HCE(Japan): monthly) and consumer price index (CPI(Japan): monthly) in Japan, and the monthly average exchange rate (100JPY/CNY). The influence factors of tourist flow from Korea to Shanghai cover household income (HI(Korea): monthly) and consumer price index (CPI(Korea): monthly) in Korea, and the monthly average exchange rate (CNY/KRW). The sample data (the tourist flow from USA and Korea to Shanghai and their influence factors) cover the period from January 2004 to December 2019 (192 observations), and the data (the tourist flow from Japan to Shanghai and their influence factors) consist of 180 observations from January 2005 to December 2019. All sample data are collected from the Wind Database (https://www.wind.com.cn/), and their data descriptions are shown in Table 1.
B. Experiment Design
In this paper, all sample data are divided into training set and testing set. Specifically, we define the USA-SH and Korea-SH data as follows: training data set covers the period from January 2004 to October 2016 (154 observations), and testing data set consists of 38 observations, covering the period from November 2016 to December 2019; to Japan-SH data: training data set (January 2005 - October 2016, 142 observations) and testing data set (November 2016 - December 2019, 38 observations). In the \begin{equation*} y_{t+h-1} = f(y_{t-1},y_{t-2},\cdots,y_{t-d})\tag{20}\end{equation*}
\begin{align*}&\hspace {-.5pc} y_{t+h-1} = f(LASSO(y_{t-1}, y_{t-2},\cdots,y_{t-d}, \\&p^{i}_{t-1}, p^{i}_{t-2}, \cdots, p^{i}_{t-d}))\tag{21}\end{align*}
The developed approach, six single models, namely ARIMA, SARIMA, MLP, SVR, GRU and BiGRU models, are applied as benchmarks. Also, several hybrid ensemble models are employed for comparison aims. (1) Based on the different decomposition methods (ICEEMDAN, CEEMDAN, EEMD, and EMD) and individual predictors (ARIMA, MLP, SVR, GRU, and BiGRU), 20 combinations of technology modules are constructed as the models, such as ICEEMDAN-ARIMA, CEEMDAN-SVR, and EEMD-BiGRU. (2) Some of the hybrid models include ICEEMDAN-BSA, CEEMDAN-BSA, EEMD-BSA, and EMD-BSA, where BSA are the BiGRU, SVR and ARIMA predictor, respectively, involving various decomposition methods and decomposed components classification method without LASSO method and influence factors. (3) Other hybrid models consist of CEEMDAN-LXBSA, EEMD-LXBSA, and EMDLXBSA, where L and X represent the LASSO method and influence factors, respectively. (4) The proposed approach is denoted as ICEEMDAN-LXBSA.
In this study, five evaluation metrics, namely, normalized root-mean-square error (NRMSE), mean absolute percentage error (MAPE), Theil coefficient (TC), direction accuracy (DA) and coefficient of determinate (
The details of the above evaluation criteria are summarized in Table 2, where
Obviously, the test approach is to select the predicted value with fewer errors based on one of the error measurements described in the aforementioned evaluation criteria. However, we need to take a step to determine whether this difference is significant (for predictive purposes). For this reason, the Diebold-Mariano (DM) and the Kolmogorov-Smirnov predictive accuracy (KSPA) tests are introduced from a statistical perspective.
Here, DM test detects the differences of prediction direction selection between target model (\begin{equation*} l_{t} = (y_{mod0,t}-y_{t})^{2}-(y_{mod1,t}-y_{t})^{2}\tag{22}\end{equation*}
\begin{equation*} H=\frac {\bar {l}}{\sqrt {\hat {V}_{\bar {l}}}/M}\tag{23}\end{equation*}
Another point is that the KSPA test is designed to ascertain whether there is a statistical difference among the distributions of prediction errors. It estimates whether a target with a lower error reflects a smaller random error relative to the model. The one-side KSPA test is applied in this study, and its null hypothesis is that the empirical cumulative distribution function of tested model error is smaller than that of competition model. When calculating the statistical value of KSPA test, the absolute value of errors is used as the loss function. For more, please refer to Hassani et al. [38] regarding the strict mathematical derivation of the KSPA statistical test.
C. Decomposition and Classification Results
The ICEEMDAN decomposition results of tourism flow from three major origins to Shanghai and their RE and MIC results are displayed in Figure 5 and Table 3, respectively. All IMFs are arranged from the highest frequency to the lowest frequency to represent the changing law of tourism flow from short-term to long-term. Meanwhile, these IMFs present features ranging from high-complexity to low-complexity based on the RE results. The first few rows IMFs of three-time series demonstrate minimal regularity and random factor, while the last few possess the characteristics of periodicity and long-term trend to a certain extent. In addition, according to the MIC results, the more regular components have a higher correlation with the original time series, which means these IMFs contain more effective information than others. For example, in the decomposition result of USA-SH, IMF2 and IMF7 components represent the short-term periodicity and long-term trend, respectively, and both have lower complexity and higher correlation.
The decomposed components of different source markets represent the changes caused by different influence factors in tourist flow. The fluctuation of the first few components means that the short-term tourism volume is affected by certain influence factors, such as weather conditions, holiday length, consumer disposable income and exchange rate movement. Therefore, for short-term fluctuation components with high-complexity, we select CPI, exchange rate and personal or household expenditure or income as the auxiliary variables to assist the forecasting of these components. The changes in the middle few components represent that the emergencies can cause a certain impact on tourism demand within a period of time. For example, due to the global financial crisis in 2008, the international tourist flow from the USA, Korea and Japan to Shanghai decreased simultaneously, and it did not rebound until 2010. Specifically, all IMF4 of three time series show a significantly downward trend during the financial crisis. In Sep. 2016, the terminal high altitude area defense (THAAD) incident caused tensions in international relations between China and South Korea, and also resulted in a decline of tourist volumes. The IMF4 of Korea-SH time series displays a downward trend from Sep. 2016 to Oct. 2017. Since the Diaoyu island incident in 2012, the tourist volume from Japan to Shanghai has dropped rapidly, which could also be seen from the IMF4 of Japan-SH time series. The long-term fluctuation corresponds to last few decomposition components, they are generally affected by local infrastructure, tourism policies, international conditions, and so on. From the last one component of three time series, it can be seen that the tourist flow from USA and Korea to Shanghai has shown a steady growth year by year, while the tourist arrivals from Japan have shown a downward trend since 2008 and have gradually stabilized after 2018.
To obtain better prediction accuracy, we select an appropriate predictor for each IMF based on the complexity and correlation. The components that RE values of IMF are greater than that of the original time series are regarded as the high-complexity ones, and the rest shows low-complexity. According to the MIC results of low-complexity components, those with a value lower than 0.1 are classified as low-correlation, and the rest are high-correlation components. The classification results in Table 4 indicate that the first few components represent high-complexity characteristics, while the last few possess high-correlation with raw time series in most cases.
For high-complexity components, we first combine them with preselected influence factors time series to form a matrix, and then use LASSO to extract effective features from all the columns of matrix and fuse them into a time series, meanwhile the input structure is determined. As shown in Table 5, the adjustment coefficients of LASSO are chosen according to the grid search method. Then, BiGRU is applied to forecast each high-complexity component based on the same network hyperparameters, which are set as follows: a number of recurrent layers
D. Experimental Results Analysis
To verify the effectiveness and robustness of the designed approach, a systematic analysis including various benchmark models and metric methods is implemented from the point of error and statistics test.
1) Error Criterion Analysis
The forecasting error-based evaluation can directly reflect the differences between predicted values and observed values. Table 8, Table 9 and Table 10 show the preliminary comparison results of one-step-ahead, three-step-ahead and six-step-ahead forecasting for all considered models, respectively, where the bold represents the best results. Meanwhile, Figure 6, Figure 7 and Figure 8 display visually the forecasting results of all models on the three testing datasets by scatter plots and line plots of forecasting error and forecasting value, respectively. Some meaningful conclusions and analyses can be summarized as follows.
First, from the perspective of single prediction model, deep learning has obvious advantages in complex time series. In the experiment of USA-SH, the BiGRU shows a significant advantage over the other single models (i.e. ARIMA, SARIMA, MLP, SVR and GRU). For one-step-ahead forecasting of USA-SH, MAPE, NRMSE, TC, DA, and
Second, the decomposition-ensemble approaches combining different models are superior to the corresponding single model. For example, in USA-SH Korea-SH and Japan-SH, the decomposition-ensemble models (i.e. EMD-MLP, EEMD-MLP, CEEMDAN MLP and ICEEMDAN-MLP) can obtain more accurate forecasting than single MLP predictor, the MAPE values of them increase by about 10% on average. Similar experimental results can be obtained by comparing other single models. Notably, our experiments show that some decomposition-ensemble approaches with ARIMA (such as EMD-ARIMA, EEMD-ARIMA, CEEMDAN-ARIMA and ICEEMDAN-ARIMA) have poorer prediction results than some single models (such as SARIMA, SVR and GRU) in USA-SH and Japan-SH.
Third, the forecasting accuracy of simplex decomposition-ensemble approaches (such as ICEEMDAN-ARIMA, ICEEMDAN-SVR and ICEEMDAN-BiGRU) can be further improved by matching suitable models to different IMF features. For one, three and six-step-ahead of USA-SH, the MAPE of ICEEMDAN-BSA is 2.6247%, 5.1357% and 7.3573% respectively, and all DA values are above 95%. According to one-step-ahead forecasting of Korea-SH, the MAPE, NRMSE, DA and
Forth, in the empirical analysis, we find that it is difficult to improve further the prediction accuracy for low-complexity components, but there is still much room for improving high-complexity components. Here, the influence factors (such as exchange rate, consumer price index (CPI) and personal or household consumption expenditure or income) are added to high-complexity components to compensate for the negative effects caused by short-term fluctuations. In fact, the forecasting accuracy (MAPE) and fitting effect (
2) Multi-Factor Fusion Analysis
In order to display better the improved effect of multi-factors fusion on the prediction accuracy of high-frequency components, Table 11 shows a detailed comparison of the prediction results of the high frequency components with and without influence factors. For IMF1 of USA-SH and Japan-SH, the average NRMSE of LX-BiGRU improves by about 25% and 32% compared to BiGRU. In one-step-ahead forecasting of IMF1, IMF2 and IMF3 of Korea-SH, the NRMSE of LX-BiGRU that uses ICEEMDAN method increase by 50.57%, 22.49% and 54.50%, respectively. For Korea-SH and Japan-SH, only IMF4 and IMF2 obtained by CEEMDAN method are the high-complexity components. Its fitting effect is the best when the influence factors are added into IMF4 and IMF2,
3) Statistic Test Analysis
In order to prove the feasibility of the developed hybrid approach, DM and KSPA tests are implemented from a statistical perspective. The test results of DM and KSPA are listed in Table 12, Table 13 and Table 14 with one-step-ahead and multi-step-ahead forecasting, respectively. The following is a detailed analysis.
In DM statistical test, except for CEEMDAN-BSA of Korea-SH time series, it can be confirmed that the results of ICEEMDAN-LXBSA reject the null hypothesis at the
Similar to DM test, in one-step-ahead and multi-step-ahead forecasting, KSPA test results also indicate that the error distribution of ICEEMDAN-LXBSA has a positive significant difference than that of other models at the
Generally, whether from DM or KSPA results, ICEEMDAN-LXBSA shows the characteristics of high precision and sufficient information extraction (i.e. the prediction error is closer to white noise). And from the statistical point of view, the proposed approach has amended, to a certain extent, some problems of the commonly used decomposition ensemble framework.
E. Discussion and Comparison With Benchmark Models
In order to understand better the comprehensive advantages of the hybrid ensemble learning approach constructed in this study, this section discusses in detail the deep reasons for the impact of each module of our approach and benchmark model on the forecasting performance.
First, for the comparison of a single model, SARIMA model achieves higher accuracy than ARIMA model in three time series, and the improvement effect is the most obvious in the USA time series. The reason for it is that the tourist flow from USA is more cyclical than the other two sources. Meanwhile, the performance of SARIMA model is better than that of the AI-based model (SVR, GRU, MLP) in some cases. This opinion is similar to that of Claveria et al. [39], whose research indicated that an AI-based model is more likely to cause information loss than linear models in the filtering process, even though it is excellently equipped to handle nonlinear pattern. In addition, SVR and GRU models outperform MLP model in most cases in the stability and forecasting performance. BiGRU model has higher prediction accuracy than GRU model in one-step-ahead and multi-step-ahead forecasting because of the application of bidirectional transmission mechanism of hidden layers. This mechanism can not only enable the model to have stronger learning ability, but also increase the time cost. In general, BiGRU model performs better than other models in all cases. The reason for this result is that deep learning technology can not only capture the hidden regular components in complex time series processing but also better adapt to its nonlinear components.
Second, using decomposition technique reduces the data complexity and prediction difficulty. According to the results of three time series, the performance of single model is worse than that of the decomposition ensemble model that uses corresponding single model as an individual predictor. However, the deep learning method, BiGRU, gains stronger processing capacity than some decomposition ensemble models (such as EMD-ARIMA, EEMD-ARIMA, and EMD-MLP) in different prediction horizons of three time series. The previous studies manifest that the defect of EMD method leads to the decomposed information confusion and development of EEMD, CEEMDAN and ICEEMDAN methods suppresses this problem to a varying degree by adding white noise to the original data to change the distribution of extremum points. In the process of multiple integration averages, the added white noise cancels each other out, so as to reveal the real data component gradually. In all decomposition ensemble models, the ICEEMDAN-based models outperform the models based on other decomposition methods (EMD, EEMD, CEEMDAN) in most cases. For the model using the same predictor to forecast each decomposed component, in most cases, the hybrid approach based on different decomposition methods and BiGRU displays the best performance in prediction level and direction accuracy.
Third, in order to select adaptively a reasonable predictor to forecast each IMF, we propose an IMFs classification method that considers both complexity and correlation to classify each component. In addition, to further improve forecasting effect, the influence factors are introduced and combined with high-complexity components. However, in some three-step-ahead and six-step-ahead prediction cases, the forecasting results of BiGRU without considering influence factors are closer to the actual results. The reason for this result may be that the influence factors do not supplement the useful information for these high-complexity components, so playing an opposite role, or the feature selection method does not effectively filter the redundant variables. But, on the whole, the proposed IMFs classification method is reasonable and can logically select a suitable predictor to improve the overall prediction performance.
Finally, compared with all benchmark models, the performance of the proposed approach (ICEEMDAN-LXBSA) is the most competitive one in different prediction horizons. The main reasons are as follows: (1) ICEEMDAN method, as a data decomposition technique, can effectively extract information with different frequencies and amplitudes in non-linear and complex time series of tourist flow for tourism demand forecasting. (2) The IMFs classification method is completely divided into three categories based on the complexity and correlation of each component. (3) The deep learning model (BiGRU) is suitable for the most complex IMF with its powerful model learning capabilities, and the regular and linear components are predicted by the ARIMA model, while the remaining components are fitted by the SVR model with strong generalization ability. (4) Considering that high-complexity components contain random factors and are affected by the short-term market, some reasonable influence factors are selected as supplementary information, and on this basis, the feature selection method (LASSO) is used to eliminate redundant information, thereby improving forecasting accuracy. Therefore, the above reasons and the experimental results prove convincingly the superiority and rationality of our approach.
Conclusion
Providing an accurate forecast of tourism demand is helpful for tourism market managers to provide effective tourism services and for investors to specify reasonable investment plans and programs. However, due to the defect of forecasting technology and the randomness and irregularity of time series caused by multiple factors, the accurate prediction of tourism demand becomes a challenging task. To ameliorate these shortcomings, a multi-factor hybrid ensemble learning approach is proposed to forecast the monthly overnight tourist flow from three major origins (USA, Korea and Japan) to Shanghai. The main work includes the following aspects: data decomposition, intrinsic mode functions classification, multi-factors fusion, individual prediction and ensemble results. The empirical results demonstrate that the proposed approach outperforms all benchmark models in terms of different prediction horizons.
The contributions of the study can be summarized as follows. First, unlike previous studies that only use historical data, this study develops a decomposition-ensemble approach with multi factor fusion strategy, thereby improving the prediction ability and the explainability. To be specific, the IMFs with high complexity usually present the high frequency and high volatility. These IMF components are often affected by short-term economic factors and are difficult to be effectively extracted, resulting in a decline of the forecasting accuracy to a certain extent. Therefore, we fuse some influence factors (i.e. exchange rate, consumer price index (CPI) and personal or household consumption expenditure or income) into the high-complexity components in an attempt to capture the dynamic of time series well. Second, the advantages of decomposition-ensemble approach have been demonstrated in many studies including the field of tourism demand forecasting, but there are only a few ways to conduct further feature analysis on IMFs. This study proposes innovatively an IMFs classification method based on a comprehensive strategy combing complexity (range entropy) and correlation (maximal information coefficient) to investigate the internal characteristics among all components effectively. Furthermore, we select appropriate prediction models for IMF in each category according to its features rather than the same model for prediction, thus improving the adaptability of prediction model. Third, according to the characteristics analysis of IMF and predictor, we assign the deep Learning technique (BiGRU), machine learning method (SVR) and time series model (ARIMA) to predict the corresponding IMF. BiGRU is used to forecast the high complex IMFs combing influence factors to obtain a better training effect, SVR is applied to non-linear and non-smooth IMF, and ARIMA is employed to the low complexity IMF with obvious trend characteristics. Based on this decomposition and ensemble idea, the advantages of each technical module can be fully utilized, thus improving the overall prediction performance. Thus, a novel hybrid approach (ICEEMDAN-LXBSA) integrating the ICEEMDAN, IMFs classification and multi-factors fusion is proposed to forecast monthly tourism demand. Compared with the existing models, ICEEMDAN-LXBSA has higher precision and stability in different horizons. So, it can server as an effective reference tool for decision-makers to make corresponding plans.
This study reveals some practical implications worthy of further study in the field of tourism demand forecasting. First, in the theoretical framework, the reasonable influence factors can effectively improve the forecasting accuracy of high-complexity components. According to the empirical results, we can see that the factors, which affect directly or indirectly the tourism industry, also deserve further attention in tourism demand forecasting. Second, from a technical point of view, the proposed classification method based on feature selection of each decomposition component can provide a basis for selecting a suitable predictor. The reliability of the strategy in terms of accuracy and stability is verified in this study. Third, for practitioners and managers in the tourism industry, precise forecast results can provide references for the effective allocation and utilization of resources and policy formulation. In addition, grasping the development of tourist arrivals can effectively transform the business plan of tourism-related enterprises, such as hotels, travel agencies, and catering services.
Although the performance of the presented approach is excellent, there are still some limitations. First, the theoretical framework of this paper is only verified by the tourism demand data from three major origins (USA, Japan, and Korea) to Shanghai, and the amount of data is limited. The proposed model may have certain limitations if the data is missing or huge. Further research needs to be tested in other cities and the amount of data expanded as well. Second, the selected influence factors are mainly some macroeconomic indicators, while other active data sources should be found and introduced into the proposed approach. In addition, since the data in this study are available and complete, how to integrate other influence factor data into the model needs further consideration. Third, predicting each IMF will result in a large amount of computational cost, and the reconstruction strategies and clustering methods are effective means to solve this problem. Therefore, to achieve superior accuracy and lower computational consumption, these limitations are possible directions for future research.