Deep multi-stage reference evapotranspiration forecasting model: Multivariate empirical mode decomposition integrated with Boruta-random forest algorithm

Evapotranspiration, as a combination of evaporation and transpiration of water vapour, is a primary component of the global hydrological cycle. It accounts for significant losses of soil moisture from the earth to the atmosphere. Thus, reliable methods to monitor and forecast evapotranspiration are required for decision-making in many sectors. Reference evapotranspiration, denoted as ET, is a major parameter that is useful in quantifying soil moisture in a cropping system. This article aims to design a multi-stage deep learning hybrid Long Short-Term Memory (LSTM) predictive model that is coupled with Multivariate Empirical Mode Decomposition (MEMD) and Boruta-Random Forest (Boruta) algorithms to forecast ET in the drought-prone regions (i.e., Gatton, Fordsdale, Cairns) of Queensland, Australia. Daily data extracted from NASA’s Goddard Online Interactive Visualization and Analysis Infrastructure (GIOVANNI) and Scientific Information for Land Owners (SILO) repositories over 2003–2011 are used to build the proposed multi-stage deep learning hybrid model, i.e., MEMD-Boruta-LSTM, and the model’s performance is compared against competitive benchmark models such as hybrid MEMD-Boruta-DNN, MEMD-Boruta-DT, and a standalone LSTM, DNN and DT model. The test MEMD-Boruta-LSTM hybrid model attained the lowest Relative Root Mean Square Error (≤ 17%), Absolute Percentage Bias (≤ 12.5%) and the highest Kling-Gupta Efficiency (≥ 0.89) relative to benchmark models for all study sites. The proposed multi-stage deep hybrid MEMD-Boruta-LSTM model also outperformed all other benchmark models in terms of predictive efficacy, demonstrating its usefulness in the forecasting of the daily ET dataset. This MEMD-Boruta-LSTM hybrid model could therefore be employed in practical environments such as irrigation management systems to estimate evapotranspiration or to forecast ET.

developed to estimate using climatic data. Among them, the PMF-56 equation is widely used due to its accuracy and stability [2]. Other than empirical methods, many researchers have developed data-driven Artificial Intelligence (AI) models to forecast and these models have shown superior performances despite non-linear behaviour of [3]. For instance, Nourani, et al. [4] employed ensemble Multiple Linear Regression (MLR), Support Vector Regression (SVR), Adaptive Neuro-Fuzzy Inference System, Artificial Neural Network (ANN), and Multiple Linear Regression (MLR) models for forecasting and the ensemble MLR model has shown the best performance. Tikhamarine, et al. [5] examined the comparative potential of ANN-Embedded Grey Wolf Optimizer, Multi-Verse Optimizer, Particle Swarm Optimizer, Whale Optimization Algorithm and Ant Lion Optimizer to predict monthly in India and Algeria. Deep Learning (DL) techniques such as the Temporal Convolution Network [6] and the ensemble of Convolutional Neural Networks (CNN) [7] which are comparatively more advanced and precise than the above traditional machine learning methods have also been recently employed to predict . The Long Short-Term Memory (LSTM) network is a DL neural technique that has been used to predict hydrological variables like water quality [8], solar radiation [9], and streamflow [10], and rainfall-runoff [11]. Several recent studies have shown the exceptional performance of LSTM model in predicting hydrological time series [72][73][74][75].
The key advantage of the LSTM model is its ability in using sequential data as inputs instead of independent training samples and this feature ensures the model's capability in dealing with more extended historic hydrologic observations with temporal dependence [76], which is a common characteristic related with many types of hydrological time series [71]. However, less research has been carried out to predict using LSTM based models. Yin, et al. [12] proposed a new hybrid bi-directional LSTM model to forecast short term daily in data scarce regions. In recent years, use of AI models have become more popular in resolving problems related to many various hydrological aspects [77]. For instance, DT model has been used to map the flood susceptible areas in Kelantan, Malaysia which performed with greater accuracy in comparison with frequency ratio (FR) and logistic regression (LR) methods [79]. The DNN model has been employed in water resource management e.g. development of spatial-temporally continuous evapotranspiration model [80], development of model for mapping suitable groundwater extraction location [81] and shown better performances compared to benchmark models. LSTM is also extensively used for flood forecasting [82], and predicting water table [83], etc.
To further enhance the forecasting model capabilities, hybrid models have been developed in the recent past by many researchers. Ferreira and da Cunha [13] developed a DL multi-step forecasting model with hybrid CNN-LSTM and assessed in comparison with standalone LSTM, CNN and traditional machine learning models (ANN and RF). According to the performance analysis, the hybrid CNN-LSTM model outperformed all the comparison models.
In addition, two-phase hybrid models which are capable of yielding high performances with relatively low errors are explored [14]. For example, Prasad, et al. [15] developed a two-phase hybrid Extreme Learning Machine (ELM) model to forecast soil moisture coupled with the Ensemble Empirical Mode Decomposition (EEMD) data preprocessing method and the Boruta-random forest optimizer (Boruta) feature selection method. This model was superior to the other comparative models and yielded a relatively accurate performance with a small number of errors.
The Multivariate Empirical Mode Decomposition (MEMD) is a data pre-processing method that is an improved extension of standard Empirical Mode Decomposition (EMD) for multichannel data [16] and works efficiently in time series nonlinear and nonstationary signal data preprocessing [17]. For instance, Prasad, et al. [15] and Ali, et al. [18] proposed new multi-stage models coupled with MEMD to forecast solar radiation and drought thereby showing superior performance when compared with other models.
Boruta-random forest (Boruta) is a feature selection technique [19] that can identify significant input parameters using a comparison with real features to those of random probes [20]. Boruta has been utilized successfully as a feature selection technique in hybrid models to forecast soil moisture [20,21], streamflow [22,23], and air quality [24].
However, ET forecasting based on multi-stage deep neural networks is yet to be explored. To address this research gap, this study is focused on developing a novel multi-stage MEMD-Boruta-LSTM deep neural network to forecast daily ET based on satellite and ground data. DT and DNN models which have been widely employed in prediction of various hydrological parameters are selected for model performance comparison with target model in this study.

II. THEORETICAL OVERVIEWS
In this section, the MEMD, Boruta, and LSTM are described in detail. The models used for comparison purposes in this study: Deep Neural Network (DNN) [9] and Decision Tree (DT) [25] are not explained in detail as they are well-known algorithms.

A. MULTIVARIATE EMPIRICAL MODE DECOMPOSITION METHOD
The MEMD is an advanced version of EMD proposed by Rehman and Mandic [16] which is capable of dealing with multivariate signals and resolved the mode mixing issue by VOLUME XX, 2017 using white Gaussian noises [26]. The MEMD method can be described as follows [16]: I. Generate a suitable number of direction vectors.
II. Calculate projections of the multiple inputs along with different directions in an n-dimensional space.
III. Identify local maxima projections and obtain multivariate envelope curves through them and subsequently calculate the mean.
IV. Extract the detail using the difference of the mean envelope curve and original signal until the stopping criteria is satisfied for a multivariate Intrinsic Mode Function (IMF) [27].
The mathematical formulae of the MEMD can be found elsewhere [16,28].

B. FEATURE SELECTION: BORUTA-RANDOM FOREST OPTIMIZER ALGORITHM
The algorithm can be briefly explained as follows [19,29]: Let ∈ be the group of predictors for the set of and ∈ be the target for number of inputs, where = , , … , .
I. Create a randomly ordered duplicated (shadow) variable, ′ for and then predict the target .
III. Compute the Z-score as: where, is the standard deviation of accuracy loss and, then maximum Z-score ( ) is determined among duplicated attributes.
IV. Following that, predictors are identified as "Unimportant" when − < and "Confirmed" as important when − > during the process.

C. TIME SEQUENTIAL PREDICTIVE METHOD: LONG SHORT-TERM MEMORY NETWORK
The LSTM is a special Recurrent Neural Network (RNN) [32] related to conventional artificial neural networks that are mainly used to identify patterns in sequences of data [33]. The LSTMs operates with special units, denoted as memory blocks that consist of input, output, and forget gates and these memory blocks continuously update and control the information flow [34]. The calculations are described in 4 steps as follows [35]: I. The LSTM layer decides which information should be forgotten or remembered, based on the last hidden layer output − and the new input by using "forget gate" : where is the weight matrix; is the bias vector and (… ) is the logistic sigmoid function.
II. The LSTM layer decides what information needs to be stored in the new cell state that is represented by the new candidate cell state ̅ after updating information by using "input gate" : , where (… ) is the hyperbolic tangent function.
III. The old cell state − updates to by the "forget gate" to remove unnecessary information and the "input gate" to get a new candidate cell state ̅ : IV. Finally, the output is derived using "output gate" and the cell state :

A. STUDY REGION AND DATASET
This study is centred in Queensland (QLD) Australia, where 84 % of the total land resources are used for agricultural operations [36]. The Queensland government declared 67.4 % of the land area of Queensland drought-affected in the year 2020 [37]. Therefore, developing a precise model to forecast water losses due to is useful for strategic planning in water resources management in the state.
The three examined sites located in arid and semi-arid areas in QLD, Australia selected for this study are Gatton −152.34°, 27.54°, Fordsdale -152.12°, 27.72° and Cairns -145.75°, 16.87° (see Figure 1). The land resources of these selected sites are mainly used for agricultural purposes.
To construct a target hybrid model, data for eight daily predictive climatic variables for the period 01 February 2003 to 19 April 2011 were extracted from the databases of NASA's Goddard Online Interactive Visualization and Analysis Infrastructure (GIOVANNI) -Atmospheric Infrared Sounder (AIRS) and GLDAS model satellite and Scientific Information for Land Owners (SILO). The GIOVANNI provides easy and user-friendly access to visualize and analyse the vast amount of Earth Science-related remote sensing data [38] that can be extracted easily without the requirement for advanced prior knowledge of complex remote sensing datasets. In addition, SILO data source provides ground-based data for predictor variables and it assists to further improve the model's performance. This database is operationally managed by the Queensland Government [39]. Missing data due to instrumental and equipment failures were filled with daily mean data of previous years [40]. Table 1 shows a summary of predictive variables and sources of data. For the target variable that is daily ET, point-based data is extracted from the SILO database.

B. DEVELOPMENT OF THE PROPOSED MULTI-STAGE DEEP HYBRID MEMD-BORUTA-LSTM MODEL
The proposed multi-stage MEMD-Boruta-LSTM model was developed using an Intel Core i7 @ 3.3 GHz and 16 GB memory computer; built using freely available DL libraries: Keras [46] and TensorFlow [47] in Python [48]. The MEMD data pre-processing method and Boruta feature selection method were implemented using MATLAB R2019b and R respectively, while "matplotlib" and "seaborn" tools in Python were used for visualizations. The MEMD-Boruta-LSTM hybrid model was developed using historical time series inputs as follows: Stage 1: In this study, before performing MEMD, firstly, all nine variables (eight predictors + target) (see Table 1) were partitioned into 50% for training (i.e., 1500 data points) and other 50% for testing (i.e., 1500 data points) for all study sites [49] to avoid having a different number of Intrinsic Mode Functions (IMFs). Deo, et al. [50] pointed out that, if the complete dataset (training, cross-validation, and testing) is decomposed together without partitioning as explained above, future data (that is testing and yet unseen data by the forecasting model at a particular time step) would unintentionally add bias into the forecast. Thus, it is an important requirement during the decomposition stage to avoid incorporating future datasets that are to be used in the testing phase with the calibration dataset i.e., training and cross-validation in this study.
The MEMD was performed in the decomposition process independently for each training, and testing data partitions for both predictor and target variables for all three sites. In this process, the recommended predefined parameters: ensemble number ( = 500) and amplitude of the added white noise ( = 0.2) were applied [51][52][53][54]. All the first IMFs of predictor and target variables were pooled into one set. All the second IMFs of predictor and target variables were pooled into one set. This pooling was carried out until the i th IMFs including residuals.

Stage 2:
Boruta-random forest is a feature selection technique available in R. Random Forest tree-based algorithm is embedded in this feature selection technique [62]. This feature selection algorithm is used to identify the significantly corelated predictor variables to the target variable in each IMFs and residuals using historical lagged data at (t-1).

Stage 3:
After identifying the significantly corelated predictor variables for the model development, respective data of those variables were normalized to remove the variance of features [55] by converting them into (0 − 1) range using equation (9): where , , and represent input data for actual, maximum, and minimum values respectively.

Stage 4:
The LSTM model was employed to forecast daily ET in each IMF and residual using significantly corelated predictor variable data at (t-1) lag. To prepare the best model design, hyperparameters for the target model (MEMD-Boruta-LSTM) were identified using the Hyperopt library in Python [56,57]. Hyperopt is one of the hyperparameter optimization algorithms that performed better than the Grid search and Random search algorithms as it ensures comparatively less time in the model training process while increasing the accuracy of the model [58]. Thereby optimal architecture of the hybrid MEMD-Boruta-LSTM model was used to predict daily ET. Finally forecasted ET in each IMF and a residual were cumulated to calculate forecasted daily ET for each study site. Figure 2 presents the workflow of the proposed multi- stage MEMD-Boruta-LSTM model. The same procedure is followed to develop hybridized DNN and DT with MEMD-Boruta (i.e., MEMD-Boruta-DNN and MEMD-Boruta-DT models). Developed standalone LSTM, DNN, and DT models and hybrid MEMD-Boruta-DNN and MEMD-Boruta-DT were used as benchmark models for the model performance comparison.

C. MODEL PERFORMANCE EVALUATION
Model performances are evaluated using the statistical metrics [41][42][43][44][45] given below to confirm whether the target predictive model is superior to other benchmark models and is sufficiently qualified for ET prediction in QLD, ) and observed value ( ) [64]. The RMSE value can range from 0 to ∞ and it becomes zero for the best predictive models.
III. Mean Absolute Error ( ; −1 ): This error value provides an assessment of the actual forecasting errors in terms of the total number of observations [62]. MAE can range from 0 to ∞ and it becomes zero for best predictive models. The MAE gives a more precise measure of average model error than the RMSE since it is not influenced by extreme outliers [65].
IV. Relative Root Mean Squared Error ( ): The RRMSE is used to measure overall forecasting accuracy of the models and always gives positive values [62]. If the value for RRMSE is less than 10%, model performance is considered to be outstanding, while model performance is considered to be good if it is lying between 10% to 20%. If the value for RRMSE error lies between 20% to 30%, model performance is considered as fair. If the value for RRMSE error is higher than 30% model performance is considered to be poor [66].
VI. Nash-Sutcliffe Index ( ):The NS [67] measures how well the plotted line between observed data and simulated data fits into 1:1. The NS is equal to 1, if the model forecasted data is perfectly matched to the observed data. NSE = 0 indicates that the model predictions are as accurate as the mean of the observed data while, Inf < NSE < 0 indicates that observed mean is a better predictor than the model [63].
VII. Willmott's Index ( ):Willmott index is a standardized measure of the degree of model prediction error and the value for WI ranges from 0 to 1, whereas this value equals 1 for best predictive models. The LM is an advanced assessment index based on WI and NS values. This index can be used to assess the goodness-of-fit of a hydrologic or hydro climatic model and is more effective than correlation and correlation-based measures (e.g., the Coefficient of Determination (r 2 ), WI and NS) [68]. The value for LM ranges from −∞ to 1, whereas this value equals one for best predictive models.
IX. Absolute Percentage Bias ( %): The APB gives the error of forecasted values as a percentage concerning the observed values. The optimal value for APB is zero and lower-magnitude values closer to zero reflect good accuracy of the model [69].
X. Kling-Gupta Efficiency ( ):The KGE measures the goodness-of-fit of the model. This metric can be decomposed into the contribution of mean, variance, and correlation on the model performance [70]. Perfect models will give value one for the KGE index [69].
where CV= Coefficient of Variation, where , and , are observed and forecasted th value of the evapotranspiration ET, , ̅̅̅̅̅̅̅̅̅ and , ̅̅̅̅̅̅̅̅̅ are the observed and forecasted average of ET and is the total number of data points of the test dataset.

IV. RESULTS AND DISCUSSIONS
In the decomposition process, training and testing datasets for Gatton and Fordsdale sites were decomposed into 12 IMFs and a residual component (i.e. 104 (= × ) predictors) whereas 11 IMFs and a residual component (i.e. 96 (= × ) predictors) were generated by decomposing training and testing datasets for the Cairns site (see Table 2). In the Boruta feature selection process, 99 predictor variables were identified as significantly corelated to the target variable ET in all IMFs and the residual of Gatton, while 98 and 88 predictor variables were identified for Fordsdale, and Cairns sites respectively (see Table 3). Table 4 shows the selected final predictor variables in each IMF and the residual used to develop target hybrid MEMD-Boruta-LSTM model and benchmark models in Cairns site. Identified hyperparameters for the LSTM in target model through the hyperparameter optimization process are listed in Table 5.   The performance of the multi-stage deep MEMD-Boruta-LSTM model and other comparative models: MEMD-Boruta-DNN, MEMD-Boruta-DT, LSTM, DNN and, DT in the testing phase were assessed using statistical metrics calculated using equations (10) to (19), visual graphs, and error distributions between forecasted and observed ET. these are the lowest recorded values. These results indicate that the proposed multi-stage deep hybrid MEMD-Boruta-LSTM model can be confidently employed for forecasting daily ET and for achieving higher forecasting accuracy compared to counterpart models (MEMD-Boruta-DNN and, MEMD-Boruta-DT) and standalone models (LSTM, DNN, and DT).
In terms of the Absolute Percentage Bias ( %) error and Kling-Gupta Efficiency (KGE) calculated in the testing phase, Figure 3(a) and 3(b) show that the proposed deep multistage MEMD-Boruta-LSTM model generates better performance in terms of APB % error percentage and KGE respectively. Figure 3(a) illustrations that the proposed MEMD-Boruta-LSTM model has scored the lowest APB error percentage (9.2-12.3%) while other all comparative models' APB error percentages are within (11.6-19.7%) range for all sites. According to Figure 3
To further validate the proposed MEMD-Boruta-LSTM model the absolute Forecasting Errors (|FE|=|Observed ET -Forecasted ET|; mm) of this proposed model and all other benchmark models are compared. The box plots in Figure 5 depict the distribution of |FE| in the testing phase with their upper, median and, lower quartiles for all models and weather stations. The results of box plots shown in Figure 5 indicate that the proposed multi-stage MEMD-Boruta-LSTM model presented the smallest quartiles for |FE| for all sites followed by MEMD-Boruta-DNN, MEMD-Boruta-DT, LSTM, DNN, and DT. These results also clearly indicate that the proposed deep multi-stage MEMD-Boruta-LSTM model is superior to the other benchmark models.
The Empirical Cumulative Distribution Function (ECDF, Figure 6) is also used to illustrate the forecasting skills in terms of the absolute Forecasting Error, |FE| (mm) at each site.
Forecasting errors of good models should be closer to zero. The all-hybrid MEMD-Boruta-LSTM, MEMD-Boruta-DNN, and MEMD-Boruta-DT models performed better than standalone LSTM, DNN, and DT models. Based on the forecasting error (0 to ± 4 ), Figure 6 visibly depicts that the proposed MEMD-Boruta-LSTM model is the most accurate compared to all other benchmark models.
In summary, the proposed multi-stage hybrid DL model (i.e., MEMD-Boruta-LSTM) provided significant high performance with the lowest values of the absolute and relative errors i.e., APB, RMSE, MAE, RRMSE, and RMAE, including the highest r, WI, NS, LM and KGE in respect to the other benchmark models. Consequently, it is promising that the results confirm the deep multi-stage MEMD-Boruta-LSTM model has the potential to forecast daily ET and its performance exceeds that of all other comparative hybrid DL and standalone models for all the study sites in Queensland.

V. CONCLUSION
This study aims to design a novel deep learning multi-stage hybrid MEMD-Boruta-LSTM model as a practical tool to forecast daily ET using satellite and ground-based variables. The Multivariate Empirical Mode Decomposition (MEMD) is incorporated with LSTM to decompose predictor variable data into IMFs and residuals and the Boruta-Random Forest (Boruta) feature selection method has been employed to screen the most correlated predictor variables to target variable ET in each IMFs and residuals. The daily predictor and target variable data (01 February 2003 to 19 April 2011) were extracted from GIOVANNI-AIRS, GLDAS model satellites, and the SILO ground database of the Queensland government. The test sites included Gatton, Fordsdale, and Cairns, which are located in drought-prone regions in Queensland, Australia. The integration of LSTM with MEMD and Boruta resulted in a novel multi-stage deep learning MEMD-Boruta-LSTM hybrid model whose performance was evaluated using The MEMD-Boruta-LSTM hybrid model yielded the highest values for normalized performance metrics: r, NS, WI, LM (see Table 6) and the lowest values for RMSE, MAE, RRMSE,  and APB for all sites. Meanwhile, the results also revealed that all hybrid models (MEMD-Boruta-LSTM, MEMD-Boruta-DNN, MEMD-Boruta-DT) remarkably outperformed in comparison with the standalone models (LSTM, DNN, DT) in forecasting ET at all study sites (see Table 6). This comparison provides strong evidence to verify that MEMD decomposition and Boruta feature selection methods can be used effectively to improve the forecasting accuracy of any model. All findings of this study confirm that the proposed multi-stage deep hybrid MEMD-Boruta-LSTM model outperformed the comparative hybrid and standalone models in forecasting ET at a daily forecasting horizon.

(c) Cairns (b) Fordsdale (a) Gatton
The novel proposed deep hybrid MEMD-Boruta-LSTM model can be practically employed for precise forecasting of ET. Evapotranspiration is the main causative natural phenomenon that contributes to the water losses from croplands. By multiplying forecasted ET with the relevant crop factor, which is a unique value for individual crops, the water loss due to evapotranspiration can be estimated in advance, which will be helpful in planning precise irrigation schedules for the future while avoiding the wastage of water resources in drought-prone areas. In addition, this multi-stage deep learning hybrid model used for forecasting is likely to lead to significant financial benefits to the farmers, in arid and semi-arid regions where agricultural practices are adversely affected by the scarcity of water resources.

VI. LIMITATION AND FUTURE RESEARCH
For this model development, data were extracted only for three sites within Queensland (as a case study) as it is impracticable to select more sites representing the whole drought-affected region in Australia or elsewhere. However, this pioneering study has produced a new modelling framework for ET forecasting and paves the way for future studies with a wider scope. For example, the geographic consistency of the MEMD-Boruta-LSTM hybrid model, together with its accuracy can be considered in future research. Moreover, the Gatton Cairns Fordsdale FIGURE 5. The box plots of the absolute value of the Forecasting Errors (|FE|) in the testing phase, generated by the hybrid MEMD-Boruta-LSTM model compared to that of the other predictive models implemented at all study sites. The best model is boldfaced (in blue).

Gatton
Fordsdale Cairns potential use of multi-stage MEMD-Boruta-LSTM for multistep ahead daily ET (7 days, 15 days, 30 days) forecasting can be researched. Further, instead of the MEMD technique for data pre-Decomposition (VMD) technique [61] can be used with Boruta-LSTM to build up a new two-stage deep forecasting model to forecast ET.