A Newly Developed Integrative Bio-Inspired Artificial Intelligence Model for Wind Speed Prediction

Accurate wind speed (WS) modelling is crucial for optimal utilization of wind energy. Numerical Weather Prediction (NWP) techniques, generally used for WS modelling are not only less cost-effective but also poor in predicting in shorter time horizon. Novel WS prediction models based on the multivariate empirical mode decomposition (MEMD), random forest (RF) and Kernel Ridge Regression (KRR) were constructed in this paper better accuracy in WS prediction. Particle swarm optimization algorithm (PSO) was employed to optimize the parameters of the hybridized MEMD model with RF (MEMD-PSO-RF) and KRR (MEMD-PSO-KRR) models. Obtained results were compared to those of the standalone RF and KRR models. The proposed methodology is applied for monthly WS prediction at meteorological stations of Iraq, Baghdad (Station1) and Mosul (Station2) for the period 1977-2013. Results showed higher accuracy of MEMD-PSO-RF model in predicting WS at both stations with a correlation coefficient (r) of 0.972 and r = 0.971 during testing phase at Station1 and Station2, respectively. The MEMD-PSO-KRR was found as the second most accurate model followed by Standalone RF and KRR, but all showed a competitive performance to the MEMD-PSO-RF model. The outcomes of this work indicated that the MEMD-PSO-RF model has a remarkable performance in predicting WS and can be considered for practical applications.


I. INTRODUCTION
The importance of wind speed prediction in wind energy farm operation and maintenance has increased over the years [1], [2]. The sustained increase in the rate of wind turbines erections demands the deployment of optimal dispatching strategy that will guarantee stable power generation by the wind turbines without having much influence on the power The associate editor coordinating the review of this manuscript and approving it for publication was Xiaowei Zhao. grid. However, the cost of wind farm operation may be affected by imperfect predictions due to the underlying uncertainties in WS [3], [4]. Similarly, the availability of wind resources must be considered for a maintenance schedule to ensure optimal maintenance for reducing the turbines' production loss [5], [6]. Therefore, accurate WS prediction has grasped research attention in recent years due to its great practical and academic values.
Different prediction models are currently used for prediction of WS in different time horizons. Most of the recent VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ studies on WS prediction reportedly used intelligent models [7], [8]. Such models are generally helpful in short-range (30 min−6.0 h) WS estimation [7]. There are several versions of these intelligent models, including adaptive neural fuzzy inference system, support vector machine, artificial neural network, etc. In practice, the WS data measured at the turbine locations are used to train these models while the tuning of the model parameters is done using the immediate past observed WS. Despite the effectiveness of these intelligent models for short-term predictions, they still experience a rapid declination in prediction accuracy with the increase of prediction horizon. Numerical Weather Prediction is the most widely used for medium-term (daily, weekly and monthly) WS prediction to guarantee satisfactory prediction accuracy. The NWP needs intensive computational capacity. Supercomputers are used to NWP which provides predictions only once or twice in a day. Moreover, the accuracy of NWP prediction is lower than the statistical models. Thus, medium-to longterm WS predictions are usually done using the NWP models. Owing to the need for better WS prediction model with good accuracy at multiple horizons, a novel WS prediction technique with enhanced short and medium accuracy should be explored.
Industrialization has predisposed the world to several energy-related problems. Renewable resources are becoming important due to the declining fossil fuel reserves and the adverse impacts of fossil fuel on environment [9]. A major practical renewable energy source is wind energy, hence, its related technologies ought to be thoroughly investigated and developed. Wind has an intermittent characteristic; hence, its energy generation is unstable. Energy conservation and management can be disturbed due to instability in wind energy generation [10], [11]; however, this problem can be addressed by an efficient WS prediction. Several factors influence wind speed, making it difficult to measure the complicated windspeed features accurately using the simple prediction models. Therefore, much attention has recently been paid to high precision WS prediction techniques.
The past decades have witnessed the development of several numerical WS prediction models. Such models are classified into physical, deterministic and probabilistic models [12], [13]. The physical models use physical features like atmospheric pressure, ambient temperature, and local terrain to approximate WS [14]. Owing to the strong theoretic foundation and tremendous performance of the physical models in WS prediction, they are deployed in the field [15]. Meanwhile, several equations are required in the physical models, making them unsuitable for mid to long-term WS prediction when the computation cost is considered. They are also less capable of short-range WS prediction at the local or station level. Several physical models have recently been developed, for instance, Allen et al., (2017) presented a boundary layer scaling model to predict long-term average near-surface WS [16], a physical-based model for WS prediction in complex terrain was developed by [17].
WS prediction using the statistical models is based on historical data. Statistical models are becoming more popular due to the recent advancements in data science [18], [19]. Statistical models can be categorized into multiple and single data models based on the number of different types of data used. WS prediction with the multiple data models requires a combination of several physical information with some statistical frameworks [20]. Multiple data models often present excellent prediction performances, and therefore, they have attracted the attention of researchers. The observed multiple meteorological data are modelled using Gaussian process regression [21] while a probabilistic approach based on some statistical algorithms (non-parametric) and numerical weather prediction data is used to design the WS prediction model [22]. A new approach based on copula is used by [23] to develop a WS prediction model [23]. Despite the chances of achieving better results using multiple data models, the complication and vagueness of the model can sometimes be increased by the multiple data. The multiple data models are therefore less stable compared to single WS data models. Furthermore, the single WS data models are often associated with low computational complexity and are therefore often suggested for short-term WS prediction [24]. Several algorithms have been deployed in designing single WS data models to ensure maximum utilization of antecedent WS data for prediction of WS; such frameworks include signal processing and time series algorithms [25]- [28]. The time series algorithms are classified as the classical algorithms for WS prediction which contain persistence algorithms. In contrary, the signal processing algorithms are primarily deployed for feature extraction during WS prediction; such algorithms include wavelet decomposition, empirical mode decomposition (EMD), wavelet packet decomposition, and complete ensemble empirical mode decomposition [29].
The WS data has a high degree of non-linearity and non-stationary which make an accurate prediction using a single WS data model challenging. Accuracy of predictive models can be improved using efficient learning and predicted parameters. Various data decompositions methods have been used to overcome this challenge which includes EMD, wavelet decomposition, variational mode decomposition, seasonal adjustment, intrinsic time-Scale decomposition and empirical mode decomposition [30]. The EMD model belongs to data-adaptive decomposition techniques has advantages over wavelet transformation methods. EMD methods decompose a time series into a collection of stationary Intrinsic Mode Functions (IMFs) with different frequency bands and a residue-based on local properties of the time series adaptively and efficiently. An EMD model starts by decomposition of a given input predictor which is projected using the adequate lagged sub-series as inputs. Then the predicted series are summed at various time scales to gain the target variable at expected data scales (monthly scale in this study). Therefore, it can efficiently capture the non-linearity and non-stationary of WS time series by decomposing it into several series with independent time resolutions [24].
The EMD algorithm also overcome the difficulties of wavelet transformation method by fixing the most suitable decomposition levels and specifying the base function [31]. The EMD has been found highly effective in improvement of model accuracy in a broad range of applications for analysing nonlinear and nonstationary processes. It has been successfully applied in forecasting different engineering problems such as prediction of evapotranspiration [32], soil water [33], crude oil price [34] and iceberg drift [35].
As WS behaves differently for different time scales, WS model performance can be improved by engaging the appropriate predictors. The WS prediction models can be classified into various categories: data-driven, model-driven and hybrid approaches. Model-driven and data-driven approaches employ metrological information and statistically techniques, respectively to handle the physical properties that influence WS. Therefore, they have their inherent advantages and disadvantages. To overcome the limitations of model-driven and data-driven approaches in handling the challenges of stochastic and intermitted nature of WS in its accurate prediction, a hybrid approach can be used [36].
Besides, it is required to optimize the parameters of hybrid models for improvement of their prediction accuracy. In recent decades, many optimization methods are being adopted for wind forecasting models such as particle swarm optimization (PSO) and Genetic algorithm (GA). Among them, the PSO has been widely used in recent years to optimize the parameters of different models in many fields.
Novel models through hybridization of multivariate empirical mode decomposition (MEMD) with random forest (RF) and Kernel Ridge Regression (KRR), and a parameter optimization algorithm known as Particle swarm optimization algorithm (PSO) are proposed in this study for accurate prediction of WS. The proposed model directed in a way to benefit the advantages of other soft computing techniques for improved tuning of the MEMD model. Random forest (RF) [37] and Kernel Ridge Regression (KRR) [38] algorithms were used to avoid the overfitting the MEMD model. The regression method adopted is the nonlinear KRR method, which has shown particularly attractive for its simple implementation, fast processing and accuracy [39]. Nonetheless, the accurate prediction of the model parameters is also a requirement for optimal model performance which has been done using PSO. The key objectives of this study include evaluation of the performance of the MEMD-PSO-RF and MEMD-PSO-KRR in WS prediction and benchmark the results by comparing the performance of newly developed models with Standalone RF and KRR models.

II. CASE STUDY AND DATASET DESCRIPTION
Despite the arid to semi-arid climate in most part of Iraq, the Tigris basin still ranges from semi-humid to semi-arid in the headwaters to the north and south, respectively [40]. Hence, any event of drought in the future is expected to adversely affect the already limited water resources of the nation and this will affect the socio-ecological system of the Tigris Basin's that is home to over 18 million people [41], [42]. The persistent increase in temperature is continuously boosting surface water scarcity and reducing aquifers' water tables, indicating an ongoing drought condition which can worsen with time. Using climate forecasting models, it has been predicted that drought and temperature are on the increase in the region and the condition may soon become unsustainable [43]. Drought-related issues are closely related to the balance between temperature and precipitation [44]. In the Tigris, the annual rainfall ranges from 400 to 600 mm; however, the range from the downstream to the upstream reaches is about it ranges from 150 to 800 mm, respectively. According to the Iraqi meteorological monitoring stations, there have been high rates of evaporation in Mosul and Baghdad. From 1960-2009, Baghdad has recorded a mean July temperature range of 23.5 to 44 • C while the rate of annual rainfall has been 244 mm; the annual evaporation rate has been 3200 mm. Regarding Mosul, the observed July temperature range has been from 24.8 to 43 • C, with annual precipitation rate of 729 mm and evaporation rate of 3900 mm.
In this research, multiple hydrometeorological variables including sunshine radiation (SS), rainfall (R nf ), minimum air temperature (T min ), maximum air temperature (T max ), evaporation (ET) and relative humidity (RH) to predict the monthly wind speed (WS), were used to build the proposed predictive model. Historical data over 1977-2013 with monthly scale data without any missing data at Baghdad and Mosul meteorological station were used in this investigation ( Figure  1). The statistical characteristics of the used historical data are reported in Table 1. The selection of this region was due to the lake of advanced technologies that support the hydrologist and climatologist and thus the proposed advanced soft computing model can provide a remarkable assistance for simulating the WS. It is highly essential for energy sources management, monitoring and assessment. Also, the studied Iraq region is considered as developing country with a high potential sources of wind energy for exploitation. Meanwhile the region is one of the primary sources of dust generation and emissions with the resulting dust affecting the neighboring countries [45]. Hence such an inspection can produce high aspect of benefit for environmental engineering projects, e.g., the risk analysis of the dust storms as well as prediction of the future wind energy sources is available for exploitation.

III. METHODOLOGY A. THE RANDOM FOREST MODEL
Bootstrapping is principally an ensemble modelling approach which provides prediction using decision trees [32], [33]. The RF algorithm is a decision tree-based machine learning approach which developed ensembles using a random bagging technique [47]. Every node is connected randomly by selecting well-known predictors to increase predictive performance and avoid overfitting [48]. The RF model can be constructed using the following steps:  i. Embedding the inputs to generate n-number of trees (i.e. n trees ) using bootstrapping.
ii. Select the maximum number of split predictors using a random sample of inputs (mtry) based on unpruned regression tree.
iii. Combine the estimates of n trees in terms of aggregations to predict WS.

B. THE MEMD APPROACH
The multivariate empirical mode decomposition (MEMD) is a self-adaptive method capable of handling the hurdles of mode alignment. The mathematical structure of MEMD is an improved variant of EMD which is defined as: In Eq. (1), (α), C k (α) and R l (α) are the input variable, the k th Intrinsic Mode Functions (IMF) and residue factor respectively. The MEMD designed by [37] demarcates the multiple inputs into IMFs using White Gaussian noise [50]. The mean ℵ (α) can computed as: The term e θ s (α) is called the envelope curves with t : In Eq. (3), R (α) is a multivariate IMF. The application of MEMD can be found in signal processing [46], [47] and solar radiation [53].

C. PARTICLE SWARM OPTIMIZATION (PSO)
PSO meta-heuristic algorithm introduced by [54] is a soft computing optimization method that is theorized to optimize a problem through the searching for the best candidate solution. The PSO algorithm is invented by replicating the social behavior of animals in a bunch e.g., birds and fishes [55]. In this algorithm, a bunch of creatures, that are called particles, spread in the search area [56]. Every single particle approximates its situation relative to the target position. They finetune their location and velocity using the current situation and the best position they were already in and the situation of the best particles in the bunch [57]: where X t id indicates the location of the particle i in iteration t, V t id is the velocity of particle i in iteration t, P t id is the best location of the particle i, P t gd is the global best position of particle i, w expresses the inertia weight, c 1 expresses the cognitive learning factor, c 2 expresses the social learning factor, and r 1 and r 2 denote the random values in [0,1].
The basic steps for implementing the algorithm are as follow: Step 1: Generate the initial swarm and assessing it.
Step 2: Evaluation of fitness of every single particle within the bunch.
Step 3: Update velocity of every single particle according to Eq. 4.
Step 4: Update the position for each particle by the following equation: Step 5: The algorithm is stopped when the termination criterion is satisfied or returned to Step 2.

D. THE KRR MODEL
The Kernel Ridge Regression (KRR) is a kernel-based regression technique [58] that handle the over-fitting problems by adopting regularization and the kernel procedure in nonlinear input variables. Mathematically, The KRR is described using the following mathematical structure: In Eq. (6), . H represents the Hilbert normed space [58]. Eq. (7) can be rewritten in the following form: can be designed using l,s = (z l , z s ) which is max kernel matrix and y is the input rx1 regress and vector, and β is the rx1 unknown solution vector. In training phase, KRR is estimated β using Eq. (9) which utilized later in validation to calculate the regression of unidentified samplez in Eq. (9). Different kinds of kernels (for example linear, polynomial and Gaussian) can be used to achieve better performance [59], [60]. The mathematical formulation of these kernels can be expressed a following:

1) THE MEMD PHASE
The MEMD model is applied to demarcate the predictor data into respective IMFs and residuals. Additionally, the predefined parameters include the ensemble number (N = 500) and the amplitude of the added white noise (ε) between -0.2 and 0.2. Total fifty-four IMFs (Table 1) for WS in station 1&2 were demarcated where every single predictor has IMFs = 9.
It is worth to highlight that the MEMD is subjected with several tuning parameters including tolerance and threshold values, stop vector, stopping criteria and total projection, as reported in Table 2 to attain the equal IMFs for both training and testing subsets.

2) THE PSO PHASE
The PSO algorithm is used for the selection of the utmost suitable IMFs for the only training set. The predefined set of parameters includes maximum iterations (=10) and population size (=20). The number of fix selected IMFs is retained to 16 that were pre-defined before executing the PSO algorithm. The same number of IMFs are adopted for the testing set following the training IMFs.

3) THE NORMALIZATION PHASE
The training and testing sets are normalized between 0 and 1 using Eq. (20) [70] to overcome large differences in the data [70]: In Eq. (20), denotes the input/output, min is the smallest and max is the largest magnitude of the data, and norm is the desired normalized point.

4) THE RF PHASE
The last modelling phase is the employment of RF algorithm to predict WS and investigate its ability in WS prediction in the monthly timescale. The designated IMFs (for training period) were embedded in RF model. Some pre-defined parameter set (i.e., 1000 trees and 5 predictors) needs to establish using hit and trial approach in both training and testing stages. The equivalent number of IMFs are used for the testing set to validate the MEMD-PSO-RF model. Further, the MEMD-PSO-RF, standalone RF and standalone KRR models were also benchmarked. Figure 2 demonstrates the diagrammatic presentation of MEMD-PSO-RF model.

IV. APPLICATION RESULTS AND ANALYSIS
In this study, a new hybrid intelligence framework for wind forecasting is proposed. The relative performance of different models was evaluated to validate the performance of the proposed model. Tables 4 and 5 provide a summary of the overall performance of the models in predicting WS at two locations in term of eight statistical measures during training and testing stages.      It is important to note that the practicality of MEMD method in prediction of WS to increase the forecasting capacity of the RF model is a key development of this paper.
The predictive accuracy confirmed that MEMD-PSO-RF model can deliver healthier predictions of WS compared to other models in the study regions. The results also revealed that MEMD-PSO-RF is effective in extracting features from climatological variables in a tangible way. The performance of MEMD-PSO-RF also revealed that the PSO algorithm has advantages in indicating the pertinent features to assist the RF in predicting WS time-series. In addition to the overall performance of MEMD-PSO-RF also confirmed the appropriateness of PSO in sorting out relevant IMFs with the assessment criteria of MEMD-PSO-RF method (i.e., Tables 1-2). Therefore, it remarkably improved the performance of MEMD-PSO-RF compared to MEMD-PSO-KRR and standalone counterpart models.
Since the artificial intelligence models exclusively depend on past data that may significantly affect the 'learning' and forecasting process, the outcomes of the present study established that an appropriate feature selection should be performed carefully before implementation of data-driven models. The MEMD is successfully classified and segregate the relevant features inside the climatological inputs to establish a more consistent physical foundation for a particular artificial intelligence method. The usefulness of MEMD in the present study is the concurrent data pre-processing of numerous climatological predictors. The MEMD can identify concurrently the signal's main frequency to capture the respective features. This finding collaborates with the findings in [72]- [76].
Another key perception is that a smaller number of predictors with competitive accuracy is a parsimonious and computationally good model which is possible to achieve using MEMD-PSO-RF.
The feasibility of the MEMD method to predict WS is a major progression in this study. It improved the predicting ability of the RF and KRR models. It is apparent that better understandings of the physical procedure were given to the hybrid model, mainly by the MEMD method to the artificial intelligence model effectively capture the information in the meteorological variables in modelling WS.
The primary purpose of implementing MEMD in this study is its self-adaptive nature which involves minor human effort in factorizing IMFs. The MEMD conducts the data-driven based time-frequency investigation of multiple inputs by considering nonlinear behaviours via dynamical process [49]. Other major advantages of MEMD is its ability to handle the mode alignment concerns very efficiently [77]. Therefore, the MEMD-PSO-RF has the potential for WS prediction and management systems. The proposed MEMD-PSO-RF model can be used as a WS modelling system for improving the efficiency of wind energy farm.

V. CONCLUSION
This study provided new insights into environmental modelling by introducing the innovative integrative intelligent-based data-driven models, MEMD-PSO-RF and MEMD-PSO-KRR for WS modelling. The proposed models with enhanced short-and medium-term prediction accuracy were applied for modelling monthly WS at Baghdad and Mosul of Iraq. Several meteorological variables were used to build four models, MEMD-PSO-RF, MEMD-PSO-KRR, and standalone RF and standalone KRR. Predictive accuracy of the proposed models was evaluated using several performance measures.
Results indicated the superiority of MEMD-PSO-RF model in reproducing the WS time series in both the Baghdad and Mosul stations. The MEMD-PSO-KRR and standalone RF models also showed good performance. However, the standalone KRR was found to perform unsatisfactorily at both the stations. Since most of the meteorological variables used in this study as input are readily available in most regions, WS prediction using MEMD-PSO-RF model is feasible for practical applications. The results of these models can aid the practitioners to determine the windy areas for deployment of wind energy systems. Iraq often suffers from dust storms; the use of such models can also help in dust storm risk management. The models can help the authorities to determine the dust storm-prone areas and to adopt the appropriate strategies for dust storm mitigation. It future, the models developed in this study can be employed with a geographical information system for spatial prediction of WS over the whole country. . His current research interests include optimization algorithms, nature-inspired metaheuristics, machine learning, and feature selection problem for real world problems. VOLUME 8, 2020 MANDEEP KAUR SAGGI received the bachelor's degree in computer science and engineering from Punjab Technical University and the M.Tech. degree in computer science engineering from D. A. V. University, Jalandhar. She is currently pursuing the Ph.D. degree in computer science engineering with the Thapar Institute of Engineering and Technology, Patiala. Her area of interest include in machine learning, big data analytics, quantum computing, network security, and cloud computing. Her research areas of a topic are crop modeling and irrigation water management.
ESMAEEL DODANGEH received the Ph.D. degree in watershed science and engineering from Sari Agricultural Science and Natural Resources University (SANRU), Sari, Iran. His research interests include hydrological modeling, extreme hydrological (drought and flood) frequency analysis, machine learning, and uncertainty analysis of bivariate hydrological frequency analysis. He is very interested in non-linear and complex hydrological systems. His research interests are mainly in geology, water resources, and environment. He served as the dean and the head of department for several academic administrative posts. His publications include more than 424 articles in international/national journals, chapters in books, and 13 books. He executed more than 60 major research projects in Iraq, Jordan, and U.K. He received several scientific and educational awards, among them is the British Council on its 70th Anniversary awarded him top 5 scientists in Cultural Relations. He holds one patent on physical methods for the separation of iron oxides. He has supervised more than 66 postgraduate students at Iraq, Jordan, U.K., and Australia universities. He is a member of several scientific societies, e.g., International Association of Hydrological Sciences, Chartered Institution of Water and Environment Management, and Network of Iraqi Scientists Abroad and the Founder and President of the Iraqi Scientific Society for Water Resources. He is also a member of the editorial board of ten international journals.
ZAHER MUNDHER YASEEN received the master's and Ph.D. degrees from the National University of Malaysia (UKM), Malaysia, in 2012 and 2017, respectively. He is currently a Senior Lecturer and a Senior Researcher in the field of civil engineering at Ton Duc Thang University. He is major in hydrology, water resources engineering, hydrological processes modeling, environmental engineering, and climate. In addition, he has major interest in machine learning and advanced data analytics. He has published over 100 research articles in international journals with a Google Scholar H-Index of 25, and a total of 1900 citations.
SHAMSUDDIN SHAHID is currently an Associate Professor at the Department of Hydraulic and Hydrology, Faculty of Civil Engineering, Universiti Teknologi Malaysia (UTM). He is also the Head of the Integrated Water Resources Management (IWRM) research group, UTM. His major research interests include water resources management, climate change impacts and adaptation to water resources, statistical hydrology, hydrological disasters, and groundwater hydrology. He uses statistical and mathematical tools for innovative solutions of hydrological problems for adaptation to global environmental changes. He published about 100 research articles in internationally reputed indexed journals and five academic books. His major research achievements in recent years include forecasting water demand in a holistic way, downscaling climate using an alternative approach for better projection of climate change, modeling economical impacts of global warming induced changes in water resources, modeling droughts during crop-growing seasons, projection of drought severity-area-frequency curves for adaptation to climate change impacts on droughts and water stress, multi-criteria decision making for selection of adaptation and mitigation strategies, and determination of unidirectional trends to distinguish global climate change from natural variability. He has successfully completed about ten national and international research projects, as a Principal Investigator.