Critical Review of Data, Models and Performance Metrics for Wind and Solar Power Forecast

Global climatic changes and increased carbon footprints provided the main impetus for the decrease in the use of fossil fuels for electricity generation and transportation. Matured manufacturing technologies of solar PV panels and on-shore and off-shore windmills have brought down the cost of generation of electricity using solar energy on par with conventional fossil fuel. Initially, solar and wind power generation was envisioned for microgrids, serving small local communities. However, advancements in power electronics have now facilitated large solar and wind farms to be integrated with main power grids. In this context, hosting capacity, which is the amount of distributed energy resources a grid can accommodate, without significant infrastructure up-gradation, has gained importance. In determining the hosting capacity at a particular location, the uncertainties of wind and solar power generation play a role. Effective forecasting models using time-series weather data can be built to predict wind and solar power generation. This forecast is essential to ensure proper grid operation and control when renewable energy sources are already installed. The forecast is also useful in the planning stages for investment decisions and distribution system planning. While long-term forecasts are rarely needed for the operation of integrated grids, accurate short-term predictive models are necessary for scheduling. This paper presents an extensive review of various forecast models available in the literature. The study mainly focuses on the short-term forecast, providing a critical review of the duration of data used in each model and a synoptic comparison of their performance indices.


I. INTRODUCTION
The energy demand of the earth's population is humongous and is expanding in mammoth proportions, with the current annual energy consumption being approximately 25 kTWh. The current trends indicate a doubling of this by 2050 and by the end of the century, it is expected to triple. Two scenarios may be considered easing ease this threat -carefully use the energy or find alternative energy sources [1]. Current dependencies on fossil fuels would lead the human population to an acute energy crisis. The continued use of the dwindling fossil fuels and their associated harmful effects like pollution and greenhouse gas emissions with resultant climate change, render them less acceptable and reliable as energy sources for the future. These factors pose some of the daunting challenges to humanity to identify and employ cleaner and greener energy sources that could satisfy and sustain the demand for the future [2], [3]. The prohibitive costs in consideration of the quantum of production limit the employment of the currently known sources of green energy as a viable alternative to conventional sources, though the gap is slowly diminishing. Among the renewable energy sources, solar and wind energy are immensely promising for electric power generation and have the potential to ensure a sizeable contribution to the electrical energy demand of the planet. Both these sources are practically inexhaustible, freely available, and involve no polluting residues or greenhouse gas emissions [4]. However, solar and wind energy are sources that depend on weather conditions, which carries an element of uncertainty. Though solar power production has a fairly predictable pattern, wind power is intermittent, unreliable, [5], [6] and highly erratic [7]. This poses a major threat to the reliability of power dispatch. This causes high accuracy forecasts to be performed on multiple time horizons [8]. Forecasting tools help in the reduction of deviations between the schedule and the actual dispatch at the load dispatch centers [9]. The power produced by solar PV panels depends primarily on solar irradiance, which exhibits a specific pattern. The magnitude of irradiance is zero through the night and starts increasing with sunrise. It reaches its peak in the afternoon and then slopes down, reaching a near-zero with sunset, thus following a periodic pattern every day. The magnitude of irradiance may vary depending on the weather conditions while retaining the pattern [10]. There can also be variations due to the intervention of clouds. De Giorgi et al. [11], developed multi-regression and ANN models for solar irradiance prediction. A sensitivity analysis using regression models has been performed to estimate the impact of different input parameters, such as module temperature, ambient temperature, and irradiance on the modules for different tilt angles. The result showed a high impact of irradiance on PV power. Benghanem and Joraid [12], estimated the relation between different parameters like global irradiance, diffuse irradiance, and temperature. This helped in evaluating the potential of solar energy in Medina, Saudi Arabia. The measured and estimated values are in good agreement. Ruiz Ariaz et al. [13], proposed a new regression model for the prediction of hourly solar irradiance. The model considers the clearness index and relative optical mass. Dong et al. [14], proposed a prediction of solar irradiance using a state-space exponential smoothing model. Before applying this model, the data is stationeries using the Fourier trend model. The performance is compared with other time series forecasting models, such as ARIMA, single exponential smoothing, random walk model, etc. [15].
Power generation from wind is highly susceptible to climatic variables viz. geographical location, wind speed and its direction, seasonal changes, time of the day, etc. A uniform efficiency cannot be warranted for a particular forecasting method across different geographies [16]. Hence it is critically essential to examine the seasonality and other influencing parameters to determine the best fit model for the location [17], [18] Literature provides a good number of options for forecast such as statistical models, intelligent models, etc. Reference [19]- [21] give a detailed review of diverse methods to forecast wind power. The authors have also enunciated the directions for developing forecast models. It is specified that a combination of physical and statistical models can be more effective in prediction. In [22], a model is proposed which integrates wind direction with other input parameters to form a mixed ARMA model. The inter-dependencies between wind speed and wind direction are investigated by K-Mean clustering. A hybrid model using ARIMA-GARCH and wavelet are proposed in [23]. The outliers in wind speed data are filtered out using wavelet decomposition, followed by modeling with the ARMA-GARCH model. In [24], ANN and Bayesian algorithms are employed to propose a 2-step prediction. This methodology could achieve MAPE ranging from 14 to 18 percentage with the analysis done at two different sites.
Artificial Neural Networks (ANN) are gaining significant acceptance in recent times for renewable energy forecasts as they overcome certain shortfalls of time series models [25]- [27]. ANN is a self-learning network, which can learn from the given data during the process of training. A well-trained neural network can be made to behave as required. A network is formed with a set of neurons organized in different layers. The bottom layer has the predictors (inputs) and the top layer contains forecast (outputs). A series of known inputs and outputs are fed to the system. At each step, the output of the system is compared with the actual output and the error is propagated backward through the network. When the training is completed with a sufficient amount of data, the network will be capable of predicting the future. Thus ANN is a very good tool for prediction as it can be trained using historical data to predict future values [28], [29]. The input data set is split into three, each one for training, validation, and testing, while training recurrent networks. The network bias values and weights are corrected using the data set for training in the training algorithm, to minimize the error. The corrections are completed using the validation set. The trained network is finally evaluated for performance on the test data set.
In the past few decades, hundreds of models have been proposed in the literature to predict solar and wind power generation. An exhaustive comparison of all these models is limited in the literature. In this paper, the authors have attempted to analyze the existing models in renewable generation forecasts. A comprehensive comparison of the data, models, and their performances is presented. The paper is organized as follows. The installation capacity of solar and wind power is briefed in section II. Various forecasting techniques are illustrated in section III. A review of statistical models for solar and wind power forecast is discussed in sections IV and V respectively. Section VI summarizes machine learning techniques used for the short-term forecast. A review of machine learning models for wind and solar power forecast is discussed in sections VII and VIII respectively. Section IX comments on the concluding remarks and the future of short-term prediction for solar and wind power forecast.

II. SOLAR AND WIND ENERGY-INSTALLED CAPACITY
Solar energy is the biggest non-carbon energy source. The energy obtained from solar radiation that is incident on Earth per hour (4.3 x 1020 J) is much higher than the planet's consumption in one year (4.1 x 1020 J). Solar electrical production and supply is a 7.5-billion-dollar industry with an annual growth rate of 35-40% globally. The potential of solar power is clear from the fact that an energy equivalent to 20,000 GW nuclear mission plants, which is equivalent to twice the consumption of the world from fossil fuels, could be supplied by coverage of 0.16% of the land on earth with solar conversion systems of 10% efficiency. Sun could virtually prove as an unlimited source of energy which dwarfs the capability of achievement by human technology.
Wind energy is another clean, eco-friendly source that is abundant. The advantage of wind energy is that power generation from it is easily scalable. Wind energy has an enormous offering to renewable power generation. This is clear from the studies that operating 20% rated capacity of 2.5 MW turbines restricted to warm, rural, and non-forested areas can exceed the cumulative global consumption of energy by 5-fold and that of electricity alone by 40 times. Off-shore wind capacity is also colossal. Studies indicate it could serve the energy demand of Europe seven times over and the energy demand of United States four times over.

A. GLOBAL SCENARIO
The installed capacity of solar PV globally has reached approximately 627 GW, which was 67 GW at the end of 2011, which was only 1.5 GW in 2000. The annual average growth rate of solar PV over the past five years is around 73%. The major contributors to this growth are only a few countries [30]. The top five countries with a cumulative installed PV capacity of 15 GW or above are China (207.4 GW), European Union (131.7 GW), the USA (75.9 GW), Japan (63 GW), and Germany (49.2 GW). 80% of the total global capacity is due to these countries as shown in Fig. 1. Country including Greece, India, Australia, France, Korea, and Portugal are gaining momentum due to economic support schemes and new policies. There has been tremendous growth in wind power production in the past 15 years [31]. According to the statistics provided by Global Wind Energy Council, the worldwide wind energy capacity was 17.4 GW in the year 2000. By the end of 2019, it became 650.5 GW. The top 5 countries in wind production at the end of the year 2019 are China (236.4 GW), European Union (192 GW), Germany (61.36 GW), USA (105.45 GW), and India (37.5 GW). Other countries emerging in wind generation are Brazil, France, United Kingdom, Spain, and Italy. Fig. 2 shows the wind generation capacity of different countries in the last 3 years.

B. INDIAN SCENARIO
Being a tropical country, India has a greater potential to tap solar energy. The PV industry in India is large and diversified. There are ten major solar cells, panels, and complete PV systems manufacturers, and around 50 different assemblers. These companies supply around 200 MW per year of 30 different types of PV systems in rural, remote areas, and industrial categories. The Ministry of New and Renewable Energy (MNRE) has started several new schemes and incentives which have provided new momentum to the growth of PV installations in India. The total PV installed capacity has quadrupled in the last four years. The total installed capacity from May 2014 to April 2017 is increased from 2.26 GW to 12.28 GW. The installed capacity of wind energy in India has considerably increased in the last few years. The potential of wind energy in India is about 400 GW as per the assessment in 2011 [32]. Major developments in wind energy production started with the formation of the National Program for Wind Energy Development in 1990. The total wind capacity of 1.9 GW in 2002-03 has escalated to 37 GW in 2019. Tamilnadu has a maximum capacity of 9.2 GW followed by Gujarat (7.2 GW) Maharashtra (4.79 GW), Rajasthan (3.3 GW), and Karnataka (4.73 GW) at the end of 2019 [33].

III. FORECASTING TECHNIQUES
Scheduling of electric power requires load forecasting and generation forecasting. Till the recent past, most of the power generation was from conventional sources of energy like thermal, hydel, and nuclear power plants, where the generation is easily controlled and limited by the ratings of the plant. Here, load forecasting is given more importance, and the generation is scheduled to meet the load demand. Hence, generation forecasting is less. Generation forecasting is a vital factor in renewable power generation, where it is not easily controlled and depends on nature. In such a scenario, it is necessary to forecast the generation before the load schedule. Therefore, the research focus is shifting towards generation forecasting to enable greater renewable energy penetration. In generation forecasting [34], the period of the forecast is one of the main criteria to be considered. The different periods of forecast are primarily divided into the following [35]:

A. VERY SHORT TERM
The forecast horizon ranges from a few minutes to 1 hour. This forecast is useful for applications like electricity clearing, real-time grid operation, and regulatory actions [35].

B. SHORT TERM
The forecast horizon ranges from 1 hour to one or two days ahead [35]. This is useful in economic load dispatch planning, operational security in the electricity market, and decisions on load sharing.

C. MEDIUM TERM
The forecast horizon ranges from 5 to 7 days [35]. This is useful for situations like decisions on unit commitment or reserve requirements.

D. LONG TERM
Long-term forecast, whose duration is more than one week is useful for planning power plant maintenance, operation management, etc [35].
As the time of the forecast gets longer, the forecasting becomes more and more uncertain because of the erratic behavior of the predictors and the forecast variables. In this work, the emphasis is primarily on short-term forecasting. The predictions are done for 1 day ahead. Each forecasting process involves a set of basic steps enumerated as follows: 1) Identification of Problem 2) Data Acquisition 3) Exploratory analysis 4) Determine a suitable model to fit the data 5) Evaluation of the forecasting model The problem of the prediction of solar or wind power generation is closely associated with the prediction weather parameters. This problem can be divided into two parts. First, solar irradiance or wind speed, or any other meteorological variable is predicted and then the amount of energy is estimated with the mathematical relations between the predicted parameter and power. Fig. 3 gives a classification of forecasting techniques.

E. PHYSICAL MODEL
The physical model starts with meteorological information such as NWP data and then model it to fit the local physical influences [36]. The adaptation of data is performed through the solar PV model with solar prediction and power curves of a wind turbine, provided by the wind turbine manufacturer, with wind forecast. This model may lead to systematic errors of the predicted power. This can be minimized by performing Model Output Statistics (MOS) applying the offline data from the site [37]. As this model needs high amount of computations, this is normally performed with supercomputers and hence not very commonly used.

F. STATISTICAL MODELS
Statistical models are most appropriate for short-term forecasting [38]. This method uses historical data to predict solar irradiance and wind speed. The accuracy of prediction for these models is lower than that with a physical model, but the computational requirements are less. A very high level of accuracy is not demanded in the application of generation scheduling. Thus, statistical models have opted where time series models are developed with the past data and prediction of future is performed. A time-series data represents series of periodic measurements of a variable. The measurements can be hourly, daily, weekly, monthly, yearly, or at any other periodic interval. The primary information required to analyze the past behavior of time series data is the data pattern. The past pattern can be easily projected to predict the future using a suitable forecast model if data is expected to behave similarly. The two important classifications of time series models are Parametric and Non-Parametric models. It is assumed in the parametric model that the basic stationary data series is structured and can be characterized using a minimum number of parameters. In a non-parametric model, the model structure cannot be decided apriori and depends entirely on the amount of data [39]. Various types of statistical models are discussed below:

1) REGRESSION MODELS
In regression analysis, a trend line is fitted to statistical data points, and the line is projected to the future. Linear or non-linear trend equations can be developed [40]. By evaluating and analyzing a trend pattern, a historical pattern can be determined which can then be projected to the future. The regression model also helps to separate the trend pattern from the data series. This method is normally used with decomposition techniques. In regression models, the mathematical relationship between two or more variables is evaluated from a plot of the historical data [41]. Multivariate linear regression model equation is as given in (1). For single varies regression, only coefficients β 0 and β 1 exist. All other coefficients β n are zero.
There are n number of dependent variable observations and p number of independent variable observations. Multivariate non-linear regression model equation is as given in (2).
In the equation, the dependent variable is y and independent variables are x 1 , x 2 , etc. Linear and non-linear regression models are developed for prediction of solar irradiance and wind speed.

2) PERSISTENCE MODEL
This is the simplest forecasting technique. In most of the analysis, this method is considered as the baseline while evaluating the performance of other forecast models [42]. This model assumes tomorrow's conditions to be the same as today. The complete historical data is not considered for this model. Only the data pattern of the previous day is considered.

3) MOVING AVERAGE MODEL
These are elementary models to smooth historical data.
Smoothing of data is required to evaluate the trend component. There are many levels of moving average models such as simple moving average and weighted moving average [43]. In the simple moving average model of order k, the average of each observation is calculated by averaging that observation and m number of preceding and succeeding observations where m = (k − 1)/2. The expression is given in (3).
where, F t is predicted value of the current observation, Y t is current observed value. The order of the moving average model has to be chosen carefully. It is a general observation that the trend is captured better with a large interval. The error in the forecasting model increases with the increase in order. The major drawback of moving average is that equal weight-age is given to all the past values which are considered for averaging. But in the actual case, the most recent data should have more relevance. The available past data is not fully used in this model. Rather, only the data which is considered for averaging is made use. The forecast result obtained through moving average may be misleading if the data contains seasonal variations as well.

4) EXPONENTIAL SMOOTHING
Forecasting by weighted moving average extends the moving average method. The simple moving average model attributes equal weight-age to all the k points. But it is obvious in any forecasting that the recent observations are the best guide to the future. Hence, it is prudent to use a weighting scheme that attributes lesser weight-age to older observations relative to the recent ones. In exponential smoothing models, the weights of older observations reduce exponentially [14]. There can be more than one smoothing parameter in exponential smoothing, which gives the classifications of exponential smoothing as single, double, and triple exponential smoothing. The determination of these parameters done with an iterative method and these values decide the weights assigned to the observations. a: SINGLE EXPONENTIAL SMOOTHING where F t+1 is predicted value of the subsequent sample, F t is predicted value of the current observation, Y t is current observed value, α is the smoothing parameter (varies between 0 and 1)

b: DOUBLE EXPONENTIAL SMOOTHING
where α and β are the trend smoothing constants respectively. Both lie between 0 and 1. and L t and Y t are the estimated and actual values respectively if the time series at the time t. b t is the slope.
c: TRIPLE EXPONENTIAL SMOOTHING: where γ is seasonal smoothing component (lies between 0 and 1), and S t is seasonal component.

5) AUTO REGRESSIVE (AR) MODELS
In AR Models, the present values of the variables are represented as a regressive function of the past values [44]. Thus, the model is said to be a regression of the past values. The combination of AR models and MA models leads to a powerful model named ARMA(p,q) model. Here p and q are orders of MA and AR models, respectively. Appropriate selection of the order of ARMA needs to be done for the effective forecast. These models can only be used if the data is stationary. The model can be expressed as given in (10).
Three steps are required to model a practical time series using the above equation. First is the transformation of the original series Y t to be stationary about its mean and variance. The second is the selection of suitable order of p and q. Third, the estimation is parameters φ1, φ2, . . . φp and θ1, θ2, . . . θp. Generally, this is achieved using some non-linear optimization procedure that minimizes the sum of square errors. If the data has seasonal variations, it can be modeled as seasonal ARMA, which is expressed as ARMA(p,q)(P,Q). AR: p is order of the auto regressive part, MA: q is order the moving average part, SAR: P is order of seasonal auto regressive part, SMA: Q is order of seasonal moving average part. The order of the ARMA model is obtained by analyzing the auto-correlations and partial autocorrelations of the stationary series. Both auto-correlation and partial auto-correlation functions should not have any abrupt cut-offs. Box and Jenkins (1976) presented set of rules to determine suitable values for p, q, P and Q. According to Box-Jenkins's methodology, any model is pertinent if it results in random residuals. If more than one such model is found to be well fitting, the model with fewer parameters need to be selected. irregular component, which can cause a better model for the data [45]. In AR, the variables usually have a built-in dependence relationship which violates the assumption that the error is independent. And the choice of the number of past values to be considered alters the efficiency (effectiveness) or goodness of fit of the model [46]. In the ARIMA model, the disadvantage of the seasonal error component that was experienced in the ARMA model is eliminated by regressing the dependent variable against the past errors. The Integrative (I) part of the ARIMA model includes the elimination of non-stationary in a time series by difference the time series as in (11).
Intelligent techniques for forecasting are appropriate when the time series data are non-stationary and erratic. The advantage of these methods is the reduced computational complexity compared to statistical models [25], [27]. These methods can be categorized into two methods. 1) Based on Neural Networks (NN) 2) Based on Evolutionary Algorithms like Genetic Algorithm (GA) The various methods based on NN solve the problem by mimicking the human brain. The designed network is trained with the historic data. The trained network is then validated and tested with the known data. In evolutionary algorithms, the problem is solved by imitating the process of biological evolution. Some hybrid models combine the advantages of two or more models [26].

H. ERROR MEASURES
It is essential to perform a comparative analysis to determine the accuracy of individual models when different models are employed for the test data. The right choice of the error measures is pivotal in determining the accuracy of the predictive model. Three error measures are used primarily to compare different models: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE).

1) MEAN ABSOLUTE ERROR (MAE)
MAE is defined as given as, where X t is input data at time t, F t is predicted value at time t, N is number of samples used in computing the error, MAE is an absolute measure. MAE can range from zero to infinity. MAE cannot be considered a final guideline to judge the accuracy of the model as the actual value decides the extent of the error. For example, an error of 10 becomes negligible when the actual value is 2000 but turns out to be significant for an actual value of 50.

2) MEAN ABSOLUTE PERCENTAGE ERROR (MAPE)
MAPE can be defined as given as, MAPE represents the quantum of error relative to the actual data expressed as a percentage, thereby directly indicating the accuracy of the model. Considering an error value of 10, MAPE is 1% for an actual value of 10 and 20% for an actual value of 50. This enhances the choice of MAPE as an effective pointer to the accuracy of forecasting techniques. While MAPE considers the absolute value of the error, it cannot provide information on the deviation of the predicted values towards positive or negative. Outlets also may affect MAPE.

I. SYMMETRIC MEAN AVERAGE PERCENTAGE ERROR (SMAPE)
The disadvantages of MAPE can be overcome by taking the ratio of error and the mean of actual and predicted values. This is defined as Symmetric MAPE as given as, When taken the same example as is MAPE, SMAPE gives an error of 28.57% in either case.

IV. REVIEW OF STATISTICAL MODELS FOR SOLAR FORECAST
In In [48] S. Kaplanis et al. propose a novel method to predict solar irradiance for any hour of a day based on only one measurement in the morning. 5 years of measured data is used for model development. Hourly global solar radiation is determined based on the day of the year and site location. A correction factor is introduced which accounts for the air mass difference during the penetration of solar irradiance. The results obtained are within the limits of standard deviation. In [39], the solar irradiance data is classified by different seasons. Each seasonal series is decomposed into trends and random components. Trend component is modeled with the least square method and random series with the ARMA model. The prediction of solar irradiance in the superimposition of individual prediction. The solar irradiance data of the duration 1998 to 2005 is used for model building and testing. The average error obtained is between 15 to 21%. Finally, the predicted solar irradiance is used to predict the power output from the solar panel.
In [49], a data mining approach is used to implement solar irradiance forecasts. The historical data for 5 years is classified into different clusters like a cloudy sky, clear sky, etc. using a k-means clustering algorithm. This is done to obtain rainfall probability patterns. These patterns are later used to predict solar irradiance. In [50], the CART (Classification and Regression Trees) algorithm of data mining is chosen to analyze the data. The results indicated that the prediction model has great importance on the seasonal parameters. In [51], the multiplicative ARMA model is used to predict solar irradiance. Four years of data are considered for model development. Annual periodicity and seasonal variations were removed. The order of the ARMA model is determined by observing the ACF and PACF plots. In [13], a new regressive model with predictors for the sigmoid function as clearness index and relative optical mass is proposed. 21 weather stations in different locations are considered for model building. The model is compared with two other similar models in the literature and the proposed model is found better. Other models are concisely tabulated in Table 1.

V. REVIEW OF STATISTICAL MODELS FOR WIND FORECAST
Forecast models are highly data dependent when the data is erratic like wind speed. The existing time series models like exponential smoothing, ARIMA, etc. may not give the best expected results with the data available from the region of study. Hence it is required to deploy various models to the data to understand the behavior pattern of the wind and to zero in on a model and develop a custom-made solution for the region of study. The first step towards developing the best forecast model is to test the basic-existing statistical time series models such as persistence, regression [62], moving average, exponential smoothing, and ARIMA as these are the basic and simple models available.
References [19], [20] give a detailed review on diverse methods to forecast wind power. The authors have also enunciated the directions for developing forecast models. It is specified that a combination of physical and statistical models is more effective for prediction. Reference [22], proposes a model which integrates wind direction with other input parameters to form a mixed ARMA model. The inter-dependencies between wind speed and wind direction are investigated by k-Mean clustering. The outliers in wind speed data are filtered out using wavelet decomposition, followed by modeling a hybrid model with ARMA-GARCH [23]. In [63], ANN and Bayesian algorithms are employed to propose a 2-step prediction. This methodology could achieve MAPE ranging from 14 to 18% with the analysis done at two different sites.
Decomposition of time series data is one of the oldest methods to analyze and characterize time series data. The idea of decomposing the time series data was introduced in the early 1950s. Since then, extensive studies were carried out by various researchers in this area. Initial works were focused on seasonal decomposition and the effect of using forecasting to reduce the complexity of seasonal adjustments [64]. Some researchers were working on forecast effects of trend variance [65]. A decomposition method based on seasonally adjusted data was proposed by Miller and Williams [66]. Later, the concept of empirically combining distinct seasonal models was proposed [67]- [69]. In [70], Chen analyzed the strength of various models like regression, ARIMA, and Holt-Winters' and found that Holt-Winters' gave better results. In [71], the performance of forecast of various seasonal models with different types of real-time data was analyzed. The performance of each model varies mainly depending on the data. Thus, there is no accord yet as to the situations under which each model is preferred. Some of the recent works include [72], [73]. In [72], the authors proposed a generic method to identify and characterize different types of changes occurring in time series data. The algorithm integrates the decomposed components with methods to detect multiple changes. In [73], the authors focus on separating trends and growth cycles. The complexity of the separation of trends arises due to the interaction of trends and business cycles. Thus, a piece-wise linear approach termed as Phase Average Trend (PAT) was developed. In [74], a day-ahead prediction of wind speed is performed by developing a multiplicative Decomposition Model. Analysis indicates that the decomposition model is promising in terms of better results and lesser errors in comparison with the existent basic forecasting methods built on time series models. A concise framework of other statistical models for wind forecast is presented in Table 2.
It can be observed from the results of decomposition models presented that the percentage errors are between 10 to 15% in the solar forecast and 10 to 25% in the wind forecast. Even though the models gave reasonable accuracy in the solar forecast, they did not effectively capture the outlets in the wind data. As wind data is too erratic to be modeled by a time series model, non-linear models such as Machine Learning would be more effective.

VI. OVERVIEW OF MACHINE LEARNING BASED MODELS
Machine learning models are extensively used in various forecasting applications as they can predict the behavior of the data sets and also, they can predict the output based on historical information. Some of the machine learning-based models extensively used in solar and wind power prediction are SVM, ELM, BA-BP, EANN, PSO-SVM model, ARMA and SVM, ARMA, and PSO-SVM, Clustered hybrid, deep learning, swarm intelligence, ensemble and hybrid ensemble learning, and hybrid ensemble learning methods. Classification of machine learning techniques is depicted in Fig. 4.
The brief description of few models is described as,

A. SVM
This is a popular type of statistical learning technique that is based on structural risk reduction [86]. This technique is VOLUME 10, 2022 very popular amongst the other machine learning techniques. Given a data-set {b i , a i } N i = 1, here N represents the data-set length, a i X n represents the input vector and b i X n is the ith label of A. The main aim of this function is to set up a classifier function f (a) = W T a + Y , this function will in turn map the input values 'a' to labels b; W in the function, W = [w 1 , w 2 , w 3 , · · · , w R ] T denotes weight-vector. Y is a scalar quantity, f (a)is a mathematical function which reflects the mapping of the input variables to higher order dimensional space. Since SVM performs minimization of structural risk, the optimization can be transformed further as, where, b is penalty number and delta i is the marginal distance.

B. EXTREME LEARNING MACHINE (ELM) MODEL
It is a method that is primarily developed to solve the complication of simple hidden layer NN [87]. The main advantage of ELM is accuracy and also the single layer NN is computationally faster than conventional learning frameworks [88], [89]. Consider a single hidden layer NN, inputs weight w i and offsets Y are initialized randomly after which the output weights are obtained. The input weights and the hidden layer bias of the neural network are randomly obtained to get a unique value NN (H ) for the hidden layer output matrix. Single hidden layer NN is transformed into a linear system H γ = T , with the identification of its output weight [90].
HereĤ represents the inverse of the matrix H .

C. BAT ALGORITHM-BACK PROPAGATION MODEL
This technique is an amalgam of methods. Back Propagation (BP) is a multi-layer feed-forward NN. In this technique, the error is back-propagated, and the signal is forward propagated. This technique is most popular in machine learning to perform function approximation, pattern recognition, and classification. The network framework of this technique is highly complex [91]. BA is developed based on its echolocation behavior [92]. BA is applied to classify and design many engineering applications [93], [94]. The combined BA-BP framework has better accuracy in terms of its prediction capability. The initial weights are optimized with BA and channelized to the BPNN.

D. ELMAN ANN
This type of NN is a simple recurrent NN [95] which consists of four structural layers. The four structural layers are input layer, middle layer, receiving layer and an output layer. The main function of the receiving layer is to properly comprehend the output information of the hidden layer [96], [97]. The function of hidden layer neuron is given by,  where F k (t) and I j (t) denote output of the receiving layer, and input neurons. V ik and W ij are the weights. This model is most preferred for its use in dynamic modelling. It has the capability of self-correlation and can process dynamic information. BP based NN is robust in terms of learning and adaptability. However, these models are prone to minor optimization problems and lack convergence speed [98]. SVM can lessen the complexity of solving high dimensional space problems. When compared to NN, they have greater generalization and extension capability [99]. In the case of the ELM, algorithm to obtain better prediction accuracy only hidden layers neurons need to be set. The learning and generalization capability of ELM are superior when compared to other learning machine models. However, they are prone to regularization problems. EANN model has enhanced model fitting capability. In this model, the output of the hidden layer is memorized in the last iteration with a surplus layer called the receiving layer. This model is complex as it has number of hidden layers in the framework when compared to other network frameworks. Owing to the above-mentioned features, combination models are more helpful and are noteworthy in enhancing the prediction accuracy by overcoming the shortfalls of the individual models.

1) LINEAR COMBINATION
This methodology is usually preferred for time series prediction. In order to find out the weights of this linear combination, VMD-IPN and RBF framework is used. This forecasting model has the capability of reducing the problems of prediction accuracy of the single models [100]. In a combination model, Z = [Z 1 , Z 2 , · · · , Z N ] T , where the equation represents an array of target values andẐ (j) = [Ẑ 1 (j) , Z 2 (j) , · · · ,Ẑ N (j)] T , j = 1, 2, · · · n is an array of predicted values to the j th single model. Therefore the forecasted values can be found by,Ẑ where n denotes the total number of n single models of the linear combination and N is array of values predicted W j is the weight of the model. Also to enhance the efficacy of the linear combination model, the final predictive valueẐ and the single predictive valueẐ (j) should comply to inequalities [101]. γ Z ,Ẑ ≤ γ Z ,Ẑ (j) , ∀j(j = 1, 2, 3 · · · , n, where γ is forecasting error. The linear combinational model can find the corresponding weight considering the predicting capability of the model components and later the combinational weights are predicted by NN. The model works initially by developing the in-sample training validation pairs and then finds the combination weights by Variational Mode decomposition; insample training validation pair based neural network weighting and radial bias function.

2) ARMA AND SVM
It is a hybrid model developed for wind power prediction. ARMA model fetches the time series information [82]. After fetching the relevant information, the model can predict the values with the help of historical data. Consider a time series f (t); the ARMA (a, b) model is given by Here a i is autoregressive value; b j is the moving average value which is determined using LSE. γ t is Gaussian noise and x is the order of AR and y is the order of MA.

3) PSO-SVM
This framework is an amalgam of optimization technique and learning mechanism. This combinational model can solve problems related to regression and classification [82] SVM has greater generalization capability when compared to other ML techniques. The parameter of RBF and the penalty factor are important in determining the performance of the SVM technique. PSO is used to find the best value of the two parameters.

4) ARMA AND PSO-SVM MODEL
To enhance the prediction capability, it is necessary to combine two or more models with rational weights. These combinational models outperform when compared to single prediction models. However, it is very difficult to rate the prediction accuracy of individual models arbitrarily. In this model, time series and an intelligent technique is used. The main advantage of this technique is that it can store more information and enhance prediction accuracy [82]. The combined ARMA and PSO-SVM model can be written as, where δ 1 and δ 2 denotes weights of the ARMA and PSO-SVM model and it should satisfy the condition δ 1 + δ 2 = 1. GCOM denotes combinational model that predicts the results of the combined ARMA and PSO-SVM model.

5) CLUSTERED COMBINATIONAL MODEL
In order to explore the prediction accuracy of the wind power, proper investigations need to be conducted on the historical data. The main advantage of clustering is to encapsulate the characteristics of wind speed, power and temperature and can also verify the effective data-set. Some of the other combination models are (Ensemble Empirical Mode Decomposition-Long Short-Term Memory Network) EEMD-LSTM, EEMD-ARIMA, and EEMD-MLP (Multi-layer perceptron) is tested for different data sets and the results are validated under different conditions [82].

VII. REVIEW OF VARIOUS MACHINE LEARNING METHODS FOR WIND FORECAST
Wind speed forecasting ensures stabilization of regulatory markets in terms of proper power dispatch, thereby leading to structured planning of reserve power capacity. Due to the extensive use of BESS in modern wind power plants as a reserve, the shelf life of BESS units gets deteriorated due to frequent charging and discharging. The capacity of the BESS is predicted by the charging and discharging schedules. An accurate wind power forecasting technique enables a wind farm operator to determine the capacity of battery units with reserve power mode [102]. Deep learning techniques are widely used in applications like regression, clustering, classification, and pattern recognition. Also, ANN based methodologies [35] are in used to predict solar radiation and wind speed. With wind farms, the data sets can be obtained from failure history, SCADA, maintenance, and CMS data.
Deep Mind, based on NN, a forecasting tool developed by Google. Some of the common reasons for the failure of a wind turbine are fluctuations in the rotor blades and the tower of the wind turbine. Machine learning based methodologies can be used to detect the fault and monitor the performance of the wind turbine [103]. Wind turbines are usually subjected to wear and tear; ML based techniques can be used to prevent such events by analyzing some parameters like yaw angle, pitch angle, and rotor speed. Some of the most commonly used AI techniques are feature based algorithms. These algorithms fetch information like acoustics, temperature, torque, and rotor speed from the sensors housed on the turbine equipment [104]. The flow of wind primarily depends on atmospheric parameters like pressure, humidity, height, and surface irregularities. These parameters are of utmost importance for short-term predictions. In solar power prediction, levels of irradiance, humidity, temperature have to be given more priority. To solve the issues with non-linearity ML methods are the prime choice. For accurate wind speed forecasting, a large amount of data sets is necessary. These data sets are in turn used to train a supervised machine learning regression technique. Extreme learning machines, SVM, and ANN are the most frequently used machine learning regression techniques [105]. For a duration ranging from three minutes to six hours, multilayer perceptron based NN techniques is used to forecast wind power. A hybrid model consisting of wavelet transform and NN is used for short-term wind speed forecasting in Portugal [106]. In this framework, the data of wind speed is divided into series through a discrete wavelet. The series of these data-sets is later fed to the neural network for training. Once the data sets pass through the training algorithm, they are trained for learning where minimization of error is performed. The learning algorithm generally used is backpropagation [107]. This technique lacks speed and as a result, it is replaced by Levenberg-Marquardt (LM). The NN based wavelet technique is compared with ARIMA and NN with the use of data-sets. The value of MAPE is around 6.97%. Physical algorithms are usually preferred to map the data of the wind turbine into numerical weather prediction (NWP) systems [108]. This technique is preferred to enhance the prediction of wind speed. However, this method is not preferred for short-term wind forecasting. An enhanced auto-regressive integrated moving average (ARIMA) model is developed to forecast the wind speed; the obtained results are marginally acceptable in terms of its accuracy and efficacy. Single statistical models are not significant in extracting the wind speed information. Generally, SVM and ANN techniques are used for wind speed forecasting in such cases. These techniques are highly expensive in terms of training the model. Conventional single models, namely physical, statistical and NN are not effective in terms of accuracy of predicting wind speed. These models have acoustic noise and result in reduced forecasting accuracy. It is, therefore, necessary to combine forecasting models to enhance the robustness of the multiple single models [109]. These combined models for wind forecast are classified into model variables and structure optimization weighted and error correction based [106]. In model variables and structure based optimization methods, a modified optimization technique [114], [115] is applied to a model during the training phase to obtain greater accuracy. In [116], to forecast short-term wind speed, PSO based optimization technique is applied to LSSVM [117]. To optimize the requisite parameters in an RBF technique, the ordinary least square method is used instead of the conventional gradient search method. Also, many optimization techniques like Grey Wolf, Whale, Bat, GA, and so on have been used for parameter optimization. To increase the accuracy of wind power forecasting, error correction-based models are usually preferred. In this methodology, the data present in the system error is extracted to ensure greater accuracy of forecasting. To establish enhanced accuracy of the forecast, a combined grey and Markov model is implemented to ensure greater precision [118], [119]. In this combined methodology, the errors forecasted by the grey model are categorized into different conditions using the Markov model and for each conditions the probability is estimated. The most frequently used model amongst the weighted combination is the linear combination model. In this model, the approach used is the average of the components. In [120]- [122], results suggest that a single naïve average can achieve acceptable correctness. But this method has lesser accuracy. These methods cannot recognize the fact that the prediction values depend on past information. The weights of the backpropagation neural network and statistical models are identified by the error propagation of the wind speed forecasting [123]. To prevent the shortfalls of the single model prediction methods, a multi-objective Bat algorithm is applied to determine the coefficients of optimal weight [105]. The aforementioned prediction methods have certain shortfalls. Model variables and structure optimization combination models are complex in terms of computation.
Also, in AI-based optimization techniques, training time is too long, and they are subjected to over-fitting problems. Error correction-based models can reduce the prediction error and are extremely expensive in terms of designing a model. The weight-based model is not preferred over the first model, but they have greater adaptability to information and can ensure stabilized predictive performance. Even though the models are not capable of optimal prediction, the results are within the prediction sequence. Therefore, it is necessary to combine an additional model for the identification of the weight of every single model. Studies suggest many methods to find the correct weights but, in most cases, predicting the optimal weights will always be a nightmare. Therefore, because of the aforesaid shortfalls, it is necessary to use a combination model for wind speed forecasting. A combination model can combine the strengths of the individual components to achieve greater accuracy. Support Vector Machine (SVM) is a popular regression model that maps the predicted values to actual values of wind speed. A,ultivariate least square SVM is proposed in [124]. The method is tested VOLUME 10, 2022 on data collected over a year from wind farms in different locations. A MAPE 10.06% is the best obtained.
Hybrid Models using fuzzy logic and AI techniques are proposed for forecasting in [125]. NWP is used to do a preliminary forecast and the quality of NWP is estimated with fuzzy rules. One hour ahead forecast is done using a neural network with wind speed measurements from a wind farm and NWP data as inputs. The worst case error is in the range of 40 to 50%. With a second model, quality of NWP is estimated by using the output of the first model as input to the second model. The rule sets of the fuzzy model are decided by comparison of wind power obtained from power curve of a wind turbine with NWP data. The fuzzy model output yields and NWP quality index. With a third momdel, separate forecast id done for each class of wind speed. The error obtained in the range of 5 and 14%. A combined hybrid approach using ANFIS, PSO and wavelet is proposed in [75]. Wavelet Transform is used to decompose the measured values of wind power. The prediction of this data is performed using ANFIS. PSO technique is used to improve the performance. The accuracy of the model is increased by training the parameters of the membership function of ANFIS using PSO. The best MAPE reported using the proposed model is 4.98%. In [126], wind speed prediction is performed with a novel ANN model, ICA-NN. The inputs to the prediction model are measured data from a SCADA system and NWP data. This data set contained temperature, humidity, and wind speed. Imperialist Competitive algorithm is used to adjust the weights. The Mean Square Error is found to be less than 20%. In [127], Wavelet Neural Network (WNN) is used to predict wind power. WNN is an ANN where the activation function is a wavelet function. A new training strategy is proposed based on Clonal Search Algorithm (CSA). The proposed model is compared with existing models such as Simulated Annealing (SA), Particle Swarm Optimization (PSO), CSA, and Differential Evolution (DE). The proposed method has an error of 9.7%, which is the lowest. In [128], the hybrid model is developed to predict wind speed where wind speed is decomposed into several sub-layers using empirical mode decomposition. Each decomposed series is predicted with a neural network optimized by genetic algorithm and mind evolutionary algorithm. The lowest MAPE obtained was 2.5%. In [129], lifting wavelet transform and Support Vector Machine (SVM) are used for model building. Wavelet transform characterizes the original wind speed and SVM improves the prediction accuracy. Each decomposed data is predicted separately and superimposed to obtain the final prediction. The prediction error is found to be 14.9%. In [130], SVM enhanced Markov Chain model is chosen for wind power forecast. Markov chain is used to capture the normal variations in wind speed, whereas SVM identifies the wind ramp dynamics. Several methods for wind power forecast have been proposed by Jing Shi, et al. In [24], [25], ANN techniques are used. One method is a technique employing two steps based on neural network and Bayesian algorithm. The algorithm was tested for data collected from two sites and the MAPE is between 14 to 18%. Prediction models with ANN are compared with auto-regressive models and seasonal ARIMA models in [26], [27], [80], [131]. A new training algorithm is proposed by applying the Lyapunov stability approach to ANFIS. The results are compared with common training algorithms like GD and RLS. In [132], an entropy based neural network learning algorithm is proposed.
Wind speed depends on many inputs such as temperature, humidity, wind direction, atmospheric pressure, etc. In the literature, algorithm using wavelet transform are extensively reported [133]- [137]. The technique is used to smoothen the data prior to prediction. Time series data is effectively decomposed in the time frequency domain. Interest on wavelet transform application for forecasting began in 1990's. The percentage error obtained is 14.9% using lifting wavelet for characterization of the signal and SVM technique to obtain better accuracy [138]- [144]. In [144], [145], hybrid wavelet-neuro algorithms with novel training strategies are proposed to reduce prediction error. Recent works include application of emperical mode decomposition and principal component reconstruction technique with multilayer ANN. The authors of the reported literature have used data specific to a particular geographic location, hence dependent on local atmospheric and weather conditions. This makes it difficult to compare the accuracy of the various models. However, from the results presented, a general conclusion can be drawn that hybrid models using machine learning algorithms have better accuracy. To ensure better performance and forecasting accuracy, hybrid or combinations models are of prime choice. In the combinations models, the diversity of the combination is given more importance rather than the optimally of each model [146]. In the study of these combinations models, different types of NN combinations are considered. A comparison of various single and combinations models used for wind power forecasting is presented in Tables 3 and 4.

VIII. REVIEW OF MACHINE LEARNING TECHNIQUES FOR SOLAR FORECAST
In [147], a forecasting technique is proposed that uses a neural network and fuzzy theory. Fuzzy rules are determined to predict solar insolation from humidity and the amount of cloud. Fuzzy rules are set using min-max theory, based on the measured values of cloud amount and humidity. A correction method is also proposed to reduce forecast errors. The output of solar power obtained using fuzzy logic and correction method is used to train the neural network.
In [154], one-day ahead solar forecast using ANN is proposed. The novelty of the proposed method is that the operator of the PV panel could appropriately select the model parameters of the ANN such as the number of hidden layers, number of delay elements, etc. NARX network with back-propagation LM algorithm is chosen for the model development. The absolute percentage error is between 1 to 5%. In [155], ANN and GNN (Generalized Neural Network) models are compared. The first stage in modeling with GNN is an offline process called the model structuring phase, where the aggregation function is fixed. The second stage is testing which is an online process. GNN model performed better than the normal ANN. In [135], a novel ANN-based model for the prediction of solar irradiance is proposed. Instead of considering solar irradiance as the only parameter in training ANN, an input vector that consists of statistical feature parameters is constructed. To obtain this, the relationship between surface solar irradiance and extra-terrestrial irradiance is found out. LM algorithm is chosen for training the ANN. The MAPE obtained for sunny and cloudy days are 9 and 26%, respectively. In [11], short-term solar forecasting system is implemented using Elmann ANN. An initial investigation on the relationship between different input parameters is carried out using multivariate regression. The ANN is trained with different sets of input vectors. The first vector was with only measured PV power, the second vector was formed by combining PV power and irradiance on the plane of the PV module. The third vector was formed by combing PV power and ambient temperature. The forecast error for the day ahead prediction was found to be between 26.2 to 19.5%. In [49], a data mining approach is used to implement solar irradiance forecasts. The historical data for 5 years is classified into different clusters like a cloudy sky, clear sky, etc. using a k-means clustering algorithm. This is done to obtain rainfall probability patterns. These patterns are later used to predict solar irradiance. In [50], the CART (Classification and Regression Trees) algorithm of data mining is chosen to analyze the data. The results indicated that the prediction model has great importance on the seasonal parameters. In [156], solar prediction models are developed with two multiple regression models, the least-square algorithm, and the Support Vector Machine (SVM) technique. In the least square method, parameters like dew point, temperature, wind speed, sky cover, precipitation, and humidity are considered. VOLUME 10, 2022 In the SVM technique, three different kernel functions like Linear Kernel, a Polynomial Kernel, and a Radial Basis Function (RBF) kernel are used for model building. RBF kernel is found to be the best. Comparison of various models for solar forecast and brief analysis is presented in Table 5.

IX. CONCLUSION
An exhaustive review of the short-term predictive models for solar and wind power is carried out in this paper. The models can broadly be classified into two categories, statistical models, and machine learning models. The most commonly proposed statistical models are Auto-regressive models, decomposition models, classification and regression tree models, Spatio-temporal models, Gaussian models, combination's probability models, etc. It is observed from the results of various basic time series models like ARIMA and ARMA, etc. that it is required to develop more sophisticated predictive models to improve the accuracy of forecasting. The models should be capable of capturing abrupt changes in the solar irradiates and wind speed data. Thus, decomposition models, Spatio-temporal models, etc. give better results. It is also observed that even though the models gave reasonable accuracy in the solar forecast; they did not effectively capture the outlets in the wind data. As wind data is too erratic to be modeled by a time series model, non-linear models such as machine learning algorithms. Despite the intermittent nature of wind and solar energy resources, they have been used for energy production on a larger scale. The study of various machine learning models is presented in the paper. The comparative analysis of various models suggests some of the noteworthy characteristics of the models. Wind power forecast was performed daily with certain data sets and the requisite samples were also considered for proper forecasting. These models were applied to different locations. The combinations models, deep learning, and principle component analysis (PCA) based models, Random forest (RF) algorithm have performed better in terms of prediction accuracy and stability. Also, LASSO based model used for wind power forecasting exhibited poor performance due to its linear behavior. Support Vector Regression (SVR) algorithm could provide better prediction accuracy if its standard deviation is not considered in the data set. Also, VMD based technique can decompose the sequence of raw wind speed data and can generate a new sequence. The clustered PSO-SVM-ARMA model provided better results and it can ensure safe operation of power system even during large scale integration of wind power. With solar power forecasting models, EMS, LASSO, HDC-blended, and MLP techniques outperformed the other models. These models have exhibited better prediction accuracy. Also, the LASSO model in solar power forecast requires less training and can handle irregularities in data points.
Some improvements that can be suggested to enhance the prediction accuracy of wind speed forecasting and also to predict the in-sample weights, is to investigate other data pre-processing methods. Most of the ensemble-based models have not considered Spatio-temporal information; this information can be considered in designing the learning models to enhance prediction accuracy. Optimal methods can be explored, individually for each of the components of the time series data and then the predicted values of the components combined to obtain the forecast.