Advanced Ensemble Model for Solar Radiation Forecasting Using Sine Cosine Algorithm and Newton’s Laws

As research in alternate energy sources is growing, solar radiation is catching the eyes of the research community immensely. Since solar energy generation depends on uncontrollable natural variables, without proper forecasting, this energy source cannot be trusted. For this forecasting, the use of machine learning algorithms is one of the best choices. This paper proposed an optimized solar radiation forecasting ensemble model consisting of pre-processing and training ensemble phases. The training ensemble phase works on an advanced sine cosine algorithm (ASCA) using Newton’s laws of gravity and motion for objects (agents). ASCA uses sine and cosine functions to update the agent’s position/velocity components by considering its mass. The training ensemble model is then developed using the k-nearest neighbors (KNN) regression. The performance of the proposed ensemble model is measured using a dataset from Kaggle (Solar Radiation Prediction, Task from NASA Hackathon). The proposed ASCA algorithm is evaluated in comparison with the Particle Swarm Optimizer (PSO), Whale Optimization Algorithm (WOA), Genetic Algorithm (GA), Grey Wolf Optimizer (GWO), Squirrel Search Algorithm (SSA), Harris Hawks Optimization (HHO), Hybrid Greedy Sine Cosine Algorithm with Differential Evolution (HGSCADE), Hybrid Modified Sine Cosine Algorithm with Cuckoo Search Algorithm (HMSCACSA), Marine Predators Algorithm (MPA), Chimp Optimization Algorithm (ChOA), and Slime Mould Algorithm (SMA). Obtained results of the proposed ensemble model are compared with those of state-of-the-art models, and significant superiority of the proposed ensemble model is confirmed using statistical analysis such as ANOVA and Wilcoxon’s rank-sum tests.


I. INTRODUCTION
During the last few decades, an increase in the demand for energy resources results in finding new means of generating energy. Solar energy produced through solar radiation is one The associate editor coordinating the review of this manuscript and approving it for publication was Dwarkadas Pralhaddas Kothari.
of the natural methods that are currently in use at domestic as well as at commercial levels [1]. Since solar energy production depends on uncontrollable environmental variables, the production amount cannot be accurately planned. Inconsistent and unpredictable solar energy generation will cause catastrophic results and hence reduce the probability of dependency on solar energy.
The use of machine learning (ML) algorithms applied on the historical dataset of solar radiations can forecast the solar radiations from five to ten minutes [2], [3] and in some cases up to twenty-four hours. Belmahdi et al. used ARMA and ARIMA models for global forecasting of solar radiation. According to Belmahdi, this methodology can forecast solar radiation for up to one, two, or three months [4]. Nearly accurate forecasting is needed for the stable and consistent supply of solar energy [5]. Daily meteorological data can be collected through different radiometric stations at various locations, and then this dataset can be employed in forecasting the solar radiations [6]. To obtain reliable forecasting results, researchers have developed various algorithms, and their extensions [7]. Many authors recommend probabilistic forecasting to have better risk management. By using predictive models, visualization, and evaluation of forecasting results to produce probabilistic forecasting, a framework named ProbCast is introduced by Browell and Gilbert [8]. The ML algorithms use atmospheric variables such as wind, temperature, latitude, atmospheric pressure, etc., for nearly accurate solar radiation forecasting. Data related to all these atmospheric variables should be regularly collected, stored, and analyzed to have reliable forecasting.
Genetic Algorithm (GA) and Neural Network (NN) modeling approaches are used by various researchers for solar radiation's forecasting [9]. In their research, average atmospheric pressure and other previous day's weather-related data predicated by a NN is provided as an input parameter to another NN. It is found from the literature that the NN model is providing more accurate forecasting of solar radiations. GA is more suitable in the survival of the fittest scenarios [10]. The physical model definition describes the physical state and dynamic motion of the atmosphere through mathematical equation [11]. GA-based algorithms are not suitable in these types of physical models [12] and NN methodology requires tons of input data that also sometimes includes few non-relevant parameters [13].
Comparative study of various solar radiation forecasting models based on NN and other ML techniques has been performed [14]. Narvaez et al. proposed a methodology that works in two steps. In the first step, it selects the best data source to have better Spatio-temporal resolution, and in the second step deep earning is used for forecasting solar-radiation [15]. Geographical and meteorological variables of a specific location are the key parameters considered for solar radiation's forecasting [16]- [18]. Al-Hajj et al. proposed a predictive model based on Dynamic Recurrent Neural Networks (DRNN) with short-term delay units to forecast the daily intensity of solar radiations [19]. The model provided better results than Root Mean Square Error (RMSE) and Mean Bias Error (MBE). Another challenge in collecting global solar radiation data is to deal with typical weather conditions, including rainfall, wind, fog, snow, thunder, humidity, sunshine, etc. Proper installation of solar radiation measuring sensors (Pyranometers) is required for such a data collection, and these sensors can be costly, and many countries do not have sufficient network resources to obtain this data [20]- [22]. In these situations, it is preferred to develop empirical models that can utilize the meteorological data measured by nearby stations [23], [24].
Sine Cosine Algorithm (SCA) [25] has high exploitation as compared to other meta-heuristics since it uses a single best solution to guide other candidate solutions. This makes SCA an efficient algorithm in terms of memory usage and convergence speed and it leads the algorithm to be applied in various recent applications [26], [27]. Based on Newton's law of gravity and Newton's law of motion, the Gravitational Search Algorithm (GSA) was proposed [28]. Position, inertial, active, and passive gravitational masses are the properties of each agent (object mass) [29]. The solution of a problem is represented through these properties and is determined by a fitness function. According to the No Free Lunch (NFL) theorem [30], two algorithms can be considered equivalent when their performance is averaged across all the possible problems [31]- [33]. There is no single ML algorithm that can perform forecasting in all possible situations and scenarios; therefore, different ML algorithms are needed to cover all forecasting scenarios [34]. Many ML algorithms are already developed and employed [35]- [41].
Our contribution in this paper is to develop, analyze, and compare an optimized ensemble model based on pre-processing and training ensemble model. This Advanced Sine Cosine Algorithm (ASCA) based model is inspired by Newton's law of gravity and motion. Sine cosine functions are used for updating the agent's position/velocity component based on its mass. The training ensemble model depends on the k-nearest neighbors (KNN) regression. A dataset from Kaggle (Solar Radiation Prediction, Task from NASA Hackathon) is used for the experiments. The proposed ASCA algorithm is evaluated in compared with Particle Swarm Optimizer (PSO) [42], [43], Whale Optimization Algorithm (WOA) [44], [45], Genetic Algorithm (GA) [46], Grey Wolf Optimizer (GWO) [47], Squirrel search algorithm (SSA) [48], Harris Hawks Optimization (HHO) [32], [49], Hybrid Greedy Sine Cosine Algorithm with Differential Evolution (HGSCADE) [50], Hybrid Modified Sine Cosine Algorithm with Cuckoo Search Algorithm (HMSCACSA) [51], Marine Predators Algorithm (MPA) [52], Chimp Optimization Algorithm (ChOA) [53], and Slime Mould Algorithm (SMA) [54]. Major contributions of our work are as follow: • ASCA algorithm is developed to optimize the weights and it is based on sine cosine optimization algorithms • ASCA based model is inspired by Newton's law of gravity and motion.
• The ensemble model depends on the k-nearest neighbors (KNN) regression.
• The model is optimized for nearly accurate forecasting of solar radiations.
• The model is applied to an authentic dataset related to solar radiations.
• Results are obtained by conducting experiments. VOLUME 9, 2021 • Results are compared with those of other available models.
• The performance of the proposed ensemble model is confirmed using a one-way analysis of variance (ANOVA) and Wilcoxon's rank-sum tests. The paper is organized into different sections, in which section II explains the background and basic methods, section III includes the details of the proposed model, experiments, and results are mentioned in section IV. Section V includes discussion and findings of the paper and finally, the conclusion and future work is mentioned in section VI.

II. BACKGROUND
Most of the ML algorithms work on the principles of predicting upcoming results based on historical data. This section will discuss the dataset that we used, the sine cosine algorithm, regression methods, and ensemble learning techniques.

A. DATASET
This work uses meteorological data from the HI-SEAS (Hawai'i Space Exploration Analog and Simulation), a dataset from Kaggle (Solar Radiation Prediction, Task from NASA Hackathon). It is a dataset of weather stations for the four months (September through December 2016) between Mission IV and Mission V [55], [56]. Statistical analysis of different weather parameters of the dataset is shown in Table 1. The dataset contains different meteorological parameters such as radiations, temperature, pressure, etc. It also includes static analysis of the dataset's attributes.

B. SINE COSINE ALGORITHM
Mirjalili in 2016 proposed a Sine Cosine Algorithm (SCA) for the optimization problem [25]. Sine and cosine oscillations functions are used for updating the position of a candidate solution. A set of random variables are used by the SCA algorithm to indicate the direction of the movement, distance, emphasize/deem-emphasize the destination's effect, and to switch between sine and cosine components [57]. SCA's mechanism for updating the position of different solutions is represented by Eq. 1.
where X t i is the current solution position in the i th dimension while P t i is the best solution position in the i th dimension. The parameters r 2 , r 3 , and r 4 are random values in [0, 1]. It can be concluded from the Eq. 1 that the agent's position is changed based on the best solution position. In the SCA algorithm, to balance between the exploration and the exploitation processes, the parameter of r 1 is updated as follows.
where t indicates current iteration number, a is a constant, and the total number of iterations are represented by t max .

6:
Set P = best agent position 7: Update r 1 by r 1 = a − a×t t max 8: for (i = 1 : i < n + 1) do 9: if (r 4 < 0.5) then 10: Update agent position by X t+1 else 12: Update agent position by X t+1 end if 14: end for 15: Set t = t + 1 16: end while 17: Return best agent P In the Algorithm (1), the position of the SCA population, with n agents, is randomly initiated.
Step (5) of the Algorithm (1) calculates the objective function for every agent to find out the best solution position. The best solution is represented as P in step (6) of the Algorithm (1). In step (7) of the Algorithm (1) r 1 is updated by using the Eq. 2. Steps (8-13) of the Algorithm (1) update the position of different agents by using Eq. 1. The algorithm (1) will continue working until the predefined criteria are met. The best solution P will be updated by exploring and exploiting the around space.
Since SCA uses a single best solution to guide other candidate solutions, therefore, it has high exploitation as compared to other meta-heuristics. It makes SCA an efficient algorithm in terms of memory usage and convergence speed. Performance of the SCA degrades in scenarios where a large number of locally optimal solutions exist. These scenarios motivated us to work on this algorithm to alleviate this drawback.

C. GRAVITATIONAL SEARCH ALGORITHM
Based on Newton's law of gravity and Newton's law of motion, the Gravitational Search Algorithm (GSA) was proposed [28]. Position, inertial, active, and passive gravitational masses are the properties of each agent (object mass). The solution of a problem is represented through these properties and is determined by a fitness function.
Mathematically, the i th agent position can be defined as follows. where X d i indicates the i th agent position in the d th dimension. N is the number of agents (masses).
The agent position x d i is updated by the following equation where the agent velocity v d i is updated by where rand i indicates a random number in [0, 1]. The agent i acceleration, a d i , at time t and in d th direction is calculated by where M ii (t) represents the i th agent inertial mass. F d i (t) is the total gravitational force that acts on agent i in a dimension d and can be represents by the following equation where rand j indicates a random number in [0, 1]. The force F d ij (t) acting on mass i from mass j is calculated as following equation.
where M aj indicates the agent j active gravitational mass, M pi indicates the agent i passive gravitational mass, G(t) is the gravitational constant at time t, ε is a constant, and ||X i (t), X j (t)|| 2 represents the Euclidean distance between two agents i and j.
Since all candidate solutions are used to update the position of each solution hence GSA algorithm has the advantage of very high exploratory behavior. In every rotation, all solutions can influence each other based on their distances and quality. However, the accuracy rate of this algorithm is often not very high.

1) MULTILAYER PERCEPTRON (MLP)
Artificial Neural Networks (ANN) follow the principles of the biological nervous system for information processing and communication among the distributed nodes. The Synapse (the connection between neurons) is used to transmit signals from one neuron to other neurons. Speech recognition, regression, and machine learning algorithms are the most common areas of application of ANN [44], [58]. The learning process and optimization of parameters have a major impact on the performance of ANN. One of the commonly applied ANN is MLP. The MLP structure is shown in Figure 1.
where input variable i is represented as I i . The connection weight between the I i variable and a hidden layer neuron j is indicated as w ij . β j represents a layer bias value. By applying the mostly recommended function, sigmoid activation, the output of node j can be calculated as follows.
The following equation defines the output of the network based on the f j (S j ) value, in Eq. 10, for all neurons in the hidden layer.
where the weights between a hidden layer neuron j and the output node k is indicated as w jk . The β k parameter represents the output layer bias value.

2) LONG SHORT TERM MEMORY (LSTM)
LSTM is an improvement over the standard version of ANN and it is a kind of Recurrent Neural Network (RNN) that can be applied to many problems [59]. The main feature of LSTM is to remember the information for a long period and it is more suitable for kind of problems where avoidance of long-term dependency is required. The basic architecture of LSTM is shown in Figure 2. The first step in LSTM is to decide what information should be discarded from the cell state. A sigmoid layer, named forget gate layer, is used for this as shown in Eq. 12.
The next step is about deciding the new information that should be stored in the cell state. The values that need an update are decided by a sigmoid layer named as input gate layer and a new candidate values vector that can be added to the state generated by tanh layer as shown in Eq. 13 and 14.
Using equations 12, 13, and 14, the old cell state C t−1 can be updated into a new cell state C t by the following equation.
Output decision based on the cell state is the final step. A sigmoid layer will help to decide about cell state parts that will be moved to output. After that cell state will use tanh will force values between [-1,1] and will multiply it with the output of the sigmoid gate as mentioned in Eq. 16.
3) SUPPORT VECTOR MACHINE Support vector machines are actually learning models that utilize supervised data and algorithms. Support vector machines are used for classification and regression analysis of data. Hyperplane, in the context of two-dimensional space, represents a line that divides the plane into two subsets [44].

E. ENSEMBLE LEARNING
Ensemble techniques are getting popular in solving the forecasting problem, especially in climate forecasting.

1) AVERAGE ENSEMBLE
In general, the Average Ensemble is one of the simplest ensemble techniques that combines the outputs of base regressors and calculates the mean. This technique aggregates the output of LSTM, SVM, and NN and calculates the mean value as shown in Fig. 3. In this work, the average ensemble is used as a reference ensemble model to evaluate the performance of the proposed ensemble model.

2) KNN ENSEMBLE
KNN is one of the simplest and oldest regressors which is commonly used to decide the regression of unknown instance [60]. KNN utilizes administrative algorithms to test data The target of the KNN algorithm is to discover k objects from the training data that are closer to the testing data then these objects are used for forecasting. KNN strategy is more effective in scenarios where the size of the training data is small. Estimation of a proper value of k is one of the major shortcomings of KNN.
Regression approaches are usually employed to predict the output variables. The regression approach divides the data set into two groups, namely training data and testing data. The testing data is in contrast to training data by using the Euclidean distance. Distance between training data and testing data is calculated and by using heuristics optimal k nearest neighbors are selected. Voting is performed for labels then, with the k nearest multivariate neighbors, an inverse distance weighted average is calculated.

III. PROPOSED MODEL
One of the biggest issues in regression problems is the use of imprecise data that includes many non-relevant features that ultimately increase the error rate of the algorithm. To overcome this problem, we propose a machine learning ensemble model that consists of two phases: data pre-processing phase and training ensemble model as shown in Figure 3.

A. DATA PREPROCESSING
To only count most relevant features, feature analysis is often recommended before regression analysis. So from the dataset, we removed the rows those contains null values or the rows those have some missing information in order not to have misleading results. Ranges and units of different attributes vary in numbers and since we are using Euclidean distance between two data point so it can affect our algorithm largely. Therefore, we use Min-Max Scalar to bring all the values between 0 and 1 as shown in Eq. 17.
where I i is the scaled value.
To measure the correlation between solar radiation and other attributes, Pearson's correlation is applied in Eq. 18. It helped us to identify strongly correlated and weakly correlated attributes. The correlation coefficient, µ, can be defined as: where SD i is a variable i standard deviation and SD j is a variable j standard deviation. E[i] is equal to i.

B. ADVANCED SINE COSINE ALGORITHM
The proposed Advanced Sine Cosine Algorithm (ASCA) is inspired by Newton's law of gravity and Newton's law of motion for objects. ASCA is based on sine and cosine functions to update the agent's position for the number of masses as shown in Figures 4 and 5. Based on the agent's mass,  the ASCA switches between sine/cosine components and position/velocity component. As we discussed earlier SCA is not very accurate and GSA suffers from low exploration drawback therefore we proposed a hybrid solution by combining the strengths of both these algorithms.
Steps of ASCA are described in Algorithm (2). Mathematically Eq. 19 is used by ASCA to update agent's position for a random parameter that has values in [0, 1] and rand SC > 0.5.
where x d i (t) is the current solution position in the d th dimension while P d i (t) is the best solution position in the d th dimension. The parameters r 2 , r 3 , and r 4 are random values in [0, 1]. As shown in Eq. 19, agent's position is updated based on the best agent position, if rand SC > 0.5. In the proposed ASCA algorithm, to balance between the processes of exploitation and exploration, the r 1 parameter is updated as follows.
where t is current iteration number, a indicates a constant, and t max is the total number of iterations.
The agent position x d i is updated in the ASCA algorithm by the following equation for rand SC ≤ 0.5 where the agent velocity v d i is updated, as shown in Figure 5, as follows.
where rand i represents a random number in [0, 1]. The parameter a d i (t) represents the agent i acceleration at time t and it is calculated as follows.
The proposed ASCA algorithm computational complexity is as follow. Time complexity for population of size N with t max iterations can be defined as:

C. TRAINING ENSEMBLE MODEL
Ensemble model instead of picking one best model from the applicants, combines all the models by assigning weight to each model. Ensemble technique is proved as one of the significant methods in improving the prescient capacity of standard models [61]. Ensemble model normally has two phases where in first phase, output variable of the best ensemble Algorithm 2 : Pseudo-Code of the Proposed ASCA 1: Initialization ASCA population X i = x 1 i , . . . , x d i , . . . , x n i , size of population n, total iterations t max , and objective function F n . 2: Initialization r 2 , r 3 , r 4 , rand i , rand j , rand SC , t = 1 3: while t ≤ t max do 4: Calculate objective function F n for each agents x d i 5: Set P = best agent position 6: Update r 1 by r 1 = a 1 − t t max 7: if (rand SC > 0.5) then 8: for (i = 1 : i < N + 1) do 9: if (r 4 < 0.5) then 10: Update agent position by else 12: Update agent position by end if 14: end for 15: else 16: for (i = 1 : i < N + 1) do 17: Update agent acceleration by 18: 19: Update agent position by end for 21: end if 22: Set t = t + 1 23: end while 24: Return best agent P member is selected in order to obtain the final prediction. Second phase mixes the output variables of ensemble members using a combination algorithm [62].

IV. EXPERIMENTAL RESULTS
Three different scenarios are considered for conducting experiments in order to prove the authenticity of proposed solution. In the first scenario correlation among the input attributes and the solar radiation is analyzed. In the second scenario, base models, LSTM, SVM and NN, are analyzed. Third scenario is based on analysis of ensemble model by using the average ensemble, KNN ensemble and the proposed optimizing ensemble weights model. Meteorological data from the HI-SEAS dataset (Solar Radiation Prediction) [55], [56] is randomly divided into two parts where 80% data is used for training and 20% is used for testing. The dataset has Radiation, Temperature, Pressure and statistical analysis as parameters. VOLUME 9, 2021 A. PERFORMANCE METRICS The performance metrics that are used for the experiments include RMSE, Mean Absolute Error (MAE), and MBE [63]. The RMSE metric can be calculated as follow in order to assess the performance: where H p,i represents a predicted value and H i indicates the actual measured value. The n parameter represents the total number of values. The MAE is used to calculate, in a set of predictions, the average amount of errors. It can be calculated as The MBE can show under-predicting or over-predicting state of the tested model. It measures the mean bias of prediction based on the average differences in directions between the predicted and measured values as follows

B. FIRST SCENARIO: CORRELATION ANALYSIS
The correlation between input attributes and the solar radiation is presented in Table 2. Prediction of regression is shown in Fig. 6 and the correlation is shown in Fig. 7. It can be observed from Table 2 that wind direction is weakly correlated to solar radiation, therefore, this parameter is decided to be ignored in the experiments. On the other side, the temperature has a higher correlation with solar radiation as compared to that of humidity, wind speed, and sun hours. So in short weakly correlated attributes should not be considered as these can adversely affect the accuracy of ensemble models including our proposed ensemble model.

C. SECOND SCENARIO: BASE MODELS
The second scenario is designed for testing the performance of base models including LSTM, SVM, and NN without involving ensemble techniques. The results of base models using the RMSE, MAE and MBE performance metrics are shown in Table 3. From the results mentioned in Table 3, it can be concluded that the LSTM model, with RMSE of 0.04041579, MAE of 0.03240178, and MBE of -0.00840223, has promising values among the tested base models. However, these results can be improved using the ensemble models.

D. THIRD SCENARIO: ENSEMBLE MODELS
This scenario is based on different ensemble models including average ensemble, KNN ensemble, and the proposed optimizing ensemble weights model. The ensemble models in this experiment utilize the training instances instead of building models to combine/average the results of the three base models (LSTM, SVM, and NN). This can classify unknown  observations to the regression of the majority and gives the results to predict solar radiation. Figure 8 shows the optimized weights of the proposed model. Results of different ensemble models are shown in Table 4. From Table 4 it can be seen that the three ensemble models show promising results as compared to those of the base models tested in the second scenario. The proposed ensemble optimizing weights model with RMSE of 0.00175482, MAE of 0.00161235, and MBE of -0.00036521, based on the advanced sine cosine algorithm, gives competitive results compared to the average ensemble and KNN ensemble models. Figure 9 shows the Receiver Operating Characteristics (ROC) curves of the proposed ensemble optimizing weights model versus the average ensemble and KNN ensemble models. The figures show that the proposed model based on the proposed ASCA algorithm is able to distinguish data with a high Area Under the Curve (AUC) with a value of 0.9875.  ANOVA test is applied to measure the statistical differences between the proposed model and other models that are used for comparison. The hypothesis testing can be formulated using two hypotheses; the null hypothesis (H 0 :  Table 5. Figure 10 shows the ANOVA test results for the proposed and other models based on the objective function. Detailed results obtained through ANOVA test are shown in Table 5. The alternate hypothesis H 1 is preferred based on the results of the test. But to decide the best algorithm another test is needed. Wilcoxon's rank-sum statistical analysis of the proposed ensemble model in comparison to other models is shown in Table 6. Wilcoxon's rank-sum test will determine if the proposed models and other models' results have a significant difference; p-value < 0.05 will show significant superiority. Hypothesis testing is formulated by two hypotheses; the null hypothesis (H 0 : µ Proposed model = µ KNN Ensembl , µ Proposed model = µ Average Ensemble , µ Proposed model = µ SVM , µ Proposed model = µ NN , and µ Proposed model = µ LSTM ) and the alternate hypothesis (H 1 : Means are not all equal). The results in Table 6 explain that p-values are less than 0.05 which are achieved between the proposed model and other models. This shows the superiority of the ASCA based proposed model  In this section, the proposed ASCA algorithm is evaluated in compared with PSO [42], [43], WOA [44], [45], GA [46],GWO [47], SSA [48], HHO [32], [49], HGSCADE [50],    HMSCACSA [51], MPA [52], ChOA [53], and SMA [54] algorithms. For a fair comparison, the proposed ASCA algorithm and the compared algorithms begin in the experiment with the same number of agents (population) with same size and are applied to the same objective function using same number of iteration, dimensions, and boundaries. Table 7 shows the classification accuracy and the descriptive statistics of the proposed ASCA algorithm compared to other algorithms. The table indicates that the ASCA algorithm achieved better results than compared algorithms.
ANOVA and Wilcoxon's rank-sum tests are performed using 20 runs for a fair comparison between the ASCA algorithm and the compared algorithms. ANOVA test is applied to measure the statistical differences between the proposed algorithm and compared algorithms that are used for comparison as shown in Table 8. The hypothesis testing is formulated here using two hypotheses; the null hypothesis (H 0 : In addition, Wilcoxon's rank-sum statistical analysis of the proposed algorithm in comparison to other algorithms is shown in Table 9. Hypothesis testing is formulated by two hypotheses; the null hypothesis (H 0 : µ ASCA = µ PSO , µ ASCA = µ WOA , µ ASCA = µ GA , µ ASCA = µ GWO , µ ASCA = µ SSA , µ ASCA = µ HHO , µ ASCA = µ HGSCADE , µ ASCA = µ HMSCACSA , µ ASCA = µ MPA , µ ASCA = µ ChOA , µ ASCA = µ SMA . The alternate hypothesis (H 1 : Means are not all equal). This shows the superiority and indicates the statistical significance of   the ASCA algorithm. Thus, the alternate hypothesis H 1 is accepted.
In Figure 11, algorithms' performance versus the objective function is shown based on the RMSE parameter. It can be noted that the minimum, maximum, and average values based on the proposed algorithm are almost the same. This curve indicated the stability of the proposed ASCA algorithm. The histogram of RMSE shown in Figure 12 for different algorithms based on the number of values confirms the stability of the ASCA algorithm. The Residual, Homoscedasticity, quantile-quantile (QQ) plots and heat map shown in Figure 13, is known as a chance plot. It is mostly used by plotting the quantiles and comparing them to contrast two probability distributions. As the figure shows, the points' distributions in the QQ approximately fit the line. Therefore, the actual and the forecasted residuals were linearly related, thus validating the recommended ASCA efficiency.

V. DISCUSSION
From the results of different experiments that are conducted for evaluating the performance of the proposed solution, it can be seen that the proposed ASCA ensemble weight model outperforms other models in terms of providing accurate forecasting. The algorithms are based on atmospheric vari-VOLUME 9, 2021  ables including temperature, pressure, humidity, etc. These variables should be collected, stored, and analyzed to have more accurate and reliable forecasting. The placement of radiation measuring sensors (pyranometers) is required to collect this kind of data. As constructing such a network is resource-consuming and costly so it is not feasible to have these sensors in every place. Therefore empirical models need to be developed using meteorological data from nearby available stations.
The proposed ensemble optimizing weights model with RMSE of 0.00175482, MAE of 0.00161235, and MBE of -0.00036521, based on the ASCA algorithm, gives com-petitive results compared to the average ensemble and KNN ensemble models. The proposed model based on the proposed ASCA algorithm is able to distinguish data with a high AUC with a value of 0.9875. The analysis based on ANOVA and Wilcoxon's rank-sum tests using more than 20 runs shows the superiority of the ASCA based proposed model and indicates the statistical significance of the algorithm. The classification accuracy and the descriptive statistics of the proposed ASCA algorithm and other algorithms indicate that the proposed algorithm can achieve better results than compared algorithms. ANOVA and Wilcoxon's rank-sum tests show the superiority and indicate the statistical significance of the ASCA algorithm which validating the recommended ASCA efficiency.
The proposed ASCA algorithm is also tested for the problem of binary classification, but the algorithm shows slow convergence in the experiments. This can be considered as a disadvantage of the ASCA algorithm for such kinds of problems. The proposed algorithm needs some improvements to cover this.

VI. CONCLUSION AND FUTURE DIRECTION
This paper forecasts solar radiation based on a proposed advanced sine cosine algorithm-based ensemble model. The proposed ensemble model shows superiority over the reference base model including LSTM, NN, and SVM. The ASCA based ensemble weights model provided better results over the average ensemble and the KNN ensemble models. Several experiments are conducted and different performance metrics are considered to conclude that the proposed ensemble weights model is best suitable for forecasting solar radiation. The proposed ASCA algorithm is evaluated in comparison with the PSO, WOA, GA, GWO, SSA, HHO, HGSCADE, HMSCACSA, MPA, ChOA, and SMA algorithms. Significant superiority of the proposed ensemble model is also confirmed using statistical analysis such as ANOVA and Wilcoxon's rank-sum tests. In future work, the proposed ASCA algorithm will be applied for other continuous problems, binary problems with a high number of attributes for feature selection and classification problems, and also constrained engineering problems. Different novel optimization techniques to select the best machine learning algorithms and to calculate weights for each can be considered in the future.

ACKNOWLEDGMENT
This work was supported by Taif University, Taif, Saudi Arabia, through Taif University Researchers Supporting Project, under Grant TURSP-2020/34.