HIFA: Promising Heterogeneous Solar Irradiance Forecasting Approach Based on Kernel Mapping

The rapid employment of photovoltaic (PV) has highlighted the importance of accurate solar irradiance forecasting in grid operation. However, the intermittent nature of solar irradiance represents a big challenge and degrades the accuracy of forecasting techniques, posing towards developing ensemble-based approaches. Most ensemble approaches generate weights based on the performance of individual forecasting models (IFMs) where linear operations are often used to aggregate them. The generalization of such weights could not be practically guaranteed due to the high variability among predictions obtained by IFMs. To tackle these issues, a novel heterogeneous solar irradiance forecasting approach, so-called HIFA, is proposed in this article. Specifically, we propose an effective aggregation strategy based on kernel mapping for aggregating the predictions of accurate deep learning based IFMs. The proposed aggregation strategy can properly map the predictions of IFMs onto a consensus prediction. HIFA utilizes efficient deep recurrent neural networks, which can exploit long-term information from previous computations to model the fluctuated solar irradiance, for building the IFMs. The results reveal that HIFA substantially improves the accuracy of solar irradiance forecasting when compared to ensemble-based approaches, thanks to the generalization capability of the proposed aggregation strategy and the high accuracy of deep IFMs.


I. INTRODUCTION
Worldwide, the interest in clean energy is increased due to the environmental and economic aspects. Solar energy is an adequate type of clean energy and is widely employed in power systems for fulfilling the massive load demand. In this regard, photo-voltaic (PV) technology can directly convert solar energy to electricity avoiding the necessity for complex energy conversion systems. Interestingly, PV systems could be attached to diverse power systems levels, such as low/medium voltage AC distribution systems, DC distribution systems, and transmission systems [1]- [4]. Specifically, PV can be employed to feed domestic, industrial, and commercial consumers due to the persistent reduction in their cost and quiet operation. Notably, the intermittent nature of solar irradiance, which is the most dominant factor that influences PV generation, causes fluctuated power flows and voltages, which is considered a potential threat to grid security. Such The associate editor coordinating the review of this manuscript and approving it for publication was Dipankar Deb . characteristics pose the need for accurate solar irradiance forecasting methods to maintain the security and optimality of grids interconnected with high PV penetrations [5]- [8].
The literature of solar irradiance forecasting comprises the use of both individual forecasting models (IFMs) or applying ensemble techniques to IFMs. In general, most forecasting approaches use historical measurements to construct the forecasting models. Of note, ensemble approaches can potentially promote the precision of IFMs. In [9], random forest (RF) and artificial neural networks (ANNs) algorithms have been used to forecast three components of solar irradiance: global, normal beam, and horizontal diffuse. The authors of [10] proposed an ensemble approach based on optimized ANNs with a median operator-based aggregation strategy for forecasting solar PV power. In [11], different IFMs, namely ANN and auto-regressive integrated moving average (ARIMA), have been coupled to build a solar irradiance forecasting model. In [12], an improved ensemble learning method for solar power forecasting has been proposed based on an RF algorithm, an adaptive residual compensation method, and the NSGA-II optimization algorithm. In [13], a solar irradiance forecasting method has been proposed based on multi-stage multi-variate decomposition, ant-colony optimization and RF algorithms. Different IFMs have been used in [14], namely k-nearest neighbor (k-NN), auto-regressive integrated moving average (ARIMA), and adaptive network-based fuzzy with a search algorithm to build a solar irradiance forecasting model.
In the last years, considerable studies have been focused on proposing efficient forecasting approaches based on deep learning [15]- [18]. In [19], a deep convolutional neural network (CNN) and a salp swarm optimization algorithm have been combined to forecast PV power generation. In [20], a generative deep neural network has been proposed to forecast solar irradiance. Recently, two variants of deep recurrent neural networks (RNNs) have been widely used with time-series forecasting: short-term long memory (LSTM) and gated recurrent units (GRU). In [21], five LSTM-RNN models have been proposed to forecast solar power generation. These LSTM-RNN models achieve accurate forecasting results when compared to several traditional machine learning methods. In [22], the stationary wavelet transform has been used to extract features from PV power time-series and then fed into LSTM to predict the PV power. In [23], a solar power forecasting method has been proposed based on CNN and LSTM without complicated preprocessing steps to eliminate outliers. Besides, the application of GRU has been introduced in [24] to forecast hourly solar irradiance in Arizona. In [25], GRU and CNN have been combined to improve the forecasting results. In [26], a k-means technique has been employed to split training sets into many groups, and then GRU has been used with each group. The authors of [27] have assessed 5 standalone models, including recurrent deterministic policy gradient, LSTM, Gaussian process regression, extreme gradient boosting, and support vector regression for solar irradiance forecasting while proposing an improved ensemble method. The study of [28] has adopted a multi-task learning method to perform a multi-time scale forecast to enhance the accuracy rate as well as the computational efficiency.
As stated above, either individual or ensemble-based forecasting models have been employed in the literature for solar irradiance forecasting. Generally, there is no individual model that can give accurate solar irradiance forecasting results with all data, according to the no-free-lunch theorem [29]. In turn, the integration of various deep learning models to construct an ensemble model could significantly improve the results of IFMs while exploiting the merits of each model for addressing the fluctuating nature of solar irradiance. However, many factors can limit the performance of ensemble-based forecasting models, such as the accuracy of IFMs forming the ensemble, the characteristics of the output of IFMs, as well as the strategy used to construct the ensemble. Besides, most ensemble approaches in the literature assume a linear relationship between the predictions of IFMs, which is not a general case in practical scenarios. Most existing approaches apply weighted average techniques to aggregate IFMs, in which the weights can be obtained in different ways, including the use of meta-heuristic optimization techniques (e.g. [13], [30]). However, there is no guarantee about the validity of the utilized weights, and their reproducibility or applicability to unseen data. Although ensemble models can enhance the results of individual learners, they still cannot provide perfect input-output mappings for unseen data [31].
To tackle the issues mentioned above, a novel solar irradiance forecasting approach, called HIFA, is proposed in this article. Specifically, we propose an aggregation strategy based on kernel mapping for combining the predictions of IFMs. HIFA utilizes deep LSTM and GRU networks to build heterogeneous IFMs. LSTM and GRU can utilize long-term information from previous computations to model the fluctuated solar irradiance data. The use of heterogeneous IFMs could guarantee a full characterization for the input data as each model has a different mechanism for recognizing the pattern of the data. The proposed aggregation strategy can efficiently map the predictions of IFMs to a consensus value. Substantially, kernel methods, which are relevant to regression techniques, are efficient for handling nonlinear relationships between targets and input data, large-scale data and structured information. Further, kernel mapping can facilitate weighting the outputs of IFMs in a unified framework. To realize such a multi-input single-output (MISO) combinatory function, we introduce the use of kernel mapping with a support vector regression technique which has a high generalization capability. Accordingly, HIFA could be an effective tool for not only forecasting the fluctuating solar irradiance but also diverse times-series prediction problems, thanks to the proposed aggregation strategy and the employed deep IFMs.
The key contributions of this article are: • Propose a promising heterogeneous solar irradiance forecasting approach, called HIFA, which does not necessitate sophisticated meteorological infrastructure; • Introduce a novel aggregation strategy based on kernel mapping for solar irradiance forecasting; • Boost the precision of solar irradiance forecasting compared to existing ensemble forecasting approaches; • Assess the efficacy of HIFA at geographically distant sites in Finland with realistic datasets. Fig. 1 depicts the HIFA framework, which includes the deep IFMs and the proposed aggregation strategy based on kernel mapping. In HIFA, we consider m past solar irradiance values and employ deep LSTM and GRU RNNs to construct heterogeneous IFMs. Indeed, GRU and LSTM have shown promising forecasting results in sequential time-series data, thanks to their ability to utilize long-term information from previous computations. Consequently, the employment of heterogeneous IFMs could maintain a full characterization for the input solar irradiance data because each IFM model has different mechanisms for handling the data. Besides, the proposed aggregation strategy based on kernel mapping can handle the nonlinear relationships between predictions of IFMs, large-scale data and structured information while facilitating weighting predictions of IFMs. HIFA, with its ability to integrate various deep learning models, could significantly improve the results of individual forecasting models. Further, it exploits the merits of each model for addressing the fluctuating nature of solar irradiance.

II. HIFA FRAMEWORK
In Sections III-V, we explain HIFA in detail. The construction of the solar irradiance IFMs based on LSTM and GRU networks is presented in Section III, the proposed aggregation strategy based on kernel mapping is explained in Section IV, and the data structure, architectures of IFMs as well as the implementation of HIFA are described in Section V.

III. SOLAR IRRADIANCE FORECASTING USING HETEROGENEOUS FORECASTING MODELS
HIFA employs deep LSTM and GRU networks to build heterogeneous IFMs. Each IFM is trained using sequence data and the corresponding target, i.e., (x 1 , y 1 ) , (x 2 , y 2 ) , . . . , x q , y q . The trained IFM can be then used to handle new input x i ∈ R D to predict the target y i given preceding inputs {x 1 , . . . , x i−1 }. In this regard, deep learning techniques have been recently used to build efficient forecasting models. Deep learning models are a kind of ANNs that comprise successive layers, allowing high-level abstraction to model the data. Specifically, LSTM-and GRU-RNNs have been widely used in the literature with time-series forecasting and achieved promising results [21], [23], [24], [26]. The main merit of the utilized deep LSTM and GRU models is that they can automatically extract the relevant features of input solar irradiance data using a general-purpose learning framework. Below, we briefly introduce LSTM and GRU basic blocks.
A. LONG SHORT-TERM MEMORY (LSTM) Fig. 2(a) shows the basic unit of LSTM. It includes input, forget, and output gates. The state of LSTM comprises two vectors: a hidden state vector h ∈ R D and a cell state vector c ∈ R D . At each time step t, the activation vectors of the input gate i t ∈ R D , forget gate f t ∈ R D , output gate o t ∈ R D and block input g t ∈ R D can be described as follows [32]: are the bias vectors. The hyperbolic tangent tanh(x) is used as an activation function for the block input and output. After the activation vectors of the gates are computed, the next cell state and hidden state are updated as follows: where refers to the element-wise product.

B. GATED RECURRENT UNIT (GRU)
As shown in Fig. 2(b), the GRU includes two internal gating variables: the update gate z which protects the D-dimensional hidden state h t ∈ R D and the reset gate r t which allows overwriting of the hidden state and controls the interaction with the input x t ∈ R p , which can be described as follows [33]:

IV. PROPOSED AGGREGATION STRATEGY
In HIFA, we propose a learnable aggregation strategy based on kernel mapping to minimize the variance of forecasting errors. Such MISO aggregation function f maps the predictions of IFMs (x 1 , x 2 , . . . , x n ) to consensus value y: Kernel functions are introduced here to map the predictions of IFMs into an implicit high-dimensional space, where a linear model can be sufficient to aggregate the predictions of the IFMs. Indeed, kernel mapping empowers performing effective association testing at the prediction level of IFMs. In kernel mapping, samples x can be mapped into a feature space of higher dimensions, x −→ (x), where is a mapping function that generates a symmetric positive semidefinite (PSD) matrix for any subset of data. Mathematically, a kernel function, K : Fig. 3 depicts the framework of kernel mapping for the predictions of IFMs. As shown, the framework has different layers: input data space, and transformed feature space created by the kernel mapping function, which plays a vital role in large-scale data aggregation [35], [36].
A kernel K is considered as a function that takes two vectors x i and x j as arguments, and it returns the value of the inner product of their mapping (x i ) and (x j ): Notably, the dimensionality of the resulting space is not vital because only the inner product value of the two vectors in the resulting space is returned. This process is called the kernel trick, where all inner products in the learning technique in the original space are replaced by kernels.
In the higher dimension space, the data can be linearly separated, in which the resulting decision function f turns to: and b are the variables of the decision plane in the resulting space. It is worth noting that the function (x) has the following characteristics: 1) it is a kernel-induced implicit mapping, and 2) it does not require to be explicitly identified because the vectors x can only be seen in the inner products.
In this study, we consider three widely used kernels: linear, polynomial and radial basis function (RBF). Assume x = [x 1 , · · · , x n ] T and z = [z 1 , · · · , z n ] T , the linear kernel can be defined as: The polynomial kernel can be defined as: where v 1 and v 2 refer to the kernel order and a constant that controls trade-off the effect of the higher and lower order terms, respectively. The RBF kernel can be defined as: In this study, support vector regression (SVR) is used as a learning technique to build the aggregation function f . SVR can be expressed as follows [35]: subject to where ζ i and ζ * i are slack variables, and C > 0 controls the trade-off between the flatness of f (x) and the amount to which deviations larger than . Indeed, the construction of the hard margins of SVR necessities a full separation of the training data in the hyper-plane, which is done using kernel functions.

V. IMPLEMENTATION OF HIFA
In this section, we present the data structure used to train IFMs, the architectures of each IFM, and HIFA implementation steps.

A. DATA STRUCTURE FOR IFMs
The solar irradiance data are restructured to train supervised machine learning techniques, where recent time-steps are used as input variables and the next time-step as the output variable, as follows: where s t refers to a solar irradiance measurement at time-step t, k is the number of look-back steps, and d is the number of measurements collected from a certain site. Both β1 and β2 can be merged into : Given a dataset of a particular site that includes historical solar irradiance measurements, we train each IFM separately. Before feeding into an IFM, its elements are normalized as follows:

B. ARCHITECTURES OF IFMs
In this study, we use five IFMs, namely IFM1, IFM2, IFM3, IFM4, and IFM5 with different architectures based on deep LSTM and GRU. The architecture of IFM1 has a single hidden layer of LSTM units, and an output layer used to predict the next solar irradiance value. Multiple recent time-steps are used to predict solar irradiance at the next time step. The recent observations are not used as separate input features, but as time-steps of the one input feature. IFM2 has the same architecture as IFM1, but LSTM is replaced by GRU. In IFM3, CNN is employed in a hybrid model with an LSTM backend, in which CNN is utilized to automatically extract features from the input solar irradiance sequence. The output of CNN is fed into the LSTM model as an input sequence. The input sequences are divided into subsequences to be processed by CNN. The creation of subsequences can be parameterized by the number of subsequences and the number of time-steps per subsequence. The CNN model has a convolutional layer followed by a max-pooling layer. The output of CNN is flattened to a one-dimensional vector to be fed into the LSTM layer. IFM4 has the same architecture as IMF3 except that GRU is utilized instead of LSTM. In IFM5, a bidirectional LSTM is used to model in solar irradiance both forward and backward directions, and then the forward and backward interpretations are concatenated.
To build the IFMs, we exploit the sequential models of Keras library. Each solar irradiance dataset is divided into 70% for training and 30% for testing. Each model has a hidden layer with four blocks with a Relu activation function. The loss function of LSTM is the mean squared error, and the adaptive moment estimation (ADAM) optimizer is employed. Both LSTM and GRU models are trained for a total of 100 epochs. The convolutional layer of IFM3 and IFM4 comprises 64 filters. All super-parameters are experimentally tuned. The parameters of IFMs are tabulated in Table 1.

C. HIFA ALGORITHM AND ASSESSMENT
In Algorithm 1, we present the steps to implement HIFA. In the training phase, we rephrase and normalize the dataset of each site as described in (24) and (25), respectively. Then, the five IFMs are built, as explained in Section V.B. The predictions of IFMs are used to train aggregation function f as explained in Section IV. In the testing phase, given recent time-steps (rs = {s t−l , . . . , s t }) of solar irradiance, the outputs of the IFMs {IFM1(rs), . . . , IFM5(rs)} are fed into the trained the aggregation function f to get the consensus prediction of the solar irradiance f (p).

Algorithm 1 Pseudo Code of HIFA
1: Input historical SI data. 2: Divide SI data into training and testing sets. 3: Structure SI data as described in (24), and then normalize SI data using (25). Load f , IFM1, IFM2, IFM3, IFM4, IFM5 13: foreach time step t do 14: Read recent SI values (rs) 15: Compute p ← {IFM1(rs), . . . , IFM5(rs)} 16: Forecast value← f (p) 17: end for 18: end while 19: end To assess the accuracy of HIFA, we use the root mean square error (RMSE) and mean absolute error (MAE), formulated as follows [37]: where N s , SI P and SI E are the numbers of samples, the predicted and observed values of solar irradiance, respectively. It is worth noting that the proposed forecasting approach has no assumption for the type of the data, and therefore, it is an applicable tool for solar irradiance forecasting applications in various countries.

VI. RESULTS AND DISCUSSION
In this section, we firstly describe solar irradiance datasets used to validate HIFA and other forecasting models. To demonstrate such diversity of datasets, we compare solar irradiance at the three sites in terms of statistical metrics, namely the mean (µ), standard deviation (σ 2 ), kurtosis, and skewness. The value of µ at Site I is 120.0257 (kW /m 2 ) which is much higher than those of Sites II and III in which µ values are 113.1965, and 82.2772 (kW /m 2 ), respectively. The σ 2 values of solar irradiance at Sites I, II and III are 199.4447, 194.3888 and 142.8869 (kW /m 2 ), respectively, implying that the solar irradiance variation has the highest rate at Site I. Regarding the kurtosis values, their values at Sites I, II and III are 4.9938, 5.7901, and 6.9278, respectively. The high kurtosis value at Site III indicates that the solar irradiance profile in this dataset is highly fluctuated and contains outliers more than the other two sites. The skewness values of solar irradiance at Sites I, II and III are 1.7455, 1.9247, and 2.0966, respectively, indicating that the distribution of solar irradiance at Site I is more symmetrical than the ones of Sites II and III. Such excessive variation of solar irradiance profiles at the different sites reveals that an efficient forecasting approach is required to obtain precise results. Note that the utilized IFMs can model all sequences of solar irradiance, even the ones that include zeros.

B. ACCURACY ASSESSMENT OF HIFA AND ABLATION STUDY
In this subsection, we present an ablation study for HIFA. In Table 2, the RMSE and MAE values of the five IFMs (IFM1-5) are shown for the three geographically distant sites in Finland. In the case of Site I, IFM2 obtains an MAE of 7.5606 and an RMSE of 14.4949, which are slightly lower than IFM1, IFM3 and IFM4. However, IFM5 gives the highest forecasting errors for this Site. With Site II, the lowest RMSE is achieved by IFM1 (17.482) while the lowest MAE is 7.9708 obtained by IFM4. For Site III, IFM1, IFM3 and IFM4 give RMSE lower than 5.20 and MAE lower than 2.20, which are much better than those of IFM2 and IFM5. These results emphasize that there is no IFM best suited to elicit solar irradiance for all sites or all solar irradiance profiles, complying with the no-free-lunch theorem. Consequently, we can conclude that each IFM gives lower RMSE and MAE for a specified site, not for all of them. Therefore, a combinatory function that integrates efficient forecasting models could guarantee the attainment of accurate forecasting results for all sites with all solar profiles, if an appropriate aggregation strategy is utilized.
Besides, in Table 2 we show the RMSE and MAE values of the proposed HIFA, in which we aggregate the best IFMs. With Sites I, II, and III, HIFA obtains RMSE values of 11.8928, 11.7097, and 3.3675, respectively. In turn, it achieves MAE values of 5.7879, 7.8446, and 1.0284 with the three sites, respectively. As we can see, the forecasting errors of HIFA are much lower than the five individual models at the three sites. In Fig. 4, we show the forecast values of HIFA at Site I, II and III for a day. Interestingly, Fig. 4(a), Fig. 4(b) and Fig. 4 8.1525, and 4.8761, respectively. As noticed, the forecasting errors of HIFA-a, HIFA-b, and HIFA-c are lower than the individual models for most cases, thanks to the proposed aggregation strategy. Based on the ablation study, the best four IFMs (IFM1, IFM2, IFM3 and IFM4) are aggregated to construct HIFA.
In Table 3, we sought the accuracy of elicited solar irradiance values of HIFA with various solar profiles (clear, partially cloudy, and cloudy). We quantify these three different solar irradiance profiles according to the clearance index proposed in [39]. As shown, HIFA shows high performance with the three profiles of solar irradiance at Sites I, II and III. It is VOLUME 9, 2021  worth mentioning that HIFA can provide accurate forecasting results, including the cases with a low clearness index, which can be achieved at highly fluctuated profiles of solar irradiance. Accordingly, HIFA could be an efficient approach for forecasting the high fluctuating solar irradiance in Finland, thanks to the proposed aggregation strategy and the adopted deep IFMs.

C. IMPACT OF KERNEL FUNCTIONS ON THE PERFORMANCE OF HIFA
In HIFA, we use the RBF kernel when building the aggregation function. However, we here study the performance of two other kernel functions with the proposed aggregation strategy, namely linear (Lin) and polynomial (Poly). Fig. 5 compares the RMSE and MAE values for the RBF-, Lin-and Polybased aggregation functions at Sites I, II, and III. As noticed, RBF and Lin kernels yield RMSE and MAE values less than 12 at all sites. Besides, the errors with Site III are the lowest compared to those of Sites I and II at which higher solar irradiance profiles are noticed. The Poly-based aggregation function gives the highest RMSE and MAE values at all sites, especially at Site III, where solar irradiance is low. This analysis demonstrates that the RBF-based aggregation function is most suited for the solar irradiance forecasting task, thanks to its generalization ability and tolerance to noise.

D. EVALUATION AGAINST EXISTING ENSEMBLE-BASED METHODS
To demonstrate the validity of HIFA, we compare its accuracy against three ensemble integration approaches: Ensemble 1 is the average ensemble approach employed in [40], Ensemble 2 is the median ensemble approach presented in [41], and Ensemble 3 is the weighted average ensemble given in [42]. Ensemble 1 [40] works in an online way by weighting the individual forecasting models according to their current performance. Such an online strategy enables the ensemble to deal with possible nonstationarities innovatively. Ensemble 2 [41] computes the median statistic of the predictions of all IFMs. The main rationale behind choosing the median statistic is that it is a strong mechanism that can neglect the impact of outliers. The median statistic indirectly handles the poor estimation performance of some individual models on parts of the target variable space, meaning the submodels that provide overestimation or underestimation of some instances may lead to an appropriate estimation for other instances. This  interchanging estimation performance is settled by using the median operator as a nonlinear ensemble mechanism. In the case of Ensemble 3 [42], a weighted average method is used to fuse the predictions of IFMs. An improved differential evolution algorithm is employed to search for the optimal combination weight values of the IFMs. It should be mentioned that the IFMs used to build HIFA are also employed to construct Ensemble 1 , Ensemble 2 , and Ensemble 3 .
As shown in Table 4, Ensemble 3 gives RMSE values of 14.8239, 19.0611, and 8.4671 at Sites I, II, and III, respectively, which are higher than those of Ensemble 1 and Ensemble 2 . The same trend can be noticed with MAE values. It is noticeable that HIFA reduces the forecasting errors significantly compared with Ensemble 1 , Ensemble 2 , and Ensemble 3 . For instance, HIFA achieves an RMSE of 11.8928, 11.7097, and 3.3675, which are much lower than those of the three ensemble techniques. To demonstrate this superiority, we compare the daily forecasting of solar irradiance of HIFA, Ensemble 1 , Ensemble 2 , Ensemble 3 , and the actual values for three days shown in Fig. (6). In general, the forecast of daily profiles by HIFA is very close to the actual ones. In turn, forecasting profiles by the other ensemble techniques are far from the actual profiles at the three days. Further, for each day, the performance of Ensemble 1 , Ensemble 2 , Ensemble 3 are different. For instance, Ensemble 2 outperforms Ensemble 1 and Ensemble 3 with the clean solar irradiance profile shown in Fig. 6 (a) while it shows the worst performance in 6 (c). Unlike the compared ensemble-based techniques, HIFA shows consistently high performance with diverse solar irradiance profiles at all sites.
Furthermore, we use the skill score to evaluate the forecasting methods, which can be expressed as η = 1 − RMSE f /RMSE r , where RMSE f and RMSE r are the RMSE of the forecasting method and the persistence method (reference method), respectively [37]. As shown in Fig. (7), Ensemble 1 , Ensemble 2 , Ensemble 3 , yield the worst performance at Site II with low skill scores when compared to Sites I and III. Ensemble 1 , Ensemble 2 , Ensemble 3 , and most ensemble approaches in the literature assume a linear relationship between the predictions of IFMs and employ weighted average approaches to combine them. It should be noted that there is no guarantee about the validity of such an assumption and the utilized weights, and their reproducibility to unseen solar irradiance data. Although Ensemble 1 , Ensemble 2 , Ensemble 3 can improve the forecasting results of IFMs, they still cannot provide perfect input-output mappings for unseen solar irradiance data [31]. However, HIFA outperforms the other three ensemble approaches at the three sites. This positive feature is accomplished by the efficient individual deep forecasting models as well as the aggregation strategy. Thanks to the heterogeneous IFMs that make full characterization for the solar irradiance data and the proposed aggregation strategy, HIFA can handle the non-linearity of the forecasting results in the combination process, yielding improved forecasting performance. Another positive feature of HIFA is that it could be used with various forecasting problems in smart grids, such as PV power, load and wind power forecasting.

VII. CONCLUSION
In this article, we have proposed HIFA, which is a promising solar irradiance forecasting approach. HIFA uses efficient deep LSTM and GRU networks to build heterogeneous IFMs. To effectively aggregate IFMs, a novel aggregation strategy based on kernel mapping has been presented. The state-of-the-art ensemble-based solar irradiance forecasting approaches generate weights based on the performance of each IFM; however, the generalization of these weights are not guaranteed due to the variability of the forecasting results of IFMs. In contrast, the proposed aggregation strategy can efficiently map the predictions of IFMs to a consensus prediction. To validate the effectiveness of HIFA, we have used three realistic solar irradiance datasets collected from three geographically distant sites in Finland. An ablation study for HIFA has been presented, in which the performance of heterogeneous IFMs, variants of HIFA (HIFA-a, HIFA-b, and HIFA-c), and different kernel functions (Lin, Poly, and RBF) have been analyzed. The experimental results reveal that solar irradiance profiles at different sites have excessive variations. Further, there is no IFM best suited to elicit solar irradiance for all sites or all solar profiles. Thus, it has been concluded that each IFM obtains lower forecasting errors for a specified site. To demonstrate the validity of HIFA, we have compared it with three ensemble integration approaches. HIFA achieves an RMSE of 11.8928, 11.7097, and 3.3675 at the three sites, which are much lower than those of the other three ensemble techniques. Accordingly, HIFA is an efficient tool for forecasting the fluctuating solar irradiance, thanks to the proposed aggregation strategy and the adopted IFMs. The future work will be directed to apply HIFA to various applications in modern power systems, such as electricity price forecasting, demand forecasting, and power generation forecasting of wind turbines.