A Review of Machine Learning-based Photovoltaic Output Power Forecasting: Nordic Context

Motivated by factors such as the reduction in cost and the need for a shift towards achieving UN’s Sustainable Development Goals, PV (Photovoltaic) power generation is getting more attention in the cold regions of the Nordic countries and Canada. The cold climate and the albedo effect of snow in these regions present favorable operating conditions for PV cells and an opportunity to realize the seasonal matching of generation and consumption respectively. However, the erratic nature of PV brings a threat to the operation of the grid. PV power forecasting has been used as an economical solution to minimize and even overcome this limitation. This paper is therefore a comprehensive review of machine learning-based PV output power forecasting models in the literature in the context of Nordic climate. The impact of meteorological parameters and the soiling effect due to snow, which is unique to this climate, on the performance of a prediction model is discussed. PV power forecasting models in the literature are systematically classified into multiple groups and each group is analyzed and important suggestions are made for choosing a better model for these regions. Ensemble methods, optimization algorithms, time-series decomposition, and weather clustering are identified as important techniques that can be used to enhance performance. And notably, this work proposed two conceptual approaches that can be used to incorporate the effect of snow on PV power forecasting. Future research needs to focus on this area, which is crucial for the development of PV in these regions.


I. INTRODUCTION
T HE demand for electrical energy is increasing very significantly because of the continued increase in the world population. In the International Energy Outlook 2019 (IEO2019) Reference case release, the US Energy Information Administration (EIA) projected that world energy usage will grow by nearly 50% between 2018 and 2050 [1]. This increased demand requires more energy generation from coal and other fossil fuels, which not only produce a significant amount of pollution but also ultimately result in the depletion of the limited resources. In addition, more generation in the conventional ways requires up-scaling of the existing electric grid to accommodate this huge change, which is economically demanding and technically inefficient.
This necessitates the need to look for sustainable alternative measures which can go in line with the UN's Sustainable Development Goals (SDG 7: Affordable and Clean Energy). Motivated by this and the continued advancement in PV (Photo Voltaic) and wind power technology, large wind and PV power plants have been deployed throughout the world. The reduced cost of these technologies is also a huge contributing factor. According to The Sustainable Development Goals Report 2020, the share of renewable energy in total final energy consumption reached 17.3% in 2017, up from 17.0% in 2015 and 16.3% in 2010 [2]. The largest increase in the use of renewables has come from the electricity sector, driven by the rapid expansion of solar and wind power. Unfortunately, these efforts are not on the scale and more has to be done if it is required to achieve Goal 7 by 2030.
Wind and solar energy have become mainstream electricity sources and are increasingly cost-competitive with fossil fuel power plants. Nearly everywhere in the world, producing electricity from new renewables is more cost-effective than producing it from new coal-fired power plants [3]. Although several key countries and regions, such as China, South and Central Europe, and the United States, have driven these trends and continued to have a large impact in 2019, renewable power, especially PV is growing in all corners of the world. Even in colder and high latitude climate regions such as the Nordic countries and Canada, solar electricity is gaining huge interest in recent years. According to the data from the statistics bureau of the respective countries and Our World in Data 1 , the PV installation trend in these regions for the last 5 years is shown in Fig 1. PV installed capacity grew by 33%, 56%, 20%, 21%, and 2.5% for Norway, Sweden, Denmark, Finland, and Canada respectively in 2020 alone when compared with the previous year. This can possibly be attributed to the reduced price of PV modules and favorable operating conditions for the solar cells due to the cold temperature. These countries have a combined installed PV power capacity of more than 5800 MW as of 2020 and with the ongoing promising steps of the government policy in these regions, solar electricity will grow even at a more significant rate.
Despite the hydro and wind power dominated power system, solar electricity is gaining momentum in Norway. Solar energy production in Norway is growing rapidly (currently 0.1 TWh annually) and it will be an important component in the power system. Solar electricity has become more attractive both in new zero-emission housing projects and for homeowners who wish to be wholly or partly selfsustained with electricity. According to the Norwegian Water Resources and Energy Directorate (NVE), a total of 40 MW new solar power was installed in Norway during the year 2020 alone [4]. By 2030, the annual energy generation from PV is estimated to be between 1.2 TWh [5] and 4 TWh [6]. Solar electricity is mainly a distributed resource in Norway, but recent projects are looking more at the use of solar in a larger scale, especially in connection with hydro power reservoirs and larger office buildings. The transition to renewable energy (PV and wind power) however is not smooth. Solar electricity has significant limitations since it is time 1 https://ourworldindata.org/ reliant and has stochastic nature. This is the result of the high dependence of PV outout power on meteorological conditions and parameters. A typical PV output power variation is shown in Fig. 2. This figure shows the PV output power for three consecutive days (from September 24-26,2020) selected at random for a PV plant located at the rooftop of the Department of Electric Power Engineering, Norwegian University of Science and Technology, Norway. It can be seen from the figure that the power output is highly varying over a short period, which brings a threat to the operation of the power grid. Although solar irradiation and temperature are the two most important meteorological parameters that directly affect the amount of energy generated from PV panels, the contribution of the geographical area where the plant is located is also significant because of a phenomenon called Soiling Effect.
The soiling effect can result from snow, dirt, dust, and other particles that cover the surface of a PV module. This effect is significant especially in areas with extreme weather conditions. For a desert type PV (arid and semi-arid areas), for example, a reduction of up to 10 % in energy production was measured as a result of soiling loss due to dust [7]. Similarly, the soiling effect of snow and ice coverage in colder climates is significant on winter days. However, quantification of the exact reduction in the generation is a complicated process due to the complex nature of the optical characteristics of snow [8]. More on factors affecting PV power is covered in Section III.
Due to the concerns discussed above, high penetration of PV power into the grid brings a challenge in the operation of the existing power system. The inherent variability of solar power creates challenges in matching variable load with variable supply. Maintaining the instantaneous balance between production and demand becomes a difficult and expensive task [9]. This will in turn affect the decision-making ability of dispatch centers and energy trading companies on critical issues such as alternate adjustments for conventional power sources, scheduling arrangements, storage requirements, and overall planning [10].
The unique climate of cold regions makes the task of integrating PV even more challenging. These regions are generally characterized by heavy snow and short sun hours during the winter, long sun hours during the summer, and fast and frequent weather variation over a short period throughout the year due to fast-moving clouds (especially Norway). The consequence of fast-moving clouds can be catastrophic especially when a significant amount of PV plants is integrated with a low voltage grid. This can cause fast and large changes in solar irradiance resulting in a sudden drop in PV output power, which ultimately leads to disruptions and grid instability. Addressing these issue is therefore very important to exploit and realize the full potential of solar electricity in the Nordic region.
The use of storage systems and consumption flexibility are usually suggested as means to overcome these limitations [11], [12]. However, their success is greatly impacted by the high cost of batteries and the lack of clear financial incentives that motivate consumption flexibility. In recent years, there has been an increasing interest in the use of energy informatics technologies to reduce the negative impacts of grid-integrated PV systems.
Energy informatics technologies (AI (Artificial Intelligence) and ML (Machine Learning)) can overcome these PV power limitations and make solar electricity an equal contributor to the energy mix. ML and AI methods can improve the adoption of solar electricity resulting in a modernized electrical grid supporting the reliability and resilience of the overall grid. These technologies are usually used in the forecasting of global irradiance and solar power output [13]. Accurate PV power output forecasting enables power system operators to make proper scheduling and operation planning, allow accurate energy trading decisions in power markets and significantly reduce the cost and size of balancing reserves.
Given the unique weather characteristics in these regions, much uncertainty exists about the design of a PV output power forecasting model. The commonly available AI/MLbased PV power forecasting models in the literature could be inadequate and could result in inaccurate and unstable forecast results. The optimal size and type of informative environmental parameters to use and the specific type of AI algorithm to implement are not fully understood. Very little is also known about the impact of the forecast horizon, the areal scale, and timestep on the performance of a forecast model in these regions. This study offers some important insights into the above critical shortcomings. Our main objective in this paper is therefore to quantitatively and qualitatively assess the role that AI/ML technology play in forecasting the amount and variation of output power focusing on PV plants located in cold regions. This work will potentially serve as a starting point where more area-specific AI-based PV power forecasting models can be designed to further promote the deployment of PV in these regions.

OBJECTIVES:
The following are some of the important issues that are addressed in this paper by reviewing recent research works: 1) Identify relevant parameters or inputs that affect AIbased PV output power forecasting in general and under a cold climatic condition in particular.
2) Investigate different PV power forecasting model types and suggest the 'one' that works relatively well for these regions. 3) Identify a forecast time step that can capture all the variations in the PV output power and identify a forecast horizon that suits a particular application. 4) And finally, identify and suggest techniques and methods that can improve the performance of a forecast model in terms of accuracy and stability.
The rest of this paper is organized as follows. Section II begins by discussing the general steps that are used in the design of an AI-based PV output forecasting model. It will then go on to a brief description of the working principles of the widely used conventional and deep learning AI algorithms for PV power forecasting. The impact of various environmental parameters on PV power forecasting is addressed in section III. In section IV, special focus is given to the case of PV in colder climates. Classification of various PV power forecasting models based on forecast horizon, model type, time step, areal scale, and approach used is covered in Section V. Finally, the important conclusions and observations from this review work are outlined in Section VII.

II. AI FOR PV OUTPUT POWER FORECASTING
This section addresses the generic AI-based PV power forecasting procedures and the common AI algorithms that are used in the design of a typical AI-based PV power output forecasting model.

A. AI PROCESS IN PV OUTPUT POWER FORECASTING
AI algorithms are currently the most widely used PV output power forecasting methods in the literature. Their popularity is due to their ability to effectively map the highly nonlinear relationship that exists between environmental input parameters and the PV power. A generic flowchart showing the AI-based PV power forecasting process is shown in Fig. 3. The whole process can be summarized into three phases. A closer look at the figure shows that the forecasting model uses both power and weather parameter measurements as an input. It should be noted however that several highperforming AI models exist in the literature that only requires the measurement of PV power as input, especially for very short-term forecast horizons.

1) Input phase
AI is a data-driven method, therefore, an important step in AI-based PV power forecasting, is data collection and analysis. This phase involves the collection of input-output data (environmental and PV power) from the site where the plant is located and preprocessing of these data. Environmental data including solar irradiance and air temperature are usually gathered from the weather station at or near the plant. Similarly, PV output power records can be obtained from the SCADA (Supervisory Control And Data Acquisition) system in the case of large plants or from data loggers for the case VOLUME   Preprocessing consists of (but is not limited to) dealing with missing values, outlier detection, data resampling, data scaling, and time series decomposition. Widely used methods to deal with missing values in PV forecasting include completely ignoring those rows of data from the training process (if the missing data are unimportant and/or small) and data imputation using interpolation techniques. Data resampling is also required when the different input data have different granularity.
The presence of outliers in the training data can lead to inaccurate prediction and may result in requiring a longer time to fully train the model. It is therefore very important to filter out the outliers that exist both in the power series and environmental data. These kinds of data are usually observed in the early periods of sunrise and sunset. In colder climates, snow can be the main reason for outliers during winter. Authors in [14] used the Hampel filter to detect and remove outliers from the measured PV output power. To benefit from various filtering algorithms, Pan et al. [15] used an ensemble filter algorithm to manage abnormally extreme weather inputs.
Data scaling is important to obtain a high-performing model both in terms of accuracy and computational resource requirement. Due to the Gaussian Distribution nature of PV power and other parameters, the most common data scaling technique in PV power forecasting is normalization [16], [17]. This rescales all the input features so that their value is always between 0 and 1. It is given by (1), where x is the value in SI unit, x min and x max are the minimum and maximum value in feature X respectively, and x nor is the new scaled value. Another very significant preprocessing step that is frequently used in AI-based PV power forecasting is time series decomposition. Common tools that are used for treating timeseries signals are WPD (Wave Packet Decomposition) [12] and WT (Wavelet Transform) [18]. These methods offer filtering ability and thus result in better performance characteristics.

2) Training phase
The training phase involves selecting a particular type of AI algorithm and training the algorithm with the training dataset. Evaluating the model with the validation dataset and finetuning the internal parameters of the algorithm to further improve performance is also part of the training process. The common AI algorithms that are widely used for PV forecasting are briefly discussed in Subsections II-B1 and II-B2 below.
For conventional ML algorithms, fine-tuning the internal parameters is usually done using extensive grid searching and optimization algorithms such as PSO (Particle Swarm Optimization) [18], GA (Genetic Algorithm) [19], [20], and ACO (Ant Colony Optimization) [15]. Due to the complex nature of the network, forecasting models based on DL (Deep Learning) algorithms use the trial and error approach to adjust their parameters.

3) Forecasting phase
Once the model is tested and validated to give satisfactory performance, it will be directly used on a new set of input test data for forecasting purposes. The input test data, in this case, is the forecasted weather data (usually from NWP (Numerical Weather Prediction)) unlike historical weather data used to train the model. Sometimes, post-processing can be implemented to further improve the prediction ability of the model.

B. COMMON AI ALGORITHMS
The common AI algorithms that are widely used in the literature for PV power forecasting can be broadly grouped into conventional ML models and DL models. A brief description of these models is given below.

1) Conventional ML Models
These are groups of ML algorithms that have limited ability to process data in its original form [21]. These methods re-quire considerable understanding and expert domain knowledge. Consequently, the selection of features is an important step and requires careful engineering. The common conventional ML algorithms that are widely used for PV output power forecasting are discussed below very briefly.
• Support vector machines (SVM) SVM was originally introduced by Vapnik in 1992 to solve classification problems but later extended to handle regression problems too. The input data is first mapped into a high-dimensional feature space through non-linear mapping so that a linear or non-linear regression approximation can be achieved in space. SVM generalization to SVR (Support Vector Regression) is achieved by introducing an ϵinsensitive region around the function, called the ϵ-tube [22]. An optimization problem is solved to find the tube that best approximates the continuous-valued function, which is a tradeoff between model complexity and prediction error. The ϵ-insensitive loss function penalizes predictions that are farther than ϵ from the desired output. The value of ϵ determines the width of the tube; a smaller value indicates a lower tolerance for error. This ensures that SVR is less sensitive to noisy inputs and the model more robust.
• Ensemble of trees The ensemble is a technique that allows combining multiple weak machine learning models to create a more powerful model. The two common ensemble models that have proven to be effective on a wide range of datasets including PV power forecasting, both of which use decision trees as their building blocks are random forests and gradient boosted decision trees [23].
The theory behind an RF (Random Forest) is to average multiple DTs (Decision Tree) that are slightly different from each other and that suffer from high variance/overfitting, to build a more robust model that has a better generalization ability and is less prone to overfitting [24]. The DTs in an RF are randomized either by selecting the data points used to build a tree (bootstrapping) or by selecting different features in each split test or both.
In contrast to the RF approach which allows parallel running of DTs, GBDT (Gradient Boosted Decision Tree) works by building trees in a serial manner, where each tree tries to correct the errors of the previous step [23]. The learning rate is the hyperparameter that controls how strongly each tree tries to correct the errors of the previous trees.
• Multilayer Perceptron (MLP) MLP is a typical example of a feedforward ANN (Artificial Neural Network) where each layer serves as the input to the next layer without loops and it is an entry point towards complex neural networks such as CNN (Convolutional Neural Network). A basic MLP network structure with one input layer, two hidden layers, and one output layer is shown in Fig. 4. The number of hidden layers and units in the MLP are hyperparameters that need to be optimized for a given problem. However, as more layers are added to a network, the error gradient that is calculated using the backpropagation

Input Layer Hidden Layer
Output Layer algorithm becomes increasingly small [24]. This vanishing gradient problem makes model learning more challenging.

2) Deep Learning Models
The term deep learning refers to artificial neural networks with multiple layers. The interest in having deeper hidden layers has recently begun to surpass the performance of conventional ML methods in various fields [25]. This is also evident from the literature in PV output power forecasting. The commonly used DL models for PV forecasting are CNN and LSTM (Long Short-Term Memory).
• CNN CNN can automatically extract high-level features from raw input data, which are much more powerful than humandesigned features [26]. It is a multi-level representation learning, where abstract features are learned from raw data through successive non-linear transformations [27]. Consequently, it has brought significant improvement in the performance of DL models in various applications. The basic structure of CNN consists of convolutional layers, pooling layers, and fully connected layers. → Convolutional layer: A convolutional layer comprises a set of filters (a grid of discrete numbers) which are convolved with a given input to generate an output feature map [28]. The weights of each filter are learned during the training of CNN through multiple iterations. → Pooling layer: A pooling layer operates on blocks of the input feature map using pooling functions such as average or max. This operation effectively downsamples the input feature map and is useful for obtaining a compact feature representation. It has no learnable parameters. → Fully connected layer: Fully connected layers are typical feedforward neural network layers like MLP. Each unit is densely connected to all the other units of the previous layer. They use non-linear combinations of the extracted features to make the final prediction.
• LSTM Unlike feedforward neural networks, RNNs (Recurrent Neural Network) are ideal candidates for modeling timedependent and sequential data problems, such as stock market prediction, machine translation, and time series prediction. However, conventional RNNs suffer from the problem VOLUME 4, 2016 . The internal structure of LSTM memory cell of vanishing gradients. The gradients become too small, and the weight updates become very insignificant. This makes the learning of long-term dependencies difficult. An LSTM network is a popular RNN architecture that solves this problem of vanishing gradient. LSTM deal with the vanishing gradient problem by not imposing any bias toward recent observations, but it keeps constant error flowing back through time [29]. This is possible by the introduction of gates (input, forget, and output) into the internal structure of LSTM based neurons (also called memory cells). This structure allows better control of the gradient flow and enables better preservation of long-term dependencies.
The internal structure of a typical LSTM memory cell is shown in Fig. 5 [24]. Here, ⊙ represents element-wise multiplication and ⊕, element-wise summation. C, H, and X represent the cell state, the hidden unit and the input respectively. Similarly, f t , i t , g t and o t denotes the forget gate, input gate, input node and output gate respectively. Finally, sigmoid and hyperbolic tangent activation functions are represented by σ and tanh.
Each gate has a specific functionality. The forget gate decides which hidden unit information to keep or discard from the previous time step. It is calculated using (2), where W xf is the weight between the input and the forget gate, W hf is the weight between the hidden unit and the forget gate and b f is the bias term for the forget gate. The input gate is responsible for updating the current cell state based on the updated cell state of the previous unit by the forget gate (3)-(5), where W xi is the weight between the input and the input gate, W hi is the weight between the hidden unit and the input gate, W xg is the weight between the input and the input node, W hg is the weight between the hidden unit and the input node and b i and b g are the bias terms for the input gate and input node respectively.
Finally, the output gate updates the value of hidden unit from the previous time step and this value is used to compute the hidden unit at the current time step (6)- (7), where W xo is the weight between the input and the output gate, W ho is the weight between the hidden unit and the output gate and b o is the bias term for the output gate. A high-level qualitative comparison between the different AI-based PV power output forecasting algorithms is given in Table 1. In this table, explainability is defined as a term that indicates the level of understanding of the internal decisionmaking rules of a PV output power prediction model as used in [30]. It can vary between the extreme cases of black-box (not interpretable directly) and white-box (easily interpretable) models. The choice of a particular algorithm for a given application should therefore be made considering all the parameters given in the table.

III. ENVIRONMENTAL PARAMETERS AFFECTING PV POWER FORECASTING
Meteorological parameters are the most important factors that directly determine the performance of any PV power forecasting model. Solar irradiance, air temperature, wind speed, wind direction, relative humidity, air pressure, and cloud cover are some of the parameters that are widely used as input to a PV power forecasting model. However, solar irradiance is by far the most significant and universal (not area-specific) of all the parameters and it is directly related to PV output power. Sudden and abrupt changes in PV power production are determined by the movement of clouds. So, in regions where the weather changes quite frequently over a short period, cloud coverage can be an equally important parameter as solar irradiance. Correlation study between meteorological variables and measured PV power is, therefore, an important step in the design of a forecasting model. Selecting an optimum number of informative inputs is crucial to obtain a high-performing forecasting model which uses the smallest computational resources.
The relative importance of most meteorological parameters for PV output power forecasting highly depends on the geographical area where the plant is located [32]. This can be observed from the web chart shown in Fig. 6 for randomly selected locations. This figure shows the correlation of the different meteorological parameters with the PV output power. It is possible to see from the figure that solar irradiance is a very important parameter irrespective of the plant's location. This claim is consistent with the observation that in the majority of literature reviewed in this paper and others, solar irradiance is used as an important input parameter. On the other hand, the importance of wind speed and relative humidity are highly affected by the location of the plant.
In addition to the common meteorological parameters affecting PV output power forecasting, other location-specific  . Relative importance of weather parameters for various locations based on correlation study with respect to the PV output power [12], [33]- [37] factors such as snow and dust cover on PV panels should also be considered for regions with extreme weather conditions. The impact of snow and ice cover on the yield of a PV plant in colder climates during winter days is very complex [8]. It can vary from partial to total obstruction of solar radiation reaching the PV modules, which results in a reduced or no generation at all. And in contrast, during the days where the modules are clear of snow, but the surrounding area is covered with snow, the consequence is opposite to the above situation. In this case, the reflectance property of snow (Albedo Effect) tends to increase the solar radiation reaching the surface of the PV modules, and hence increasing the PV power generation. If these situations are not taken into consideration in the design of PV output power forecasting model in these regions, they will lead to huge forecast errors. Similarly, in arid and semi-arid areas, issues associated with the accumulation of dust should be taken into consideration. More on the case of PV in a colder climate is covered in the following section (Section IV). Besides the parameters themselves, seasonal variation of the parameters is also another key aspect worth considering in the design of a PV power forecasting model. This is because the performance of a prediction model is directly affected by the season of the year. This is evident from Table  2 which shows how the prediction performance (measured in terms of RMSE (Root Mean Squared Error) and MAPE (Mean Absolute Percentage Error)) of a forecast model is dependent on the season of the year. As seen in Table 2, the performance of the prediction models is better during the winter period in terms of RMSE. In contrast, the worst performance is observed during the summer. The RMSE increased by 77%, 994%, 38%, 80% and 157% from winter to summer for [12], [16], [18], [38], and [39] respectively. A possible explanation for such observation can be that the weather changes quite frequently in summer and less in winter in these regions. The forecast model is unable to capture the fast and sudden variations. It should be noted however that this conclusion is only made with a limited reference and cannot be claimed to be the case for all regions. Another explanation can be in the definition of RMSE itself, which magnifies larger errors calculated between the measured and predicted power. No such conclusive statement can be made from the observation of MAPE across the seasons.

IV. PV IN COLDER CLIMATE
Since the focus of this paper is about ML-based PV power forecasting under a Nordic context, it is important to emphasize the case of PV in a cold climate. This section covers how this unique climate offers an ideal condition for PV operation and at the same time how it can also bring its own challenge. Finally, how it is possible to reduce the challenge is also covered in this section.

A. HOW COLD CLIMATE IS 'IDEAL' FOR PV
Many experiments show that solar cells perform very well under cold climate conditions. This is due to the low operating temperature inside the PV modules which is the direct result of the low temperature of the area where the plant is located. Manganiello et al. [40] demonstrated that PV modules located in colder regions result in higher seasonal energy yield and negligible energy loss. Their work also indicated that such high latitude climate has the potential to realize seasonal matching of production and consumption. This is also evident from the work of [41]. Their study has shown that by optimally placing PV installation at a relatively higher tilt angle in colder climates, it is possible to bring the temporal production profile of PV into better correlation with VOLUME 4, 2016 typical electricity consumption patterns. They investigated and quantified the potential of PV installations that favor high winter irradiance, high ground-reflected radiation (albedo effect), and steeper panel tilt angles. Adaramola et al. [42] conducted similar work for a PV plant in Ås, Norway. An annual yield of 2.55 kWh/kWp obtained from their analysis suggest that PV installations in this location and similar other locations in Norway is technically feasible.

B. CHALLENGES OF COLD CLIMATE
Despite the opportunities that come along with PV installations in colder climates, it has also a few but important limitations. Lower sun angle, short solar duration hours, thick clouds, and soiling due to snow are the typical challenges that PV installations in high latitude climates face during the winter. The impact of snow is by far the most significant one. Modeling the impact of snow is widely reported and extensively explored in the literature [43]- [45], but it is still difficult to fully understand and account for snow melting and sliding processes. For this paper and better understanding, the impact of snow on the PV output power can be observed as a three-stage process.

1) During snowing
During the time of fresh snow, the solar irradiation reaching the surface of the PV modules is also significantly reduced due to the thick cloud. Consequently, the power output from the PV system is close to zero, if not zero already.

2) Days after extended period of snowing
This is the most challenging period where the impact of snow is difficult to quantify precisely. The snow cover acts as a shading that prevents solar irradiation from entering the solar cells. The effect can vary from full shading resulting in zero generation to partial shading which results in a reduced generation. The impact of snow on yield in such cases depends on various factors such as snow depth, snow weight, the tilt angle of the module, ambient temperature, wind speed, and surface property of the module. Another important factor to account for the snow effect is the snow clearing process and the time required to clear a module completely.
An example of total obstruction of irradiance due to snow accumulation on the surface of the PV module can be seen from Figure 7, which is recorded for the same PV plant described earlier. On 27th November 2020, the PV output power is zero although there is a significant amount of solar radiation (a relatively clear sky day during the month of November in Norway). This can possibly be explained by the accumulation of snow on the PV modules (total obstruction). Such kinds of effects will lead to a huge forecast error if not taken into consideration. For the next two days, it is also possible to see that the snow is cleared from the modules and the plant is operating as it should be.
An early example of research into the impact of snow on the yield of a PV plant includes the work of Becker et al. [43]. They have obtained a strong correlation between the  incidence of freshly fallen snow and a decrease in yield. An estimated average yearly yield loss between 0.3 and 2.7% was measured for a 1016 kWp PV plant in Munich, Germany between 1999 and 2004. What was surprising in their finding is the observed sliding procedure of snow from the PV panels. Even for small values of solar irradiation (less than 100 W/m 2 ) and temperature (less than 0 0 c), the snow on the modules began to slip. Another important work that quantifies the losses associated with snow for a PV plant in Colorado and Wisconsin is the work of Marion et al. [46]. They developed a model that can estimate yield loss due to snow. The model uses daily snow depth, plane-of-array irradiance, air temperature, PV array tilt angle, and the extent of snow coverage on the PV array. The model performed very well mostly and gave a result in agreement with the measured values. Similarly, a simplified two-stage snow modeling method is proposed in [44]. The estimated hourly solar energy output is first calculated using module plane-of-array irradiance, module temperature, and derating factor. And in the second stage, a binary decision-making matrix is used whether to disregard the estimated energy depending on the number of days since last snowfall, snow depth, and ambient temperature. Despite the simplified nature of this approach, it suffers due to the binary decision-making process. It also ignores the fact that snow can be transparent depending on its depth and the time required for clearing the modules is not also considered.
A snow loss prediction model that takes into consideration the transmittance property of snow is proposed in [45]. Snowcovered modules are modeled in such a way that they can generate power based on snow depth. The Bouguer-Lambert law was used to estimate the amount of insolation that is received on the surface of uniformly snow-covered PV modules. The results of their study showed that the orientation (portrait or landscape) of the modules plays a significant role in reducing the impact of snow in snowy conditions. This paper only considered snow sliding as the dominant snow removal process and did not consider snow melting and snow removal due to wind. The surface of the PV modules is also assumed to be uniformly covered by snow at every time. Despite the success of this work in bringing significant improvement in the modeling of snow loss, the fact that it ignored other snow clearing procedures and the assumption of uniform snow cover on the modules at all times can be important constraints.
Due to the complex nature of factors such as the extent of snow cover, tilt angle, snow transmittance, and snow clearing processes on snow loss prediction, all the work we have seen above used at least one or two simplifying assumptions. Inspired by this limitation, Hashemi et al. [47] proposed a snow loss prediction model based on machine learning algorithms for a PV plant located in Ontario, Canada. They designed an algorithm that can capture all the inherent complexity using the available meteorological data from the location of the plant. A prediction model based on a gradient boosted tree resulted in the lowest mean squared error.
Further details on the impact of snow on yield reduction are beyond the scope of this paper and the reader is directed to Pawluk et al. [48] for more information. In this paper, they have discussed the impact of snow, identified factors that influence the generation loss, examined existing snow impact estimation techniques, and finally concluded by discussing various mitigation strategies to reduce the impact of snow.

3) Clear modules after snowfall
The effect of snowfall on the performance of a PV plant is not always negative and it can be used to optimize system design in colder climates [49]. This is due to a phenomenon called the Albedo Effect. Albedo is the term describing the reflectance property of snow and plays a huge role for PV in high latitude and colder climates. The incoming reflected light from the surrounding can be as high as 3 to 6 times when it is covered in the snow [8]. By properly choosing a location with a strong albedo effect and optimally sizing the tilt angle, it is possible not only to maximize generation but also to shift the temporal production patterns to match the typical demand [41]. Solar electricity generation could effectively be shifted from summer to winter without compromising the total annual yield.

C. HOW TO COPE WITH THE CHALLENGES
PV output power forecasting models that are designed for colder climates should take into consideration all the above limitations. Incorporating short-term hourly snow loss prediction models is very important as they can play a key role in the operational management of electric grids in such locations. Correction strategy is required for PV power forecasting models to incorporate the loss due to snow or snow cover and the albedo effect should be considered as additional input parameters in the design of the forecast model in the first place.

V. CLASSIFICATION OF PV POWER FORECASTING MODELS
Various forecast models have been used effectively for PV output power forecasting. These models can generally be categorized into multiple groups based on forecast horizon, type of prediction model, time step, areal scale, and approach used (Fig. 8). It should be noted however that there is no universally accepted way to classify PV power forecasting models and numerous other approaches have been used widely in the literature.

A. BASED ON FORECAST HORIZON
Based on forecast horizon (length of time into the future for which forecasts are to be made), PV power forecasting models can be categorized into three main groups. Shortterm, medium-term, and long-term forecast. Although this classification is widely used, it is also common to find a fourth category in many works of literature (i.e. Ultra shortterm forecast). Each forecast horizon is applied for various purposes and applications. It is worth noting that there is a lack of consensus among researchers in defining the boundary between different forecast horizons and usually overlapping happens between each group. Fig. 9 shows this overlapping nature and the different application areas of each forecast horizon.

1) Short-term Forecast
This includes a forecast horizon of up to one hour, several hours, one day, or even a week. This kind of forecast allows power system operators to ensure unit commitment, scheduling and dispatching. It is also crucial in PV-integrated energy management systems and in energy market operation [50]. Ultra short-term forecasts (from minutes to hours) on the other hand are highly beneficial to electricity pricing, power smoothing, and monitoring of real-time electricity dispatch.

2) Medium-term Forecast
Medium-term PV power forecasting is usually done for one week to one month. However, some also consider the forecast between one day and one week ahead as being in this group. This type of forecast is particularly important to schedule the maintenance of PV integrated power systems by considering the availability of generation in the future.

3) Long-term Forecast
Long-term PV power forecasting includes a forecast horizon of a month and up to a year. Typically used by power system owners and operators for long-term planning of the electricity generation, transmission, and distribution. The interested reader is directed to the work of [50] for further reading on forecast horizons for PV power output forecasting.
As weather parameters are usually difficult to forecast with acceptable accuracy beyond a certain short period, PV power forecasting also suffers as the horizon is increased. Table 3 shows how the performance of a prediction model degrades as the forecast horizon is increased. It can be seen from the data in the table that there is a significantly noticeable difference in the value of the evaluation metrics as the forecast horizon is increased. For example, in [18], the RMSE error increased by 37.5%, 87.5%, and 212.87% when the forecast horizon is increased from 3 hours to 6, 12, and 24 hours respectively. VOLUME   Such kinds of performance discrepancies can be particularly huge in areas where the weather changes quite frequently over a short period. Norwegian climate can be a good example for this. Due to the frequent movement of clouds; cloud coverage and solar irradiance cannot be forecasted with high accuracy beyond a certain very short period. Poorly forecasted weather parameters directly result in inaccurate PV power output prediction. So, emphasis should be given to the choice of forecast horizon before starting to design a prediction model. Based on this point and closer examination of the discussion above, a short-term, or ultra-short-term forecast could be a good choice for the Nordic climate. This is also evident from the work of [51]. Their result suggest that for a PV plant in Norway, PV output power forecast cannot be made for forecast horizon longer than 1 hour ahead without introducing a significant error.
Comparative study of the performance of PV power prediction models in all the forecast horizon categories has received limited attention in the literature. This issue has been addressed by recent work of [53]. Here the authors evaluated the performance of various ML algorithms such as LR (Linear Regressor), PR (Polynomial Regressor), DTR (Decision Tree Regressor), SVR, RFR (Random Forest Regressor), MLP, and LSTM for short-term (24 hours ahead), medium-term (1 week ahead), and long-term forecast (1 year ahead). Their overall analysis shows that a PV power forecast model based on RFR resulted in a better performance. Such kind of study is very helpful to identify a forecast horizon that specifically matches the weather condition of a given area.

B. BASED ON MODELS USED
Based on the particular type of model they use, PV power forecasting methods can generally be categorized into four groups: Physical, Statistical, AI-based, and Hybrid models. The first group depends on physical modeling of the PV plant and weather system, but the remaining are data-driven methods that rely only on the measurement of PV power and other environmental parameters. It is important to note that there is no universally suitable approach for classifying PV power output forecasting based on the type of models they use. Different approaches have been proposed in different works of literature.
Authors in [50] classified PV power forecasting models into three groups consisting of a physical model, persistence model, and statistical models. The statistical model included both the traditional time-series forecasting methods (ARIMA (Auto Regressive Integrated Moving Average) and SARIMA (Seasonal ARIMA)) and models based on  [54] where PV power forecasting models are classified into two groups. Model-based and data-driven approaches. An alternative and comprehensive approach which is also similar to the one used in this paper is developed by [32]. Here they classified PV power forecasting models into physical, statistical, ML, persistence, and hybrid methods. To summarize, PV output power forecast model classification is usually a matter of subjective opinion with which the authors of this paper agree and we believe that the system of classification adopted in this paper satisfies the needs of the study.

1) Physical Models
In this approach, the PV plant is first modeled mathematically with plant-specific parameters such as module inclination angle and module efficiency. Then this model is used directly to calculate the PV generation with forecast information of solar irradiance and temperature obtained from NWP. This approach is particularly important as part of a feasibility study to determine the amount of PV generation before it is constructed. Due to its dependence on the forecast of NWP, which has poor resolution for short periods, this approach is usually used for long-term forecast purposes [55], [56].

2) Statistical Models
Statistical models assume that the future value of the target variable is a linear function of past observations and random errors. Popular statistical forecasting methods for PV output prediction include ARIMA and SARIMA. SARIMA is an improved version of ARIMA, and it is designed to support seasonality which is an important characteristic of PV power data.
Kushwaha et al. [57] implemented SARIMA for very short-term PV power forecast and concluded that despite the satisfactory performance of the model during clear sky days, the performance degraded significantly on cloudy days where frequent weather changes are quite common. To overcome such shortcoming, SARIMA is usually used in a combination with other techniques (Wavelet-SARIMA) and models. Implementing this approach is the work of [58], where SARIMA is used together with ANN in a parallel structure. In doing so, they showed that both the accuracy and resilience of the prediction model are improved. Based on the work of [57] and others, statistical methods are not recommended forecasting models for Nordic climate, given the high variation of weather parameters over a very short period.

3) AI-based Models
The majority of works in PV output power forecasting in the literature are implemented using AI algorithms. These methods rely on the ability of a model to learn from historical data and to further refine its predictive ability through training. AI-based PV output power forecasting models in this section include both the conventional AI/ML algorithms and the DL algorithms as described in Sections II-B1 and II-B2 respectively. Here we give an overview of how these algorithms are used in the design of a PV power forecasting model. The ensemble method is also discussed as a technique that improves the performance of a prediction model.
The performance of various ML models including ANN, Linear Regression, M5P Decision Tree, and GPR (Gaussian Process Regression) is compared in [59] for a PV plant in Qatar. CFS (Correlation feature selection) and RFS (Relief Feature Selection) techniques are used to identify the relevant features for the models. A forecast model based on the ANN network outperformed all the other models in terms of RMSE and R 2 . The prediction accuracy and error distribution of SVM and ANN are compared in [10]. This information is in turn used to estimate the capacity of the energy storage system required to absorb mismatch in energy trading applications. Both methods gave satisfactory results, but ANN marginally outperformed SVM. Authors in [60] assessed the performance of ANN, SVR, and RT for predicting the power output of a PV plant against a PM (Persistence Model) model. Their comparative analysis shows that the ANN outperformed the other models, resulting in the lowest normalized RMSE and MAPE error of 0.6% and 0.76% respectively.
Three commonly used conventional AI/ML models in PV output power forecasting, i.e., MLR (Multiple Linear Regres-VOLUME 4, 2016 sion), GB (Gradient Boost), and ANN are compared in [61]. These models are tested with different training windows and features for a PV site in the state of Florida. The forecast model based on GB resulted in the lowest RMSE and lowest variance. ANN produced the lowest performance in terms of both accuracy and variance of forecast results. Similar work that implements conventional ML algorithms for PV output power forecasting has also been pursued by others. SVM [62], BNN (Binarized Neural Network), SVR, and RT [63], and ANN [64], [65] are few examples. An important limitation of implementing the above methods arises when they are used for a PV plant located in a region where large and sudden weather variations over a short period are frequent. Both the forecast accuracy and stability of the prediction model suffer in these conditions. To reduce this effect, an ensemble technique is usually applied. The predictions of individual ML models are combined to make a model that has a characteristic of better generalization and robustness.
Ensemble of optimized ANN models is used in [54]. Here, they used the bagging technique to create diversified base estimators that can capture different aspects or characteristics of the PV power series. Statistical aggregation strategy (median) was used to make the final prediction. The ensemble approach showed superior performance in comparison with the individual ANN models and a smart PM. Conceptually identical work, utilizing DT instead of ANN as a base estimator, was proposed by [66]. Two tree-based ensemble methods, i.e., RF and ET (Extra Trees) were compared with the SVM model. The significance of their ensemble approach is marginal at most in terms of improved accuracy over SVM, but much better forecast stability was achieved.
Similarly, an ensemble of DTs but with a boosting algorithm is implemented in [16]. Unlike bagging which randomly selects subsamples from the data set and train each estimator, boosting uses the same data set but each estimator learns from the last prediction sequentially. This ensemble model outperformed both the statistical model (ARMA) and SVM in all seasons of the year. Other literature that implements an ensemble approach for PV power forecasting includes the work of [67] and [68] where they used NN (Neural Network) and KDE (Kernel Density Estimation) respectively as base estimators.
A closer look at the above works and other pieces of literature, reveals an important gap and shortcoming of PV output power forecasting models based on the conventional ML algorithms. That is, even though these methods have better performance when compared to the statistical approaches, they still suffer from the problem of over-fitting and insufficient generalization to fully capture the highly nonlinear characteristics of PV power. They are more effective in a region where the weather stays relatively stable. In other words, they result in a great performance when used only in the cases where the deterministic component (which is explained by the location of the sun) of the PV power is more dominant than the stochastic component (which is explained by the movement of clouds). In an area where the weather changes quite frequently over a short period (i.e. when both the stochastic and deterministic components are equally important), like Norway for example, there is a need for a better approach. Motivated by this, several DL models are proposed for short-term PV power forecasting. A concise summary of some of the work in the literature implementing conventional ML algorithms is given in Table  4 in Appendix A. The key finding, the forecast horizon, and the input parameters are important points to notice here. The term conventional ML algorithms here represent all AI algorithms (including shallow NN) except those based on deep learning networks.
The performance of three DL PV power forecasting models (LSTM, CNN, and CLSTM (Convolutional LSTM)) is compared in [69] for a 23.4 kW PV plant in Alice Springs. For a smaller data set (half-year data), the performance of all three models was poor. A possible explanation for this result can be the lack of adequate training samples to extract the spatial and temporal features required to make a good forecast. The LSTM model was able to extract the temporal features for an input data size of one and half years. Hence, LSTM outperformed both CNN and the hybrid model (CLSTM). As the size of the data set is further increased (3 years), the CLSTM model which incorporates the advantages of both LSTM (temporal features) and CNN (spatial features) showed a superior performance. An important finding of this work is that the input sequence length (training dataset size) plays a huge role in the performance of a DL prediction model. A longer input sequence doesn't necessarily result in a better performance.
For a location where the weather changes (especially solar radiation) quite frequently over a short period, PV output power prediction becomes even a more challenging task. To address this issue, Zhang et al. [70] proposed Autoencoder-LSTM. The autoencoder reduced the uncertainties between perceptron mapping in the training process in response to complex weather conditions. The performance of this architecture was compared with conventional LSTM, FNN (Feedforward Neural Network), and PM models for various locations with different weather conditions. The autoencoder-LSTM based model resulted in the best accuracy for a day ahead PV power forecast on 15-minute time steps.
Another work involving the LSTM network, similar in principle to [70] but using an attention mechanism instead, was proposed in [38]. Here an ensemble of two LSTM neural networks is proposed for power and module temperature time series data. The attention mechanism is added so that the forecast model can adaptively focus only on input features that are more important to the current output. In comparison with other forecast models such as conventional LSTM, PM, ARIMAX, and MLP, the proposed method has the best accuracy for various forecast horizons. The performance improvement is more apparent in a forecast horizon longer than 15 minutes. Similar works involving DL algorithms have also been pursued by others. LSTM [71]- [73], CNN [74], [75], and GRU (Gated Recurrent Unit) [76]. A brief summary of some of the works in PV output power forecasting using DL algorithms is given in Table 5 in Appendix A. One apparent thing to observe from this table is that LSTM is the most widely used DL algorithm for PV output power forecasting.

4) Hybrid Models
Due to the complex nature of PV output power forecasting, a single ML/DL model is usually unable to fully capture the highly non-linear characteristics between the inputs and output. Inspired by this, there are many works in the literature where more than one technique or model is used for PV power forecasting. This approach is commonly known in the literature as a hybrid model. The definition of a hybrid model has not been consistent throughout different works of literature. Here it is used to represent a model which can be a combination of the above three PV power forecasting models or techniques from multiple other domains.
One way to form a hybrid model is to use an optimization algorithm to fine-tune the internal parameters of a conventional ML model. Pan et al. [15] used SVM as a base forecasting model and I-ACO (Improved ACO) optimization technique to fine-tune the internal parameters of the model. The R 2 score, RMSE, and MAE of the hybrid model are significantly improved to 0.997, 0.1868 kW, and 0.1569 kW respectively. Conceptually identical work, utilizing SVM, is proposed in [20]. Here, the authors used GA instead of I-ACO for optimization purposes and significant performance improvement is achieved as compared with the base SVM model without optimization. The RMSE and MAPE reduced from 680.85 W and 100.47% to 11.23 W and 1.7052% respectively.
The performance of a hybrid model can be further improved by including time series decomposition algorithms on the above approach (i.e. using optimization techniques). A hybrid forecast model combining the SVM model, signal decomposition (WT), and optimization technique (PSO) is proposed in [18] for a 480 kW PV plant in Beijing, China. The WT decomposes and filters historical PV power measurements and weather parameters into subcomponents and these noise-treated subcomponents are used to train the SVM model. The PSO is used to optimally tune the internal parameters of the SVM. Their result demonstrated the adequacy of the proposed method in terms of higher forecasting accuracy. A similar approach that combines WPD and LSTM is implemented in the work of [12]. WPD decomposes the power series into four low (represent the power output trend) and high (represent varying and random output) frequency components. These components are trained by four LSTM neural networks and the final prediction is made by the linear weighting method. Their result showed significant forecast accuracy improvement over other methods such as LSTM, GRU, RNN, and MLP in all four seasons of the year. The RMSE improved by 77.3% as compared with the forecast model involving LSTM without WPD.
Another technique where a hybrid model can be implemented to increase the forecast accuracy is to use a weather classification algorithm at the initial stage of the forecasting process. Based on the classification result, two approaches can be implemented. First, each weather cluster can be trained with a separate model and the final prediction will be the aggregate of the prediction from each model. Second, only a subset of the original data set that is similar to the forecast day will be used to train a common model for all clusters.
Implementing the first approach is the work of authors in [77]. They trained separate RF models for each cluster that is obtained using historical PV power data. Each model predicts on a new test sample, and the final prediction is made by integrating the prediction results from each cluster according to the assigned weight by the ridge regression algorithm. A PV forecasting model that implements the second approach is proposed by Jiang et al. [78]. Here, the authors used PCC (Pearson Correlation Coefficient) to measure the similarity between different days based on meteorological variables, and only samples similar to those from the target forecast day are selected as the training set to train GA-optimized ELM (Extreme Learning Machine). They were able to achieve an accurate forecast model that is also stable and computationally efficient. What can be seen from their work is that clustering analysis can effectively improve the prediction accuracy and stability of solar output power forecasting. A comprehensive summary of various hybrid PV output power forecasting models is given in Table 6 in Appendix A.

C. BASED ON TIME STEP
PV power forecasting models can be divided into two groups depending on the number of future time steps under consideration [52]. A forecast model that requires only the prediction of the next immediate time step is called a single time step forecast. On the other hand, a forecast problem that requires the prediction of more than one time-step is called a multiple time-step forecast. The number of time steps to look into the future depends on the specific application area where the forecasting will be implemented. The more time steps to be predicted into the future, the more challenging the task will be. This is the direct result of both the compounding nature of uncertainty on each forecast time step and the stochastic nature of input weather parameters.

1) Single Time Step Forecast
Most PV power forecasting models in the literature fall in this category. This involves forecasting a single time step into the future on a minute, 5 minutes, 15 minutes, or hourly basis. This category of forecast is relatively simple and accurate. It is more appropriate and ideal especially in areas where the weather changes frequently in a short interval.

2) Multiple Time Step Forecast
In some applications such as scheduling and dispatching, the prediction of the next immediate time step will not be sufficient. In such circumstances, multiple time step forecast becomes very important. The two widely used approaches for VOLUME 4, 2016 multiple time step forecast are direct and iterative methods (Fig. 10). In the direct approach, multiple models are developed for each time step and if it is an NN-based model, multiple output units representing each time step are used in the output layer. In contrast, the iterative approach uses a single model and appends the prediction from a previous time step as an input for the next step. Although this approach is relatively less complex, it suffers from error accumulation from each previous prediction step. If the true observation of the power is available (which is the case in PV output power forecasting), this value can be used instead of the predicted value as part of the input for making the prediction on the next time step. This way it is possible to eliminate the error propagation problem.
Rana et al. [52] proposed a new hybrid approach for multiple time step forecast that integrates data re-sampling technique with an ML algorithm. For every prediction step, the time-series input data is first re-sampled to a new representation that is ideal for that particular step. Then a single step ML model is trained on the new re-sampled data avoiding the error at higher prediction steps. They have shown that their approach is a better alternative to the conventional approaches in the literature for multiple steps ahead prediction. The technique proposed in their work is very inspirational, but an important constraint can be when downsampling the original time series, a piece of important information describing the true variation in PV power can be lost, especially for higher time steps.
The performance of the three multiple time step forecast approaches discussed above was compared in [79] for a PV plant located at the University of Queensland in Brisbane, Australia. They used a heterogeneous ensemble model which dynamically assigns weight to the individual predictions based on the error information from the previous and current time steps. Their result shows that multiple-step forecast based on the approach proposed by [52] result in better accuracy. This approach achieved a 12.2% and 24.8% performance improvement in terms of MAE when compared with the direct and iterative approaches respectively.

D. BASED ON AREAL SCALE
PV output power forecasting models can also be grouped into single-field and regional forecasts based on the number of PV plants and areal coverage considered in the study. All the PV power forecasting models discussed so far in this paper involve a single plant, thus belong to the single-field category. On the other hand, regional forecast involves a group of PV plants spread in a wider area. This kind of forecast is particularly important for large grid operators for planning unit commitment, determining reserve requirements, contingency analysis, and energy storage dispatch [80].

1) Single-field Forecast
Single-field forecast implies the prediction of solar output power from a single PV plant. This approach is more accurate and effective because the weather condition of the geographical area where the plant is located can be precisely represented by localized parameters.

2) Regional Forecast
Integrating PV plants spread over different locations to a grid that connects many regions is a major technical challenge for system operators that have to ensure the balance between production and consumption at all times. This has also major economic implications for energy trading companies operating across multiple regions (ex. Nord Pool in the case of Europe). It is therefore important to have a regional PV power forecast that aggregates the PV power prediction of individual plants located in various geographical areas. Two common approaches exist in the literature for regional PV power output forecasting: Bottom-up and Up-scaling approaches (Fig. 11).
• Bottom-up Approach: In this approach, prediction for each PV plant in the regional area under consideration is made first, and then the results are aggregated statistically to obtain the regional forecast. This is shown in Fig. 11 (a). This approach is not widely used practically as it requires large computational resources and detailed knowledge about each plant in the region. Its application is limited to the cases where there are few PV plants in the area under consideration. • Up-scaling Approach: This can be implemented in two ways. In the first approach, the various PV sites in the forecast region are replaced by a virtual power plant, and prediction is made directly at the regional level by using input data aggregated at a lower level (Fig. 11  (b)). In the second approach, representative PV sites are sampled carefully and then the PV power forecast of such plants is re-scaled to obtain the regional power prediction according to the total capacity in the area (Fig. 11 (c)). Choosing the right PV plant subsets that represent the spatial distribution of the overall region is equally important as the prediction itself in this approach [82].
A regional PV power forecast implementing a bottom-up approach for PV plants located in Luxembourg is proposed in the work of Koster et al. [83]. The hourly regional power generation is predicted using exogenous input (solar irradiance) obtained from NWP and a physical PV performance model. The prediction results from the proposed approach were compared with the true measured values and high accuracy is achieved with an average RMSE value of 7.4%. Authors in [81] compared the performance of different ML-based regional forecasting methods using the up-scaling approach for PV plants located in the regions of Italy and the Netherlands. The forecast model based on an analog ensemble algorithm outperformed all other models including RF and GBDT. Aillaud et al. [84] used nationally aggregated PV generation data and meteorological inputs from NWP for regional PV power output forecast in Germany. A forecast model based on CNN-LSTM gave the best performance

Aggregation
Regional Forecast when compared with other conventional and DL models. This is directly attributed to the ability of this hybrid model to extract both spatial (CNN) and temporal (LSTM) features from the input.
An up-scaling approach using a sample of reference plants is the practical and economic way of estimating PV output power on a regional level. However, this approach has its own challenges and introduces some uncertainties. These uncertainties arise from the sparsity of reference plants, the number of unknown plants, and most importantly on the difference in characteristics between the reference and unknown plants in the area. Saint-Drenan et al. [85] addressed and quantified these uncertainty issues for 366 PV plants in the regions of Germany. Their analysis shows that the RMSE decreased as the number of reference plants increased and when the number of unknown PV plants decreased. Another interesting finding of their work is that a variation of RMSE between 0.01 and 0.025 kW/kWp is measured depending on the choice of the reference plants.
To improve the performance of the regional PV output power forecast model using the up-scaling approach, authors in [86] used module orientation as an additional input variable. They determined the module orientation of the unknown PV plants using GIS-based (Geographic Information System) data sources and spatial interpolation techniques. Their approach resulted in an improvement of the RMSE by 5% when compared with the conventional up-scaling approaches.
The impact of local variability of weather parameters is 'minimal' in the regional PV power forecast due to the spatial smoothing effect. However, it can still be very challenging to make regional forecasts for some locations such as Norway. Norway is an elongated country where the southern and western part is fully exposed to the Atlantic ocean and experiences very different weather characteristics than the eastern and northern areas. Furthermore, the inland regions of Norway also have their unique climate. This significant weather variation across the county makes the task of de- termining sample PV plants that will be used in the upscaling approach very difficult. Another major bottleneck that makes regional forecasting challenging in Norway can be the uneven distribution of PV plants across the country which again directly affects the choice of reference samples for the up-scaling approach. Based on the data from NVE [4] for 2020, this uneven distribution is shown in Fig. 12 and Fig. 13. These figures show the spatial distribution of PV plants across the country in different electricity power market zones. From these figures, it is possible to see that almost all (≈ 98%) the PV capacity in Norway is concentrated in the south and west part of the country which would negatively contribute for PV power prediction on regional level.

E. BASED ON APPROACH
Depending on the type of forecast value they return, forecast models can be grouped into two categories. Point forecast and Probabilistic forecast.

1) Point Forecast
Most works in the literature fall in this group. In this approach, the goal is to determine the precise value of the power production for each forecast step. The limitation of this approach is that it ignores information such as upper and lower bounds of possible forecasts that are very valuable for system operators and energy trading companies [55].

2) Probabilistic Forecast
The probabilistic forecast provides prediction intervals in addition to precise values with which the forecast is expected to fall with some predefined confidence level or probability. The additional information about the uncertainty of the prediction is very important to decision-makers such as PVbased electricity market operators [87]. This knowledge is used to provide a precise generation schedule for the dayahead and real-time market. A day ahead probabilistic PV output power forecasting model based on QRNN (Quartile Regression Neural Network) is proposed in [87] for a 7 MW PV plant in South-East of Spain. The internal settings of the NN and the size of informative inputs are optimized using GA algorithm. Their analysis shows that a point forecast corresponding to the QRNN for quantile 0.5 has lower RMSE compared with a persistence reference model and with the best MLP model.
Authors in [54] used the bootstrap technique to quantify uncertainties associated with each forecast. This technique allowed predictions to be made with a wider prediction interval with a confidence level of 84%. Such kind of forecast is very important for proper planning, scheduling, and generation control of the available energy sources. It is also crucial information for ensuring power system reliability and for an efficient energy market operation.
Thus far, this work has attempted to claim that solar electricity can be a good and economical alternative to conventional ways of generating electricity even in the colder climates of the Nordic regions and Canada. The cold climate can provide a unique opportunity in terms of seasonal matching of generation and consumption, and improving the efficiency of the PV modules. However, the cold climate also brings an important challenge to the operation of PV plants in these regions, such as the soiling effect due to snow. Frequent and abrupt change of weather parameters over a short period is also another important challenge that is worth considering in these regions. Despite these limitations, this work argued that accurate PV output power forecasting that considers the needs of this region's climate can improve the adoption and deployment of PV plants further.
Section II addressed the basic procedures and steps that are commonly followed in the design of AI-based PV power forecasting models. The discussion of the common AI algorithms that are widely used in relation to PV power forecasting was also part of this section. The different environmental parameters that directly affect PV power forecasting are discussed in section III. A more focused discussion on the case of PV in a cold climate is covered in section IV. Both the merits and demerits of the cold climate are thoroughly covered. It is argued that the advantages that come along with the cold climate outweigh the challenges and a PV power forecasting model that takes into consideration this can result in a more efficient system. And finally, the classification of PV power forecasting models based on multiple criteria was presented in section V. In the two sections that follow, a thorough discussion on the main findings from this review paper, together with a final comment and conclusion is provided.

VI. DISCUSSION
In recent years, solar electricity is gaining huge interest in the cold and high latitude climate regions such as the Nordic countries and Canada. Although the cold climate offers many opportunities, the unique climate in these regions also makes the task of integrating PV into the existing grid more challenging. Accurate PV output power forecasting can overcome this limitation and make PV power an equal contributor to the energy mix in these regions. This paper presented a comprehensive review and evaluation of the state of the art in the use of AI/ML algorithms for PV output power forecasting in the Nordic region. The main goals of this review paper were to identify relevant input parameters and evaluate their importance in PV power forecasting and assess the performance of various PV output power forecasting models. The analysis of the impact of the forecast time step, forecast horizon, and areal scale on the performance of a prediction model was also the point of interest in this review. This review was also aimed at qualitative assessment of some of the techniques that are used to enhance the accuracy and stability of a prediction model. The following are some of the important issues that are discussed in this review work.
The selection of weather parameters as an input to a PV output power forecasting is a vital step to achieving a good model. Proper feature selection and using an optimum number of informative inputs ensure a high-performing model with minimal computational resource requirement. This is one of the important required characteristics of a PV forecast model for practical deployment. Since the Nordic region's weather, especially Norway's, vary very frequently over a short period, it results in fast and abrupt changes in PV output power. This study indicates that such variations can only be precisely accounted for by including solar irradiance and cloud coverage information as inputs in the prediction model.
The impact of the soiling effect due to the accumulation of snow on PV modules is another significant issue that should be taken into consideration in the design of a PV output power forecast model in the Nordic region. This effect can vary from partial to total obstruction of solar irradiance reaching the surface of the PV module resulting in a reduced or zero generation of PV power. In general, prior works are limited to the calculation and prediction of yield reduction due to snow cover. To the author's knowledge, no previous study has investigated how to directly incorporate the impact of snow on PV output power forecasting.
This study suggests two possible strategies that can be used to incorporate this effect in the design of a PV output power forecast model. The first strategy is to consider snow depth and snow cover as additional inputs in the initial stages of the design process of a forecast model. In this way, the algorithm learns how to associate snow depth and snow cover with PV output power. In the second approach, a strategy can be designed in such a way that during the days with significant snow cover, a correction term can be added to the forecast output that is obtained without taking snow accumulation as an input feature. These strategies provide new insight into the problem and can serve as the first step towards a more profound understanding of PV output power forecasting in Nordic regions. The success of the two proposed approaches above however depends greatly on the precise measurement of snow depth and snow cover on the surface of the PV modules. This can be of major concern and the authors believe that this information may be currently unavailable from most or all PV plants in the Nordic region. This, therefore, suggests a definite need and effort to be directed towards installing a measurement station for monitoring the snow effect. It is also very important to bear in mind that these approaches require addressing the issues related to the snow clearing process due to snow sliding and snow melting. More research is also needed to understand and consider the reflective property of snow and ice (albedo effect) in PV output power prediction in Nordic regions.
The forecast time step and forecast horizon are also important parameters that directly affect the accuracy of a prediction model. These parameters should be chosen considering the stability of the weather condition where the plant is located. This study shows that for a relatively stable weather condition, a prediction model with a forecast time step of up to 1 hour and a forecast horizon of up to one week or more can be designed without significantly compromising the accuracy. On the other hand, if the weather condition of the PV plant's location is frequently changing over a short period, such as in Norway, a prediction model with a forecast time step and a forecast horizon of more than one hour could lead to substantial forecast error.
This review also shows that the choice of a particular type of ML algorithm to apply for PV output power forecasting depends on the weather condition of the area where the plant is located similar to the forecast time step and horizon. For stable weather conditions, the deterministic component (which is explained by the movement of the sun) of the PV output power is more dominant than the stochastic component (which is explained by the movement of cloud). In such cases, conventional ML algorithms such as RF and SVM can result in a sufficiently good-performing prediction model. However, in areas where the stochastic component is equally important as the deterministic component, the conventional ML algorithms are found to be mostly inadequate. In cases like this, DL algorithms such as LSTM and CNN have been implemented to overcome this inherent limitation. It was found in our analysis that such approaches can fully capture VOLUME 4, 2016 highly complex input-output relationships and result in a high-performing prediction model. This research has identified several techniques and approaches that have been implemented to improve the performance (accuracy and stability) of a PV power prediction model. One such technique is the use of an ensemble approach. Ensemble techniques such as bagging and boosting which aggregate the prediction ability of multiple base estimators are found to improve the poor generalizing ability and the over-fitting problem of conventional ML methods considerably. Such an approach can be particularly beneficial to enhance the performance of conventional ML-based PV power forecasting models in a region with highly varying weather conditions. Time-series decomposition is another technique that is mostly applied to improve the performance of a PV power forecast model. WT and WPD are the two commonly utilized and efficient time-series decomposition tools that are used as preprocessing steps to treat time series data against noise. In this way, the algorithm can be trained on the noisetreated data which is found to improve the training speed and prediction accuracy substantially.
This research work has also shown that for PV output power forecasting models based on DL algorithms, autoencoder and attention mechanism techniques can be included to improve performance. These approaches denoise the input time series data and ensure that the algorithm adaptively focuses only on the input features that are more important to the current prediction and avoid interference from other features. This enables the prediction model to handle uncertainties in the training process and deal with complex and unstable environmental conditions.
Weather clustering is another approach that this study has identified as a useful tool that has enormous potential in improving the performance of a prediction model. This approach ensures that the model is trained and subsequent prediction is made by considering data only on days which have similar characteristics as the target forecast day. This method not only results in higher accuracy and stability but also needs small computational resources which is a required attribute in the practical deployment of a PV power prediction model.

VII. CONCLUSION
In general, the findings in this research suggest the need to adapt the available AI/ML algorithms in the literature for the design of PV output power forecasting to match the unique climate of the Nordic region. More emphasis should be given to the impact of the soiling effect due to snow during the winter season in addition to the highly varying weather conditions. This ensures an accurate PV power forecasting model which potentially is a better and economical alternative to other methods to stabilize a grid with high penetration of PV. This work therefore can serve as a groundwork for further research into the design of a high-performing AI/ML PV power forecast model and can contribute in several ways to the limited current literature available for PV in the Nordic region. .  [10] 2019 Korea 1 day GHI, SD, CC, SH, RH, Precipitation, Temp, and WS ANN and SVM The effect of PV power prediction errors is characterized on energy storage system-based PV power trading markets [16] 2018 Oregon, USA 1 day GHI and Temp GBDT High accuracy, stable performance, and strong model interpretability [36] 2018 Jiangsu, China 5 minutes GHI, Temp, WS and WD SVR and ARIMA Weather classification and pattern recognition algorithms increased the forecasting accuracy [54] 2019 Amman, Jordan