Solar Power Forecasting Using Deep Learning Techniques

The recent rapid and sudden growth of solar photovoltaic (PV) technology presents a future challenge for the electricity sector agents responsible for the coordination and distribution of electricity given the direct dependence of this type of technology on climatic and meteorological conditions. Therefore, the development of models that allow reliable future prediction, in the short term, of solar PV generation will be of paramount importance, in order to maintain a balanced and comprehensive operation. This article discusses a method for predicting the generated power, in the short term, of photovoltaic power plants, by means of deep learning techniques. To fulfill the above, a deep learning technique based on the Long Short Term Memory (LSTM) algorithm is evaluated with respect to its ability to forecast solar power data. An evaluation of the performance of the LSTM network has been conducted and compared it with the Multi-layer Perceptron (MLP) network using: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE) and Coefficient of Determination (R2). The prediction result shows that the LSTM network gives the best results for each category of days. Thus, it provides reliable information that enables more efficient operation of photovoltaic power plants in the future. The binomial formed by the concepts of deep learning and energy efficiency seems to have a promising future, especially regarding promoting energy sustainability, decarburization, and the digitization of the electricity sector.


I. INTRODUCTION
Renewable energy, especially solar PV, will gain prominence as a major source of energy in the future. But as their share of the energy mix grows, ensuring the safety, reliability, and profitability of power generation assets will be a top priority. Therefore, the successful integration of solar energy into the electrical grid requires an accurate prediction of the power generated by photovoltaic panels. Speaking of solar energy in particular, its unexpected behavior brings with it a series of problems when generating energy, such as voltage variations, power factor details, and stability. For this reason, these new tools are constantly being created that contribute to the prediction of future events, with the aim of reducing errors in predictions [1]. Auto-Regressive Integrated Moving Averages (ARIMA) models have proven to be extremely useful for the short-term prediction of highfrequency time series. In contrast to ARIMA models and statistical methods, artificial neural networks are more powerful, especially in representing complex relationships that exhibit The associate editor coordinating the review of this manuscript and approving it for publication was Giambattista Gruosso . nonlinear behaviors. In recent decades, the application of artificial neural networks (ANNs) in time series prediction has grown due to the ideal characteristics offered by ANNs for working with nonlinear models. Likewise, the development of applications that facilitate work when carrying out simulations with ANN continues to increase. In [2], artificial neural networks are highlighted as one of the prediction methods for time series thanks to their great adaptability and capacity to solve nonlinear and complex problems. In recent years, as a result of the research on artificial intelligence, deep learning based on ANN has come to light to become very popular due to its capability to accelerate the solution of some difficult computer problems [3]. While multi-layer perceptron (MLP)-type ANNs can be used to model complex relationships, they are incapable of assimilating the long-and short-term dependencies present in historical data [4]. These dependencies refer to the ability of an ANN to identify and remember behavior patterns from the distant past and the near past, respectively. An ANN is needed to make predictions of sequential data behavior [5]. As an attempt to address this problem, the first Recurrent Neural Networks (RNN) emerged in the 1980s, where the term ''recurrent'' refers to the characteristic of these networks to have internal feedback loops. However, these networks presented a great disadvantage, which is known in the literature as the vanishing gradient problem. This problem refers to the difficulty of training these networks with methods based on gradients or backpropagation algorithms. This is due to the fact that when this type of method calculates the weight adjustment based on the chain rule, there are common multiples between 0 and 1 that are multiplied by n times. Numbers less than 1multiplying by n times, an exponential decrease in the error signal is generated, which significantly affected the training of these RNNs [6]. Due to the above drawbacks, Long Short Term Memory (LSTM)-type RNNs emerged in 1997 as a solution due to their memory units in the network cells [7]. The prediction of solar radiation is a fundamental key to increasing the supply of electrical energy generated from this medium to the distribution networks. In the energy market, when an electric energy producer does not comply with the programmed offer, they are penalized with a proportional relationship between the energy actually produced and what is stated in the contract. The integration of these renewable energy sources intensifies the complexity of managing the grid and the ongoing balance between electricity consumption and production due to its unpredictable and intermittent nature. Several tools and methodologies have been developed for the prediction of solar energy at different horizons. In this study, the task of predicting the photovoltaic power for the coming days in short time intervals (30 minutes) from data previously recorded during one year was considered using Long Short Term Memory (LSTM) Recurrent Neural Networks (RNN). This is a type of (ANN) that estimates the next data from the data received in the past and the current input data and compares it using a multi-layer perceptron (MLP) neural network, which is a neural network that has limitations in performance because it learns only using the oneto-one relationship between input and output and does not consider the characteristics of time series data. The results of each scheme are evaluated using: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), and Coefficient of Determination (r 2 ). A preliminary study of the data was performed to find patterns and apply certain corrections, such as eliminating missing data by averaging the previous 60 minutes' values (average 12 previous values). The remainder of the paper is organized as follows: Section 2 provides a review of previous works on the topic, Section 3 provides an explanation of the various deep learning techniques based on LSTM and MLP, Section 4 presents and discusses the results, and Section 5 concludes the paper with a discussion of future suggestions.

II. LITERATURE REVIEW
Artificial Neural Networks (ANNs) are of great importance in contemporary research on the mathematical modeling of systems based on renewable energies. It reported higher solar forecast results than those obtained using traditional statistical methods [8]. In the presence of noisy data, the LSTM model provides a more accurate prediction compared to the ARIMA model [9]. The use of deep learning techniques has allowed excellent results in classification and regression problems to be obtained. This is due to its automatic adjustment of internal parameters by means of supervised learning algorithms [10]. Research has focused on describing various artificial intelligence (AI) techniques used in renewable energy for their prediction and optimization [11]. Existing artificial intelligence tools have been used for the modeling and simulation processes of solar energy systems [12]. This study examines the importance and benefits of artificial neural networks in predicting environmental variables and estimating some unconventional energy systems, as well as describing the main methodologies and neural models used to date [13]. Recently, LSTM has become one of the most widely used artificial neural networks for short-term wind speed predictions [14], [10]. The LSTM has been proposed to make forecasts of wind speed on 24-hour wind farms. This work compares the results obtained by LSTM with MLP, deep versions of MLP, and classical methods. LSTM showed better efficiency prediction results than MLP and classical methods [15]. Neural networks-based deep learning are widely used in solar radiation applications. Evaluating the performance of the multilayer perceptron (MLP) and the empowered decision trees by combining them with linear regression for the estimation of solar energy; the results show that the MLP model had a better performance according to the coefficient of determination indicators (R 2 ) which was 97.7% and the RMSE, which was 0.033 [16]. A long-term prediction approach was used with the MLP network and iterative strategy; however, their results show that the support vector machine (SVM) model with a straightforward strategy yields better results [17]. LSTMs have the characteristic of being able to process large amounts of data and have the ability to generalize; that is, they adapt to unknown data, which makes them get better results than SVM-based models [18]. The Long-Term Memory Recurrent Neural Network (LSTM-RNN) has been proposed to accurately predict the output power of solar PV systems using hourly data sets for a year and has compared the proposed method results with multiple linear regression (MLR), bagged regression trees (BRT), and neural networks (NNs) methods for photoelectric prediction. The LSTM networks prove less predictive error compared to other methods [19]. Long short term memory (LSTM) neural networks were used to predict solar power panels and wind power in the medium and long term. The errors obtained were lower than those of the persistence model and the SVM model [20]. As an alternative to the conventional approaches, ANNs have been successfully applied to estimating solar irradiance. The multilayer perceptron structure (MLP) is the most common neural network [21]. An MLP method has been proposed to predict solar radiation for the next 24 hours using current data of mean solar radiation and air temperature for a region in Italy [22]. VOLUME 10, 2022

III. METHODOLOGY
In this section, the sources of information and data processing methods are described, and the building of deep learningbased neural networks to predict solar radiation is presented.

A. SOURCES OF INFORMATION
In this study, the MATLAB software (R2019b) was used for the training process of the LSTM, which is an advanced architecture for RNN to predict the values of future time steps of a sequence. The sequence regression network was trained to the LSTM sequence, where responses are training sequences with changing values in one time step. That is, for each time step of the input sequence, the LSTM network learns to predict the value of the next time step. To evaluate the effectiveness of the proposed method, a case study was performed using a data set obtained from Nova Scotia Community College in Halifax that includes one-year data (from January 1, 2017, to December 31, 2017) for Halifax, located in Nova Scotia, Canada. For each day, data was selected only during daylight hours, from 8 am to 5 pm. The original photoelectric data were collected at 5-minute intervals and included 365 × 120 = 43800 measurements. Each missing value is processed by averaging the previous 60 min values. Data is summarized at 30-minute intervals as the task is to make a forecast for each half-hour for the next day. Thus, we have 20 values in one day and 20 × 365 × 1 = 7300 values in one year. The data is then normalized to the interval [0-1]. Upon completion of the forecasting process, the results obtained for the proposed models are compared with the actual value. The fact that the observation recording period is 5 minutes. Thus, within one hour, you will have 12 observations. In turn, 188 (12 × 24) values. The number of past values of the series that are considered to forecast the target vector is independent of the size of the target vector and will depend in each case on the algorithm used and the nature of the problem. In this strategy, a single execution of the algorithm, with the passed data that you need, results in the forecast with the necessary horizon. Figure 1 presents the original photoelectric data during daylight hours, from 8 am to 5 pm with an interval of 5 minutes.

B. PREDICTION TECHNIQUES FOR SOLAR POWER FORECASTING
Artificial Neural Networks (ANN) are part of the area of knowledge of Artificial Intelligence (AI) and Deep Learning, simulated through computer programs, to mimic the human capacity to learn, memorize, and find relationships. ANNs, in particular, attempt to reproduce the behavior of biological neural networks in an extremely simple way [23]. The ability to learn nonlinear relationships and their ability to model complex systems have made them a useful tool in different scientific fields [24], [25]. The basic unit of ANNs is the artificial neuron, which is a simplified mathematical abstraction of the behavior of a biological neuron. ANNs are made up of a large number of artificial neurons grouped in layers and  highly connected to each other, to work together to solve a problem. Deep learning techniques can be used in forecasting and identifying characteristics of each one; among them: stages implemented for the approximation, the accuracy with which the power generation is approximated, the convergence time and the uncertainty associated with the forecast. In the case of neural networks, studies have demonstrated the ability of this technique to accurately determine the time series of data.

C. MULTILAYER PERCEPTRONS (MLP)
The Multilayer Perceptron (MLP) network is the most popular ANN architecture used in solving scientific problems [26], due to its demonstrated ability to approximate non-linear relationships [27]. They are a type of ANN capable of modeling complex functions. They are apt to ignore noise and irrelevant inputs, and they can easily adapt their weights. They are also easy to use. The learning process of an MLP can be divided into two main phases: the input of data through the inputs of the MLP and the correction of the errors, during which the errors are calculated by comparing the real data against the answer that it delivers to the model through a technique known as Backpropagation. This iteration is repeated multiple times to reduce the error, using an algorithm to obtain a result with better aptitude, being the Bayesian Regularization, one of the most common algorithms [28]. A multilayer perceptron is made up of an input layer, an output layer, and one or more hidden layers; although it has been shown that for most problems, a single hidden layer will suffice [29], [30]. In figure 2, a neuron j is represented, where x i are the inputs, w ij are the weights that relate each input i with the neuron j, and y j is the output.
The neuron performs two types of operations: first, the propagation rule; and later, the activation function. The propagation rule (Z ) is defined by the inputs and the synaptic weights. The most commonly used is the sum of the product of the inputs x i by the weights w ij that join them to the neuron j. This operation represents a linear function that passes by way of the origin. To remove this limitation, a parameter called threshold bj is added. This can be considered one more input, with a fixed value of 1, whose weight bj must also be updated. The propagation rule is calculated as where, Z j is the result of the propagation rule applied to neuron j, x i is the input vector i, w ij is the weight that joins input i with neuron j, and b j is the threshold associated with neuron j. The activation function (A) is responsible for evaluating the activation of the neuron and obtaining the output of neuron j. It is determined based on the result of the propagation rule such as where, A j is the activation of neuron j and f is the activation function.
The most common activation functions are step, linear, and sigmoidal functions, among which the logistic functions and hyperbolic tangents stand out. Depending on the type of problem to be solved, one type of response or another will be required. For classification problems, where binary outputs are desired, activation functions of the sigmoid type are often used. These types of functions have a range of small values with saturation at the extremes. The most used functions are the logistic function, with a working range on the ordinate from 0 to 1, and the hyperbolic tangent, with a working range on the ordinate between −1 and 1. On the other hand, to solve regression problems, they usually use functions of a linear type, since more variability in the response is needed. A neural network is organized into layers connected to each other creating a neural network. A layer is understood to be the set of neurons that are located at the same level in the network and that process information at the same time. In addition to the activation function of each neuron, the behavior of the neural network depends on the topology and the training carried out to establish the value of the weights.

D. LONG SHORT TERM MEMORY (LSTM)
Due to the rapid growth of the field of deep learning, which has manifested itself in the development of new technologies and architectures as well as greater integration in many areas of research, structures such as long-term memory (LSTM) have been recently used in the development of new solar forecasting techniques [31]. LSTM networks are currently one of the most popular models in deep learning applied to time series prediction. This kind of prediction is a difficult problem due to the presence of long-term trends, seasonal and cyclical fluctuations, and random noise [32]. LSTM structures are distinguished by their ability to model and recognize temporal patterns from data sequences, with the presence of memory cells and the way information is transferred between their units. In this way, they are able to process the sequences and records of the available operational data, thus extracting temporary information that allows reducing forecast errors [9]. In general, past events influence future events, and based on this idea, the recurrent neural network has a structure where information from the previous step is passed to the next step and used for estimation. Accordingly, the recurrent neural network has achieved great results in estimating sequential information, that is, time series data. However, as the length of the past data, required for data estimation, increases, the vanishing gradient problem occurs in the existing recurrent neural network. This problem can be solved by using the gates of the LSTM algorithm as proposed in [7], [33]. The networks of Long Short-Term Memory (LSTM) are very similar to the MLP structure. They have input layers, hidden layers, and an output layer. However, LSTM in its hidden layer has a memory unit [18]. LSTMs are an ANN type that is classified as Recurrent Neural Networks, which are characterized by not being strictly fed forward, such as MLPs. This is because LSTMs use inputs from previous iterations for future output calculations, thus providing feedback. This type of ANN is potentially more powerful than MLP, and they are characterized by showing a temporary behavior [15]. The memory unit consists of three gates: input gate, forget gate, and output gate (Input gate (i t ), Forget gate (f t ), and Output gate (o t )) and a recurring connection as shown in Figure 3. The drive has one input x t and two drive previous state feedbacks which are the previous state output s t−1 and the state variable c t−1 . The gates use a sigmoid activation function g, while states use a tanh function. The memory unit of the LSTM can be defined by a set of equations, where w are the parameters and b are the biases [18].
In general, the LSTM structure consists of three different layers: forget, input and output. In the LSTM architecture, first, x t and s t−1 information are used as inputs, and it is decided which information to delete. These operations are done in the forget layer f t determined by where, g is the activation function, which is the sigmoid in this work.
In the second step, the input layer i t , where new information will be determined, is given by Then, the candidate information, that will form the new information, is expressed by Finally, the output data is obtained by using the following expressions in the output layer.
The process described above continues iteratively. Weight parameters (w) and bias parameters (b) are learned by the model in a way that minimizes the difference between actual training values and LSTM output values [34], [35].

E. PERFORMANCE EVALUATION
Four different statistical evaluation criteria were used to evaluate the prediction performance of the proposed LSTM model. These criteria are:

Mean absolute error (MAE)
It expresses the mean absolute deviation of the difference between the predicted values and the actual values and is calculated by

Root mean square error (RMSE)
It represents the standard deviation of the estimation errors and is calculated by

Mean absolute percentage error (MAPE)
It measures the prediction accuracy of the models as a percentage and is calculated by

Coefficient of determination (R 2 )
It represents the strength of the linear relationship between the predicted values of the models and the actual values and is calculated by where, N is the number of samples used for statistical evaluation criteria, o i is the actual value of the observation, and p i is the forecasted value of the observation andõ is the average of the actual observation values.

IV. RESULTS AND DISCUSSIONS
In order to implement the proposed model, 43,800 records were taken during daylight hours with a 5-minute solar interval in Halifax, corresponding to the days from January 1, 2017 to December 31, 2017 available at the Nova Scotia Community College (NSCC) and to find the missing values. A prediction was made every 30 minutes for a winter day (December 31), and a summer day (June 30) for the total of 7300 values, which were divided into 7280 training data, and the last 20 for the prediction test. To evaluate the effectiveness of the proposed method, MATLAB R2019b was used for a LSTM training process with an initial learning rate of 0.01 and a maximum epoch of 1000. A case study was carried out using original photovoltaic data for one year during daylight hours, from 8 am to 5 pm at 5-minute intervals.
In this work, data is summarized at 30-minute intervals to make a half-hour forecast for the next day. In Figures 4 and 5, the actual graphs of daily power generation are shown in comparison with the graphs expected in the forecast models. Figure 4 shows the results for one day in December, and figure 5 shows the results for one day in June. The results show that the forecast is close to the actual data, especially during the winter period when the system did not generate much power, and in the summer, the LSTM model has a significant advantage over the MLP model. Tables 1 and 2 show the performance criteria for the two models, for the winter day and the summer day, respectively. In general, when the results are examined, it has been shown that the LSTM model gives the best results in all performance criteria on both days.
The forecasted values of the proposed LSTM model used to predict the solar power and the time graph showing the   It is also reported that the forecasted and actual values show the same trend. This means the LSTM algorithm rarely undergoes gradient exploding or a vanishing gradient. Figure 7 shows the results of the LSTM solar forecast models for the month of June 2017. The graph represents the daily average solar energy during daylight hours from 8 am to 5 pm. Although the graph shows that the prediction is close to reality, solar energy is not only affected by the parameters of this model. However, it differs due to various factors of changes in the atmosphere, such as water vapor, clouds, or pollution.
The results show the importance of the time series used to train the models. The quality of the data has a significant impact on the performance obtained by the forecasting model, especially the outliers. A specific time series may perform better with one model, while the same model may perform poorly with another time series model. Consequently, it is convenient to consider multiple models to search for suitable predictions. However, there are several techniques for time series forecasting that can be implemented according to the methodology proposed in this study.

V. CONCLUSION
The LSTM model, which is based on a deep learning approach, is proposed to predict daily solar power values. The prediction was evaluated using solar power data from Halifax, Nova Scotia, Canada over the course of a year (January 1, 2017, to December 31, 2017). This data was split into two categories: training and test sets. While only the training data was used in the learning process of the model, the test data was not used in the learning process. The task of predicting photovoltaic power for the coming days at 30-minute intervals was considered. The results of the suggested model were compared using the Multi-Layer Perceptron (MLP) algorithm, which is the most extensively used technique in the literature, in order to evaluate and examine their correctness and performance. When compared to the MLP algorithm, the prediction performance of the suggested LSTM model offered more effective values in all performance parameters MAE, MAPE, RMSE, and R2 for each category of days, according to the findings of simulation tests and the results presented in Tables 1 and 2. The proposed model gives trustworthy data that will allow photovoltaic power plants to operate more efficiently in the future. The combination formed by the concepts of artificial intelligence and energy efficiency appears to have a bright future, particularly in terms of boosting energy sustainability, decarburization, and electrical sector digitization. The importance of data processing (time series) utilized to train models was highlighted in this study. A specific time series may perform better with one model, while the same model may have poorer performance with another time series model. This reason suggests focusing on research and development in multiple models in order to arrive at predictions with high suitability. In addition, the preprocessing approaches used to reduce noise, eliminate outliers, and reduce errors from prediction models should be taken into account. When used to predict solar energy, artificial intelligence algorithms have demonstrated their aptitude and superiority in obtaining favorable outcomes. However, obtaining such results would necessitate a significant amount of hyperparameter adjustment. Furthermore, the quality of the data, particularly the outliers, has a substantial impact on the prediction model's performance. Several time series prediction approaches can be implemented using the methodology provided in this work. VOLUME 10, 2022