Solar PV Power Estimation and Upscaling Forecast Using Different Artificial Neural Networks Types: Assessment, Validation, and Comparison

According to its various features, the solar photovoltaics (PV) system is realized as a significant promising energy source to cope with energy shortcomings and environmental impacts like contamination. Therefore, it is mandatory to estimate and predict the output power for prediction intervals to avoid any power outage or urgent disturbances in the utility grid. These are challenging tasks as the solar PV output power depends on the weather variables data such as temperature and solar radiation. In this article, the estimation and forecast of solar PV output power are investigated with an upscaling method using three different types of artificial neural networks (ANNs) in order to reduce the estimation errors in current types of ANNs. The multilayer feedforward neural network (MLFFNN), recurrent neural network (RNN), and nonlinear autoregressive exogenous (NARX) model neural network (NARXNN) are applied to estimate and forecast the total output power of four real solar PV substations in Egypt. Hence both the surface temperature and the solar radiation of each PV substation are applied as the inputs of each designed NN, whereas the total output power of the four PV substations is its output. For the training and effectiveness investigation procedures of each applied ANNs, the data of two months (60 days) are attained and collected from these four PV substations. Here, the data of the first 45 days are applied to train the three designed NNs, while the data from the remaining 15 days, which are not applied for the training, are used to check the effectiveness and the generalization capability of the trained NNs. Hence, the estimation process is considered a prior step for the forecast of the output power. Therefore, an upscaling method is utilized for assessing and forecasting a regional solar PV output power because of the limited number of monitored plants and applied data. The results provide evidence that the trained NNs are running very well and efficiently to estimate the power correctly. The performance of the MLFFNN is the best compared with the other NNs, whereas the NARXNN’s performance is the lowest one. The MLFFNN achieves the lowest mean squared error (MSE) of 0.27533 and the lowest absolute approximation error of 0.2099 MWh. Finally, the assessment and comparison among the three trained NNs and other techniques in recently published articles are highlighted and presented which reveal the performance superiority of the three trained NNs compared to other ANNs.


I. INTRODUCTION
Recently, the witnessed growth of the penetration of renewable energy sources (RESs) to support the global electrical energy demand and eradicate the urgent drawbacks of fossil fuels, becomes a promising research trend to develop new industrial technologies to cope with their various obstacles and challenges [1], [2]. Among the available sorts of RESs, Solar Photovoltaic (PV) systems are widely utilized according to their various features such as being environmentally friendly and long-lasting [3], [4], [5]. However, the power generation and system reliability and performance directly depend on atmospheric conditions like solar irradiation air temperature, and humidity which are also crucial for the grid operator's control to optimize the penetration of electrical power into the grid, allocate the solar PV substations, and to predict the complications and energy necessities into the near future. Depending on the variations of the weather conditions and the sudden disturbances of RESs in grid-connected or islanded modes, various studies implement two choices in order to sustain the performance indices of the utility grid within acceptable limits. Firstly, employing high-capacity energy storage systems like batteries increased the initial implementation cost and causes some bad impacts on the power quality of the utility grid. Secondly, developing accurate models for energy production forecasting based on climate conditions is a vital solution to wipe out the utilization of new energy storage components [6].
To come over the bad influences of weather conditions on solar PV generation power, several artificial neural networks (ANNs) are implemented to estimate, forecast, analyze and model the weather circumstances for reducing the generation drawbacks and predicting the output power in order to estimate the maximum hosting capacity of the grid during the penetrations of RESs, especially solar PV systems, to maintain the performance indices of the utility grid within acceptable limitations, [6]. So, the high-accuracy implemented output power prediction algorithms will lead to significant enhancement in the utility grid energy management which in turn decreases the energy cost, supporting the grid reliability, besides plummeting the bad environmental impacts [6], [7], [8], [9]. Hence, the prediction horizon of the solar PV output power can be classified as short-term, medium-term, or longterm periods depending on the duration time [10]. Hence, several articles elaborated on long-term Machine Learning (ML) techniques [11], [12], [13], [14], [15]. So, it is essential to specify the forecast horizon which contributes to the judgment-creating actions, based on the designers' assumptions and usages, in the smart grids, selecting the suitable strategy for the prediction of solar PV output power [16]. To reduce the prediction errors, the estimation process has a prominent role as considered a prior step for the forecast [17], [18]. Moreover, ML techniques are used to specify and estimate the performance ratio of a solar PV system as one of the reliability indicators. This helps the system operators to make urgent plans to maintain or replace solar PV systems besides predicting solar PV power generation [19]. In [20], the day-ahead solar PV output power was monitored and estimated using the data available in time series by ML techniques; however, its estimation application is still in the early stage. Then the solar PV output power estimation or forecast can be accomplished using deterministic [21], [22] or data-driven models based on ML or probabilistic methods [23], [24], [25].
Looking specifically at the solar PV power forecasting approaches, they are classified into model-based and datadriven [26]. In model-based methodologies, physics-based models that involve weather variables, e.g., solar radiations, are applied for solar PV power predictions [27]. Although the accurate power forecasting, the operator assumptions, and adopted simplifications cause uncertainties that influence the practical utilization [28], [29], [30]. While the data-driven methods, based on ML techniques, depend on the available solar PV parameters data to investigate the precise association among variables as inputs and the solar PV output power without requiring any physics-based models [16]. In Ref [31], the support vector regression (SVR) was used for solar PV power prediction and compared to a physical modelling method via extensive measurements, numerical weather prediction (NWP), and cloud motion vector (CMV) data. Here, the root means squared error (RMSE) and the mean bias error (MBE) were applied to assess the model performance on the solar PV output power predictions. By using the various solar radiation parameters and cell temperature, which were analytically estimated, as the predication model inputs, the authors in [32] investigated the prediction accuracy of three ML models (support vector machine (SVM), ANNs, and weighted k-NN) depending on the historical weather periods data. Hence, the prediction accuracy significantly increased depending on the weather variables acquired from solar PV physical models.
Several articles studied the effect of the weather variables data of the pre-processing algorithms on the prediction accuracy of the data-driven methods. In [33], wavelet decomposition (WD) and principal component analysis (PCA), as preprocessing techniques, were merged to separate the atmospheric input data. A group least square SVM (GLS-SVM) technique was applied for short-term solar PV power predictions and its results emphasize the vital role of PCA-WD preprocessing technique usage in the prediction process. In [34], a hybrid prediction model was investigated using combined algorithms namely, wavelet-transform (WT), particle swarm optimization (PSO), and SVM, for short-term solar PV power prediction depending on the actual PV power measurements and NWP for solar radiation, and other weather variables data. Using the mean-absolute percentage error (MAPE) and normalized mean absolute error (NMAE) indices, the prediction results confirmed the superiority of the Hybrid WT-PSO-SVM model compared to other prediction models. Dual-stage prediction model using three different ANN algorithms, generalized regression NN (GRNN), Extreme Learning Machine NN (ELMNN), and Elman-NN, with the aid of genetic algorithms optimized back propagation (GA-BP) algorithm, was proposed in [35] which stated the robust relationship between the variations of the solar radiation and solar PV power generation. Other articles' studies proposed prediction models based on the ELM technique merged with the maximum power point tracking (MPPT) techniques and various PSO algorithms to enhance the prediction accuracy [27], [36]. To validate the performance of 68 ML techniques, a fair comparative study was established in terms of ''three sky conditions, seven locations, and five different climate regions'' [37]. This emphasized the effectiveness of tree-based algorithms for long-term prediction only, so it was required to combine other algorithms to boost the merits and minimize the demerits of the usage of the sole algorithm.
Based on the NN algorithms, solar PV power forecasting can be attained because of the NN's ability to generalization under altered circumstances [38], [39]. Kumar et al. [6] established three NNs (Elman NN, feed-forward (FF)NN, and GRNN to predict the solar PV output power in relationships among weather variables data and solar radiation. Hence, the results confirmed that the NN yields a precise forecast with an RMSE of 0.25 in ELMAN NN and 0.30 in FFNN, and 0.426 in GRNN. However, the effectiveness of the generalization ability of these NNs was not examined and evaluated under changed circumstances. In [40], a multi-layer perception-based ANN model was proposed for short-term power prediction. Another model in [41], used two learning algorithms, namely Levenberg-Marquardt (LM) and Bayesian Regularizations (BR), with various weather variables and their results stated that the BR algorithm is better than the LM algorithm (RMSE = 0.0706 and 0.0753, respectively). While in [42],10 different learning algorithms under different training datasets (23 different combinations of the time stamp of the year) were investigated which confirmed the effectiveness of combined NN algorithms in significantly increasing the prediction accuracy. Other NNs established to predict solar PV power in [41], [43], and [44], however the NN valuation during nature conditions variation needs extensive studies.
Upscaling methods can be utilized for estimating, assessing, and forecasting a regional solar PV output power because of the limited number of monitored plants and applied data. Moreover, the less historical data available to train the ML techniques is the main drawback in some cases [25]. In [45], various four upscaling methods were investigated with diverse solar PV substations' information and data availability scenarios. Other upscaling methods were applied based on the chosen subsets of solar PV stations with single output power that was determined as representative of the total solar PV output power from several solar PV stations in terms of subsets capacity for regional forecast [21], [45], [46], [47]. A hybrid upscaling strategy was applied based on power measurements, NWP, and ''cloud motion vector forecast datasets'' using the SVR technique [31]. In [25] and [48], new upscaling methods for estimation and forecast were proposed based on ANNs for estimating and day-ahead forecasting of the regional solar PV power.
As above discussed, the prediction accuracy is essentially dependent on the weather variables data, time stamps, and solar radiation which can be boosted using combined types of ML techniques in terms of the training error, the size of the input layer, and the generalization ability of the ML techniques, especially NN systems, under various circumstances. Therefore, the main studied points of this article rely on: • Assessment, validation, and comparison of various ANNs types under different training datasets.
• Investigation of the estimation and prediction accuracy of the different ANN types in terms of the weather variables data, time stamps, and solar radiation by validating historical solar PV power as their inputs.
In order to accomplish the main objectives, the contribution and the novelty of this article can be summarized as follows: • The total output power of four real solar PV substations in Egypt is estimated under different weather variables data, time stamps, and solar radiation using three types of NNs: the multilayer feedforward neural network (MLFFNN), recurrent neural network (RNN), and nonlinear autoregressive exogenous (NARX) model neural network (NARXNN).
• The three ANNs types' training is verified, and the training process is carried out using real collected data of VOLUME 11, 2023  45 days (1.5 months) from four real solar PV substations in Egypt.
• To assess the training process of the three ANN types, the mean-squared-error (MSE) and the training-error (TE) indices are applied.
• To evaluate the generalization ability and the prediction accuracy of the three trained ANN algorithms, other different data (15 days) are used which are different from the data used for the training process. The results emphasize that three ANN algorithms are trained very well, and the MSE and the TE are very low. In addition, they can generalize under different conditions and data. Consequently, the three trained ANN algorithms can accurately estimate the solar PV output power different weather variables data, time stamps, and solar radiation.
• The errors in the estimation and forecast of solar PV output power are investigated with an upscaling method. Hence, the data-driven upscaling method was trained and tested on the real four solar PV substations using specific series of ANN algorithms.
• A comparative study is highlighted among the obtained results of the three trained ANN types and other techniques in recently published articles. The article organization can be presented as follows: section II offers a mathematical analysis for analytically calculating the output power of the solar PV substation. In section III, the proposed model structure and the problem statement are investigated. In sections IV and V, the design, the training, and the testing of the three ANN algorithms for predicting the power are exhibited in detail. Section VI highlights and compares the effectiveness and validation of the three trained ANN algorithms using data from the 15 days which are not used for the training. Moreover, discussing the applied data-driven upscaling method and its results. Section VII establishes a comprehensive comparison and discussions among the applied NNs, and other current published articles followed by the conclusion and future work.

II. ANALYSIS OF SOLAR PV OUTPUT POWER CALCULATION
To calculate analytically the electrical power obtained from the solar PV module, the following equation are studied in [49] and [50]: where, η s : the reference efficiency of the solar PV cells τ g : the glass transmissivity α s : the solar cell absorptivity R : the solar radiation (W /m 2 ) A : the total area of the solar cell (m 2 ) µ s : the thermal coefficient of solar PV cell efficiency (%/ • C) T s : the solar cell temperature ( • C) T r : the reference temperature ( • C) The detailed specifications of solar PV stations are elaborated on and presented in Table 1.
To calculate the output power from equation (1), many parameters for the solar PV station must be known accurately such as the reference efficiency of the solar PV cells, the glass transmissivity, the solar cell absorptivity, the solar radiation, the total area of the solar cell, the thermal coefficient of solar PV cell efficiency, the solar cell temperature, and the reference temperature. The main aim of our proposed work is to calculate the electrical power easier, efficiently, and depending only on two parameters which are the surface temperature and the solar radiation. Therefore, the three ANNs types are proposed and implemented for this purpose. The results that are presented later in this paper prove this issue.
Here, the proposed model strategy is designed and implemented using the three proposed NNs algorithms as depicted in Fig. 1, which can be divided into four stages: 1) Attaining the original solar PV time series data such as the surface/solar cell temperature and solar radiation from four solar PV substations in Egypt.
2) The data preprocessing technique is applied to organize the data and initialize the missing parameters. 3) ANN algorithms (MLFFNN, RNN, and NARXNN) process strategy involves a training and testing process, and an effective process. 4) Visualizing results.
MATLAB is used in executing the last three stages and using Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz processor.
For accurate presentation, the flow chart of the proposed model methodology and the summarized article contribution is shown in Figure 2.

IV. NN DESIGN FOR PV OUTPUT POWER PREDICTION
In this section, the design of the three proposed NNs types is discussed. The main criteria during the NN design are obtaining high performance which is the very low MSE and TE (close to zero). The main inputs of any type of the designed NNs that achieve this high performance are the surface temperature, T s , and the solar radiation, R, of each substation of the four PV substations. Therefore, the number of user inputs is eight (2 × 4=8). The collected input data are presented in Figure 3. The output of the designed NN is the total power of the four PV substations. As known, the NN with three layers (input layer, hidden layer, and output layer) can solve many complex problems. Therefore, all three types are composed of three layers: the input layer, hidden layer, and output layer. The comparison between the design of the three NN types is presented in Table 3 and their structures are shown in Figure 4. The actual total power P from the real four PV power stations is used for training the designed NNs and it is compared with the estimated total power P ′ from the designed NN. The equations of the feedforward part of these designed NNs are presented as follows.

A. THE EQUATIONS OF THE FEEDFORWARD PART OF THE MLFFNN
The output of the hidden neuron j which is in the NN's hidden layer is given as where, x i are the inputs to the NN. x 0 = 1, x 1 = T s1 , x 2 = R 1 , x 3 = T s2 , x 4 = R 2 and so on. w ji is the weight between the input i and the hidden neuron j.
The activation function of the hidden layer is given by   The estimated total power by the MLFFNN, P ′ , is given by, where, b 1j is the weight between the hidden neuron j and the output of the MLFFNN.

B. THE EQUATIONS OF THE FEEDFORWARD PART OF THE RNN
The output of the hidden neuron j which is in the NN's hidden layer is given as where, x i are the inputs to the NN. x 0 = 1, x 1 = T s1 , x 2 = R 1 , x 3 = T s2 , x 4 = R 2 and so on. w ji is the weight between the input i and the hidden neuron j and c jn is the weight between the input y n (k − 1) and the hidden neuron j.
The activation function of the hidden layer is given by, The estimated power by the RNN, P ′ , is given by, where, b 1j is the weight between the hidden neuron j and the output of the RNN.

C. THE EQUATIONS OF THE FEEDFORWARD PART OF THE NARXNN
The output of the hidden neuron j which is in the NN's hidden layer is given as where, x i are the inputs to the NN. x 0 = 1, x 1 = T s1 , x 2 = R 1 , x 3 = T s2 , x 4 = R 2 and so on. P ′ (k − 1) represents the previous value of the estimated power by NARXNN. w ji is the weight between the input i and the hidden neuron j and c j1 is the weight between the input P ′ (k − 1) and the hidden neuron j. The activation function of the hidden layer is given by, The estimated power by the NARXNN, P ′ , is given by, where, b 1j is the weight between the hidden neuron j and the output of the NARXNN. The training error resulting from each designed NN should be very small and close to zero as possible and is calculated from the following equation: The training process of the designed NN is discussed in detail in the next section.

V. NN TRAINING AND TESTING FOR PV OUTPUT POWER PREDICTION
In this section, the training, and the testing of the designed NNs (MLFFNN, RNN, and NARXNN) are presented in detail. The following steps during these processes are presented in Table 4. The training of the designed NNs (MLFFNN, RNN, and NARXNN) are presented using Levenberg Marquardt (LM) algorithm. LM algorithm has the following properties: 1) This algorithm can easily process the data in a fast way, [52], [53], [54], [55]. 2) It is a type of second-order optimization technique that has a strong theoretical basis and provides significantly fast convergence, and it is considered an approximation to Newton's Method [81], [82]. 3) Compared to other learning algorithms, LM learning has the trade-off between the fast learning speed of the classical Newton's method and the guaranteed convergence of the gradient descent [81], [83]. 4) This learning is suitable for larger datasets as well as converges in fewer iterations and in a shorter time than the other training methods. The data that was used for training the three designed NNs types are obtained from four connected PV power stations in Egypt. These data are generated based on equation (1). In other meaning, the real PV station calculates the electrical output power depending on equation (1). The collected data are for two months (58 days). The data of one month and a half (44 days) are used for the training process of the designed NNs, and the rest of the data which is a half month (15 days) is used for investigating the effectiveness and the generalization of the trained NN. In this section, the training process is illustrated. The total number of input-output pairs of the training data is 12672. In training each type of the NN, 85 % of these data are used for training, 10% for testing, and the rest 5% for the validation method. Using trial and error methodology and after an investigation of many different initializations of the weights and number of hidden neurons, the best parameters of the three designed NNs that achieve high performance are presented and compared in Table 5. It should be noted that the training time is not very important because the training is occurring offline. In addition, the main purpose is to have a very trained NN that can predict the PV power correctly and in an efficient way.
The MSE resulting from the training of the designed NNs is shown in Figure 5. As clear from Figure 5 and Table 5, the resulting MSE from training the designed MLFFNN is better and lower compared with the other NN structures. This means that the convergence and the approximation in the case of using the MLFFNN are better. In addition, the training time is also lower compared with the other architectures. This means that the MLFFNN is faster in convergence. However, the result in MSE from the designed RNN and NARXNN is also satisfactory and low. It should be noted that the ''Best'' that is found in Figure 5 means the best validation performance. The best performance of the NN which is the lowest MSE is always taken from the epoch with the lowest validation error, as clear from the figure.
Once the training is finished and completed, the same data that was used for the training process is used for doing the test of the three trained NNs. The approximation error between the estimated total power by the three trained NNs and the actual total power from the real PV is presented in Figure 6. As shown in Figure 6, the approximation error is low whether using the MLFFNN, RNN, or NARXNN. This means that the VOLUME 11, 2023   NN is trained very well and able to estimate the PV output power efficiently and correctly. It is clear also from Fig. 6 that the approximation error in the case of using the MLFFNN is lower and better compared with using the other types of NNs. The approximation error is the case of using the NARXNN as the higher one. For more illustrations, the average, maximum, minimum, and standard deviation (std.) of the absolute value of this approximation error using the three trained NNs are presented in Table 6.
For more illustrations, the convergence, and the comparison between the actual total power from PV and the estimated total power by the three trained NNs, are presented in Figure 7. As shown in Figure 7, the convergence and the approximation between them are good and satisfactory. These results support the ones presented in Figure 6. The generalization ability and the effectiveness of the trained NNs are presented in the next section.

VI. NN GENERALIZATION AND UPSCALING TECHNIQUE FOR PV OUTPUT POWER PREDICTION
In this section, the generalization ability, and the effectiveness of the three trained NNs are presented using different data from the training ones. The rest of the available/collected data which is the data of the half month (15 days) is used for this purpose. The inputs of the NN (temperature and radiation) in this case are shown in Figure 8.
The comparison and the error between the actual total power from PV and the estimated total power from the three trained NNs (MLFFNN, RNN, NARXNN) are shown in Fig.9 and Fig.10. In addition, the average, std., maximum, and minimum of the absolute error between the two total powers are presented in Table 7. As shown by Figs. (9,10) and Table 7 that the approximation between the actual total power and the estimated one by the NN is very good, and the error between them is low. This means that the three NNs are trained well, and they can estimate the PV output power correctly and efficiently, under different data and conditions. This proves the effectiveness and the generalization of each of the trained NNs. The results show also that the trained MLFFNN has a better performance compared with the other trained NNs. The error in the case of using the trained MLFFNN is lower and better compared with using the trained RNN and NARXNN. The performance of the trained NARXNN is the lowest one and its resulting error is the highest compared with the other trained NNs.
The upscaling technique is implemented for estimating and forecasting the distributed generation of the four PV models. Indeed, the main goal of the upscaling technique is to scale the output power of the subset to obtain the output power of the complete set, [17], [21], [84]. For developing the upscaling technique two steps should be followed. Firstly, the subset must be chosen in such a way that its behavior, regarding power output, is representative of the behavior of the complete PV total station. Then, models based on MLFFNN are developed for the prediction of the total station output power. In this technique, one of the four PV models is selected as a subset randomly. Therefore, the proposed MLFFNN uses the inputs of this subset system to predict the overall output power of the complete set, as shown in Figure 11.
The result of upscaling shows an improvement in the accuracy of regional power estimation and forecast with respect to the previous value. It reduces the resulted MSE of power estimation by 4.2% and the RMSE by 20%, as shown in Fig. 12. Furthermore, the error between the MLFFNN upscaling predicted power and the actual output power for the complete PV system is indicated in Figure 13, which is very small and close to zero. The average, maximum, minimum, and standard deviation values of the absolute error are determined and are 0.0720, 5.2702, 0.0, and 0.2299 respectively.
To ensure the effectiveness of the upscaling prediction method, its generalization ability is checked. Simply, two weeks' data from the selected subset inputs, solar radiation, and surface temperature, are used for verification of this method. The approximation error in the case of using the generalization ability is low, as clear in Figure 14. To be exact, the average, maximum, minimum, and standard deviation (std.) of the absolute value of this approximation error using upscaling MLFFNN forecasting are presented in Table 8.

VII. COMPARATIVE STUDY
The proposed NNs types are compared with other previous related approaches which were presented in ref. [11], [12], [13], [14], and [15]. This comparison is developed in terms of the developed method, the number of user inputs, the resulting MSE, the result in RMSE, the regression, and the investigation of the generalization ability under different conditions. In addition, the number of hidden neurons,  hidden layers, epochs, and activation function, if the method uses the NN's principle. This comparison is presented in Table 9. It should be noted that the regression equation is different for each developed system or method based on the VOLUME 11, 2023   NN (MLFFNN, RNN, and NARXNN), in case of using different data from the training ones (The generalization ability stage). structure of this system. Therefore, there is no need to put the regression equation in Table 9. The regression measures the correlation between the estimated power by the developed method and the actual power obtained from the solar PV  station. The main important point about the regression is that its value should be very close to 1.0 to obtain a very good convergence/approximation between the estimated and actual powers.
As presented in Table 9, our proposed MLFFNN, the developed methods by Kazem et al. [14], MFFNN-MVO [11], and the random forest regression [12], record the lowest results from MSE compared with other previous approaches. This means that these methods are more accurate in predicting solar PV output power. The resulting regression with our proposed structures (MLFFNN, RNN, and NARXNN) is the highest one compared with other approaches. This means that the approximation and the convergence between the actual power and the estimated one are better using our proposed structures. In addition, the used inputs in our case and the case of Kazem et al. are only two compared with other approaches, which used more than two inputs. This means that the complexity of our proposed methods is less than the others. The generalization ability under different cases and conditions is checked only with our proposed approach, whereas it is   not stated and investigated with others. From this comparison, we conclude that our proposed approach is reliable and efficient in predicting solar PV output power correctly and under different conditions. However, further investigation and  study are required in future work for our proposed NNs types considering the minimization of the resulting MSE and the training/approximation error. In addition, more data (e.g., 6 months) can be also taken into consideration.

VIII. CONCLUSION AND FUTURE WORK
It is essential to boost the prediction accuracy of the NNs for estimating and forecasting the solar PV output power to overcome the power outage and disturbances in the utility grid due to variations in environmental conditions. In this article, three trained ANN types (MLFFNN, RNN, and NARXNN) are proposed to estimate the total power of four solar PV power substations. Hence, the surface temperature and solar radiation are used as the inputs of these NNs, and the total PV power is the output. The data of 60 days (2 months) are gained from four real solar PV power substations in Egypt. From this, the data of the first 45 days are applied to the training process which is occurred using the LM learning Algorithm. The rest of the data which are for 15 days (half a month) are used for checking the effectiveness and the generalization ability of the trained NNs. The results from the training process show that the designed NNs achieve good and satisfactory performance (very low MSE and training/approximation error). This means that the NN can estimate the PV output total power efficiently and correctly. These results show also that the MLFFNN has the best performance compared with RNN and NARXNN. The MSE and the approximation error in the case of using the designed NARXNN are the highest compared with the other NNs. The results from investigating the generalization ability of the trained NNs by the use of different data than the training ones support the results from the training process. All the trained NNs can work under different conditions and data. In addition, the trained MLFFNN has the best performance compared with the other NNs, whereas the trained NARXNN achieve the lowest performance. After the estimation process validation is considered, the upscaling method is utilized for assessing and forecasting the regional solar PV output power of the dour solar PV stations. A comparison is developed between our proposed method and other previously related published work. From this comparison, we deduce that the proposed method has low MSE, and complexity compared with others. In addition, the generalization ability is checked and investigated only with the proposed method.
In future work, deep learning algorithms can be considered and compared. In addition, collected data from six months or one year can be used. Predicting the power of other renewable energy sources such as the energy of wind can also be applied.

DATA AVAILABILITY STATEMENT
The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

CONFLICTS OF INTEREST
The authors declare no conflict of interest.

HIGHLIGHTS
• Three types of ANNs are utilized for solar PV power estimation and upscaling forecast.
• The total output power of four real solar PV substations in Egypt is estimated.
• The lowest mean squared error (MSE) and training error are attained.
• The better performance and effectiveness of the proposed model are proven for solar PV power estimation and upscaling forecast.