A Local Training Strategy-Based Artificial Neural Network for Predicting the Power Production of Solar Photovoltaic Systems

Power production prediction from Renewable Energy (RE) sources has been widely studied in the last decade. This is extremely important for utilities to counterpart electricity supply with consumer demands across centralized grid networks. In this context, we propose a local training strategy-based Artificial Neural Network (ANN) for predicting the power productions of solar Photovoltaic (PV) systems. Specifically, the timestamp, weather variables, and corresponding power productions collected locally at each hour interval <inline-formula> <tex-math notation="LaTeX">$h$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$h =$ </tex-math></inline-formula> [1, 24] (i.e., an interval of <inline-formula> <tex-math notation="LaTeX">$\Delta h=1$ </tex-math></inline-formula> hour), are exploited to build, optimize, and evaluate <inline-formula> <tex-math notation="LaTeX">$H=24$ </tex-math></inline-formula> different ANNs for the 24 hourly solar PV production predictions. The proposed local training strategy-based ANN is expected to provide more accurate predictions with short computational times than those obtained by a single (i.e., <inline-formula> <tex-math notation="LaTeX">$H=1$ </tex-math></inline-formula>) ANN model (hereafter called benchmark) built, optimized, and evaluated globally on the entire available dataset. The proposed strategy is applied to a case study regarding a 264kWp solar PV system located in Amman, Jordan, and its effectiveness compared to the benchmark is verified by resorting to different performance metrics from the literature. Further, its effectiveness is verified and compared when Extreme Learning Machines (ELMs) are adopted instead of the ANNs, and when the Persistence model is used. The prediction performance of the two training strategies-based ANN is also investigated and compared in terms of i) different weather conditions (i.e., seasons) experienced by the solar PV system under study and ii) different hour intervals (i.e., <inline-formula> <tex-math notation="LaTeX">$\Delta h=2$ </tex-math></inline-formula>, 3, and 4 hours) used for partitioning the overall dataset and, thus, establishing the different ANNs (i.e., <inline-formula> <tex-math notation="LaTeX">$H =12$ </tex-math></inline-formula>, 8, and 6 models, respectively).


I. INTRODUCTION
A. BACKGROUND Sustainable energy sources have shown a keen interest around the world for several important reasons, such as the downturn of fossil fuels, the rise in prices, industrial pollution, the energy crisis, and the surge of ecological concerns [1]- [3]. Renewable Energy (RE) generation has been strongly encouraged and supported by government policies and technology advancements [3]. In 2018, the share of RE (181 gigawatts) in cumulative production capacity worldwide had increased rapidly, contributing to more than 50% of the annual The associate editor coordinating the review of this manuscript and approving it for publication was Alfeu J. Sguarezi Filho .
average power production potential added in the same year [4]. Besides, the energy produced by solar irradiation is considered the most promoting and safe energy supplied from the Photovoltaic (PV) systems [5]. PV energy sources are among the most extensively accessible and highly attractive RE sources according to their significant potential for energy production [6]- [9].
The main difficulty in the PV system is the complexity, parasitic capacitance, harmonic distortion, and sophistication of the equation of current-voltage and power-voltage characteristics [10]. The relationship among PV current and voltage is both implicit and complex depending on certain variables, among them are the ambient temperature, solar irradiation, wind speed, and dust accumulation [11], [12]. On hot days, the cell module temperature can quickly be attained 70 • C, where power energy output can drop significantly below nominal values [13]. The production of PV systems mainly depends on the amount of global solar irradiation received by the modules; any change in the power implies that solar irradiation changes during the day or infected by shading. In addition, wind speed can be a significant factor in the occurring of dust, dirt accumulation, and soiling in the PV system [14]. Such a phenomenon prevents the absorption of successful solar irradiance by PV cells and significantly reduces the overall PV power generation. This reduction in power can reach 50% in the arid and semiarid regions, where the solar irradiation is usually high [15]. Thus, the production from such energy sources depends on intermittent (stochastic) weather variables that calls for prediction (forecasting) models and tools capable of accurately estimating the PV power productions by accommodating such inherent stochasticity in the weather variables [16]- [18].
In this context, the prediction of PV power generation would make a significant contribution to the management and maintenance of modern energy systems, such as the connection to microgrids [19]. Prediction plays a critical role in managing the efficiency of the power system [7], [8], [20].

B. LITERATURE REVIEW AND MOTIVATION
Numerous methods for predicting PV production have been published in the literature. Nevertheless, an effective method is still needed for enhancing the performance of PV prediction to decrease the adverse effects of system instability. However, the prediction methods are generally classified into model-based and data-driven [16], [21]- [23].
Model-based methods are based on analytical equations that focus on the PV power production concept. The equations typically use weather conditions to predict power output [23]. Usually, such methods do not require historical data, but they strongly depend on comprehensive station location details and reliable meteorological data. They could be easily defined and focused on solar irradiation, or they could be more complicated if additional weather variables, like ambient temperature, wind speed, and dust, are used. Thus, the effectiveness of their forecasts is heavily dependent on the precision of the Numerical Weather Predictions (NWPs) details. Although such methods increase prediction accuracy, the uncertainty resulting from the approximations and/or assumptions in the adopted models could constitute a limitation on their realistic implementation [18].
Contrarily, data-driven methods (developed using various Machine Learning (ML) techniques) depend uniquely on the availability of the historical pairs of weather variables and the associated solar power productions. They aim to build models (called black-box models) to capture the hidden mathematical relationship between the weather variables and the associated PV power productions [4], [16]- [18].
For example, Ding et al., [24] proposed an improved version of the Back-Propagation (BP) learning algorithm-based Artificial Neural Network (ANN) to predict the power output of a PV system during different environmental conditions. The improved BP algorithm was shown superior compared to the traditional BP algorithm in enhancing the accuracy of the power output prediction.
Zeng and Qiao [25] designed the Radial Basis Functionbased Neural Network (RBF-NN) for short-term solar PV power prediction using past values of meteorological data (e.g., sky cover, transmissivity). Results showed that the RBF-NN outperforms the linear autoregressive (AR) and the Local Linear Regression (LLR) models. The authors ended up that the use of transmissivity and other extra meteorological data, particularly the sky cover, could mainly improve the efficiency of the power prediction.
Li et al. [26] predicted the PV output power using the Auto Regressive Moving Average with exogenous inputs (ARMAX) and Auto-Regressive Integrated Moving Average (ARIMA). The two models used as exogenous inputs the ambient temperature, insolation duration, precipitation amount, and relative humidity to predict the power output of a 2.1 kW grid-connected PV system. Results revealed that the ARMAX model significantly enhances the predictability of the power output over the ARIMA.
De Leone et al. [27] used Support Vector Regression (SVR) to predict the energy production of a PV plant located in Italy. The method used the past meteorological data (e.g., solar radiation, ambient temperature) and power outputs to predict future power outputs. The obtained results revealed that the quality of the expected power output depends heavily on the accuracy of the meteorological data.
Yang et al. [28] predicted the PV power in the short-term using Auto-Regressive with exogenous input based Spatio-Temporal (ARX-ST). The evaluation of the results was compared to the conventional Persistence model. The authors addressed that the existing ARX-ST can be expanded with more meteorological data to help boosting the prediction precision.
Khademi et al. [29] proposed a Multi-Layer Perceptron equipped with an Artificial Bee Colony (MLP-ABC) algorithm to predict the power output of a 3.2kW PV plant. The collected data were separated into sunny and cloudy days and used to develop the MLP-ABC prediction model. The findings were compared to the MLP-ABC model when both sunny and cloudy days were used to establish the prediction model. It was concluded that the separation of different weather conditions enhanced the accuracy of the PV power output predictions.
Li et al. [30] used the Multivariate Adaptive Regression Splines (MARS) model for daily power output prediction of a grid-connected 2.1 kW PV system. This model maintains the flexibility of the traditional Multi-Linear Regression (MLR) paradigm; thus, having the ability to handle non-linearity. The obtained results using the MARS model were compared with linear models, such as MLR, ARIMA, and ARMAX, as well as some non-linear models, such as SVR, K -Nearest Neighbors (K -NN), and Classification and Regression Trees (CART). Results showed that non-linear models tend to provide higher performance than linear models, on average. The authors concluded that no model could do consistently better than the other at both the training and prediction levels.
Muhammad Ehsan et al. [31] implemented an MLP-based ANN model for a 1-day ahead power output prediction of a 20 kWp grid-connected solar plant situated in India. Authors examined different combinations of hidden layers, hidden neuron activation functions, and learning algorithms for reliable 1-day ahead power predictions. The authors concluded that the ANN characterized by a single hidden layer, Linear Sigmoid Axon (neuron activation function), and Conjugate Gradient (learning algorithm) was able to deliver reliable power output predictions.
Theocharides et al. [32] examined the performance of three different ML methods, namely ANNs, SVR, and Regression Trees (RTs), with different hyper-parameters and sets of features, in predicting the power production of PV systems. Their success was related to the Persistence model throughout the computation of Mean Absolute Percentage Error (MAPE) and normalized Root Mean Square Error (nRMSE). The obtained enhancements were then evaluated using the Skill Score (SS). It was found that the ANNs outperform other prediction models from the literature.
Alomari et al. [33] proposed an ANN model for PV power production prediction. The proposed model investigated the strengths of two different learning algorithms (i.e., Levenberg-Marquardt (LM) and Bayesian Regularizations (BR)), by utilizing different variations of ANN model's inputs. The conclusions drawn revealed that an ANN-based BR provides more accurate predictions than those obtained by ANN-based LM (i.e., RMSE = 0.0706 and 0.0753, respectively).
Al-Dahidi et al. [16] investigated the capability of the Extreme Learning Machine (ELM) in predicting the PV power output. The obtained results revealed that the ELM provides better generalization capability with negligible computational times compared to the traditional BP-ANN.
Later, Al-Dahidi et al. [18] suggested a comprehensive ANN-based set-up solution for enhancing the 24h-ahead solar PV power output predictions. The authors also used the bootstrap technique to quantify the sources of ambiguity that influence the model structure predictions in the form of Prediction Intervals (PIs). The efficacy of the recommended ensemble solution was illustrated by a real case study of a solar PV system (264kWp capacity) located in Amman, Jordan. The suggested method has been shown to be advantageous to various standards in providing more accurate power predictions and accurately quantifying multiple sources of ambiguity.
Behera et al. [34] proposed a prediction technique based on a combination of ELM, Incremental Conductance (IC), and Maximum Power Point Tracking (MPPT) techniques. The obtained results revealed that the ELM provides better performance compared to the standard BP-ANN and that performance can be further enhanced using the PSO technique.
Huang and Kuo [35] proposed a high-precision PVPNet model-based Deep-Learning Neural Networks (DLNNs) for 1-day ahead power output prediction. The prediction results obtained by the proposed PVPNet model were evaluated (in terms of RMSE and MAE) and compared to other ML techniques of literature. Authors concluded that the proposed PVPNet model has an excellent generalization capability and can boost the prediction performance, while reducing monitoring expenses, initial costs of hardware components, and long-term maintenance costs of future PV plants.
Catalina et al. [36] proposed two linear ML models (i.e., Least Absolute Shrinkage and Selection Operator (LASSO) and linear SVR), and two non-linear ML models (i.e., MLPs and Gaussian SVRs) with satellite-measured radiances and clear sky irradiance as inputs to nowcast the PV energy outputs over peninsular Spain. Results revealed that the two non-linear ML models were better than the two linear ML models.
From the above research works, it is apparent that the efforts were mainly dedicated to enhancing the employed data-driven prediction model or investigating other advanced models from the literature. Differently, this work aims to propose a local training strategy applicable to any data-driven prediction model for ultimately boosting the prediction accuracy of the solar PV power outputs, while reducing the computational times. Specifically, the hour-by-hour variability (i.e., the 24-hour seasonality patterns of each day) arise in the solar data, of both the weather variables and the corresponding power productions, has never been explored while developing the prediction models. The consideration of such seasonality while developing the prediction models is expected to be beneficial in enhancing the prediction accuracy while reducing the computational times.

C. CONTRIBUTIONS
The proposed training strategy requires splitting the available inputs-output patterns collected from the actual operation of a PV system based on an hour interval of h = 1, into H = 24 datasets, each dataset represents the data collected at each hour interval h, h ∈ [1,24]. The established datasets are then used to build H = 24 feedforward ANNs models. The selection of the ANNs is driven by the fact that they are simple, easy to understand and to implement, and capable of solving non-linear interpolation problems [37], [38].
Each built-ANN is initially optimized on a validation dataset in terms of the number of hidden neurons to enhance the prediction accuracy further and, then, utilized online to estimate the corresponding hourly production of a day on a test ''unseen'' dataset.
The effectiveness of the proposed training strategy-based ANN is examined on a grid-connected solar PV system (264kWp capacity) located in the Applied Science Private University (ASU), Amman, Jordan [4], [16], [18], [39]. Specifically, the accuracy of the power production predictions and the computational times required to develop and evaluate the built-ANNs are verified by resorting to three performance metrics from the literature [4], i.e., the RMSE, the MAE, and the Weighted MAE (WMAE), and to the computational time in minutes, respectively.
For comparison and validation, a single prediction model (i.e., for a fair comparison, an ANN model is considered) developed and optimized, in terms of the number of hidden neurons, globally on the entire dataset is used as a benchmark to verify the effectiveness of the proposed strategy on the ASU solar PV system. Moreover, the ELMs are used instead of the ANNs, and the Persistence prediction model is adopted further to verify the superiority of the proposed training strategy-based ANN.
Therefore, the significant contributions of the present work are two-fold: • The development of a local training strategy-based ANN for an accurate estimation of the solar PV power productions with short computational times; • The comparison of the obtained results to the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence prediction model of literature, to further explore the effectiveness of the proposed local training strategy.
The remaining of this article is organized as follows. In Section II, the work objectives are illustrated, and the problem of predicting the solar PV power output is stated. In Section III, the ASU solar PV system case study is described, and the proposed local training strategy-based ANN is illustrated, also providing an essential background of ANN. In Section IV, the application of the proposed training strategy-based ANN to the ASU case study is shown, and the obtained results are discussed and compared with those obtained by the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence model of literature. Section V investigates the influence of using different hour intervals on the prediction performance. Lastly, conclusions are drawn, and future works are recommended in Section VI.

II. WORK OBJECTIVES
This work aims to develop a data-driven model for an accurate estimation of the power productions of a solar Photovoltaic (PV) system with convenient computational times. We consider the availability of historical weather data (W) and corresponding power production data ( − → P ) of the PV system collected during Y years (or D days) (see Fig. 1). The weather data (W) consist of four weather variables collected at hour h, h ∈ [1,24] of each day during the period Y ; they are: • wind speed (S h ); • relative humidity (RH h ); • ambient temperature (T amb h ); and • global solar irradiation (I rr h ). Together with the timestamp and the corresponding power data (P h ), one can establish an overall inputs-output dataset from the previous data vectors: The timestamp is here represented by the chronological hour (hr) and day number (d) from the beginning of each year data during the period Y , hr = 1, . . . , 8760, d = 1, . . . , 365.
We aim to effectively exploit the pre-processed dataset during the development/training stage of a data-driven prediction model for providing accurate predictions of the PV power production, with convenient computational times. To this aim, the available dataset is divided into H = 24 datasets. Each dataset is composed of the timestamp and the corresponding weather variables and power productions collected locally at each h-th hour of a day, h ∈ [1,24]. The datasets are used to develop and optimize H = 24 data-driven prediction models. Feedforward Artificial Neural Networks (ANNs) are employed as prediction models due to their simplicity and convenient computational efforts required [40]. Still, the proposed training strategy is general and could be applied to any data-driven ML techniques from the literature (e.g., ELMs, SVMs, etc.). Finally, the built-prediction models are individually used to predict the h-th power production of the solar PV system. Thus, our contribution entails proposing an intuitive way of handling the available dataset to accurately build/develop a data-driven prediction model, such as the ANN, in this work.
The proposed local training strategy is expected to provide more accurate power production predictions with short computational efforts, compared to the traditional global training strategy, in which the available dataset is used, entirely, VOLUME 8, 2020 to develop and optimize a single (H = 1) ANN prediction model. Additionally, further comparisons and analyses are carried out to explore the effectiveness of the proposed training strategy-based ANN.

III. MATERIAL AND METHODOLOGY
In this Section, the real data of a solar PV system used in this work, together with the proposed (local) and benchmark (global) training strategies-based ANN, are presented in Section III.A and Section III.B, respectively.  The dataset utilized in this work comprises real weather data, W, (i.e., inputs) (measured by a weather station located far away by around 172 m from Al-Khawarizmi Building ( Fig. 2) and the corresponding PV power productions (i.e., output) (measured by the inverters of the PV system), − → P (in kW) [39]. This dataset has been collected for Y = 3.625 years (from 16 th May, 2015 to 31 st December, 2018) with a time step t = 1 hour, from 12 a.m. to 11 p.m. daily, i.e., D = 1326 days with N = 31824 inputs-output patterns.

1) ASU SOLAR PV SYSTEM
The ASU PV system comprises 14 SMA sunny tri-power inverters (13 inverters with a power of 17kW and 1 inverter with a power of 10kW) attached with Yingli Solar panels (of a type YL 245P-29b-PC) tilted by 11 • and oriented 36 • (from S to E). This orientation is chosen to accumulate as much optimum solar radiation as possible during the day (as depicted in Fig. 3 [39].
The design characteristics of the ASU PV system are reported in Table 1.

2) ASU WEATHER STATION
The ASU weather station (depicted in Fig. 4) is 36m high equipped with latest instruments used to measure 45 different weather variables, such as global solar irradiations, relative humidity, precipitation amounts, wind speeds and directions, barometric pressure, and ambient temperatures collected at various levels from the ground.
Among the 45 weather variables, engineering and professional opinion suggested to use the highly correlated weather variables to the PV power productions as inputs to the prediction model. Those variables, together with the instruments installed for their measurements and their detailed characteristics are [42]: • the wind speed at 10m (S) (m/s). The wind speed transmitter is used for measuring the horizontal component of the wind speed with high accuracy. The transmitter is equipped with an electronically regulated heating for providing a smooth running of ball bearings during winter operations, and for avoiding the shaft and slot from icing-up. The technical specifications of this instrument are reported in Table 2; • the relative humidity at 1m (RH ) (%). The hygro-thermo transmitter with capacities sensing element is used for measuring the relative humidity with high accuracy. The transmitter is equipped with weather and thermal radiation shield to protect the humidity sensor against radiation, precipitation, and mechanical damages.  The technical specifications of this instrument are reported in Table 2; • the ambient temperature at 1m (T amb ) (in • C). The hygro-thermo transmitter with RTD is used for measuring the temperature of ambient air with high accuracy. The transmitter is equipped with weather and thermal radiation shield to protect the temperature sensor against radiation, precipitation and mechanical damages. The technical specifications of this instrument are reported in Table 2; and • the global solar irradiations (I rr ) (in W/m 2 ). The pyranometers are used for measuring the global (total) irradiation on a plane surface with high accuracy. The technical specifications of this instrument are reported in Table 2.
Additionally, the corresponding timestamps (i.e., number of hours and day from the beginning of each year data) are also considered in inputs [4], [16], [18], [33], [43]. The remaining variables have been excluded from the analysis.

3) DATA PRE-PROCESSING
For proper utilization of the dataset, it has been pre-processed following the guidelines reported in [4], [16], [18], [33], [43] using the same case study. For example: • missing values have been excluded from the analysis; • negative Irr values and corresponding missing P values (recognized in the early morning (i.e., 12 a.m. -6 a.m.) and late evening (i.e., 6 p.m. -11 p.m.)) have been set to zero; and • the overall data have been normalized between 0 and 1. values collected at each h-th hour interval of 1 hour, h ∈ [1,24], of the days. This is highly necessary for a notification here to justify the motivation of using the collected h-th parameters' values solely in developing H = 24 different ANN prediction models. Each model is dedicated to estimating the corresponding h-th power production value; b) the extensive and random variability in the reported parameters' values collected at the entire hours of the days. It is essential to mention here that, typically, an ANN prediction model built on the whole dataset is expected to estimate the PV power productions at each h-th hour interval less accurately, h ∈ [1,24] compare to a).
The pre-processed Y = 3.625 years' inputs-output patterns are appended in an overall matrix X. This matrix will be used later on for the development of the ANN and ELM prediction models using both the proposed (local) and the benchmark (global) training strategies, as well as for the implementation of the Persistence prediction model.

B. METHODOLOGY
In this Section, the proposed local and the benchmark global training strategies-based ANN performed with a MATLAB code that has been developed in-house are presented in Section B.1 and Section B.2, respectively.

1) THE PROPOSED LOCAL TRAINING STRATEGY-BASED ANN
The proposed training strategy-based ANN is sketched in Fig. 6, and it goes along the following four steps: VOLUME 8, 2020  Step 1 (Establishing H = 24 Different Datasets): This step entails partitioning the overall available pre-processed dataset (X) into H = 24 different datasets (X h ) of equal size. Each dataset represents the timestamp ( , and the corresponding power productions ( − → P h ) collected at each hour interval h (i.e., h ∈ [1, 24/ h], using an hour interval h = 1 hour) during the period Y = 3.625 years. Thus, the H = 24 different datasets can be written as follows: This partitioning is motivated by the fact that the inputsoutput patterns collected at different hours and different days are mostly variable. Thus, utilizing solely the data collected at the h-th hour, for the estimation of the corresponding h-th power production, in the development of the ANNs prediction models, is expected to produce more accurate power production predictions. Additionally, this partitioning will, indeed, reduce the computational efforts required by the ANN prediction models in capturing the hidden ''unknown'' relationship between the inputs and the output power.
Step  24], with arbitrary fractions of γ = 50%, β = 20%, and α = 30%, respectively. The motivation of using such fractions is to assure that the annual seasonality appears in the datasets has been sufficiently captured while developing the ANNs prediction models.
To further assure that while sampling randomly, a Cross-Validation (CV) procedure is being employed, as we shall see in Step 3. Thus, other arbitrary fractions can be considered, and the obtained conclusions would be, indeed, the same.
Each • Hidden layer. It comprises N h hidden neurons that used to process the received inputs via a hidden neuron activation function, f 1 (), and send the processed information to the output layer. In practice, the hidden neuron activation VOLUME 8, 2020 function is a continuous non-polynomial function (e.g., ''Log-Sigmoid'', ''Linear'', ''Radial Basis'', etc.) established to capture the mathematical hidden non-linear ''unknown'' relationship between the inputs and the outputs; • Output layer. It provides an estimation of the corresponding h-th power production (P j h ) via an output neuron activation function, f 2 (), which is typically a linear transfer function (''Purlin'') [4], [44]. The estimated h-th power production (P j h ) of the j-th input pattern, x j h , can be written as follows: To adequately define the ANNs configurations, different candidate numbers of hidden neurons (n candidate ) are explored. The n candidate is considered to span the interval  with a step size of 5. For each possible configuration, the h-th power production (whose actual value is P j h ) is estimated (P j h ) and the Levenberg-Marquardt (LM) error BP learning algorithm is adopted to minimize the mismatch (typically by calculating the Mean Square Error (MSE)) between actual and estimated power productions by exploring different random initializations of the ANN internal parameters (i.e., − → β n , − → w n , b n , b o ). The built-ANNs (ANN h , h ∈ [1, 24]) are those whose internal parameters are optimally selected to minimize the MSE on the N train training inputs-output patterns. Step • Weighted Mean Absolute Error (WMAE) (Eq. (6)). This metric computes the average relative error between the actual (true) and estimated power productions produced by the built-ANNs models. Similar to the previous two metrics, small WMAE values indicate that the predictions are accurate, and vice versa. In practice, this metric is of interest to compare the prediction accuracy when the production capacities are changing; where Step 4 (Test the Optimum ANNs): The optimum ANNs whose configurations are the best selected among the whole possible different combinations as reported in Step 3 and obtained on the validation dataset are to be evaluated on the unseen test datasets (X valid h ). The predictability of the optimum ANNs is evaluated using the above-mentioned performance metrics, and their average and standard deviation results calculated over the 100 CV trials are reported.

2) THE BENCHMARK GLOBAL TRAINING STRATEGY-BASED ANN
The benchmark global training strategy-based ANN entails building/training (using a training dataset), optimizing (using a validation dataset), and evaluating (using a test dataset) an H = 1 single ANN prediction model. To this aim, the overall available pre-processed dataset X is divided into the following datasets: • the training dataset (X train ). It is formed by N train = 15912 patterns (collected from 663 days, each comprises 24 inputs-output patterns, thus establishing 24 × 366 = 15912 patterns). The X train is used for building/training the single H = 1 ANN model; • the validation dataset (X valid ). It is formed by N valid = 6360 patterns (collected from 265 days, each comprises 24 inputs-output patterns, thus establishing 24 × 265 = 6360 patterns). The X valid is used for optimizing the configuration of the single H = 1 ANN model in terms of number of hidden neurons; • the test dataset (X test ). It is formed by N test = 9552 remaining patterns (collected from 398 days, each comprises 24 inputs-output patterns, thus establishing 24 × 398 = 9552 patterns). The X test is used to evaluate the performance of the optimum single H = 1 ANN model. In other words, the benchmark global training strategy aims at exploiting the complete inputs-output patterns collected at the different hour intervals of different days for the development of the single ANN prediction model (thus called ''global'' training strategy).
For a fair comparison with the proposed approach, the 663, 265, and 398 days considered here in each simulation (CV) trial are the same as those that have been obtained (using the arbitrary fractions γ = 50%, β = 50%, and α = 50%) and used in the proposed training strategy for establishing the training, validation, and test datasets, respectively. The simulations are then repeated 100 times, and the ultimate average performance metrics and their standard deviations are reported and compared with those obtained by the proposed training strategy (Section IV).

IV. APPLICATION RESULTS
The application results of the proposed local training strategybased ANN (Section III.B.1) on the ASU real case study (Section III.A) are here described and compared with the results obtained by the benchmark (Section III.B.2). Further, its effectiveness is verified and compared when Extreme Learning Machines (ELMs) are adopted instead of the ANNs, and when the Persistence model is used.
Because the early morning and late evening hours are with no need to estimate the power productions since there are no solar radiations, the reported results are shown solely for h ∈ [7,21].  Table 3, together with the best obtained average performance metrics. The results are obtained by using the ''Radial basis'' and ''Purlin'' hidden/output neuron activation functions (f 1 () and f 2 (), respectively) and the 100-fold CV procedure. The ''Radial basis'' takes in input values between (−∞, ∞) and produces in output values between (0,1) [44], [45], whereas ''Purlin'' transfers the inputs to the outputs without any change [44], [45].
It is worth mentioning that one could follow an exhaustive search procedure to select the best-hidden/output neuron activation functions (i.e., f 1 () and f 2 (), respectively) among the other available activation functions in the literature [44]. However, to reduce the complexity of the optimization step, the ''Radial basis'' and ''Purlin'' functions are selected in this work for the different ANN models, following the recommendations and guidelines reported in [16] on the same dataset. The number of hidden neurons for each h-th optimum ANN (N opt h ) are selected at which the overall multiplied performance metrics is minimized (i.e., min(RMSE h * MAE h * WMAE h )).
Notice that: • The optimum number of hidden neurons obtained for each h-th ANN model is, in general, shown to be proportional to the level of variability exhibited by the data collected at the corresponding hour interval, h ∈ [7,21]. For example, at h = 7 and h = 12 (small and large data variability, respectively, as depicted in Fig. 5 For a fair comparison, the ''Radial basis'' and ''Purlin'' hidden/output neuron activation functions (f 1 () and f 2 (), respectively), and the 100-fold CV procedure are used.
For clarification purposes, Fig. 8 shows the evolution of the overall multiplied performance metrics (RMSE * MAE * WMAE) obtained by the benchmark over the 100-fold CV procedure concerning the candidate numbers of hidden neurons, i.e., n candidate , that span the interval  with a step size of 5. One can recognize that the optimum number of hidden neurons is obtained at N opt h = 5 (star), and as long as the number of hidden neurons increases, the overall multiplied performance metrics increases (i.e., less prediction performance).  Table 4 reports the average performance metrics obtained by the optimum ANN configurations of the proposed model (as reported in Table 3) accompanied with those produced by the optimum ANN configuration of the benchmark model (as shown in Figure 8) on the entire day hours (h ∈ [7,21]) of the validation datasets (X valid h and X valid , respectively) and training datasets (X train h and X train , respectively) over the 100-fold CV. Results show that the proposed training strategy-based ANN (Section IV.A) outperforms the performance of the benchmark training strategy-based ANN (Section IV.B).

C. COMPARISONS AND DISCUSSIONS
The performances of the two models are also verified using the test ''unseen'' datasets. In this regard, Table 5 reports the average performance metrics obtained by the proposed model accompanied with those produced by the benchmark model on the entire day hours (h ∈ [7,21]) of the test datasets (X test h and X test , respectively) over the 100-fold CV. To effectively compare the performance metrics from the two prediction models for the test datasets, the Performance Gain (PG Metric ) of each performance metric (Metric) is calculated as per Eq. (7) [4], [17]: (7) This gain describes the performance gain obtained by the proposed model to the performance of the benchmark for each of the computed performance metrics [4], [17]. Typically, positive values of the PG calculated for the RMSE, MAE, WMAE indicate that the proposed approach outperforms the benchmark model. Also, the computational efforts required by the prediction models for their developments, validations, and evaluations are reported as well in Table 5. Looking at Table 5, one can easily recognize that the proposed prediction model outperforms the benchmark significantly. Specifically: • The proposed approach enhances the prediction performance by ∼25% (for the RMSE), ∼30% (for the MAE), and ∼22% (for the WMAE); • Additionally, the computational efforts required by the proposed approach are significantly reduced by ∼40%, as expected; • Thus, the proposed training strategy boosts the prediction performance than that of the benchmark with short computational efforts. Specifically, Fig. 9 shows the evolutions of the average performance metrics (Fig. 9 (left)) and the corresponding standard deviations (Fig. 9 (right)) at each hour h, h ∈ [7,21] using the proposed (circles) and the benchmark (squares) models computed over the 100-fold CV on the test datasets.
Looking at Fig. 9 (left), one can recognize that: • The predictions provided by the two models are comparable. Specifically, the benefit in accuracy of the proposed model to the benchmark (in particular at the early and late days' hours) for the three performance metrics is justified by the use of solely the hourly data in building/developing the ANNs for the prediction of the corresponding hourly power productions. In contrast, the entire dataset of the whole days' hours is used to build/develop the single ANN model. For example, at h = 12, the performance metrics obtained by the proposed training strategy (i.e., RMSE = 19.05kW, MAE = 12.52kW, and WMAE = 0.095) are smaller and, thus, superior to those obtained by the benchmark (i.e., RMSE = 20.9kW, MAE = 14.63kW, and WMAE = 0.111). Whereas looking at Fig. 9 (right), one can recognize that: • the variability (standard deviation) in the three performance metrics obtained by the proposed model is smaller than that obtained by the benchmark. Similarly, this is because the proposed model exploits solely the data collected at each hour h to predict the corresponding power production predictions, whereas the benchmark utilizes the entire data collected at the whole hours to predict the power productions at each hour h; • the variability is reduced between the two models at the middle days' hours, as expected, due to the large variability in the collected data at these particular hours utilized in the proposed model compared to the early and late days' hours; Further insights on the superiority of the proposed model can be seen by looking at Fig. 10 that shows the computed average performance metrics over the 100-fold CV at each season (i.e., different weather conditions) on the test datasets ( Fig. 10 (left)), together with the obtained performance gains (Fig. 10 (right)). It can be seen that: • the proposed model (a dark shade of color) provides more satisfactory performance in terms of prediction accuracy of the power productions, i.e., lower metrics, for all seasons, compared to the benchmark (a light shade of color) ( Fig. 10 (left)); • the highest performance gains achieved by the proposed model to the benchmark using the three metrics are obtained at the Summer season, whereas the lowest performance gains are obtained at the Winter season, with almost equal intermediate performance gains obtained at both Autumn and Spring seasons (Fig. 10 (right)). This indicates the superiority of the proposed model in achieving more accurate predictions for significant power production season (Summer season) compared to the small power production season (Winter season). For clarification purposes, Fig. 11 shows four examples of the best (Fig. 11 (top)) and worst (Fig. 11 (bottom)) power production predictions obtained by the proposed model (circles) of four different days (one day for each season) compared with the corresponding predictions obtained by the benchmark model (squares) together with the actual productions (solid lines), respectively. The predictions provided by the two prediction models are comparable: the benefit in the  prediction accuracy of the proposed model to the benchmark is justified by the use of the solely hourly data for training the proposed model to predict the corresponding hourly power productions, whereas the complete hourly data are used to train the benchmark model.
For completeness, Table 6 reports the average performance metrics and the corresponding performance gains obtained by the proposed model to the benchmark of these particular days in the four seasons for one CV trial, i.e., CV = 6. One can recognize the superiority of the proposed model compared to the benchmark for all of the selected days across the four seasons. For example, the most significant enhancement obtained by using the proposed training strategy reaches up to ∼58% (RMSE), ∼60% (MAE), and 60% (WMAE) for Day 4 (6 th June, 2015 -Summer), whereas the lowest enhancement obtained by the proposed training strategy reaches up to ∼2% (for the three performance metrics) for Day 1 (25 th February, 2017 -Winter). This indicates the capability of the proposed training strategy in enhancing the prediction performance across the four seasons, even for the worst predictions obtained by the proposed training strategy.

D. COMPARISONS WITH OTHER PREDICTION TECHNIQUES
In this Section, the effectiveness of the proposed local training strategy with respect to the global training strategy when other ML techniques are adopted is investigated (refer to Fig. 6). Mainly, the Extreme Learning Machines (ELMs) are employed as prediction models instead of the ANNs. In addition, the consideration of using the ANNs is justified by comparing the prediction performances obtained by using the ANNs to those obtained by using the ELMs. Further, the prediction performances obtained by using the ANNs and the ELMs are compared to the well-known Persistence prediction model of literature, for completeness.
The ELM, initially developed by [46], is a new learning algorithm for single-hidden layer neural networks. Similar to ANN model architecture, the ELM comprises an input layer, a hidden layer that consists of N h hidden neurons, and an output layer. The idea underpinning the development of the ELM is two-fold: i) it randomly chooses the inputs' parameters (i.e., weights and biases) of the hidden neurons, instead of using the iterative (traditional) Back-Propagation learning algorithm and, then ii) it determines the output weights, analytically. The application of ELM in different industrial fields show that it has better generalization capability and requires shallow computational efforts [46].
The Persistence model [47] for PV power production prediction is an intuitive and straightforward approach commonly used as a benchmark for evaluating the effectiveness of any proposed prediction techniques. Basically, it assumes that the PV power production at time h, h ∈ [1,24], of the next day will be the same as the present PV power production collected at the same time h.
Both the proposed (local) and the benchmark (global) training strategies are employed for developing H = 24 and H = 1 ELMs, respectively, following the steps reported in Section III.B. For the two training strategies, the ELMs are built using the training datasets (X train h and X train , respectively), optimized in terms of the number of hidden neurons using the validation datasets (X valid h and X valid , respectively), and evaluated and compared using the test datasets (X test h and X test , respectively). It is worth mentioning that different numbers of hidden neurons (n candidate ) are considered and examined, they are n candidate = [25, 50, 100, 500, 900, 1300, 1700, 1900]. In addition, the ''Radial Basis'' is used as a hidden neuron activation function [16]. Table 7 reports the average performance metrics obtained by using the proposed (local) training strategy-based ELM together with those obtained by using the benchmark (global) training strategy-based ELM on the entire day hours (h ∈ [7,21]) of the test datasets (X test h and X test , respectively) over the 100-fold CV.  Looking at Table 7, one can easily recognize that: • the utilization of the local training strategy for the development of the ELMs largely enhances the solar PV power production prediction compared to the use of the global training strategy. Accurately, enhancements reach up to ∼34%, ∼36%, and ∼30% for the RMSE, MAE, and WMAE, respectively.
• the predictability obtained by using the ELMs is slightly lower than that obtained by the adoption of the ANNs. Future works can be devoted to enhancing the adopted prediction model embedded with the proposed local training strategy of this work. In addition, Fig. 12 shows the evolutions of the average performance metrics (Fig. 12 (left)) and the corresponding standard deviations (Fig. 12 (right)) at each hour h, h ∈ [7,21] using the proposed (local) training strategy-based ANN (circles), the proposed (local) training strategy-based ELM (squares), and the Persistence model (diamonds) computed over the 100-fold CV on the test datasets. Looking at Fig. 12 (left), one can recognize that: • The predictions provided by the three models are comparable. Specifically, the Persistence model seems to slightly/largely outperform the proposed (local) training strategy-based ANN/ELM, respectively, at the early morning (i.e., h = 7, 8,9,10) and late evening hours (i.e., h = 18, 19, 20, 21). This can be justified by the fact that at those time hours, the variability of the weather conditions is small that makes the intuitive justification of the Persistence model valid (i.e., the PV power production at time h, h ∈ [1,24], of the next day will be the same as the present PV power production collected at the same time h). However, the performance of the Persistence model starts to notably decrease (the RMSE, MAE, and WMAE start to increase) at the middle days' hours (i.e., h ∈ [11,17]) with respect to the performances obtained by using both the ANN and the ELM. This is due to the large variability of the weather conditions experienced by the ASU PV plant at those time hours; • The proposed (local) training strategy-based ANN allows obtaining more accurate power predictions throughout the whole time hours than the proposed (local) training strategy-based ELM; Whereas looking at Fig. 12 (right), one can recognize that: • the variability (standard deviation) in the three performance metrics obtained by the Persistence model is the smallest among the three models. Similarly, this is due to the intuitive operation of the Persistence model that considers solely the data collected at each hour h to predict the corresponding power production predictions; • the variability obtained by the utilization of the ANN is lower than that obtained by the ELM, in particular at the middle days' hours. This assures the effectiveness of the proposed (local) training strategy-based ANN in providing accurate solar PV power production predictions with small variability (i.e., confidence bounds).

V. INFLUENCE OF USING DIFFERENT HOUR INTERVALS FOR DATASET PARTITIONING ON THE PREDICTION PERFORMANCE
In this Section, the influence of using different hour intervals, h, for partitioning the overall available pre-processed dataset (X) into H different datasets, on the predictability of the ASU power production, is investigated. Specifically, three different hour intervals are considered in this work; they are: • h = 4 hours. This interval entails partitioning the dataset (X) into H = 24 h = 6 different datasets, and thus, H = 6 ANNs will be built, optimized, and evaluated. Each dataset represents the timestamps, weather variables, and the corresponding power productions collected at each hour interval h during the Y = 3.625 years, h ∈ [1,6]. Once the overall dataset (X) is partitioned using the three considered hour intervals, i.e., h = 2, 3, and 4 hours, into H = 12, 8, and 6 different datasets (X h ), respectively, the proposed training strategy is applied following the steps illustrated in Section III.B (Fig. 6) 7,21] using the proposed (circles) and the benchmark (squares) prediction models (as depicted in Fig. 9), and the proposed training strategy-based ANN using the three different hour intervals, i.e., h = 2, 3, and 4 hours, (diamonds, triangles, and stars, respectively) computed over the 100-fold CV on the test datasets.
For more clarifications, Fig. 14 shows the average performance metrics obtained by using the proposed training strategy-based ANN with different hour intervals (light shade of color) accompanied with those produced by using the benchmark training strategy-based ANN (dark shade of color) on the entire day hours (h ∈ [7,21]) of the test datasets (X test h and X test , respectively) over the 100-fold CV. The analysis of Fig. 13 and Fig. 14 conclude that: • The predictions provided by the proposed (local) and the benchmark (global) training strategies-based ANN are comparable. Specifically, the benefit in accuracy obtained by the proposed training strategy using the different hour intervals to the benchmark (in particular at the early and late days' hours) for the three performance metrics is still justified by the use of the local hourly data for building/developing the ANNs for the prediction of the corresponding local hourly power productions; • The prediction performance seems to be shifted towards the global training strategy (i.e., the benchmark with H = 1) when large hour intervals are being used (e.g., h = 4 hours) and, thus, less number of ANN models are being used. For completeness, Fig. 15 shows the computational efforts in minutes required by using the proposed and the benchmark strategies during the training, optimization, and evaluation phases. One can notice that the computational efforts needed by the proposed approach using the three different hour intervals (i.e., h = 2, 3, and 4 hours) and using the suggested  hour interval (i.e., h = 1 hour) are mostly less than those required by the benchmark model. In fact, the variable computational efforts needed when using the local training strategy is due to the differences in the number of ANN models (H ) established and the amount of data used to train and optimize the H models.
To conclude, the consideration of the hour-by-hour variation (i.e., the 24-hour seasonality patterns of each day) while building/training a data-driven prediction model is shown beneficial in enhancing the predictability of the solar PV power productions while reducing the computational efforts required by the adopted model compare to the traditional global training strategy. In practice, these enhancements are valuable in balancing power supplies and demands across centralized grid networks through economic dispatch decisions between the energy sources.

VI. CONCLUSION AND FUTURE WORKS
In this work, a local training strategy-based Artificial Neural Network (ANN) is proposed to enhance the solar PV power productions with short computational efforts. Specifically, the proposed training strategy is local in a sense that solely the timestamp, weather variables and the corresponding power productions collected at each h hour interval of size h = 1 hour, h ∈ [1,24], are used to build, optimize, and evaluate H = 24 ANN prediction models each to be used for the estimation of the h-th hour power production. The proposed strategy is validated on a solar PV system of Applied Science Private University (ASU) located in Amman, Jordan. The effectiveness of the proposed strategy is evaluated to a state-of-the-art ANN model built and optimized using the entire available dataset. Three performance metrics are used for the comparisons, namely the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE), and the Weighted MAE (WMAE), in addition to the computational times (Time) measured in minutes. Results show that the proposed training strategy-based ANN outperforms the benchmark training strategy-based ANN with performance gains reach up to 25% (RMSE), 30% (MAE), 22% (WMAE), and 40% (computational training and test times). Further, the effectiveness of the proposed training strategy-based ANN is verified and compared when Extreme Learning Machines (ELMs) are adopted instead of the ANNs and when the Persistence prediction model is being used. Lastly, the proposed training strategy-based ANN is evaluated in terms of i) different weather conditions (i.e., seasons) to also confirm its superiority to the benchmark training strategy-based ANN and ii) different hour intervals (i.e., h = 2, 3, and 4 hours) used for partitioning the overall dataset and, thus, establishing the different ANNs (i.e., H = 12, 8, and 6 models).
Future works can be devoted to the application of deep learning techniques, e.g., Long Short Term Memory (LSTM) and/or Echo State Networks (ESNs), instead of the employed ANN to further enhance the prediction performance. His current research interests include the development of analytics and models for prognostics and health management (PHM), operation, maintenance and reliability, availability, maintainability and safety (RAMS) analysis of engineering systems, and the development of artificial intelligence (AI)-based methods for renewable energy production prediction. His research interests also include renewable energy systems and mechanical engineering fields, such as thermal sciences, heating, ventilation and air-conditioning (HVAC), and others.

NOMENCLATURE
MOHAMED LOUZAZNI received the B.Sc. degree in electronics from the Faculty of Science, Ibn Tofail University, the M.Sc. degree in electronics and telecommunication from the Faculty of Science, Abdelmalek Essaâdi University, and the Ph.D. degree in electrical and renewable energy engineering from the National School of Applied Science, Tangier. He was a visiting Ph.D. student with the Faculty of Electrical Engineering, University Polytechnic of Bucharest, in 2015. In 2017, he started his research at the Solar Tech Laboratory, Energy Department, University Polytechnic of Milan, Italy. In 2018, he was a Ph.D. visitor at the Higher Polytechnic School of Algeciras, University of Cadiz, Spain, for five months. He is currently a Researcher with Abdelmalek Essaâdi University. His research interests include mathematical modeling, optimization, meta-heuristic algorithm, computational intelligence, photovoltaic and power energy, forecasting, fuel cell, GPR, radio frequency, electromagnetic, and electronic.
NAHED OMRAN received the B.Sc. degree in mechanical engineering from the Faculty of Engineering Technology, Al-Balqa Applied University, Amman, Jordan, in 2015. She is currently a full-time Teacher Assistant with Applied Science University, Jordan. She is also working as a Renewable Energy Engineer with the Renewable Energy Center, ASU. Her research interests include PV panels' technologies, machine learning applications in renewable energy, and thermal science. VOLUME 8, 2020