Modeling and Fault Categorization in Thin-Film and Crystalline PV Arrays Through Multilayer Neural Network Algorithm

Categorization of PV faults is an essential task for improving the efficiency and reliability of a photovoltaic (PV) system. Output characteristics of a solar (PV) system can be severely affected under various fault conditions including short circuit, module mismatch, open circuit, and multiple faults under shading conditions. Such PV faults can potentially be analyzed through the PV characteristic curve analysis using a multilayer neural network with a scaled conjugate gradient algorithm (SCG). This paper presents an extensive investigation for categorization, i.e., classification of the above-mentioned PV faults using the SCG algorithm. The major contribution of the presented research work is the categorization of PV faults in sixteen different classes considering polycrystalline and thin-film PV technologies with two different configurations, including SP and TCT. The fault classification is achieved with high accuracy of 99.6% and a fast-computational time of 0.08 sec. The results are validated through the plot of the Confusion Matrix and Region of Convergence (ROC) with their performance evaluation in MATLAB. The achieved accuracy and fast computational time prove the effectiveness of the multilayer neural network-based approach for classification of the PV faults to increase power output, efficiency, and lifespan of PV systems.


I. INTRODUCTION
A massive body of research has been focused on the advancement of solar photovoltaic (PV) technology with an aim to improve the latter's efficiency and high variability due to its non-linear nature and high reliance on external atmospheric conditions [1]. The high sensitivity of PV systems to extreme weather conditions such as thunderstorms, rain, and humidity, high ambient temperatures, and non-uniform shading can severely impact the output characteristics of PV arrays, which are connected in various configurations for harnessing of maximum power output [2]. Various interconnections of PV modules can develop severe internal faults like an open and short circuit, and hotspot heating under uncertain environmental conditions, which can lead towards blackout of the The associate editor coordinating the review of this manuscript and approving it for publication was Zhehan Yi . entire PV system [3]. Therefore, early detection and timely diagnosis of faults are necessary to prevent extraordinary power losses and reliable operation of PV arrays.
In this regard, the classification of PV faults has been reported in the literature for the proper diagnosis of faults and to increase the life span of a PV power system. Different fault detection techniques for recognition of PV arrays faults have been reported in the literature works like time domain reflectometry [3] and earth capacitance measurement [4]. Online fault diagnostic technique through infrared imaging has been used for fault identification in [5] that differentiates faulty PV modules from normal module through change in apparent temperature. Authors in [6], [8], and [9] attempted to figure out the location and type of fault by collection of data through installation of current and voltage sensors on small PV arrays, but those techniques do not effectively detect the faults in large PV systems. Time Domain Reflectometry (TDR) is an effective technique for PV fault diagnosis in a series-connected PV farms. However, it requires precision instrument for analysis of signals to diagnose the fault accurately and can only be applied to series-connected PV modules as presented in [6].
As mostly PV arrays are connected in various seriesparallel configurations to extract the required amount of power, techniques presented in the above-cited works cannot diagnose faults in series-parallel configurations of PV arrays. Many investigations by various authors have been conducted to analyze environmental impact like severe shading on various series-parallel topologies [7]- [10]. It is known that the impact of fault can vary upon the form of interconnection and type of PV technology. The impact of different PV arrays' faults on different configurations include as given in [11]- [16]: Honey Comb (HC), Total Cross Tied (TCT), and Bridge Linked (BL) have been investigated with consideration of thin-film and monocrystalline PV arrays. Authors in [16] restrict the investigated impact of open circuit, short circuit, and shading fault on a reasonably small 6 × 6 PV array. Power loss minimization has been achieved in [11] with thin-film PV technology under various faulty scenarios with power-voltage (P-V) curve analysis only. From the given literature, it is found that there is room to investigate different PV faults' impact on different PV materials with the larger interconnected PV arrays through analysis of both P-V and I-V curves. It is pertinent to mention that the I-V curve is considered more crucial for the recognition and classification of PV faults in a PV system.
In fact, faults in an interconnected PV array are challenging to diagnose due to their unpredictability and non-linearity. Artificial neural networks (ANN) can, fortunately, characterize the relationship between input states and expected output with complex structure, connecting weights, bias, and thresholds. The probabilistic neural network (PNN) has been used in [12], which classifies different PV faults with 85% precision. Accuracy of the computed results is necessary for the proper diagnosis and categorization of such faults. It is believed that fault classification through artificial neural networks (ANN) can be a good solution in terms of accuracy, fast computation, and ease [13]. In [14], a neural network model was designed to envisage the output power of PV modules. The neural network (NN) was used for categorization of different PV faults in [15], but none of the studies have categorized and classified diverse faults of TCT, BL and SP interconnected PV arrays in different PV materials [15] with high precision and accuracy through ANN as summarized in Table 1. In particular, classification of multiple faults in thin-film and crystalline PV technologies remain unveiled in the literature.
To bridge the aforementioned research gap, the presented research work investigates various topologies, including SP, TCT, and BL, and gives its detailed analysis through I-V and P-V characteristics curve. Two different PV technologies, including thin-film and polycrystalline PV, have been considered for this analysis. An extensive input data set of 5 × 1248 has been developed and collected by computation and analysis of change in the output of the considered interconnected PV arrays. The performance of PV arrays is analyzed with classification and recognition of the PV faults through the backpropagation algorithm of neural networks. The adapted procedure recognizes all the faults with high accuracy and categorizes the faults with respect to different PV materials. The contributions of this research work are as follows: 1. Different types of faults in PV arrays are analyzed on a thin-film and crystalline 9 × 7 PV array to investigate the faults' impact on the characteristics curve of a PV system. The combined impact of faults is also analyzed on TCT, BL, and SP interconnection of the PV array. 2. All faults are categorized and classified through a multilayer neural network with high accuracy of 99.6% and a faster computation time of 0.08 sec than that of the literature works [12]. This research has also classified all simulated faults in PV array with differentiation of different PV materials, including thin-film and crystalline technology, which is not reported in the literature. 3. PV faults are categorized into two different PV configurations, including SP and TCT, through neural networks with high accuracy of 99.6%, which is not achieved in the previous research works. Thermal imaging for feature extraction is used in [17] with NN as a classifier for fault detection in PV modules, which achieved 92.8 % overall accuracy as a fault classifier. A comparison of NN classifier with conventional classifiers like K-nearest neighbor (KNN), and support vector machine (SVM) is also made in [17]. Classifiers like SVM and KNN achieved an accuracy of 80.3% and 56.8% respectively, while NN classifier performed better as a fault classifier and achieved an overall accuracy of 92.8% in classification of faults in the PV module. Accurate detection of shading fault in PV systems is performed through principal component analysis (PCA) with achieved average accuracy of 97% in [18]. Convolutional NN based approach is used in [19] for extraction of features from scalograms and to perform classification of faults with 73.5% accuracy without considering different PV materials. PV faults were categorized with 99% accuracy without consideration of PV material and configuration of PV array in [16]. Classification of open and short circuit faults in a PV array through PNN is achieved in [20] with an accuracy of nearly 98%. Diagnosis of various faults has been conducted through various methods, including I-V measurements with machine learning techniques and multiclass exponential loss function (SAMME-CART) in references [21]- [24]. Over 95% average accuracy is achieved in classification of each fault. None of the previous research works has classified PV module mismatch, open and short circuit, multiple faults under shading conditions in different PV materials and configurations with 99.6% accuracy and fast computational time [26]- [32]. 4. An extensive input data set of 5 × 1248 is collected through analysis and computation of parameters of thinfilm and polycrystalline PV arrays. Five input parameters are considered with each input having 1248 samples for the classification of faults through NN algorithm with high accuracy of 99.6%.

5.
The results are validated through plotting the best validation performance, confusion matrix, and region of convergence (ROC) analysis for the classification of faults in dissimilar PV technologies, i.e., thin-film and crystalline. The rest of the article is organized as follows: mathematical modeling of the developed system is explained in Section-II. VOLUME 8, 2020 SCG algorithm is explained in Section-III. Obtained results and computations are detailed in Section-IV. The conclusion is given in Section-V.

II. SYSTEM MODELING
Two different PV arrays are considered for the classification of faults, including p-n hetero-junction (thin-film; amorphous silicon) and p-n homo-junction. The five-parameter model is selected over the seven-parameter model due to higher accuracy for fault analysis in crystalline PV array, as depicted in Figure 1 in which a single PV cell is coupled in parallel to one diode as a current (I L ) source [11]. The shunt resistance and series resistance are characterized by R sh . and R s, respectively. n is an ideality factor. The value of I sc signifies short circuit current. T R represents working temperature, while T STC is the temperature at standard testing condition, i.e., 25 • C. The coefficient of I sc and V oc are articulated as k i and k v , respectively. The irradiance at STC is denoted by G STC . N sr and N pa are total number of series cells and parallel PV cells, respectively. The semiconductor's energy bandgap is signified by E go . The output current of PV module is represented by I output, as found in Eq. (1).
Thermal voltage, open-circuit voltage, and short-circuit current are characterized by V t , V oc and I sc , respectively as given in Eqs. (2) & (3) [33,34]. I STC and V STC are current and voltage at STC, while G R is the irradiance.
Homojunction cells follow superposition theorem. Authors in [23] adopted an analytical model of heterojunction solar array, which computes current and voltage parameters of a PV cell, denoted by J and V , respectively. The a-Si cells use a triple layer p-i-n form of structure having thick p-n layer and i-layer of nearly 1 micrometers thickness. The voltagedependent charge collection is the most dominant charge collection mechanism. An analytical expression is obtained for voltage-dependent photocurrent in [23] to describe current J voltage V characteristics in thin-film cell. Total current density is denoted as follows in Eq. (4). J d and J L represent forward diode current and photocurrent density, respectively. The total photocurrent density J L (λ, V ) represent the sum of current density for the carriers drifting towards the bottom contact J b (λ, V ) and the current drifting towards the top contact J t (λ, V ) as given in Eq. (5) and Eq. (6). Where is a normalized absorption depth, τ b and τ t are normalized carrier for drifting towards bottom and carrier drifting towards the top contact, respectively. Total photo-generated density is found by integrating overall incident photon's wavelength of spectrum as follows in Eq. (7). Modeling of different PV materials is explained for describing the diverse behavior of PV materials under fault conditions due to their different material composition.
A. DEVELOPED FAULTS IN PV ARRAY All developed faults are analyzed in three configurations, including BL, SP, and TCT. The SP is widely used interconnection of PV arrays. The BL interconnection is also a type of SP interconnection with more internal interconnections than SP in a design like bridge formation, and TCT in which PV modules are joint together, as shown in Figure 2.
A Simulink model of a 9 × 7 PV array under electrical faults is developed to study the performance of faulted PV array, as shown in Figure 3. Four different fault scenarios are analyzed in this study, including module mismatch (F1), short circuit (F2), open circuit (F3), and combined impact of faults in case of multiple faults scenario (F4).
Module mismatch fault (F1): Module mismatch fault is analyzed in this case, which is developed by the provision of non-uniform irradiance to designed 9×7 PV array, as revealed in Figure 4. Temperature of PV components due to heating and non-uniform irradiance is also analyzed through applying 35 • C (more than STC 25 • C) to first, second parallel string, and 38 • C to fourth, fifth, sixth and seventh parallel PV string, respectively. A 9 × 7 PV array is considered a non-square matrix due to unequal number of rows 'm', and column 'n'.
Where element '11' denotes row '1' with column '1', i.e., PV module '1' in column '1' or first parallel string. The '31' refers to third PV module in first parallel string.  Sudden decrease in current and voltage is encountered due to non-uniform shading, which indicates a severe impact of low irradiance and high temperature on PV array. The change in values of current and voltage reduces the power significantly. The thin-film PV technology has less severe impact on the performance of the system due to decrease in current loss.
Short circuit fault with bypass diode failure (F2): This fault case is observed by introducing short circuit fault between modules in the PV string. Bypass diode failure is also VOLUME 8, 2020 analyzed by developing short circuit fault with diode connection. The peak voltage and peak current reduce the power output significantly after occurrence of short circuit fault in the PV array. The usage of thin-film PV can optimize the system's performance, but power loss still occurs in the both PV arrays. The occurrence of short circuit fault indicates significant change in current, voltage, and power of PV system.
Open Circuit (F3): This fault appears due to loose connection of modules leading to module disconnection, which is analyzed in terms of decrease in current, voltage, and power of PV array.
Multiple Faults (F4): All developed faults are analyzed under shading for investigation of the combined effect of PV faults upon the adopted interconnected PV arrays. Sudden decrease in current and voltage are indicated due to occurrence of faults under low irradiance and high-temperature conditions. The details in change of PV parameters after occurrence of each fault are detailed in the section of results. The samples of the affected PV parameters like current, voltage, power, irradiance, and temperature are also collected for classification of PV faults.
The characteristic parameters of a PV array-like irradiance, temperature, short circuit current, open circuit voltage, and peak power are analyzed under the different fault scenarios, and a novel data set of 5 × 1248 is developed after analyzing and computing 1248 samples of 5 inputs. The proposed method can classify faults in sixteen different classes through categorization of faults after applying SCG algorithm of multilayer neural network. In order to classify the faults accurately, we need to select feature quantities that can characterize and recognize the fault signal accurately. The input data set of 5 × 1248 correspond to 5 PV parameters (features of fault signal to perform fault categorization). Each PV parameter has 1248 samples that characterize the different classes of faults. A 16 × 1248 target data set correspond to 1248 associated class vectors defining which of sixteen classes each input is assigned to. Flow chart of the adopted method is depicted in Figure 5. The multilayer NN uses SCG as a training algorithm for the categorization of faults, which updates the weight vectors after comparing output with target data set of 16 × 1248. It computes the minimum global error, which is used for accurate classification of faults through computation of confusion matrix. The crossentropy (CE) is computed to check the performance. The algorithm successfully classifies the faults in TCT and SP type of interconnected PV arrays with categorization in thinfilm and crystalline PV array, as shown in Figure 6.

III. SCG ALGORITHM OF NEURAL NETWORK
Fast-supervised learning algorithm of scaled conjugate gradient (SCG) for PV fault classification is used in this multilayer algorithm which adjusts the weights in the steepest descent direction (negative of the gradient) and avoids the line search per learning iteration in order to scale the step size. A backpropagation method is used in this study with SCG as a training algorithm. Three layers, including hidden layer, input and output layers, constitute the structure of this network. The input layer propagates the data forward to the output layer known as forward propagation. A total of five inputs, each having 1248 samples propagate to hidden layer of 10 neuron layers to update weights and biases at each iteration. Total five inputs, including temperature, irradiance, open-circuit voltage short circuit current,, and temperature, are used in the input layer. Minimum squared error (MSE) is computed from the output through the developed network. The error is calculated based on difference between predicted and actual outcomes. The derivative of error is computed w.r.t each weight in the network to back propagate the error and update the model after minimizing the error. The same process is repeated multiple times to learn ideal weights. The network also includes hidden layer of ten neurons and output layer representing sixteen outputs, which are classifying the faults. Each layer has the tangent sign as the activation function known as sigmoid function. The sigmoid function is used for classification of faults, which constraints the output between 1 and 0. The error is computed by comparing estimated output to the real output, and then biases and weights are updated accordingly. The values close to 0 are desirable for estimating correct output. The cross-entropy is a loss function for classification problems. The process can be repeated for specific iterations until global minimum error is achieved.
This SCG algorithm is shown in Figure 7. Letw be the weight vector and global error functionẼ.Ẽ might be an appropriate error function which can be calculated with one forward pass and the gradientẼ with one forward and one backward pass. where 'p is the number of patterns during training, and E p is the error. The terms n = E (w k )p k is estimated with a non-symmetric approximation as in Eq. (4) [18].
The approximation inclines to the true value of E (w k )p k . The complexity in computation is O(3N 2 ) and O(N) 5, and all this is combined with Conjugate gradient (CG) approach to get fast computation and more accuracy [18]. The CG approach is combined with model trust region approach termed as Levenberg-Marquardt algorithm for this algorithm. The S k indicates second-order information. It is computed as follows in Eq. (5).
If sigmoid function δ k ≤ 0 in given iteration, then the raise in λ k is determined by Eq. (6). The raise in λ k occurs ands k is estimated again ass k .
This Eq. (6) implies that if λ k is more then − δ k |p k | 2 , then δ k > 0. The value ofλ k depends upon Eq. (7) to get an optimal solution.λ A comparison parameter (CP) is introduced to raise and lower the value of scale parameter (SC), i.e., λ k for good approximation, even with a positive definite Hessian matrix. The values of λ k directly increases the step size in such a way that the bigger value of λ k makes the step size smaller. The step size k is found to be in Eq. (10).
The k is a comparison parameter (CP) whose value is close to 1. If CP is less than 0.25, then scale parameter (SC) is found in Eq. (11).
If the steepest direction is not equal to zero, then set k = k + 1 and update weight vectors otherwise terminate and return thẽ w k+1 as desired and required minimum weight. This scale parameter should be greater than zero for a successful reduction in error. SCG algorithm does not involve user-dependent parameters as described in algorithm, which is significant advantage in comparison to line-search based algorithms. The samples are divided randomly in which 70% samples are used for training, 15% used for testing, and 15% for validation in training phase of this algorithm. Mean square error (MSE) is used for judgment of performance of algorithm as found in Eq. (12).
where Y e is estimated, and Y m is measured values of faults over PV array by the model, respectively.

A. CLASSIFICATION OF FAULTS
The layout of developed neural network is shown in Figure 8. The layout consists of 5 input each having 1248 samples, which propagate to hidden layer of 10 neuron layers to update biases and weights at each iteration. The sigmoid function is used for classification of faults, which constraints the output between 1 and 0. The error is computed by comparing estimated output to the real output, and then biases and weights are updated. The values close to 0 are desirable for estimating correct output. The cross-entropy is a loss function for classification problems. The process can be repeated for specific iterations until global minimum error is achieved. The data set of 5 × 1248 is collected through characteristic curve analysis and computation of PV parameters after fault occurrence using MATLAB, as shown in Table 2. The 70% of the described data set is used for training, while rest of the 30% data is equally divided for testing and validation of results. These inputs are taken through 5 × 1248 input data set. The output layer consists of 16 outputs that classify data in sixteen different classes of faults. The input parameters of data set are shown in Table 2. All collected input data set 5 × 1248 and target data set of 16 x1248 are then used for categorization of faults in sixteen different classes through training algorithm of SCG.   The developed fault scenarios are classified through applying neural network backstage propagation algorithm after analysis of characteristic curve as described above. The obtained graphical results are presented in the next section.

IV. RESULTS AND DISCUSSION
All the developed faults are analyzed through I-V curve analysis in this section. A 9 × 7 PV array is modeled in MATLAB with a PV module of 150 W each. The technical specifications of the two different PV module, including crystalline module of Ningbosolar electric power and thin-film (a-Si) module of Xunlight, are used for collection of data set as given in Table 3. A total of 9.1 kW power peak is generated in case of no fault operation of the developed 9 × 7 PV array.

Module Mismatch Fault (F1):
The calculated values under this fault are contained in Table 4. Characteristic curve of PV array under fault free operation is shown in Figure 9. Severe partial shading of PV array develops module mismatch fault, as shown in Figure 10. Three interconnections, including TCT, BL, and SP, are compared in terms of characteristic curve. Sudden peak appears in characteristic curve due to unexpected decrease of current. An abrupt reduction in the value of current due to shading indicates a high impact of irradiance on PV array. The change in values of current and voltage reduce power from 9.20 kW to 5.80 kW in total cross-tied, 5.10 kW in bridge-link (BL), and 4.95 kW in   series-parallel (SP) with multiple peaks in the PV curve, as depicted in Figure 10 and Figure 11. The thin-film PV technology improves the performance in severe shading conditions by decreasing sudden current loss. The power increases from 5.80kW to 6.03 kW in TCT configuration, 5.10kW to 5.80 kW in BL interconnection, and 4.95 kW to 5.70 kW in SP arrangement, respectively.
It is worthwhile to note that the thin-film performs better than polycrystalline with improved power peak. This fault analysis shows that irradiance, temperature are the parameters which have significant impact on power output of PV system The TCT perform better than other interconnections BL and SP topology in module mismatch fault (F1) and aided in optimizing performance of PV array.
Short Circuit Fault With Bypass Diode Failure (F2): The peak voltage and peak current reduce the power output significantly from 9 kW to 7.2 kW in SP, 6.6 kW in BL and 6.08 kW in TCT configuration as revealed in Figure 12 after occurrence of short circuit fault in PV array. The thin-film   PV array achieves better than poly-crystalline PV array by reducing power loss and increasing power from 7.2 kW to 7.6 kW in SP, 6.6 kW to 7.1 kW in BL interconnections, and 6.08 kW to 6.25 kW in TCT configuration as demonstrated in Figure 13. The computed values of PV model under 'F2' are given in Table 5. The utilization of thin-film can improve the power loss minimization and optimize the system's performance, but still, power loss occurs in both PV arrays.
The SP arrangement accomplishes better than the other interconnections under 'F2' scenario and indicates that   selection of suitable interconnection can impact the performance of PV system.
Enhanced peak power is attained in SP configuration than that of TCT and BL interconnections with the use of of thinfilm (a-Si) PV technology, as evident from Figure 13.
Open Circuit Fault (F3): As depicted in Figure 14 and      Table 6. It is seen that thin-film does better than polycrystalline under this fault situation, as shown in the curve analysis.
Impact of Multiple Faults (F4): It is conducted for analysis of performance of all adopted interconnection with thin-film and crystalline material. The power reduced from 8 kW to 5.0 kW in SP, 4.85 kW in BL, and 5.0kW in TCT interconnections with polycrystalline PV array, as illustrated in Figure 16 and Figure 17. Computed values of the PV system under multiple fault scenario are given in Table 7.
The produced power decreased from 8kW to 5.21kW in SP interconnection, 5.4 kW in BL, and 5.3 kW in TCT with thin-film PV technology, as revealed in Figure 16.   The thin-film PV array performs better than polycrystalline in all developed fault cases. A data set of 5 × 1170 is generated through computing and analyzing parameters of developed PV model. All considered faults are then classified and categorized through DNN in next section.

A. RESULTS FOR CLASSIFICATION OF PV FAULTS
The input data consisting of PV parameters is trained over 101 iterations with NN. The results are evaluated in terms of cross-entropy (CE), which specifies that the minimum global error (MGE) is accomplished. The data set is trained through SCG algorithm, and performance is measured through plotting confusion matrix, ROC plot analysis, and validation of performance through plotting cross-entropy (CE) with number of epochs. The best validation performance is indicated by plotting cross-entropy (CE) of training, testing, and validation data on Y-axis with number of iterations on X-axis, as shown in Figure 18. The performance is computed for each epoch, and best performance is chosen at a point when all coincide at same point. At that point in time, the training should be stopped, and no further iterations should be proceeded. It means that no further training is required, and if done, it may mispredict the results. The greatest validation presentation is 0.0001392 at epoch 95, as depicted in Figure 18. This graph shows the value of achieved cross-entropy (CE) for each iteration. The less value of CE indicates proper classification and less error in classification of faults, which is achieved by this NN classification. The X-axis and Y-axis indicate the number of iterations, i.e., 101 epochs and CE for each iteration, respectively. The phenomenon of overfitting is also indicated in the graph due to difference in the value of cross-entropy for training and testing data at epoch 101.
Training states are shown in Figure 19 in which first plot shows less value of gradient '0.37929e-05' at epoch 101, which indicates the network is learning up to a great extent due to reasonable adjustment of the weights and biases. The proper adjustment in weights and bias make the network more reliable and increase the chances of accurate classification. The validation plot shows the six validation checks at epoch 101. This plot shows the points where failure across certain limit is an endpoint for training, indicating the start of overfitting of data. Failure of values after epoch 50 can be seen where overfitting of data is also started, as shown in Figure 19.  Table 8.
The training data is used for determining the weights and thresholds of the PV models given input and targets data set. The validation data set is a non-training set whereas testing data is a set of data that indicates the unbiased performance estimates. Separate confusion matrix is plotted for all three data sets, as shown in Figure 20, which describe the performance of classification model. It permits the conception of training algorithm's performance through identification of confusion between different classes. Information about overfitting of data can also be extracted by using confusion matrix. Significant difference in the results of training and testing data can indicate overfitting of data, as indicated in Figure 20 and Figure 22. Nearly 96.6% accuracy is achieved in predicting class 1 fault for training data, as shown in Figure 20, while 75% accuracy is achieved in predicting class 1 for testing data, as shown in Figure 22. All the remaining classes VOLUME 8, 2020   Figure 22. The histogram shows very less error in classification of faults, i.e., 0.007901 in testing, validation, and training samples. Confusion matrix is used for describing and evaluating the performance of classifier on the test data while the ROC graph summarizes the confusion matrices generated for each threshold without having actual calculation of them. In other words, Receiver Operator Characteristic Curve can assist in  determining the value of best threshold. Confusion matrix and ROC are selected for assessing the classification capabilities of the trained NN classifier in this study. A confusion matrix gives a complete picture about performance of classifier and allows computation of various classification metrics, which is the main reason for selecting confusion matrix with ROC to assess the performance of proposed neural network training algorithm. Confusion matrix is best performance evaluation of multiclass problems. The results of region of convergence (ROC) are given in Figure 23. The ROC uses two parameters including false and true positive rate sensitivities with various thresholds for showing performance VOLUME 8, 2020  of classification model. All results are also given for accurate evaluation of results, as shown in Table 9 and Table 10.  Rate (TPR), which is proportion of correctly identified class. It means a correctly classified fault among fault positive population (Class = 1) represented as Eq. (13).
Specificity is a measure of True Negative Rate which denotes the percentage of the known negatives among the fault negative population (class = 0) denoted as Eq. (14).
False Positive Rate (FPR) is a proportion of identified positive (identified faults) among the population which does not belong to that fault class. It is equal to 1-specificity. Specificity Sensitivity, which is the True Positive Rate amongst the diabetes-positive population. Sensitivity = True Positives/(True Positives + False Negatives). The overall performance can be evaluated by area under the curve in the region of convergence (ROC) plot as shown in Figure 23. It represents the ability of classification algorithm to distinguish 1s (positives) from 0s (negatives).the plot shows the accurate classification as results of both specificity and sensitivity are close to 1 indicating 99.6% accuracy of fault's classification.
The ROC plot shows points in the upper left corner indicating the accurate classification of 8 fault classes with 99.6% sensitivity and specificity. More accuracy than test ROC is achieved in predicting class 1 fault for train ROC as shown in Figure 23. All remaining classes are successfully predicted for both train ROC and test ROC. The same results are observed in the confusion matrix, which also shows overfitting of data for predicting faults in both classifiers. A comparison of the presented results with that of literature works is shown in Table 9. All faults are accurately VOLUME 8, 2020  classified in sixteen different classes with 99.6% accuracy. None of the previous research has classified faults in different PV configurations and materials, which is a significant contribution of this study. The confusion matrix clearly shows that 99.6% accuracy of classification and 0.4% misclassification rate in prediction of fault classes. Different PV materials are also differentiated in the classification faults. Sixteen different classes of faults are differentiated based on irradiance, temperature of individual PV modules, open circuit voltage, short circuit current, and generated peak power. Table 10 shows the computation of errors and crossentropy in separate samples of the data set. The 1248 samples of five inputs are separated into 874 samples for training, which achieved CE of 5.78017e-0 and error of 1.14416e-1 %, 187 samples for validation, and 187 samples for testing which achieved 16.44026e-0 and 16.39030e-0 values of CE respectively. Approximately 0% and 8.02e-1 % of errors are achieved by validation and testing data, respectively. Low values of CE indicate excellent performance, which is achieved by this training algorithm. The low value of error indicates the reduced misclassification rate, which is desirable for proper classification of faults.

V. CONCLUSION
In this study, various types of PV faults, including module mismatch fault, open circuit, short circuit, and multiple faults under partial shading, are classified in SP, and TCT interconnections made of polycrystalline and thin-film PV arrays through multilayer neural network (MNN). To be extensive in the research study, a large input data set of 5×1248 and target data set of 16×1248 are developed for categorization of faults in sixteen different classes of PV faults. The characteristic curve of three different configurations of 9 × 7 PV array, including SP, BL, and TCT, is analyzed for analysis of the impact of faults on various parameters of PV array like short and, open circuit voltage, and peak power etc. The SCG training algorithm classified all the developed faults in thinfilm and polycrystalline PV materials with high accuracy of 99.6% and a fast-computational time of 0.08 sec, which is not reported in the related literature.
The results are validated through plotting best validation performance, confusion matrix, and region of convergence (ROC) analysis for classification of faults in different PV technologies, which may be considered a unique attempt of the presented research study. The ROC plot and confusion matrix plot show the 99.6% accuracy with fast computational time of 0.08 sec. This research work does not only help timely diagnosis of PV faults but also categorizes them in two different PV materials, which may lead towards the longer lifespan and better performance of a PV system. Classification of faults in crystalline and thin-film PV configurations with high accuracy is performed in the proposed work. Further research is needed to classify faults like thin cracks of crystalline and thin-film PV modules using advanced techniques. Practical implementation of proposed work can also be considered as a future enhancement of the proposed work. The proposed method can be implemented via low-cost microcontrollers for real-time applications.