Interval-Valued Reduced Ensemble Learning based Fault Detection and Diagnosis Techniques for Uncertain Grid-Connected PV Systems

Nowadays, Grid-Connected Photovoltaic (PV) energy systems got the most interest. Fault detection and diagnosis (FDD) becomes more and more important in order to guarantee high reliability in PV systems. FDD of PV systems using machine-learning technique aims of developing effective models that can provide a better rate of accuracy. Recently, numerous machine learning based ensemble models have been applied in FDD using different combination techniques. Ensemble methods is a machine learning technique that combines several base models in order to produce one optimal predictive model. In this study, we propose six effective Ensemble Leaning (EL)-based FDD paradigms for uncertain Grid-Connected PV systems. First, EL-based interval centers and ranges and interval upper and lower bounds techniques are proposed to deal with the PV system uncertainties (current/voltage variability, noise, measurement errors,...). Next, in order to more improve the diagnosis abilities, two interval kernel PCA (IKPCA)-based EL classifiers are developed. The IKPCA-EL techniques are addressed so that the features extraction and selection phases are performed using the IKPCA models and the sensitive and significant interval-valued characteristics are transmitted to the EL model for classification purposes. Finally, the number of observations in the training data set is reduced using Hierarchical K-means techniques in order to overcome the problem of computation time and storage cost. Therefore, two interval reduced KPCA-EL techniques are proposed. The study demonstrated the feasibility and efficiency of the proposed techniques for fault diagnosis of Grid-Connected PV systems.

Support vector machines (SVM) is one of the widely used technique in classification and nonlinear function estimation [9]. In [10], the authors propose a method based on single diode model to simulate the characteristics of the PV array in order to detect faults. In this case, SVM is used to analyze the output power residual of the simulation model. In other studies, a PV fault detection technique based on multi-resolution signal decomposition (MSD) and SVM classifier aims to detect line-to-line faults in PV arrays has been developed [11]. This technique consists in requiring the current data and total voltage from a PV array and a limited amount of labeled data for training the SVM. The main drawback of the classical SVM technique presented in the selection of features that may sometimes lead to wrong output especially when the data set has more noise. In [12], a fault diagnosis method based on experimental data, combined with the KNN technique is proposed. The main idea behind this proposal is to detect and classify different faults like open circuits, line − line, partial shading in real-time. KNN is one of the topmost used machine learning algorithms thanks to her simplicity and high capacity [13]. KNN does not require any assumption for underlying data distribution and any training data points for model generation. This in turn gives high performances when using real datasets [13], [14]. But KNN suffers from some limitations in the case of the large dataset because the instance calculation of distances between each samples would be very costly. The DT algorithm belongs to the family of supervised learning algorithms and it has been widely used in literature. The main idea of this technique is to predict the class or value of the target variable by learning simple decision rules allowed from training data. DT algorithm executes classification without requiring much computation and generates understandable rules. In [15], a decision tree method has been proposed to detect and classify open circuits, line-line short circuits, partial shading, and degradation. This proposal can accurately detect different conventional faults. The main drawback of this technique is the assumption that the PV array is operating at the (MPP), which is not ensured in real PV systems [16]. Besides, it generally undergoes problems of overfitting, especially in the case of a large data size. Convolutional neural network (CNN) has become dominant in various fields, including fault diagnosis. The main idea of CNN is to use a backpropagation algorithm using multiple building blocks, such as convolution layers to automatically and adaptively learn spatial hierarchies of features [17]. The main drawbacks of the CNN model are the requirement of a large training dataset and the challenges of class imbalance, gradient explosion, and overfitting when training the model [18]. In [19], the authors propose a generative adversarial network based on trinetworks form (tnGAN) to handle leak detection problems with incomplete sensor data. This proposal presents a hybrid approach based on a generative model, a multiview awareness strategy, and a dual-discriminative network architecture. The proposed method achieves the best results compared to CNN model. During the last decades, ensemble learning (EL) models have gained significant attention from the scientific community [20]. EL is a technique that creates and combines multiple machine learning models in order to produce one optimal predictive model which gives improved results [21]. Bagging, boosting, stacking and random subspace are the main types of ensemble methods [22]. Bagging is used as a way to decrease the variance in order to improve the accuracy of models through decision trees. Boosting aims to learn from precedent predictor errors to make better predictions in the future (decrease bias). Stacking allows a learning algorithm to group together several other predictions of similar learning algorithms (improve predictions ). Random subspace combines the predictions of multiple decision trees trained on different subsets of columns in the training dataset by simple majority voting in the final decision rule [23]. Bagging helps eliminate the overfitting of models in the procedure by decreasing variance. However, the resultant model using bagging ensemble methods can experience lots of bias when the proper procedure is ignored and it introduces a loss of interpretability of a model [24]. Boosting algorithm seeks to reduce the model's bias and it is used when high bias and low variance are presented. In addition, Boosting generates an unified model with fewer errors as it focuses on maximizing benefits and reducing shortcomings in a single model [24]. The main advantage of the random subspace technique is the random selection of subsets of features, resulting in weakly correlated multiple weak learners [25]. In conclusion, when the challenge in a single model is overfitting, the bagging method performs better than the boosting technique. Boosting faces the challenge of handling overfitting since it comes with over-fitting in itself. When the challenge is to obtain low-correlated multiple weak learners, the random subspace technique method is better than boosting and bagging. Detailed descriptions of commonly used ensemble techniques are given in Section 3. A Comparison study between single and ensemble learning algorithms is done in different works [22], [26]. It is claimed and proven that ensemble learning techniques can outperform classical single machine learning methods in many cases [26]. The first one is when the training algorithm fails to find the best solution (computational problems). The second one is when the available training data are too small compared to the search space (statistical problems ). The last one is when the learning algorithms miss affecting fitness functions (representation problems ) [26]. Another study demonstrates that boosting ensemble techniques outperformed bagged ensemble techniques to predict the stock market [27]. In the literature, ensemble learning algorithms are widely used to affect maximum performance and they have been applied in a variety of real-world applications [22], [28]. In [29], the authors propose a technique to improve the predictive performance of existing conventional machine learning (ML) algorithms as an arc fault detection method. This proposal is based on the superposition of conventional ML algorithm to create an enhanced classifier that decreases the bias and decision variance. Another fault detection method based on ensemble machine learning (EML) is introduced in [30]. In this study, different EML algorithms and the associated hyperparameters are compared to select the most accurate hyperparameters for a series dc arc fault detection. In [22], an enhanced ensemble learning method was proposed to provide a better and higher rate of prediction accuracy of stock-market prediction. In this proposal, boosting, bagging, stacking, blending, and simple maximum voting combination techniques are used to construct twenty-five different ensemble classifiers using DT, SVM, and multilayer perceptron (MLP) neural networks. Despite numerous studies revealing the dominance of ensemble learning methods over single learning methods, most of these works only ensemble a specific type of classifier. In addition, the previously investigated ensemble learning-based fault classification approaches use only single-valued data, and the uncertainties of the system are not taken into account. The uncertainty in the systems, which is presented by the interval-valued data, is the consideration of the minimum and maximum recorded values, while the single-valued data representation is obtained by a simplification of data during the mining procedure. Thus, the interval-valued data representation offers a better overview of the measured phenomenon compared to the representation of the average value. However, inaccuracy, uncertainty, or parameters variability might characterize the important information describing the real systems [31]. Thus, classical data is not able to present these dissimilarities and for this reason, it is important to represent the data as interval-valued data. In [32], a KNN approach to deal with uncertainties by using data in the form of intervals. In other studies, a new approach for constructing regression and classification models for intervalvalued data using support vector machine method is proposed [33]. An uncertainty analysis technique based on a nonparametric statistical modelling method for photovoltaic array output is proposed in [34]. This proposal aims to resolve the problem of differences between the parameter estimation (PE) results and the real output distributions by using nonparametric kernel density estimation (NKDE) methods. Besides, another main drawback of the classical ensemble learning classifiers is the direct use of the raw information from the process data. In the literature, different FDD techniques based on feature extraction and selection steps using a single classifier have been proposed [35], [36]. The main idea behind the extraction and selection steps is to extract and to select the most pertinent and informative data features, which will consequently enhance the use of the ML algorithm in the classification step for diagnosis purposes [37]. Literature has shown that the applications of some techniques for feature extraction and selection have significantly enhanced the accuracy of classification. In [38], a fault classification method based on multiscale interval PCA (MSIPCA) and ML method was proposed for uncertain HVAC systems. The MSIPCA technique is also proposed for enhancing the diagnosis performance by extracting the most significant linear features from data. However, popular complex systems show strong VOLUME 4, 2016 nonlinear correlations between their variables. Various nonlinear Kernel PCA (IKPCA) methods have been presented [39], [40]. The main objective of the IKPCA method is to i) transform the interval-valued data matrix on a numerical data matrix, (ii) map the input numerical data onto the feature space using a nonlinear mapping function, and (iii) use PCA into a feature space [40].
In this work, we propose innovative ensemble learning paradigms to deal with the problem of fault detection and diagnosis of uncertain PV systems. The principal contributions of this article are threefold.
1) The first contribution of this paper aims to develop an innovative EL models for interval valued data with KNN, SVM and DT classifiers using bagging, boosting and random subspace combination tools. The developed paradigms are so-called interval EL (IEL)-based centers and ranges (IEL CR ), and upper and lower bounds (IEL U L ).The objective behind these proposed methods is to show the impact of using interval-valued data instead of single-valued data to improve the fault diagnosis abilities. The main idea of the developed techniques is to represent the interval-valued data matrix using centers and ranges or upper and lower bounds approaches. Then, the feature matrices are constructed and introduced to the proposed EL classifier for fault classification purposes. In this study, we use two methods based on interval-valued data to further assess the effectiveness of using model uncertainties. To summarize, six multi-class (MC) classifiers called IEL CR , IEL U L , IKPCA CR -based EL, IKPCA U L -based EL, IRKPCA CR -based EL, IRKPCA U L -based EL are used. The main goals behind the proposed methods are to show the efficiency of using interval-valued data, features extraction and selection, and data size reduction step by step. The MC classifiers consist of classifying instances into one or more classes. To further improve the classification performances of the developed classifiers, a set of one-class (OC) classifiers is proposed. To do that, a bank of OC IEL CR , IEL U L , IKPCA CR -based EL and IKPCA U L -based EL, IRKPCA CRbased EL and IRKPCA U L -based EL classifiers are developed (there are as many classifiers as classes). An emulated PV system is applied to demonstrate the effectiveness of the proposed diagnosis methods.
The rest of the work is presented as follows. Section 2 presents the GCPV system description and data collection. A brief description of machine based ensemble learning techniques is given in Section 3. Section 4 presents the proposed paradigms. The performance of the proposed methods is evaluated in Section 5. At last, some conclusions are drawn in Section 6. Figure 1 shows the synoptic of the GCPV system under study, where PV and grid emulators are used to emulate the operation of PV panels and a 3-phase grid respectively (under different operating modes). Table 1 shows the system variables considered in this study, where the measurements are recorded each 5-15s depending on the nature of the faults and their occurrence.

III. PV IMPLEMENTATION AND DATA COLLECTION
The faults were emulated at different system stages (common coupling point, inverter, sensors, emulated PV arrays,...) to ensure a comprehensive analysis [35], [37]. A first fault F 1 was emulated by introducing an open-circuit fault on one of the inverter switches at the time (inverter fault). Another AC side fault F 3 was emulated by disconnecting the grid at the common coupling point (islanding referred as grid-connection fault). On the PV side, three types of faults were emulated. The fault F 2 was introduced at the sensor level (output current sensor fault) to emulate the sensor wiring/reading issues. Moreover, using the PV emulator features, a 10-20 % permanent partial shading was introduced to emulate the PV panel fault (F 4 ) while the connection faults (  The three-phase inverter's output currents The output current of the PV panel emulator Three-phase voltages The output voltage of the PV panel emulator Output voltage The output voltage of the DC-DC converter   Table 2.

A. CLASSIFICATION TECHNIQUES
In this study, we use SVM, KNN, and DT models to construct different ensemble classifiers. The main advantage of SVM technique is that it is able to handle high dimensional data without overfitting problems. Moreover, the kernel trick is a real strength of SVM in which one can solve any complex problem [41]. However, SVM does not perform very well when the data set has more noise which affects the final decision. The KNN model is a very efficient classifier in terms of improvisation for random modeling on available data [42]. A tree model is very useful for solving decisionrelated problems and it can work well even if the assumptions are somewhat violated by the dataset from which the data is extracted [43]. Therefore, in this work, we propose three well-used machine learning algorithms, each of which differs in its way of training from the other, to overcome the shortcomings that result from the use of a single classifier. Thus, they work in an integrated way.

1) Support Vector Machines (SVM)
SVM has been first introduced by Vapnik [44]. There are two main categories for SVM: support vector classification (SVC) and support vector regression (SVR). In this study, an overview of the basic ideas underlying support vector (SV) machines for classification is presented. For a considered training data set with N samples {x k , y k } N k=1 , with input data x k ∈ R m and output y k ∈ {−1 1} which represents a set of labeled training features. The SVM for classification is presented as following: where w ∈ R m and b ∈ R.

2) Decision Tree (DT)
Decision Tree (DT) is a well-known technique that has been applied to real-world problems [45]. DT is a symbolic learning technique that organizes information extracted from a training dataset in a hierarchical structure composed of nodes and ramifications. The main advantage of using DT algorithms is that they involve minimal requirements for data preparation and are robust on large datasets.

3) K-Nearest Neighbors (KNN)
kNN is a non-parametric algorithm, which means it does not make any assumption on underlying data [46]. The main step VOLUME 4, 2016 5 of KNN technique is to classify samples from the available data based on similarity. Therefore, when new data appears then it can be easily classified into a good suite category by using K-NN method [47]. The Euclidean distance is used to compute the KNN class as follows, For a given known class X = [x 1 , x 2 , ..., x k ] and a data to be classified Y = [y 1 , y 2 , ..., y k ]. So, the distance is given by (2) Then a class is assigned to which the distance defined as in Eq. 2 is minimum.

B. ENSEMBLE LEARNING TECHNIQUES
Ensemble technique is a machine learning technique that combines the decisions from multiple models in order to generate one optimal predictive model and to enhance global performance. The main idea behind ensemble techniques is to improve predictability in models and decrease bias and variance to boost the accuracy of models [48]. Boosting, bagging, and random subspace are the most popular ensemble methods. Next, we discuss the three advanced ensemble methods.

1) Boosting
Boosting is one of the most popular ensemble techniques. The main objective behind boosting algorithm is to combine many weak learners into strong learners [49]. So, it learns from previous predictor mistakes to make improved predictions in the future. Therefore, it significantly improved the predictability of models [50]. The main steps of boosting technique are threefold: i) Bias the training data towards those examples which are difficult to predict, ii) add assembly members to correct predictions from previous models, and iii) combine predictions using a weighted average of the models. Some commonly boosting algorithms are adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost) and gradient boosting.

2) Bagging
Bagging, also called bootstrap aggregating, is an ensemble learning method that decreases the variance and improves the accuracy of different models to form one ensemble model [49]. The first step of the bagging technique is to create multiple models. Then, the created models are generated based on the actual method with random sub-samples of the dataset which are constructed from the original dataset randomly using bootstrap sampling technique [50].

3) Random Subspace
Random subspace (RS) is similar to the bagging method but the variables are randomly sampled, with replacement, for each learner [51]. By training the estimators on random samples of characteristics instead of all characteristics set, RS aims to decrease the correlation between models. RS outper-forms other ensemble techniques in terms of computational cost thanks to the use of random subsets [25].

V. PROPOSED TECHNIQUES
The main contributions are threefold. First, two alternative and effective interval-valued learning methods (interval EL CR and interval EL U L ) based on the direct use of variables measured with uncertainties are presented. In this study, we used three classification algorithms and three ensemble techniques. The used classification algorithms are Support Vector Machine (SVM), Decision Trees (DT), and K-Nearest Neighbor (KNN). The used EL techniques are Bagging, Boosting, and Random sub-space. Next, the proposed methods are assessed and compared to the ensemble learning (EL) for single-valued data in order to show the impact of the interval-valued data representation on the diagnosis performance. The main steps of interval-valued raw databased EL (IEL) techniques are illustrated in Figure 2. Then, in order to further improve the efficiency of the developed IEL methods, two additional intervals KPCA (IKPCA)-based FDD techniques are developed, where the most relevant characteristics are first extracted and selected from the original data then the final features are fed to the proposed EL model for classification purposes. Once the samples representing the healthy and different possible faulty scenarios in the process are available, the IKPCA models are constructed using only the healthy data. The built models are applied to extract and select the most significant features. However, the main disadvantage of IKPCA-EL is the computational cost which is proportional to the number of measurements. To overcome this challenge, an improved IKPCA technique based on a data reduction scheme using H-K-means clustering is proposed. The first objective behind this proposed technique is to reduce the number of samples. The improved IRKPCA-EL not only decreases the computation time and storage cost but also keeps the diagnosis capacity. Next, some arbitrary groups of selected features are applied to train the EL classifier. Finally, to make efficient decisions, we compare the EL output results using the different picked arbitrary groups.

A. INTERVAL-VALUED DATA
In order to keep the variable information, it is more relevant to present these measurements by interval values instead of single values. Given that x ij , i = 1, .., N and j = 1, ..., m, is an observation is the i−th sample of the j−th variable, the interval representation of the data measurement x ij is given by, where x ij and x ij are the lower bound and upper bound of the interval,respectively. The interval-valued matrix [X] is defined as follows: The generic interval [x jk ] can be also represented as a couple {x c jk , x r jk }. The center x c jk of the interval is given as [52], [53] x c j (k) = and the range x r j (k) of the interval is expressed by: Usually, data is composed of variables that belong to different physical quantities with different scales and spreads. To deal with this problem, the data matrix is scaled to zero mean and unit variance. Thus, the pre-processing step is very important and it is recommended before applying any model in order to enhance the simulation results.

B. ENSEMBLE LEARNING FOR INTERVAL-VALUED DATA (IEL) METHOD
In this section, EL techniques based on interval centers and ranges IEL CR and interval upper and lower bounds IEL U L are presented.

1) EL based on interval centers and ranges (IELCR)
In this method, the Center and Range (CR) approach is used. The CR technique is one of the most used models for analyzing interval-valued data. Let X be the training data sets, where m is the number of variables and N is the number of observations.
In the CR technique, the interval-valued data matrix is first transformed into center and range matrices as: Then, the obtained data matrix is constructed by the concatenation of center and range data matrices. Thus, the new input X CR data matrix is presented as:

2) EL based on interval upper and lower bounds (IELUL)
For the interval EL LU method, an upper-lower approach is considered to classify the data. Let X L and X U be the lower and upper bounds of the input matrices, respectively.
The upper and lower matrices can be considered at the same time. According to the above definitions, let X LU be the upper-lower value that can be represented by: where, γ ∈ [0, 1], γ can be used as the adjustment weight of interval-valued data unit, which is used to balance the relationship between the upper and lower bounds of the interval-valued data unit. The upper and lower matrix is given by: When γ = 1, it can be studied as a lower scheme with one feature. If γ = 0, then, it can be represented by an upper bound that contains the size information of x. The next section proposes two EL algorithms based on IKPCA models. In the proposed IKPCA-EL methods, only the most informative extracted features from the dataset are selected and applied to the EL algorithm for classification in the diagnosis problem.

C. ENSEMBLE LEARNING BASED INTERVAL KPCA METHODS
The main idea behind the proposed IKPCA-EL methods is to extract and select the most pertinent nonlinear features from interval-valued data using two IKPCA models. Then, the selected pertinent nonlinear features are fed to the EL to address the fault classification problem. The feature extraction and selection steps are used to retain only the most relevant and effective measurements in order to better present any system under different operating modes. IKPCA method consists of transforming the interval-valued dataset on a numerical dataset and then a KPCA is applied to the created numerical dataset. Besides, it aims to calculate the interval kernel principal components (IKPCs) in the characteristics space using nonlinear kernel functions and integral operators [54]. Let us consider three data matrices X CR ∈ R N ×2m , X L ∈ R N ×m , and X U ∈ R N ×m , which represent the center and range matrix, the lower matrix and the upper matrix, respectively. IKPCA technique consist of applying the KPCA model in the given interval data matrices.

1) Feature extraction using IKPCA
Given a training interval data matrix [X]. The matrix regrouping the mapped interval vectors is arranged as follows : where h >> m is the dimension of the characteristic space. Using the kernel trick, we can compute the kernel principal components (KP C s ) using eigenvector expression as follows: where λ and α are the eigenvector and eigenvalue of the gram matrix K. The interval kernel matrix K can be expressed as follows: where k([x]) is defined as: 2) Feature selection using IKPCA Let be consider the eigenvector of the kernel matrix in the feature space v = λ −1 X T α [39]. The matrix with the leading eigenvectors is computed as, where Λ = diag {λ 1 , ...., λ } is the largest eigenvalues of the matrix [K].
Then, the kernel principal components are defined as, [39], Additional to the first KPCs, IRKPCA based features extraction is performed using the Hotelling's T 2 , squared prediction error (Q) and combined ϕ statistics which are used to select the optimal features [55]. The statistical features are calculated as follows: τ T 2 α and τ SP E α represent thresholds of T 2 and SP E at the confidence level α, respectively.
where F α ( , N r − ) an F-distribution with and N r − degrees of freedom. where , with a and b are the mean and variance of the SP E index, respectively. For the IKPCA based on upper and lower bounds, new interval squared prediction error (ISPE) index is given by: where γ ∈ [0, 1], γ is the weight that defines the trade-off between the upper and lower bounds.
In the same way, the interval Hotteling's IT 2 statistic is given by: where, SP E and T 2 are the statistical characteristics for lower bound and SP E and T 2 are the statistical characteristics for upper bound of interval-valued data. The interval combined index Iϕ is given by, 8 VOLUME 4, 2016 where τ IT 2 α and τ ISP E α represent control limits of IT 2 and ISP E at the confidence level α = 95%, respectively.
where F α ( , N r − ) an F-distribution with and N r − degrees of freedom. where , with a and b are the mean and variance of the ISP E index, respectively. The variance D 2 , mean m, kurtosis K and skewness S of the first retained KPCs t = [t 1 , ..., t N ] T , where t k = [t k1 , ..., t k ] ; k = 1, ..., N are calculated by [39],

D. EL BASED INTERVAL REDUCED KPCA METHODS (IRKPCAHKMEANS) 1) Hierarchical clustering
Hierarchical clustering aims to group similar objects into groups called clusters [56]. We can compute the distance between two clusters as following, • Single Linkage: Compute the minimum distance d(s, c) between any single data point in the two S and C clusters: • Complete Linkage: Compute the maximum distance between a S and C: • Ward's linkage: Regroup the clusters in which the inertial losses within clusters ∆(S, C) at each step are decreased.
where m S and m C are the total weight of the observations , g S and g C are the center of gravity of S and C,respectively, and d 2 (g S g C ) is the Euclidean distance between g S and g C . In this work, we use the Ward's linkage distance method [36]. Let consider the the original matrix X = x 1 x 2 · · · x N T ∈ R m×N i = 1, ..., N . Using the Agglomerative hierarchical clustering [57], N r clusters are obtained {C 1 , C 2 , ..., C Nr } where x j ∈ C i j = 1, ..., n i , i = 1, ..., N r with n i is the number of samples in C i .

2) K-means clustering
K-means clustering is one of the simplest and popular machine learning techniques [58]. The main idea behind Kmeans clustering is to attributes samples to the cluster with a smallest distance between samples to centroid cluster. The objective of using K-means clustering is to improve the quality of the clusters result obtained using Agglomerative hierarchical clustering. It compute the squared distances between the data and centroids, and attributes data to the nearest centroid. We purpose to enhance the N r clusters {C 1 , C 2 , ..., C Nr } and we classify into N r disjoint subsets {C 1 , C 2 , ..., C Nr } each containing n i observations, where x j ∈ C i j = 1, ..., n i , i = 1, ..., N r by the reduction of the mean-square-error cost function The resulting input data set obtained using H-K-means is given as, where x r (i) = 1 n i xj ∈ Ci x j , i = 1, ..., N r (38) with x j ∈ R m , j = 1, ..., n i and N r = + 1, ..., N .

3) Feature extraction and selection using IRKPCAHKmeans
Let be consider a mapped interval valued data X r defined as, 39) The reduced kernel K r ∈ R Nr×Nr is constructed as follows: The eigenvector λ r and the corresponding eigenvalue α r of the new reduced kernel matrix K r are determined by solving the following equation: Next we extract and select the most significant features from the reduced interval valued data using IKPCA methods as given in section V-C2.

E. FAULT CLASSIFICATION METHODS
During the classification stage, once the global characteristics are extracted and selected using the four proposed methods IKPCA U L , IKPCA CR ,IRKPCA U L , and IRKPCA CR , they are used as input data for the proposed EL technique. Finally, to make efficient decisions, we compare the EL output results and choose the best one. The main steps of the proposed techniques are illustrated in Algorithm 1.
Input: Collect the normal N × m interval data matrix X.

VI. RESULTS AND DISCUSSIONS
This section presents the results and discussions of our experimental.

A. EVALUATION PARAMETERS
In this section, a set of emulated PV system data is used to assess the effectiveness of the proposed methods. The adopted criteria are: Normalized Classification Accuracy (NCA), which represents the number of correct predictions divided by the total number of input samples. Normalized Recall (NR), which represents the number of correct positive results divided by the number of all pertinent samples. Normalized Precision (NP), which represents the number of correct positive results divided by the number of positive results predicted by the classifier. Computation time (CT (s)) which represents the time needed to execute the algorithm.

B. MULTI-CLASS (MC) CLASSIFICATION RESULTS
In this study, we use the minimum root mean-square error (RMSE) as a selection criterion for different ML classifiers. The 10-fold cross-validation approach was used to obtain the classification accuracy and to illustrate the efficiency of the proposed techniques for FDD purposes. For the proposed ensemble learning techniques, the DT was tested with 50 trees, the K and C parameters for SVM are selected with the lowest RMSE value and the K value for KNN is equal to 1, 3, and 5. For the FFNN, MNN, GRNN, CFNN, PNN, NN, RNN, and CNN classifiers, the number of hidden layers chosen is ten and the number of hidden neurons in the hidden layer is equal to 50. The first step of this work aims to compare the performance of the presented interval IEL CR and interval IEL U L techniques, the results are compared to EL for single valueddata. The number of variables m equals to 9 and the number of samples N equals to 1501 for both IEL CR and IEL U L techniques. The results of the multi-class classification are summarized in the table 3 where it can be distinctly noticed that the classification metrics obtained using the two proposed methods is higher than the one obtained using the EL for single-valued data. It is easy to conclude that the use of interval representation instead of a single value representation enhances the fault classification performance. At the second phase, in order to further improve the performance of the proposed IEL-based techniques, novel EL-based frameworks (IKPCA CR , IKPCA U L , IRKPCA CR and IRKPCA U L ) were proposed. Firstly, the data set is normalized under normal operating modes. Secondly, the interval KPCA (IKPCA) and interval reduced KPCA (IRKPCA) models are constructed in which the cumulative percent variance (CPV) criterion is equal to 95% as confidence level. CPV criterion is adopted to retain the number of first kernel component . The reduced datasets (number of samples N ) obtained through H-K-means equal to 806 and 800 have fed to respectively IRKPCA CR and IRKPCA U L techniques. The IKPCA models are structured by 31 interval kernel principal components (IKPCs) while the selected number of IKPCs using CPV criterion is equal to 18 and 17 using IRKPCA CR and IRKPCA U L models, respectively. To generate the simulation data 6 operating modes are considered. The operating modes include one healthy referred to class C0 and 5 faulty modes (F 1 -F 5 ) assigned to classes C1-C5 (Table 2). In this study, 5 groups of features are extracted and then we select the best one from them. Table 4 shows the performed groups of features. Sampled mean, kurtosis, variance and skewness of the retained IKPCs Group 5 The first IKPCs The main goal of this part is to extract and select the most effective characteristics from raw data in order to obtain the best classification results. In the first stage, emulation data is used to collect and label the database in faulty mode. Then, we apply the labeled data as inputs for the proposed techniques. For this purpose, a comparison between five arbitrary groups using the proposed techniques is presented in table  Table 5. From this table, we can be seen that the proposed methods based on data reduction scheme can achieve higher accuracy using group 5 of features. We observed that the EL-based methods obtain an accuracy of 0.99 (EL-IKPCA) and 1 (EL-IRKPCA) using group 5 of features. As shown in Table 5, the accuracy of the proposed EL-based methods performed better the classification results comparing with IEL techniques. The overall accuracy can improve from about 0.72 using IEL to 1 using EL-based IRKPCA techniques. Additionally, to further evaluate the results, recall and precision classification metrics are used. As shown in Table  6, the proposed EL-based methods present perfect results in all used classification metrics. Additionally, we used confusion matrix to more demonstrate the diagnosis performance of the proposed methods (see Tables 7, 8 and 9). The confusion matrix represent the visualization of the performance of the proposed algorithms.
The rows present instances in an actual class while the columns represent the instances in a predicted class. In addition, the confusion matrix represents the correct classified and mis-classified samples for the condition modes (C 0 to C 5 ). Referring to the results given in Tables 7, 8 and 9, the proposed EL-based methods achieved the highest accuracy correctly identifying 1501 measurements among 1501 during the healthy case (C0). Furthermore, the NP is 1 and its recall is 1 for all different modes using both IRKPCA-EL during all faulty cases with 0 of misclassification. We can conclude from these results that the proposed methods are able to distinguish the six different modes and obtain good classification results.
To further evaluate the effectiveness of the proposed techniques, a comparative study between 14 machine learning (ML) methods is done. The ML techniques include the proposed methods, interval principal components analysis based EL (IPCA-based EL) [59], Feed-Foward Neural Network (FFNN) [60], Multiple Layers (MNN) [61], Generalized Regression Neural Network (GRNN) [62], Cascade Foward Neural Network (CFNN) [61], Probabilistic Neural Network (PNN) [7], Neural Network (NN) [60], Recurrent Neural Network (RNN) [63] and Convolutional Neural Network (CNN) [64]. Table 10 presents the results according to the NCA and computation time (CT). The classification outcomes, given in Table 10, demonstrate that the enhanced ensemble methods using IKPCA and IRKPCA models provide VOLUME 4, 2016 the best results in terms of NCA compared to other techniques. Besides, one can notice from Tables 10 that the results  are significantly improved compared to the IPCA-based EL. IPCA-based EL classifier reached quite high performance, with an NCA value of 0.92 and with a misclassification rate equal to 0.08. From Table 10, it is shown that both IKPCA and IRKPCA improve the feature extraction results and outperform the linear IPCA model because they can handle the nonlinearity of the PV system. Also, we can be noticed that the presented IEL classifier makes the performance of fault diagnosis efficient for fault classification. The IEL CR and IEL U L classifiers provide a classification NCA equal to 0.72% and 0.74%. A classification error of 0.28 is achieved using IEL CR and for IEL U L , the misclassification is 0.26. The poor NCA using IEL CR and IEL U L are due to the use of measured variables without characteristics extraction and selection steps which indicates the effectiveness of the developed IKPCA-EL and IRKPCA-EL techniques to perform the classification task. In addition, we can conclude from the results summarized in Table 10 that the developed IRKPCA CR -EL and IRKPCA U L -EL methods afford the best tread-off between NCA and computation time (CT). Therefore, the proposed methods based on characteristics extraction and selection phases and data reduction scheme are considered as good alternatives for faults classification due to their high NCA and reliability. For FFNN, MNN, GRNN, CFNN, PNN, NN, and RNN classifiers, the best results in terms of NCA are obtained using MNN with NCA values of 0.86 and misclassification value of 0.14.  [59] .92 93.14 FFNN [60] .83 59.12 MNN [61] .86 25.87 GRNN [62] .69 35.96 CFNN [61] .85 71.16 PNN [7] .71 31.73 NN [60] .72 13.7 RNN [63] .84 267.1 CNN [64] .76 389.16

C. ONE-CLASS (OC) CLASSIFICATION RESULTS
To more highlight the effectiveness of the developed techniques a bank one class classifiers is presented. One class classification is a specific type of classification task done by only instances of one class. In our case study, we apply one healthy and five faulty classes [65]. As shown in Table 11, each one is trained in order to classify a specific class labeled by 1 or -1. The performance of the proposed methods in terms of NCA is presented in Table 12 using the selected features of group 5. Classification results of all classifiers, given in Table  12, demonstrate the effectiveness of the proposed techniques based on feature extraction and selection steps thanks to the high ability of the proposed kernel-based methods to extract and select the most pertinent and significant characteristics from interval raw data.

VII. CONCLUSION
New fault detection and diagnosis (FDD) techniques dealing with uncertain Grid-Connected Photovoltaic (PV) systems have been proposed in this paper. The uncertainty was addressed by using the interval-valued data representation. Firstly, two interval-valued ensemble learning (IEL) classifiers based on the direct application of the interval-valued dataset were proposed. Secondly, two enhanced IEL methods based on features extraction, selection, and fault classification steps were developed. For the features extraction and selection steps, two interval KPCA (IKPCA) methods were performed to extract and select the most significant features by transforming the single-valued data set into interval-valued latent variables. Then, the most pertinent characteristics were fed to the proposed EL technique for classification purposes. Finally, in order to further improve the diagnosis results in terms of computation time, an improved IEL techniques based on data reduction and interval KPCA (IRKPCA) were proposed. The proposed methods applied the Hierarchical Kmeans (H-K-means) clustering measure to remove the irrelevant and redundant samples. The simulation results using a grid-connected PV system under healthy and faulty conditions showed the impact of using interval-valued instead 12 VOLUME 4, 2016 of single value representation and the effectiveness of the proposed techniques for features extraction and selection to provide the best compromise between diagnosis metrics and low computation time.
The obtained results showed the effectiveness and robustness of the proposed FDD techniques. The fault classification accuracies presented when applying the proposed paradigms showed some missed detection and false alarm rates and some faults were not correctly classified. Therefore, one future research direction is to implement online and adaptive NN-based tools to update the model which can provide reduced missed classification rates. On the other hand, despite the power of creating classifiers using ensemble learning techniques, a class imbalance problem is usually seen for real-world applications. Hence, one future research perspective is to develop learning algorithms that smooth out the imbalance between classes and exploit the advantages of multi-objective optimization (MOO) design to improve the performance of ensembles classifiers.
KAIS BOUZRARA is a professor of Electrical Engineering at Laboratory of Automatic Signal and Image Processing, National Engineering School of Monastir, Monastir, Tunisia. He has more than 15 years of combined academic and industrial experience. His research interests are in the area of systems engineering and control, with emphasis on process modeling, monitoring, and estimation. He has published more than 80 refereed journal and conference publications and book chapters. Email: bouzrara.kais@gmail.com HAZEM NOUNOU (SM'08) is a Professor of Electrical and Computer Engineering at Texas AM University at Qatar. He has more than 19 years of academic and industrial experience. He has served as an Associate Editor and on the technical committees of several international journals and conferences. He has significant experience in research on control systems, databased control, system identification and estimation, fault detection, and system biology. He has been awarded several NPRP research projects in these areas. He has successfully served as the lead PI and a PI on five QNRF projects, some of which were in collaboration with other PIs in this proposal. He has published more than 200 refereed journal and conference papers and book chapters. He is a senior member of the IEEE. Email: hazem.nounou@qatar.tamu.edu MOHAMED NOUNOU (SM'08) is a professor of Chemical Engineering at TAMU Texas A&M University at Qatar. He has more than 19 years of combined academic and industrial experience. His research interests are in the area of systems engineering and control, with emphasis on process modeling, monitoring, and estimation. He has published more than 200 refereed journal and conference publications and book chapters. He has successfully served as the lead PI and a PI on several QNRF projects (6 NPRP projects and 3 UREP projects). He is a senior member of the AIChE (American Institute of Chemical Engineers) and a senior member of the IEEE (Institute of Electrical and Electronics Engineers). Email: mohamed.nounou@qatar.tamu.edu VOLUME 4, 2016