A Novel Fault Diagnosis of Uncertain Systems Based on Interval Gaussian Process Regression: Application to Wind Energy Conversion Systems

Fault detection and diagnosis (FDD) of wind energy conversion (WEC) systems play an important role in reducing the maintenance and operational costs and increase system reliability. Thus, this paper proposes a novel Interval Gaussian Process Regression (IGPR)-based Random Forest (RF) technique (IGPR-RF) for diagnosing uncertain WEC systems. In the proposed IGPR-RF technique, the effective interval-valued nonlinear statistical features are extracted and selected using the IGPR model and then fed to the RF algorithm for fault classification purposes. The proposed technique is characterized by a better handling of WEC system uncertainties such as wind variability, noise, measurement errors, which leads to an improved fault classification accuracy. The obtained results show that the proposed IGPR-RF technique is characterized by a high diagnosis accuracy (an average accuracy of 99.99%) compared to the conventional classifiers.


I. INTRODUCTION
The deployment of Wind Energy Conversion (WEC) systems has witnessed an increasing need for the reduction of maintenance and operational costs [1], [2], where the most effective solutions are found in condition monitoring and diagnosis [3]. Indeed, the operation of WEC systems is usually accompanied by unexpected faults, which should be detected and classified at an early stage to avoid a system collapse. The wind variability, vibrations, and mainly the power electronics interfaces remain the main sources of failures [4], [5].
Many fault detection and diagnosis (FDD) approaches have been proposed for WEC systems in the literature [6], [7]. Generally, FDD techniques can be categorized into two main The associate editor coordinating the review of this manuscript and approving it for publication was Yuan Zhuang .
classes: data-driven [8], [9] and model-based techniques [10], [11]. The Data-driven FDD techniques make only use of the available diagnosis data [8], [9]. The data are first applied to build a model in the training phase, which is then applied in the testing phase for diagnosis purposes.
On the other hand, model-based FDD techniques consist in comparing systems' measurements with system variables calculated from the mathematical model, which is usually computed using some fundamental understanding of the system under normal operating conditions [10], [12], [13]. The residual which presents the difference between the measurements and the predicted model can be used as a chart for fault diagnosis.
In [1], [3], the authors presented a brief description of different kinds of faults, their generated signatures, and diagnosis solutions. Using the gearbox vibration signal, the authors in [14] have proposed a deep learning technique while a multiscale convolutional neural network was proposed in [15] to extract the faulty wind turbine features under different operating modes. In [16], the authors proposed fault detection and identification approaches which can identify faults, determine the occurring time and location, and estimate its severity. The authors in [17] proposed observer-based FDD techniques for wind turbines, where the diagnosed residuals are generated using Kalman filter, the detection phase is addressed using generalized likelihood ratio test, and the isolation phase is achieved using dual sensor redundancy. Finally, the performance of the proposed FDD techniques is assessed using Monte Carlo schemes. In [18], the authors developed a data-driven FDD approach for the gearbox of a WEC system. Moreover, in the paper [19], the authors proposed unknown input observer based scheme for detecting faults in a wind turbine converter. In [20], the authors proposed a data-driven multimode FDD technique to discriminate the WEC system faults. In the developed technique, the wind turbine nonlinear characteristics were approximated by multiple piece-wise linear systems.
Furthermore, several approaches have been developed to improve the overall performance of WEC systems [21]- [23]. The first phase in the WEC system diagnosis is the extraction of the most relevant patterns/features from the original dataset. Gaussian Process Regression (GPR) is one of the most well-known feature extraction and modeling strategies. In [24], it has been shown that the GPR presented an improved modeling and prediction accuracy when compared to the classical techniques. However, the mostly used GPR based diagnosis technique considers only single-valued data and does not take into account the system uncertainties.
To address the above issue, this paper proposes an interval GPR (IGPR) algorithm where the data is interval-valuedrepresented. The developed IGPR is characterized by a better handling of WEC system uncertainties such as wind variability, noise, measurement errors, which leads to an improved fault classification accuracy.
The IGPR method is applied to extract the multivariate and interval-valued features, including the interval mean vector M IGPR and the interval variance matrix C IGPR . It is characterized by its efficient extraction of multivariate and uncertain patterns from any data set. The developed approach, the socalled IGPR CR , consists of concatenating center and range matrices to compute the new numerical matrix and then fitting a GPR model on the matrix.
The interval-valued statistical parameters obtained from the IGPR model, including the interval mean vector M IGPR and the interval variance matrix C IGPR , are then selected as features and fed to the RF classifier for decision making. Indeed, the RF classifier, a combination of tree predictors, has been recently presented as one of the most effective classification techniques in FDD problems [25], [26].
Therefore, the main contribution of the current work is to develop a feature extraction and selection method using IGPR then introduce the selected interval-valued multivariate features to several RF algorithms for classification purposes.
To summarize, the developed approach consists of two phases. First, the IGPR model is applied to the original data in order to extract and select the most accurate features (including the mean vector M IGPR and the variance matrix C IGPR ). Then, the M IGPR and C IGPR are introduced to the RF classifier to perform the detection and classification of faults. The main difference between the proposed solution and the conventional RF algorithm is the introduction of a phase that performs features extraction and selection from the entire data. Two kinds of classifiers are considered in this work: a multi-class classifier and a set of one class classifiers The multi-class classifier consists of classifying instances into one or more classes. To better improve the diagnosis abilities, a bank of one-class classifiers is proposed. To illustrate the feasibility and effectiveness of the proposed technique, a WEC system is used as a validation platform. The opencircuit, wear-out, and short-circuit are the three transistor faults considered in this paper. Besides, a comparative study between the proposed technique and other machine learning (ML)-based classifiers including interval kernel PCA-based RF [26], Support Vector Machines (SVM) [27], Decision Tree (DT) [28], Naive Bayes (NB) [29], Discriminant Analysis (DA) [30] and K-Nearest Neighbors (KNN) [31], is presented.
The performance of the proposed techniques is investigated using sets of emulated data extracted under different operating conditions. The presented results confirm the higheffectiveness of the developed technique in monitoring uncertain WEC systems due to the high diagnosis capabilities of the interval-valued features-based IGPR and its ability to distinguish between the different operating modes of the WEC system.
The rest of the paper is structured as follows. Section 2 describes the proposed IGPR-RF technique. The diagnosis results are evaluated using the WEC system data in Section 3. The interpretations and conclusions are drawn in Section 4.

II. DESCRIPTION OF THE PROPOSED METHODOLOGY A. IGPR FOR FEATURE EXTRACTION AND SELECTION
Nonlinear GPR is a machine learning technique based on Bayesian theory and statistical learning theory. The main idea of GPR is to assume that the learning sample follows the prior probabilities of the Gaussian process and then determines the corresponding posterior probability. It is suitable for complex regression problems such as nonlinear and high dimensionality. However, GPR is used for single-valued data, which is a result of simplification during the data mining procedure. Thus, GPR based on interval-valued data representation is required to describe the data uncertainty and variability. Assuming x] represents the input intervalvalued data unit, where x and x ∈ R and x ≤ x. x and x are called the lower and upper boundary respectively.
[y] = [y, y] VOLUME 8, 2020 represents the output interval-valued data unit, where y and y are called the lower and upper output boundary, respectively. Denote [X ] = [x ij ] as an (N × m) input and output intervalvalued matrix as per: 11 . .

B. IGPR BASED CENTERS AND RANGES
The basic idea of the proposed IGPR-based Centers and ranges method (IGPR CR ) is to fitting a GPR model to intervalvalued data using the information contained in the centers and ranges of the intervals in order to improve the model prediction performance compared to the classical GPR technique.
The proposed IGPR CR model consists first of transforming the input [X ] and output [Y ] matrices into numerical matrices based on the interval centers and ranges. The input center X c and range X r matrices, and output center Y c and range Y r matrices are defined by: where the input center x c i and range x r i vectors, and the ouput center y c i and range y r i vectors are defined, respectively, by: The new input X CR and output Y CR data matrices are constructed by the concatenation of centers and range data matrices as: For an input vector x CR = [x c , x r ] and its corresponding output vector y CR = [y c , y r ], an interval Gaussian process f (x CR ) can be fully specified by its mean function m(x CR ) and covariance function k(x CR , x CR ). The interval Gaussian process is defined as: where m( The covariance function k(x CR , x CR ) or the kernel plays an important role in the IGPR operation. A large variety of kernel functions can be used depending on the specific application. In this study, a Gaussian kernel function was chosen for the GPR, which takes the following form [32]: where δ is the characteristic length-scale. The output vector y CR can be related to an underlying arbitrary regression function f (x CR ) with an additive independent identically distributed Gaussian noise , which represents the noise component from the interval data. This relationship is expressed by: where is the additive white noise and assumed to be the independent and identically distributed Gaussian noise such that ∼ N (0, σ 2 n ), with σ 2 n is the standard deviation of this noise.
The interval Gaussian process represented in equation 10 becomes, The prior joint distribution of the observation value Y CR and the prediction value y CR * can be obtained by: where I is the identity matrix, K is the Gram matrix of training dataset, k * = k(x CR 1 , x CR * ) · · · k(x CR N , x CR * ) T and k * * = k(x CR * , x CR * ). Conditioning the joint Gaussian prior distribution based on X CR , Y , and x CR * , the predictive distribution can be calculated by: where M IGPR is the predictive mean and C IGPR is the predictive variance which are given respectively, by The choice of the M IGPR and C IGPR as input features to the RF classifier should enhance the diagnosis performance. In the following, more details on the methodology are presented.

C. RANDOM FOREST FOR FAULT CLASSIFICATION
Once the statistical quantities M IGPR and C IGPR are computed using the IGPR model, the system faults should be isolated. In the current paper, the RF algorithm will be applied to isolate/classify these faults and distinguish between the different operating modes. The RF classifier algorithm was developed by Breiman [33] based on the bagging idea. It combines multiple decision trees to create a forest [34], [35]. The features of each generated tree are randomly chosen and then the most popular class is voted. The output of the classifier is obtained by a majority vote of the trees in the forest. The RF classifier is one of the most prevalent algorithms adopted to address the problems of multi-classification. However, the RF implementation suffers from certain drawbacks when considering the correlations between variables. In addition, to perform diagnosis, the RF uses only the raw data by the direct use of measured variables, which might lead to a low performance due to the data redundancies and noises. Therefore, to improve the diagnosis effectiveness of the conventional RF classifier, the IGPR-based features should be extracted and selected before their introduction to the RF for classification. Figure 1 shows the flowchart of the proposed FDD technique. First, the developed IGPR-RF divides the input data set (step 1) into training (used for learning) and testing (used for validation) data sets in order to distinguish between the healthy and faulty operating modes. During the training phase, the interval-valued model is firstly built using the IGPR algorithm. Second, the IGPR model extracts and selects the most effective features (step 2). Then, the RF uses the statistical IGPR parameters (selected features) for training (step 2). Finally, the classification is performed as shown in step 3. In the testing phase (step 3), the statistical IGPR parameters of the test sample data (belonging to a respective class) are extracted and selected using the IGPR technique. Then, based on the classification model computed in the training phase, the RF classifies the statistical IGPR parameters (step 3).

III. APPLICATION TO WEC SYSTEMS
The proposed FDD approach is implemented on a WEC system. Different comparative studies are investigated in this work. The proposed IGPR-RF technique is compared to IKPCA-RF, SVM, DT, NB, DA and KNN. In this work, the radial basis function (RBF) is used for all machine learning techniques with a kernel parameter σ . All the experiments are conducted using 10-fold cross-validation in the training set, after which they are applied in the testing phase. The minimum root mean-square error (RMSE) is taken as selection criterion for different machine classifiers. In the IKPCA algorithm, the parameter σ is equal to the minimum distance between the training data. The number of kernel principal components is determined using the cumulative percent variance (CPV) with a threshold equal to 95%. Naïve Bayes has an assumption that each attribute follows a normal distribution. The K value for KNN is set to 3 and for the SVM classifiers, the parameters C and σ are chosen with the lowest RMSE value and they are used for the training of SVMs for the whole data set. The parameters of IGPR model is optimized using the maximum marginal criterion. For Discriminant Analysis (DA), the regularization parameter is set as 1. For DT and RF, 50 trees are utilized. The performance is evaluated using the following criteria: Accuracy, Recall, Precision and F 1 Score [36].

A. SYSTEM DESCRIPTION
In this paper, the studied WEC system consists of a serial connection of a WT, a squirrel cage induction generator (SCIG), and a grid-connected back-to-back converter (Figure 2). The whole system is controlled to feed a fixed frequency current to the grid at unity power factor. The system parameters are presented in [23]. However, any fault in one of the abovementioned system stages could strongly affect the power production rate [37]. As the recent studies have shown that the power electronics interface is the most sensitive WEC system stage to faults, the inverters operation should be monitored in order to ensure an effective and continuous operation. Indeed, many factors lead to the power semiconductors aging which mainly affects the time response and could lead to additional switching losses. Moreover, the excessive switching  of the transistors might be the origin of different types of faults. Therefore, it is highly recommended to early detect the transistors aging in order to prevent the overall inverter failure. For instance, the IGBT fault is preceded by an abrupt increase of the collector-emitter voltage, which is considered as a good predictive maintenance indicator [38]. In this study, the transistor aging is modeled by the increase of the internal resistance while a null value is representing the normal operating condition. In this study, the rectifier and inverter sides transistors S 11 and S 21 are respectively encompassed in the FDD approach (see Figure 3).
The open-circuit, wear-out, and short-circuit faults are considered in this study ( Table 1). The wear-out fault is emulated by increasing the internal resistance to 2 . Figures 4 to 8 show the behavior of the mechanical torque, generator speed, generator current, grid current, and DC bus voltage respectively under normal and faulty conditions.

B. DIAGNOSIS RESULTS AND COMPARISON STUDIES
Twelve variables are generated for diagnosis purposes ( Table 2). Seven operating modes including one healthy and six faulty modes are used as generated simulation data series (Table 3). Each mode is adequately described over 2000 10-time-lagged samples within a 1s time period and 20 KHz sampling frequency [23]. The IGPR model is built by 2000 extracted samples. The IKPCA model is built under normal operating conditions using CPV criterion with 95% of confidence interval. Finally, the statistical quantities    (M IGPR and C IGPR ) obtained from the IGPR model are introduced to the RF algorithm for classification. To illustrate the classification accuracy of the developed approach, a 10-fold cross-validation scheme was adopted. The labeled data are used as inputs for all classifiers. Two types of classifiers are    presented in this work: multi-class classifiers (see Table 4) and a set of one-class classifiers (Table 7). For the multiclass classifiers, Table 4 illustrates the results using the IGPR-RF, IKPCA-RF, SVM, DT, NB, DA, and KNN techniques to assess the diagnosis performance in terms of accuracy and F 1 score.
It can be noticed from Table 4 that the developed IGPR-RF technique gives a better classification accuracy compared to the IKPCA-RF and both of them outperform the raw databased classifiers. The good performance of the developed approach is due to its effectiveness in excluding the ineffective samples and selecting the most accurate features from the predictive posterior distribution, while the IKPCA-RF uses the first principal components as inputs to the RF classifier. The SVM, DT, NB, DA and KNN classifiers are based on the direct use of the raw data.  To further assess the effectiveness of the proposed FDD method, the results are presented using the confusion matrix (Tables 5 and 6). The confusion matrix defines the number of predicted labels in columns and the number of actual labels in rows. The diagonal of the confusion matrix presents the correct classification for the seven classes (C 0 to C 6 ). For the testing healthy data, assigned to class C 0 , the IKPCA-RF classifier (see Table 5) identifies only 1966 samples among 2000 (true positive). In addition, the detection precision is 99.14% and its recall is 98.30% which also represents the classification accauracy. So, 1.7% of misclassification is found (false alarms) for this class. A classification error of 0.200% is found for class C 3 in testing data. For the faulty case (C 4 ), the precision is 98.64% and the recall is 98.25% with 1.75% of misclassification for testing data set, whereas the misclassification rate for the faulty class C 6 is 0.70% and 0% for faulty classes C 1 , C 2 and C 5 . However, using the proposed the IGPR-RF, the precision is 100% and the recall is 100% for all cases ( Table 6) which means that the classification errors are equal to 0%.
A set of one-class classifiers is presented here in order to further improve the classification capabilities of the proposed IGPR-RF strategy. For this purpose, a classifier bank that uses two classifiers based on kernel methods (IKPCA-RF, IGPR-RF) is applied to distinguish between the WEC faults. Each classifier is trained to classify a specific class with a label of 1 or −1. The classification results are presented in Table 7.
It can be seen from Table 7 that the average accuracy rate obtained using the proposed method in the training and VOLUME 8, 2020 testing cases are 99.99%. However, the IKPCA-RF indicates 99.13% of classification accuracy for the training case and 99.18% for the testing case. Thus, the developed IGPR-RF technique presents a very good accuracy in the training and testing cases compared to IKPCA-RF.

IV. CONCLUSION
In this paper, a new fault detection and diagnosis (FDD) strategy was proposed to improve the reliability of uncertain wind energy conversion (WEC) systems. To achieve a fast and reliable FDD, the most effective interval-valued features were extracted and selected using a interval GPR (IGPR) model. Then the selected interval-valued features were introduced as inputs to the RF classifier for diagnosis purposes. The simulation results were presented to prove the effectiveness of the proposed fault diagnosis strategy. The presented results showed that the IGPR-RF, with nonlinear statistical features that depend on small selected samples of the dataset, performed better than the IKPCA-RF technique that explicitly depends on the entire dataset, and way better than the conventional techniques (SVM, DT, NB, DA and KNN) using raw data. Moreover, the developed IGPR-RF technique presented a noticeable accuracy improvement compared to the IKPCA-RF where the entire dataset is used.