Enhanced PSO-Based NN for Failures Detection in Uncertain Wind Energy Systems

Ensuring the validity of measurements in wind energy systems (WES) is a challenging task in system diagnosis and data validation. This work, therefore, elaborates on the development of new approaches aimed at improving the operation of WES by developing intelligent and innovative fault diagnosis frameworks. Therefore, an enhanced particle swarm optimization (PSO), data reduction, and interval-valued representation are proposed. First, a feature selection tool using PSO Algorithm is developed. Then, in order to maximize the diversity between data samples and improve the effectiveness of using PSO algorithm for feature selection, the Euclidean distance metric is used in order to reduce the data and maximize the diversity between data samples. Finally, PSO and RPSO-based interval centers and ranges and upper and lower bounds techniques are developed to deal with model uncertainties in WES. The last retained features from the proposed PSO-based methods are fed to the neural network (NN) classifier. The proposed methodology improves the diagnosis abilities, reduces the computation time, and decreased the storage cost. The presented experimental results prove the high performance of the suggested paradigms in terms of computation time and accuracy.

techniques should be applied [7]. Generally, two main classes of FD techniques can be identified: model-based and datadriven methods. Model-based approaches are based on a consistency test between the observed process behavior of the sensor and the expected behavior of a mathematical process model, which is generally derived using a basic understanding (using physical and chemical principles) of the process under fault-free conditions. However, these methods belong to the capability of the mathematical model to correctly characterize the system's behaviors. Data-driven techniques are based on historical data collected during fault-free process operations [8]. Using the training data, an empirical model is built and used for faults detection by exploiting the future measurement data [8].
The most well-known data-driven diagnosis techniques applied to detect faults in WES include machine learning and statistical process control [9]. For instance, densitybased spatial clustering was used to classify faults state, and decision tree (DT) and random forest (RF) methods were used to build the predictive models for WES anomalies. In other works, a diagnostic technique based on the signal analysis method has been developed [10]. This technique eliminates redundant information using an empirical mode decomposition and discrete wavelet transform (DWT) technique. In [11], a fault diagnosis technique includes support vector machine (SVM) model and the optimal composition of symptom parameters algorithm to diagnose the motor fault in WES. However, this proposal suffers from some limitations due to the selection of the features which may lead to the misclassification of faults. To overcome the limits of using classical SVM, a fault diagnosis technique that proposes a large margin distributed machine (LDM) and introduces margin variance and mean on the basis of SVM is presented [12]. This proposal shows the best classification results compared to classical SVM. A decision tree has been developed to diagnose faults in WT systems [13]. Although this method is easy to perform, it has drawbacks in dealing with missing values. Another fault detection method based on artificial neural networks (ANN) is presented in [14]. Recently, deep learning techniques such as convolution neural network (CNN), recurrent neural network, and long short-term memory (LSTM) have been widely used in fault diagnosis of WES [15], [16], [17]. In [18], a fusion neural network model is developed by combining long short-term memory neural network (LSTM NN) and broad learning system (BLS) algorithm, to outstanding predict the lithium-ion battery capacity and remaining useful life (RUL). In [19], wavelet packet decomposition is used to extract characteristics that are fed to CNN classifier to perform the classification task. Deep learning techniques can accurately detect several conventional faults, but they have some limitations represented in the hard training step and the high time complexity. Different ways may improve the use of machine learning (ML) for FD purposes. Generally, the existing intelligent FD methods consist of two main steps: feature selection and fault classification. In [5], a fault diagnosis technique for WES based on an ML algorithm is developed to solve performance problems flexibly and reliably. In this case, the reduced kernel principal components algorithm is used to extract the most significant features from raw data and perform the classification task using a random forest classifier. This proposal presents high diagnosis accuracy but it needs a high computation time. Another fault diagnosis methodology based on features extraction and selection step is introduced in [20]. This technique consists of extracting and selecting features from raw sensor data using an improved Gaussian process regression (GPR). The main disadvantage of Gaussian processes is the loss of efficiency in high-dimensional spaces. In [21], a hybrid approach is proposed by combining the variational modal decomposition (VMD), particle filter (PF), and GPR to forecasting battery future capacity and remaining useful life (RUL). In our previous work [22], an efficient feature selection technique based on particle swarm optimization (PSO) is proposed. The basic objective behind the PSO algorithm is to remove irrelevant features and extract only the most significant ones from raw data in order to improve the classification task using a neural network classifier. Besides, in this proposal, an improved extension of the PSO model based on the use of Euclidean distance (ED) is developed. The main idea behind the use of the ED method is to avoid the problem of premature convergence and local sub-optimal areas when using the classical PSO optimization algorithm. Also, the proposed reduced PSO-NN (RPSO-NN) algorithm aims to improve the results in terms of accuracy as well as in terms of complexity time, and storage cost. Generally, uncertainty and inaccuracy might describe the significant information characterizing the real systems [23]. Classical data is a simplification during the data mining process and it may cause severe loss of information. For this reason, intervalvalued data representation is important [24], [25]. Recently, different faults diagnosis techniques have been proposed to deal with noise in data, imbalanced data types and inadequate fault sample data [26], [27]. In [26], an improved label-noise robust generative adversarial network was used to ensure the quality of the generated data and improve the generalizability of the model under actual operating conditions scenarios by performing a batch comparison between generated and actual data.
In this paper, different classification paradigms, merging the benefits of Particle Swarm Optimization (PSO) and Neural Networks (NN) algorithms, are proposed for classification of faults in uncertain WES. The first developed method, socalled PSO-based NN, is addressed so that PSO is applied for feature selection, and then the selected pertinent features are fed to the NN classifier to perform the classification task. Unfortunately, the application of PSO-based NN algorithms in complex problems is limited by the lack of diversity causing premature convergences (local sub-optimal areas). To overcome this limitation, an Euclidean distance (ED)based data reduction tool is used.
One of the main objectives of this work is to improve databased diagnosis techniques in uncertain WES. Generally, uncertainty and inaccuracy might describe the significant information characterizing the real systems [23]. Classical data is a simplification during the data mining procedure and it may cause severe loss of information. Thus, interval-valued data representation is of high importance [24], [25]. In this work, more robust techniques can be achieved by describing the process measurements by interval-valued data instead of single-valued data. Therefore, interval models are to be applied to handle the new nature of data.
The objective is to improve data-based monitoring methods, especially in experimental industrial applications with imperfect measurements which significantly worsen the fault diagnosis task. The developed techniques are able to improve both fault diagnosis robustness and sensitivity while maintaining a satisfactory and stable performance over long periods of process operation. To summarize, the main paper's contributions are as follows: 1) An effective PSO-based feature selection method is proposed. The aim is to remove irrelevant features and only extract the most significant ones from raw data (better classification accuracy using NNs. 2) An Euclidean distance (ED)-based reduced PSO-NN (RPSO-NN) algorithm is proposed (higher accuracy, lower complexity, and reduced storage cost) to overcome the premature convergence and local suboptimum limitations of classical PSO optimization algorithms.
3) The proposed techniques are extended to intervalvalued data to deal with model uncertainties such as noise, measurement errors, and variability. The proposed techniques are characterized by their high performance in terms of accuracy, robustness, and ability to detect incipient/drift faults and their severity in WES.
This paper is organized as follows: Section III presents the proposed techniques while the description of the studied WES is presented in Section III. Then, Section IV presents the results and the performance evaluation. Finally, Section V concludes the paper.

II. INTERVAL ENHANCED PSO-BASED NN FOR FAULT DETECTION A. PROPOSED DIAGNOSIS PARADIGM
In this paper, the aim is to detect incipient faults while taking into consideration the WES uncertainties. The developed interval enhanced PSO based NN technique exploits the benefits of features selection based on PSO, datasize reduction using ED, interval-valued representation of raw data, and NN for classification. Firstly, the proposed algorithms will use PSO method for features selection and neural networks classifier for classification to improve the diagnosis of WES. PSO is a very effective global search technique based on the movement and intelligence of swarms (number of samples). An Euclidean distance (ED)-based reduced PSO-NN (RPSO-NN) algorithm is proposed (higher accuracy, lower

Algorithm 1 IRPSO-Based NN Algorithm
Input N × m data matrix X . 1. Normalize the data set, 2. Determine the new interval data matrix (X UL ,X CR ), 3. Compute the reduced matrix using the euclidean distance metric by conserving only one observation in the case of redundancy, 4. Select features using PSO model, 5. Use the selected features as input to the NN classifier for the training phase, 6. The NN classifier is evaluated using testing features, 7. Classifying the different operating modes (healthy and faulty).

B. TECHNIQUES
In this section, a brief description of the proposed methods is presented.

1) ENHANCED PARTICLE SWARM OPTIMIZATION (PSO)
Particle Swarm Optimization (PSO) is considered as a stochastic optimization technique based on the intelligence and movement of swarms [28], [29]. PSO is a very efficient global search algorithm that needs very few algorithm VOLUME 11, 2023 parameters compared to other optimization algorithms like genetic algorithms which require setting different evolutionary operators such as crossover and mutation [29].
To overcome the limitation of premature convergence (due to the lack of diversity) which limits the effectiveness of PSO algorithms, an ED-based data reduction tool is proposed. The main objective of the ED method is to extract only a single sample in case of redundancy to construct the reduced data. This, in turn, plays a pivotal role to increase the diversity between data samples and then solving the problem of premature convergence and local sub-optimal area. Let's consider a data matrix X with N samples and m process variables, the ED between the rows X i and X j of the data matrix X is computed by: The dissimilarity matrix D representing the dissimilarity between all samples' pairs for a data matrix X is computed by: Thus, we obtain a reduced data matrix X ′ with N ′ samples and m process variables where N ′ < N . Then we apply the PSO algorithm for selecting the more pertinent features from the reduced data.

2) INTERVAL REDUCED PSO (IRPSO)
The uncertainties in the collected data present the intervalvalued data (IVD). IVD methodology is very important to preserve variable information. In this step, four techniques IPSO CR , IPSO UL , IRPSO CR , and IRPSO UL are used. The main idea behind these proposals is to use a specific intervalvalued data matrix instead of the single-valued data matrix. The interval data matrices are X CR and X UL . X CR constructed by the concatenation of center and range matrices and X UL constructed by the lower and upper bounds of interval values of variables.
Let's consider x ij , where i = 1, .., N and j = 1, . . . , m, is the i−th sample of the j−th observation, the interval representation of the data observation x ij is defined using the lower bound x ij and the upper bound x ij as follows, The interval data matrix [X ] is constructed using the interval-valued samples for the different description variables as follows: For the interval X UL technique, an upper-lower technique is presented to define the new data. The lower and upper bounds matrices X L and X U are computed as, . .
The interval matrix X UL is constructed using the upper and lower matrices can at the same time as, where, θ ∈ [0, 1], θ represents the regulation weight of interval-valued data unit. The new constructed upper and lower matrix is computed by: It is worth mentioning that θ = 1 represents a lower scheme with one feature while θ = 0, represents an upper bound including the size information of x.
The interval [x jk ] can be also represented as a couple {x c jk , x r jk }. The center x c jk and the range x r j (k) of the interval are computed respectively as, x r jk = The center and range matrices from interval-valued data matrix are computed as, Then, we concatenate the computed center and range matrices into one matrix X CR as, 3) ARTIFICIAL NEURAL NETWORK Neural network techniques have been widely used for FD in various applications [30], [31]. Neural networks (NNs) contain different layers, including an input layer, one or more hidden layers, and an output layer [31]. Neurons and weighted connections between neurons are the main components of neural networks [31]. The weights and the inputoutput function are the main evaluation criteria of the NN performance [31]. The main known Network architectures are feedforward and recurrent architecture. The main difference between the two architectures types present in the feedback between networks which is present when using recursive architecture by taking the correct prediction during backpropagation. In this paper, multilayer artificial neural networks (ANN) with Levenberg-Marquardt Backpropagation (LMBP) training method is adopted [32]. The developed ANN is constructed with 10 hidden layers and 50 hidden neurons in the hidden layer.

III. SYSTEM DESCRIPTION
The WES under study is presented in Figure 2, while Figure 3 illustrates the topology of the used back-to-back converter. The detailed description of the WES under study was presented in [1]. WTC includes two main parts. The first one is the model of the turbine and the squirrel cage induction machine (SCIG). In this case, the stator side AC/DC Converter is used for the control. The second one is the grid-side DC/AC converter sub-system. This configuration allows unlimited variable speed operation. The generated voltage is rectified and transformed into direct current and voltage whatever the rotation speed of the machine. A detailed description of the turbine is presented in [1]. In the wind chain, the power converters topology is on two levels ( Figure 3). Each converter consists of three arms. Each arm contains high and low IGBTs. Failures in WEC systems are mainly caused by variations in weather conditions and faults in power converters. Recent studies have shown that more than 21% of the failures in WEC systems are caused by faults in the power conversion stage [33], though the WT and the gearbox failures cause the longest downtime [34]. Many elements might cause the fatigue of the switching devices. The fatigue impacts in particular the dynamics of the component and consequently may result in extra switching losses and even system failure. The fatigue is usually modeled by increasing the internal resistance of the component. In this work, it is assumed that the internal resistance is equal to zero during normal operating conditions. Then, the resistance increases to reflect fatigue.   Therefore, it is necessary to detect the onset of fatigue to prevent the overall failure of the converter. Several electrical and mechanical variables need to be analyzed in detail to narrow down the failed parts. For instance, the collectoremitter voltage of an IGBT rises sharply just before the failure occurs. This can be used as an excellent indicator for preventive maintenance purposes in wind systems [35]. For the sake of simplicity, only IGBT11 (rectifier stage) and IGBT21 (inverter stage) are encompassed in the FD study. Three types of faults are considered (open-circuit, wear-out, and short-circuit) ( Table 1). It is worth noting that the wearout fault is modeled by an internal resistance of 2 .
The behavior of some electrical and mechanical variables for different fault scenarios are is presented in Figures 4 to 8.
In this study, 12 variables have been generated for modeling and fault classification as listed in Table 2 [36]. To demonstrate the effectiveness of the developed methodologies, real bearing vibration data are used as an example [1].

IV. RESULTS AND DISCUSSIONS
Seven working modes (1 healthy and 6 faulty modes) are evaluated as illustrated in Table 3. Each operating mode is adequately qualified over 2000 10-time-lagged observations VOLUME 11, 2023   The swarm size, max iteration, cognition coefficient, and social coefficient of the PSO algorithm are equal to 20, 100, 2, and 2, respectively. The number of hidden layers of the NN structure was set to 10 while the number of hidden neurons is equal to 50 in the hidden layer.
In the multi-class classification stage, one healthy case (C0) and 6 faulty cases ( C1-C6) are considered (Table 3). Table 4 shows that the proposed algorithms outperform the NN algorithm in terms of accuracy in the training and testing phases. Besides, The PSO-based NN provides the best results   compared to other optimization techniques like genetic algorithm (GA), differential evolution (DE), ant Colony optimization (ACO), simulated annealing (SA) and gravitational search algorithm (GSA) based methods. In addition, the presented results demonstrate that the proposed reduced PSO-based NN (RPSO-based NN) (19.87/1.19) provides an important reduction in terms of computation time compared to PSO-based NN (36.14/2.47) technique.
To prove the effectiveness of using interval-valued data compared to single-valued data, the results of the proposed interval techniques are compared to the proposed single-valued techniques. One can notice from  compared to IPSO-based NN CR (38.86/2.68) and IPSObased NN UL (40.03/2.76) techniques for training and testing phases. In addition, The proposed IRPSO-based NN techniques not only decrease the computation time but also enhance the accuracy. The obtained accuracy using both IRPSO-based NN CR and IRPSO-based NN UL is equal to 99.18/99.15 and 99.71/99.68 for training and testing phases, respectively. A comparison study between the proposed methodologies and other existing methods such as recurrent neural network (RNN), General regression neural network (GRNN), SVM, RF, and KNN is presented. KNN and SVM present a low accuracy and they are not able to differentiate between the different operating modes. RNN classifier presents good results in terms of accuracy, but it is suffer from a high computation time for both the training and testing phases.
It is clear from Table 4 that the proposed IRPSO-based NN UL method achieves the best accuracy compared to other techniques.
In the second stage a one-class classification is done (Table 5). In this case, each classifier is trained to classify a specific class with a label of 1 or −1. Table 6 summarizes the results in terms of accuracy and mean of computation time of the developed techniques. The presented results from Table 6 prove that all the proposed techniques provide good results in terms of accuracy. In addition, we can conclude from the presented results in Table 6, that the developed techniques based on data-size reduction tool afford a good reduction in terms of computation time and with almost the same accuracy. Thus, the developed techniques based  on PSO algorithm for features selection and ED for datasize reduction can strongly decrease the size of the dataset while keeping the more informative features. In addition, the presented results demonstrate that the proposed IRPSO-based NN UL technique gives the best tread off between accuracy (99.71/99.68) and computation time (23.17/1.35) for training and testing phases.

V. CONCLUSION
Diagnosis in WES is important to ensure reliable power production and optimal energy harvesting because WES usually suffer from several faults due to a difficult outdoor environment. This work elaborates on the development of new approaches aimed at improving the operation of WES by developing intelligent and innovative WES fault diagnosis frameworks. To do this, an enhanced NN-based classifier using PSO, and data reduction are proposed. The improved PSO algorithm is applied to enhance the classification accuracy by removing the irrelevant features and extracting the most significant ones from reduced data using the Euclidean distance method. The reduced PSO-NN (RPSO-NN) technique is characterized by higher accuracy and reduced VOLUME 11, 2023 computation time and storage cost. To deal with model uncertainties (measurement errors, noise, variable variability, . . .) in WES, the developed technique will be extended to intervalvalued data with the aim of achieving greater accuracy. Therefore, based on the application of the interval-valued dataset, four techniques were proposed to deal with uncertainties in WES. The proposed methodologies not only improve the diagnosis abilities but also reduce the computation time and storage cost. The presented results demonstrate the effectiveness of the proposed approaches for fault diagnosis of WES. As future works, we will explore other machine learning and deep learning algorithms through single and interval-valued representation, allowing the comparison between the fault diagnosis performances of each algorithm for certain and uncertain WES systems. Finally, the developed fault diagnosis techniques will be utilized in practice to help improve operations of WEC systems. They will be tested and validated using simulated and real data under extreme conditions and using different simulation scenarios. He has published more than 120 journals and conference papers, and he is the author of two books and two book chapters. His research interests include systems control with applications arising in the contexts of power electronics, energy conversion, renewable energies integration, and smart grids.
KAMALELDIN ABODAYEH received the M.Sc. degree in functional analysis from University College Dublin, the Ph.D. degree from University College Cork, Ireland, in 1997, and the Ph.D. degree from the Department of Process Engineering, University College Cork. Since 2001, he has been with Prince Sultan University, Saudi Arabia. He has published more than 60 articles in various areas of pure and applied mathematics. His research interests include functional analysis, theoretical physics, discrete potential theory, fixed point theory, and quality monitoring and statistical hypothesis testing.
KAIS BOUZRARA is currently a Professor of electrical engineering with the Laboratory of Automatic Signal and Image Processing, National Engineering School of Monastir, Monastir, Tunisia. He has more than 15 years of combined academic and industrial experience. He has published more than 80 refereed journals and conference publications and book chapters. His research interests include systems engineering and control, with emphasis on process modeling, monitoring, and estimation.
HAZEM NOUNOU (Senior Member, IEEE) is currently a Professor of electrical and computer engineering with Texas A&M University at Qatar. He has more than 19 years of academic and industrial experience. He has significant experience in research on control systems, database control, system identification and estimation, fault detection, and system biology. He has been awarded several NPRP research projects in these areas. He has successfully served as the lead PI and a PI for five QNRF projects, some of which were in collaboration with other PIs in this proposal. He has published more than 200 refereed journals and conference papers and book chapters. He has served as an associate editor and on the technical committees for several international journals and conferences.

MOHAMED NOUNOU (Senior Member, IEEE)
is currently a Professor of chemical engineering with Texas A&M University at Qatar (TAMU). He has more than 19 years of combined academic and industrial experience. He has successfully served as the lead PI and a PI for several QNRF projects (six NPRP projects and three UREP projects). He has published more than 200 refereed journals and conference publications and book chapters. His research interests include systems engineering and control, with emphasis on process modeling, monitoring, and estimation. He is a Senior Member of the American Institute of Chemical Engineers (AIChE).