Fault Detection and Isolation of a Pressurized Water Reactor Based on Neural Network and K-Nearest Neighbor

Nuclear power plants (NPPs) are complex dynamic systems with multiple sensors and actuators. The presence of faults in the actuators and sensors can deteriorate the system’s performance and cause serious safety issues. This calls for the development of fault detection and diagnosis systems for detection and isolation of such faults. In this study, fault detection and diagnosis (FDD) based on neural networks (NN) and K-nearest neighbour (KNN) algorithm is applied to a pressurized water reactor (PWR). Fault detection is first determined based on the NN. Second, the KNN algorithm is used to classify the faults. The proposed approach is capable of classifying a variety of actuator faults, sensor faults, and multiple simultaneous actuator and sensor faults. A set of simulation results is provided to demonstrate the accuracy of the FDD method. The classifier performance is further compared with other machine learning techniques.


I. INTRODUCTION
Nuclear power plants (NPPs) play a key role in reducing greenhouse gas emissions. However, the safety of their operation remains a significant concern. Nuclear plants are complex dynamic systems with many actuators and sensors. Because of their vital role in NPPs, any fault in actuators and sensors can degrade the system's performance and cause serious safety issues. Therefore, particular attention should be paid to the detection and diagnosis of such problems to prevent their degradation, which can lead to catastrophic damage. To achieve this, model-based fault detection and diagnosis (FDD) is applied for NPPs [1]. This approach uses a mathematical model to describe the normal behavior of the plant. Faults in the process are detected and isolated by comparing the system's behavior with the fault-free model. However, difficulty in obtaining exact and accurate models of NPPs puts forth hurdles in practical applications of model based FDD techniques. As opposed The associate editor coordinating the review of this manuscript and approving it for publication was Dazhong Ma . to model-based approaches, data-driven methods do not rely on explicit knowledge about the process. Instead, they use the data acquired from the process to construct an empirical model. Recent studies have developed data-driven approaches for monitoring NPPs. For instance, the support vector machine (SVM) algorithm has been used for FDD of NPPs [2], [3]. The simple and flexible structure of principal component analysis (PCA) has garnered widespread interest in the past decade. For instance, Farhan et al. applied data-driven techniques based on PCA along with Fisher discriminant analysis for a control rod withdrawal fault and an external reactivity insertion fault [4]. In another study, an improved PCA was employed for FDD of sensors in an NPP [5]. Another approach to the diagnosis of faults in an NPP was proposed in [6] that used data acquired from a full-scope simulator for a kernel PCA. More recently, the PCA approach was used with multivariate contribution plots (MCP) [7].
Another alternative approach is a neural network (NN), which is a network of neurons that learns complex functions through a series of non-linear transformations. VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ They have been successfully employed for complex classification tasks such as image recognition [8], speech recognition [9], and system identification [10]- [12]. NNs have also been used to address the fault diagnosis problem in NPPs [13]. A convolutional NN model was developed for abnormality diagnosis in an NPP in [13]. Three types of sensor fault signals were simulated in [14] using a modified ensemble empirical mode decomposition and probabilistic NN. In [14], a distributed fault diagnosis approach was proposed that was based on a fuzzy NN and data fusion, and the efficiency of the diagnostic approach was improved in [15].
In a more recent work [16], fault diagnosis performance was tested via the comparison of a radial basis network and the Elman NN. The K-nearest neighbour (KNN) algorithm is a nonparametric classification method that can be used for classification, regression, and pattern recognition problems [17]. The KNN algorithm is simple and easy to implement. The purpose of KNN classification is to categorize data points based on the classification of their neighbours, where K represents the number of nearest neighbors considered for the determination of the object class. Although KNN has been successful with fault detection in industrial processes [18]- [20], there are only a few studies on this approach with fault diagnosis and identification. In this study, NN and KNN are applied for the first time for the detection and classification of single and multiple simultaneous sensor and actuator faults in a pressurized water reactor (PWR). With this framework, faults are first detected using an NN approach, and then the KNN method is used to classify them. Compared to existing techniques, the KNN approach is more accurate and uses less computational time.
Studies in the existing body of literature have primarily focused on faults that affect either the sensor or the actuator but not both. Moreover, most studies assume that only a single fault is injected into the system at a time. A realistic study should establish an FDD for both actuator and sensor failures and should consider the injection of multiple simultaneous faults. The proposed FDD in [5] is capable of detecting and isolating multiple simultaneous fault but it is limited only to sensor faults. In [21], a simple case of sensor and actuator faults was studied by the development of a fault detection technique based on an NN to determine the presence of saturation faults in the actuator and bias faults in the power and temperature sensors of a PWR. The present study can be considered as an extension of [21] by examining drift faults in sensors and offset faults in the actuator. Furthermore, the injection of multiple simultaneously sensor and actuator faults is studied in addition to the single faults. The KNN algorithm is used to perform the classification of faults. The main contributions of this paper are as follows: • The NN technique and KNN algorithm are employed for fault detection and classification in a PWR.
• Actuator offset, actuator saturation, sensor bias, and sensor drift faults are studied.
• Both single faults and multiple simultaneous actuator and sensor faults are considered.
• The proposed classifier is compared to other machine learning techniques. The rest of this paper is organized as follows: Section II provides a description of the PWR process. Section III presents the data collection for FDD. The two classification methods are described in Section IV, and Section V describes the efficacy of the proposed technique. Finally, conclusions are drawn in Section VI.

II. PRESSURIZED WATER REACTOR
The PWR model used in this study can be found in the literatue [22]. The PWR mathematical model assumes a point kinetics equation coupled with six delayed neutron groups and a lumped thermal hydraulic model. The dynamic model is described in (1) through (5) [22].
where P is the neutronic power, is the prompt neutron generation time, and C i , λ i , and β i are the delayed neutron precursors' concentration, decay constant, and fraction of delayed neutrons, respectively; H f and H c denote the proportionality constants; τ f , τ c , and τ r denote the time constants; T f , T c1 , and T c2 are the temperatures of fuel, coolant nodes 1 and 2, respectively.

III. DATA COLLECTION FOR FDD
This section discusses the application of the NN to detect faults in the actuator and sensors of the previously described PWR plant. The PWR plant is assumed to be controlled by a robust PID controller that is carefully tuned and operating in the range of 80%-100% full power.

A. TYPES OF FAULTS
Six single faults and two simultaneous faults are considered in this study, as shown in Table 1. The types of faults considered in this study included bias, drift, actuator offset, and actuator saturation faults, which are described as follows: Bias fault. This is one of the most common faults in sensors, corresponding to a constant offset added to the sensor output, which may be caused by inappropriate calibration or physical changes in the sensor [23]. Bias failures are a common fault in NPPs, and their maintenance can be costly [1]. In this study, the bias fault is injected into the power and temperature sensors at a certain time.
Drift fault. This consists of a time-varying offset [24]. The drift fault is difficult to detect because the drifting amplitude is initially low [25], therefore it is important to have a performant sensor drift FDD. Drift faults are common in NPPs and can cause power reduction [1]. As with the bias fault, the drift fault is injected into the power and temperature sensors at a certain time.
Actuator saturation fault. This is when the actuator (control rod system) exceeds a set saturation value. This phenomenon inevitably must be considered because of physical limitations that, in practice, can led to important deterioration of the system [26].
Actuator offset fault. This corresponds to an offset added to the control rod system at a certain time. This failure can occur because of design/ manufacturing defects in the actuator [27].

B. RESIDUAL GENERATION WITH NEURAL NETWORK
The NN is used to detect faults in the PWR, with two NNs trained to represent the power and temperature of the original (non-faulty) power plants. Thus, when the nuclear plant presents faults, a residual is generated between the faulty and non-faulty NPPs. The data generated for training these two NNs are described in detail in [21]. Both networks are trained independently to adopt the behaviour of the closedloop process during normal operation. One NN is dedicated to learning the dynamics of the power, whereas the other is dedicated to the reactor temperature. Both NNs used in this study are two-layer feedforward networks as shown in Fig. 1. They have a tanh activation function in the hidden layer (layer 1) and a linear function in the output layer (layer 2). The Levenberg-Marquardt optimization is selected to train the networks. This algorithm uses an approximation of Newton's method rather than the gradient descent method. The best NNs obtained for the power has the following parameters: five neurons in the hidden layer, 999 training epochs, and a mean square error (MSE) of 5, 9.10 −4 . The best NNs obtained for the temperature has the following parameters: six neurons in the hidden layer, 1000 training epochs, and an MSE of 0.10225.
The proposed fault detection method is based on the scheme shown in Fig.2. The input P dem corresponds to power demand and the outputs P and T correspond to the power and temperature measures, respectively. The residuals result from the measurement error between the sensor measures and NN estimations, where e 1 corresponds to the measurement error between the measured power and the power estimated by neural network 1 and e 2 corresponds to the measurement error between the measured temperature and the temperature estimated by neural network 2. Threshold alarms are defined to detect the faults, which is when a residual value is greater than the alarm threshold.  x i (i . . . P)-inputs, w ij (i . . . p, j . . . k)-weights from input to hidden layer, b j (j . . . k)-biases of the neurons in the hidden layer, A 1 -activation function in the hidden layer, w 2 j (j . . . K )-weighting functions from hidden to output layer, b-bias of the output neuron, A 2 -linear activation function of the output layer, Y -output.

C. DATA GENERATION FOR FAULT CLASSIFICATION
The fault classification problem is solved using the data of power, temperature, and the two residuals (e 1 and e 2 ). To create a meaningful representation of the data, four data points are collected in each of the eight fault cases for 100 simulation runs and are sampled at a frequency of 1000 Hz. The four data measurements in the presence of bias-type faults in the power sensor are shown in Fig. 3-6. The variation of the reactor power is shown in Fig. 3. The variation of the temperature is shown in Fig. 4. The variations of the two residuals e 1 and e 2 are shown in Fig. 5 and Fig. 6, respectively. Only five simulation runs are presented here. The sensor bias fault was injected randomly and for each simulation run. For instance, a bias fault occurs at 161 seconds in the first simulation. The fault effect is more obvious for the temperature, whereas it is VOLUME 10, 2022  less noticeable for the power, which is controlled by the PID controller.

IV. FAULT CLASSIFICATION METHODS
Three classification algorithms are analysed for this study, which are the KNN, SVM and NN classifiers. The SVM and NN classifiers are employed to benchmark the performance of the KNN classifier.

A. NN-BASED CLASSIFICATION
The standard NN that is used for classification purposes is characterized by a two-layer feedforward network. They have a sigmoid function in the hidden layer and a softmax transfer function in the output layer. The NN structure is shown in Fig. 7. The hidden layers transform the input data into higher representations by using the nonlinear transformations:   where x and h are the input vectors and hidden representations, respectively; b are the biases of the neurons in the hidden layer; w are the weights from the input to the hidden layer; and σ is a sigmoid activation function. The transformation of (7) without the activation function is applied to the output of the last hidden layer, as follows: where h represents the last hidden layer, b represents the biases of the output neuron, and w is the weighting function from the hidden to the output layer. The softmax function is then used to calculate each output neuron. In this study, the neural network is trained using scaled conjugate gradient back propagation. The goal of training the network is to maximize its accuracy, which can be defined as follows [28]:

B. KNN-BASED CLASSIFICATION
The KNN method is a simple non-parametric classification method. Despite the simplicity of the algorithm, it is known to perform well. Furthermore, it is an important benchmark method [29], [30]. KNN performs the classification task based on the similarity index considering the distance measure; k corresponds to the integer value that is mostly lying within the range [3][4][5][6][7][8][9][10]. It is advisable to choose an odd value of k to obtain a clear prediction. Among all the input classes that are stored in the algorithm, the class decision selection is predicted based on the majority votes given by the neighboring points correspondingly nearer to the class. Distance is a key word in this algorithm. Distance measurements are used to measure the distance between individuals in a space. The Euclidean distance is the most common distance measurement method. For example, if x and y are two points in the Euclidean space and it is assumed that x = (x 1 , x 2 , x 3 , x 4 , . . . , x n ) and y = (y 1 , y 2 , y 3 , y 4 , . . . , y n ), then, the Euclidean distance of line segment xs can be expressed as follows [31]:

C. SVM CLASSIFIER
The SVM is a machine learning algorithm based on structural risk minimization and statistical learning that is used for data classification and regression [32]. The SVM method has been successfully employed for various applications separating data into two or more classes. The aim of using the SVM is to find an optimal hyperplane that separates data points of one class from those of another class. An optimal hyperplane is defined as one that maximizes the margin of separation between two classes. Thus, the SVM is basically employed to address classes that are linearly separable. For nonlinear cases, the classifier may not perform well. Hence, kernel functions are used for nonlinear transformation. A kernel function turns a nonlinearly separable object into a linearly separable one by mapping it into a higher dimensional feature space. Common kernel functions include the linear kernel, polynomial kernel, and Gaussian radial basis function kernel [33].

D. TRAINING PROCESS 1) DATA PREPROCESSING
The data collected in Section III are used for training the three classifiers. Before training the classifiers, it is necessary to transform the raw measured data into a form that can be input to the learning classifiers. To succeed with this, the data are transformed through preprocessing steps. First, the data are normalized between 0 and 1. The data are then re-sampled to 100 Hz because the datasets are too large for a personal computer to accommodate. Finally, the data are reshaped as a vector matrix.

2) NEURAL NETWORK
For the purpose of training the NN classifier, the data collected are sorted into a training set (50%), validation set (25%) and testing set (25%). Validation, and testing sets are used to avoid overfitting and to check the generalization properties, respectively. The NN is trained using a scaled conjugate gradient for 1000 training epochs. The number of neurons in the hidden layer is increased to five because that is where the largest improvement is achieved. The number of neurons in the output layer is fixed to eight as it corresponds to the number of elements in the target vector.

3) KNN
For the KNN classifier, six different methods available in MATLAB are trained, namely the fine KNN, medium KNN, coarse KNN, cosine KNN, and cubic KNN. Table 2 presents the definition of each classifier. In this paper, a fivefold cross validation is performed to avoid overfitting. In this approach, the data are divided into five folds, of which four folds are used for training and one is used for testing. This operation is repeated five times in such a manner that each fold is used for testing exactly once. The average test error is obtained by averaging all five folds. Among the six KNN classifiers, the one presenting the best performance is found to be the weighted KNN. The classification accuracy of the six classifiers are detailed in the appendix (Table 8). For the rest of the paper, the KNN model considered is the weighted one.

4) SVM
For the SVM classifier, six different methods available in MATLAB are trained: linear SVM, quadratic SVM, cubic SVM, fine Gaussian SVM, medium Gaussian SVM, and coarse Gaussian SVM. The definition of each classifier is presented in Table 2. As KNN, the performance of SVM is also evaluated using the fivefold cross-validation. Among the six SVM classifiers, the fine Gaussian SVM is found to be the most performant. The classification accuracy of the six classifiers are provided in the appendix (Table 8). For the remainder of the paper, the SVM model considered is the fine Gaussian SVM.

V. FAULT CLASSIFICATION RESULTS
The simulation is performed to test the performance of the three classifiers: the KNN, NN, and SVM classifiers. The three classifiers are trained to classify the eight fault cases (Table 1). Fig. 8 shows the training performance of the NN. It can be seen that the training performance of the NN is good. The performances of the three classifiers are evaluated with confusion matrix tables. The basic statistical results of the confusion matrix can be extended to the following three indicators: accuracy, precision, and recall, and the indicators are calculated as follows [31]: where TP and TN are true positive and true negative, respectively; FP and FN are, respectively, false positive and negative.      15.1%, and 20.2% incorrectly classified as F2, F5, and F6, respectively. These facts demonstrate that the NN classifier is imprecise and unreliable. Table 4 is the confusion matrix of the SVM classifier. In general, the results are more or less correct, except that there are two fault modes that are poorly classified (those belonging to F3 or F4). Of the faults belonging to F3, 24.1% are misclassified as F1, and 22.7% are misclassified as F4. Faults belonging to F4 have only 31.6% correctly classified and 54.5% incorrectly classified. This means that faults belonging to F4 have a greater chance of being wrongly classified than of being correctly classified. The overall 65.3% accuracy of SVM shows that it has better overall accuracy than the NN classifier (57.5%). However, the SVM classification accuracy performed poorly in one fault mode (F4). Table 5 is the confusion matrix with the KNN classifier. It is worth noting that the KNN classifier has better accuracy than the others. The overall accuracy is correct, but it can be seen that faults belonging to F4 are the least accurate. They have 69% correctly classified and 24.9% incorrectly classified as F3. The KNN classifier presents an overall accuracy of 85.3%, as compared to 68.5% and 57.5% for the SVM and NN classifiers, respectively, meaning that KNN is undoubtedly a better performer than the others.   Receiver operating characteristic (ROC) curves are also used to analyse the performance of the classifiers. By definition, an ROC curve shows the true positive rate versus the false positive rate for different thresholds of the classifier output. This approach is then used to visualize the classification performance under different decision thresholds; therefore, it is a good tool for evaluating the performance of an algorithm. The ROC curves generated to evaluate the NN classifier are shown in Fig. 9. From ROC curve shapes, it can be said that NN is a reasonably accurate algorithm for the eight faults because the curves are all away from the diagonal. For the sake of brevity, the ROC curves for KNN and SVM are not provided. Instead, the area under curve (AUC) is used as a summary of the ROC curve and describes how much the curve is stretched toward the upper left corner of the diagonal [35]. The AUC for the different classifiers is summarized in Table 6, and the overall performance is given in Table 7. The AUC measurements reveal good classification accuracy for the three classifiers. Nevertheless, the KNN classifier has far better classification performance, with an average AUC of 0.95. The SVM and NN classifiers present an average AUC of 0.91 and 0.87, respectively. In addition to being accurate, the KNN classifier has the lowest computational time with a training time of 3.7 × 10 1 seconds, as compared to 1.15 × 10 3 seconds and 4.5 × 10 4 seconds for the NN and SVM classifiers, respectively.

VI. CONCLUSION
This paper proposes a (FDD) method using NN and KNN approaches, where NN is used for fault detection and KNN is used for fault classification. In the diagnosis process, the data obtained from the PWR reactor are used to train the classifiers. Actuator faults and some types of sensor faults, such as drift and bias faults, are employed to test the classifiers. Moreover, the injection of multiple simultaneous faults is considered. The proposed classification method demonstrates good performance and can be used effectively for the diagnosis of PWR faults. Eight different faults have been successfully classified.
The KNN classifier is also compared to two other machine learning algorithms (NN and SVM). The performance of the KNN classifier is better than the other classifiers in identifying the single and multiple simultanous sensor and actuator faults. A detailed analysis has been performed to compare the performance of the classifiers by computing ROC curves, AUC measures, and confusion matrices. It has been observed that the KNN has the highest AUC average than the NN and SVM classifiers. The confusion matrices confirm the outstanding performance of the KNN classifier over the other techniques. The KNN classifier has indeed an overall accuracy of 85.3%, as compared to 68.5% and 57.5% for the SVM and NN classifiers, respectively. The KNN algorithm is then doubtessly a better performer than the other techniques. SVM has a better overall accuracy than the NN classifier. However, the SVM classifier is defective as it fails in classifying the fault F4; the fault F4 has more chance of being wrongly classified as F1 (54.5%) than of being correctly classified (31.6%). The NN classifier performs poorly in classifying faults belonging to F1 and F4 but it is not classified as defective. In addition to providing better classification accuracy, the KNN is also found to be less computationally expensive in comparison to the NN and SVM methods.
The simple architecture of the proposed KNN algorithm allows easy implementation. But before that the integration of the proposed classifier in a real power plant must follow several stages of verification and validation (V&V), review, and approvals [36], [37]. Future work will investigate an ensemble classifier that combines NN and KNN to improve the efficiency and accuracy of the detection and diagnosis of faults in a PWR. Table 8 presents the classification accuracy of the KNN and SVM classifiers.