Generalized Principal Component Analysis-Based Subspace Decomposition of Fault Deviations and Its Application to Fault Reconstruction

In the present work, based on the generalized principal component analysis, we propose a new approach to decompose the subspace of fault deviations, which is used for reconstruction-based fault diagnosis through principal component analysis (PCA) monitoring system. The proposed method is advanced since it lightens the computational burden by eliminating the irrelavant information and simplifying the fault subspace. The fault effects are extracted through analyzing the generalized principal components of the normal operating data and the fault data. The significant fault deviations that cause the alarming monitoring statistic are calculated. This is achieved by designing a two-part feature decomposition procedure. In the first part, the normal operating subspace is extracted through analyzing the generalized principal components of both the historical normal data and fault data. The fault-free part of the data is eliminated by projecting the data into the normal operating subspace. In the second part, principal component analysis is performed on the remaining part of the data, where the largest fault deviation directions are decomposed in order. By the two-part decomposition, an integrated fault subspace for all monitoring statistic indices is obtained, which separates the measurement data into two different parts for fault reconstruction. One part is related to the normal operating subspace, which is deemed to follow normal rules, and thus insignificant to remove alarming monitoring statistics. The other is related to the fault subspace, which contributes to the out-of-control signals. Theoretical support is constructed and the related statistical characteristics are analyzed. Its feasibility and performance are illustrated with the data from the Tennessee Eastman (TE) benchmark process.


I. INTRODUCTION
In modern industrial processes, fault detection and diagnosis [1]- [11] have become one of the most critical areas of research in process control over the past decades, and an essential element in the operation of modern engineering systems to avoid serious consequences and reduce the maintenance costs. Since the manufacturing processes often have a large number of measured variables, and the measured variables have a high correlation, dimensionality reduction techniques have been widely used for process data analysis and process improvements. Some representatives are principal The associate editor coordinating the review of this manuscript and approving it for publication was Youqing Wang . component analysis (PCA) [12], partial least square (PLS) [13], [14], Fisher discriminant analysis (FDA) [15], etc., which provides a refined and low-dimensional analysis space by projecting measurement data onto a low-dimensional latent space. These techniques extract basic features from measured data and define the normal operating areas by adapting to acceptable changes. When a process moves outsides the normal operation regions, it is hoped that the operator can immediately detect it, diagnose an attributable cause for the deviation, and take necessary corrective actions to restore the process to a normal state. Although fault-detection methods have been researched very much, extensive research on fault isolation still requires in-depth consideration. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ Generally, data driven-based fault diagnosis methods have caught great attention from since they require less process knowledge and are more inflexible to implement. Data driven-based fault diagnosis methods diagnose faults through analyzing the historical data and current data. There are two major strategies, reconstruction-based methods and classification-based methods. Classification-based methods, such as Fisher discriminant analysis [16], [17], focus on discriminant feature extraction and matching. Reconstructionbased methods extract the fault subspace and utilize it as a reconstruction model which can eliminate alarming signals and then obtain the diagnostic results. Dunia and Qin [18] designed a model of fault deviations to estimate the faultfree part of the measurement data and they were the first to define the concept of fault reconstruction. The conventional reconstruction methods are based on PCA. These algorithms extract the general fault information from the input data and may not be able to discriminate fault patterns well from normal conditions. To some extent, the dependence of the fault reconstruction on the monitoring statistics limits the improvement of the reconstruction method to a high level. For reconstruction, input data can be separated into the normal operating part and the fault part. A normal operating data has a large portion of the normal operating part and nearly no portion of the fault part. In contrast, a fault data has a little portion of the normal operating part and an out-of-control portion of the fault part which can cause alarming statistics in monitoring systems. To our best knowledge, the generalized principal component analysis (GPCA), which can capture the most common variations of two input data sequences, attracted significant attentions in recent years [19]- [21]. If the two input sequences of GPCA are the fault data set and the normal operating data set respectively, then the generalized principal components (GPCs) extracted by GPCA can represent the normal operating part of the fault data set. Here, we name the subspace composed of these GPCs as the normal subspace. Using the normal subspace, we can eliminate the normal operating part from the fault data set to obtain the fault part. It is easy to conclude that we can look for the fault subspace through processing the fault part of the fault data set.
The main objective of this paper is to explore more efficient ways to model the fault information in order to correct the fault more effectively. A GPCA model is built and implemented to the normal operating data set and the fault data set, and a PCA monitoring system is established with the normal operating data set. Instead of modeling the fault data directly, we give a detailed analysis on the normal operating part and decompose it through a GPCA model, and the fault subspace is obtained via a PCA on the remained part of the fault data set. In the fault data set, those fault deviations that can cause alarming monitoring statistics are separated from the others, which is implemented in two steps. The first step is to find out all the normal operating directions by applying GPCA on the normal operating data set and the fault data set, and a normal operating subspace is formed by adaptively selecting these directions with a PCA monitoring system. Clearly, in the fault data set, the normal operating part is irresponsible for the alarming statistics of the monitoring system whereas the fault part is not. The second step is to extract the major responsible fault deviations using the normal operating subspace for fault reconstruction. In fact, the normal operating subspace can actually adjust its size according to the complication of the industrial process during the GPCs extraction, so that the fault subspace can represent the fault deviations more accurately. Moreover, since GPCA is independent of the monitoring system, the proposed algorithm can improve its performance by changing the fault detection algorithms. The fault effects are accurately modeled based on the relevant alarming monitoring statistics. Consequently, fault reconstruction can be effectively achieved by paying attenion to the fault deviations caused by the alarming monitoring statistics. Relevant statistical analysis and discussions are carried out to further understand the proposed solution.
The rest of the paper is organized as follows. First, based on a review of the standard PCA algorithm used to detect faults, it laid the groundwork for simple preparatory theoretical support. Subsequently, the proposed algorithm is mathematically designed through statistical properties and attribute analysis. Emphasis is placed on the suitability and rationality of the algorithm. Then, based on data from the well-known Tennessee Eastman (TE) benchmark chemical process, the aforementioned recognition and argument are verified. Discussions are carried out to analyze the presentation results. Finally, the last section draws the conclusions.

II. PRINCIPAL COMPONENT ANALYSIS-BASED FAULT DETECTION
This section describes the fault detection system based on PCA. In general, it uses two subspaces (principal component subspace, PCS and residual subspace, RS) to monitor different types of process variations. Two various monitoring statistics, T 2 and SPE, are used, reflecting the abnormal changes in each subspace.
Let X be an n × j-dimensional normal data matrix in which the rows are the observations and the columns are process variables. It is assumed that X is normalized to have zero mean and unit variance. PCA is carried out to decompose the systematic information and residuals from X: where P(j × r) is the loading matrix, and T (n × r) is the principal scores, which is derived by P(j × r) from the measurement data X. n and j are the row and column size of X, and r denotes the principle components (PCs) size. In this way, the systematic information in X is characterized by TP T and separated from the residuals E, which is considered to be noise. In the perspective of projection, an observation x can be decomposed as x =x +x , wherex = PP T x is the projection to the PCS andx = (I − PP T )x is the projection to the RS.
Similarly, if the projection subspace is separated into the fault and fault-free parts, then an observation x can also be decomposed as x = x * +x f , where x f = f is the projection to the fault subspace and x * = x − f is the projection to the normal operating subspace. denotes the fault directions which is an orthonormal matrix spanned by the fault systematic subspace with a dimension of j × r f . r f denotes the major fault directions along which the fault systematic information varies. f represents the fault score in the fault systematic subspace so that f represents the magnitude of the fault. For a given fault subspace , We can calculate the fault score f by where (·) † is a generalized inverse operator. For a fault observation, f is out of line. And for a fault-free observation, f is very small. Therefore, the common information of a fault and fault-free observation is normal operating subspace.
In the PCS and RS, two popular statistical indices [22] are used by the statistical process monitoring for fault detection, which are Squared Prediction Error (SPE) and Hotelling's statistic. The SPE index is defined as the squared norm of the residual vectorx , , and λ i the i th eigenvalue of the data covariance. And the Hotelling's T 2 measures the variation in the PCS and defined as The process is normal if T 2 ≤ τ 2 , where τ 2 = χ 2 α (l) with confidence level (1 − α) × 100%, α is a significance factor of a weighted Chi-squared distribution [24], = T T T /(n − 1) denotes second-order statistics of principal part in the normal training data.

III. SUBSPACE EXTRACTION APPROACH OF RESPON-SIBLE FAULT DEVIATIONS
The proposed method is presented to develop a unified fault reconstruction model for any monitoring statistics. This method is applied for the decomposition of the responsible fault effects in two parts, which are introduced in 3.1 and 3.2. Before the method is implemented, two data sets are prepared, y ∈ R n×j and x ∈ R n f ×j , each being composed of the same number of variables and maybe different number of samples. y denotes the normal data, and x denotes one data set collected under one fault status. To develop PCA monitoring models [25], the normal data set is centered and scaled to be zero mean and unit standard deviation. Then, the data preprocessing information of y is used to normalize x. So that the preprocessed x can cover the fault deviation information relative to the normal center.

A. GENERALIZED PRINCIPAL COMPONENT EXTRACTION ALGORITHM
The generalized principal components (GPCs) are referred as the generalized eigenvectors corresponding to the r largest generalized eigenvalues of the autocorrelation matrix pencil composed of two data vectors, where r referred as the number of the GPCs. The techniques, which can accomplish the task of extracting the GPCs from input signals, are called generalized principal component analysis (GPCA). The concept can be given as follows [26], where R y and R x are two n × n symmetric positive definite matrices. The positive scalar λ and the vector v are called the generalized eigenvalue and the corresponding generalized eigenvector of the matrix pencil (R y , R x ), respectively. According to the matrix theory [27], the matrix pencil (R y , R x ) has n positive generalized eigenvalues λ i (i = 1, 2, . . . , n), and the corresponding R y -orthonormal general- where δ i,j the Kronecker delta function. If the generalized eigenvalues of the matrix pencil (R y , R x ) are arranged in a descending order, i.e. λ 1 > λ 2 > · · · > λ r > · · · > λ n > 0, the generalized eigenvectors corresponding to the first r generalized eigenvalues (λ 1 , i = 1, 2, . . . , r) are referred as the GPCs of the matrix pencil (R y , R x ). The matrix pencil (R y , R x ) is composed of two autocorrelation matrices R y and R x . In this paper, R y = E(y T y) is the autocorrelation matrix of the fault data, and the R x = E(x T x) is the autocorrelation matrix of the historical normal data. The common variations of two input data sequences are captured by generalized eigenvectors and the common degrees are evaluated according to the corresponding generalized eigenvalues. Based on the AMEX criterion [28], we proposed a generalized information criterion for estimating the GPC of the matrix pencil (R y , R x ), which is given by, where w ∈ R j×1 . Then, the gradient of J (w) with respect to w is given by After some discretization operations, we can rewrite (10) into the following nonlinear stochastic learning rule  where η ∈ (0, 1] is the learning rate. The iteration result w is obtained as the generalized principal component of the matrix pencil (R y , R x ), when the difference of w between iterations is less than a threshold δ.
Remark 1: From Eq. (6), it is obvious that the generalized principal eigenvector v is the common projection matrix where the projection of R x and R y have the strongest linear correlation. Therefore, we can obtain the common part of the input data x and y. If we regard x as the fault data, and y as the normal operating data, then the generalized principal eigenvector(s) and its (their) complementary subspace can represent the normal operating subspace and the fault subspace of the industrial process, respectively.
Due to the complexity of the industrial process, the number of the generalized eigenvectors is easily determined. However, we can obtain the exact number of generalized eigenvectors through the PCA monitoring model. In detail, the normal operating subspace would be exactly extracted when the data reconstructed by the generalized eigenvector(s) are under the normal operating condition.

B. FAULT RECONSTRUCTION STRATEGY
Before introducing the strategy, we explain the relationship between the normal operating subspace and the fault subspace in Fig. 1. From the perspective of fault reconstruction, the input data are considered to be composed of two parts, the normal operating part and the fault part. For the normal operating part, a common subspace W n can be extracted and each individual input data can be represented by its corresponding vector f n . And the normal operating part of the individual data equals to W n multiplied by f n . For the fault part, a common subspace W f can be extracted and each individual input data can be represented by its corresponding vector f f . And the fault part of the individual data equals to W f multiplied by f f . Clearly, for one industrial process, the normal operating subspace is unique, i.e. W n , and the fault subspaces can be various with different faults, i.e. W f 1 , W f 2 ,. . . , W fr . A normal operating data has the negligible small f f and nonnegligible f n whereas both f f and f n of a fault data are nonnegligible. The fault reconstruction strategy is to find the fault subspaces corresponding to different faults, which makes the reconstruction models correct the fault data yielding no alarming signals in the PCA monitoring system, then the model constructed by the fault subspace can detect the corresponding fault in the online operation.
To our best knowledge, the conventional algorithms developed the fault reconstruction model to blame for two different monitoring statistics. Thus, they need to heavily depend on a specific monitoring system, such as PCA based monitoring system to build the model. As shown in Fig. 2, the proposed algorithm is presented for the development of the different fault reconstruction models with any monitoring systems. Here, the PCA based monitoring system is used only for its fault detection, which is the basic function of any monitoring systems. The flow of the fault subspace extraction is shown in the Fig. 2, whose details are introduced in the following steps.
First, the normal data y and the fault data set x are inputted into the GPCA model (11), where the autocorrelation matrices of the normal and fault data are respectively estimated as then the generalized eigenvectors w i , i ∈ (1, j) are obtained. The w i s are arranged in the descending order of the corresponding generalized eigenvalues. The largest generalized eigenvalue stands for the variation direction which contributes most to the normal operating condition. If the generalized eigenvalue is significantly larger than one, it means that the projection of the fault data in the corresponding eigenvector has a strong linear relation with that of the normal operating data. In terms of the fault reconstruction, the linear relations are closely related to the common pattern between the fault data and the normal operating data. Therefore, we can find out all the strong linear relations to compose a subspace to represent the common pattern. This subspace can be defined as the normal subspace, which is initialized Second, the normal subspace is estimated by W = [W w i ]. The fault-free part x c of the fault data is obtained by double-projecting the fault data set x into the normal subspace W , Here, the first projecting is W T x, which is only related to the normal operating condition, and the pattern of the fault data closely related to the fault condition is eliminated by this operation. The second projecting is W (W T x), which is used to re-project the projection related to the normal operating condition, then the fault-free part x c of the fault data is obtained. To be noted, if the results (4) and (5) of the PCA monitoring model are ''fault'', then operate i = i + 1, W = [W w i ] and Eq. (14). The PCA monitoring model is used to judge whether the obtained normal operating subspace can sufficiently describe all the normal operating information of the fault data. Third, the PCA monitoring model, which is previously built based on the normal data y through the (1) and (2), is used to detect the fault-free data set x c . In the PCA monitoring model, two statistical indices, Hotelling's T 2 and SPE, are calculated. If the results of statistical indices indicate that the fault-free data set x c is ''fault'', go back to the second step. If it is ''normal'', go to the next step.
Forth, eliminate the fault-free part x c from the fault data set x, and we have x f which is almost the pure fault part of the fault data set x, The autocorrelation matrix of the fault part The principal components, which are the eigenvectors associated with largest eigenvalues of the autocorrelation matrix R f , contain the key fault information of the fault data. Therefore, by analyzing the principal components of x f we can obtain the fault subspace W f . The column size of W f is determined by the complexity of the fault. Define a threshold value and maintain the direction of their corresponding eigenvalues greater than the predefined threshold value. The number of principal directions is n f .
Following the above four steps, we can obtain a fault subspace of one fault data set. The fault data set y refers to one fault when it is used in one circle of the four steps. Then we can obtain another fault subspace through the proposed method by selecting another fault data set and processing the fault data set with a circle of four steps again. One by one, following the above procedure, subspaces for all types of faults can be obtained using different types of fault data sets. Based on these fault subspaces, a fault library which can identify different types of fault, is constructed. When the proposed algorithm is used in the real industrial process, the fault data set, which is immediately collected after a fault is detected through a monitoring system, is inputted into the above four steps, and the fault subspaceŴ f is obtained. Then all the inner products of theŴ f and the candidate fault subspaces in the fault library are calculated. The fault subspace corresponding to the most significant inner product can be diagnosed as the fault occurred in the industrial process. Moreover, the amplitude of the fault can also be calculated by the fault subspace to illustrate the degree of the fault over time.

C. COMMENTS AND PROPERTY ANALYSES 1) THE MODEL STRUCTURE
By comparing the conventional fault reconstruction algorithm with the proposed algorithm, they extracted the directions of fault variation from different aspects. On the one hand, the conventional fault reconstruction algorithm focuses on general major variations overall the fault data space. On the other hand, the proposed algorithm focuses on the relative variations from normal to fault, i.e., the main fault effects that are responsible for alarming signals, and the common variations between normal and fault, i.e., the normal operating conditions that are responsible for avoiding alarming signals. Ever since the monitoring statistic issue alarms, it means significant changes of the industrial process have happened. The GPCA model is used to identify what has and has not significant changed in the industrial process, revealing the relative changes from the normal case and the fault case.

2) THE FAULT SUBSPACE W f IS PART OF COMPLEM-ENTARY SUBSPACE OF THE NORMAL OPERATING SUBSPACE W
For the proposed algorithm, the alarming-responsible data set and the alarming-irresponsible data set are comprised of the fault data set, then, combining the definition of R f , we have Let P denote the eigenvectors of R x , combining the definition of W f , W f is enveloped by P(I j − WW T ).
Since W T (I j − WW T ) = 0, which means that I j − WW T is the complementary subspace of W , we draw the conclusion that the fault subspace W f is part of complementary subspace of the normal operating subspace W .

3) THE NUMBER OF RETAINED RECONSTRUCTION DIRECTIONS
With regard to the conventional algorithms, a higherdimensional reconstruction model may correct the fault data overmuch which a lower-dimensional reconstruction model may not remove the fault effects sufficiently. In the proposed algorithm, all possible directions, also the columns of W , with equal variations to normal are obtained through the GPCA model. The column number of the W is actually determined by the monitoring system, thus the reconstruction model built by W is adaptive to the industrial process. Moreover, it is possible that rank(W f ) + rank(W ) < j, which means that the final dimension only needs the least number of directions to restore the alarming statistics to normal. This may result from such a fact that some directions obtained as the complementary subspace of W may not necessarily cause alarming signals even though they show increased variations than normal. These insignificant directions can be ignored.

A. TENNESSEE EASTMAN BENCHMARK PROCESS
In this subsection, the proposed fault reconstruction method is evaluated by examining the application of the proposed method in the well-known Tennessee Eastman (TE) benchmark chemical process. Since the first introduction by Downs and Vogel [33], the TE process has been widely used to test and evaluate various process monitoring and fault diagnosis methods [29]- [32]. As is illustrated in Fig. 3, the TE process is consists of five main operating units: a reactor, a product condenser, a vapor-liquid separator, a recycle compressor, and a product stripper. It contains two process variables blocks: 41 measured variables and 11 manipulated variables. There are four gaseous reactants A, C, D, E and two liquid products G and H and one by-product F. In this process, 21 faults data are also available for simulation, which include sixteen known faults and five unknown faults. The details on the process description can be found in [31].
Tennessee Eastman process (TEP) provides an excellent simulation platform to verify the fault diagnosis performance of the proposed method. Process measurements are sampled with interval of 3 min. Nineteen composition measurements are sampled with time delays that vary from six minutes to fifteen minutes. Four hundred and eighty normal samples are used for the development of monitoring models. Twelve known faults are identified as described in [31] since they can be clearly detected by at least one monitoring statistic. They include four different fault types. Faults 1-6 are associated with step changes in different process variables, e.g., in the A/C feed ratio, D feed temperature. Faults 7-10 are associated with random variations in certain variables, e.g., an increase in the variability of reactor cooling water inlet temperature. For Fault 11, there is a slow drift in the reaction kinetics. For Fault 12, two cooling water valves are stuck.
All variables in the normal data space are centered on the mean and scaled to the unit variance. First, a GPCA model is developed, including an information criterion and an extraction algorithm. The gradient descending method is used to derive the extraction algorithm from the information criterion. We have also developed PCA monitoring models, including a systematic subspace of dimension 27 and a residual subspace dimension 15. The number of the PCs is determined by cumulative explained variance rate (CVER) to maintain 90% process variability. The monitoring system is used for online monitoring of the 12 faults, all of which can be clearly detected. Then the normal operating subspace is extracted as the GPCA is implemented in the normal operating data set and the fault data. The number of directions in the normal operating subspace is determined by the detection results of the PCA monitoring models. The fault part of the fault data set is separated by eliminating the normal operating part. The fault subspace is obtained through applying PCA on the fault part of the fault data set. To improve the fault diagnosis performance immediately at the beginning of process disturbance, only the first 100 fault samples are used. For the concerned 12 faults, the fault space is decomposed into different parts according to their relative changes with respect to normal case. The modeling results are shown in Table 2, where the most common parts under the fault conditions and normal conditions are evaluated by the maximal values of T 2 and SPE.
In contrast, directly using the conventional PCA algorithm to process the fault data, it is also possible to develop a reconstruction model and use it for fault diagnosis. Compared with the proposed algorithm, the main difference of the conventional PCA modeling method is that the relative fault effects that cause the signals outside of the control are not collected before PCA modeling. The reconstruction models developed by the proposed algorithm are then placed into an online application to correct the faulty data for both The reconstruction results vary with the different numbers of fault directions. The reconstruction models developed by the proposed algorithm are applied online to correct the 12 th VOLUME 8, 2020  faulty data for both T 2 and SPE monitoring statistics, which are shown in Figs. 6 and 7, respectively. In both the principal and residual subspaces, the reconstruction result with only one direction has the largest varying from the original monitoring statistics where the comparison of the result with five and seven directions illustrates that the last direction has the least varying. It means that only the correct fault reconstruction models can be used to reconstruct statistics outside the control so that the causes of the fault can be properly diagnosed.
Besides, to reflect the difference between the proposed method and the conventional method, another numerical experiment to compare these two methods. In this table, the proposed method and the conventional method are implemented to model the faults for twelve known fault processes.  Both methods are capable of efficiently reconstructing the faults. For the proposed method, the number of generalized principal components (GPC), which represents the normal operating subspace and the numbers of fault directions, which represents the fault operating subspace, are recorded in the second and third columns. For the conventional method, the numbers of directions in principal subspace and the residual subspace are recorded in the four and fifth columns. By comparing these numbers, we can see that in most fault cases, fewer dimensions are needed to construct the fault model by the proposed method than what are needed by the conventional PCA method. This point illustrates the proposed method has a lighter computational burden than the conventional PCA method in dealing with the fault reconstruction issues.

B. DISCUSSIONS AND SUMMARY
According to the above findings, the proposed method can better decompose all fault states, and so that the process can be understood and the performance of fault diagnosis can be improved. By combining the fault data with the normal data, the influence of fault deviation with respect to normal state are analyzed to develop reconstruction models.
In summary, the advantages of the proposed reconstruction modeling and fault diagnosis analysis can be enumerated below.
First, the generalized principal directions can be sorted from the combination of the fault data and the normal data, which can be used to illustrate the common directions. Then, GPCs distribute the most significant fault effects associated with them and are used effectively for fault diagnosis via reconstruction. Therefore, a simple reconstruction structure is expected.
Second, it is easy to think that the modeling method proposed in this paper can be combined with other fault diagnosis methods, such as reconsruction contribution plot, to disclose those process variables that can reveal the most relative fault deviations. That is, having sorted the common effects in both the normal data and fault data and separated them from fault data, critical variables that affect the fault effects can be demonstrated.

V. CONCLUSION
In the present work, a new reconstruction modeling algorithm is proposed through a two-stage subspace decomposition procedure and its application is demonstrated on the fault diagnosis. The common effects between the fault and normal data are analyzed, so that the fault parts in the fault data are decomposed and used effectively to rebuild the fault. By extracting the most common variations and separating them from the fault data, the underlying characteristics of various faults are revealed. The feasibility of the proposed algorithm was confirmed through an TE process. The method proposed in this paper is based on the PCA monitoring system, but in fact it can easily combine statistical monitoring methods of various attributes. There may be many issues for future investigation, but the results of this study provide the basis for further work and improvement.