Gas Path Fault Diagnosis of Gas Turbine Engine Based on Knowledge Data-Driven Artificial Intelligence Algorithm

As the core power for the aviation industry, shipbuilding industry, and power station industry, it is essential to ensure that the gas turbines operate safely, reliably, greenly and efficiently. Learn from the advantages and disadvantages of the thermodynamic model based and data-driven artificial intelligence based gas-path diagnosis methods, a newfangled gas turbine gas-path diagnosis approach on the basis of knowledge data-driven artificial intelligence is proposed. That is a hybrid method of deep learning and gas path analysis. First, gas turbine thermodynamic model of the object to be diagnosed is constructed by adaptation modeling strategy. And the engine thermodynamic model is taken as the basal model to simulate various gas path faults. Secondly, a large number of knowledge data corresponding to component health parameters and gas turbine boundary condition parameters & gas-path measurable parameters are simulated by setting different component health parameter values and different boundary conditions based on this basal model. And next, define the vector composed of the boundary condition parameters & the gas path measurable parameters in the knowledge database as the input vector, and the component health parameter vector as the output vector, and a deep learning model for regression modeling of this knowledge database is designed. At last, along with the gas turbine engine runs, the trained model outputs component health parameters in real time after trained deep learning model is deployed to the corresponding gas turbine power plant. The simulation experiment results show that, accurate and quantified health parameters of each gas path component can be obtained by the proposed method in this paper, and the overall root mean square error does not exceed 0.033%, and the maximum relative error does not exceed 0.36%, which illustrates the proposed method has great application potential.


I. INTRODUCTION
A gas turbine is an internal combustion engine that uses a continuous flow of gas as a working medium to drive the impeller to rotate at high speed and convert the chemical energy of the fuel into useful work. In the period of the gas turbine operating, the temperature, pressure, speed, etc., inside the engine will remain at a certain high intensity level, which is likely to cause damage to the gas turbine. In addition to these harsh working conditions, it may also be affected by the surrounding polluted environmental. For example, the air sucked through a compressor is often accompanied by a large amount of sand, dust and carbon oxides. Although the air intake filter can be used to filter pollutants, some unfiltered impurities will stick to the compressor blades, or is on other internal parts. As the operating time increases, various performance degradations or damages of its main components, such as fouling, leakage, etc., will occur, and various serious failures may further occur [1]. The current daily maintenance strategy of gas turbine engines in power plants at home and abroad usually adopts preventive maintenance, in other words, EOH indicated by manufacturer determines whether the gas turbine needs a minor inspection or a gas path inspection or an major overhaul. Regarding the shutdown and maintenance of the gas turbine engine, whether it is planned or unplanned, it always means that a large amount of money will be used for operation and maintenance costs. Therefore, these costs should be reduced while ensuring the safe operation of gas turbines as much as possible. For the sake of enhance the reliability and usability of gas turbine engines, while prolonging the life cycle and using as little operation and maintenance costs as possible, the engine user needs through continuous monitoring and diagnosis, take appropriate maintenance strategy based on actual health performance, that is, condition-based maintenance. Gas path analysis (GPA) is a technology that can issue early warning information to a worsening situation that is developing or about to occur [2], [3].
At present, according to diagnosis mechanism, GPA methods include diagnostic methods on the basis of thermodynamic model decision-making [4] and data-driven artificial intelligence [5]- [7]. The main gas-path component health parameters can usually be taken to indicate gas turbine health status, such as compressor and turbine flow characteristic index, that characterizes the component flow capacity, and efficiency characteristic index, that characterizes the component operating efficiency, and the efficiency characteristic index of the combustion chamber [8]. Nevertheless, these important health status information cannot be obtained by direct measurement. And therefore, it is difficult to monitor at present. In period of engine operating, when health performance of certain components is degraded or damaged, the internal performance parameters of the components (such as pressure ratio, isentropic efficiency, etc.) will make a difference, and the measurable parameters of the external gas path (such as temperature, pressure, etc.) will be forced to change, seen in fig.1. Therefore, the thermodynamic model decision-making based diagnosis is an inverse mathematical process in which component health parameters are obtained from the gas path measurable parameters of the engine through the thermodynamic coupling relationship, which are used for the assessment of the engine performance health conditions [9].
The character of thermodynamic model decision-making based diagnostic methods is that it does not need to accumulate a mass of the fault data sample sets, and can quantify the diagnostic results to obtain detailed diagnostic information. According to the complexity of performance model used, the thermodynamic model decision-making based diagnostic methods can be further divided into small deviation linearization diagnosis methods [10] and nonlinear diagnosis methods [11], [12]. Because the diagnostic accuracy of the small deviation linearization diagnosis methods is greatly affected by the variation of the boundary conditions (environmental conditions and operating conditions) and the sensor measurement noise, the nonlinear diagnosis methods VOLUME 9, 2021 are the mainstream of research. The driving solution algorithms of the nonlinear gas path diagnosis methods include local optimization algorithms (such as Newton-Raphson algorithm [13] and Kalman filter algorithm [14], [15]) and global optimization algorithms (such as particle filter algorithm [16] or genetic algorithm [17], [18]). To address the existing problems of low diagnostic accuracy lead by linearization of the thermodynamic system, and problems that sensor measurement noise and bias can cause large deviations in diagnostic accuracy, scholars have made considerable improvements [19]- [21]. However, when above-mentioned gas turbine nonlinear diagnostic methods are deployed in a gas turbine power plant, there are actually the following three difficulties: a. The current gas turbine forward thermodynamic calculation (i.e., gas turbine performance simulation) has high accuracy and reliability. However, the accuracy and reliability of gas turbine reverse thermodynamic calculation (i.e., the thermodynamic model decision-making based diagnostic methods) have yet to be tested by actual engineering project, and currently it is still mainly at the stage of theoretical testing. b. At present, most gas turbine power plants are peak-shaving power plants. Gas turbines are usually operated under transient off-design conditions such as frequent dynamic loading and unloading, rapid start and stop, etc., which easily leads to the occurrence of algorithm divergence phenomenon in the real-time monitoring and calculation process. c. For users of gas turbine power plants, another practical problem is that users often do not have any gas turbine thermodynamic modeling technology, let alone a thermodynamic model decision-making based diagnostic technology. For the existing data-driven artificial intelligence diagnosis methods that are commonly used, such as neural networks [22]- [25], and fuzzy logic [26]- [29], as illustrated in the fig.2, it is often necessary to build on an existing component fault data sample set. However, accumulating fault data sample sets based on engine operating experience and on-site operating data is a time-consuming & labor-intensive task. In addition, for malfunction types not involved in the malfunction data sample set, the above existing approaches are usually difficult to provide accurate diagnosis results, and large deviations will occur.
For distributed process monitoring [30]- [32], which has recently attracted attention, it maintains strong correlation and improves monitoring performance, but it is difficult to quantify the diagnosis results. The method proposed in this paper can obtain more accurate diagnosis results and quantify the diagnosis results, making it more intuitive.
For a newly commissioning engine, there are often problems with a small number of labeled fault data samples and unbalanced fault data samples. First of all, the information obtained in the process of fault diagnosis is usually incomplete. Traditional machine learning algorithms often require a large number of labeled data samples as a training set, but in real scenarios, a large amount of manually labeled data samples is required, resulting in huge costs. How to use the improvement of expert technology and knowledge to label out high-quality data samples, and the increase of system intelligence to screen out more high-value labeled data samples is also a question worth discussing. Secondly, the frequency of different types of failures is not consistent. Due to the imbalance of the categories of the original data, the learning of the model will be biased towards categories with a mass of samples. In the actual fault classification process, the model needs to pay more attention to the classification of a small number of faulty samples. The deviation of the model will cause the system to ignore the characteristics of a small number of samples. How to generate a large amount of valuable labeled data samples through technical means such as scenario generation and data enhancement needs to be further studied. The above-mentioned problems restrict the application of traditional data-driven artificial intelligence diagnostic technologies.
Combined with the advantages and disadvantages of the gas path fault diagnosis method based on thermodynamic model and data driven, this paper proposes a gas turbine gas path fault diagnosis method based on knowledge data driven. Through the gas turbine thermal model constructed based on the adaptive thermal modeling strategy, a data set containing different fault types is obtained, and then deep learning is used for regression modeling to obtain a gas turbine gas path fault diagnosis model. Through the trained diagnosis model, the health parameter vector of each gas path component is diagnosed and output in real time. Simulation experiments show that the method proposed in this paper can accurately obtain the quantified health parameters of each flow component.

A. DEEP NEURAL NETWORK
Neural network is a research hotspot in the field of artificial intelligence. It can reflect the strong nonlinear relationship between network input parameters and output targets, and can flexibly handle multi-input and complex variables. With the rapid development of artificial intelligence and deep learning, deep neural networks (DNN) have received great attention and applications. DNN can combine low-level features to form more abstract high-level features through multi-layer nonlinear transformation [33]. Compared with the traditional single hidden layer neural network (such as BP neural network), deep neural network has a stronger ability to process nonlinear data, and the hidden layers are fully connected. In other words, any neuron in the i-th layer must be connected to any neuron in the i+1th layer. From a local perspective, the principle is a linear relationship (equation (1)) plus an activation function (equation (2)).
where, w i is the weight of the connection between the i-th layer and the i + 1th layer; b is the bias of unit i. The Rectified linear (ReLU) activation function (equation (2)) in the deep neural network can effectively avoid the gradient disappearance and local minimum problem that BP neural network is prone to. The typical deep neural network structure includes: input layer, hidden layer and output layer, as shown in fig.3. The increase of the number of layers and nodes can improve the fitting accuracy of the model. However, too many layers and nodes will cause over-fitting and reduce the generalization of the model. Therefore, model training should consider the effectiveness of depth, and avoid over-fitting while having better prediction results.

B. THE PROPOSED METHOD
In view of the above gas turbine gas-path diagnosis problems that currently exist, learn from the advantages and disadvantages of the thermodynamic model based diagnostic methods and data-driven artificial intelligence based diagnostic methods, a newfangled knowledge data-driven artificial intelligence based gas turbine diagnostic approach, which is a hybrid method of deep learning and GPA, is proposed, illustrated in fig.4. In this paper, the health parameters of gas turbine components are used as the evaluation index of the unit health status.
The specific diagnosis steps are as follows: a. On the basis of gas-path measurable parameters & adaptation modeling strategy, gas turbine thermodynamic model of the object to be diagnosed is constructed. And the engine thermodynamic model is taken as a basal model to simulate various gas path faults. b. According to different gas path component faulty modes and the climate conditions and operation conditions of the gas turbine throughout the year, a large amount of knowledge data corresponding to component health parameters, gas turbine boundary condition parameters − → u and gas path measurable parameters − → Z are obtained through simulation based on this basal model, seen in fig.5. and then a data set (knowledge database) with one-to-one correspondence between the component health parameter vector and boundary condition parameter & the gas path measurable parameter vector is generated.   the output vector, and design a deep learning model for regression modeling of this knowledge database, seen in fig.6. d. The trained deep learning model is deployed to the corresponding gas turbine power plant. As the engine is running, based on pratical boundary condition parameters − → u & gas path measurable parameters − → Z , the trained model will be able to output component health parameters in real time. Accurate and quantified health parameters of each gas path component can be obtained by the proposed method in this paper and the detailed application and analysis through simulation experiment are described as below.

III. APPLICATION AND ANALYSIS
Taking a certain heavy-duty gas turbine as the research object, the thermodynamic working principle diagram of this type of single-shaft engine is illustrated in fig.7.
And the detailed gas turbine boundary condition parameters − → u and gas path measurable parameters − → Z of this heavy-duty engine are illustrated in Table 1.
Firstly, based on the engine boundary condition parameters − → u and gas path measurable parameters − → Z from the target engine at newly commssioning stage when the engine is always healthy and clean, the gas turbine thermodynamic model of the object to be diagnosed can be constructed by adaptation modeling strategy. And the specific principle of  adaptation modeling strategy can be referred to the our previous research work [34]. And then the gas turbine thermodynamic model can be taken as a basal model to simulate various gas path faults.
And the detailed gas turbine component health parameters of this heavy-duty gas turbine are illustrated in Table 2.
And the effects of different gas path component faulty modes on the component health parameters, and the range    of component degradations can be referred to our previous work [35], and are illustrated in Tables 3 and 4. Secondly, according to different gas path component faulty modes (illustrated in Table 3) and the hypothetical   climate conditions (illustrated in fig.8) and hypothetical operation conditions (illustrated in fig.9) of the gas turbine, 15620 knowledge data samples corresponding to − → SF and engine boundary condition parameters − → u & gas path measurable parameters − → Z are simulated by implanting different values of − → SF (illustrated in Table 4) and different boundary  conditions based on this performance model. And then a data set (knowledge database) with one-to-one correspondence between the component health parameter vector − → SF and boundary condition parameter & the gas path measurable parameter vector − → u , − → Z is generated. Thereinto, the hypothetical atmospheric environment condition parameters, which are selected according to actual weather changes. The Shanghai area is used as the working place of the target gas turbine. A group of atmospheric environment data is selected every 3 hours, 8 groups of atmospheric environment data are selected every day, and a total of 3905 groups of atmospheric environment data are selected. And next, define the boundary condition parameter & the gas path measurable parameter vector − → u , − → Z in the knowledge database as the input vector, and the component health parameter vector − → SF as the output vector, and  design a deep learning model for regression modeling of this knowledge database, seen in fig.10. And the structure of deep learning model is seen in Table 5. To test the effectiveness, 12496 knowledge data samples are randomly selected for training the deep learning model, and the remaining 3124 knowledge data samples are used for test. At last, the trained deep learning model can be deployed to the corresponding gas turbine power plant. As the gas turbine engine runs, according to gas turbine boundary condition parameters − → u and gas path measurable parameters − → Z at operating stage, trained deep learning model can output component health parameters in real time. In this paper, the remaining 3124 knowledge data samples are used for test this scenario. And three common neural networks of GRNN, BP and RBF from another our previous work [36] are used as comparative machine learning models, the relative errors by these methods are shown in figures 11 to 15 and the root mean square errors by these methods are shown in Table 6.  From Table 6 and form figures 11 to 15, we can see that, compared with other machine learning models, the proposed deep learning model shows best diagnostic accuracy, and the overall root mean square error by the deep learning model does not exceed 0.033%, and the maximum relative error by the deep learning model does not exceed 0.36%, which illustrates the proposed method has great application potential.

IV. CONCLUSION
Aiming at above existing problems of gas turbine fault diagnostic methods, learn from advantages & disadvantages of the thermodynamic model based and data-driven artificial intelligence based diagnostic methods, a newfangled knowledge data-driven artificial intelligence based gas turbine diagnostic approach, that is a hybrid method of deep learning and gas path analysis, is proposed. Some meaningful conclusions can be obtained as below: (1) Unlike the traditional data-driven artificial intelligence diagnosis methods, which are used for anomaly detection or/and fault classification, accurate and quantified health parameters of each gas path component can be obtained by the proposed method in this paper.
(2) After the data set (knowledge database) with one-toone correspondence between the component health parameter vector and boundary condition parameter & the gas-path measurable parameter vector is generated by the performance model and learned by the deep learning model, the trained model can be used easily by users of gas turbine power plants without any gas turbine thermodynamic modeling technology.
(3) Compared with other machine learning models, the proposed deep learning model shows best diagnostic accuracy, and the overall root mean square error by the deep learning model does not exceed 0.033%, and the maximum relative error by the deep learning model does not exceed 0.36%, which illustrates the proposed method has great application potential.