Flight Test Sensor Fault Diagnosis Based on Data-Fusion and Machine Learning Method

Fault diagnosis and classification (FDC) is an important part of prognostics and health management for ensuring safety and performance in the flight. However, it is challenging to achieve accurate FDC only based on single senor readings. In this paper, a fused FDC model among multiple different sensors is stabled by a hybrid deep learning architecture combining a sparse autoencoder (SAE) and a convolutional neural network (CNN). The hybrid model uses the SAE to enhance the hidden fault signal features in the multiple sensor signals, and then classifies the obtained feature map using the CNN. This method, which combines the advantages of the SAE in feature extraction and of the CNN in local feature recognition, fully utilizes the spatiotemporal coupling characteristics of multi-sensor signals. The FDC accuracy obtained by the proposed method when applied to a flight test data set is 93.78%, compared with 66.67% obtained using the combined SAE and feedforward neural network method and 83.11% obtained using the CNN only.


I. INTRODUCTION
Prognostics and health management (PHM) aims to minimize maintenance costs by evaluating, predicting, diagnosing, and managing the health of engineering systems. It incorporates incipient failure detection (fault detection), identifying specific failure types and isolating their origin (fault diagnosis), and predicting remaining useful life (prediction) [1], [2]. PHM can therefore prevent unexpected catastrophic failures, reduce maintenance frequency, optimize the storage of spare parts and other resources, etc., and has significantly influenced the industry in recent years. It is increasingly valued in fields such as aerospace, intelligent manufacturing, and marine and ocean engineering. Aircraft manufacturers have adopted PHMs, examples being Boeing's Aircraft Health Management system [3] and Airbus's Aircraft Maintenance The associate editor coordinating the review of this manuscript and approving it for publication was Guillermo Valencia-Palomo .
Analysis system, to improve the overall performance of aircraft and spacecraft. The National Aeronautics and Space Administration (NASA) will use the PHM paradigm for all future manned and unmanned space missions [4], [5].
Fault diagnosis and classification (FDC), the first task of a PHM system, aims to detect performance degradation early enough to prevent serious damage. It is closely related to fault-tolerant control and thus greatly influences operational cost and safety. Commonly, FDC methods can be divided into three categories: model-based, signal-based, and data-driven methods [6], [7], [8]. Model-based methods require a model of the industrial processes that are created using physical principles or system identification techniques. Examples of system modeling methods are observer/residuals, parameter estimation, parity space, and bond graphs. However, modelbased methods are limited by modeling errors, measurement noises, and system uncertainties, and therefore tend to generate false alarms. Additionally, the increased complexity VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and nonlinearity of modern systems make system modeling difficult, or even impossible at times [9]. Signal-based methods utilize historical signal data and prior knowledge to analyze the signal symptom in the time domain, frequency domain, or time-frequency domain. Commonly used signal-based methods include autocorrelation function, fast Fourier transform (FFT), short-time Fourier transform (STFT), and wavelet transform, etc. This method has a wide application in real-time situations, but the prior knowledge is not always available and real signals are always noisy, both decreasing the accuracy of this method [8]. Therefore, data-driven methods have attracted wide attention and have been developed rapidly in recent years. Unlike model-based methods and signal-based methods, data-driven methods mine hidden information solely from historical data and require neither physical knowledge nor an explicit mathematical model [10]. When a large volume of historical measurement data is available, data-driven methods can model even complex nonlinear systems [11]. They have become one of the most popular PHM methods in the aerospace industry. Data-driven methods can be classified into statistical methods (such as principal component analysis (PCA) [12] and its extensions [13], [14], independent component analysis (ICA) [15], [16], [17] , and partial least squares [18], [19], [20]) and artificial intelligence methods.
In particular, artificial intelligence methods are widely used in every aspect of PHM, such as fault diagnosis and the prediction of the remaining useful life (prognostics). Lee et al. [11] and Seo et al. [21] explored a hybrid method combining SVM and ANN for gas turbine engine fault diagnosis. Alrifaey et al. [22] presented a new DL framework for fault feature extraction, fault detection, and parameter optimization based on recurrent neural network-long short-term memory (LSTM), the sparse autoencoder (SAE), and particle swarm optimization. Zhou et al. [23] proposed a DL fault diagnosis method that applied a global optimization scheme to a GAN to generate more discriminable fault samples and thus diagnose faulty bearings. Chen et al. [24] developed a transfer learning framework for small sample sizes, known as the transfer fault diagnosis model from structurally complete data, and established a migration learning mechanism to improve the fault diagnosis accuracy.
For complex aeroindustrial systems, data from multiple sensors are often combined to form sensor arrays to improve measurement accuracy and obtain more reliable inferences than from a single sensor [25], [26]. Due to the harsh working conditions, sensor signal data often contain high levels of measurement noise. For multi-sensor coupled signals, when a fault occurs in one sensor, the fault signal is often assimilated into the measurement noise or is inconspicuous, making it difficult to identify and isolate the fault feature. Thus, a multisensor fusion architecture which makes full use of the spatialtemporal coupling information of multi-source signals is important for addressing fault tolerance control, incipient system failure detection, and other aeroindustrial problems. Various efforts have been made for the fusion of multiple types of signals to improve FDC ability by using data-driven methods. Han et al. [27] presented a spatiotemporal convolutional neural network (ST-CNN) fault diagnosis framework which combines Spatiotemporal Pattern Network (STPN) and CNN for multivariate time-series data from complex systems. Shao et al. [28] constructed stacked wavelet autoencoder (SWAE) for multisensory data fusion and designed an enhanced voting fusion strategy for collaborative fault diagnosis. Tribeni et al. [29] developed a hybrid method combining SVM and short-term Fourier transform (STFT) techniques for nonlinear motor system fault signal classification. Serdio et al. [30] combined multivariate orthogonal space transformations and data-driven system identification models, applying them to vectorized time-series models to enhance the performance of residual-based fault detection for multi-sensor networks. Huang et al. [31] proposed a sliding window processing CNN-LSTM model to extract fault features with a time delay for the fault diagnosis of complex systems. The CNN layers extracted the features, and the LSTM layers captured the time delay information. This method improved the predictive accuracy and noise sensitivity.
Among all the neural network structures, autoencoder and CNN are powerful tools for spatiotemporal feature extraction and can be applied for FDC. An autoencoder is a special type of neural network architecture whose input and output have the same structure. Autoencoders are trained to capture input data in lower dimensions by unsupervised means [32]. Unlike traditional linear reduced order models such as PCA and dynamic mode decomposition (DMD), the autoencoder provides nonlinear low-dimensional feature expression. This method is widely used in image reconstruction, feature extraction, and reconstruction. In FDC, the autoencoder and its variants are usually combined with other neural network models to enhance fault features and thereby improve the identification and classification capabilities of fault diagnosis and isolation methods [33], [34]. As one of the most popular neural network architectures, CNNs are widely used in computer vision, pattern recognition, and image classification because of their powerful feature extraction capability. In the field of FDC and PHM, the CNN is also a potential classifier and has been widely used; however, high noise and inconspicuous fault features in multi-sensor data often mislead the classification results.
In the existing studies, although SAE and CNN methods were intensively examined in motor image classification [35], price forecasting [36] and other fields, most of the studies were only focused on single type of data. In the field of PHM, only few studies used this method for FDC of 1D signals [37], [38], however, with limited information, existing methods suffer from low accuracy on the identification of complicated fault signals with background noise. There is a substantially unexplored domain and too little work has been devoted to for multi-source sensor data FDC of aeroindustrial systems. Inspired by previous research, we propose a hybrid DL architecture combining an SAE and CNN for flight test sensor fault diagnosis and classification of multivariate coupled sensor signals. First, an SAE is used to extract the hidden features of multi-source coupled signals. The CNN-based DL method is then used to distinguish the fault characteristics of the extracted feature maps and classify the fault types.
In summary, the innovative contributions of this paper can be summarized as follows.
1) By combining SAE and CNN, this paper proposed a novel multi-source data fusion and fault diagnosis DL model; which makes full use of the abundant and complementary information of complex system multisource signals.
2) The proposed hybrid architecture fully utilizes the advantages of SAE in feature extraction and CNN in local feature recognition, reducing the influence of background noise and enhancing robustness of fault diagnosis.
3) The fault diagnosis ability of the proposed method was evaluated by using a commercial aircraft braking system flight test sensor data. The results showed that the proposed architecture can significantly improve fault diagnosis accuracy. The rest of the paper is organized as follows. Section II introduces the three DL algorithms used for FDC of multi-source coupled signals: the SAE, the CNN, and the proposed algorithm. Section III describes fault classification for flight test data using the proposed method, and compares the results with those of other methods. Finally, Section IV summarizes the conclusions.

A. AUTOENCODER AND SPARSE AUTOENCODER
An autoencoder is a special symmetrical feedforward neural network (FN) that is mainly used for dimension reduction and feature extraction of data through unsupervised learning [38]. The network can be regarded as consisting of an encoder and decoder, with its output and input layers having the same structure. During training, the input data are firstly transformed into lower-dimensional representations, and in the output part of the neural network, the low-dimensional information is reconstructed back to high-dimensional information.
The basic structure of an autoencoder is shown in Figure 1, where the input is . . , n is the number of samples and j = 1, 2, . . . , m is the dimension of each sample, and the output layerX i = x i1 ,x i2 , . . . ,x im . The term labeled ''+1'' is the bias unit and corresponds to the intercept term. The encoding process tries to find a low-dimensional approximation of the input h = f (x), and the decoding process learns to reconstruct the input r = g(h (x)). The autoencoder and the cost function can be constructed as where W and b are respectively the weights and biases between the input layer and the hidden layer; W and b are respectively the weights and biases between the hidden layer and the output layer. Intuitively, the autoencoder is similar to PCA, but its performance is better because the nonlinear coding and decoding process can extract more effective new features. However, an autoencoder is useless if it simply learns to set g(f (x)) = x; i.e., the neural network performs an identity mapping to produce overfitting of data. Thus, we usually need to impose some constraints on the encoder so that it can learn useful features. There are several types of autoencoder such as the stack autoencoder, regularized autoencoder, undercomplete autoencoder, SAE, and denoising autoencoder. The SAE is a classical autoencoder that adds regular items to the hidden layer, and imposing a sparsity constraint on the hidden units forces the autoencoder to discover and collate features in the latent space. It can improve the performance of the classical autoencoder and has greater practical application value in feature extraction and classification [40], [41], [42].
The structure diagram of the SAE is shown in Figure 2. As with the autoencoder, its input and output have the same structure, but most hidden layer units are suppressed (this is the so-called sparsity limit). As shown in the figure, the light-colored hidden layer units are the suppressed activation function. Taking the sigmoid activation function as an example, when the output of neurons is close to 1, they are considered to be activated; when the output is close to 0, they are considered to be inhibited. The activation of the hidden layer is given by ξ i (x) = sigmoid(Wx +b), where the average activation of the hidden unit i can be written In the process, a penalty term is added to the objective function ρ i to keep most of the hidden neurons inactive and thus achieve ''sparsity''. The penalty term is where KL (·) is the Kullback-Leibler (KL) divergence, which is a measure of the difference between two probability distributions: If ρ i = ρ, the KL divergence KL (ρ ρ i ) = 0, otherwise it increases monotonically. The average activation ρ i is computed on all training examples to obtain the sparse error, after which the weight and bias can be updated by the backpropagation algorithm. Through this process, the SAE extracts the sparse input features, providing a better starting point for the CNN.

B. CNN
A CNN is an FN characterized by a certain depth of convolution operation [41]. Compared with fully connected neural networks, a CNN greatly reduces the number of network parameters by fully utilizing local correlation and weight sharing to improve the training efficiency [43]. Its excellent performance in feature extraction makes it one of the most popular neural network categories. Typical CNN structures are convolutional layers, pooling layers, and fully connected layers, as shown in Figure 3.
The convolutional layer, which is the key component of a CNN, contains a set of filters that is also called the convolutional kernel. It is composed of a grid of discrete numbers, and its function is to convert the input to feature maps with a sliding convolution operation. Consider a convolution input with a two-dimensional grid structure. At the beginning, the kernel is positioned over the left upper section of the input and performs a dot product with the matching grid of the input; the kernel then slides to the right with a specified stride procedure and performs the same operation. This sliding procedure is implemented from left to right and from top to bottom until the whole input is covered. The stored results represent the feature map. The output feature maps depend on the shape and dimension of the kernels.
A pooling function replaces the output of the net at a certain location with a summary statistic of the nearby outputs [41]. The pooling operation reduces the spatial resolution and data volume of the feature map captured by the convolutional layer, but keeps the representation approximately invariant. The most widely used pooling forms are max pooling and average pooling. Similar to the convolutional layer, the pooling operation works by sliding a window across the input, taking the maximum/average value of the window at each subregion, and sorting the result as its output.
The end part of a CNN architecture usually consists of fully connected layers, whose form is the same as that of the FN. The function of the layers is to classify the feature map detected and extracted from convolutional layers and pooling layers. For the purpose of classification, the output of the fully connected layer can be flattened into a single value.

C. THE PROPOSED METHOD
In the present research, based on data fusion of multiple signals, a hybrid deep neural network combining SAE and CNN is proposed and verified to extract specific sensor fault features submerged in multivariate coupled sensor signals and classify fault types. The core of the proposed algorithm including two aspects. Firstly, unlabeled historical signal data obtained from multiple sensors with different sampling rates are modeled together and input to the SAE to extract meaningful features of normal/fault signals and reduce the influence of noise. Secondly, the reconstructed feature maps obtained from the SAE are labeled and then utilized to train the CNN to extract more features and classify the fault type. By building correlations among multiple sensors, the fault of one sensor can be more easily recognized by evaluating the deviation of the fault sensor signal from the signals of other sensors.
The flowchart of the proposed method and a schematic diagram of the hybrid deep neural network are shown in Figure 4. The proposed framework can be divided into offline and online stages. The offline stage is used to collect and preprocess historical flight test data and train the model. Where the preprocessing of raw signal data mainly includes interception and reconstruction using down-/over-sampling method to achieve the fusion of imbalanced dataset. In the online stage, the multi-sensor flight test data are monitored, and the well-trained model obtained in the offline stage is used for fault diagnosis and classification. These procedures are further explained below:

A. FLIGHT TEST DATA
The datasets employed in this study are the normalized sensor data sampled from a commercial aircraft braking system in the flight test. The braking system is complex and comprises the hydraulic system, braking components, brake oil tank, and other components. During the flight test, various types of sensor data were recorded simultaneously to monitor the brake system's health state. The sampling rates of sensors are different due to the varieties of sensor types. In our data preprocessing stage, we sub-sampled the high sampling rate signals and obtained 10 sensor datasets recorded at 32 Hz sampling rate for subsequent data fusion and reconstruction procedure. The typical data distribution of this breaking process is shown in Figure 5. The data includes the cruise phase (t = 0-140 s), the deceleration phase (t = 140-1040 s), and the stop phase (t > 1040 s). During the cruise phase, sensors stably monitor the aircraft's health status and detect environmental changes. Since the braking system is not used, so sensors have a low probability of failure. In the deceleration period, the pilot operates the braking system to decelerate the aircraft, and the speed, temperature, pressure, and other sensor signals follow the aircraft's status. At this stage, due to the rapid change of flight environment and manual operation, sensor signal noise and instability significantly increase, which are prone to failure. Therefore, the multi-sensor signal recorded in this stage was selected to build the fault diagnosis datasets. For compatibility with the input structure of the proposed FDC method, sensor data with a length of 50 in the deceleration phase are intercepted to construct the training data set; that is, each data set is converted to a 2D image-like data matrix with a 50 × 10 dimensional shape. For each flight test, five sets of data were captured, and data from 30 flight tests were used.
To evaluate the fault classification capability of the proposed algorithm, fabricated fault signals were added to the original normal data set. Information about the common sensor fault types of aircraft is available in [44] and [45]. Four typical sensor faults-slow oscillation, increased noise, slow drift, and catastrophic failure-were used to construct the fault data set. However, other types of failure, such as square wave, bias, and spiky, can also be classified by the proposed framework. Figure 6 illustrates the fault types used in this study: 1) Slow oscillation: An additive fault type that shows a regular oscillation behavior based on the original signal, and can be described by Y s = X +a sin (ωt)+N , where Y S is the output signal data, X is the original signal data, a is the scale factor (where a = 0.2 is a constant in our study), and N is the zero-mean noise. 2) Increased noise:The response of the sensor is replaced by a random time series that does not represent any system information. For analytical simplicity, it is assumed to be zero mean and can be represented as Y s = N .
Since sensor data are normalized, the range of N ∈ (0, 1) is used here.

4) Catastrophic failure:
Complete signal loss occurs when the sensor suffers a catastrophic failure. The output can be described by Y s = 0. To construct the sensor data set for fault diagnosis and classification, the abovementioned fault signals are added to a specific sensor (sensor 9) that has a high failure rate in all cases. Normal sensor signals are labeled 0, and slow oscillation, increased noise, slow drift, and catastrophic failure signals are labeled 1, 2, 3, and 4, respectively. Therefore, the overall data set consists of 150 sets of normal data and 150 sets of each of the four types of fault data, constituting 750 sets of training test data. Subsequently, all the input vectors are randomly assigned to the training and test sets in the proportion of 70% and 30%.

B. PARAMETER SET UP
The proposed network is mainly composed of two branches: the SAE layer and the CNN layer. The SAE layer is designed to capture complex features from raw data and consists of five hidden layers, each consisting of 500, 400, 20, 400, and 500 neurons, respectively. Unlabeled training data are firstly input into the SAE network to obtain the feature map, and then corresponding labels are added and then delivered to the CNN network. The CNN layer consists of four convolutional layers, each of which is followed by four pooling layers. A ''flatten layer'' is used to reshape the CNN layers' output matrix and transport the output to the fully connected layer for fault type prediction. Table 1 summarizes the detailed settings used in this study.
In Figure 7, to show how features change after data processing, feature graphs are used to represent groups of signal data before and after SAE processing. Notice that the normalized sensor data are converted to a 2D imagelike data matrix, and the color range indicates the normalized data value. Raw signal data under different operating conditions are shown in the left column. As can be seen from the figure, the characteristics of single-sensor signals under different states might differ from each other. However, when the system receives coupled signal data from multiple sensors, the fault signals are assimilated into the system noise or hidden by the varying operating environments, which is likely to cause the system to ignore the fault. As most sensor signals are normal signals, which are considered interference signals in fault classification, the expectation is that they will be suppressed. The right column shows the feature maps corresponding to the fault types obtained from the SAE. The data fluctuation is smaller on the right side. Although most graphs from different data sets show a similar distribution of characteristics, there are differences too. Although it is difficult to interpret the feature map obtained by the SAE, the output matrix shows the desired characteristics: suppression of the background noise and enhancement of the fault features. Therefore, it is reasonable to conclude that the SAE has learned the information hidden in different types of fault data.

C. CLASSIFICATION RESULTS AND COMPARATIONS
To evaluate the performance of the proposed SAE+CNN method, the classification results obtained by the SAE+FN method and by the CNN only method are also provided. In the SAE+FN architecture, the structure of the SAE is the same as in the SAE+CNN method: the SAE layers are followed by a three-layer fully connected FN, which is used to classify the fault types.
The classification accuracies obtained with the training and test sets by the three methods are listed in Table 2. The SAE+FN method's performance is not satisfactory. In the test data set, the training and testing accuracies are 68.38% and 66.67%, respectively. The CNN-only method is more accurate than the SAE+FN method, obtaining 88.57% and 84.23% accuracies in the training and test data sets, respectively. These results are not surprising. For the SAE+FN model, although the SAE enhances the signal features, the mapping ability of the fully connected layer is limited, and the extracted feature information cannot be used effectively. For the CNN-only model, as the fault data are usually assimilated in the background noise and the features are not prominent, it is difficult for the model to achieve satisfactory feature recognition even when convolution and pooling operations are used to enhance the local information. The proposed SAE+CNN framework achieves the best performance, with training and test accuracies of 96.19% and 93.78%, respectively. The results show that the SAE can learn effective

FIGURE 7.
Graphs of different types of raw multi-sensor data and the corresponding feature maps extracted by the sparse autoencoder: a) normal data, b) slow oscillation data, c) increased noised data, d) slow drift data, e) catastrophic failure data.
fault features as preprocessing for the CNN, suppressing noise and enhancing the feature recognition capability of the CNN. The proposed architecture improves the model's fault diagnosis capability and enhances its robustness by fully utilizing the spatiotemporal information in the multi-sensor signals.
The offline process of the SAE+CNN method achieves satisfactory diagnostic and classification results. In the online stage, the multi-sensor data need only be truncated and reshaped in accordance with the input requirements of the model. For the data acquisition process at the sampling rate of 32 Hz, the sampling time is 1.5625 s, and the classification time is about 0.07 s. Thus, the proposed SAE+CNN model satisfies the requirements of online fault diagnosis and classification.

IV. CONCLUSION
In the present research, a hybrid DL architecture combining an SAE and a CNN is proposed for FDC of multivariate coupled sensor signals. The proposed method is applied to flight test braking system sensor data and improves the fault classification accuracy from 66.67% (SAE+FN method) and 83.11% (CNN only method) to 93.78%.
The proposed method has many advantages for real-world applications. First, it uses the spatiotemporal coupling information characteristics of multi-sensor signals, which reduces the influence of background noise. Second, the hybrid architecture fully utilizes the advantages of the different types of DL methods, which enhances the robustness of feature extraction of the DL fault diagnosis model. However, as the proposed architecture uses supervised learning, it cannot process unlabeled (unknown) failure types. To improve the FDC ability of the proposed method, the training data set must be expanded, which will, however, undoubtedly increase the network training time and reduce the accuracy. Future work can therefore focus on FDC of unknown fault types.