Pressure Signal Prediction of Aviation Hydraulic Pumps Based on Phase Space Reconstruction and Support Vector Machine

In view of the difficulty of fault prediction for aviation hydraulic pumps and the poor real-time performance of state monitoring in practical applications, a hydraulic pump pressure signal prediction method is proposed to accomplish the monitoring and prediction of the health status of hydraulic pumps in advance. First, based on the on-line real-time acquisition of time series flight parameters and pressure signal data, the chaotic characteristics of the system are analyzed using chaos theory, so that the time series pressure signal is predictable. Second, phase space reconstruction (PSR) of the one-dimensional time series data is conducted. The embedding dimension $m$ and time delay $\tau $ are obtained by the C-C method. The reconstructed matrix is used as the training set and test set of the support vector regression (SVR) algorithm model according to a certain proportion, and the genetic algorithm (GA) is then used to optimize the parameters of the SVR model. Finally, the SVR model optimized by the genetic algorithm based on phase space reconstruction (PSR-GA-SVR) is used to test the test set data. The results show that the prediction accuracy of the proposed method is higher than that of the BP neural network based on phase space reconstruction (PSR-BPNN) and the SVR model based on phase space reconstruction (PSR-SVR). Relative to PSR-BPNN and PSR-SVR, PSR-GA-SVR produces a minimum mean square error (MSE) reduced by 73.40% and 68.0%, respectively, and a mean absolute error (MAE) decreased by 90.41% and 90.87%, respectively. The confidence level for PSR-GA-SVR was increased, and the coefficient of determination was greater than 0.98.


I. INTRODUCTION
The aviation hydraulic pump is the 'heart' of an aircraft's hydraulic system. Once it fails, it will cause serious consequences, affect flight missions, or even lead to aircraft destruction and human death. Therefore, the condition monitoring and fault prediction of the hydraulic pump can ensure the online monitoring of the pump source system, accomplish the early detection of faults, ensure the safety of the system, and provide a reference for the development of conditionbased maintenance. In recent years, fault prognosis and health management (PHM) technology has been booming. This The associate editor coordinating the review of this manuscript and approving it for publication was Nagarajan Raghavan . technology, which integrates data acquisition, data processing, condition monitoring, fault diagnosis, fault isolation and health management [1], provides the quantitative basis and theoretical support for the fault prediction and condition monitoring of hydraulic pumps [2].
There are three types of fault prediction methods: datadriven methods, knowledge-driven methods and modeldriven methods [3]. Generally, model-driven methods need to establish complex mathematical and physical models, which require a large amount of calculation and are difficult to attain. The knowledge-based fault prediction methods need to accumulate rich experience and historical laws, but they are easily affected by expert subjectivity. However, data-based fault prediction methods can easily obtain data and meet the needs. With little prior knowledge, data-based methods can make full use of the implicit relationships among data to conduct predictions [4]. Therefore, data-based prediction methods have become the preferred methods to research in terms of the monitoring and prediction of hydraulic pump status.
The degradation of hydraulic pumps is a continuous process, and the degradation of their performance is generally accompanied by changes in certain parameters, such as changes in the pump outlet pressure, flow, vibration, temperature, oil pollution, etc. [5]- [7]. Through the analysis and mining of the performance parameters of hydraulic pumps, the health status monitoring and fault prediction of hydraulic pumps based on data can be accomplished. Wang addressed the problem that the fault mechanism of an axial piston pump is not clear, which leads to difficulties in fault detection. She proposed a multi-fault classification and diagnosis method based on a deep trust network, and the accuracy of the results reached 97.40% [8]. Besides, she also used a new convolutional neural network based on minimum entropy deconvolution to classify faults in axial piston pump, the results showed that this method is better than the traditional method [9]. Li et al. used the Monte Carlo numerical simulation sampling method to analyze and model the amount of wear in oil on the micro level, and the micro model was applied to the macro process of pump degradation to predict the remaining useful life [10]. Lin proposed a method based on the entropy weight and gray prediction to solve the problem of insufficient fault data, which makes the fault prediction of aircraft hydraulic pumps more accurate and objective [11]. Li et al. analyzed the characteristic parameters of pump vibration signals and established a DBN model to accurately predict the degradation trend of pumps [12]. A hydraulic pump is a highly reliable mechanical and electrical product [13]. However, it is not only very time-consuming to conduct accelerated life tests in the use stage, but also the set working conditions deviate from the actual situation, which leads to the deviation of the experimental results. Therefore, it is a very economical and reliable method to monitor and predict the hydraulic pump faults using the on-line monitoring data.
In recent years, the research on hydraulic pumps has mainly focused on the accelerated life tests. Through the experiments, the vibration, wear and oil analysis data of hydraulic pumps are extracted, and the life and failure mode of hydraulic pumps are predicted. The pressure signal of a hydraulic system contains rich fault characteristic information [14]. In the practical application process, because a hydraulic pump always works with the system, other parameters such as vibration are difficult to measure. In contrast, the pump outlet pressure signal is easy to collect, and is closely related to the system's operating status [15]. Therefore, this article proposes a prediction method using the pressure signal data of the hydraulic pump to conduct realtime monitoring of hydraulic pump status. Jiao used a method including an adaptive regression and particle filter to analyze the whole life cycle pressure signal of a fuel pump, which was related to the performance of the fuel pump, and predicted the pressure signal trend and the remaining life of the pump [16]. Ergin and others predicted the pressure of a speed regulating pump in a hydraulic system by using a structural cyclic neural network [17], and Tian et al. [18] accomplished pressure signal prediction by combining the wavelet packet and chaotic support vector machine, and achieved good prediction accuracy. Lu et al. decomposed the pressure signal data of an aviation hydraulic pump using EEMD, and identified different fault modes of a hydraulic pump using the SVR model. The results showed that the proposed method can effectively identify the fault mode and has good accuracy [19]. In addition, he again proposed an SVR method using the Gaussian mixture model to predict the degradation process based on the pump outlet pressure signal, and compared with the existing methods, its prediction effect was better [20].
Artificial intelligence-based methods are widely used in mechanical fault detection and prediction, and they provide good results, however, they are all based on large amounts of sample data [21]- [23]. In view of the non-stationary, fluctuating and small sample nonlinear characteristics of on-line time series monitoring signal of the hydraulic pump pressure, this article uses the chaos theory method combined with the advantages of the support vector machine [24] to conduct small sample nonlinear time series data prediction and proposes a fault prediction method using the support vector machine based on phase space reconstruction and the genetic algorithm for the real-time monitoring and prediction of pressure signals. In this article, the prediction algorithm of PSR-GA-SVR is proposed, and the corresponding modeling steps are given. The advantages of the algorithm proposed in this article compared with the traditional algorithms and the difference between the predicted value and the real value are analyzed. Finally, the model is applied to the data sets of two other aircraft for evaluation, which further proves the superiority of the algorithm proposed in this article.

II. MODEL CONSTRUCTION A. SVR PREDICTION MODEL
The support vector machine (SVM) is one of the typical machine learning algorithms based on statistical learning theory and the principle of structural risk minimization. Compared with a neural network, decision tree and other algorithms, the SVM has more obvious advantages in prediction using small sample and nonlinear data. A support vector machine prediction model is usually used for classification prediction and regression prediction [25]. For general time series data prediction, the support vector regression is selected as the prediction model [26].
The basic theory of the SVR prediction is as follows.
Consider a set of l data samples (x i , y i ) (i = 1, 2, · · ·, l), where x i ∈ R n and y i ∈ R, are the input and output sample data, respectively. The sample set is mapped from the original space to the feature space by nonlinear mapping ϕ (·), and the nonlinear fitting problem of the low dimensional space is transformed into the linear fitting problem of the high-dimensional feature space: where ω is the weight vector and b is the threshold offset. For a given sample set (x i , y i ) (i = 1, 2, · · ·, l), the ultimate goal of the SVR is to find an appropriate objective function f (x) to minimize the error between the actual output and the predicted output. Based on the theory of structural risk minimization and choosing the insensitive loss function of ε as the error benchmark, the objective function of the SVR model is: where L ε is the insensitive loss function of ε, C is the penalty factor, and C > 0. By introducing the relaxed variable factor, formula (2) can be transformed into a quadratic programming problem: By introducing Lagrange function optimization and solving the corresponding dual problem, the following regression functions can be obtained: α i and α * i are Lagrange multipliers that satisfy 0 ≤ α i , α * i ≤ C. When they are not 0, the corresponding sample is the support vector. K (·, ·) is a kernel function satisfying Mercer's theorem, and the relationship between the kernel functions is shown in formula (6).
There are many kernel functions that are commonly use, such as the linear kernel function, poly kernel function, RBF kernel function, and sigmoid kernel function, and the kernel function needs to be selected according to the prediction effect.
Generally, the RBF kernel function, which can be mapped to any dimension space, is selected. Therefore, the following two parameters need to be determined when SVR theory is used for regression prediction: the penalty factor C and the kernel parameter γ .

B. GENETIC ALGORITHM MODEL
Because the penalty factor C determines the generalization ability of the regression model and γ reflects the distribution characteristics of the training samples, the selection of these two parameters determines the accuracy of the regression prediction. The genetic algorithm is a type of heuristic parameter optimization algorithm that simulates the law of genetic evolution and the survival of the fittest to search for the optimal parameters. It does not need to traverse all possible combinations of parameters. It has the advantages of simple operations, a small amount of calculations and a faster operating speed. Altobi et al. [27] use the genetic algorithm to optimize the two parameters of a neural network (the numbers of neurons and hidden layers), which makes the optimized prediction model more accurate. Using the genetic algorithm to optimize parameters C and γ can not only shorten the training time of the model, but also avoid over fitting and under fitting problems in the training process, and improve the prediction accuracy. The steps of the genetic algorithm are as follows: (1) The coding strategy used to encode the genotype data code is selected, (2) The fitness function is determined, The genetic strategy (selection, crossover, or mutation) is selected, (4) The initial population is randomly initialized, (5) The fitness value is calculated, (6) The genetic operators are applied to the next generation population, (7) Whether the genetic iteration is terminated is judged according to the conditions.
The modeling process of optimizing the parameters of the objective function using the genetic algorithm is shown in Fig. 1.

C. PHASE SPACE RECONSTRUCTION OF TIME SERIES
For any single variable times series {x (t i ) |x (t i ) ∈ R, i = 1, 2, · · ·, n} with a non-linear time interval of t, the data information contained therein is limited. If a one-dimensional data signal is upgraded and mapped to a high-dimensional space through phase space reconstruction, it can not only effectively mine historical data information, but it can also avoid the subjectivity and randomness caused by artificial random selection [28]. For a chaotic system, according to Takens's theorem and the G-P algorithm, if the appropriate embedding dimension m and time delay τ can be found for time series x (t i ), then the reconstructed phase space can be expressed as follows: The number of phase points is the embedding dimension satisfies the relation If d is the dimension of the dynamic system, then the geometric characteristics of the attractor on the trajectory line of the reconstructed R m phase space are topologically equivalent to the original system, that is, any invariant of the original system can be calculated by using the reconstructed phase space state variables. According to phase space reconstruction theory, the matrix form of the learning and training sample set (x i , y i ) of the SVR input can be given as: X is the input data of the sample, and Y is the expected prediction output data of the sample.

D. CONSTRUCTION OF PSR-GA-SVR MODEL
To fully combine the advantages of the three models and predict the pressure signal of an aircraft hydraulic pump more accurately, this article creatively fuses the three models and proposes a prediction method based on the fusion model. The overall modeling process is shown in Fig. 2, and the specific modeling steps are as follows: Step 1: After preprocessing the collected data, phase space reconstruction theory is used to reconstruct the data and judge the chaotic characteristics of the data.
Step 2: Construct the SVR model.
Step 3: Use the genetic algorithm to optimize the two parameters of the SVR (C and γ ). The specific process is as follows: 1) The SVR is used as the objective function of the genetic algorithm, and the genetic algorithm parameters (population size, number of iterations, and the selection, crossover and mutation probabilities) are initialized.
2) The value range of the optimization parameters is given, and each parameter is coded. 3) The fitness value is calculated to determine whether the convergence condition or the maximum number of iterations is reached. If the condition is met, the optimal value is output and decoded. Otherwise, the genetic operation is performed to continue to calculate the fitness value.
Step 4: Divide the reconstructed data into training set and test set at a ratio of 8:2, and input the training set into GA-SVR algorithm model to train the model.
Step 5: Input the test set, compare the predicted value with the real value, and verify the accuracy of the model.

III. MODEL PARAMETER SELECTION
The main parameters to be determined in the support vector regression model optimized by the genetic algorithm and phase space reconstruction theory are the penalty factor C, the kernel parameter γ , the time delay τ and the embedding dimension m. C and γ are determined by the genetic algorithm, while τ and m are determined by the time series data. The selection of parameters τ and m is very important, but the selection of the two is relatively difficult. VOLUME 9, 2021 The common methods to determine t include the autocorrelation function method, mutual information method, etc.; and the methods for determine m include the geometric invariant method, Gao's method, etc. To simplify the calculation, the C-C method can be used to determine τ and m simultaneously [29].
For any reconstructed array , define the correlation integral as: C(m, n, r, τ ) is a function embedded in time series, where is the definition function. Then, Xt i is decomposed into t disjoint time subsequences, and the statistics of the subsequence array are calculated: where the range of variable t is 0 ≤ t ≤ 200. According to the statistical results, three parameter functions regarding t in formula (14) are calculated.
According to the above functional relationship, the corresponding curve is formed and the first minimum point of S(t) is found. The corresponding t of this point is the time delay τ , and the global minimum value point of S cor (t) is the delay time window:

IV. ANALYSIS AND PROCESSING OF TEST DATA A. TEST DATA SET
The data set used in this experiment is the pressure data of a hydraulic pump recorded during the flight of a certain type of aircraft. The hydraulic source system of the aircraft is shown in Fig. 3. The model of the hydraulic pump is ZB-34. The working medium used is no. 15 aviation hydraulic oil (YH-15). When the hydraulic system is working normally, the working pressure is 210 +5 −20 (kgf /cm 2 ) and the working flow does not exceed 40 (L/min). Since in the actual aircraft hydraulic pump monitoring data, the hydraulic pump outlet pressure is part of the flight parameter data record, the data selected in this article are the hydraulic pump outlet pressure data recorded by the flight recorder when a certain aircraft is behind. A total of 1500 data points are recorded per second within 25 minutes (1500 s) in the state. The pressure data trend is shown in Fig. 4.

B. DATA PREPROCESSING AND EVALUATION CRITERIA
To avoid the influence of data deviations on the regression prediction, it is necessary to normalize the collected pressure data and convert the data into dimensionless data. The normalization method adopted in this article is maximum and minimum normalization, and the calculation formula is as follows [30]: After normalization, the value range of the data is [−1, 1], After the final prediction results are obtained, the prediction results need to be restored by the following formula: The evaluation criteria used to evaluate the accuracy of the prediction results are the minimum root mean square error (RMSE), the average absolute error (MAE) and the coefficient of determination (R 2 ).
2970 VOLUME 9, 2021 FIGURE 5. Calculation of τ and m using the C-C method. (20) where N is the number of test samples, x pre i is the predicted value, and x rea i is the true value. The R 2 reflects the degree of deviation between the predicted value and the real value. R 2 = 1 means that the predicted value is completely consistent with the real value; and the closer R 2 is to 1, the better the prediction effect.

A. SIMULATION OF PHASE SPACE RECONSTRUCTION
According to the operating flow of the C-C method, the corresponding statistical calculation program is compiled for the simulation calculations, and the statistical relationship as shown in Fig. 5 is obtained.
According to the graph analysis, the t corresponding to the first minimum value of S(t) is 14, that is, the time delay τ = 14; and the t corresponding to the global minimum value of S cor (t) is 75, that is, the time window τ w = 75. The embedding dimension m = 7 can be calculated by formula (15), and the SVR learning sample matrix can be constructed accordingly. The maximum Lyapunov exponent is 0.0288 > 0, which belongs to a chaotic system and can be used for time series prediction.

B. EXPERIMENTAL SIMULATION OF THE GA-SVM MODEL
The GA-SVM model is constructed using the Python language, and the reconstructed phase space matrix in Section A is input into the GA-SVM model for training, 80% of the data are used as the training set, and 20% of the data are used as the test set. Using these data, the constructed model was trained many times. The final model was set with 100 iterations, a population size of 5, a crossover probability of 0.01, a mutation probability of 0.8, the interval of penalty factor C was [0. 1,10], and of kernel parameter γ was [0.1, 300]. Using the genetic algorithm to optimize the parameters of the SVR, the final optimization parameters are C = 4.37, and γ = 151.81. The parameters are substituted into the SVR model to train the model using the training set data, and then the trained model is applied to the test set data to test the model accuracy. The comparison results between the predicted value and the real values are shown in Fig. 6. The length of the test set data is 283 points and the training set is 1132. Fig. 6, shows that the difference between the predicted value and the real value obtained by the proposed algorithm is small, and the following performance is consistent with the original pressure signal. To further verify the accuracy of the algorithm proposed in this article, the algorithm is compared with the BPNN algorithm based on phase space reconstruction and the SVR algorithm based on phase space reconstruction. The comparison results are shown in Fig. 7, and the evaluation index calculation results of each algorithm are shown in Table 1.
The analysis of the figures and tables shows that the GA-SVR model based on phase space reconstruction has the highest prediction accuracy and better tracking performance. Compared with that of the PSR-BPNN and PSR-SVR   algorithms, the RMSE of GA-SVR model is reduced by 73.40% and 90.41%, respectively; the MAE is reduced by 68.16% and 90.87%, respectively; and the correlation coefficient R 2 is greater than 0.98 and, nearly 1, which further illustrates the accuracy and credibility of the prediction results.
In addition, this article also takes the difference between the predicted values of the three algorithms and the true values as the research object of quantitative analysis. The error distributions of the three algorithms is shown in Fig. 8, and Table 2 quantitatively calculates the statistical characteristics of the errors for the three algorithms. The analysis of Figure 7 and Table 2 shows, the prediction results of the PSR-GA-SVR model proposed in this article are the best. (error = predicted value − true value).
To further verify the applicability of the proposed algorithm, this article selects the flight data of two other aircraft (aircraft 2 and aircraft 3) and compares the monitored  hydraulic pump pressure data according to the algorithms from section 2 to section 4. The selected data conditions are shown in Table 3. The comparison between the predicted values and the real values of the PSR-GA-SVR algorithm is shown in Fig. 8 and Fig. 9. Figure 10 shows the error   trend between the predicted values and the true values of the two aircraft. According to these statistical characteristics, the prediction results are better, and it can be seen that the predicted values are slightly smaller than the real values, and the error distribution is more uniform.

VI. CONCLUSION
Through the simulation, calculation and verification of the prediction model proposed in this article, the prediction results are analyzed and the following conclusions are drawn: (1) Based on the on-line monitoring pressure time series signal of the hydraulic pump, a Lyapunov exponent that is greater than 0 is calculated by using chaos theory, and the pressure monitoring sequence extracted is a chaotic system. The phase space reconstruction of the one-dimensional time series signal is conducted to construct a high-dimensional matrix, and the embedding dimension m is 7 by using the C-C method.
(2) The genetic algorithm is used to optimize the parameters of the support vector regression model, and the reconstructed hydraulic pump pressure data are analyzed and predicted. The results show that the PSR-GA-SVR model proposed in this article has higher prediction accuracy and better reliability. The RMSE and MAE of the PSR-GA-SVR method were decreased by 73.40% and 90.41%, respectively, relative to those of the PSR-BPNN method and by 68.16% and 90.87%, respectively, relative to those of the PSR-SVR method. Therefore, the PSR-GA-SVR method can predict the pressure signal of hydraulic pump more accurately.
(3) The support vector machine model proposed in this article can effectively solve the time series prediction problem, and the generalization ability of the learning model is good. This method is applied to the condition monitoring and fault prediction of a hydraulic pump, which can better predict the change trend of a pressure signal and provide referential value for the realization of health management of hydraulic pumps.
However, the algorithm proposed in this article only performs single-step (short-term) prediction, and further research is needed to achieve multi-step prediction of hydraulic pump pressure signals to achieve a better predicted effect.
YUAN LI received the bachelor's degree in engineering from Air Force Engineering University, in 2019. He is currently pursuing the master's degree. His main current research interests include aviation equipment fault prediction and health condition monitoring.
ZHUOJIAN WANG received the Ph.D. degree in system engineering from Air Force Engineering University, in 2006. He is currently an Associate Professor and a Master Supervisor. He has published more than 30 academic articles. His main research interests include equipment reliability engineering and comprehensive support.
ZHE LI received the Ph.D. degree in aeronautics and astronautics science and technology from Air Force Engineering University, in 2019. He is currently a Lecturer. His main research interests include flight simulation and flight safety. His current research interests include fault prediction and health management of complex systems.
ZIHAN JIANG received the bachelor's degree in engineering from Air Force Engineering University, in 2018. He is currently pursuing the master's degree. His main research interest includes aircraft mission reliability assessment.