Intelligent Deep Learning Method for Forecasting the Health Evolution Trend of Aero-Engine With Dispersion Entropy-Based Multi-Scale Series Aggregation and LSTM Neural Network

Accurate health evolution trend forecasting of aero-engine is essential for operation reliability and maintenance costs of aeronautical equipment. In this study, an intelligent deep learning method, systematically blending the dispersion entropy-based multi-scale series aggregation scheme and long short term memory (LSTM) neural network, is proposed for forecasting the health evolution trend of aero-engine. Firstly, a comprehensive measurement of health levels, namely, integrated health state index (IHSI), is developed with high-dimensional dataset. Secondly, the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is exploited to decompose the IHSI sequence into several multi-scale series to further capture the internal characteristics of original sequence. Subsequently, multi-scale series aggregation assisted with dispersion entropy analysis theory is conducted for obtaining the aggregated sub-series (ASS). Finally, the ASS are served as the inputs of LSTM network to complete the health evolution trend forecasting of aero-engine. To demonstrate the effectiveness of the proposed method, six approaches are present for the comparisons of forecasting performance. The experimental results indicate that the proposed method can effectively measure the health evolution process of aero-engine and further obtain more accurate trend forecasting results.


I. INTRODUCTION
With the improvement of systemic integration degree and automation level, complete and reasonable health state monitoring has an important significance in ensuring stable operation of equipment and meanwhile contributes to achieve the condition-based maintenance (CBM) [1]- [3]. Aero-engine, as a kernel part of aeronautical equipment, plays an important role in system safety and operation reliability [4]- [6]. Due to the significance of engine for aeronautical machine, necessary monitoring and analysis of health state for engines The associate editor coordinating the review of this manuscript and approving it for publication was Omid Kavehei .
should be effectively implemented during the whole cycle life of equipment, which can extract real-time state information of engine and avoid serious security incidents. More specifically, health evolution trend forecasting is an essential step for the implementation of monitoring and status analysis, which is helpful to evaluate the operation states in the future and for the purpose of making maintenance plan in advance [7]- [9].
In order to accurately forecast the health evolution trend of aero-engine, an appropriate measurement for effectively quantifying the health levels of engine should be built in advance. Note that plenty of data from various parts are acquired during runtime for subsequent analysis [10]. Accordingly, how to make full use of these data to build a health state measurement is the primary problem to be solved. Aiming at this problem, some researches on building a suitable indicator for the effective description of health levels have been conducted. For example, adopted the power condition to characterize the performance levels of battery packs [11]. Kral obtained the operation status of vehicle cooling systems based on the analysis of oil conditions [12]. Gebraeel utilized the collected vibration signals to analyze the status of rolling bearing [13]. However, the above studies usually adopt the single physical signal for analysis, in which the reflection of health status is only from an one-sided aspect and some signals containing health state information may be neglected. Furthermore, a wide variety of signals are collected by acquisition system, which causes great difficulties for the selection of representative signal. Therefore, the construction of an excellent health state index is still a difficult point to be overcome urgently for health evolution trend prediction of aero-engine.
Health evolution trend forecasting is fundamentally a problem of time series prediction, i.e., obtaining the future health conditions. On the whole, the prediction models mainly include three categories, i.e., knowledge-based, physicsbased and data-driven models. Actually, due to the difficulties of obtaining prior knowledge and physical principles, the former two prediction models usually suffer many dilemmas in practical applications. Data-driven models, taking full advantage of monitoring data, can realize the accurate prediction without prior knowledge and physical rules [14], [15]. In recent years, as a popular data-driven model, artificial neural network (ANN) have been widely used for time series forecasting and achieved many successful cases [16], [17]. Adhikari established an ANN-based model to predict the financial time series [18]. Khashei adopted back propagation neural network (BPNN) to complete the task of time series prediction [19]. Tian utilized an ANN model to forecast the remaining useful life of bearing [20]. However, the change of time series in practical engineering is mostly non-linear and non-stationary, and usually shows evident characteristics of fluctuations. For this reason, it is quite difficult to forecast series fluctuation accurately and reliably. Because the change of health status is a gradual accumulation process, health trend forecasting not only relies on the current condition but on the condition of the past. If only consider the condition at present, the information contained in the condition at an earlier time would be ignored. Aiming at this problem, a new deep learning model, i.e., recurrent neural network (RNN), is developed and constructs the associations between hidden layer nodes, where the memory of recent states can be stored [21]. Because of memory characteristics, RNN can handle the related data in the time domain. Many researches have demonstrated the effectiveness of RNN for time series prediction [22], [23]. As an improved version of RNN, long short term memory (LSTM) neural network has been applied in time series prediction, language progressing and other fields, and attracted more and more attentions [24], [25]. Due to the''gate'' structure of LSTM, the useless information can be strictly filtered and more valuable information can be extracted from historical dataset.
Based on the analysis above, an intelligent deep learning method, systematically blending the dispersion entropy (DE)-based multi-scale series aggregation scheme and LSTM neural network, is proposed in this paper to forecast the health evolution trend of aero-engine. More specifically, a comprehensive measurement of health levels, namely, integrated health state index (IHSI), is constructed with highdimensional dataset. Different from the indicator adopted in [11]- [13], the constructed IHSI fully retains useful health status information and reflect the condition of engines synthetically. Subsequently, CEEMDAN algorithm is exploited to decompose the IHSI sequence into several multi-scale series to further capture the internal characteristics of original sequence. Because of excellent decomposition performance, CEEMDAN contributes to obtain intrinsic mode function (IMF) components with single frequency and provides data foundation for next research. Then, multi-scale series aggregation assisted with DE analysis theory is conducted for obtaining the aggregated sub-series (ASS). With the obtained ASS, there will be fewer sequences comparing with original multi-scale series, and prediction efficiency and accuracy will also be improved. Finally, the ASS are served as the inputs of LSTM network to complete the health evolution trend forecasting of aero-engine. Because of excellent ability of information filtering, LSTM is helpful for the improvement of final forecasting accuracy. To demonstrate the effectiveness of the proposed method, six approaches are present for the comparisons of forecasting performance. The experimental results indicate that the proposed method can effectively measure the health evolution process of aeroengine and further obtain more accurate trend forecasting results.
The rest of this paper is arranged as follows: Section 2 briefly introduces the CEEMDAN and LSTM neural network. Section 3 provides the detailed descriptions of the proposed method. In Section 4, experiment and results analysis are present. Finally, we conclude this study in Section 5.

II. BACKGROUND KNOWLEDGE A. CEEMDAN ALGORITHM
To adaptively analyze the time-frequency characteristics of signal, an empirical mode decomposition (EMD) method was developed by Huang et al. [26]. With this method, the original signal can be decomposed into several multi-scale series, i.e., IMFs, which reveal the features of signal from different time scales [27]. Based on the detailed implementation procedures of EMD described in [27], the raw signal can be reconstructed by n IMFs and one residue r n (t): where x(t) is the original signal and IMF i (t) denotes the obtained i-th IMF after signal decomposition with EMD. VOLUME 8, 2020 Although it has been demonstrated that EMD shows excellent performance in analyzing the non-stationary signal, there are some inherent limitations that make some restriction on the application of EMD, such as mode mixing problem and end point effect. Aiming at these problems, a noiseassisted signal analysis approach, named ensemble empirical mode decomposition (EEMD), was proposed by Wu and Huang [28]. However, it lacks the ability to completely eliminate Gaussian white noise after signal reconstruction, and the high time consumption caused by the added noise make great influence on the application of the decomposition approach. Based on these, a complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) method was developed to make up for the deficiencies of EEMD [29]. It resolves the problem of mode mixing, and the reconstructed error is always zero. In the meantime, the computational costs of CEEMDAN are significantly lower than that of EEMD. The implementation steps of CEEMDAN can be summarized below.
(1) With the EMD method, each . . , I can be decomposed, where ε 0 denotes a noise factor, and w i (t) denotes Gaussian white noise with zero mean and homogeneity of variance.. Then, the first IMF of original signal using CEEMDAN can be obtained: (2) Calculate the first residue, (3) Decompose residue r 1 (t) + ε 1 E 1 (w i (t)) to calculate the second IMF, where E j (·) denotes the operator of j-th IMF obtained by EMD. (4) Step (2) to (3) are repeated until all IMFs have been addressed. The final residue is calculated below, where m is the number of IMFs. Thus, through the procedures mentioned above, the raw signal x(t) can be reconstructed as follows: Equation (6) presents a reconstruction form of raw signal. Note that the IMFs indicate the features of signal at different time scales, and the residue changes more smoothly and is helpful for the improvement of forecasting accuracy.

B. LSTM NEURAL NETWORK
The LSTM neural network is a deep learning method with wide range of application, where inputs and outputs can change with time. Because of its excellent performance on dealing with long-term dependency problem, LSTM neural network has been used in the fields of time series forecasting and attracted lots of attention [24], [25]. Specifically, the memory cell in network, which realizes the function of temporal state storage, is the foundation of the whole architecture. Based on this memory cell, the information can be updated through three gates, including input gate, forget gate and output gate. The implementation process of LSTM can be summarized as follows [30]: (1) The input activations are controlled by the input gate, and the new information can be stored by memory cell when the input gate is activated.
(2) The extraneous information can be removed by the forget gate, and the past cell conditions can be forgotten when the forget gate is activated. (3) The output activations are controlled by the output gate, and the final status can be updated by the latest cell output when the output gate is activated.
For the purpose of health state tendency prediction, the LSTM neural network adopts a three-layers structure in this paper. In addition, suppose that the x = (x 1 , x 2 , . . . , x T ) and y = (y 1 , y 2 , . . . , y T ) represent the historical data and corresponding predicted data, respectively. So, the forecasting results can be obtained by the following formulas [31]: where i t is the input gate, W is the weight matrix, k t is the vectors for activating the block of memory, a t is the vectors for activating the memory cell. f t denotes the forget gate, o t denotes the output gate, b denotes the bias and '' • '' denotes dot product. Besides, σ (·) represents sigmoid function: g (·) and h (·) represent centered logistic functions: h The basic structure of LSTM is depicted in Fig. 1.

III. THE PROPOSED HEALTH EVOLUTION TREND FORECASTING METHOD
In this paper, an intelligent deep learning method, systematically blending the DE-based multi-scale series aggregation scheme and LSTM network, is proposed in this paper to predict the health evolution trend of aero-engine. This section contains the following: the construction of integrated health state index, the aggregation of multi-scale series based on dispersion entropy analysis and the implementation steps of the proposed method.

A. CONSTRUCTION OF INTEGRATED HEALTH STATE INDEX
To accurately and comprehensively evaluate the health levels of aero-engine, it is necessary to construct a proper health state index. With a large number of available sensory signals, a comprehensive index, named integrated health state index (IHSI), is innovatively developed based on an idea of liner transformation in this paper. In fact, the construction of IHSI can be regards as a transformation from high-dimensional dataset to one-dimensional series. Suppose Q 1 of U 1 × V matrix and Q 2 of U 2 × V matrix denote two high-dimensional datasets, which represent the faulty and healthy states of engine, respectively. Based on Q 1 and Q 2 , a transformation matrix T can be generated to build a mapping relationship between the original high-dimensional sensory dataset and the constructed one-dimensional IHSI sequence as follows: zero vector and G 2 denotes a 1 × U 2 unit vector. On the basis of the obtained matrix T and the collected dataset R, the IHSI, written h, can be calculated as follows: Specifically, the value of IHSI is limited in [0, 1], and ''1'' represents health condition and ''0'' represents fault condition. Thus, the constructed IHSI can provide a feasible way to comprehensively measure the health levels of aero-engine.

B. AGGREGATION OF MULTI-SCALE SERIES BASED ON DISPERSION ENTROPY ANALYSIS
Because of the impact of operational conditions and environment noise, the sensory signals usually show the significant non-stationary and non-linear features and the corresponding decomposition results with CEEMDAN always contain several IMFs. This phenomenon can make difficulties to effectively train the prediction model and improve the forecasting accuracy. Aiming at this problem, a multi-scale series aggregation scheme assisted with DE analysis theory is developed in this paper to retain all effective information of IMFs and meanwhile obviously decrease the number of IMFs. The process of DE-based multi-scale series aggregation can be summarized below.
where µ is the expectation, σ is the standard deviation and the value of d i (t) changes between 0 and 1. And then, each element in d i (t), denoted as d i (j) (j = 1, 2, . . . , N ), can be transformed to an one-dimensional integer sequence from 1 to S by a linear mapping model: where S is the number of category and z s i (j) is the j-th sequence. On this basis, define an embedding vector z ξ,s i (j) as follows: where ξ is embedding dimension and ω is time delay. Based on Eq. (20), each sequence z ξ,s i (j) will be mapped into a dispersion mode ν 0 ν 1 ···ν ξ −1 . Specifically, z s i (j) = ν 0 , z s i (j + ω) = ν 1 , ..., z s i (j + (ξ − 1)ω) = ν ξ −1 . Subsequently, the frequency of each mode ν 0 ν 1 ···ν ξ −1 is calculated as: According to the above equation, the DE value of each multi-scale series c i (t) can be obtained by the equation below: Note that the value of DE is bounded in [0, 1]. In essence, DE can effectively measure the randomness of sequences. Based on this property, the original multi-scale series can be VOLUME 8, 2020 recombined to generate the aggregated sub-series, denoted as ASS, according to the following criterion: With the proposed multi-scale series aggregation scheme, a group of ASS can be acquired by analyzing DE distributions of raw multi-scale series. In essence, the generated ASS not only completely retain the information of original series, but significantly decrease the number of series. Based on these, the constructed ASS can be served as the inputs of forecasting model for further improvement of prediction accuracy and efficiency.

C. IMPLEMENTATION STEPS OF THE PROPOSED METHOD
In order to accurately forecast the health evolution trend of aero-engine, an intelligent deep learning method combining DE-based multi-scale series aggregation and LSTM neural network is proposed. The flowchart of the proposed method is present in Fig. 2, and the implementation steps general procedures can be described in detail below.
Step 1: With the signal acquisition system, monitoring data during the whole lifecycle of aero-engine should be collected.
Step 2: Based on the collected datasets, the IHSI sequence can be built to measure the health levels of aero-engine.
Step 3: The generated IHSI sequence should be randomly split into training set and testing set. Specifically, training set is used to search optimal parameters of LSTM neural network and further obtain the forecasting results with high accuracy.
Step 4: Based on CEEMDAN decomposition algorithm, the IHSI sequence of training set can be decomposed into several multi-scale series, i.e., IMFs.
Step 5: All the IMFs are aggregated with the developed DE-based multi-scale series aggregation scheme and obtain a series of ASS.
Step 6: An optimal LSTM model is trained for each ASS to forecast the sequence of the corresponding ASS.
Step 7: Sum over predicted sequences and then the final forecasting results are available.
Step 8: Compare testing set and forecasting results and the performance of prediction method can be evaluated.

IV. EXPERIMENT AND RESULTS ANALYSIS A. DESCRIPTIONS OF EXPERIMENTAL DATA AND ERROR CRITERIA
To demonstrate the effectiveness of the proposed method, a classical dataset derived from the problem of prognostics and health management in 2008 is used and the corresponding results can be further analyzed and evaluated in this section. Specifically, the dataset is collected from an aero-engine model shown in Fig. 3, and consists of unit number, cycle index, operational setting parameters and 21 sensory signals for each cycle [32]. As described in Reference [33], there are 6 kinds of operational conditions of areo-engine according to different groups of setting parameters. With a view of the difference between different sensory data, it is unreasonable to directly adopt these data for the measurement of health levels of engine. Therefore, a comprehensive measurement that can accurately quantify the health levels of engine is urgently needed, and on this basis the health evolution trend of aero-engine can be effectively predicted.
In addition, three commonly used error criteria are utilized to analyze the performance of prediction method, including mean absolute error (MAE), root mean square error (RMSE) and mean relative percentage error (MRPE) [34], [35]. MAE and RMSE are two scale-dependent measurements, which can reflect the proximity and the departure between the forecasting results and actual data, respectively. Besides, MRPE is a percentage measurement which can describe the average prediction ability of forecasting method. Definitions of these three criteria are listed as follows: where k denotes the number of testing samples, Y i andŶ i are the i-th actual value and predicted value, respectively. Furthermore, note that all experiments are executed in MATLAB 2014.

B. IHSI SEQUENCE CONSTRUCTION
As mentioned in Reference [32], some sensory signals contain valuable health status information of areo-engine while others do not. We tend to select these signals that can effectively present the development trend of health status to establish the IHSI sequence. Following the strategy of signal filtration depicted in [33], seven sensory signals, as listed in Table 1, can be chosen for subsequent study. Based on these selected signals and Eq. (16), matrices T i (i = 1, 2, · · · , 6) can be built for six different operational conditions. Specially, Q 1 and Q 2 should be established in advance for the construction of transformation matrices. In this paper, Q 1 is constructed based on sensory data under faulty states, where the remaining cycle life C r (C r = C o − C w , C o is the current operation cycle and C w is the whole cycle) is bounded in [−3, 0]. Analogously, Q 2 is built under healthy states, where C r is less than -200. Based on T 1 ∼ T 6 and historical dataset, an IHSI sequence can be obtained by Eq. (17) as depicted in Fig. 4. It can be found from this figure that the constructed IHSI sequence shows a clear tendency of health development with the increase of cycle index. Consequently, the developed IHSI can be viewed as an effective and comprehensive measurement for quantifying health levels of aero-engine. Moreover, the first 150 health indexes in the IHSI sequence (h(1), h(2), · · · h(150)) can be chosen to VOLUME 8, 2020 Fig. 4, it can be found that the constructed IHSI sequence tend to fluctuate significantly so that the important information contained in the sequence would be submerged. In order to further capture the internal characteristics of original IHSI sequence and obtain the prediction results with high accuracy, CEEMDAN algorithm is used to decompose the IHSI sequence into several multi-scale series, i.e., IMFs. Fig. 5 presents decomposition results of IHSI sequence in detail, which consists of seven IMF components and one residue component.
Due to the potential impact of multi-scale series' complexity on model training and forecasting efficiency, a multiscale series aggregation scheme assisted with DE analysis theory is innovatively developed to generate new ASS. The DE distribution of all IMFs and residue component are shown in Fig. 6, and Table 2 lists the corresponding values of these components in detail. From Fig. 6, it can be found that the values of IMFs decrease monotonously until that the value of residue component is zero, which indicates that the randomness of decomposed multi-scale series show a declined tendency. With the aggregation formula (23), the interval length of DE for the recombination of multi-scale series, i.e., 2 (H max − H min ) /m, can be calculated as 0.223. Based on Eq. (23) and actual distribution of DE, all multi-scale series can be divided into 4 groups, and then 4 ASS are aggregated accordingly as present in Table 3. It can be found that the DE value of IMF1 is within in [0.669, 0.892], and thus it is served to construct the first ASS component. The similar procedures are implemented to reconstruct the ASS2, ASS3 and ASS4. Fig. 7 depicts the generated ASS components, which reflects the different characteristics of four ASS. Specifically, ASS1 is a non-stationary series with highest fluctuation frequency whereas RIMF4 shows a smooth trend during the whole cycle life. Besides, there are fewer sub-series comparing with original multi-scale series, which contributes to the improvement of prediction efficiency and accuracy. Therefore, the aggregated ASS can be directly employed as the inputs of forecasting model to forecast the health evolution trend of aero-engine.

D. HEALTH EVOLUTION TREND FORECASTING OF AERO-ENGINE
As described in Fig. 2, on the basis of the obtained ASS, the forecasting values for all ASS can be calculated by LSTM neural network. Fig. 8 presents the detailed comparisons between actual values and forecasting values of LSTM  network for 4 ASS. From the figure, it can be seen that the significant non-stationarity of ASS1 and ASS2 cause much deviation between actual values and the corresponding prediction values, while the forecasting sequences of ASS3 and ASS4 can accurately fit the real sequences due to their gradual variation and stationary property. With the obtained forecasting values of ASS1 to ASS4, the final results of the proposed intelligent deep learning method for forecasting the health evolution trend of aero-engine can be further calculated by the cumulative operation of all predicted sequences. Fig. 9 shows the comparisons between original IHSI sequence and final forecasting series in detail, which indicates that the predicted results achieve the great prediction performances on accuracy and fitting degree.
Besides, for comparison, five other prediction methods, including EMD-DE-LSTM, CEEMDAN-LSTM, LSTM, extreme learning machine (ELM) and BPNN, are also used for health evolution trend forecasting of aero-engine. More specifically, the comparison between the proposed method and EMD-DE-LSTM is conducted to validate the excellent decomposition performance of CEEMDAN approach, the comparison between the proposed method and CEEMDAN-LSTM tends to show the superiority of multiscale series aggregation scheme assisted with DE analysis theory, and the comparison between LSTM and other two single models tries to demonstrate the better prediction performance of LSTM. Finally, the comparison results between the proposed method and other five methods can further indicate the feasibility and superiority of combining CEEMDAN decomposition algorithm, DE-based multi-scale series aggregation scheme and LSTM for the improvement of prediction accuracy. VOLUME 8, 2020    In addition, the forecasting accuracies of six methods are depicted in Fig. 11, and the corresponding values of three   Table 4. From Figure 10, it can be observed that the predicted results of the proposed method have better performance for fitting the actual IHSI series in the entire time-scale than that of other methods. More specifically, based on the analysis of error criteria depicted in Fig. 11 and VOLUME 8, 2020 Table 4, we observe that MAE, RMSE and MRPE of the proposed method are 0.030, 0.042 and 3.621%, which are obviously less than that of other prediction methods.
The following conclusions can be summarized on the basis of the above results as: (1) Compared with other approaches, the proposed forecasting method based on multi-scale series aggregation and LSTM network has the best prediction performance, whereas BPNN has the worst. This is because the optimal model parameters of BPNN are difficult to determine. (2) The combination prediction methods, including the proposed method, EMD-DE-LSTM and CEEMDAN-LSTM, can achieve the higher forecasting accuracy than other three single methods, which illustrate the effectiveness and necessity of series decomposition for improving the final prediction accuracy. (3) With CEEMDAN algorithm, the mode mixing phenomenon occurring in the decomposed series of EMD can be effectively eliminated so that the predicted error are further decreased. (4) Because of the developed DE-based multi-scale series aggregation scheme, the forecasting performance of the proposed method can be significantly enhanced. (5) Different from other single methods, LSTM realizes the higher accuracy and meanwhile has the better ability for tracking the jumping IHSI points, which indicates the feasibility and superiority of adopting the LSTM network in the proposed forecasting method. Consequently, the proposed intelligent deep learning method not only contributes to accurate measurement of health levels for aero-engine, but can achieve the superior forecasting performance comparing with other traditional method.

V. CONCLUSION
In this paper, a intelligent deep learning method, systematically blending the DE-based multi-scale series aggregation scheme and LSTM neural network, is proposed for forecasting the health evolution trend of aero-engine. Compared with existing approaches, the proposed method has three significant advantages. First, a comprehensive measurement of health levels, i.e., IHSI, is constructed with highdimensional dataset for accurately quantifying the health status of aero-engine. Second, aiming at the fluctuations of the IHSI series, CEEMDAN is applied to decompose the IHSI sequence to further capture the internal characteristics of original sequence, and additionally an DE-based multiscale series aggregation scheme is innovatively developed to to retain all effective information of multi-scale series and meanwhile obviously decrease the number of series. Lastly, the aggregated sequences, i.e., ASS, are served as the inputs of LSTM neural network to forecast the health evolution trend of aero-engine.
To demonstrate the effectiveness of the proposed method, six different approaches are applied in the analysis of a wellknown case study. The experimental results indicate that the proposed method can effectively measure the health evolution process of aero-engine and further obtain more accurate trend forecasting results. Furthermore, because of the fluctuations and non-stationarity of actual operation data, the forecasting strategy on the basis of combination pattern is much better served for actual engineering than single forecasting models.
WEI JIANG received the B.S. degree in mechanical engineering and the Ph.D. degree in water resources and hydropower engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2014 and 2019, respectively. He is currently a Lecturer with the Faculty of Mechanical and Material Engineering, Huaiyin Institute of Technology. His research interests include fault diagnosis, trend prognosis, and intelligent management for mechanical systems.
NAN ZHANG received the B.S. degree in electronic information engineering from Wuhan Textile University, Wuhan, China, in 2014, and the Ph.D. degree in water resources and hydropower engineering from the Huazhong University of Science and Technology, Wuhan, in 2019. He is currently a Lecturer with the Faculty of Mechanical and Material Engineering, Huaiyin Institute of Technology. His research interests include control theory, system identification, and modeling of power generation systems.