A Data-Driven Maintenance Framework Under Imperfect Inspections for Deteriorating Systems Using Multitask Learning-Based Status Prognostics

This paper proposes a data-driven, condition-based maintenance framework (DCBM) for deteriorating equipment under the impact of varying environments and natural aging. The equipment’s degradation status is determined by a prognostic and health monitoring method. Generally, monitoring data and maintenance inspections are imperfect because of uncertainties in the equipment degradation process, which may prevent a reliable evaluation of a system’s deterioration. By utilizing a deep learning technique, we construct a new stacked autoencoder long short-term memory (SAE-LSTM) network-based multitask learning model to extract state features from the monitoring data, and then perform multistep forecasting to obtain performance degradation and failure probability information. The developed SAE-LSTM-based multitask learning achieves prognosis results close to the actual values, which indicates the excellent feature extraction capability of this model. As a result, we introduce this deep multitask learning model into the optimization of the maintenance process. Probabilistic forecasting is used as one of the criteria for maintenance decisions made with imperfect inspections to address the influence of the uncertainties involved in the prognoses results. The effectiveness of the proposed DCBM framework is illustrated by the application of an engine degradation dataset, and this model is more cost-effective than the baseline maintenance policies.


I. INTRODUCTION
In engineering environments, a system's performance deteriorates due to aging devices and various external shocks, and this process is prone to nonstationary and stochastic features. In reliability theory, equipment deterioration is usually simulated by a stochastic process, allowing the risks caused by the uncertainties to be depicted by failure probabilities or reliability metrics [1]. Probabilistic forecasting can provide quantitative descriptions of uncertainty information [2]- [4]. Thus, the prognostics of both performance degradation and failure probability are necessary for comprehensively evaluating the variability and uncertainty in the degradation process [5], [6].
The associate editor coordinating the review of this manuscript and approving it for publication was Jiajie Fan .
Generally, prognostics technologies focus on the forecasting of the time that a system fails to perform its expected function. The technologies are implemented during the operation of the system, so as to predict the failure and arrange the maintenance activities as early as possible. Based on the situations surrounding the actual deteriorating systems, there are two types of prognosis methods available: the physicalbased method (PBM) [7] and the data-driven method (DDM) [8]. For a system with multiple degradation parameters, it is almost impossible to derive a precise theoretical model, and the assumptions adopted in the PBM also lead to additional uncertainty. Varying from the PBM, the DDM establishes the mapping relations between the monitoring signals and the underlying physical characteristics of the potential faults, without the presumption of the parameters of the degradation process [9], [10]. As a DDM model used in the fault diagnosis field, long short-term memory (LSTM) network-based methods, have shown excellent performance in modeling the longterm and short-term dependencies of degradation data [11]. For instance, a CNN-and LSTM-based prognostic approach was provided in [13] to improve the prediction accuracy and lessened the requirement for expert experience in diagnosing bearing faults. Cabrera et al. [14] adopted a Bayesian search to optimize the hyperparameters of the LSTM-based model, and then the optimized LSTM model was used to capture faulty patterns and fault diagnoses. In addition, as an unsupervised deep learning framework, autoencoders (AEs) can effectively reconstruct monitoring data to eliminate the effect of noise before further analyses. For example, an integrated learning method that combines the autoencoder and LSTM for rare fault event diagnoses and fault type identification was proposed in [12].
In this paper, a prognostic model based on the stacked autoencoder long short-term memory network (SAE-LSTM) is constructed to implement data reconstruction and capture the hidden patterns of equipment failure from degradation data. The current research on equipment degradation using the deep learning method mainly focuses on fault diagnoses, and there are few studies that contain both degradation estimations and failure probability predictions in one prognostic model [8], [15]. In this paper, we propose a probabilistic forecasting method, based on the SAE-LSTM classifier, that provides for the possibility of equipment failure, and then we integrate it into a multitask learning model (MTL) with a regressor that is used for performance estimation. This approach can be used as a general surrogate model to forecast the performance degradation and the failure probability of the system simultaneously. As a model training method that can achieve higher efficiency by capturing the relationships among different tasks, multitask learning has been studied through comparisons with independent learning task models [16], [17]. The representative works and applications of the MTL can be found in the general overview presented in [18]. Then, using the constructed SAE-LSTM-based multitask learning model as the prognosis module, a data-driven condition-based maintenance framework (DCBM) for the deteriorating system is proposed.
For deteriorating equipment, an unscheduled failure means additional replacement costs and downtime costs. With the development of prognostics and health management technologies, continuous or periodic monitoring of the system degradation process has greatly promoted the application of condition-based maintenance (CBM) strategies [19]- [21]; therefore, showing it is more cost-effective to arrange the maintenance schedules according to the status prognostic results. For example, in [22], an unsupervised representation learning method was developed to address life estimations using high-dimensional run-to-failure data for CBM. Zhou et al. [23] proposed a dynamic reliability-centered maintenance model with fault diagnostic and prognostic technology, where the failure probability was calculated by the statistical method. By estimating residual life based on the observed sensor data, preventative maintenance schedules were executed in [24]. Lou et al. [25] adopted the backpropagation neural network and the Boltzmann machine to model the nonstationary degradation processes and applied them in a maintenance model. However, most of the existing CBM models assume that the system state's prognosis and maintenance inspection are perfect [26]- [28]. This problem has raised the concern of some researchers. The article [29] proposed a Kalman filter approach to estimate the system reliability, and it proved that solely relying on the prognosis of performance degradation to schedule the maintenance may lead to suboptimal solutions. Therefore, an assumption of an ideal prognosis will negatively impact the maintenance plan, making it necessary to introduce probabilistic forecasting as a supplementary index for maintenance with imperfect inspections [30], [31] to address the influence of the uncertainties involved in the real-time condition monitoring data. As mentioned above, this paper establishes a probabilistic forecasting approach based on the SAE-LSTM classifier to provide the predicted failure probability of the deteriorating equipment. The paper also develops a DCBM framework that consists of a complete analysis process from the prognosis module of both the performance degradation and the future failure probability, to the maintenance module under imperfect inspections.
This study proposes a prognostic approach to simultaneously predict the performance degradation and failure probability of deteriorated systems using monitoring data, and applies this prognostic model to achieve a more cost-effective maintenance process. The contributions of this research can be summarized as follows: (1) To the best of our knowledge, this work is the first MTL based maintenance framework using imperfect inspections that involves both degradation estimations and probabilistic forecasting. (2) The proposed DCBM model can adaptively learn the system's deteriorating features from historical monitoring data, and perform maintenance decision-making using these features.  (3) As an end-to-end dynamic predictive maintenance strategy, the DCBM is composed of modules from degradation data-based state prognoses to the final maintenance decision, and it does not require predefined features or reliance on prior knowledge. This paper is arranged as follows: In Section 2, the proposed prognosis model is explained and the construction of the SAE-LSTM-based MTL network and the procedures of the DCBM framework with imperfect inspections are introduced. Section 3 presents the average maintenance cost model of the proposed maintenance strategy and compares it with the other two baseline maintenance strategies. In Section 4, the application results of the proposed DCBM model on an engine dataset are presented. Section 5 provides the conclusions of this paper and possibilities for future work.

II. PROPOSED MODEL
In engineering applications, an estimation of the current deteriorating status of equipment is essential for the arrangement of system operation plans and maintenance activities. As shown in Fig. 1, with the features extracted from the historical data, the corresponding maintenance measures at inspection point T can be determined through one-step, or multistep, forecasting of the future degradation process. However, due to measurement errors, the prognosis results are not precisely consistent with the true degradation process. Thus, maintenance activities are implemented with imperfect inspections. It is natural that the more degradation features that are extracted from the historical data, the higher chance there is to make an optimal maintenance decision. Therefore, we construct an SAE-LSTM network-based multitask learning model to implement state forecasting, which can provide a more comprehensive reference for a DCBM using imperfect inspections.

A. SAE-LSTM BASED MULTITASK LEARNING MODEL
As shown in Fig. 2, the SAE-LSTM network is composed of two parts: encoder E (·) and decoder D (·). Both the encoder and decoder consist of two hidden LSTM layers. For the fault prognosis of a deteriorating system, LSTM can not only review the history of the system degradation process but can also track the current state of the system. A common LSTM Then, according to the input timesteps, the context vectors will be transformed to form the repeat vector Y 1:n = [x 1 , x 2 , . . . ,x n ], which is used as the initial state of the decoder D (·). In addition, the sequential outputs are generated by where θ 1 and θ 2 are the parameters of the encoder and decoder, respectively. Then, by modifying the output part of the model, we set two specific fully connected layers with different activation functions to construct two learning tasks: regression and classification. As shown in Fig. 4, with the same input dataset, and the shared SAE-LSTM hidden layers, we propose a multitask supervised learning-based prognosis model that is composed of the regressor and the classifier. The shared network layers can learn the representations across two tasks, and then the status prognosis for the deteriorating system can be performed by the task-specific layers.
In Fig. 4, assuming there are m deteriorating devices, the operation monitoring data of n sensors collected in the time window T can be expressed as . ,x t,n , i = 1, 2, . . . ,m. The input layer processes the above data into a three-dimensional sensor (sample, time step, feature), where sample is the sample number of the monitoring data, time step is the sliding window size of the input data, and feature is the number of features extracted from the monitoring data [32]. To avoid the overfitting problem in the training process, all the LSTM layers of the hidden layer adopt the ''dropout'' regularization method, and the training process adopts the ''early stopping'' strategy to obtain the best trained model while minimizing the training time.
For the regression task, the historical data collected by n different sensors may have different ranges and should be preprocessed by a dataset normalization method, such as minmax normalization, which can linearly map the raw data to the range of [0,1]. To reduce the influence of singular values, and ensure the differentiability of the loss function, the Huber loss function is adopted in this model, which can solve the problems of derivative discontinuity of the L1 loss function, and the sensitivity of the L2 loss function to outliers. The Huber loss function can be regarded as the combination of the L1-and L2-loss functions, which are defined as follows: where y, f (x) are the target value and the predicted value, and δ is the model hyperparameter to adjust the weight between the L1-and L2-loss functions. For the classifier, the historical data should be labeled according to the inspection interval τ , and the system states can be defined as: Safe, that is, the system degradation will not exceed the threshold in the time τ ; Risky, that is, the system degradation will exceed the threshold value in the time τ . The failure probability mentioned in this paper refers to the possibility of equipment failure in future time τ . If the time window τ has multiple values, a multistate model can be defined, and the corresponding labels should be set for each state. The softmax function is used to normalize all the output values of the classification model, as in: where P i represents the normalized probability that the input sample is associated with the i-th state, and the sum is 1; x i represents the output of the SAE-LSTM hidden layers. The classification task in this paper belongs to the binary mutually exclusive class problem, and we use Categori-cal_crossentropy as the loss function, as in: where n is the number of categories; Y i is the one-hot encoded target vector y i1 , y i2 , . . . , y in , and y ij = 1 if the category is consistent with that of sample i, otherwise y ij = 0; P ij is the predicted probability f (X i ) that the observed sample i belongs to category j. Then, for the proposed SAE-LSTM-based multitask learning model, a joint loss function is defined for the joint training process of the regressor and the classifier, which can be set as the combination of the two abovementioned loss functions, as in: where σ is the parameter that adjusts the weight between the two loss functions. When σ = 0, this multitask learning model degenerates to a regressor model that only performs the regression for the performance degradation, while σ → ∞, it becomes the classifier model only for probabilistic forecasting. Similar to the hyperparameter δ in the Huber loss function, both should be adjusted during the training process to achieve the best results for all the training tasks. The training process of the proposed model is presented in the following table. First, the historical data are preprocessed, and a sliding window with sliding step one is used to construct time series data pairs as training data. Then, the model is VOLUME 9, 2021 trained by the batch gradient descending and backpropagation algorithm.

Algorithm 1: Training Process of the Proposed MTL Model
Input: Historical input time series: Historical target time series:  1 Get the output of learner on train batch θ t ← ∇θ t−1 L t , L t Update parameters by using the EarlyStopping method End for  [30], [31] is presented in this section (see Fig. 5). ''Imperfect inspection'' means that the system degradation level obtained through inspection has a measurement error when compared with the actual degradation level. It is possible that the inspection results are inconsistent with the actual situation. In this section, as shown in Table 1, the inspection results of the system degradation level obtained based on the prognosis results has three types: good (TR), defective (FR), and failure (TF). The DCBM model is constructed based on these results. In general, this framework consists of three parts: The first part is the module to construct an offline trained model based on historical data. Generally, there is a data preprocessing step where the field engineer collects the recorded operating status data and historical failure information and disposes the data into the required format. After preprocessing, the historical dataset is divided into the training dataset and the validation dataset for the training process of the multitask learning network. The training process demands a suitable adjustment of the network structure and hyperparameters based on the training accuracy and computational complexity. Maintenance engineers can estimate the performance of the network by evaluation metrics such as the MAE and accuracy. When the model can effectively capture the potential features of the degradation process, and the loss of both the training set and validation set does not drop, the training stops, and the model can be applied in the second part.
The second part is the module of the online status prognosis. In this module, the collected real-time monitoring data will be used as the input data of the trained prognosis model to obtain the outputs of both the current degradation of the system, and the failure probability in the next maintenance interval. This process does not repeat the time-consuming training process of the first part, so it can be completed within an acceptable time scope, which can satisfy the requirements for rapid access to the system status and fast maintenance decision-making in engineering applications.
The third part is the module of maintenance decisionmaking. Because the prognosis results obtained in the second part cannot be completely accurate, a bias exists between the predictions and the actual value. Thus, according to the prognosis results of the performance degradation (PD) and the failure probability (FP), three possible statuses for the deteriorating system are defined: good (true reliable, TR), defective (false reliable, FR), and failure (true failure, TF); see Table 1. Then, the maintenance strategy can be divided into two types: (1) A radical maintenance strategy, in which maintenance measures are required for both FR and TF; (2) A conservative maintenance strategy, in which maintenance measures are required only when the status is TF. In this paper, preventive maintenance is taken in the state ''FR'' to ensure the reliability of the deteriorating system. In addition, the prognosis results are only valid in the current maintenance period. For the next inspection point, the above DCBM program should be updated.
This DCBM maintenance strategy comprehensively considers both the estimation of the status degradation and the effect of the uncertainty in the outputs, which can compensate for the adverse impact of imperfect inspections on maintenance decision-making. Therefore, by introducing probabilistic forecasting as a criterion to incorporate the influence of uncertainty, the MTL-based prognosis module can bring more flexibility to the DCBM model, and make it better adapt to the dynamic demands of deteriorating equipment.

III. PERFORMANCE OF THE DCBM MODEL
To evaluate the performance of the proposed maintenance strategy, the indices include the average maintenance cost rate (MCR, the ratio of the average cost per maintenance cycle to the average maintenance cycle length, defined by Ross [33]) and the availability [27], [28], [30]. The availability is the ratio of the average uptime per maintenance cycle to the sum of the uptime and the downtime per maintenance cycle. Generally, maintenance cost and availability are two contradictory indices. In this paper, we assume that the preventive maintenance will not cause system downtime, so the MCR is adopted as the performance metric of the maintenance policies, as defined by where C M (t) represents the cumulative maintenance cost until time t, E (C T ) is the expected value of the maintenance cost incurred in a cycle, and E (T M ) is the expected length of a cycle. The DCBM adopts the periodic inspection strategy with the successive maintenance inspection interval τ , and maintenance is only performed during the inspection period. The maintenance solutions include corrective maintenance and preventive maintenance. Corrective maintenance aims to eliminate equipment failures, and the cost is C c . Preventive maintenance refers to routine maintenance activities to reduce the failure probability, which is usually implemented as the operation schedule. The cost for preventive maintenance is C p . Considering the system status in Table 1, the  corresponding maintenance cost is presented in Table 2. The ''FR'' inspection status indicates that the predicted system degradation has exceeded one of the thresholds of the PD or the FP, so preventive maintenance is required. The ''TR'' status indicates that both the predicted PD and FP have exceeded the thresholds, which means that the system is in danger, so corrective maintenance is required.
The procedure for evaluating the MCR of the DCBM is presented in Fig. 6. At maintenance inspection point T , if a failure has occurred, corrective maintenance will be performed to repair the failed equipment. If no failure occurs, the prognosis module will perform forecasting by the current monitoring data. Next, is the decision-making module based on imperfect inspections. If the inspection status is ''TR'', it indicates that the system is operating well, so it will continue to run until the next maintenance inspection point. If the inspection status is ''FR'', it means that the two prognosis results of the system degradation lead to a conflicting judgment about the equipment status, and may cause an incorrect maintenance decision, so preventive maintenance is required. If the inspection status is ''TR'', there will be two options: (1) ''corrective maintenance'', replace the equipment immediately; and (2) ''do nothing'', let the equipment work until fail-VOLUME 9, 2021 ure, and then replace the equipment. The choice depends on the relationship between M CR failed and M CR work , as shown below. If the system fails between two inspection points, the system will stay in the faulty state from the failure time point until the next maintenance inspection point, and then adopt corrective maintenance to repair the failed equipment. By using the average maintenance cost rate model, the maintenance cost per unit time of the DCBM is as (7) and (8) ;RUL is the performance metric of the deteriorating equipment in this paper, representing the remaining life, RUL j is the remaining useful life predicted in the j-th inspection point, and the Threshold-D is the maintenance time interval τ , then RUL j > τ indicates the reliable state; f j = P RUL j < τ represents the predicted possibility of equipment failure in the next maintenance interval τ at the j-th inspection point, then Threshold-F is f 0 , and f j < f 0 means reliable;δ (x) = 1 when x is true, and The inspection and maintenance costs satisfy the following relationship: For comparison, the average maintenance cost models of the periodic maintenance strategy (PM) [34] and the ideal predictive maintenance strategy (IPM) [35] are derived as follows: (1) Periodic maintenance policy (PM) First, it is necessary to estimate the mean time to failure TF based on historical data: TF = 1 m m i TF i . Regular preventive maintenance is performed with the time interval τ , and corrective maintenance is performed at the previous inspection point of the time TF, so the average maintenance cost is:

IV. AN ENGINEERING APPLICATION STUDY
This section uses the degradation dataset of the turbofan engine provided by the NASA Prognostics Data Repository as an application case. This dataset consists of sensor data under multiple operating conditions, and different degradation modes of the same type of engine [36]. Each group of data in the dataset is composed of 26 columns representing various operating information from the engine. The first two columns are the engine ID and the degradation time step, the next three columns represent the operating modes of the engine, and the last 21 columns are the sensor data obtained from the monitoring system. The settings of the proposed SAE-LSTM-based multitask learning network are presented in Table 3. We use the ''Glorot_normal'' as the initializer of the layer weight. The hyperparameter in the Huber loss function is δ = 0.03, which can be obtained by test experiments during the training process. The Adam optimization algorithm is adopted in the training process [37]. This case study uses the FD001 data of the engine dataset as the research object, which is composed of run-to-failure data from 100 engines. The data of the first 90 engines are used for the training process, of which 70% is the training set (12821) and 30% is the validation set (5495). Then, the data of the remaining 10 engines are used as the test set (2250) to simulate real-time collected monitoring data. By observing the historical data collected from the sensors, the trend of the equipment degradation process can be roughly estimated. Fig. 7 provides the run-to-failure sensor data of engine #60 as δ 1 * C I + δ 2 * (C I + C p ) + δ 3 * (C I + C c ), system works (7)  an example. We can see that there exists a stable stage period at the beginning of the degradation process, and then the sensor data gradually change as the equipment deteriorates. This means that the engine suffers little degradation at the initial stage. Therefore, the piecewise linear function is adopted to depict the changing trend of the true RUL; it uses an upper bound to represent the initial remaining useful life, and then the RUL will decrease after a certain time point. In addition, the sensor data are preprocessed by the sliding window to construct the input data samples, as shown in Fig. 8. This paper constructs the deep neural network in the Python 3.6 and TensorFlow programming environment.

A. THE PROGNOSIS MODEL AND THE RESULTS
As mentioned before, for the constructed MTL-based prognosis model, the MAE and accuracy are used as the evaluation metrics for the regressor and classifier tasks. The MAE represents the average absolute error between the predicted value and the true value, and the accuracy represents the proportion of the samples with correct forecasting results in the total samples. In addition, the root mean squared error (RMSE) and confusion matrix [TP, FP; FN, TN] are also given for reference, where TP means the true positive, FP represents the false positive, FN is the false negative, and TN is the true negative.
Accuracy = n correct n samples =

TP + TN TP + FP + FN + TN
The values of these corresponding evaluation metrics are presented in Table 4, and we find that within the range of the joint loss function parameters σ from 0.1 to 100, the performances of the regressor and classifier, reflected by the  values of the MAE and accuracy, have the opposite trends. To achieve better prognosis results for both the regressor and classifier tasks, we set σ = 1.5. The results listed in Table 4 show that the accuracy of the one-step forecasting is higher than that of the three-step forecasting, which can be explained by the influence of uncertainty in the monitoring data. The longer the forecast time step is, the more uncertain factors need to be considered; the deviation in the forecast results will be larger and lead to lower forecasting accuracy. In addition, the RMSE values are larger than those of MAE, because RMSE is more sensitive to outliers in the prognosis results, which indicates a negative effect of the deviation.
To visualize the performance of the prognosis model, we also provide the ROC (receiver operating characteristics)-AUC (area under the curve) curve and the training loss curve as the measurements. The ROC curve is a probability curve that represents the distinguishing capability of the classifier at different threshold settings. A larger AUC (between 0 and 1) indicates better performance of the model. This curve is plotted with the x-axis FPR (false positive rate) against the y-axis TPR (true positive rate). As shown in Fig. 9, the AUC value of the proposed model on the test set is 0.9766, indicating that the multitask learning-based classifier has good accuracy. For the regression learning task, the loss curves of the training set and the validation set in Fig. 10 also show the effectiveness of the multitask learning compared to the regression learning task.   For the prognosis results obtained at inspection point T , a larger RUL value indicates less performance degradation of the equipment, and corresponds to a lower failure probability before the next maintenance inspection point T + τ . Fig. 11 shows the prognosis results of four engine samples in the test dataset, and the forecasting curve of the RUL can well fit the true degradation trend of the engine. In addition, the curves of the predicted failure probability with the different inspection intervals τ = 20, τ = 25 and τ = 30 are also presented. Using Fig. 11(a) as an example, at the given inspection time point 143, the true RUL is 12, the predicted RUL is 16, the failure probability with τ = 20 is 0.55, the failure probability with τ = 25 is 0.91, and the failure probability with τ = 30 is 0.93, which means that when the predicted RUL of the engine is 16, the values of the probability that the engine will fail in the next 20, 25, and 30 time cycles are 0.55, 0.91, and 0.93, respectively. These results in Fig. 11 prove the good capacity of the proposed model in state prognosis for the deteriorating system, and that the predicted RUL and failure rate can effectively describe the degradation process of the engines.

B. THE IMPERFECT INSPECTION BASED MAINTENANCE DECISIONS
In this section, the prognosis results will be used as the decision basis for the maintenance schedule under imperfect inspections. To illustrate the effectiveness of the proposed model, the inspection interval τ and the criterion f 0 are assumed to be 25 and 0.01, respectively. The prognosis results at each inspection point and the corresponding maintenance measures are listed in Table 5. Using the results of engine #94 as an example, at inspection point 8τ , the predicted L. Zhang, J. Zhang: Data-Driven Maintenance Framework Under Imperfect Inspections for Deteriorating Systems  value of the RUL is 59.03 > τ , and the predicted failure probability is 0.0075 < f 0 , so the engine status is ''TR''. There is no need to adopt maintenance measures, and the equipment can keep running. Then, at the 9τ inspection point, the predicted RUL is 33.77 > τ , and the predicted failure probability value is 0.2419 > f 0 , so the inspection status is ''FR'', and preventive maintenance should be taken. After that, the status of the equipment is ''TF'' at the 10τ inspection point, since both the predicted RUL and the failure probability exceed the thresholds τ and f 0 and satisfy the condition that the equipment is about to fail. In this case, the DCBM model (shown in Fig. 6) will choose the maintenance measure between the two options, ''Do nothing'' and ''Corrective maintenance'', and the one with the smaller MCR will be adopted. If the equipment fails before the inspection point, then corrective maintenance should be performed. Fig. 12 shows the predicted status of Engine #98 at the 5τ inspection point. One can note that the real RUL and the predicted RUL are larger than Threshold-D τ , while the predicted failure probability is larger than Threshold-F f 0 , so the engine state is ''FR'', and preventive maintenance is required. In contrast, the maintenance measure is adopted when the real RUL is larger than the threshold, which indicates that the threshold setting f 0 = 0.01 is conservative for the actual situation, and the maintenance engineer can adjust the threshold settings to improve the decision process.

C. COMPARISON OF THE MCR UNDER DIFFERENT MAINTENANCE POLICIES
First, the MCRs of the maintenance models that separately use the predicted RUL (DCBM-RUL) and the predicted failure probability (DCBM-FP) are compared with the proposed model. From Fig. 13, we can see that with increasing inspection interval τ and cost C p , the MCR of the proposed model is larger than that of the other two models. This can be explained by the status of the system under imperfect inspections shown in Table 1. The proposed model considers two types of ''FR'' status, while the other two models only include one ''FR'' status. From the results obtained in Fig. 12, the threshold f 0 is conservative for the actual situation, and the prognosis model tends to overestimate the RUL, so the MCR of the model based on the predicted failure probability is larger than that based on the predicted RUL. Therefore, considering the impact of the imperfect inspections, the proposed maintenance model can consider more defective states information and adopt a conservative strategy to avoid missing the necessary maintenance.
Then, the MCRs of the three maintenance strategies presented in Section 3 are compared by setting different values for the inspection intervals τ , as shown in Fig. 14. The mean time to failure TF in the IPM model is approximately 205. The values of C p , C c , C d , and C I are assumed to be 1, 2, 0.5, and 0.1, respectively. It can be seen from Fig. 14 that with an increasing maintenance interval τ , the MCR of the three maintenance strategies experiences a decreasing trend, and the MCR of the DCBM is higher than that of the IPM and lower than that of the PM. When τ <15, the MCRs of the three strategies are quite different, and the MCR of the PM is significantly higher than that of the DCBM and IPM. When τ >15, the MCR of the DCBM and PM gradually decreases. These results demonstrate that under the smaller inspection interval τ , the DCBM strategy can achieve a lower MCR than the PM, and the larger inspection interval τ can significantly reduce the MCR. With the increase in the inspection interval, the MCR will increase due to the influence of the downtime cost. For the phenomenon that the MCR of the DCBM and PM fluctuates with the increase of τ , the reason is that under the larger inspection interval, the MCR is mainly influenced by the downtime cost C d , and some specific values of τ can make the equipment downtime shorter than the others, so it will lead to a lower maintenance cost, and subsequently the MCR is smaller. Therefore, to obtain the minimal MCR, it is necessary to select an appropriate inspection interval τ for the maintenance strategy. For instance, for engine #93, the best maintenance inspection interval is 26, and its MCR is close to that of the perfect inspection model's IPM.
Furthermore, using all the samples in the test dataset, the sensitivity analysis of three MCR models under different C p , C c , C d , τ are presented in Fig. 15. As mentioned above, the relations of these costs are assumed to be C c > C p > C I > C d . Sub- Fig. 15(a) shows that when C p = 5C d , τ = 15, and C I = 0.5, the changes in C d play a more important role in the MCR of the three maintenance policies than that of C c . 3626 VOLUME 9, 2021 The PM and DCBM are more sensitive to changes in C d . The MCR of the DCBM is significantly lower than that of the PM. In sub- Fig. 15(b), C c = 0.5C d , C p = 2C d , and C I = 0.5. With the constant C d , the MCR of the DCBM and the PM gradually decreases with the increase in τ . Therefore, a larger inspection interval can significantly reduce the maintenance cost. When τ is small and C d increases from 0 to 10, the MCR of the PM increases from 0 to 20, while that of the DCBM increases from 0 to 4.2, which shows the excellent performance of the DCBM in reducing the downtime cost of the equipment. τ and C d have no impact on the ideal case IPM because the IPM is the ideal case in which the failure time can be exactly predicted and its downtime cost is 0. It can be seen from sub- Fig. 15(c) that when C d = 0.5C p , τ = 15, and C I = 0.5, with the increase in C p , the growth rate of the MCR of the PM is larger than that of other maintenance policies, and is approximately twice that of the DCBM. When C p and C d are small, C c has a larger effect on the MCR of the DCBM, which can be explained by the conservative strategy of the DCBM under the inspection status ''TF''. In sub- Fig. 15(d), C c = 0.5C p , τ = 15, and C I = 0.5. According to the relation C p > C d , in the area satisfying this relation, the MCR of the DCBM is lower than that of the PM. Therefore, the DCBM can achieve a lower MCR than the PM policy in most situations.
In summary, based on the prognosis results provided by the deep multitask learning model, the proposed DCBM policy shows promising performance in reducing the maintenance cost rate under the influence of imperfect inspections and relative parameter uncertainties. The comparison and sensitivity analysis of the MCR of the PM, IPM and DCBM illustrate the feasibility and efficiency of the proposed DCBM policy in optimizing the maintenance process. The proposed datadriven maintenance model is superior to the other two datadriven baseline models in terms of the cost rate MCR.

D. THE PERFORMANCE OF THE IMPERFECT PROGNOSTIC BASED DCBM
To evaluate the performance of the proposed DCBM, a further comparison between the MCRs of the DCBM and the PM is performed. According to the results shown in Fig. 11, the RUL curves obtained from the constructed prognostic model fit the degradation trend of the engine well, but there were still fluctuations compared with the real RUL curve. Therefore, the prognosis results are imperfect, and we used an ''imperfect prognostic'' to indicate the difference between the prognosis results and the true value. The influence of the imperfect prognostic results on the maintenance cost will be evaluated by the relative difference in the maintenance cost VOLUME 9, 2021 C RD between these two models: As shown in Fig. 16, C RD between the DCBM and the PM is provided with different values of C c /C p and is given C i = 0.5, C d = 5, C p = 10. When the inspection interval τ increases from 2 to 8, C RD rapidly decreases from 70% to 10%. This means that the difference C RD of the MCR between the DCBM and the PM is relatively large when τ is small, and C RD decreases with the increase in the inspection interval τ . This phenomenon can be explained by the impact of the costs associated with C i and C p . With the increase in τ , the number of inspection actions is fewer than before, and both the inspection cost and the preventive cost decrease, so the MCR is more sensitive to the cost of corrective maintenance, and the difference in the MCR between the DCBM and the PM decreases. When the inspection interval τ >14, the C RD of all three cases of C c /C p are less than 5%. Next, with the increase in the ratio between C c and C p , the corresponding C RD curve showed a downward trend. This can be explained by the maintenance decisions based on the imperfect prognostic results. To achieve the lowest maintenance cost, the DCBM model will make choices between ''Corrective maintenance'' and ''Do nothing'' according to the MCR of these two choices in the ''TF'' situation. When the ratio of C c /C p is sufficiently large, the ''Do nothing'' option will be chosen by the DCBM model to avoid corrective maintenance in the state ''TF'', and the equipment will work until failure. Then, both the MCR of the DCBM and the PM are affected by the penalty cost of downtime C d , so C RD is lower than that under a smaller ratio of C c /C p . Therefore, the C RD between the DCBM and the PM will increase when the inspection interval τ and the ratio of C c /C p decrease, and then the DCBM can achieve a much lower MCR than that of the PM. However, according to the contents in Section IV(C), with the decrease in τ , the maintenance cost rate MCR will increase. Thus, in practical applications, we need to adjust the values of τ , C c and C p appropriately to balance the relationship between C_RD and the MCR to complete the maintenance process with the optimal cost.

V. CONCLUSION
The contributions of this work are twofold: First, we constructed a novel SAE-LSTM-based MTL model, which can implement the joint training of the regressor and the classifier to make status prognostics for the deteriorating equipment, and the prognosis outputs of both the RUL and the failure probability can provide more accurate descriptions for the degradation process. Second, we developed a DCBM framework under imperfect inspections, which uses the constructed deep multitask learning model as the prognosis module. This maintenance strategy shows excellent performance in dealing with the impact of the uncertainties and the imperfect prognosis results. The DCBM provides a complete analysis process, from the monitoring data preprocessing, to the quick prognosis of both the RUL and the future failure probability. Then, the prognosis results were used as the decision basis for the imperfect inspection based maintenance module to achieve an optimal maintenance cost rate. The deep multitask learning-based prognosis model proposed in this paper shows promising feature extraction and representation learning capabilities for degradation estimations and probabilistic forecasting, which portends broad application prospects in the fields of reliability assessment and maintenance decisionmaking. Furthermore, based on the work in this paper, for future studies, we will focus on probabilistic forecastingbased fault diagnosis and maintenance optimization for deteriorating systems with multiple failure modes.