Improved Ensemble Feature Selection Based on DT for KPI Prediction

In the production process of large-scale machinery and complex industries, the key performance indicator (KPI) prediction is an essential part of project scheduling and cost estimation. The continuous enrichment of sensor types and functions brings us massive soft-sensing parameters for regression, but also brings severe challenges to algorithm learning. In this paper, an improved ensemble feature selection based on decision tree (EFS-DT) strategy for KPI prediction is developed. On the one hand, the ensemble of multi-criteria filtering results broadens the selector’s perspective without the time cost of superposition. On the other hand, credibility and similarity analysis are designed to eliminate the concerns of Dempster’s combination rule about conflict. After re-evaluating the variable scores, more high-quality variables can be selected to build a more accurate and robust KPI prediction model. Finally, a realistic shield tunnel case in China is used to evaluate the feasibility and effectiveness of proposed approach.


I. INTRODUCTION
From the perspective of safety, schedule and cost, key performance indicators (KPIs), e.g., the core parameters of major equipments and product quality variables, are vital during construction or production [1], [2]. In recent years, the development of sensors and data analysis technology has made it possible to evaluate KPI, which has attracted widespread attention from scholars [3]- [5]. Sun and Ge [6] combined ensemble and semi-supervised learning to effectively use unlabeled data for KPI prediction. Si et al. [7] divided the kernel matrix into KPI-related and KPI-unrelated subspaces. Then, two statistics are designed for process monitoring based on the improved kernel partial least squares (KPLS) method. Shao et al. [8] applied decision tree (DT) to KPI prediction in the medical field and achieved satisfactory results. DT has become an excellent tool for KPI prediction because it is easy to understand and interpret [9], [10]. However, due to complex working environment, indirect control and difficulty in real-time acquisition of key indicators, it is still extremely necessary and challenging to The associate editor coordinating the review of this manuscript and approving it for publication was Gautam Srivastava . identify positive soft-sensing parameters and realize accurate KPI prediction.
With the continuous enrichment of sensor types and functions, there are numerous variables, also known as parameters or features, available in DT-based modeling for KPI prediction [11], [12]. However, invalid parameters containing irrelevant or redundant information not only increase the overfitting risk and pruning difficulty, but also adversely affect convergence and performance [13]- [15]. Therefore, how to achieve effective and robust dimensionality reduction becomes particularly critical. Filter models are popular in feature selection of high-dimensional data due to their high efficiency and less computation compared with wrapper and embedded models [16]. However, the accuracy of the separate filter model is not guaranteed due to its one-sided perspective and independence from the algorithm, so the hybrid serial feature selection methods emerge [17], [18]. Based on the feature subset selected by gain ratio (GR), Saengsiri et al. [19] further pruned the candidate features through greedy search (GS) modeled on the support vector machine (SVM). Akadi et al. [20] used genetic algorithm (GA), a wrapper model, to obtain more condensed candidate feature set from feature subset selected by minimum redundancy maximum relevance (mRMR). VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ However, the serial refinement is strict with the initial selection strategy and shelves the time consuming, so the ensemble methods have aroused interest [21]- [23]. Bolón-Canedo and Alonso-Betanzos [24] introduced the basic concepts of ensemble feature selection, and discussed the state-of-the-art advances and future trends. Jin et al. [25] made two improvements to the objective function in the common space model (CSP), and fused the candidate feature subsets obtained under the two objectives through the Dempster-Shafer (DS) evidence theory. Although ensemble learning has become very popular for classification, there are still rare subjects in the case of ensemble feature selection. Moreover, most of the existing methods are not universal and extensible. Therefore, this study develops an improved ensemble feature selection based on DT (EFS-DT) strategy for KPI prediction. The ensemble feature selection fuses multi-criteria filtering results to retain more informative variables at less time cost.
Ensemble feature selection is to fuse the variable scores from different selectors, i.e., feature selection models. It is helpful to enhance the robustness of feature selection and obtain closer unbiased prediction results [21], [26]. Decision fusion is a strong tool to realize ensemble feature selection. DS evidence theory shines in this field due to its compact computation and no prior knowledge [27], [28]. However, it is always subject to counterintuitive results when dealing with inevitable conflicts of non-homologous decisions [29]. Unfortunately, differences in criteria between selectors often cause significant divergences in feature scores, known as evidence conflicts in decision fusion. This makes reckless fusion infeasible. Therefore, conflict management has received widespread attention from scholars [30], [31]. Li et al. [32] took the consistency of evidences as the basis for weighting, and modified the fusion rules. Song and Deng [33] developed a conflict management method based on evidence belief divergence and information volume. However, most of the existing conflict management methods only focus on the evidence itself, but ignore the credibility of the evidence sources. Since the performance of each selector is different, it is not reasonable to treat all evidences equally. Therefore, this study proposes a novel weight distribution strategy that combines the credibility and similarity of evidences for conflict management. Credibility analysis based on the evaluation results of the selectors puts the fusion eye on evidences with high reliability. Similarity analysis based on support puts the fusion eye on evidences with high-reputation. The stacking of the two makes the final decision more accurate and robust.
The main contributions of this paper are threefold: (1) An improved EFS-DT strategy is developed for KPI prediction, which implements preliminary evaluation and revaluation of variable scores. (2) Ensemble feature selection is utilized to fuse multi-criteria filtering results to retain more informative variables with less time cost. (3) A novel weight distribution strategy is proposed, which uses credibility analysis and similarity analysis for conflict management.
The remainder of this paper is structured as follows: In Section II, variables are scored and modeled based on the multi-criteria selectors. Section III implements KPI prediction based on ensemble feature selection. Section IV gives an overview of the overall framework. Section V applies the proposed method to a real shield tunneling case in China and produces the experimental results. Section VI concludes this paper.

II. MULTI-CRITERIA SCORING AND MODELING
In this section, multiple selectors with different criteria evaluated the quality of variables to obtain variable scores. Then, feature selection is performed based on the scoring results of each selector. Finally, the selected variable sets are used for modeling, and the prediction performance of the models is evaluated.

A. MULTI-CRITERION FILTERING
Multi-criteria filtering is the process in which multiple filter models with different criteria obtain their respective variable scores on the training set.
Given the input data normalized by Z-score, it is tabulated to N samples and M variables X = x j , j = 1, 2, · · · , M and output data y = {y i , i = 1, 2, · · · , N }. The data is split into training, validation and test sets. The ratio of each set is defined as 60%, 20%, and 20%, which are commonly used.
In regression task, the quality of variables can be evaluated according to information, distance and dependence [34]. Therefore, three selectors based on the above criteria are used in this study. They are mRMR, RReliefF and Pearson's correlation coefficient (PCC) method. In essence, they are all striving to preserve relevance (x i -y) and remove redundancy (x i -x j ) [35]. Their respective evaluation criteria are as follows: where S m contains the selected m variables, and the remaining variable set is X − S m . I (a; b) represents the mutual information between a and b [36]. This method, information-based mRMR, adopts an incremental search strategy, so its variable ranking, denoted as Rank m , is the order in which variables are selected.
where w V is the quality score of variable V . p diffL|diffV , p diffV , p diffL are probabilistic approximation terms based on relative distance [37]. The variable ranking of this method, distancebased RReliefF, is the descending order of w, denoted as Rank R .
where r x j ,y represents PCC between x j and y.x j andȳ are the mean of x j and y respectively. The variable ranking of this method, dependence-based PCC method, is the descending order of r, denoted as Rank P . The preliminary scores of the above selectors for variables can be calculated by Eqs. 1-3. However, the scoring rules are inconsistent due to differences in criteria and perspectives of filter models. Specifically, the incremental selection strategy of mRMR cannot guarantee that each newly selected variable will score lower than the previous one. The scores and gaps of both RReliefF and PCC method gradually decrease with the ranking. Therefore, the reciprocal of the ranking is used as the variable score in this study, which possesses the property that the score and the gap gradually decrease with the ranking, calculated as follows: where Rank sm and Score sm represent the ranking and reciprocal score of the filter model s for the m th variable, respectively. s belongs to {mRMR, RReliefF, PCC method} in this study. The variable scores obtained from selectors with different criteria will broaden the one-sided perspective of separate filter model through subsequent ensemble.

B. MODELING AND EVALUATING
According to the scores evaluated by each selector, different subsets of variables can be determined for modeling. Evaluating a selector is to evaluate the performance of the variable set selected by it in modeling and prediction.
Classification and regression tree (CRAT) makes DT competent for KPI prediction. The growth of regression tree is a recursive process of constructing piecewise constant approximation based on minimizing error [38]. To be self-content, we summarize the main steps of regression DT as follows: First, traverse the variable j and the splitting point s to find the optimal splitting variable and point (j, s) that meet the following conditions: where x ij is the j th variable value of sample x i . R 1 (j, s) and R 2 (j, s) are the left and right regions of the j th variable divided by s. c 1 and c 2 are the predictions on the above two regions. Then, the partition and the corresponding outputs are determined by (j, s): where N d is the number of samples in R d .ĉ d represents the optimal prediction of the d th region, i.e., the mean of labels in this region.
Repeat the above steps in the new sub-regions until the termination conditions are met. Finally, the input space is divided into D partitions, and the predicted output is denoted as follows:ŷ where In the training set, the top h (h ≤ M ) variables in the ranking results of each filter model in Section II-A are selected to build the DT-based KPI prediction model. Their respective prediction error metrics on the validation set are obtained to form the performance metrics matrix (PMM): where S and J are the number of filter models and performance metrics respectively. p sj represents the j th performance metric of filter model s. Three minimal metrics are used in this study: root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), defined as follows: where is an arbitrarily small but strictly positive number to avoid undefined results. y andŷ are the ground truth and estimated target values respectively. PMM is the evaluation of filtering results, which reflects the quality of each selector. However, the separate selector has limited effect on improving the prediction accuracy. During iteration, the wrapper model recalibrates the selected variables based on the prediction errors to ensure accuracy. Inspired by this, we incorporate the resulting PMM into the subsequent ensemble analysis to assist in the revaluation of variable scores. This not only points the way in favor of ensemble, but is less cumbersome than the wrapper models.

III. KPI PREDICTION BASED ON ENSEMBLE FEATURE SELECTION
In this section, the filtering results of the selectors with different criteria are integrated. Integration is completed by decision fusion. Conflict management, i.e. revision, is required before fusion. The process of re-evaluating variable scores through fusion and selecting high-quality variables for modeling is ensemble feature selection. Finally, KPI prediction is implemented based on the result of ensemble feature selection.

A. REVISION OF EVIDENCES
In the revision stage, credibility analysis and similarity analysis pioneered by us will be used to manage the inevitable conflicts between variable scores.

1) CREDIBILITY ANALYSIS
Credibility analysis utilizes the PMM in Section II-B to analyze the performance differences of the selectors. Forward the minimal metrics to obtain new PMM: where p sj represents the j th positive performance metric of the filter model s. Specifically, if j = RMSE, p sj = 1/p sj makes the metric forward. Then, each column is divided by its own 2-Norm to obtain the standardized PMM: Determine ideal and nadir solutions: where the ideal and nadir solutions correspond to the maximum and minimum values of each column respectively. Specifically, p + j = max p 1j ,p 2j , · · · ,p Sj . The credibility of the selector can be described by the relative distance between the performance metrics and the ideal and nadir solutions, which is calculated as follows: where w s Cre represents the credibility of the filter model s, the value range is [0,1], the closer to 1, the more credible it is. p s = p s1 ,p s2 , · · · ,p sJ and the distance is calculated by where L is the length of a and b. The larger the distance, the lower the proximity and similarity. The variable score vector modified by the credibility analysis is as follows: Therefore, credibility analysis puts more bets on selectors with high credibility. In other words, the higher the historical accuracy of the selector, the more dominant the results it produces, and the more accurate the KPI prediction model based on the ensemble results will be.

2) SIMILARITY ANALYSIS
Similarity analysis utilizes distance to measure the support between evidences. The distance metrics matrix (DMM) between variable score pairs after credibility analysis can be obtained through Eq. 18: where d sj represents the distance between the s th variable score and the j th variable score, i.e., d sj = Dis Score s ,Score j . Then, the average difference between the s th variable score and other variable scores is calculated as follows: The correction weight of similarity analysis can be calculated by normalizing the support degree as follows: The variable score vector modified by the similarity analysis is as follows: Therefore, the similarity analysis puts more bets on the variable scores with high-reputation (supported by more evidences). In other words, the more a piece of evidence is supported by more other evidences, the more dominant it is. The divergence between the revised evidences is weakened to make the ensemble results more robust.

B. FUSION OF EVIDENCES
In the fusion stage, a unified conclusion is obtained by decision fusion. Before this, the revised variable scores need to be summed and normalized, as follows: whereScore = [Score 1 ,Score 2 , · · · ,Score M ], and M m=1S core m = 1. The purpose of normalization is to convert variable score into probability distribution for subsequent fusion.
DS evidence theory is more and more widespread in decision fusion. It combines multiple specific evidences to derive abstract fusion conclusions within the framework of discernment [27]. The fusion rule of DS evidence theory, also known as Dempster's combination rule, is as follows: where is the empty set and K is conflict factor. The range of M 1 (B) and M 2 (C) is [0, 1], which represent the confidence of the 1 th and 2 th evidences (probability distributions) on proposition B and C. M (A) is the trust of proposition A resulting from the fusion of two evidences. In this study, the score for individual variable is the confidence of this variable proposition. Specifically,Score m is the quality score of the m th variable, i.e., M (m). Without correction, S − 1 fusion needs to be performed, so fusion is still performed S − 1 times after correction for fairness, as follows: F(Score) =Score ⊕Score ⊕ · · · ⊕Score S−1 (28) where ⊕ represents the fusion operation of Eq. 26. F(Score) is the comprehensive quality score of all variables.
So far, the ensemble has been completed by decision fusion. The variables that stand out in the comprehensive variable rankings will be used to build KPI prediction models.

C. KPI PREDICTION
The variable subsets with cardinality h can be selected from the training, validation and test sets of X according to F(Score), denoted as I train h , I val h , and I test h . The corresponding real KPIs are y train , y val , and y test , respectively. h is determined at the lowest point of the curve where the performance of the model on the validation set varies with the number of retained variables.
In the prediction phase, I train h and y train are fed into the DT model for training. The model is then verified on I val h and y val . After adjusting h, retraining and revalidation are performed to obtain the model performance curve with the number of retained variables. The h corresponding to the lowest point of the curve is the number of variables to be retained. When unseen data I test h arrives, the predicted outputŷ test of KPI can be obtained through Eq. 7. Ultimately, the prediction performance can be evaluated and compared in terms of Eqs. 9-11 and time consumption.

IV. SOLUTION OVERVIEW
The flowchart of improved EFS-DT for KPI prediction is depicted in Fig. 1, which mainly includes multi-criteria filtering, modeling and evaluating, revision and fusion of evidences, and KPI prediction. In order to avoid the risks of dimensionality disasters, overfitting caused by high model complexity, and high computational cost of serial feature selection, the ensemble idea runs through the whole paper. In the stage of multi-criteria filtering, multiple filter models with different criteria evaluate and score variables according to their own standards. In the stage of modeling and evaluating, the variables selected according to the filtering results are fed into the DT model, and the PMM is obtained. In the stage of evidence revision, conflict management is performed through credibility analysis and similarity analysis. In the stage of evidence fusion, the revised variable scores are fused by Dempster's combinational rule to determine the variable subset that contributes more to KPI prediction. Finally, the DT-based KPI prediction model is built and the prediction performance is evaluated.

V. EXPERIMENTAL STUDIES A. CASE BACKGROUND
Tunnel boring machines (TBMs) are widely used in underground projects due to their efficiency, safety, and economy. Since the introduction of TBMs in the 1950s, engineers and contractors have been committed to study their mechanisms, VOLUME 9, 2021 operation analysis of core components, and the TBM-soil interactions. The development of sensors and data analysis technology has made it possible to evaluate the KPI of TBM, i.e., advance rate (AR). However, the operator cannot control AR directly, but indirectly by adjusting other parameters, such as thrust, torque, etc. Therefore, identifying positive soft-sensing parameters and realizing accurate KPI prediction are critical to project safety and construction schedule.
The application case in this study is the right line of the double-lane tunnel crossing the Q river in Hangzhou, China, which connects the north and south shores. The design mileage is YK0+000.000∼YK3+586.566, of which the length of the shield section is 1830m (YK1+000.000∼YK2+830.000). Fig. 2 depicts the crosssection and overview of the Q river tunnel project. According to Article 3.1.1∼3.1.4 of ''Geotechnical Engineering Survey Code'' (2009 edition): This project is a river-crossing tunnel with level I (importance). The site has diverse rock and soil types, poor uniformity, and variable geology, so the foundation complexity is II (medium).

B. EXPERIMENT RESULTS AND DISCUSSION
The performance of TBM is closely related to the complex geological environment. However, precise geological surveys, limited by above-ground construction, water depth and cost, are a luxury. Therefore, this study divides the geology into different soil units based on geological exploration report. It lays the foundation for the implementation of the feature engineering and data analysis in Sections. II-III for different soil units. Next, the analysis and comparison of experiments will be conducted from three aspects: stratigraphic division, feature selection, and KPI prediction. The experiments are performed on a Windows Server with a dual-3.60-GHz CPU and a RAM of 16GB.

1) THE STUDY OF STRATIGRAPHIC DIVISION
In this project, the exploration points are 20∼25m apart, and the hole depth is 1.5∼2 times the tunnel diameter below the tunnel buried depth. However, the difference in the sampling frequency of long-interval geological exploration and operating parameters (1 sample every 6 sec) makes data supplement or alignment infeasible. Nevertheless, TBMs often exhibit distinct performance on different cross-sections, but analogous behavior on similar cross-sections at different locations. Therefore, this study divides the strata into four major engineering soil units (ESUs) based on the geological exploration report. The sections S 1 to S 4 of ESUs are presented in Fig. 2. As shown in the figure, there are 915 rings in the project. S 1 is composed of descending section from 0∼147 rings and ascending section from 801∼915 rings, S 2 is 148∼380 rings, S 3 is 381∼554 rings and S 4 is 555∼800 rings. The geological composition of each ESU is diverse, but the soil properties within individual units are similar. On this basis, ensemble feature selection can be implemented and KPI prediction models can be built in different soil units. Geotechnical conditions are also implicit in the analysis.

2) THE STUDY OF FEATURE SELECTION
The parameters collected by TBM have direct or indirect influence on the tunneling quality mainly concentrated in three modules. They are the main drive electrical module, the propulsion module and the mud conveying module. Parameters that do not contribute to the TBM KPI prediction are excluded from candidates, such as binary power flags, component positions, and parameters that are valid only in a specific mode. In particular, weekend mode is rarely used on-site. The final candidates for feature selection are the remaining 99 sensor parameters. The sensor types involved are listed in Table 1, one of which may correspond to multiple sensors in different locations.
However, feeding all the candidate parameters into the KPI prediction model is computationally intensive. Therefore, the improved EFS-DT strategy is implemented to find out the raw parameters in each ESU that contribute to their KPI prediction. The number of retained parameters in each ESU is determined by the performance on the validation set. The metrics of predictive performance include RMSE, MAE and MAPE. The results are shown in Fig. 3. In order to facilitate  the discovery of the comprehensive optimum, the three minimal prediction performance metrics are mapped to [0,1] by Max-Min normalization. The comprehensive optimum, i.e., minimum sum, are obtained at S 1 = 32, S 2 = 24, S 3 = 22, and S 4 = 30 respectively. The dimension is reduced to less than 1/3 of the original number.
The variable scores and selection results under the four ESUs are shown in Fig. 4. The green bars indicate the selected variables, and the # in the horizontal axis ticks separate the sensor type and project number. As shown in Fig. 4, the scores of variables ranked below 40 are generally less than 0.1. Increasing low-quality variables will not only adversely affect the prediction, but also increase the burden of learning algorithm. Moreover, the number of retained variables is too small to provide sufficient information, which will also lead to performance degradation. Therefore, for a clearer view,  In Fig. 4, there are significant diversities in the number and type of retained variables in different units, indicating the geological differences and the necessity of distinctive analysis. Variable scores and gaps gradually decrease with ranking, which is also consistent with intuitive perception. In all units, the score of AR (F11 in Table 1) is the highest, indicating that it contributes the most to KPI prediction. Since the data fed into the model are current samples, it is not surprising that the KPI prediction for the next moment appears autoregressive, i.e., F11 is retained. Overall, the proposed method effectively reduces the data dimension, thus revealing more granularly how the raw parameters affect KPI.

3) THE STUDY OF KPI PREDICTION
After ensemble, the variable subset of each ESU used to build the KPI prediction model is determined. In order to verify the effectiveness of this strategy, the proposed method is compared with other separate filter models, conventional serial feature selection methods. The results on the test set are listed in Table 2. In the time columns, + separate the time loss of two stages. They are the time cost for feature selection and the time required for modeling and evaluation on the validation set using the selected variables. In order to ensure the fairness of the comparative experiment, the proposed method and other methods participating in the comparison all use DT model with default hyperparameters for KPI prediction.
As shown in Table 2, the absence of feature selection saves time in the screening process. However, it shifts the burden to modeling and cannot eliminate the negative impact of irrelevant or redundant information on prediction accuracy. The prediction errors of the three separate filter models in all ESUs are reduced compared with no feature selection. The optimization of time cost in modeling and evaluation stage is also very obvious, about 2∼3 times more. Interestingly, the most mundane method of the three filter models,  the PCC method, not only consumes less selection time, but also achieves considerable performance. This indicates that variables with strong linear correlation with AR in this study are not only easier to establish the mapping relationship with the labels, but also contribute greatly to the prediction.
Compared with the filter model alone, the prediction errors of the serial methods are further reduced. This is attributed to the serial feature selection which relieves the one-sidedness of the separate filter model. However, they cannot get rid of the inherent defects, i.e., the time-for-accuracy and the dependence on the reliability of the initial feature selection result. Compared with the most time-consuming filter models, the proposed method only takes about 4s to achieve greater improvement, while the serial methods takes 3 times or even 10 times more time. This is because once the revision weights are determined in the validation phase, the computation required for fusion will be minimal. The proposed method is far ahead in prediction accuracy. Moreover, since the proposed method is the fusion of filter models with natural stability, it is not as elusive as the heuristic method. Therefore, it is more conducive to lock the soft-sensing parameters with positive effects. In summary, the effectiveness and feasibility of the proposed method in feature selection are verified.
In order to verify the effectiveness of the proposed revision and fusion scheme, the comparison results with other revision and fusion methods are listed in Table 3. Since the fusion time is minimal, time cost is not involved in this comparison. In the table, soft voting and DS evidence theory directly integrate the variable scores of the three filter models, i.e., evidences, without revision. Compare with the results in Table 2, the reckless fusion is worse than the separate filter models, even the RMSE of DS evidence theory in S 1 and S 2 is close to no selection. The revision methods have hardly improved or even slightly impairs the prediction accuracy. This is because the revision strategy is adopted without considering the reliability of the evidence sources, and the less credible evidence has brought a devastating blow to the fusion. The proposed method implements evidence revision through credibility analysis and similarity analysis. The results demonstrate that improved EFS-DT strategy retains more informative parameters that the filter models pay attention to from different perspectives, and achieves more accurate predictions. Overall, the effectiveness and feasibility of the proposed method in conflict management and fusion are verified.

VI. CONCLUSION
In this paper, an improved EFS-DT strategy is developed for KPI prediction. On the one hand, it integrates multi-criteria filtering results to retain more informative variables with less time cost. On the other hand, it uses credibility analysis and similarity analysis for conflict management to achieve more accurate and robust prediction. Finally, the feasibility and effectiveness of the proposed method are evaluated with a real shield tunnel case in China. The experimental results indicate that the proposed method can achieve effective dimensionality reduction and accuracy improvement with as little time overhead as possible. data about the operation process of the shield machine, and Shouzhi Guo, Lei Guo, and Yao Ma for their kind help.
FULIN GAO received the B.S. degree in electrical engineering and automation from Ludong University, China, in 2019. He is currently pursuing the master's degree in control science and engineering with the School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China.
His research interests include information fusion, operation status assessment of industrial process, and fault monitoring and diagnosis.
SHUAI TAN received the B.S. degree in automation and the Ph.D. degree in control theory and control engineering from Northeastern University, China, in 2005 and 2012, respectively.
She is currently an Associate Professor with the East China University of Science and Technology, Shanghai, China. Her research interests include operation state evaluation for complex industrial process, fault monitoring and diagnosis, and machine learning of image information.
HONGBO SHI received the B.E. degree in chemical automation and the Ph.D. degree in control theory and control engineering from the East China University of Science and Technology, China, in 1986 and 2000, respectively.
He was the 2003 Shu Guang Scholar of Shanghai. He is currently a Professor with the East China University of Science and Technology. His research interests include modeling of industrial process and advanced control technology, theory and methods of integrated automation systems, and condition monitoring and fault diagnosis of industrial process.
YANG TAO received the B.E. degree in automation from Zhengzhou University, Zhengzhou, China, in 2015. He is currently pursuing the Ph.D. degree in control theory and control engineering with the East China University of Science and Technology, Shanghai, China.
His research interests include feature extraction, process operating performance assessment, fault detection, and fault diagnosis.
BING SONG received the B.E. degree in automation and the Ph.D. degree in control theory and control engineering from the East China University of Science and Technology, Shanghai, China, in 2012 and 2017, respectively.
He is currently an Associate Professor with the Department of Automation, East China University of Science and Technology. His research interests include feature extraction, fault detection, fault diagnosis, and multimode process monitoring.