Feature Engineering and Artificial Intelligence-Supported Approaches Used for Electric Powertrain Fault Diagnosis: A Review

Electric powertrain is constituted by electric machine transmission unit, inverter and battery packs, etc., is a highly-integrated system. Its reliability and safety are not only related to industrial costs, but more importantly to the safety of human life. This review is the first contribution to comprehensively summarize both the feature engineering methods and artificial intelligence (AI) algorithms (including machine learning, neural networks and deep learning) in electric powertrain condition monitoring and fault diagnosis approaches. Specifically, this paper systematically divides the AI-supported method into two main steps: feature engineering and AI approach. On the one hand, it introduces the data and feature processing in AI-supported methods, and on the other hand it summarizes input signals, feature methods and AI algorithms included in the AI method in cases. Therefore, firstly this review is to guide how to choose the appropriate feature engineering method in further research. Secondly, the up-to-date AI algorithms adopted for powertrain health monitoring are presented in detail. Finally, such current approaches are discussed and future trends are proposed.


I. INTRODUCTION
Electric powertrain system is playing an extremely important role in both human life and industry, thus increasingly attracting attention and researches. Especially in electric vehicles (EVs), which account for an increasing proportion of vehicles, the reliability and safety of powertrain components bear the responsibility of human life. The reliability issues of the powertrain system may appear on any component from historical data perspective. For example, in a permanent magnet synchronous motor (PMSM) which widely used in industries and EVs, the permanent magnet of the rotor will be demagnetized due to the high temperature during operation; the gearbox and bearing closely connected to the motor will cause safety hazards due to wear or damage; in the field of power electronic, DC-bus capacitors or switch faults in the inverter may also occur due to various reasons such as ageing, over-stress, etc. Various faults of mechanical or electronic components will threaten the life security of drivers and passengers. Therefore, real-time condition monitoring and fault detection of EV powertrain ( Fig. 1) have increasing importance and are high-priority tasks as well [1]. To date, various technologies have been applied to monitor the health condition of the electric powertrains. The commonly used methods are vibration analysis (VA) [2], [3], electromagnetic monitoring [4][5][6], motor current signature analysis (MCSA) [7], [8], thermal monitoring [9], [10], infrared monitoring [11], [12], sound signal analysis [13], [14], etc. In [15], a combinationallogic-based method is proposed to identify possible faults in EV powertrains. However, such methods are not sensitive to fault time and cannot predict faults in advance. AI as an excellent tool is frequently employed to improve the performance of MSCA, VA [2], [3], thermal monitoring [16], etc. In recent years, the MCSA is getting faster increasing popularity among the motor monitoring technologies and the transient motor current signature analysis (TMCSA) which relies on the starting current of the motor provides more accurate fault information than MCSA, especially in non-ideal situations such as the application of multiple actuators, unstable power supply voltage, and load motor torque oscillations [17]. They are widely used to detect various faults of various motors and even gearboxes [7], [8]. Nevertheless, VA is also the mainstream of AI-supported motor bearing fault detection and gearbox fault detection. AI-supported approaches have received strong attention from both industry and academia due to their low cost, model independence and superior performance, etc. In the field of industry, electric powertrain system requires the intervention of condition monitoring systems to prevent catastrophic faults and damages of major industrial machines, as well as predictive maintenance. For example, in the automotive industry, AI will help the electric powertrain of vehicles to be more transparent and visual to predict the possible faults in advance. Therefore, it has played an important role in solving many types of problems in powertrain systems, which emerged from the high development of machine learning and deep learning. A variety of AI algorithms was applied by scholars to electric powertrain in the past few decades, such as expert systems [18], fuzzy logic algorithm [19] and artificial neural networks (ANN). In recent years, more algorithms have been applied in the field of powertrain health monitoring. Improved algorithms based on classic ML algorithms such as Knearest neighbor (K-NN), decision tree (DT), support vector machine (SVM) and even ensemble algorithms (Fig. 2) (such as adaptive boost (Adaboost), random forest (RF), etc.) are often employed. These methods will classify the time-domain, frequency-domain or time-frequency domain features extracted from the feature extraction methods in the measurement signal (such as current, vibration, etc.). The feature extraction methods required by these methods are introduced in Section II. Meanwhile, neural networks and deep learning methods can avoid the feature extraction step to a certain extent and are becoming more and more popular among researchers. The neural network based models roughly include multilayer perceptron (MLP), convolutional neural network (CNN), deep belief network (DBN), recurrent neural network (RNN) (including long and short-term memory network (LSTM) and gated recurrent unit (GRU)), autoencoder (AE), etc. Several papers have reviewed the AI-supported condition monitoring and fault diagnosis techniques. In [20], the author only summarizes the fault diagnosis of IM, not for the complete EV powertrain, and the review has outdated. In the review [21], the fields described are over-broad result in that each field cannot be introduced in detail. [22] is to review the AI fault diagnosis method from the perspective of rotating machinery. However, the AI algorithms involved are extremely limited. This paper proposes the target at presenting a comprehensive review on the recent research about the whole process of AI applied condition monitoring and fault detection in EV powertrain system. Firstly, this paper systematically divides AI-supported condition monitoring and fault diagnosis into two main steps from feature engineering and AI algorithms points of view. Particularly, this review analyses literatures from feature engineering perspective, not just from AI algorithm point of view. In feature engineering, the raw signal will be processed for feature extraction and dimensionality reduction, which result will be used as the input of the AI algorithm. The application cases of methods are reviewed, and the advantages and shortcomings of these methods are pointed out. Then, the EV powertrain is divided into four components and applications of the up-to-date AI algorithms are analysed in detail with their inputted signal and feature engineering methods. In recent years, there is almost no review focus on the field of AI-supported EV powertrain condition monitoring and fault diagnosis. This review aims to guide further research in this field. The framework diagram of the review is demonstrated in Fig. 3. The organization of the rest paragraphs is introduced briefly as follows. Section II introduces the feature engineering methods form feature extraction and feature selection points of view. Section III summarizes the application of AI algorithms in the four components of EV. Section IV presents my discussion about current AI algorithms adopted in the EV powertrain health monitoring and the future trends of algorithm application and the improvement of neural networks in the future. A general conclusion is placed in section V.

II. FEATURE ENGINEERING
Feature extraction is usually used to analyse the waveform signal such as current signal and vibration signal [23], [24], but if the extracted feature information is not completely contributed to the fault classification, it is necessary to use feature selection technology to find representative features to achieve the purpose of reducing the feature size and enhancing fault diagnosing efficiency [23].

A. FEATURE EXTRACTION
The step after data preprocessing is often to extract features of the vibration signal, current signal, etc., which extract the fault information contained in the signal. The most famous and important method in signal processing is the Fourier transform (FT), which is widely used signal processing because it has the function of transforming the signal back and forth from the time domain to the frequency domain and helping analyze the components of the signal. For the signal that is discretized into discrete sampling points, the FT becomes the discrete Fourier transform (DFT). However, due to a large amount of DFT calculation and the high computational complexity, the earliest version of fast Fourier transform (FFT) was proposed by J. Cooley and T. Tukey in 1965 [25], which reduced the amount of calculation by several orders of magnitude. This allows computers to process signals more quickly, thereby promoting the rapid development of communications and signal processing. FFT has made great progress in fault diagnosis methods based on Fourier analysis [26]. Yang et al. [27] applies FFT in MCSA to analyze the stator current data of IMs and calls the obtained feature the FFT-ICA feature of stator current. Romero-Troncoso et al. [28] uses the improved FFT after fractional resampling for IM periodic monitoring tasks. In the paper [29], FFT banded RMS value input CNN for gearbox fault identification. FFT is also used to extract the fault spectrum characteristics of the three-phase current when diagnosing the multiple insulated gate bipolar transistors (IGBTs) open-circuit faults and current sensor faults in the three-phase pulse width modulation inverter [30]. Under necessary circumstances, quantum Fourier transform (QFT) can accelerate the FFT exponentially [31]. Short-time Fourier transform (STFT) contribute to deal with the problem of FT about losing all timedomain information. E. H. E. Bouchikhi [32] applies STFT to analyse IM stator current to diagnose bearing faults.
Continuous wavelet transform (CWT) introduces the wavelet mother function to achieve dynamic resolution in the time domain and frequency domain. Using CWT can get the complete timefrequency domain information of signals, avoiding the loss of information of the original signal to the utmost extent [33]. Discrete wavelet transform (DWT) is a new type of spectrum analysis tool that discretizes the scale and translation of basic wavelets. It can not only observe the frequency domain characteristics of local time-domain processes but also observe the time domain characteristics of local frequency domain processes, so even those non-stationary processes can be transformed and processed well. The type and number of features used in fault diagnosis methods could be extracted by DWT, but when extracting fault characteristics of motor bearings, key information may be lost near the fault characteristic frequency. Some research on IM proves that DWT can make fault identification more accurate than traditional FFT [34]. Therefore, it is also one of the time-frequency domain techniques preferred by researchers. In [35], the author uses DWT to diagnose faults in motor transient operation. Frequency domain methods often fail to detect faults from nonstationary signals, such as bearing vibration signals [36].
WT is a very successful method in time-frequency domain EV powertrain health monitoring, which can also reduce noise in the noisy working environment. However, the high-frequency band is not split where the modulation information of machine fault exists. WPT can extract key components through a band-pass filter to decompose the signal into different levels, which is helpful for the correct extraction of fault feature information. In paper [39], the CAA algorithm was used to improve the feature correlation according to the weight of the feature after the author extracted the fault features used by this method from the time domain, statistical feature, and frequency domain through wavelet packet transform (WPT). Then it reduces the dimensionality of fusion features through principal component analysis (PCA) and the support vector machine (SVM) is a diagnostic classifier. This is the basic process of a fault diagnosis method based on fault characteristics. WPT obtains a variety of the wavelet packet feature quantities through its multi-scale time-frequency analysis: energy, fluctuation coefficient, skewness, and margin factor. To pursue higher diagnostic accuracy, T. Gao [40] uses generalized discriminant analysis (GDA) to fuse wavelet packet features to eliminate redundant information. However, the feature extraction of WPT will not be able to extract effective fault information due to lack of adaptability [41], [42]. The WFM method combines WPT with manifold learning, which can suppress background noise in the time-frequency domain and enhance the signal [43]. The small-amplitude transient pulse at the beginning of the bearing failure can be extracted by the WFM method [44].
Methods such as FFT, STFT and WVD are more suitable for processing linear signals with high stability. WT has great advantages in analyzing non-stationary and non-linear signals, but it also has its shortcomings such as the certain bands of the defect information and the selection of the base function [45]. In addition, there are EMD, local mean decomposition (LMD) [46] and other adaptive methods are applied to the time-frequency domain analysis of powertrain fault signals. EMD is a signal decomposition method that can adaptively decompose any signal into a set of intrinsic module functions (IMF) with different frequency characteristics. This is a major advancement in analysing non-stationary signals. Ge et al. proposed a rolling bearing fault diagnosis method based on ensemble EMD (EEMD) [47], wavelet semi-soft threshold (WSST) signal reconstruction and multi-scale entropy (MSE). First, use EEMD to decompose the bearing vibration signal into IMF, and then use the Pearson correlation coefficient to filter the high-frequency IMFs that contain more noise information. The WSST method denoises the high-frequency IMFs for signal reconstruction. The feature vector is constructed by using the MSE method to calculate the MSE value of the reconstructed signal. In paper [48], the author uses the EMD-SVM method to diagnose the neutral point clamped threelevel inverters (NPC). The EMD-based feature extraction method is widely used in the intelligent diagnosis of bearings and rotating machinery [49], [50] in EV, and the stator as well [51]. However, EMD methods decompose the signal according to its time-scale characteristics. If the signals collected from different sensors are processed under the condition of without a pre-set basis function, the position of the fault feature in the EMD will be uncertain. Therefore, the best choice is to use WT related methods with a Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 2 fixed wavelet basis function to decompose all signals when using deep learning diagnosis methods. Empirical wavelet transform (EWT) combines the adaptability of EMD decomposition and the advantages of WT, which is a useful adaptive tool for vibration signal processing and can also decompose the original signal into different modes [52]. In the paper [53], there is a hybrid automatic bearing fault detection method that combines EWT and fuzzy logic system (FLS) to locate the early degradation of the bearing state under different working conditions. Paper [54] proved that EWT is more effective than EMD in the diagnosis of rolling bearings. However, the EWT method still has some problems. On the one hand, it is difficult to determine adaptive and robust boundaries of the EWT segments; on the other hand, the filtered signal still has noise and redundant vibrations, which will mask weaker fault features, which is not conducive to detecting early-stage faults [55]. There is also spectral kurtosis (SK) analysis in time-frequency domain technology that can handle both stationary and nonstationary signals [56].
To take advantage of the powerful ability of convolutional neural network (CNN) (Fig. 4) to automatically extract features, time-domain vibration signals which include adequate once or more types of faults information are transformed to twodimensional (2D) grey-scale images through continuous wavelet transform (CWT) [57]. In addition, CNN is good at processing continuous wavelet transform scalogram (CWTS) generated by CWT. S. Guo [58] adds the Pythagorean spatial pyramid pooling (PSPP) layer to the top layer of CNN, so that the fault features obtained by the PSPP layer from CWTS (Fig. 5) can be passed to the convolutional layer below for secondary extraction. Feature extraction methods related to neural networks include AE, but traditional AEs cannot stably obtain various meaningful signals from vibrations. Sparse autoencoder (SAE) is another neural network-based method for failure features extraction. The author of the paper [59] proposed to use normalized sparse autoencoder (NSAE) constructs local connection network (LCN), namely NSAE-LCN. NSAE-LCN overcomes two shortcomings of traditional autoencoders: machine feature extraction may learn similar features and shift variant properties may lead to machine health misclassification.  Entropy-related methods are also an important tool for feature extraction in fault diagnosis, and there are now many derivative methods. A diversity entropy (DE) based novel method called multiscale diversity entropy (MDE) is extended to deal with multiscale analysis for a comprehensive feature description by combining with the coarse graining process [60]. This entropy method is designed for fault diagnosis of rotating machinery such as rotor and gearbox [61]. In the paper [62], the frequency band entropy (FBE) based on information entropy (IE) and STFT are introduced to extract the fault feature frequency of rolling element bearings. This method in the paper [63] is improved by WPT based on the Daubechies wavelet. Some entropy methods can play an auxiliary role in the construction of feature vectors. Permutation entropy (IPE), multi-scale entropy (MSE), multiscale permutation entropy (MPE), weighted permutation entropy (WPE), and fine-sorted dispersion entropy (FSDE) calculate the entropy values of signal reconstructed by EMD and variational mode decomposition (VMD) [64], [65] to construct the feature vectors which will be used as the input data and classification basis for subsequent AI fault classification methods such as types of support vector machine (SVM) [66][67][68][69][70].
Summary of pros and cons from all literatures mentioned before are shown in TABLE I. Principal Component Analysis (PCA) prepares for fault diagnosis before entering the dataset and convert existing features to low-dimensionality, reduce feature space, and avoid high-dimensional data redundancy [81], [82]. PCA use a group of fundamental function to reasonably optimize the minimum error of data model. The author introduces the algorithm of combining independent component analysis (ICA) in the kernel technique to improve the feature extraction of condition monitoring and fault diagnosis in IMs. ICA is formulated in the kernel-inducing feature space and developed through twophase kernel ICA algorithm: whitened using kernel principal component analysis (kernel PCA) plus ICA. Kernel PCA spheres data and makes the data structure become as linearly separable as possible by virtue of an implicit nonlinear mapping determined by kernel. ICA seeks the projection direction in the kernel PCA whitened space, making the distribution of the projected data as non-Gaussian as possible. In [82], the performance of the classification process due to the choice of kernel function is presented to show the excellent characteristic of the kernel function. And this research demonstrates the clustering feature using ICA is better than PCA does. In the article [83], a machine learning-based fault diagnosis method for single VFD-fed IMs have been developed. Gharavian et al. [84] compared the FDA-based and PCA-based feature selection results in the fault diagnosis of EV gearboxes. PCA may not be the optimal feature reduction method in a specific problem. PCA and ICA are the most popular unsupervised learning feature dimensionality reduction techniques. For feature selection through supervised learning methods, Linear Discriminant Analysis (LDA) and Random forest Decision tree (RFDT) can be used [85]. What the difference is PCA follows the direction of maximum variance for optimal reconstruction, whereas LDA is to obtain the optimal low-dimensional representation result of the original data set by maximizing the between-class scatter matrix meanwhile minimizing the withinclass scatter matrix. Since LDA operates based on labelled information, it can obtain better results than PCA when there are sufficient labelled samples [86]. LDA has been used in lots Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 2 of fault detection works. Many works of literature explore the application of the combination of LDA and MCSA methods to various parts, online monitoring of operating conditions through voltage and current analysis, and estimation of the severity of the fault through the magnitude of the sub-harmonic amplitude. Paper [87] solved the problem of difficulty in detecting faults with sub-harmonics in the MCSA-LDA method, using harmonic amplitude instead of sub-harmonics as fault detection and classification function. Jin et al. [86] proposed a combination of trace ratio linear discriminant analysis (TR-LDA) and VA, which was applied to the bearing fault analysis of IM and brushless DC motors. TR-LDA is a variant of LDA based on the TR criterion. It is an excellent solution to the TR problem and can be extended to deal with non-Gaussian data sets encountered in many real-world fault diagnosis problems. LDA can also be used in the global spectrum analysis of vibration signals to improve the effect of fault diagnosis of ball bearings [88].
In addition, AI-based optimization algorithms have outstanding performance in feature selection, such as particle swarm optimization (PSO) and differential evolution (WBDE) [89]. Lee et al. [90] pointed out through experiments that PSO is the key technology to find the optimal weight of damagesensitive feature vectors in the bearing system. In [91], the GA is utilized to reduce the number of features and select the most important ones from the feature database to reduce power consumption and reduce model computational complexity. In the paper [92], GA combined with random forest (RF) algorithm to produce a new method RFOGA for machine fault detection, which is more accurate than only RF.

III. AI ALGORITHMS FOR CONDITION MONITORING AND FAULT DETECTION OF ELECTRIC POWERTRAIN SYSTEM
Commonly used AI algorithms include three levels: machine learning, neural networks and deep learning. This section introduces their applications in EV powertrains. Fig. 6 demonstrates the complete flowchart from the original signal to the failure or prediction result (All variants of feature methods and AI algorithms are not listed in detail in the figure). Table II, III, IV and V shows the main contributions and applied AI algorithm of each literature with used signals and feature methods. The electric machine is an important part of the electric powertrain system. For instance, permanent magnet synchronous motor (PMSM) [93] plays an important role in many industrial applications such as EV traction steering, robotics and wind power due to its high efficiency, reliability, wide working range and high torque density [87]. Induction machine popularizes in EV powertrains as well [94]. Therefore, ensuring their correct operation is the key task of the motors fault diagnosis.
The subsequent description of motor faults will be in the following order: Bearing faults; Rotor faults; Stator faults.

1) BEARING FAULTS
Almost most of the AI-supported diagnosis of bearing faults (Fig. 7) relies on bearing vibration signal, and a few use motor current signals [7], rotor speed [107] and acoustic emission (AE) [108] [109], etc. For bearing diagnosis based on VA, MCSA, etc., signal processing and feature extraction of vibration or stator current should be applied. The most used feature extraction and selection techniques have been mentioned in Section II. Some AI-based new feature extraction methods are being proposed more and more and applied in bearing fault identification. Among them, H. Shi [95] proposed the SDLSTM algorithm combining stacked denoising auto-encoder (SDAE) and long short-term memory (LSTM) model to detect abnormal phenomena from the vibration signal of the bearing. According to the paper [96], the author suggests a deep wavelet autoencoder (DWAE) based on extreme learning machine (ELM). Due to transferring the nonlinear activation function to wavelet function, DWAE has unsupervised feature learning function to capture characteristics of the signal. What researchers are interested in CNN is that it combines powerful automatic feature extraction capabilities and classification capabilities, in which a certain number of convolutional and pooling layers are used to extract fault features, and the following fully connected layers are used for fault classification. In the paper [57], CNN based LeNet-5 is used to extract the fault feature information of the time-domain vibration signal and use the RF classifier to classify multiple extracted local and global information multilevel features. It does not directly use CNN for classification but is followed by a machine learning classifier for classification. This article uses the feature maps of different layers in CNN to train independent RF classifiers and uses the winner-take-all strategy to obtain the best classification results. Similarly in paper [58], the author proposes a new fault diagnosis method based on continuous wavelet transform scalogram (CWTS) and Pythagorean spatial pyramid pooling convolutional neural network (PSPP-CNN). Its novelty is to solve the problem of low accuracy in the existing method for dealing with variable-speed bearing fault diagnosis and diagnose bearing faults at full speed. Fig. 8 illustrates an advanced fault diagnosis process based on CNN. It does not directly use 2-D CNN for classification from image data inputs but is followed by a machine learning classifier for classification. To solve the drawback that the pooling operation of CNN will lose a lot of useful fault feature information, H. Li [80] proposed a new type bearing fault diagnosis model CNNEPDNN based on ensemble DNN and CNN, which combines features with different discriminative powers to identify faults and uses the bearing data from bearing data center of case western reserve university (CWRU) verified the validity of the method. CNN applied papers for bearing fault diagnosis in recent years also have [103][104][105][106]. Among them, 1-D CNN is used to extract signal features and classification in paper [103] and [105]. CNN is only used as a classifier in paper [104], while CNN in literature [106] and [80] is used for feature extraction, and the SVM classifier is used to identify the type of fault. While CNN is superior in function, it is also accompanied by the problem of gradient disappearance as the number of layers deepens. According to paper [101] and [100], the residual network can solve the problem of CNN gradient disappearance.
Commonly used machine learning classifiers such as SVM and RF are not good at extracting fault features and are usually only used for fault classification. However, they can be connected to the CNN structure for direct fault classification [106] [57]. On the other hand, the feature extraction method can also be used to manually extract the fault features from the initial signal, and then machine learning classifiers perform classification work [2,7,98,108]. There are also many excellent designs for bearing fault diagnosis. For example, in the paper [107], k-NN defines the fault type according to the specific graph distance metric in the graph-mapped spectrum (GMS). M. Kuncan [99] diagnose bearing faults by training the grey relational analysis (GRA) model. M. Kang [109] uses GA algorithm to optimize the classification accuracy of discriminative feature analysis (DFA) and applies time-varying and multiresolution envelope analysis (TVMREA). According to paper [107], rotor speed can also be used as a signal source for fault diagnosis, and even the accuracy of the absolute valuebased PCA (AVPCA) based on rotor speed in a variable speed environment is significantly higher than that of vibration-based AVPCA.

2) ROTOR FAULTS
There has been a large number of studies on rotor fault diagnosis, and the methods are also diverse, such as data-driven methods, magnetic flux based methods [4], [5], and modelbased methods. Broken rotor bar (BRB) is a common fault in IM. At the level of machine learning algorithms, SVM are an advanced technology for rotor conditions classification. Paper [110] innovatively combine directed acyclic graph SVMs (DAGSVMs) and recursive undecimated wavelet packet transform (RUWPT) for classification problem of detecting BRB fault in IM. In multi-class classification problems, MSVM constructs an optimal hyperplane to decide multi-class problems, in which sufficient outliers are needed to improve the performance of MSVM. After the participation of Support Vector Data Description (SVDD), the accuracy tested on rotor bar fault detection is higher than that of the traditional MSVM method [111]. NN-based algorithm is widely used in BRB problem detection [112]. In addition, DNN can also complete the feature extraction and fault classification of the rotor system under unsupervised conditions [113]. In this paper, the author converts raw vibration signals into vibration images as well as paper [114]. The difference is that the former designs a DBN based model, and the latter designs a CNN based direct connection based CNN (DC-CNN). To be supported by CNN, the vibration signal can also be converted into a symmetrized dot pattern (SDP) image [115]. Researchers in the field of bearing fault have also got an increasing interest in the image feature extraction function of CNN in recent years. In addition to vibration image, displacement-based shaft vibration image (SVI) and even thermal image can also be used to diagnose bearing faults by inputting CNN based model [116], [16].
Particularly in PMSM, the rotor permanent magnet (PM) will demagnetize to varying degrees as the temperature rises. Traditionally, Ding et al. proposes a temperature estimation method using model reference fuzzy adaptive control to estimate the PM temperature [4]. In this method, a model reference adaptive system (MRAS) is established to estimate the PM flux linkage, and a fuzzy control method is introduced to improve the accuracy of the parameters estimated by the MRAS and the applicable speed range, and finally it derives the flux linkage and PM temperature based on the electromotive force (BEMF). And paper [5] predicts the rotor temperature by estimating PM flux linkage as well. The current research on rotor temperature prediction through neural networks and other algorithms is very limited because AI-related methods are driven by a large amount of data. However, it is very difficult to accurately measure the rotor temperature in the experiment, and it is unable to provide a large and highly reliable data set for the AI algorithm.

3) STATOR FAULTS
The types of stator fault are briefly including inter-turn fault, dynamic eccentricity fault, current sensor fault, insulation fault, open-circuit fault, short-circuit fault, turn fault, etc. Traditional stator detection methods are mature, such as magnetic flux [6] and infrared thermography (IRT) based diagnostic technology, Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 2 which can effectively detect stator winding faults, turn-to-turn faults, and cooling system faults. PCM et al. [6] suggests the magnetic flux and vibration signals as the basis for distinguishing the main electrical faults of the stator winding (turn-to-turn short circuit and unbalanced power supply). According to papers [11], [117], IRT images are analyzed thermally to monitor inter-turn fault and failure of the cooling system in IM. Then turn to AI approaches. According to surveys [6], the inter-turn short-circuit (ITSC) of the stator winding accounts for one-third of all IM faults, second only to two-fifths of bearing faults. Paper [118] applies the orthogonal least squares regression (OLSR) algorithm to reduce the size and calculation of the neural network, in which the least squares regression algorithm is used to find the best size of the probabilistic neural network (PNN), and the dual tree complex wavelet transform (DTCWT) is used to extract the characteristics of the vibration signal. There is a stator current based method also for ITSC fault. This paper [119] uses discrete wavelet energy ratio (DWER) as the classification basis to input an improved Elman Neural Network (ENN), which can effectively detect ITSC faults. For stator faults, the most signal source is stator current [119][120][121][122][123][124]. In the early years, AI-based stator failure research mainly relied on improved ANN models [118], [120], [121], [124]. In recent years, CNN related articles appeared [122]. On the other hand, fewer people have used classic machine learning classifiers to diagnose stator failures in recent years. Particularly, literature [123] designs a novel SVM structure to detect the ground fault and inter-turn fault of the stator winding. The author extracts the characteristics of two different faults by setting different frequency bands of Stockwell transform (ST) and inputs two different SVM models.
From the perspective of the motor as a whole, the AI algorithms can achieve more functions. For example, the RNNbased LSTM network can estimate the remaining useful life (RUL) of the motor [125]. B. Wang [126] proposes multiscale convolutional attention network (MSCAN) can also achieve accurate RUL prediction. In addition, the method of multisensor signal fusion is a possible way to detect faults in different parts of the motor at the same time [127].  [130], multiview sparse filtering (MVSF) [8], deep recursive dynamic principal component analysis (Deep RDPCA) [131], Resonance Residual Technique [132] and deep learning based methods [133], [134].
In the application field of ML, C. Li [135] introduced a multimodal deep support vector classification (MDSVC) approach to complete the fault diagnosis task of the gearbox in which support vector classifier (SVC) is used to fuse Gaussian-Bernoulli deep Boltzmann machine (GDBM) in different ways to build MDSVC model. Among the classic ML classifiers, SVC and SVC derived classifiers are valued by researchers due to their excellent performance. Paper [136] proposed a gearbox fault detection method based on wavelet support vector machine (WSVM) and immune genetic algorithm (IGA). There are also studies using support vector machines [137][138][139]. In the paper [3], the optimized k-NN algorithm is used to identify the severity of gear cracks under different motor speeds and loads. Paper [140] also applies k-NN based algorithm to gear fault identification. Decision Tree (DT), as a basic classifier, can convert complex gearbox failure data into simple decisionmaking processes to extract rules [141]. Deep learning, which can analyze fault features deeper than classic classification algorithms, has also achieved great success in the classification task of gearbox. BP neural network is a classic and widely used neural network. Cheng et al. [142] uses the IQPSO algorithm based on the PSO algorithm to optimize the classic BP neural network, and the IQPSOBP obtained after training has a higher accuracy for gearbox failures than PSOBP. In paper [143], the author compares the fault detection performance of the classic ML algorithms with the proposed 2-D CNN architecture (Fig. 9). The input data is got from multi-sensor fusion information. This twodimensional neural network analyzes the current signals of multiple current sensors in real-time, and can be directly used to classify fault types, partially eliminating the need for complicated processing procedures in the middle. While a new multiscale convolutional neural network (MSCNN) [144] is Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 2 proposed to be used to identify the health status of gearboxes. It can be used for both feature extraction and fault classification. [29] is also studies that use CNN to diagnose gearboxes.   [130] Vibration -mRVM  To design two features: accumulative amplitudes of carrier orders (AACO) and energy ratio based on difference spectra (ERDS) [141] Vibration -DT  Calculation process is simple and fast by using DT [8] Motor current MVSF SVM  To implement easily  To propose MVSF for integrating multi-view feature representation effectively [137] stator or/and rotor current -multiclass SVM  To define different time domain and frequency domain features of gearbox faults in DFIG stator and rotor current signals  To use decision-level information fusion for different solutions [138] Rotor current HT AE, SVM  To design angle resampling algorithm  To design and apply a new type of deep structure classifier [139] Motor current -Radial basis function kernel-SVM  To be used under variable speed conditions  To integrate an adaptive signal resampling algorithm, a frequency tracker and a feature generation algorithm

C. AI APPROACHES FOR DETECTING FAULTS OF INVERTER
Due to the rapid growth of current EV and renewable energyrelated industries, people are paying more attention to low withstand voltage, high-school power inverters such as multilevel inverters. Most inverters use insulated gate bipolar transistors (IGBTs) as power devices due to their high input impedance and low state voltage drop. The status of inverter components such as power transistors is closely related to equipment reliability. Current inverter diagnosis technologies mainly include model-based methods, expert systems, and AI methods [145]. At present, there are relatively few fault types that can be monitored and diagnosed by the AI-supported approaches. And Fig. 10 shows that the failure rate of IGBTs and when the AI-based approach should be used. The most widely used ML algorithm for IGBT faults is SVM. In the paper [146], FFT is used to extract the fault characteristics of the inverter's output voltage, RPCA reduces the dimension of the feature space, and the fault type is obtained through the classifier constructed by SVM. Moosavi et al. [147] uses WT to extract the waveform from the three-phase current for recognizing the IGBT open-circuit fault in the DC/AC inverter of the electric vehicle, and compares the diagnostic performance of MLP, SVM, SOM, and K-means algorithm. The results show that the recognition performance of SVM and MLP is better. For detecting faults in NPC inverter, Cheng et al. [148] designs the least squares support vector machine with gradient information (G-LS-SVM) diagnostic model to classify the fault voltage signal sparsely represented by compressive sensing theory, Chen et al. [149] uses multi-layer SVM to identify the upper, middle and down bridge voltages signals and differently Hu et al. [150] uses a neural network as a fault classification method. It can be seen from the trend that researchers increasingly like to use the deep analysis of the neural networks to detect the open circuit fault of the inverter [151][152][153][154]. For instance, ANN is used for the online monitoring of IGBTs [155]. Paper [156]   Although the IGBT open circuit failure will cause the entire system to collapse due to overheating, it is relatively easier to detect in terms of timing. IGBT in the inverter only withstand short-circuit current for a short period, even only a few milliseconds. Therefore, short-circuit faults are more difficult to detect through algorithms. The battery system is the energy source of the powertrain and has an important position. As an energy storage device, it is more dangerous, and accidents may occur under extreme environmental influences. Therefore, it is very important to estimate and diagnose the condition of the battery system. Machine learning and neural networks based deep learning methods are effective methods to complete fault detection and estimate state of health (SOH), state of charge (SOC), remaining useful life (RUL) and state of energy (SOE). The main fault types include internal short circuit (ISC) fault, external short circuit (ESC) fault, overcharge/over-discharge fault, Inconsistency within battery pack, connection fault, insulation fault, thermal management system fault, sensor fault, battery management system (BMS) hardware fault, contactor fault, etc. General battery system fault diagnosis methods probably include model-based methods, signal processing and AI-based methods, etc. Here we mainly focus on methods based on signal processing and AI. There are many types of faults in the battery system, so it is not only necessary to research single fault diagnosis, but also to solve multi-fault diagnosis problems. SVM and RF are still popular classifiers in the fault detection of battery systems. Yao et al. [158] proposed an SVM-based model for intelligent diagnosis of electric vehicle battery systems. In the paper [159], the author uses a RF-based classification method to analyze the influence of temperature and SOC on ESC faults characteristics of lithium-ion batteries. Yao et al. [160] proposes a method for classifying the fault status of lithium-ion batteries based on general regression neural networks (GRNN) and proves the voltage difference (VD) is very suitable as the input characteristic value of fault diagnosis. Paper [161] uses only voltage information to predict the current value in the battery pack through BPNN and then predicts the surface temperature of the battery based on the 3D electro-thermal battery model. LSTM algorithm (improved from RNN) is usually used for battery fault diagnosis and prediction. Hong et al. [162] uses LSTM neural network to predict the multi-forward-step voltage of electric vehicles to get the voltage abnormality of the battery system. It is similar to the paper [163]. This paper uses the equivalent circuit model (ECM) and also combines LSTM neural network to detect voltage abnormality for the target of battery fault diagnosis. Paper [164] proposes an LSTM-based method for predicting the surface temperature of lithium-ion batteries, which does not require complex mathematical modelling and professional knowledge of battery physics.
The real-time monitoring of battery health is closely related to the performance and cost of electronic equipment and electric vehicles. Proper battery management helps avoid failures and catastrophic hazards and improves the durability of the battery pack. Paper [165] is the first application of the method of combining sample entropy and sparse Bayesian predictive modelling (SBPM) to predict battery SOH. In the paper [166], the discharge point voltage data of the lithium battery under high-temperature operation is used as the input to train the deep neural network classifier. Feng et al. [167] builds a predictive diagnosis model based on SVM and the mapping relationship between battery inherent characteristics and support vectors is found based on the historical charging data of healthy batteries. The algorithm only needs a few charging curves to achieve fast on-board diagnosis of SOH. Paper [168] extracts health features (HF) from the used battery charging curve, and then use the input into a weighted least squares support vector machine (WLS). In the paper [169], the author compares data-driven ML algorithms for estimating the maximum releasable capacity of lithium-ion batteries, such as SVM, RF, ANN, etc., and optimizes the ANN algorithm.
The battery is a very sophisticated system, and its SOC is affected by many factors and is not simply linear. The accurate estimation of the battery SOC is of great significance for improving the life cycle of the battery pack and reducing the cost of charging. Traditional SOC measurement methods include the open-circuit voltage (OCV) method, current integration method, equivalent circuit model, electrochemical model and other methods. The main focus here is on the datadriven AI-supported method. According to paper [170][171][172], SVM is still a popular one in the SOC field. In [170], SVM is combined with coulomb-counting technology and Forgetting Factor Recursive Least Square (FFRLS) to estimate the SOC of the Li-S battery. SVM model in [171] is optimized parameters by the PSO algorithm. And in the paper [172] proved that SVM can also be used with high-capacity lithium iron manganese phosphate (LiFeMnPO4) battery cell SOC estimation. In the paper [173], the sparse learning machine for real-time SOC estimation is also based on LS-SVM. SOC methods based on neural networks include [174] and [175] in which RBF neural network and NARX-LSTM are used respectively. Compared with methods such as SVM, NN-based model is generally more time-consuming, but there are network structures that can process time-series data such as LSTM and GRU.
The remaining useful life (RUL) predictions are more dependent on models that can process time-series data. Neural network-based models such as LSTM and GRU in RNN variants are the first choice of researchers, especially LSTM [176][177][178]. In addition, models such as extreme learning machines also perform well in RUL prediction [179]. As far as fault diagnosis is concerned, the two most important steps are how to extract fault features from a large amount of data and fault classification algorithms. Accurate extraction of fault features is the key to whether the fault can be classified correctly. It is the key to predict the occurrence of the fault in advance to be able to extract the fault characteristics from the weak fault signal when the fault first appears. The current diagnosis methods focus on the application and innovation of feature methods, while less attention is paid to the new application of AI algorithms. AI classification algorithms are also related to the speed and accuracy of fault classification, and the realization of multi-fault identification also relies on machine learning and deep learning. From literatures in section III, this paper gets the brief description of the application trend of main AI algorithms: 1) K-NN The K-NN algorithm is widely used in the research of fault diagnosis due to its simple structure and can be used as a classifier of fault features. 2) Classic machine learning classifiers (SVM, RF, etc.) SVM has always been a very classic classifier, with good classification effect and fast calculation speed. The RF algorithm is a representative algorithm for ensemble learning, with excellent classification effects. If the training data set is small, the effect of these classifiers is likely to be due to the deep learning classifier based on neural networks. But in recent years, compared with the diagnosis research based on neural network, the fault diagnosis research based on this kind of algorithm is less.

3) ANN
Fault diagnosis models based on various ANN variants were popular in the early years and often appeared in combination with various feature extraction methods such as WT. In this type of diagnostic model, ANN takes various features extracted from the original signal as input and outputs the classification result. In addition, the running speed of the ANN algorithm is slower than that of the classic machine learning algorithms in part A and B, but it has a greater advantage if the amount of training data is large. 4) CNN CNN has got the great interest of people in recent years, which is reflected in the accelerated growth of various CNNrelated articles on fault diagnosis. This is because CNN has both powerful automatic feature extraction and fault classification abilities, which is able to make the diagnostic model more integrated. 1-D CNN is mainly used to process the signals collected by the original sensors. The 2-D CNN is more used in combination with signal-to-image conversion methods, which helps to show the 2-dimensional correlation between fault features.

5) RNN and LSTM and GRU
RNN is good at processing time series data and can predict system parameters and indicators from historical time data. Due to standard RNN algorithms often suffer from gradient disappearance or gradient explosion problems, the model cannot be trained normally. In recent years, LSTM has been more used in prediction tasks, such as RUL prediction for electric motors and battery systems, and SOC and SOH prediction for battery systems. GRU has a more forget gate than LSTM but has fewer parameters than LSTM, so the training speed is faster. It usually performs better than LSTM on the dataset with a small amount of data.

6) Cluster
Clustering algorithms such as K-Means, Fuzzy K-Means, etc. often have two functions. On the one hand, in unsupervised or self-supervised diagnosis models, clustering algorithms can classify feature vectors with similar features into one category. On the other hand, clustering algorithms also have a certain feature dimensionality reduction effect.
Most of algorithms introduced in section III are supervised learning based approaches, which usually require manually designing training datasets in practical applications. However, the application of unsupervised learning in health monitoring needs further research. AE is a classic unsupervised feature learning method, but transfer learning is often needed to apply the trained model to other applications and additional training processes are needed to adjust parameters. Self-supervised learning is also a feasible research direction. As in paper [181], unsupervised learning has been used in the field of fault diagnosis. It is not simply extracting fault features from historical fault signals, and then using labels for supervised classification. Instead, algorithms are used to detect the severity of multiple faults under variable speed and load online.
At present, machine health monitoring in engineering has applied a variety of algorithms such as machine learning and deep learning, but it has not followed the development of advanced algorithms, such as dynamic neural networks and the very popular transformer structure in natural language processing (NLP). The emergence of these cutting-edge algorithms can also have an impact on AI-supported health monitoring, but the application of these new frameworks in engineering is lagging behind. It should be that few people can transfer algorithms that are not yet fully mature in computer science to the field of engineering. For example, the attention technology in the transformer model, etc., applied to fault diagnosis or even engineering may produce better results.
The detection of a single fault in EV powertrains can no longer meet the needs of people. In the application of actual fault detection, it is often necessary to detect more and more fault types. Therefore, accurate and rapid multi-fault diagnosis is also an important direction of current development. We can adopt the thinking of lifelong learning/continuous learning in computer science. Using traditional machine learning classification algorithms and neural networks for multi-fault Author Name: Preparation of Papers for IEEE Access (February 2017) VOLUME XX, 2017 2 detection problems is not always having good accuracy/performance in distinguishing each fault. The difference between lifelong learning and transfer learning is transfer learning will forget the content learned in previous tasks after learning new tasks, and continuous learning will not forget the tasks learned earlier and will make the trained model perform well in all tasks as possible. In practical applications, the update of this algorithm does not require the participation of old data, and the model in use can be quickly updated to achieve the purpose of extending the function of the existing algorithms.

V. CONCLUSION
This study presents a comprehensive review on feature engineering methods and AI algorithms in EV powertrains condition monitoring and fault diagnosis. Fault diagnosis based on AI technology can overcome the need for precise physical models and parameters, so realizing a data-driven automatic diagnosis system is of great significance in applications. The main contributions include: 1) This review summarizes and analyzes the feature engineering methods in EV electric powertrain condition monitoring and fault diagnosis in recent years. This provides reference for method selection in further research. 2) This review summarizes the up-to-date AI algorithms adopted for condition monitoring and fault diagnosis of each component in EV powertrain from application and research cases. 3) Suggested future trends could be enlightening for future research and studies in the field of EV powertrain condition monitoring and fault diagnosis.