Convolutional Neural Network-Based Inter-Turn Fault Diagnosis in LSPMSMs

Stator inter-turn fault diagnosis system for electric motors is of a considerable concern due to its significant effect on industrial production. In this paper, a new method for detecting the inter-turn fault and quantifying its severity in the line start permanent magnet synchronous motor (LSMPSM) is proposed. The new method depends on monitoring the stator current during steady-state period to detect the fault. The convolutional neural network (CNN) method is proposed to correlate the motor steady-state current with the status of the motor winding conditions and detect any presence of inter-turn faults. The data used in this study is extracted from both an experimental setup of a one-horsepower LSPMSM and the corresponding verified mathematical model through several testing cases under various loading conditions. One of the main features of the proposed technique is that it does not require separate feature extraction phase. The results indicate that the proposed technique is able to detect the inter-turn fault under different loading conditions varies from 0NM to 4NM with accuracy of 97.75% for all defined fault levels. The use of steady-state current for fault detection regardless of motor load enables the proposed technique to detect the fault online without disturbing the system functionality and reliability as well as without adding any extra hardware to the system.


I. INTRODUCTION
The use of line start permanent magnet synchronous motors (LSPMSMs) in industry such as in pumps, fans, compressors, and other constant speed applications is in its early stages [1]. LSPMSMs are considered among the most promising types of motors, due to their high efficiency, high operational power factor, self-starting, high power density, high operational torque, and low operational temperature [2], [3]. According to literature, LSPMSMs are considered the most top efficiency motors in the market where their efficiency meets the IE4 super-premium efficiency [4]. Therefore, LSPMSMs are excellent choice for the applications where the reduction of energy consumption is a priority.
Due to the internal and external stresses such as destruction in insulation material, inefficient cooling, voltage stress, The associate editor coordinating the review of this manuscript and approving it for publication was Yu Wang . overloading, chemical contamination, and partial discharge, motors can experience several types of faults. The significant failures are the inter-turn fault, eccentricity, broken bars, and demagnetization. According to the IEEE and Electric Power Research Institute (EPRI) surveys, the primary cause of inter-turn faults is the insulation breakdown [5], [6]. Such failures affect the normal manufacturing processes and operations resulting in a significant loss of revenue. In addition, some of these faults may decrease the efficiency and reliability of the motors. Since the number of LSPMSMs used in the industry is increasing, there is subsequent need for maintenance programs. As such, it is crucial to develop a diagnostic tool that predicts the faults in their early stages [1]. It is worth mentioning that the focus of the reported research on faults diagnosis of LSPMSM was mostly on rotor faults and demagnetization [7]- [12]. Recently, a few research works concentrated on stator winding faults [1], [13]. This research work is a serious attempt to close this gap and investigate the stator inter-turn fault in the interior-mount LSPMSMs.
Different raw indicators have been used in detecting electric machine faults. Among those indicators are Motor Current Signature (MCS), Instantaneous Angular Speed (IAS), Acoustic Signature (AS), and Surface Vibration Signature (VS). The use of MCS in fault detection basically depends on the decomposition of current into its harmonic components that are discriminating each type of fault. Since the current signature is sensitive to almost all motor faults, it is the most non-invasive technique used in machine faults diagnosis [14]- [16]. Moreover, the use of MCS is preferred over the other signatures since it does not need additional measurement sensors [17].
On the other hand, instantaneous angular speed, vibration, and acoustic signatures-based techniques are considered to be convenient for mechanical faults such as bearing defects and broken bars [17]- [21]. In [17], MCS, IAS, and VS have been individually employed in detecting the broken bar fault in induction motor. The effectiveness of these signatures has been investigated with the help of Fourier transform, with the Results demonstrating that IAS outperforms other signatures in detecting the broken bar fault. In [22], the use of vibration signal in detecting bearing and inter-turn fault in induction motor has been investigated. The findings indicate that using the time-frequency analysis for the vibration signature was better in detecting the bearing failures than inter-turn failures. In [23], both MCS and VS have been used in detecting the inter-turn fault in permanent magnet synchronous motor (PMSM). The authors observed that the simultaneous use of two signatures in detecting machine faults could give more accurate results. However, having more than one signature would be at the expense of data collection and analysis. In this paper, the use of MCS only in detecting inter-turn fault is proposed.
Earlier research shows that various early fault detection techniques have been developed for electric motors where the signal-, model-, and knowledge-based techniques are widely implemented. These techniques exhibited good performance in fault monitoring for all types of electric motors [24]. In signal-based methods, the fault is detected by analysing the signatures directly collected from the faulty motors and compared with the healthy signatures. The analysis study is normally done by time domain, frequency domain, enhanced frequency domain, or time-frequency domain. Unfortunately, the performance of the signal-based method degrades with unknown abnormalities and unbalance conditions [25].
The model-based detection method needs a precise mathematical model for the motor. In this method, for fault detection, the field collected data from a motor is compared with the mathematical model output [24]. On the other hand, the knowledge-based model is achieved by using machine-learning tools such as fuzzy logic, artificial neural network, support vector machine, self-organizing maps or partial least squares [26]. However, the knowledge-based model requires extensive experience to perform well in detecting the motor faults. The quality of training data and the selected features are the major factors that affect the performance of the knowledge-based methods.
Artificial Neural Networks (ANN) are widely used in fault diagnosis and detection in all types of electric motors. They are characterized by fast processing capabilities, robustness, ability to find implicit nonlinear relation between different variables. No prior information related to motor parameters is required with ANN. Researchers have reported different types and topologies of neural network applied for monitoring the inter-turn faults [27]- [35]. A Multistage Modular Neural Network (MNN) was proposed for detecting the size of inter-turn fault in induction motors [27]. A set of statistical features were extracted from the approximated levels of current wavelet components and then used as inputs to the proposed tool. Results indicated that the modular neural network outperforms the Multi-Layer Feed Forward Neural Network (MLFFNN) in terms of accuracy, simplicity and learning capability. In [28], both the MLFFNN and Radial Basis Neural Network (RBNN) were used in detecting the inter-turn fault in induction machine. Results showed that the MLFFNN gives better performance in detecting the inter-turn fault. In [29], the MLFFNN has been proposed for detecting inter-turn fault under variable speed, load and fault severity for PMSM. The harmonic components of stator current were used as inputs to the ANN. The detection of inter-turn fault in LSPMSM is recently investigated in [30], [31] where MLFFNN was used in the detection process. The input is a set of time-domain and frequency-domain statistical features. Based on the literature, the crucial phase in the design of diagnostic tool in terms of MLFFNN, RBNN and MNN is the extraction of distinctive features. Feature extraction is done separately before the design of the neural network. In Convolutional Neural Network (CNN), the feature extraction and selection are part of the neural network. This results in a more efficient network in terms of both hardware and speed. Therefore, CNN is capable of working with the raw data. Several studies have utilized the CNN in detecting inter-turn fault in induction and PMSM motors [32]- [35], while no work is found in the literature for using CNN in LSPMSM. The performance of CNN in detecting inter-turn fault was compared with the Recurrent Neural Network (RNN), the Support Vector Machine (SVM), RBNN and MLFFNN. Results showed that CNN outperforms them in terms of accuracy [32]- [35]. Therefore, this research work suggested the use of CNN in detecting inter-turn fault for the LSPMSM.
In this paper, a 2D CNN based diagnostic tool for detecting stator inter-turn fault in LSPMSM has been developed. The developed tool uses the raw steady-state currents data as an input, with the inter-turn fault severity as the output. A 1.0-hp interior-mount LSPMSM has been used in developing and testing the proposed tool. Large experimental and simulation data set was collected and used in the training and testing of the developed tool. The data set was collected under different load conditions, fault severity levels, fault resistances and different staring conditions (rotor position, VOLUME 8, 2020 and supply voltage zero crossing point). Results show that the proposed tool is able to detect inter-turn fault severity with high accuracy.

II. CONVOLUTIONAL NEURAL NETWORK
CNNs are artificial intelligence architectures, mainly simulating the behavior of the visual system for the human brain [36], [37]. They are designed based on multi-layer neural networks that extract features from collected data. CNN can perform multiple tasks such as segmentation, detection, classification, and any data correlation. For classification applications, CNN is used to identify the labeled data by employing supervised learning techniques. Whereas supervised learning is one of the machine learning mechanisms for classifying collected data based on previously identified training process in order to find the target values.
CNN has three main design ideas: weight sharing, spatial sub-sampling, and local receptive fields. The CNN is composed of four layers: convolutional layer, pooling layer, fully connected layer, and softmax layer. These layers comprise a set of neurons with biases, weights, and activation functions [38]. The CNN network consists of two main stages which are the feature extraction stage and the classification stage. The feature extraction stage includes the convolutional layer and the pooling layer. While the classification stage involves both fully connected layer and softmax layer. The block diagram of a typical CNN architecture is shown in Figure 1. One of the key points that attracted the authors to use the CNN over the other techniques is that it includes feature extraction in its architecture. Accordingly, CNN can minimize the data pre-processing stages compared with other classification techniques. In CNN the feature extraction is done in the convolutional and pooling layers. The convolutional layer consists of neurons that are structured to form a set of filters (kernels) with specific heights and lengths (pixels). The filter is a matrix/vector of integers that is being used with the same size as the kernel on a part of the input pixels. Each pixel is multiplied by the kernel value and the result is added to a single and simple value for representing a grid cell in the output feature map like a pixel. In low level techniques, filters are configured manually for classification purposes, whereas CNN, with enough training dataset, has the ability to learn these filters in order to extract the main features that will improve the classification accuracy of the system [37], [38].
Two-Dimensional (2-D) CNN was applied on the steady state current signals directly to find the fault level [39]. As reported in literature, 1-D and 2-D CNNs are mostly used in fault detection scenarios due to their high performance in feature extraction [40]. The mechanism of learning main features from any raw signal by using 2-D extraction features is illustrated in Figure 2. The input is the raw signal amplitude with respect to time while the output of this feature extraction stage represents a set of local features extracted from the raw data.
Typically, CNN consists of convolutional layers and pooling layers along with other supported layers (i.e. activations, normalizations, . . . etc.) which are grouped into submodules, then fully connected layers will be used at the end of the CNN structure based on the design requirements [41].

A. CONVOLUTION LAYER
It convolves an array of the raw signals that comes from the input layer with a set of filters with defined size to acquire the suitable feature maps. The feature maps are generated by moving these filters over the targeted dataset. Usually, Rectified Linear Unit (ReLU) is used in CNN model to generate the targeted output feature map. Moreover, Batch Normalization (BN) can be used to speed up the training speed of the CNN model by reducing the fluctuation and internal covariate shift. Consequently, better classification accuracy can be achieved. The output of the convolution layer can be represented by: where X n i and Y n j represent the input and output of the n th convolution layer, respectively. W n ij represents the convolution kernel of the n th layer with a specific size. b n j represents the n th bias value. The operator ( * ) is the convolution operation. M j is the input feature map and f (.) represents the activation function.

B. POOLING LAYER
It comes directly after the convolutional layer in order to decrease the dimension of the resulted convolved features. Indeed, this can be done by down sampling the feature signs that are constructed by the previous layer. The input signals are divided into sub-parts and a pooling function is applied to each part to evaluate a new value. Among pooling functions, there are two widely used functions which are the average pooling function and the max pooling function. The average pooling function evaluates the average value of all selected inputs while the max pooling function evaluates the maximum value by using a suitable filter and stride values and then the resulted values will be propagated to the next layer. As mentioned before, this layer minimizes the dimension of the extracted feature maps by changing them into a single output. Therefore, the computational time is reduced and the most important features are extracted. The Max-pooling and Average-pooling can be evaluated, respectively, as: where P n ij represents the output of the n th pooling layer, B is the pooling window size, l B and W B are the length and width of the window, respectively.

C. FULLY CONNECTED LAYER
The main goal here is to take the output feature maps resulted from the convolution and pooling layers and use them to classify the input data into a label. In the fault detection problem, the output of the fully connected layer represents the class of a specific fault. The output of the fully connected layer is determined as: where, W f and B f are the weights and biases of the fully connected layer, respectively.

D. SOFTMAX LAYER
It allows a multi-class task to be run by the CNN. It reproduces a vector of labels into a set of values between 0 and 1, and the summation of all values is equal 1. Therefore, the number of outputs will be the same as the number of classes. This layer is the last layer of the fault classification stage and the output is calculated as: The 2-D convolution can be applied to an array that includes various data and shared weights by using a set of neurons [42]. However, the backward propagation technique can be used to adjust the shared weights. The benefit of using the convolution operation is to identify the main features of the input signals that will be used in the classification stage. As mentioned before, the convolution layer is combined with pooling layer to decrease the dimensionality. Generally, many activation functions that can be used in the learning process along with ReLU such as: Softmax, hyperbolic tangent function (Tanh) and sigmoid function (Sigmoid).
One of the popular optimization algorithms is the stochastic gradient descent with momentum (SGDM). It is widely used during the training process for any machine learning approach to find the weights and biases with minimal error rate values. The gradients of the weights and biases in the CNN model can be obtained by using SGDM algorithm based on the defined loss function. In general, there are two performance metrics used in identifying the efficiency of optimization technique: generalization and speed of convergence.

III. PROPOSED METHOD
The goal of this research is to develop a 2D CNN based stator inter-turn monitoring tool for LSPMSM that provides early warning of possible failure. The following steps summarize the work done to develop the proposed tool: Step 1: Building an experimental setup and a validating mathematical model to be used to investigate and study the inter-turn fault and its effect on the stator currents.
Step 2: Collecting the three-phase steady-state stator currents for different cases under different loads, fault severity, fault resistances, rotor starting positions, and supply voltage zero crossing points.
Step 3: Designing and Training of 2D CNN that correlates the steady-state stator currents with the existing inter-turn fault severity, if found.
Step 4: Testing of the developed correlation using unseen fault cases.

IV. EXPERIMENTAL SETUP AND DATA COLLECTION
Motor current signature analysis (MCSA) has been widely used in literature for detecting several motor faults [17], [43]- [45]. The main feature of MCSA lies in avoiding installation of additional hardware or sensors [17]. Current VOLUME 8, 2020 signature may be taken for the currents from the starting condition in which the transient period will be considered. In this case, the technique is called Advanced Transient Current Signature Analysis (ATCSA).
Despite MCSA may provide less information than the ATCSA, it gives benefits such as avoiding the complications of knowing the initial conditions (motor rotor position and voltage angle at the starting point). Furthermore, MCSA facilitates the online detection of faults while the motor is running without the necessity of isolating the motor. Therefore, the steady-state stator currents are selected as the fault indicators. The effect of inter-turn fault on the steady state stator currents is investigated below.
In this study, both experimental and simulation investigations of inter-turn fault have been done. The experimental investigation was carried out on a 1.0-hp, 60-Hz, 344 turns per phase, and 4 poles interior mount LSPMSM. The simulation study was carried out using the validated mathematical model in [1]. Equations (6)(7)(8)(9)(10)(11)(12)(13)(14), as shown at the bottom of the next page, represent the final mathematical model in qd0 frame. The validated model has been implemented and simulated using MATLAB software. Equations (6) and (7) represent the stator and rotor voltages equations. Equations (8)(9)(10)(11)(12)(13) represent the flux-current relations in matrix form while (14) represents the torque equation. where µ is the shorted turns ratio. v s q , v s d and v s 0 are qd0 stator voltages. v r d and v r 0 are qd0 rotor voltages. i s q , i s d and i s 0 are qd0 stator currents. i r q , i r d and i r 0 are qd0 rotor currents. ω r is the rotor speed. λ s q , λ s d and λ s 0 are the qd0 stator linkage fluxes. r s is the stator resistance per phase. r rq is the rotor q-axis resistance. r rd is the rotor d-axis resistance. r r0 is the rotor 0-axis resistance. λ r q , λ r d and λ r 0 are the qd0 rotor linkage fluxes. i f is fault current. λ m is the flux of the permanent magnet. T em is the electromagnatic torque. L lrq , L lrd are the q-and d-axis leakage inductances of rotor, respectively. R f is the external fault resistance. L m is the magnetizing inductance. L ls is the stator leakage inductance. v s a 2 is the shorted turns voltage. L md and L mq are the d-and q-axis mutual inductances, respectively. λ s a 2 is the shorted turns flux linkage. L m is the inductance due to saliency. L asas is the stator phases mutual inductance. Figure 3 shows the experimental setup built to carry out the different tests and collect the required data. The setup consists of a CASSY system (CASSY software, current sensors, voltage Sensor), multi-function meter, instantaneous speed sensor, and magnetic brake.
To investigate the effect of inter-turn fault on stator currents, four cases of inter-turn faults have been done experimentally on phase-a while the other motor phases are in healthy condition. These cases are 4 (1.16%), 9 (2.61%), 26 (7.55%) and 40 (11.62%) shorted turns. For the tested cases, the stator three-phase currents and the fault current in the shorted turns were recorded at 10,000 samples per second. Figure 4 and Figure 5 show the results of the tested cases and the time of applying the fault. It is worth mentioning that the fault resistance for all cases was 0.3 ohm.  It is clear from the results that as the size of inter-turn fault increases, both the fault current and the current of the fault phase increases as well. Additionally, the other un-faulted phases are slightly affected. Figure 4.a shows that at 4 shorted turns, the motor currents are almost not affected, and the fault current reached a peak of around While at 40 shorted turns (Figure 4.b), the current of the faulted phase is extremely affected. Additionally, the fault current reached a peak of more than 20A which is very high and could damage the winding if it sustains for more time. As a conclusion, the results show that the current is sensitive enough to this type of fault and it can be used as a reliable indicator in developing the proposed diagnostic tool.
To show the accuracy of the used mathematical model in the simulation study, the model has been implemented using MATLAB. Figure 6 shows the simulation and the experimental stator current of phase-a (faulted phase) during steady sate. The figure shows that both the simulation and experimental stator currents are in good agreement under different fault condition. A successful development of the diagnostic tool requires large data set. Therefore, 5821 testing cases (experimental and simulation) were collected. These were collected under different load conditions, fault severity levels, fault resistances and different staring conditions (rotor position, and supply voltage zero crossing point).

V. THE PROPOSED DIAGNOSTIC TOOL DESIGN
The proposed CNN approach begins with 2-D raw data collected from the motor considered. This data represents three cycles of steady-state stator current signals of the three phases. After that, it will pass through the first layer in the x 11 x 12 x 13 = µ( 3 2 k 1 )cosθ r µ( 3 2 k 2 ) sin θ r 0 (12) VOLUME 8, 2020  proposed CNN model in order to extract the main features by using a convolutional layer along with ReLU function. Then a pooling layer will be used to down-sample the data. These two layers will be reused several times to investigate for proper features. At the end, the output of the previous layers will be fed into more than one fully connected layers. Softmax will be used as a top classifier for the required fault classes. As per Table 1 below. As mentioned in section II, any activation function such as Tanh, ReLU, Sigmoid, and Softmax can be used. However, ReLU function is selected in this study due to its efficient and superior performance [41]. For the pooling layer, max-pooling is used since it achieves higher efficiency over the average pooling. The proposed CNN model is shown in Figure 7.
In Figure 7, the feature extraction process is achieved during the first five stages. Firstly, the input consists of 2-D array and represents three cycles of the 3-ph steady-state motor currents with size of 501×3 for each pattern. It is worth mentioning that the stator current signals during steady state has been used as a fault indicator. In this work, 3 cycles of the current signal (50ms in 60-Hz system) is found enough for detecting the inter-turn fault. In this paper, the sampling rate of acquiring current is 10,000 sample per second. Therefore, the number of samples per three cycles is around 501 samples. Then, 16 and 32 filters were used with a filter size of 3 × 3 in the convolution layer for convolving our input data with these filters. In the same layer, ReLU function was used to generate the targeted output feature map. Moreover, batch normalization was used in this layer to enhance the performance as mentioned in Section 2. After that, Max-pooling function was selected for all pooling layers with size 2 × 1 and stride = 2 since the Max-pooling function has better performance over the Average-pooling function in the CNN model. These layers were used several times in order to go deep in our raw data and extract the main features.
For the classification process, a fully connected layer is used along with the SGDM optimization algorithm for achieving the best weights. The output of this layer is a vector with a size of 11 × 1, which is equal to the number of targeted classes. Then, Softmax layer is used after the fully connected layer at the end of the network in order to change the output behaviour of the fully connected network into probability distribution values and the output here is the same as the number of classes. This gives the number of shorted turns in the motor considered (see Table 1).

VI. RESULT AND DISCUSSION
The experiments and simulations were conducted using MATLAB along with CNN toolbox library. In order to demonstrate the potential of the proposed intelligent diagnostic tool, total of 4666 data samples were used to train the CNN, and 1155 data samples were used to test the CNN. Our data samples are composed of experimental and simulation data for the targeted motor. As the three-phase motor currents are utilized in this paper, each phase current is investigated when preparing our data samples. Consequently, a total of (5821) raw current data samples were prepared for the healthy motor and for the ten faulty classes based on our classification and the number of shorted turns, as described in Table 1. The structure of the intelligent diagnostic tool was specified in previous section with the suitable number of filters as well as their sizes. The CNN architecture has parameters that may lead to different changes in the evaluation (i.e. learning rate). The value of the learning rate was changed with a specific range during the training process in order to achieve better performance for the whole system. Different structures of convolutional neural networks with different number of layers and neurons have been used to form suboptimal convolutional neural network that correlate the three cycles of the steady state current with its corresponding number of shorted turns. The simplest with the highest efficiency was the one with 7 layers as shown in Figure 7. The training phase was completed successfully after 40 epochs which corresponds to 264.2 sec. Results showed that the training accuracy was 100%, while the overall testing accuracy for detecting the occurrence and class of different fault levels was 97.7% under different loading conditions varies from 0 NM to 4 NM.
The confusion matrix of this model is used to represent how many true and false predictions of each class along with the achieved accuracy. Furthermore, it represents not only the prediction values made by the classification layer, but also which kind of faults are detected. Figure 8 shows the confusion matrix, which demonstrates the classification outputs during the testing process using our trained CNN model described before. Around 98% of the tested data samples were correctly identified. Few wrong identified cases are recorded due to other faulty conditions during the experiment and simulation steps.
The overall collected results from the experiments using this real motor data samples are studied with respect to such common standard performance metrics based on the literature such as, Accuracy (Acc), Sensitivity (Sen), Specificity (Spe), and Positive predictively ratio (Ppr), F-Measure, and G-Mean [46]. These metrics are defined as: where TP: True Positive, FP: False Positive, TN: True Negative, and FN: False Negative. The accuracy measures the overall performance of the detection system over the 11 classes of the tested motor data sample. The rest of the metrics are mainly used for each class separately based on the classification layer. Table 2 shows the percentages of these metrics for the 11 classes. From Table 2, it can be noticed that the accuracy is around 97.75% for all classes. The achieved sensitivity is between 91.71% for C2 and 100% for C6, C7, and C10. However, the range of all classes for the specificity metric is approximately between TABLE 2. Accuracy, sensitivity, specificity, precision, F-measure, G-mean results.  From the classification results and the confusion matrix, it can be clearly seen that the proposed 2D CNN classifier can extract the local features directly from the raw current data and achieve a high fault classification performance. Moreover, the region of convergence (ROC) figures are shown in Figure 9 for better visualization of the performance of the proposed model along with the 11 classes. The area under the ROC curves is an indicator of the performance of the classifier. Larger area values indicate better classifier performance. The maximum area is 1, which corresponds to a perfect classifier. The area of the developed classifier is around 0.98 (micro-averaging) which indicates that the developed classifier has great performance.
In order to evaluate and compare the performance metrics of the proposed model with other models, two learning models are selected for comparison purposes with the same dataset. Table 3 shows a brief comparison between the proposed model and the MLFFNN that was introduced in [30] and [31]. The proposed model outperforms MLFFNN in terms of the accuracy and number of classes that can be discovered without using any pre-processing stages.

VII. CONCLUSION
In this paper, a 2D six stages CNN based diagnostic tool for detecting stator inter-turn fault in LSPMSM has been presented. The developed tool uses the raw steady-state three phase currents as inputs and the fault severity as outputs. The diagnostic tool inputs/outputs are based on experimental and simulations for 1.0-hp interior-mount LSPMSM. The proposed tool results demonstrate its effectiveness to detect the inter-turn fault severity with high accuracy. The proposed tool uses 7 layers with 11 classification outputs based on three cycles of the three-phase steady-state motor currents. The tool accuracy for all classes under different loading conditions varies from 0 NM to 4 NM is 97.75%, which is superior to the developed tools in the literature. It has been observed that the classification of the inter-turn fault level is performed with higher accuracy and shorter training time without the need of feature extraction phase. Extension of the tool to generalize the detection regardless of the motor size is under investigation. The potential of the developed tool in detecting the low fault levels was also demonstrated.