Open-circuit Fault Diagnosis of Traction Inverter Based on Compressed Sensing Theory *

: This study proposes a new method of fault diagnosis based on the least squares support vector machine with gradient information (G-LS-SVM) to solve the insulated-gate bipolar transistor(IGBT) open-circuit failure problem of the traction inverter in a catenary power supply system. First, a simulation model based on traction inverter topology is built, and various voltage fault signal waveforms are simulated based on the IGBT inverter open-circuit fault classification. Second, compressive sensing theory is used to sparsely represent the voltage fault signal and make it a fault signal. The new method has a high degree of sparseness and builds an overcomplete dictionary model containing the feature vectors of voltage fault signals based on a double sparse dictionary model to match the sparse signal characteristics. Finally, the space vector transform is used to represent the three-phase voltage scalar in the traction inverter as a composite quantity to reduce the redundancy of the fault signals and data-processing capabilities. A G-LS-SVM fault diagnosis model is then built to diagnose and identify the voltage fault signal feature vector in an overcomplete dictionary. The simulation results show that the accuracy of this method for various types of IGBT tube fault diagnosis is over 98.92%. Moreover, the G-LS-SVM model is robust and not affected by Gaussian white noise.


Introduction
The traction inverter is the power source of contactless energy storage power supply (CES-PS) systems [1] ; hence, its smooth operation is very important for these systems [2] . However, the probability of failure increases as the number of traction inverter levels and the number of IGBT tubes in a CES Many methods can to extract fault signal features, such as Fourier transform, wavelet transform, and S transform [3] . These methods are based on the Nyquist sampling theorem and consume substantial storage space and hardware resources. The compressed sensing (CS) theory is proposed to break through the defects of the Nyquist sampling theorem and become a new research hotspot in the signal processing field. Buliding a complete atomic library as a training library for K-SVD was first proposed in Ref. [4], which was then applied in the field of image noise reduction with significant results. Today, the K-SVD algorithm learns overcomplete dictionaries in related signals, such as signal noise reduction and compressed sensing. Representing the signals approximately sparsely and reducing the measurement data, it also improves the completeness and representativeness of samples. The complete dictionary is adaptable and flexible [5][6][7][8] , but its structure is poor, and the training time is long. Moreover, the detection result error after training is large. In Ref. [9], a performance degradation prediction method was proposed based on higher-order differential mathematical morphological gradient spectral entropy, phase space reconstruction, and extreme learning machine, to extract the initial characteristics of performance degradation from the original bearing signal. A principal component analysis was proposed in Ref. [10]. The method reduces the dimension of the constructed characteristic signal matrix and reduces the linear correlation between the data. The abovementioned methods focused on the extraction of bearing fault signals and are less applied to the extraction of power quality fault signals.
Fault signal classification and recognition mainly include an artificial neural network, an expert system, support vector machines (SVM), and other methods [11] . In Ref. [12], a fault diagnosis strategy of an H-bridge multilevel inverter based on the PCA-SVM model was proposed, but this ignored the influence of noise. In Ref. [13], a three-phase inverter based on the combination of SVM and adaptive noise was proposed. In Ref. [14], the reverse control strategy and the PID control method were used to study a grid-free, three-phase inverter with an LC filter. Meanwhile, an improved ant colony optimization algorithm based on multiple swarm strategies, a co-evolution mechanism, a pheromone update strategy, and a pheromone diffusion mechanism and a genetic ant colony adaptive collaborative optimization algorithm were proposed in Refs. [15][16] to balance the convergence speed and improve the performance of large-scale optimization problems with better search capabilities and strong search accuracy. In Ref. [17], a semi-supervised generalized learning system for fault classification and recognition of the extracted feature data vectors was proposed. Although the abovementioned method overcomes the shortcomings of being easily trapped in the local optimal solution and long training time, it has high search ability and strong search accuracy. However, the regression quality is poor when there is less training data. Therefore, a least squares support vector machine (G-LS-SVM) model that introduces gradient information into the training data is proposed herein [18] . However, it should be noted that it also makes the SVM model lose sparseness and reduces the generalization ability. The completeness and representativeness of the samples have a great impact on the accuracy of the fault diagnosis results. Aiming to solve the problems related to sparseness of voltage fault signals and completeness and representativeness of samples, the compressed sensing theory is proposed [19][20] .
Considering this phenomenon, this study proposes a sparse dictionary model based on the double sparse dictionary model to construct an overcomplete dictionary to represent the sparseness, completeness, and representativeness of the voltage fault signal for the IGBT open-circuit fault in a CES-PS system. To reduce the redundancy and the data-processing capability of the fault signal, the space vector transform is used to represent the three-phase voltage scalar in the traction inverter with a composite quantity. For the representative five types of voltage fault signals, the double sparse dictionary under the CS theory is used to perform the feature extraction and sparse representation of the voltage fault signals, build an overcomplete dictionary containing the voltage fault signals, and extract the feature vectors of the sparse fault signals. The G-LS-SVM model is then entered for diagnosis. The diagnosis results show that when the G-LS-SVM model has a sufficiently sparse input and complete and representative voltage fault signals, the fault recognition rate is higher than 98.92%, and the G-LS-SVM model has strong anti-noise capability and robustness.

Traction inverter fault classification
The working process of the traction inverter is shown in Fig. 1. Here, 380 V AC power was input to the power base station, and power was received through the CES-PS system. The output voltage was 750 V DC after the rectifier. The inverter supplied power, which drove the normal work of the traction motors, air conditioning systems, and other subsystems. The topology of the traction inverter is shown in Fig. 2. The inverter is regarded as a three-phase, two-level, voltage-type inverter consisting of three half-bridges. The output pulse-width modulation (PWM) wave contained only two types of U and 0 level and used a fully controlled switch with anti-parallel reactive feedback diodes equivalent to an AC asynchronous motor as a three-phase symmetrical inductive load.

LS-SVM principle
The SVM learning algorithm aims to solve the problems of classification and regression of small sample data. Its core idea is to find a hyperplane in space that can divide all data samples and shorten the distance between all data in the sample set and this hyperplane [21] . However, the SVM takes a long time to solve the model parameters and reduces the model efficiency. To this end, the inequality constraint condition in the SVM is changed to an equality constraint to derive the LS-SVM. The model for solving the LS-SVM turns into a simple solution of linear equations.
The objective function and constraints of LS-SVM are presented as follows where , w b denote the parameters of the model sought; γ is the weight of the optimal hyperplane and bias difference (penalty factor); i e is the error between the true and predicted values; i x denotes the coordinates of the training data; i y is the real value; ϕ is the mapping function; and n is the total number of training data.

Improved G-LS-SVM model w
The LS-SVM regression model only uses linear data to limit the regression trend and cannot accurately predict the regression model for discrete data. Therefore, according to the gradient information, the rate of change of discrete data can be expressed, and an LS-SVM regression model based on gradient information is formed. The objective function is where w is the hyperplane normal vector; 1 2 3 , , Substituting the Lagrange multiplier into formula (2), the G-LS-SVM regression model can be obtained as is the gradient step size; b is the regression function intercept; and K is the inner product of the kernel function.

Kernel function selection
The key technology of the LS-SVM based on gradient information is the selection of the kernel functions. The choice of the kernel function will affect the learning ability and the generalization performance of the G-LS-SVM. Different kernel functions will produce different G-LS-SVM algorithms. The Fourier transform spectrum of the Gaussian radial basis kernel function is single-lobed; hence, mapping a lowdimensional linearly indivisible signal to a highdimensional one becomes linearly separable. The RBF kernel function is selected and expressed as 4 Construction of the overcomplete dictionary based on the CS theory

Sparse representation of the CS theory
The CS theory shows that as long as an N-dimensional signal is compressible or sparse in a certain transform domain, an observation matrix that is not related to the transform basis can be used to project the transformed high-dimensional signal onto a lowdimensional space. Obtain an M-dimensional signal ( M N )and solve the optimization problem to reconstruct the original signal with a high probability from these few projections.
For a non-sparse discrete signal, an approximate sparse representation can be obtained under an orthogonal transform basis where x is an approximate sparse representation of signal μ . For signal μ , find a measurement matrix for projection Formula (6) However, the sparseness of the voltage fault signal under the orthogonal transform basis is still not high. Therefore, an overcomplete dictionary adapted to the signal characteristics is used to sparsely represent the signal. According to the CS theory, if a certain transformation exists, , formula (7) can be written as is the perception matrix.

Overcomplete dictionary design based on the double sparse dictionary model for the electric energy disturbance signal
For the sparse decomposition of the voltage fault signals, a complete dictionary should first be constructed to adapt to the characteristics of the signal itself. The signal sparsity is closely related to the quality of the designed overcomplete dictionary, which further affects the robustness of the G-LS-SVM model and the accuracy of fault classification and recognition.
The sparse dictionary can be divided into the fixed and learning dictionaries. The fixed dictionary has poor self-adaptability and relies on the prior knowledge of the original signal. Learning the dictionary is realized by machine learning of the sample signal. Its flexibility and adaptability are good [21] . However, the experimental results showed that in the training and learning process, the atoms in the atomic library are updated column by column at each iteration, which takes a long time. The algorithm is also complicated. Therefore, a dual sparse dictionary model is formed by combining two dictionary models [22] . This model shortens the dictionary training time, reduces the algorithm complexity, and has the ability to identify the fault categories more accurately.
The double sparse dictionary model aims to further sparsely represent each atom in the overcomplete dictionary N N × ∈ D R that was constructed on a predefined base dictionary ( In formula (9), be the sample set of N atoms. The objective function of the double sparse dictionary can be defined as In formula (10), Γ is the sparse representation coefficient matrix of the voltage signal under dictionary D; i Γ is the coefficient column vector corresponding to the i dictionary atom in the coefficient matrix; j a is the coefficient column vector corresponding to the jth atom in the sparse dictionary A; p is the sparseness of the voltage signal; and k is the sparseness of the atoms of the sparse dictionary A. The specific steps of its algorithm are as follows.
(1) Input: the sample set X formed by the voltage fault signal. The sparsity is k . The base dictionary is Φ . The initial dictionary represents coefficient 0 A . The number of iterations is t . (2) Initialization: (3) While (number of iterations < t ): ① Sparse coding: use the SPG-LIC decomposition algorithm to solve the sparse representation of the electrical energy signals in the current dictionary; ② Dictionary update: update each column j a in A. (4) Output: output A after satisfying the iteration stop condition, then multiply it with the base dictionary Φ to obtain the overcomplete dictionary D.

Implementation steps of the fault early warning model based on the double sparse dictionary
The core of constructing an overcomplete dictionary lies in the selection and parameter value of the base dictionary in the double sparse dictionary model. Choosing a suitable base dictionary and parameter values is particularly important in ensuring that the voltage fault signal is sufficiently sparse. The overcomplete dictionary constructed in this way takes less time to train fault category recognition and has a higher accuracy in diagnosing voltage fault signals. The algorithm flow is presented as follows.
(1) The feature vectors of the various types of voltage fault signals are extracted. The sample data x measured as compressed sensing constitute an overcomplete dictionary D.
(2) A part of the overcomplete dictionary D is randomly selected as the training samples and used as the base dictionary Φ after a successful training. A coefficient decomposition on the base dictionary Φ is then performed for the successfully trained samples. The coefficient matrix obtained after decomposition constitutes a sparse dictionary A.
(3) The parameters in the double sparse dictionary model are reasonably selected. The training samples are selected from the sample set X . The obtained signals are sufficiently sparse and can accurately represent the characteristics of various voltage fault signals.
(4) Using the random Bernoulli matrix [23] as the measurement matrix Ψ , according to y x Φ = , the n-dimensional fault signal is projected to obtain the m-dimensional measurement value y( m n ).
(5) The measured value y is transmitted to the data-processing center. The sparse coefficient is reconstructed using the measured value y , measurement matrix Ψ , and overcomplete dictionary D through the SPG-LIC algorithm [24] .
(6) The reconstructed sparse coefficient a′ is used to obtain the fault signal y′ through y Da ′ ′ = . (7) The relative errors of y and y′ are calculated. (8) The relevant parameters in the double sparse dictionary model are adjusted such that the relative error is lower than 1×10 −5 .
(9) The reconstructed signal y′ is input into the G-LS-SVM model to diagnose the fault type.

Fault diagnosis under the G-LS-SVM model
This study used Matlab/Simulink to build a traction inverter model in the CES-PS system and collect its IGBT open-circuit fault voltage signal. The feature extraction and the sparse representation of the voltage fault signals were performed using a double sparse dictionary under the CS theory. An overcomplete dictionary library containing the voltage fault signals was constructed. Space vector transformation was used to represent the three scalars of the three phases with a composite quantity to ensure the fault diagnosis accuracy [20][21] .

Space vector transformation
In a three-phase DC/AC inverter, a three-phase voltage is usually represented by three scalars. If the three scalars in a three-phase inverter are represented by a composite quantity, the three-phase problem can be reduced to a single-phase problem while reducing the fault signal redundancy and the data-processing capacity [25] . Suppose that the three-phase voltage scalars at the traction inverter output are A U , B U , and C U and satisfy formula (11) formula (11) can be transformed into where 2π 3 j a = e and 4π 2 3 j a = e . In formula (12), 1 U represents a vector on the complex plane, which can represent the three scalars of the inverter. The actual and imaginary parts are 1 2π 4π cos cos 3 3 According to formulas (11), (13), and (14), The inverse transformation of formula (15) is

Various fault voltage waveforms of the inverter
(1) Normal operation (C 1 ) The three-phase voltage waveforms of A, B, and C when the traction inverter works normally in the CES-PS system are shown in Fig. 3. The waveform after the space vector transformation is shown in Fig. 4, where Fig. 4a represents the real part of the complex vector; Fig. 4b represents the imaginary part of the complex vector; and Fig. 4c represents the zero-sequence voltage signal.
(2) Single tube failure (C 2 ) When a single tube failure occurred in the traction inverter, the T1 tube was selected as the monitoring object of the oscilloscope, and the space vector transform was performed on the voltage failure signal. The waveform is shown in Fig. 5, where Fig.  5a and Fig. 5b represent the real and imaginary parts of the space voltage vector, respectively, and Fig. 5c depicts the zero-sequence voltage. When a single-phase dual-tube fault occurred in the traction inverter, the T1T4 tube was selected as the monitoring object of the oscilloscope, and the space fault was performed on the voltage fault signal. Figs. 6a and 6b represent the real and imaginary parts of the space voltage vector, respectively, and Fig. 6c represents the zero-sequence voltage.
(4) Double-tube faults on the same side of the adjacent bridge arms (C 4 ) When a traction inverter has a double-tube failure on the same side of the adjacent bridge arms, select the T1T2 tube as the monitoring object of the oscilloscope and perform a space vector transformation on the voltage fault signal. The waveform is shown in Fig. 7, where Fig. 7a and Fig. 7b represent the real and imaginary parts of the space voltage vector, respectively, and Fig. 7c represents the zero-sequence voltage.  When a traction inverter has a double-tube fault on the opposite side of the adjacent bridge arms, select the T1T5 tube as the monitoring object of the oscilloscope and perform a space vector transformation on the voltage fault signal. The waveform is shown in Fig. 8, where Fig. 8a and Fig. 8b represent the real and imaginary parts of the space voltage vector, respectively, and Fig. 8c represents the zero-sequence voltage.

Experimental verification
The feature vectors of the 22 kinds of IGBT open-circuit faults were extracted to construct an overcomplete dictionary based on the double sparse dictionary model. The feature vectors in the overcomplete dictionary were input into the G-LS-SVM model for fault identification and diagnosis. The fault diagnosis Tab. 2 also shows that the G-LS-SVM model had a higher accuracy rate for fault-free and single-tube fault diagnosis, which took a shorter time and a lower fault diagnosis accuracy rate for the same and different sides of the adjacent bridge arms. For the verification of the stability and robustness of the voltage fault signals extracted from the G-LS-SVM model, the signal-to-noise ratio (SNR) of the Gaussian white noise applied to each type of voltage fault signal was 50 dB, 40 dB, 30 dB, and 20 dB. The test set consisted of 100 randomly extracted fault feature vectors. The results are shown in Tab. 3. The G-LS-SVM model had a high accuracy for the voltage fault signal recognition and had strong anti-noise capability and robustness.

Conclusions
This study aimed to solve the problems related to open-circuit faults of the IGBT tube of the traction inverter in a CES-PS system. This was achieved by extracting the feature vector of the voltage fault signal, building an overcomplete dictionary based on the double sparse dictionary model, and using the G-LS-SVM model to compare the feature vector in the overcomplete dictionary by troubleshooting. The fault diagnosis results showed that when the eigenvectors of various IGBT tube faults were included in the overcomplete dictionary, the model accurately identified various IGBT tube faults with an accuracy rate of more than 99%. In addition, the G-LS-SVM model had strong anti-noise ability and robustness.
The simulation experiments herein were based on a Matlab/Simulink platform. This study also further tried to incorporate the feature vectors of multiple fault signals to build a fault recognition model of multi-feature fusion signals, making the classification method more universal and the fault recognition more accurate.