Application of SCNGO-VMD-SVM in Identification of Gas Insulated Switchgear Partial Discharge

Partial discharge (PD) is one of the main reasons of insulation deterioration in gas insulated switchgear (GIS). How to efficiently and accurately identify PD signals is an important guarantee for the stable operation of GIS. In this paper, an improved northern goshawk optimization (SCNGO) is proposed, which automatically optimizes parameters of variational mode decomposition (VMD) and support vector machine (SVM) to realize fault identification of GIS PD. Firstly, to overcome the shortcomings that NGO is easy to fall into local optimal solution and slow convergence speed, the opposite learning of refraction strategy and sine cosine algorithm (SCA) are introduced to optimize NGO. By comparing the test functions of various algorithms, the superiority of SCNGO algorithm is proved. Then, GIS PD experiment is designed for fault signal acquisition and algorithm verification. SCNGO-VMD is used for parameter adaptive optimization of PD signals. On this basis, the effective intrinsic mode functions (IMFs) are screened by composite index. Furthermore, time-domain, frequency-domain, and entropy features are constructed as mixed features and t-SNE is used to reduce the dimension. Finally, the feature vectors are input to SCNGO-SVM for fault identification. Through experimental analysis, compared with other algorithms, the proposed algorithm model has good state identification accuracy for GIS PD fault diagnosis. The paper provides a reference for the application of optimization algorithm in GIS PD fault identification.


I. INTRODUCTION
Gas insulated switchgear (GIS) is widely used in modern power systems, which integrates circuit breakers, isolation switches, ground switches, voltage transformers, current transformers, arresters, bus bars, cable terminals and other equipment together.Compared with traditional open highvoltage equipment, it has a series of advantages such as small footprint, compact structure, little interference from the external environment, safe and reliable operation, easy maintenance, and long maintenance cycle [1], [2], [3], [4].In recent years, with the development of electric power and the improvement of the national economy, the requirements for power grid security are gradually increasing.GIS equipment has been installed in power grids above 35kV.
The associate editor coordinating the review of this manuscript and approving it for publication was Guillermo Valencia-Palomo .
Although GIS has many advantages, the actual operation experience shows that there are still some defects and hidden dangers in the operation of GIS due to some inevitable problems in the process of design, manufacture, transportation and installation.[5], [6], [7].Among them, partial discharge (PD) caused by insulation failure seriously threatens the safe operation of power system.PD is a precursor to the aging of electromechanical components, so it is very important to accurately identify PD.
When PD is generated in the insulation structure, physical phenomena such as electrical pulse and ultrasonic wave will be generated.Phase resolved partial discharge (PRPD) image and phase resolved pulse train (PRPS) image [8], [9] are common identification method at present.However, the method is easy to cause aliasing problem because of many PD sources.The application of neural network has increased in PD fault diagnosis in recent years [10], [11], [12].However, this method requires a large number of experimental and operational data.For high-pressure experiments, the period is long, the cost is high, and it is not easy to implement.On the other hand, physical signals are nonlinear, unstable and have significant pulse characteristics, which contain very rich fault information.Useful information can be extracted from these signals for GIS fault identification, so as to detect faults in a timely and accurate manner to ensure the normal operation of power equipment.Some decomposition methods are helpful for fault identification.Empirical mode de-composition (EMD) [13] can adaptively decompose the signal into a series of intrinsic mode functions (IMFs) through the local scale features of the signal, so as to reveal the internal properties of the signal.However, EMD has some problems such as endpoint effect and mode mixing [14], [15].Some new decomposition methods, such as feature mode decomposition (FMD) [16], it has been demonstrated that FMD has superiority in feature extraction of machinery fault.However, the new method has not been popularized, and more cases are needed to test its feasibility.Singular value decomposition (SVD), as an effective signal denoising tool, has been attracting considerable attention in recent years.Reference [17] proposed an approach to integrally address large dimensionality or large instance size, by using SVD within a learning algorithm for one-layer feedforward neural network.However, SVD is mainly used for data dimension reduction and compression storage.
VMD [18] is an adaptive signal decomposition method which is recently proposed and widely applied in fault identification due to its excellent noise reduction performance.In VMD, the frequency center and bandwidth of each component are determined by iteratively searching for the optimal solution of the variational mode.This method has excellent filtering characteristics and solid theoretical basis, and has the characteristics of high precision, fast convergence speed and good robustness [19], [20], [21].VMD also has many applications and researches in the field of PD.The literature [22] proposed a kind of PD pattern recognition method based on VMD-Choi-Williams distribution (CWD) spectrum.A PD signal is decomposed into several components by VMD algorithm, and the CWD analysis of the obtained components is carried out.The results show that this method can contain the fault internal characteristics well.Reference [23] proposed an adaptive denoising algorithm, which combines VMD with Ljung-Box (LB) white noise test.Experiments show that the proposed method can effectively remove the white noise and recover PD signal more accurately.In Reference [24], a denoising method combined with singular value decomposition (SVD) and VMD was proposed to eliminate noise in on-site PD signals from high-voltage electrical equipment.The proposed method can eliminate periodic narrowband interference and white noise in different PD signals effectively.In addition, PD characteristics were investigated in [25].In order to exclude the effect of oscillation components on real-time assessment, VMD is used to extract the primary development trend of the feature parameter.Qin, et al. [26] studied PD pattern recognition of 10kV XLPE cable defects.The entropy values of each modal component were studied by using VMD for the collected PD signals.Reference [27] proposed a self-adaptive technique for PD signal denoising with automatic thresholding determination based on VMD and wavelet packet transform (WPT).The results show that the proposed method performs better with respect to several performance indexes.However, the above VMD algorithm has the following disadvantages: 1) The mode number K and penalty factor α of some experiments are artificially set by experience, which is not convincing enough, 2) Some optimization algorithms for VMD have the disadvantages of slow convergence, insufficient precision and easy to fall into local optimal.
Intelligent optimization algorithm is a good solution, which can adaptively determine the appropriate parameters and improve performance.Northern goshawk optimization (NGO) is a new swarm-based algorithm proposed by Dehghani, et al. [28] in 2021 and has been proved to be superior to existing algorithms in terms of convergence accuracy and convergence speed.NGO was used to optimize the load power shortage rate and energy storage energy in [29], which reduced the system cost and improved the convergence speed of the model.Santosh Kumar et al. [30] utilized NGO to identify the distinct characteristics of the speech signal based on the extracted features, and the performance of the new system is significantly improved.In addition, NGO performs well in other situations [29], [31].However, NGO also has common problems with swarm-based algorithms, and there is a lot of room for improvement in convergence speed and accuracy.To solve these problems, SCNGO is introduced in the paper.Opposition-based learning of refraction strategy [32] and sine-cosine strategy [33] are ued to improve NGO, and adopt SCNGO-VMD to optimize the decomposition of PD signals.
There are many machine learning methods to identify PD, such as artificial neural network, support vector machine, K-Nearest neighbors and so on [34].Due to the high voltage and danger of GIS PD experiment, the data acquisition is limited, which cannot support the neural network training with large sample size.SVM adopts the principle of interval maximization to classify, and its classification effect is not only related to the number of training samples, but also related to the distribution of it.When training samples is small, SVM algorithm can better deal with the situation of uneven distribution of data.SVM is a small sample learning method with solid theoretical foundation.Therefore, this paper chooses SVM for classification and recognition.Also, SVM parameters could be optimized through SCNGO.
The remaining parts of the paper are organized as follows: Section II introduces the theory of VMD and NGO.Section III introduces the method of SCNGO-VMD proposed and proves the excellent performance of SCNGO algorithm by using test functions.Section IV extracts the fault signal and carries on the feature extraction.Section V conducts fault identification.Finally, Section VI concludes this paper.

II. A DESCRIPTION OF THEORETICAL BACKGROUND A. VARIATIONAL MODE DECOMPOSITION
The VMD algorithm generalizes the classical Wiener filter to multiple adaptive bands, and can decompose a set of signals into multiple smooth IMFs according to the center frequency of each mode.The steps for the VMD algorithm to decompose the original fault data into IMFs are as follows: where µ k is the IMF component obtained after VMD decomposition of the original signal; K is the number of IMFs; f(t) is the original signal to be decomposed, in this paper, it is four PD fault signals: corona discharge, particle discharge, floating discharge and air-gap discharge; δ (t) is the Dirac distribution; t is the sampling time, and ω k is the frequency center corresponding to each IMF component.

2) SOLVE THE OPTIMAL SOLUTION OF THE CONSTRAINED VARIATIONAL PROBLEM
The specific steps of solving are as follows: Step 1: The constrained variational problem is transformed into an unconstrained variational problem.So, introduce multiplication factor and carry out Lagrange transformation.
where λ is the Lagrange multiplication factor; α is a second-order penalty factor, which is used to ensure the accuracy of signal reconstruction in the presence of Gaussian noise.
Step 2: The alternating direction multiplier method is used to update the multiplier factors, each IMF component and its center frequency.When the iteration termination condition in Step 3 is met, the optimal solution µ k of the unconstrained model is obtained.
Step 3: When the iteration results meet the Eq. ( 6) (a given precision ε > 0), the iteration is terminated, and K IMFs are output after Fourier transform.

B. VARIABLE-STEP MULTISCALE SINGLE THRESHOLD SLOPE ENTROPY
Slope entropy (SloEn) is new nonlinear dynamic method proposed by Cuesta-Frau et al.To portray the time series more accurately, [40], [41], and [42] introduces the variable-step multiscale theory and then proposes the VSM-StSloEn algorithm, which uses sliding windows to obtain more accurate complexity values, overcoming the shortage of coarse-grained degree in traditional multiscale entropy.The specific steps of VSM-StSloEn are as follows.
Step 1: A given time series X = {p 1 , p 2 ,. . ., p N } is converted into several variable-step coarse-grained series y (τ ) λ,j by the following operation.Furthermore, the variable-step coarse-grained sequences when τ = 1 are the original time series where τ and λ represent the scale and step size, respectively, y λ,j is the jth element of the λth variable-step coarsegrained sequences.Step 2: The StSloEn is calculated for all variable-step multiscale sequences.Then, when the scale factor is τ , the mean value of StSloEn values is calculated as VSM-StSloEn at that scale, which can be represented as follows: StSloEn(y

C. NORTHERN GOSHAWK OPTIMIZATION
The northern goshawk is a medium and large bird of prey, with sharp eyes and extremely fast speed that often catches prey off guard.NGO, proposed by Dehghani in 2021 [28], is a swarm based algorithm that simulates the behavior of northern goshawks in capturing prey, including search identification, capture, chase and prey escape and chase again.

1) INITIALIZATION PROCESS
The first step of the simulation is to treat a goshawk as a vector, and a group of goshawks forms the population matrix of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the algorithm.In the initial phase, each gos-hawk is randomly distributed in a population matrix.
where X is the population matrix of the northern goshawk; X i is the initial position of the i-th goshawk; x i,j is the position of the i-th goshawk in the j dimension; M and N are the total number of goshawks and the dimension of space, respectively.The objective function is: where F(X ) is the column vector of the objective function; F i is the value of the objective function corresponding to the i-th object position.As the number of iterations increases, the objective function will select the best of the best until the optimal solution is found.

2) PREY IDENTIFICATION AND ATTACK
According to the goshawk's good visual field, the exploration phase is used to identify prey and launch an attack, which involves a global search of the search space to determine the optimal area.The identification process is as follows: where P i is the prey position corresponding to the i-th goshawk; F pi is the corresponding objective function value; is the new position of the i-th goshawk in the j dimension; F new,p 1 i is the objective function value corresponding to the new position; parameters q and E are random numbers used to generate random NGO behavior in search and update, where qϵ[0, 1], E = 1 or 2.

3) CHASE AND ESCAPE OPERATION
After the northern goshawk attacks the prey, the prey tries to escape, and the northern goshawk continues to chase prey.Due to the extreme speed of the northern goshawk, they can chase their prey and eventually hunt in almost any situation.Simulation of this behavior improves the ability of local search in search space. where is the new position of the i-th Northern Goshawk in the exploration phase; R is the radius of the chase area; η is the current iteration number; T is the maximum number of iterations; F new,P 2 i is the new objective function value in the exploration phase.

III. THE PROPOSED METHODS (SCNGO-VMD) A. OPPOSITE LEARNING OF REFRACTION
Aiming at the problem that NGO algorithm tends to fall into local extremum in the late optimization period, which leads to insufficient convergence accuracy, the paper adopts opposite learning of refraction mechanism to initialize NGO population.Opposition-based learning is an optimization strategy proposed by Tizhoosh [32].The basic idea is to expand the search scope by calculating the opposite solution of the current solution, so as to find a better alternative solution for a given problem.At the same time, opposition-based learning still has some shortcomings.The introduction of opposite learning in the early stage of optimization can enhance the convergence performance of the algorithm, but it is easy to make the algorithm fall into premature convergence in the later stage.Therefore, a refraction principle [35] is introduced into the opposition-based learning strategy to reduce the probability of the algorithm falling into premature convergence in the late search period.The principle of opposite learning of refraction is shown in Fig. 1.where the optimization range of the solution above the x-axis is [l, u]; the y-axis is the normal; α and β represent incidence angle and refraction angle respectively; h and h * are the corresponding lengths of incident and refracted rays; O is the midpoint of the optimization range [l, u].
According to the geometric relationship of the line in mathematics, it is obtained as follows: By defining n =sinα/sinβ according to the refractive index, the equation for refractive index n is as follows: Take the scaling factor k = h/h * and substitute it into Eq.( 18) to get the deformation equation: When n = 1 and k = 1, Eq. ( 17) can be converted into opposite learning equation: When Eq. ( 19) is extended to the high-dimensional space of the NGO algorithm, n = 1 can be obtained as follows: where x i,j is the position of the i-th NGO in the population in j dimension (i = 1, 2,. . .,M; j = 1, 2,. . .,N), M is the population number and N is the dimension; x * i,j is the refraction opposite position of x i,j ; l j and u j are the minimum and maximum values of the j-dimension of the search space respectively.
The flow of Algorithm One is as follows: (1) N northern goshawk positions x i,j are randomly initialized as the initial population position.
(3) The initial population and the refraction opposite population are combined, and the first N northern goshawks are selected as the initial population according to the fitness value.

B. SINE COSINE ALGORITHM
In the process of goshawk predation, the location of food source plays a very important role, affecting the direction of the whole population.However, considering that food sources may be different and the location may not be the same, when the search for food is locally optimal, the whole population stagnates, resulting in a loss of population location diversity, which increases the possibility of falling into local extremes.In view of this phenomenon, the sine and cosine algorithm (SCA) [33] is introduced in the position update of NGO.By using the oscillating characteristics of the sine and cosine model to affect the location of discoverers, the diversity of discoverers is maintained, and the global search ability of NGO is improved.
The step search factor r 1 = a -at/Iter max (a is a constant, t is the number of iterations, the paper sets a = 1) of the basic sine and cosine algorithm shows a linear decreasing trend, which is not conducive to further balancing the global search and local development capabilities of NGO.The step search factor is improved, and the transformation curve is shown in Fig. 2. The new nonlinear decreasing search factor, such as Eq. ( 22), has a large weight in the early stage and a slow decline rate, which is conducive to improving the global optimization ability.the weight factor is small, the advantage of local development of the algorithm is enhanced, and the speed of obtaining the optimal solution is accelerated.
Considering in the whole search process of NGO, the position update of population individuals is often affected by the current location.Therefore, the nonlinear weight factor ω of Eq. ( 23) is introduced to adjust the dependence of population individual position on individual information.In the early stage of optimization, the smaller ω reduces the effect of the optimization individual position update on the current solution position, and improves the global optimization ability of the algorithm.In the later stage, the larger ω makes use of the high dependence between the current position information and the individual position update to accelerate the convergence speed of the algorithm.The change curve is shown in Fig. 2, and the new position updated is Eq.(24).
where x best is the overall optimal position at present; r 2 determines how far northern goshawks move, r 2 ∈[0,2π]; r 3 controls the influence of the optimal individual on the latter position of the northern goshawk.The superiority of NGO has been demonstrated in [28].Combine opposite learning of refraction and SCA to modify NGO, a new northern goshawk optimization (SCNGO) is proposed.Now, compare SCNGO with some of the algorithms newly proposed in recent years, such as Dung beele optimization (DBO) [36], Golden Eaole optimization (GEO) [37], Rime-ice optimization (RIME) [38], Beluga whale optimization (BWO) [39].All algorithms are iterated 1000 times, and the number of particles for each algorithm is set to 100.The test functions are shown in Table 1, and the test results are shown in Fig. 3 and Table 2.In the test of peak functions above, it is concluded that SCNGO performs well compared with other algorithms, and has the advantages of fast convergence and strong optimization ability.

C. SCNGO-VMD
In summary, opposite learning of refraction and SCA are introduced to improve NGO.The penalty parameter α and decomposition parameter K of VMD are optimized by SCNGO, and the minimum envelope entropy is used as the fitness function.Finally, the optimal parameters are obtained to complete the optimization process.The flow chart of SCNGO-VMD is shown in Fig. 4.

IV. GIS PD EXPERIMENT
Fig. 5 shows four typical fault models of PD: corona discharge, particle discharge, floating discharge, and air-gap discharge.As shown in Fig. 5(a), the corona discharge is  protruded by a simulated tip with a radius of 15 µm and a diameter of 1 mm.To simulate the free particle discharge, small spheres with a diameter of 2.0 mm are placed on a concave grounding electrode, and the high-voltage electrode is fixed at a distance of 10 mm from the grounding electrode, as represented in Fig. 5(b).To simulate floating discharge, the epoxy with a thickness of 10mm is placed, and the suspended material is directly between the high-voltage electrode and epoxy.The height of aluminum is about 4mm, and there is a certain gap between the high-voltage electrode and the aluminum, as shown in Fig. 5(c).For air-gap discharge, there is a small gap between the epoxy and the high-voltage electrode, as shown in Fig. 5(d).The time-domain pulse PD signals of four different defects are collected by UHF sensors.
Fig. 6 shows the experimental GIS cavity and related equipment and experimental wiring circuit.
In order to ensure that PD data can be collected as much as possible under each defect model, the test voltage is uniformly increased to above the initial discharge voltage through the voltage regulator, and the step size is 1kV to prevent the GIS from being broken down.The oscilloscope has a maximum sampling rate of 50MSa/s.Through the experiment, the initial discharge voltages of corona discharge, particle discharge, floating discharge, and air-gap discharge     3.
The evolution curves of different algorithms are shown in Fig. 8.It is indicated that SCNGO shows its superiority in convergence speed and accuracy.
The optimization results of SCNGO-VMD parameters for four types PD signals are shown in Table 4.

B. SELECTION OF IMFS
The original signals are decomposed by SCNGO-VMD to obtain K IMFs.The more fault features contained in each IMF component, the smaller the envelope entropy (E p ) value will be.In addition, PD faults are accompanied by strong pulse signals, resulting in an increase in kurtosis (Q).At the same time, in order to determine the correlation between the IMFs and the original signal, Pearson correlation coefficient (M ) is introduced.Based on the above indexes of envelope entropy, kurtosis and correlation coefficient related to fault characteristics, the paper proposes a composite index F for screening IMFs, which is defined as:  The composite index F can reasonably evaluate the amount of fault feature information contained in each IMF component.From Eq. ( 25), the larger the value of F is, the richer the fault feature information contained in IMF is.The IMFs with larger composite index values are selected as the effective IMFs to complete the screening process.Take particle discharge as an example, Fig. 9 shows the results of SCNGO-VMD decomposition, and Table 5 list the calculation results of the corresponding composite index.
Three IMFs with the highest comprehensive index are selected as effective components.In this case, IMF1, IMF2, and IMF3 are selected for corona discharge, particle discharge, and air gap discharge, while IMF2, IMF3, and IMF4 are selected for floating discharge.

C. FEATURE EXTRACTION 1) CONSTRUCTION OF MIXED FEATURES
Entropy value is an important feature of fault signals, which can accurately reflect the operation of equipment in different states.The parameter γ of VSMS-Stsloen affects the entropy.To overcome the threshold problem, the paper utilizes SCNGO to optimize the threshold γ of VSM-StSloEn.The specific optimization steps are shown in [40].Then, take In signal feature extraction, time-domain and frequency-domain feature analysis is the simplest and most basic analysis methods, and PD faults can also be represented by time-domain and frequency-domain features.The IMF component obtained by SCNGO-VMD decomposition contains clearer and more accurate fault information than the original signal.Therefore, for each effective IMF component, 6 time-domain features T i and 2 frequency-domain features S i are extracted to construct a time-frequency domain mixed feature, where i represents the i-th IMF component after SCNGO-VMD decomposition, and the selected time-frequency domain features are shown in Table 6.
In summary, time-domain, frequency-domain, and entropy features can represent the fault characteristics of GIS equipment from one side.Therefore, a high-dimensional mixed domain feature Q = [T , S, E] is established by combining the three features.The mixed domain feature realizes comprehensive representation of fault information, thus providing comprehensive features for fault diagnosis.

2) T-SNE DIMENSIONALITY REDUCTION
As an unsupervised nonlinear manifold learning algorithm [43], t-SNE can fully extract low dimensional sensitive feature information from high-dimensional data, thereby achieving dimensionality reduction and quadratic feature extraction of the data.It adopts the concept of conditional probability distribution to model the original high-dimensional data and the low dimensional data in the embedded space, mapping high-dimensional data to the low dimensional data and ensuring the distribution probability remains unchanged as much as possible.The t-SNE is used to reduce the dimension of the 69-dimensional mixed feature set, and the 3-dimensional feature vector is obtained, as shown in the following Fig. 11.
It is indicated that after t-SNE dimensionality reduction, the four types of data samples are classified clearly, and sample points under the same fault can be clustered together.Sample data under different fault states have a long distance in low dimensional space, which can well meet the requirements of feature stability.

V. PD FAULT IDENTIFICATION
320 groups of PD fault training samples are input into the SVM classifier for training, and the SVM model for fault diagnosis is obtained.SCNGO is used to optimize the parameter penalty factor c and the kernel parameter g of the SVM.In this step, the population size is set to 20, the maximum number of iterations is set to 50, c ∈ [1,2000], and g ∈ [1,200].The optimization results are shown in the Table 7. Classify the remaining 80 sets of test samples using the trained SVM model, and the classification results of the test samples are shown in Fig. 12.The particle discharge and air gap discharge can be correctly identified by Fig. 12, and a set of corona discharge is misidentified as particle discharge.In addition, two sets of floating discharge are misidentified as air-gap discharge.The overall identification accuracy reaches 96.25%.
In order to verify the effectiveness and superiority of SCNGO-VMD-t-SNE-SVM, it is compared with other methods, and results are shown in Table 8.
Compared with #1 and #2, the fault identification accuracy and running time of the SCNGO are significantly improved, which reflects the superiority of the opposition-based learning of refraction strategy and SCA proposed in this paper.It can be seen from # 1, # 3 and # 4 that the dimensionality reduction greatly improves the performance, and the dimensionality reduction effect of t-SNE is better than that of PCA.From #1, #5, #6 and #7, it is indicated that compared with DBO, GEO and other algorithms, SCNGO still has a huge advantage.Conclusion can be obtained from #1, #8, #9, and #10: SVM performs better than ELM, BP and PNN in this case of small sample data.

VI. CONCLUSION
The paper presents application of a new fault identification model for GIS PD.Firstly, aiming at the shortcomings of NGO, two strategies are introduced to improve NGO performance.Then, SCNGO-VMD-SVM is used to realize parameter adaptive optimization.Effective IMFs screening are performed using the comprehensive index F.Meanwhile, the mixed features are constructed, and t-SNE algorithm is used to reduce the dimension of high-dimensional mixed features.Finally, the vectors are input in SCNGO-SVM to realize the fault identification of GIS PD fault.The conclusions obtained are as follows: 1) By introducing opposition learning of refraction strategies and SCA, the problems of slow convergence speed and local optimum of NGO are obviously improved.
2) The construction of mixed features and the dimensionality reduction of t-SNE greatly improve the performance of the algorithm model.
3) Compared with other models, the classification accuracy of SCNGO-VMD-t-SNE-SVM reaches 96.25%, and running speed is fastest, which proves the superiority and robustness of the proposed method.

FIGURE 1 .
FIGURE 1.The principle of opposite learning of refraction.

FIGURE 10 .
FIGURE 10.Mean of the entropy values at different scales.
FIGURE 3. Results of test functions.

TABLE 6 .
Time-domain and frequency-domain features.
FIGURE 12. Classification results of the test samples.

TABLE 8 .
Comparative results of some algorithms.