A Power Transformer Fault Diagnosis Method-Based Hybrid Improved Seagull Optimization Algorithm and Support Vector Machine

The method of Support Vector Machine (SVM) based on Dissolved Gas Analysis (DGA) has been studied in the field of power transformer fault diagnosis. However, there are still some shortcomings, such as the fuzzy boundaries of DGA data, and SVM parameters are difficult to determine. Therefore, this paper proposes a power transformer fault diagnosis method based on Kernel Principal Component Analysis (KPCA) and a hybrid improved Seagull Optimization Algorithm to optimize the SVM (TISOA-SVM). Firstly, KPCA is used to extract features from DGA feature quantities. In addition, TISOA is further proposed to optimize the SVM parameters to build the optimal diagnosis model based on SVM. For the SOA, three improvement methods are proposed. An improved tent map is used to replace the original population initialization to improve population diversity. In addition, the nonlinear inertia weight and random double helix formula are proposed to improve the optimization accuracy and efficiency of the SOA. Then, benchmarking functions are used to test the optimization performance of TISOA and six algorithms, and the results show that TISOA has the best optimization accuracy and convergence speed. Finally, the fault diagnosis method based on KPCA and TISOA-SVM is obtained, and it is noteworthy that three examples are tested to verify the diagnostic performance of the proposed method. These results show that the proposed method has higher diagnostic accuracy, shorter diagnosis time, stronger significance and validity than other methods. Therefore, a research idea is provided for solving practical engineering problems in the field of fault diagnosis.


I. INTRODUCTION
The power transformer is one of the most important pieces of equipment in the power grid, and the failure of a power transformer will affect the stable operation of a power grid, resulting in very large economic losses and even endangering people's lives [1], [2]. On June 19, 2017, a transformer explosion accident, which caused great economic losses, one death and seven passer-by injuries, occurred in Yongzhou City, Hunan Province, China [3]. Thus, it is crucial to perform The associate editor coordinating the review of this manuscript and approving it for publication was Hiram Ponce . fault diagnosis of power transformers quickly and accurately to ensure safe and stable operation [4].
In the field of power transformer fault diagnosis, dissolved gas analysis (DGA) is the most versatile method [5]. There are several traditional DGA methods: the three-ratio method, Rogers ratio method, and Duval triangle method [6]- [9]. Nevertheless, the above methods mostly rely on manual experience and historical data. They encounter defects, such as incomplete coding and coding boundaries, so these methods cannot perform fault diagnosis of power transformers quickly and accurately. With the development of artificial intelligence, many new methods, such as artificial neural network (ANN) [10], [11], support vector machine (SVM) [12], [13], and fuzzy theory [14], [15], have been proposed and applied in the field of power transformer fault diagnosis.
Zhou et al. proposed a transformer fault diagnosis model based on improved gray wolf optimization (IGWO) and probabilistic neural network (PNN). Compared with the simulation results of eleven models, the stability of the proposed model was effectively proven, and the accuracy of transformer fault diagnosis was improved significantly. However, the authors did not study the initialization of GWO. The poor initial position of the population will affect the optimization performance. Therefore, only improving the update strategy cannot stably improve the GWO optimization performance [16]. Kim et al. proposed a Semisupervised Autoencoder with an Auxiliary Task (SAAT) for power transformer fault diagnosis using DGA. The results show that the fault diagnosis method based on SAAT has good diagnostic performance. The proposed health feature space (HFS) contributes to the intuitive monitoring of the health state of transformers. The above methods only reflect the health of transformers and cannot identify specific fault types, so the proposed method cannot be applied to accurate fault identification. [17]. Tahir et al. produced an intelligent monitoring and classification algorithm for detecting transformer winding faults based on frequency response analysis (FRA). Then, four cases were used to analyze the proposed method. The model can accurately identify various winding states in the transformer. However, they did not study the disadvantage that the parameters of ANN are difficult to determine, and the diagnostic accuracy of the proposed model cannot be optimized only by using ANN [18]. Yuan et al. proposed a transformer fault diagnosis model based on chemical reaction optimization (CRO) and twin support vector machines (TWSVMs). The results show that the model not only ensures high simulation accuracy but also has fast diagnosis efficiency. However, CRO has the defects of precocity and poor optimization accuracy. They have not studied the above shortcomings, and the results have not been compared with recent research methods. Meanwhile, compared with the traditional diagnosis model, the improvement of diagnostic accuracy of the proposed model is not high [19].
By summarizing the above research methods from literature [10]- [20], ANN needs a large number of sample data, and the validity of the model is not strong, and the stability and robustness of fuzzy theory are poor, compared with ANN and fuzzy theory, SVM, which is suitable for nonlinear and small sample data, is not only more effective, and but also is more widely used in the field of fault diagnosis [20]. Thus, this paper selects SVM as the basic model. SVM is a classification algorithm with a perfect mathematical model based on statistical learning theory. SVM has strong generalization, but determining parameters is still difficult [21], [22]. Therefore, many optimization algorithms, such as the whale optimization algorithm (WOA) [23], differential evolution (DE) [24], particle swarm optimization (PSO) [25], and GWO [26], have been successively applied to SVM transformer fault diagnosis models to find the optimal parameters.
Power transformer fault diagnosis based on SVM has been studied by many scholars at home and abroad. Zeng et al. proposed a transformer fault diagnosis model based on the IGWO-least squares SVM (LSSVM) model and used kernel principal component analysis (KPCA) to extract features from mixed DGA data. The results show that the model has high diagnostic performance. Nevertheless, DE has some defects, such as slow convergence and ease of falling into local optimization, and the authors did not research them. Thus, the performance improvement in the diagnosis model is not distinct [27]. Li et al. proposed a transformer fault diagnosis model based on an improved sparrow search algorithm (ISSA)-SVM, and the constructed model can accurately diagnose transformer faults. However, the population diversity of SSA is poor, and traditional DGA data have noise, and the authors have not further studied the above problems [28]. Benmahamed et al. proposed a transformer fault diagnosis system using Gaussian classification and bat algorithm (BA) to optimize SVM, and gas concentration and gas percentage were used as two inputs to diagnose faults. The results show that the diagnostic accuracy of BA-SVM reached 93.75%. However, the traditional BA, which easily falls into local optima, has poor population diversity, so it cannot find the optimal parameters of the SVM. If the authors can study the population behavior to improve BA, the diagnostic performance of proposed model will be better [29]. Zhang et al. used a genetic algorithm (GA) combined with an SVM to binary code the DGA feature quantity, and an improved krill algorithm (IKH) was used to optimize the SVM parameters, then, the fault diagnosis model based on IKHSVM. The results show that the IKHSVM can improve the accuracy of transformer fault diagnosis. However, the authors only added adaptive random disturbance to the krill population, and they should also improve the initialization of the KH population, which can increase the optimization performance to improve the stability of the diagnostic model [30].
Three shortcomings of transformer fault diagnosis based on SVM are summarized: 1) a single fault diagnosis model cannot greatly improve the fault diagnosis performance; 2) the noise of transformer fault data will reduce the stability of the model; 3) research on optimization algorithms is not targeted and cannot significantly improve the optimization performance. Thus, a transformer fault diagnosis method based on the KPCA and TISOA-SVM method is proposed in this paper. It is noteworthy that the innovations and contributions of this paper are mainly divided into the following four improved methods. First, KPCA is used to extract the features of DGA data to reduce the influence of noise on the diagnosis results. In addition, SOA can be improved by the following three methods to obtain the TISOA. A modified tent map (MTent) is proposed to improve the initial diversity of the seagull population. Considering the influence of the inertia weight of SOA on population convergence and the optimization mode, a nonlinear inertia weight method, which can effectively improve the convergence speed of the SOA, is proposed, and a random double helix foraging formula is proposed to improve the optimization accuracy of the SOA. Then, it is noteworthy that TISOA can be obtained from the above three improved methods, and the benchmark functions are used to test the optimization performance of TISOA and the other six algorithms. The results show that TISOA has the best optimization performance. Finally, three examples were used to test the diagnostic performance of the proposed method. The results show that the proposed method has the highest diagnostic accuracy, the shortest diagnosis time, the best validity, and the strongest significance. These results can prove that the proposed method has the best diagnostic performance.
The rest of this paper is as follows. Section II introduces the basic theory of KPCA. Section III introduces the basic theory of SVM and SOA and the improvement methods of SOA. Meanwhile, the optimization performance of TISOA is tested. In section IV, the diagnosis model proposed in this paper is described. In section V, the diagnosis results are analyzed by three examples. In section VI, the research conclusions and shortcomings of this paper are summarized, and future research directions are proposed.

II. BASIC THEORY OF KPCA
KPCA is a feature extraction method based on principal component analysis (PCA) [31]. KPCA solves the disadvantage that PCA can extract only the features of linear data by introducing a kernel function. KPCA extracts the features of nonlinear data and increases the dimensionality of the data by relying on a kernel function. The eigenvalues and eigenvectors of the processed data are extracted, and the required principal components are obtained by PCA [32], [33]. And the mathematical model is as follows and the flow of data processing is shown in Figure 1.
For data sample x i (i = 1, 2, . . . m), by introducing nonlinear function ϕ(x i ), the original sample data are mapped to a high-dimensional space, and the covariance matrix (1) is constructed.
where λ is the eigenvalue of C and V is the eigenvector of λ.
An N -order matrix K is defined as follows.
The above derivation is the case where the mean value of the mapping data is zero. However, most cases are not consistent, and K will be effectively transformed. (6) where N is an N -order matrix and N i,j is 1/N . Eq. (6) ensures that the assumption that the mean value of the mapping data is 0 is true.
SVM has a strong classification ability [34]. The principle is to construct the optimal hyperplane in space and classify the sample data. The kernel function is used to map x i to the highdimensional space ϕ(x i ) to construct the optimal hyperplane. The model is as follows: where ω is weight vector; b is deviation. The relaxation variable ξ i is introduced to construct the maximum interval classifier.
where ξ i ≥ 0; C is a penalty factor that can balance ξ i . The Lagrange multiplier λ i is introduced to turn the original problem with constraints into a simple pair problem.
where in the existing part, λ i = 0, and solving Eq. (10) can obtain the best ω and b, which are brought into Eq. (7) to obtain the optimal decision model.
is the kernel function. In this paper, the radial basis function (RBF) is selected as the kernel function of SVM and KPCA. RBF has only one parameter, and the performance of processing nonlinear data is good. The expression is as follows: where σ is kernel function parameter. To establish the transformer fault diagnosis model, TISOA is used to find the parameters C and σ , and KPCA is used to process the data.

B. SOA
SOA is a new metaheuristic optimization algorithm proposed by Gaurav Dhiman and Vijay Kumar in 2018 [35], [36]. The basic principle of SOA is to construct optimization rules by studying the migration and foraging behaviors of seagull populations. Migratory behavior refers to the migration of seagull populations from one place of residence to the next; foraging behavior refers to the behavior of seagull population predation in their residence. The optimization strategy of SOA is as follows and the optimization process is shown in step (a) of Figure 4.

1) MIGRATION
The migration behavior of seagulls should meet three conditions, namely, avoiding collisions, moving toward the direction of the best neighbor, and remaining close to the best search agent. 1) Avoiding the collisions. To avoid collisions among seagull populations, variable A is used by SOA to adjust the position of seagulls.
where C s represents a new location different from other seagulls; P s represents the current location of the seagull; and A represents the seagull's movement behavior in a given search space.
where the value of f c is 2; t is the current number of iterations; and T max is the maximum number of iterations.
2) Movement toward the best neighbor's direction. After ensuring that there was no collision between seagulls, all seagulls moved toward the best position.
where − → M s indicates that the seagull population is moving toward − − → P best , the best positioned seagull. B is a random number and has the ability to explore balanced algorithm.
3) Remain close to the best search agent. After calculating the convergence direction of each seagull, the seagull begins to move.
where D s represents the new position of the seagull after moving.

2) ATTACKING
The seagull population engage in foraging behavior according to experience gained from migration behavior. The seagull population constantly changes its attack angles and flight speeds during migration. Seagulls spiral in the air to attack prey.
where r is the helix radius, which is controlled by u and v, u and v are the correlation constants of the helix shape, and k is the random angle between [0, 2π]. Combined with the new location of seagulls, the location update formula of the seagull population is obtained.
where − → P s (t) is the location of updated seagull population.

C. IMPROVING METHODS OF SOA
SOA has strong optimization performance. However, it still has some defects. For example, it has poor location diversity of the initial population and easily becomes stuck in local optima. Aiming at the above shortcomings, this paper proposes the following three improvement methods to improve SOA: MTent, nonlinear inertia weight and the random double helix foraging formula.

1) MTENT
This paper uses a tent map to improve the defects of low ergodicity and weak randomness of the initialized population of SOA. Tent map is a traditional chaotic mapping method [37].
VOLUME 10, 2022  However, the existence of small periodic points will make it jump out of the chaotic state [38]. For example, when the initial value x 0 = {0.2, 0.4, 0.6, 0.8}, there are small periodic points. Furthermore, there are unstable periodic points, such as x n = {0, 0.25, 0.5, 0.75} and x n = x n−i , i = {0, 1, 2, 3, 4}. Therefore, MTent is used to replace the original initialization method of SOA to enhance the population diversity. The formula is as follows: In Eq. (21), rand() is a random number between [0, 1]. When the chaotic sequence is generated, x n will jump out of the periodic point and enter a chaotic state again because the random disturbance formula rand()/20. The bifurcation diagram of MTent is shown in Figure 2. Figure 3 shows the population distribution in (0,1) space generated by MTent and Tent. With MTent, seagulls can jump out of the boundary due to the random disturbance formula, and all seagulls that jump out of the boundary are counted as being on the boundary.
As shown in Figure 3, by comparing two methods, the results can be seen that most of the populations generated by the tent map are clustered at the boundary and have poor ergodicity and randomness. In contrast, the initial population generated by MTent has stronger ergodicity and randomness, and the generated population has better diversity.

2) NONLINEAR INERTIA WEIGHT
Migration behavior is an important part of the optimization strategy of SOA. Inertia weight A plays a key role in the process of migration behavior. The size of A, which can avoid the collision of seagulls during migration, will affect the global search behavior of SOA.
In this section, a nonlinear formula is proposed to improve the inertia weight A of the SOA to make the seagull population behavior update more flexible and to increase the optimization efficiency.
The cosine function in Eq. (22), which can make the attenuation of the inertia weight smoother, is selected to replace Eq. (14). Figure 4 shows the function image of the inertia weight before and after improvement.
As shown in Figure 4, when the iteration ratio of the improved inertia weight reaches 0.33, the value of A is equal to 1 and decays slowly in the subsequent iteration. The population starts a local search to ensure that the optimization accuracy and optimization efficiency of the SOA are effectively improved.

3) THE RANDOM DOUBLE HELIX FORAGING FORMULA
In this paper, the foraging behavior of SOA is studied and inspired by the bubble-net attacking method of WOA. A double helix formula with randomness is proposed to improve the foraging formula of SOA. The basic theory of WOA is as follows.

a: WOA
The WOA is a metaheuristic algorithm to simulate the predation behavior of humpback whales and was proposed in 2016 by Mirjalili [39]. WOA has the advantages of a simple structure and few adjustment parameters.

i) ENCIRCLING PREY
All whales in the population swim near the optimal whale in the population, causing the whale population to conduct local optimization.

ii) SEARCH FOR PREY
The whales randomly select a whale in the population and swim near it to conduct global optimization. The location update formula is as follows: where a is a linear function that decreases from 2 to 0 monotonically.

iii) BUBBLE-NET ATTACKING METHOD
Bubble-net predation is used to surround the prey by simulating the predation behavior of humpback whales. The location update formula is as follows: where b is a constant that defines the shape of the spiral; l is a random number between [−1, 1]; and p is a random number between [0, 1]. When p < 0.5, WOA surrounds prey or searches prey. When p ≥ 0.5, WOA carries out bubble-net attacking.

b: THE RANDOM DOUBLE HELIX FORAGING FORMULA
The variable parameter A in WOA is introduced to control the search state, the seagull position is randomly changed in the SOA iteration process, and the control parameter p is introduced to control the optimization method. When |A| > 1 and the control parameter is p < 0.5, SOA will conduct a random search.
where − → D s represents the new position of the seagull after it has moved and −−→ X rand represents a random seagull in a seagull population. Finally, the double helix formula is constructed by combining Eq.
When |A| > 1, the formula of p < 0.5 is replaced with Eq. (29). In this paper, the double helix with randomness in Eqs. (29) and (30) is used to improve the foraging behavior of SOA in Eq. (19), the diversity of the seagull population is increased, and the optimization accuracy of SOA is improved.

D. TISOA 1) SUMMARY OF IMPROVEMENT METHODS
The improved method proposed in Part C of Section III is used to improve SOA from three aspects to obtain TISOA, 1) MTent is used to initialize the seagull population, which can replace the original random initialization of SOA, and it can increase the diversity of the seagull population and greatly enhance ergodicity and randomness. 2) Under the condition of good population diversity, the inertia weight A is improved obtain the improve formula (22). The method is increasing its attenuation speed to 1, reduce the migration behavior of the seagull population, and increase the foraging behavior of the seagull population. 3) To greatly improve the optimization accuracy of SOA, based on the previous two improvements, a double helix formula with randomness is proposed to improve the foraging behavior of SOA to replace the original attacking. Significantly, this is an innovative improvement method, which can comprehensively improve the convergence speed and optimization accuracy of SOA. To verify the validity and superiority of these three improved methods, the benchmark functions have been used to test the optimization performance of TISOA in Part E of Section III.

2) OPTIMIZATION PROCESS OF TISOA
In this section, the optimization process of TISOA is introduced in detail, and Figure 5 is the specific flow chart.
Step 1: parameter initialization. Initialize the algorithm parameters of the SOA and WOA, and set the population size M , the maximum number of iterations T , the dimension dim, F c , A, a, l and B; VOLUME 10, 2022  Step 2: population initialization. MTent is used to initialize the seagull population and randomly generate the individual seagull population; Step 3: fitness calculation. The initial fitness values of the seagull population and the individuals are calculated and compared to find the optimal individual; Step 4: migration behavior. SOA performs the migration behavior using Eqs. (13), (15) and (17). Then, the linear inertia weight A (Eq. (14)) which controls the motion behavior, is replaced by Eq. (22); Step 5: foraging behavior. The double helix with randomness in Eqs. (29) and (30)

1) TRADITIONAL BENCHMARK FUNCTION
In this section, nine benchmark functions are selected to test the above algorithms, including three unimodal functions F 1 , F 2 and F 3 ; three multimodal functions F 4 , F 5 and F 6 ; and three fixed dimensional multimodal functions F 7 , F 8 and F 9 . The specific formula is shown in Appendix 1. Figure 6 is a three-dimensional graph of nine benchmark functions. Table 2 shows the simulation results of seven algorithms, in which the best value (Best), average value (Ave), and standard deviation (Std) of the fitness values are recorded as evaluation items.
In Table 1, the best value, average value and standard deviation of the function test results are in bold. For unimodal benchmark test functions F 1 , F 2 and F 3 , the test results of TISOA are better than the results of the other algorithms, in which the found optimal fitness value is 95 orders of magnitude higher than that of the traditional SOA and dozens of orders of magnitude higher than those of the other algorithms, and the performance of TISOA is obviously better than that of the other algorithms. From the results of multimodal benchmark test functions F 4 , F 5 and F 6 , it can be concluded that the results of ISOA and TISOA are the best compared with the results of the other algorithms, but TISOA, ISOA, GWO, and WOA can find the minimum fitness value in F 4 and F 6 . In the fixed-dimension multimodal benchmark test function F 7 , the three evaluation terms obtained by TISOA are the best; in F 8 and F 9 , GWO, WOA, SOA and TISOA can find the best evaluation items.
From the test results for the above benchmark functions, compared with other algorithms, the three evaluation items obtained from the TISOA test are optimal, which is enough to prove that TISOA has strong optimization performance and stability. However, there are also several benchmark functions in which TISOA and other algorithms can find optimal results. Therefore, the specific test results are shown in Figure 7 to further compare and analyze the optimization performance of these algorithms. For unimodal benchmark test functions F 1 , F 2 and F 3 , compared with the other algorithms, TISOA not only finds the optimal value of the function but also improves the convergence speed. For F 3 , TISOA finds the optimal value only when it reaches 100 iterations. For multimodal benchmark test functions F 4 , F 5 and F 6 , TISOA and ISOA can stably find the minimum values of the functions, GWO and SOA also have the ability to find the minimum values of the functions, and the convergence speed of TISOA is significantly higher than the other algorithms.
For F 4 and F 6 , TISOA converges quickly at the beginning of iteration and finds the minimum fitness values. For the fixed-dimension multimodal test functions F 7 and F 9 , TISOA has the fastest convergence speed, and the minimum values of  the functions can also be found. For F 8 , TISOA has significantly improved the convergence speed compared with SOA and ISOA.

2) CEC2015
The CEC2015 benchmark functions are used to further verify the optimization performance of TISOA. All algorithms are tested 20 times, and the average (Ave) and standard deviation (Std) were taken as the evaluation items. The specific function formula is shown in Appendix 2, and the results are shown in Table 3.
The boldface text in Table 3 is the best item of the test results. It can be clearly seen that the results of TISOA for these functions are the best. Among the CEC-9, CEC-12 and CEC-13 benchmark functions, the average values of several optimization algorithms are close, but the TISOA standard deviation is smaller and the optimization stability is stronger. For other functions, the average value and standard deviations of TISOA are the minimum, and TISOA has better optimization performance than other traditional algorithms, which can effectively prove the remarkable optimization performance of TISOA.

IV. POWER TRANSFORMER TROUBLESHOOTING BASED ON KPCA AND TISOA-SVM
In this section, a power transformer fault diagnosis method based on KPCA and TISOA-SVM is described in detail as follows.

A. DATA PROCESSING BASED ON KPCA
The DGA data used in this paper are 300 sets of DGA data obtained from the Hubei Power Grid in China, and some data are given in Table 4.   There are five characteristic quantities in the existing DGA data, namely, H 2 , CH 4 , C 2 H 6 , C 2 H 4 and C 2 H 2 . There are six fault types: medium-and low-temperature overheating (T 1 ), high-temperature overheating (T 2 ), partial discharge (PD), low-energy discharge (D 1 ), high-energy discharge (D 2 ) and normal state (NC). The fault types are numbered, as shown in Table 5.
KPCA is used to extract features from the existing data. First, the DGA data are divided into test sets and training sets. Two hundred groups of data are selected as the training sets, and 100 groups of data are selected as the test sets. The relationship between the principal component contribution rate and eigenvalue (Lambda) is shown in Figure 8 and Table 6, in which the core width in KPCA is selected as 8.
In this paper, the principal component contribution rate is 95%, so the first four principal components are extracted as fault diagnosis data. Table 7 shows some of the extracted data.
Because the sample size corresponding to each fault type in the data are different, 30 groups of data are selected   for each fault type, and a total of 180 groups of data are selected for fault diagnosis. The groupings of the training set and test set of data are shown in Table 8 and have a ratio of 2:1.

B. DATA NORMALIZATION
Due to the different sizes and high complexity of the existing fault data features, it is necessary to normalize the data after KPCA feature extraction and to map all data into (0, 1).
where i = 1, 2, 3, 4; x * i represents normalized data; x min represents the minimum value in data; x max represents the maximum value in data; x i represents raw data.

C. THE PROCESS OF FAULT DIAGNOSIS
This section elaborates the specific flow of power transformer fault diagnosis based on KPCA and TISOA-SVM, and Figure 9 is the specific flow chart.
1) Feature extraction. KPCA is used to extract the features of DGA data, and the principal components are extracted as fault data according to the total contribution rate of the principal components, which is 95%; 2) Data preprocessing. The extracted fault data are loaded; the training sets includes 120 groups, and the test sets includes 60 groups. The data are normalized to (0,1); 3) Parameter optimization. TISOA is used to optimize the parameters C and σ of the SVM. The specific optimization process is described in Part C of Section III; 4) Judgement of whether the termination conditions are met. If the optimization algorithm reaches the maximum number of iterations, then the optimization is stopped and the optimal parameters are output; otherwise, the optimization process continues; 5) Result output. After obtaining the optimal parameters C and σ , SVM starts fault diagnosis and outputs the results, including the algorithm running time, fault classification, test set accuracy, training set accuracy, and number of misclassifications.

V. EXAMPLE DIAGNOSIS RESULTS
The method based on TISOA-SVM to diagnose the faults with the KPCA-processed DGA data in this paper. Furthermore, three examples are compared and analyzed to verify the superiority of the proposed diagnosis method. The parameter VOLUME 10, 2022   settings of the algorithms used in all methods are shown in Table 1, and T max = 100; Population = 30; ub = 10 3 ; lb = 10 −3 .

A. EXAMPLE 1
Firstly, TISOA-SVM is used to diagnose the data before and after KPCA processing to verify the validity of KPCA in data processing, Table 9 and figure 10 show the diagnostic results.
The results show that the diagnostic accuracy of proposed method is 96.67%, with 2 misjudgments, and the  7.037s, and diagnostic time of TISOA-SVM is 8.762s. It can be seen that TISOA-SVM is used to diagnosis the DGA data which are processed by KPCA can obtain the higher accuracy and shorter time than diagnosis original DGA data. The above results prove that the DGA data processed by KPCA is more effective in fault diagnosis.
Significantly, in order to carry out the diagnosis test more effectively and intuitively, all the following fault diagnosis tests are based on the DGA data processed by KPCA.
Then, the performance of TISOA-SVM is preliminary tested, the fault diagnostic results of GWO-SVM, WOA-SVM, SOA-SVM and SFO-SVM are compared with diagnostic result of KPCA-TISOA-SVM. And the diagnostic results include diagnostic accuracy, diagnostic time and fitness curve. The specific diagnostic results are shown in Figure 11 and Table 10.
By analyzing the results in Figure 11 and Table 10, the fault diagnosis accuracy of GWO-SVM is 90%, with 6 misjudgments; the fault diagnosis accuracy of WOA-SVM is 90%, with 6 misjudgments; the fault diagnosis accuracy of SOA-SVM is 88.33%, with 7 misjudgments; and the fault diagnosis accuracy of SFO-SVM is 88.33%, with 7 misjudgments. It can be seen that diagnostic accuracy of proposed method is approximately 5% higher than other methods, and it has 4-5 fewer misjudgments. The proposed method is significantly better than four models and the diagnostic accuracy has great improvement.
Then, the diagnostic time of five methods is given for further analysis. It can be seen that the result of proposed method is 7.037s, which is fastest compared with above methods. Thus, it can be concluded that proposed method has higher diagnostic efficiency than above methods. VOLUME 10, 2022 Finally, as shown in Figure 12 and Table 11, fitness curves are given to compare the convergence performance of the proposed method.
As shown in Figure 12 and Table 11, the optimal fitness curve of TISOA-SVM converges fastest, which converges when iterating to the 3th generation, and the fitness value found is the highest. In addition, the average fitness curve converges of proposed method when it iterated to 50 generations, with the increase in the number of iterations, the average fitness also approaches the optimal fitness. A higher fitness value can be found with GWO-SVM, and it starts to converge during the 22th generation. The optimal fitness curve of WOA-SVM converges when it is iterated to 17th generations. A faster convergence speed can be found with SOA-SVM, it starts to converge during the 5th generation, but the fitness value is not high. And the optimal fitness curve of SFO-SVM converges when it is iterated to 3th generations and it has the lowest fitness value in five methods. It is worth noting that except for the proposed method, the average fitness curves of other methods all do not converge to the optimal fitness values.
In summary, the proposed method has the best performance than other methods in diagnostic accuracy, diagnostic time, and fitness curves.
Meanwhile, in order to prove the diagnostic performance of the proposed method more convincingly, five methods from the others in the literature are given to diagnostic the fault of KPCA data in Example 2, and the results are compared with the proposed method.
Firstly, Figure 13 and Table 12 show the diagnostic accuracy and time of these methods.
The results show that the fault diagnosis accuracy of IGWO-SVM is 93.33%, with 4 misjudgments; the fault diagnosis accuracy of IGWO-PNN is 93.33%, with 4 misjudgments; the fault diagnosis accuracy of HGWO-LSSVM is 91.67%, with 5 misjudgments; the fault diagnosis accuracy of BA-SVM is 90%, with 6 misjudgments; and the fault diagnosis accuracy of IKH-SVM is 95%, with 3 misjudgments. Among them, IKH-SVM has the highest diagnostic accuracy, but it also lower than the proposed method.
Then, diagnostic time of six methods is analyzed Because the model of PNN has high complexity, so the diagnostic time of IGWO-PNN is the slowest of the other methods, with 21.013s. The same improve methods were used in IGWO-SVM and HGWO-SVM, but these methods reduce the complexity of diagnostic model, because the diagnostic time of IGWO-SVM is not reduce. However, LSSVM has lower complexity than SVM, so the diagnostic time of HGWO-LSSVM has a litter lower than IGWO-SVM. The   diagnostic time of BA-SVM and IKH-SVM is 11.467s and 8.182s respectively. It can be concluded that the proposed method is the fastest than other methods.

C. EXAMPLE 3
In this section, statistical test is used to judge about the superiority of the proposed method. 20 times results of six methods are tested by Wilcoxon rank sum test to obtain p-value. That is, whether the proposed method is significantly different from other methods under the standard of p = 0.05. When p > 0.05, it can be considered that the assumption of H0 is accepted, there is no significant difference between the two methods, and the diagnostic ability of the methods is the same; When p < 0.05, H0 is rejected, that is, there is a significant difference between the two methods. And Table 13 shows the results.
The results show that the tested p-value are all far less than 0.05, so the conclusion can be obtained that the proposed method has stronger significance than other methods.
Then, A single test set does not prove the validity of the method. So, 3-fold cross-Validation is used to verify the validity of the proposed method. Table 14 shows results of all methods.
It can be seen that the result of IGWO-PNN is 4% less than that of single test set because the model of PNN has low stability. And compared with other methods, the proposed method has the highest accuracy. In this section, the significance and validity of proposed method are proved. Meanwhile, summarizing the results of section A, B, and C, it can be concluded that the proposed method has the best diagnostic performance compared with other methods.

VI. CONCLUSION
This paper advances a new power transformer fault diagnosis method based on the SVM. First, KPCA is used to extract the features of DGA data. Then, aiming at the defects of SOA, three improved methods are proposed: improved tent chaos initialization, nonlinear inertia weight, and the stochastic double helix formula. Finally, a transformer fault diagnosis method based on KPCA and TISOA-SVM is constructed, and three examples are used to test its diagnostic performance. Significantly, the following conclusions are obtained: 1) The traditional benchmark functions and CEC2015 benchmark functions are used to simulate TISOA and six optimization algorithms. The results show that TISOA has the highest optimization accuracy and fastest convergence speed. The results of these functions effectively prove that TISOA has excellent optimization performance, which can optimize SVM parameters quickly and accurately.  2) The DGA data based on KPCA are used for fault diagnosis, and three examples are used to verify the diagnostic performance of proposed method. Compared with other method, the diagnostic accuracy and diagnostic time of proposed method have been greatly improved, and the significance and validity of the method is also the strongest. In conclusion, the fault diagnosis method proposed in this paper has excellent diagnostic performance, can diagnose transformer faults quickly and accurately, and has high reference values. However, the research on DGA data in this paper is not comprehensive. In the future, DGA data will be deeply analyzed and studied to further improve the accuracy and stability of transformer fault diagnosis.

APPENDIX A
See Table 15.

APPENDIX B
See Table 16.