Artificial Generation of Partial Discharge Sources Through an Algorithm Based on Deep Convolutional Generative Adversarial Networks

The measurement of partial discharges (PD) in electrical equipment or machines subjected to high voltage can be considered as one of the most important indicators when assessing the state of an insulation system. One of the main challenges in monitoring these degradation phenomena is to adequately measure a statistically significant number of signals from each of the sources acting on the asset under test. However, in industrial environments the presence of large amplitude noise sources or the simultaneous presence of multiple PD sources may limit the acquisition of the signals and therefore the final diagnosis of the equipment status may not be the most accurate. Although different procedures and separation and identification techniques have been implemented with very good results, not having a significant number of PD pulses associated with each source can limit the effectiveness of these procedures. Based on the above, this research proposes a new algorithm of artificial generation of PD based on a Deep Convolutional Generative Adversarial Networks (DCGAN) architecture which allows artificially generating different sources of PD from a small group of real PD, in order to complement those sources that during the measurement were poorly represented in terms of signals. According to the results obtained in different experiments, the temporal and spectral behavior of artificially generated PD sources proved to be similar to that of real experimentally obtained sources.


I. INTRODUCTION
When an insulation system is subjected to high voltage levels, in certain areas of the material, it is common to find large magnitudes of electric field capable of releasing, under certain conditions, some electrons strongly linked to their nuclei [1].This generates the appearance of free charges that converge in current flows which will degrade the dielectric properties of the insulation over time [1], [2].Such current The associate editor coordinating the review of this manuscript and approving it for publication was Mehdi Bagheri .
flows are usually of short duration (0.1-10 ns) and limited to a microscopic area of the insulation.That is the reason why they are called partial discharges (PD) because the electric field, despite its great magnitude, is unable to totally rupture the insulation and, instead, does so partially [2].In general, PD can occur in air (corona PD), gas vacuoles inside the insulation (internal PD) or in the surface of the insulation (surface PD), where permittivity is lower than in the rest of the material [3].PD do not cause imminent breakdown, but a reiterative activity gradually degrades insulation until failure occurs [2], [4], [5].Furthermore, the presence of PD can VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see http://creativecommons.org/licenses/by/4.0/also indicate the existence of other degradation mechanisms acting over the insulation system [6].Therefore, PD detection is relevant to establish the real condition of the insulation system in any equipment or electrical machine subjected to high voltage.
Traditionally, commercial measurement systems represent PD activity as pulses in pC or mV, superimposed on the network signal.These representations are termed Phase Resolved Partial Discharge Patterns or PRPD Patterns, and are utilized to identify any PD source through a set of characteristics that each type of PD possesses [7], [8].One of the main advantages of using PRPD patterns for identification process is that they allow adequate recognition of the type of PD source acting on the test object, and also its evolution over time [8].For example, once internal PD sources are detected in the isolation system of a generator or power transformer; the efforts of maintenance engineers must focus on monitoring the evolution of the PRPD that is obtained, because this type of sources cannot be mitigated easily with a simple maintenance process.
Although this way of identifying sources allows adequate recognition of the type of PD (under ideal conditions), in industrial environments the presence of external noise or multiple sources of PD can make it difficult to interpret the PRPD patterns, making necessary to perform a separation process prior to identification [9]- [11].Generally, this separation is done through clusters that are formed in different areas of a separation map, where each cluster will represent a source of PD or electrical noise.
In recent years, other PD identification techniques based on machine learning (ML) have begun to be used with very good results [12]- [17].Although the application of these techniques is very wide, many works have been focused on direct identification after the information obtained from the signals emitted by each type of PD source [13].The results obtained so far have shown that intelligent ML-based techniques can be an adequate solution when identifying different PD types.However, many aspects remain to be addressed and investigated, such as the decrease in the success rate of these algorithms when, after training, the PD signals show temporary or spectral changes, product of some variation in the test object or measurement circuit [9], [12], [13], [18].
Likewise, regardless of the identification technique or process used, it is very important to have a significant number of signals from each of the sources acting over an asset, this way the separation and/or identification of the type of PD source will be much more accurate and the diagnosis of the equipment's condition will be much more precise [1]- [3], [7].
In this paper, the authors propose a novel algorithm of artificial generation of PD, based on a Deep Convolutional Generative Adversarial Networks (DCGAN) architecture.This algorithm has proven to be able to generate different PD sources from a small number of signals that have been obtained experimentally with different test objects.The results described in Section V show that the artificially generated signals have a temporal and spectral behavior similar to the signals that are experimentally generated with real test objects, therefore, these signals can be correctly classified as PD signals of different nature.For the comparison of the results obtained with both generation methods, the spectral content of each PD source obtained was analyzed.Likewise, the spectral power clustering technique (SPCT), usually employed to separate PD sources and electrical noise, was applied in order to evaluate the location and shape of the clusters obtained for both real and artificial signals.Finally, the implemented algorithm was trained with a set of data including real and artificial signals of each type of PD source.The results obtained show that, for each case, the algorithm recognized both types of signals as if they were of the same class.

II. EXPERIMENTAL ASSEMBLY AND TEST OBJECTS USED FOR PD GENERATION
The electrical method for detecting PD has long been the most used; its characteristics and procedures are defined in the standard IEC60270 [7].To ensure reproducible and akin measurements, the IEC60270 recommends two basic measurement circuits: direct and indirect.Both circuits are formed by a High Voltage Source, a coupling capacitor acting as a low impedance path for high frequency pulses (PD), a quadrupole or measurement impedance and a sensor element preferably of high bandwidth.The main difference between these two measuring circuits is the connection of the quadrupole and the sensor: for the direct circuit these two elements are connected in series with the test object, while for the indirect circuit they are connected to the branch of the coupling capacitor.In this paper, the generation of PD sources was made using an indirect measurement circuit.Further details on the components and connections of this circuit are provided in [7], [9].
Basically, the measurement process consisted of gradually increasing the voltage in the measurement circuit until the partial discharges in the test objects were measurable using a commercial high-frequency current transformers (HFCT) of 80 MHz bandwidth [9], [19].This sensor obtains the data to be transmitted to the acquisition system for subsequent signal processing.The HFCT was connected to an NI-PXI-5124 acquisition system with 150 MHz of bandwidth, 12 bits of vertical resolution and 200 MS/s of sampling rate.The acquisition system was programmed to digitize and store the PD pulses generated in each test object.Finally, each of the captured signals will have a duration of 1 µs and will be formed by 200 samples.
Three different test objects were used for the generation of three types of partial discharges (see Fig. 1): Corona discharges: As shown in Fig. 1, this type of PD occurs in the gases that surround a sharp or pointed conductor that is subjected to a strong electric field, causing the local rupture of the insulation [1]- [5].The corona PD sources have a very typical PRPD pattern because they tend to occur in the negative half-cycle of the voltage signal and their magnitude is almost constant.However, when the voltages are very high, some pulses may appear in the positive half-cycle while also maintaining low dispersion in terms of magnitude.Particularly for the outcomes described in Section V, the test object used to generate this type of source was a point-plane configuration, in which the distance between the point and the plane was adjusted by 1.5 cm and the voltage applied to the point was 5.6 kV.
Internal discharges: These discharges typically occur inside an insulation system, in gas vacuoles, in particle inlays of greater or lesser dielectric strength, in metal protuberances in one of the electrodes or internal cracks of the material [1], [5], [8], [20].The PD activity under these conditions is originated from the strong electrical stress resultant from the difference in permittivity between the insulating medium and the imperfection.If this type of PD constantly occurs in the insulation, the material may be eroded until full destruction [3], [6].The test object used to generate this type of source consists of nine sheets of NOMEX paper stacked and vacuum encapsulated.These sheets were placed between two electrodes and submerged in transformer mineral oil to remove all surface discharges.In addition, the three central sheets were previously perforated to form a cylindrical vacuole, simulating an imperfection in an insulation system.For this test object, stable PD activity was obtained at 8.9 kV.
Surface Discharges: They usually appear in the isolation system of any equipment when there is contamination or moisture on the surface of the material [7], [8].The continuous presence of high-intensity tangential electric field components originates current flows that extend beyond their site of origin, bordering the surface of the insulation.In practice, this type of discharge may occur in high-voltage cables, insulators, generator windings, transformers and in any other equipment or machine with some degree of surface contamination.To generate stable surface PD activity, high voltage was applied to a ceramic bushing, see Fig. 1.Additionally, the insulator surface was contaminated with a saline solution to facilitate the occurrence of PD at a lower voltage (9.5 kV).

III. CLASSIFICATION AND IDENTIFICATION OF PD SOURCES
Either measured in the field or in unshielded high-voltage laboratories, the signals captured by the sensors to subsequently reach the measurement channels of the PD monitoring systems are originated from electrical noise and PD sources usually acting simultaneously.The simultaneity of the signals during the measurement process makes the PRPD difficult to interpret even for experts in the field, because noise signals without phase correlations can often reach higher magnitudes than those of the PD [7], [10], [21], [22].This problem has gradually increased due to a higher use of systems based on power electronics, such as switched-mode power supplies, frequency inverters, rectifiers, inverters or other electrical-electronic devices capable of generating some type of similar switching [17], [23].Likewise, in many measurement processes it is very common to find simultaneous presence of multiple PD, which causes the PRPD measured in any real equipment or test object to be of complex interpretation since certain less harmful sources with greater amplitudes can hide the presence of more critical sources, such as internal PD, (whose presence can indicate accelerated deterioration of equipment insulation) [9], [13], [18], [24]- [29].In addition, certain types of discharges, such as corona PD, usually do not have a significant influence on the life expectancy of insulation systems.Something similar occurs with certain sources of surface discharges.For example, the discharges originated in contaminated insulators could be harmful, but once identified, they can be easily mitigated during the maintenance period [9], [24], [28].
For these reasons, proper identification of the type of PD source is essential for adequate diagnosis of the insulation's actual condition, thus avoiding erroneous evaluations of the equipment or machine being tested.In the scientific literature, different approaches have been proposed to address the PD identification process when multiple sources act simultaneously [7], [8], [26], [30]- [33].Two of the main approaches are summarized below.

A. VISUAL OR AUTOMATIC IDENTIFICATION OF PRPD PATTERNS USING CLUSTERING TECHNIQUES
As indicated above, in large part of the measurement processes carried out in real equipment such as transformers, generators and high voltage cables, complex PRPD patterns formed by multiple sources of PD and/or electrical noise acting simultaneously, are obtained.For this reason it is necessary, prior to any identification process, to perform a separation stage over the PRPD obtained, in order to classify in clusters or point clouds (on a two-dimensional or three-dimensional classification map) the PD or noise sources that have been captured [9], [11], [24], [26], [27], [33].This separation process allows examining the sources individually, analyzing only the PRPD of the signals corresponding to each cluster (see Fig. 2.), therefore making much more accurate the source identification.
According to the above, one of the advantages of applying the separation process is that the signals from the different sources are grouped into clusters in different areas of the separation map, where finally the number of sources will correspond to the number of clusters.In this sense, it is necessary to obtain a sufficient number of pulses for each PD source during the measurement, so that the presence of all sources will be more noticeable, both in the cluster and in the corresponding PRPD.Once the separation process is finished, the identification of all the sources associated with each of the clusters can be made through a simple visual analysis of the individual PRPD patterns, as well as automatically from the analysis of different statistical parameters that are normally obtained from the same PRPD patterns [8], [9], [26], [33].

B. DIRECT IDENTIFICATION OF PD SOURCES BY APPLYING MACHINE LEARNING (ML) TECHNIQUES:
Another widely used procedure for identifying PD sources is the application of algorithms based on machine learning techniques [12]- [18], [26], [28], [31].Recent advances in the computing field meant significant improvements in data storage and processing capacities, allowing these techniques to be widely used in stochastic applications, where the signals obtained cannot be easily identified by conventional mathematical techniques or by a simple visual inspection of the spectral or temporal content of the signal.ML techniques can identify different types of PD with high success rates by directly analyzing the signals captured by the acquisition system [13], [18].This analysis is only possible if the used algorithms are previously trained with a large data set labeled correctly that includes all the possible characteristics of the sources to be identified (this way, sufficient representation of the problem domain is included) [34].For the identification process, these algorithms take each of the captured signals and, according to the information acquired during training, proceed to classify them.Other works have also focused on the direct identification of PRPD patterns that are obtained during the acquisition [28].In this case, the training is done with PRPD patterns generated from different controlled test objects, in which the type of source is fully characterized.
Regardless the used approach, one of the limitations when applying the separation or identification methods through any ML-based algorithm, is not being able to capture a significant number of pulses associated with each type of source during the acquisition [9], [27]- [29].Particularly, when separation techniques are applied, clusters with less representation points could erroneously be associated with spurious signals from external noise sources (electric heaters, luminaires turning on and off, etc.) which tend to be randomly located in any part of the separation map [10], [21].The ML-based identification algorithms' performance could be affected by this fact, because if the algorithm training is not carried out with enough data, the recognition rate during the identification process could be low [18].In addition, if the algorithm is based on the identification from the PRPD patterns, the testing of a PRPD with few pulses would be practically unfeasible, since it would not be possible to identify the typical characteristics that shape the corresponding source.
On the other hand, for many of the researchers focusing only on the mathematical treatment of this type of signals, it could be complex to have the instrumentation and sophisticated equipment required to generate and measure PD whenever they wish to evaluate a new separation technique or an identification algorithm.For these reasons, the main challenge when characterizing PD sources is to have, for each type of PD, an appropriate set of signals in terms of quantity (statistically significant) and quality (free of spectral components associated with electrical noise or other simultaneous PD sources), in order to perform an adequate separation or training of each type of source, to then obtain accurate results during the identification stage.
Unfortunately, in this context, one of the many factors that normally makes it difficult to acquire PD signals is the incorrect setting of the trigger level in the acquisition system; if a very high trigger level is used (to filter unwanted signals of electrical noise), the low-magnitude PD sources are also filtered, capturing only those PD pulses that exceed the established trigger threshold.If, on the contrary, a very low trigger level is set in the acquisition system, the measurement channel tends to be saturated by the large number of noise pulses, and will likely capture only some of the pulses associated with the DP.In any case, it is necessary that the activation level of the acquisition system is properly adjusted so that, during the measurement, the lower magnitude PD sources are not discarded and the acquisition system is not saturated by the large amount of low magnitude noise usually present in any measurement environment [9], [22], [24].
For all of the above, it may be of great interest to quickly generate PD signals that have the same temporal and spectral behavior of the signals that are experimentally captured.This way it would be possible to generate large amount of pulses from fewer represented sources of PD or electrical noise, thus improving the separation processes, since the number of signals that are part of a given source could be increased in such a way that the cluster does not go unnoticed or confused with some source of external noise.Also, additional signals could be generated to increase the data used in the training of ML-based identification algorithms.

IV. ARTIFICIAL GENERATION OF PD SIGNALS FROM DCGAN
The field of artificial intelligence (AI) has been revolutionized with the spectacular results obtained by deep neural networks [35].Tasks like computer vision [36]- [38], natural language processing [39], speech recognition [40], speech generation [41], etc. have shown a breakthrough performance by using these techniques [42], [43].Advances like autonomous driving [44], intelligent assistants [45], just to name a few are examples of how AI has been applied into commercial products and services.Moreover, AI is progressively stepping into other fields such as healthcare [46], finance [47], transportation [48], human resources [49], environment preservation [50], etc.It is in fact one of the most promising technologies for human performance transformation.Among the different machine learning techniques being applied in data processing, the deep neural networks stands out thanks to its ability to generate data that can be grouped into the generative models [51].
In this regard, the deep generative models are those that have the capability to produce a structured data sequence [43].The output sequence could be text, using Long Short Term Memory (LSTM) networks [52]; images, using the Generative Adversarial Networks (GAN) [51], or audio, using Wave Generative Adversarial Network (WaveGAN) [53] [54].In particular, the GAN architecture represents a new paradigm in deep learning.Essentially, this architecture is composed of two networks: one is for generating data (generator), and the other is for discriminating artificial generated data from the real data (discriminator), see Fig. 3.
Functionally both networks play a versus game, one against the other, so the generating network is trained to fool the discriminative one.Once the precision of the discrimination reaches 50% or similar, we can conclude that the discriminative network no longer differentiates artificial generated data from the real one.A valuable byproduct obtained as consequence of this process is the trained generative model which can be used to generate data.GAN networks have been used with great deal of success mostly in the image generation tasks [55], [56].Besides being successful in the field of generating very realistic images, the GAN paradigm is starting to be used in the field of signal generation, such as audio signals [53].In particular, the sound generation remains a challenge since the signal has dependencies at different scales in frequency and in time.Even recent developments (WaveGAN [53] and GANSynth [54]), still produce sound with artifacts that are noticeable for the human ear.
Initially, the first adversarial network proposed was based on a Multi-Layer Perceptron (MLP) [57].Since then, many architectures have been proposed for improving the image  generation task [58]- [60].Some architectures include layers of pooling for image resizing (Pix2Pix, cGAN), that allow eliminating information between the convolution layers [61].For the model of artificial signal generation proposed in this paper, it was necessary to use an architecture that would allow maintaining most of the characteristics that PD signals have in time and frequency.If, at some point in the process, features are eliminated, the new signals generated could not necessarily correspond to a type of PD.
For the artificial generation of PD, an algorithm based on DCGAN was implemented, with an interesting approach, unlike the traditional convolutional-pooling layer.According to the architecture proposed for the algorithm, the strided convolution is responsible for sampling features and also resizes the image without losing the found characteristics to reach a final vector of representation whose only output will be the class of the real or false image [62].
In the generator, a sample of Gaussian noise is taken together with the tuning of the saturation of the discriminators' cost function, and with normal convolutions the new resulting image G (z) is reconstructed.Fig. 4 shows the detail of the discriminator and the generator structure that integrates the implemented algorithm.In the discriminator we find a network that has, as input, a one layer image of 16 × 16 pixels.In the following stages the image goes through a series of convolutions, in each layer the size of the convolution mask is different, until a vector of image characteristics can be assembled as output.The objective of the discriminating network is to be able to determine if an image comes from the space of the real samples (x), or is generated by the G (z) model that corresponds to the output of the generating network.On the other hand, the generating network takes Gaussian noise (z) as input, and in the first stage it makes a projection towards the size that the output image should have, then goes through the same number of convolutions that the discriminative model has with the same features.
The training of this model is carried out with the values obtained from the discriminator to tune the weights of the different neurons, at the end of this process, the inference model will correspond to the values of the weights in each of the convolutional layers, this model will allow obtaining the required images, which are subsequently transformed into signals by the algorithm.
Fig. 5 shows an example of the images obtained for different types of PD before performing the training of the implemented algorithm.Each 16 × 6 image was constructed from real PD corona signal, internal PD and Surface PD obtained from the three test objects described in Section II.More details on the behavior of these signals are described in the experimental results in Section V.
Finally, as shown in Fig. 6, the proposed artificial PD generation algorithm integrates three stages: Preprocessing, In each case, during the training stage some hyperparameters important in generative networks were adjusted in order to achieve convergence of the result.As mentioned earlier, GAN networks compete with each other, and the direct connection is given by saturation of the cost function of the discriminative network, which becomes the input to the generating network.In order to avoid that this value gets saturated or made undetermined, it was necessary to adjust the input data so, at the time of initializing the weights in each of the convolutional layers, it would not break the training and end up delivering null results in case of finding any value out of range.In this case, the data were delimited in values between −1 and 1.For this particular process, no validation data set was obtained, only for training and testing.
In the last stage, Generation, inference models were used.Each model had a Gaussian noise image as input to generate an artificial image associated to the corresponding PD.Finally, the artificial PD image was converted into a vector, to which the first 200 values (or samples) were extracted and stored in a new vector to be subsequently transformed into a time domain signal.

V. EXPERIMENTAL RESULTS
To compare the artificially generated signals with those experimentally obtained, the spectral content of both groups was studied and compared.Likewise, the spectral power clustering technique (SPCT) that is normally applied in the separation of PD sources [9], [21], [22], [23], [30] was used to evaluate whether the real and artificial signals for each type of PD source were grouped in the same zone of the separation map.
During the separation process the SPCT classifies the signals that come from different types of PD sources or electrical noise into clusters, each cluster on the map will be formed by those signals whose spectral characteristics are similar, that is, they belong to the same source.One of the main advantages of this separation technique is its ability to sense any change in the signal, which is reflected in the shape and position of the cluster on the PR map.
The clustering process with SPCT is done through a 2-D separation map called power relationship map (PR map); for each signal the spectral power of two different frequency bands is obtained and divided by the total spectral power of the signal.These two mathematical relationships are called PRL (power ratio for low frequencies) and PRH (power ratio for high frequencies), see ( 1) and ( 2).
where, s(f ) is the magnitude of the FFT of the temporal signal of the pulse s(t).[f 1H , f 2H ] is the high frequency band, [f 1L , f 2L ] is the low frequency band and f t is the maximum frequency under analysis.According to the structure of this technique, the frequency bands are chosen based on the spectra observed in each of the signals.Therefore, each signal will be represented on the separation map through a value of PRL and PRH [9].The separation intervals used for this analysis were adjusted as described in [45], i.e. at [10], [29] MHz for the PRL and [29], [49] MHz for the PRH.
If the spectral behavior and the position on the separation map of a certain type of source are similar for both groups of signals, then the obtained signals with both forms of generation can be considered equivalent and could be used together to improve the performance of any separation or identification process.

A. COMPARISON OF REAL AND ARTIFICIAL CORONA PD SOURCES
In order to experimentally generate these types of PD sources, the point-plane configuration described in section II was used as test object.During the measurement process, a high trigger level was maintained (above the noise level at 3.6 mV) to measure only the signals associated with corona PD, thus avoiding any low-magnitude noise sources.The voltage level of the test object was increased to 5.6 kV, where a stable PD level was obtained.For this experiment, a total of 1,000 pulses were stored.
The normalized average spectral power and the SPCT were applied for each of the signals obtained during the acquisition and are shown in Fig. 7. Fig. 7(a) shows three different spectral power peaks clearly detected at 3.4 MHz, 32 MHz and 54 MHz, with the highest magnitude at 3.4 MHz, which is typical for corona PD sources, where the highest spectral components of the signals are at low frequencies [30].Fig. 7(b) shows the PR map generated from the application of the SPCT to each signal, with separation intervals set at [10,30] MHz for the PRL and [30,50] MHz for the PRH.As shown in the figure, the clusters points have spectral contents slightly higher for the PRL, which causes the cluster to be located below the main diagonal of the PR map, keeping dispersion greater in PRL than that obtained for PRH (σ PRL = 4.81 and σ PRH = 3.06.Where, σ corresponds to the standard deviation obtained from the centroid for each cluster).This cluster position coincides with the spectral content observed in Fig. 7(a), where greater average spectral power is evidenced for the band [10,30] MHz.As can be seen in Fig. 8, these artificial signals exhibited a temporal behavior similar to that of the real signals obtained experimentally.It should be noted that with this single training, the algorithm can generate many more signals if required.
To evaluate the spectral behavior of this new group of signals, the normalized average spectral power and the separation PR map were obtained.Like Fig. 7(a), in Fig. 9(a) three spectral power peaks were clearly identified for the same frequencies (3.4 MHz, 32 MHz and 54 MHz).In addition, the highest spectral power peak was also at 3.4 MHz.As shown in Fig. 9(b), when applying the SPCT to artificial signals, the obtained cluster is located in the same zone of the separation map as the cluster obtained with the real signals (Fig. 7(b)).However, when comparing both clusters, different distributions among the points were evident and this rules out the possibility of over fitting during the generation of artificial signals.Regarding dispersion, the behavior of this cluster coincides with the cluster obtained with the real signals, that is, the dispersion in PRL was greater than that obtained for PRH (σ PRL = 5.10 and σ PRH = 3.84).

B. COMPARISON OF REAL AND ARTIFICIAL INTERNAL PD SOURCES
Internal PD were obtained by applying 8.9 kV to the test object, which was based on nine sheets of NOMEX paper (see section II).For this experiment, a total of 1,000 signals were stored while maintaining a high trigger level (5.1 mV) to avoid low-magnitude noise pulses.The measurement process began five minutes after applying high voltage, when the PD activity was stable.Fig. 10 shows the normalized average spectral power and the PR map for this type of PD source.According to Fig. 10(a), when evaluating the average frequency spectrum of these signals, the two main peaks of spectral power can be observed at 3 MHz and 12 MHz respectively, the peak at 3 MHz being higher.
On the other hand, the PR map in Fig. 10(b) shows that on the PRL axis, the cluster is highly dispersed (σ PRL = 8.17), while for the PRH, the dispersion is low (σ PRH = 1.42).Also, this cluster shows that the spectral power content is low in the PRH, which caused the cluster to be near the PRL axis.As expected, the shape and position of the cluster is completely different from the obtained with corona PD.
For the training process of the GAN algorithm, 500 signals out of the 1,000 experimentally obtained signals were used.Once the training was finished, 1,000 artificial signals from internal PD were generated.Fig. 11 shows the normalized average spectral power and the PR map for   this new group of signals.In both cases, the results are concurring to those experimentally obtained; the spectral power peaks are in the same frequency bands, and the highest spectral power peak is also at 3 MHz.When applying the SPCT to the artificial signals obtained for this type of source, a similar cluster to that obtained in Fig. 11(b) can be seen, also being located close to the PRL axis and showing the same shape, that is, the dispersion of the cluster in the PRL remains greater than in PRH (σ PRL = 8.59 and σ PRH = 1.51).

C. COMPARISON OF REAL AND ARTIFICIAL SURFACE PD SOURCES
Finally, for this last experiment, the ceramic bushing described in Section II was used as test object.This test object was contaminated on the surface with saline solution to obtain stable PD activity at a voltage level of 9.5 kV.During the acquisition, a high trigger level was maintained at 6.3 mV, and 1,000 pulses associated with PD were stored.Fig. 12(a) shows the spectral power peaks measured with this type of PD source; the main three peaks are above  The PR map in Fig. 12(b) shows that the cluster associated with this type of PD source is in a close area to that of the cluster obtained for corona PD (below the main diagonal of the separation map); however, the dispersion and shape of the cluster are completely different.For this type of PD, the cluster dispersion is similar for both axes: σ PRL = 2.89 and σ PRH = 2.65, therefore, a much more homogeneous cluster is evident.
Once the 1,000 artificial signals were generated for this type of PD source (from the previous training with the 500 pulses of the real partial discharges), the normalized average spectral power and the PR map were obtained and are shown in Fig. 13.According to the frequency content observed in Fig. 13

VI. DISCUSSION AND CONCLUSION
According to the results obtained in Section V subsections A, B and C, the spectral content of the signals generated in real and artificial ways showed a similar behavior for the three different types of sources (corona PD, internal PD and surface PD).Likewise, the spectral power peaks for each signal obtained with both generation methods were at the same frequency bands; this correspondence was also observed after 75 MHz, where the spectral power values in each of the analyzed signals were low, that is, no spectral components of interest were evidenced.As shown in Fig. 15, when comparing the normalized average spectral power of each source, it was observed that, for some frequency bands, the spectral power was slightly higher in the artificially generated signals.For example, when analyzing the results for the PD corona sources, the average spectral power obtained from the artificial signals in some frequency bands was above that obtained with the real signals ([7MHz-31MHz], [38MHz-53MHz] and ).A similar behavior, but with different frequency bands, was also observed for internal PD and surface PD sources.
On the other hand, when the SPCT was applied to each of the obtained signals, the technique classified each source in different areas of the separation map, concurring in the same area the sources experimentally and artificially generated that belonged to a same type of discharge.Likewise, for both forms of generation, the dispersion levels of the obtained clusters presented the same behavior, that is, σ PRL> σ PRH.However, the dispersion values in PRL and PRH for the clusters associated with the artificially generated sources were greater than the dispersion values of the clusters obtained with the real PD sources (see Table 1).This variability in the artificially generated signals aids to rule out any possibility of over fitting in the signals delivered by the implemented algorithm.
In general, the previous results indicate that for each type of PD analyzed, the artificial signals, in addition to having a spectral and temporal behavior similar to the real signals, were classified by the SPCT as signals of the same type.
This was also confirmed in Section V subsection D, where the implemented algorithm was trained with a mixed set of signals (real and artificial) that randomly included a higher percentage of artificial signals (70%).The results obtained for this analysis, showed that for training and signal generation, the algorithm identified a single type of source, which was reflected with the presence of a single cluster at the time the artificially generated signals for each type of PD were clustered.
Likewise, the position and shape of the clusters obtained with this new group of artificial signals were similar to those shown in Section V subsections A, B and C.However, some slight variations associated with the dispersion of the clusters are kept, that is, the dispersion in terms of PRL and PRH of the clusters formed from the training with mixed signals, was greater than the dispersion obtained with the clusters of artificial signals formed from training with real signals, and in turn greater than that of clusters obtained with real signals.
Despite the differences in the values associated with the dispersion of the clusters, all these results confirm that the artificial signals generated by the DCGAN algorithm implemented in this work can be considered an adequate alternative to the methods of experimentally generated PD signals.However, this type of algorithm must still be tested for other types of sources, for example, white noise or impulse noise.Likewise, it would be of great interest to evaluate the behavior of the algorithm when training with simultaneous sources of either different types of PD sources or PD sources with electrical noise.
The authors hope that this new form of signal generation can be oriented to improve the separation or identification processes in order to complete clusters with little representation, performing training with the few acquired signals to then generate many more artificial signals which can be used to complete insipient clusters.Likewise, these artificially generated signals could also be used to preliminary test the behavior of new algorithms designed to separate or identify sources of PD or electrical noise without the need for the instrumentation and equipment required at experimental level, requiring only a small group of signals to then proceed to multiply them, maintaining the same temporal and spectral characteristics of the real signals.

FIGURE 2 .
FIGURE 2. Schematic diagram of the separation and identification process for PD sources and/or electrical noise.

FIGURE 3 .
FIGURE 3. Schematic diagram of a GAN in which there are two networks competing with each other to generate samples similar to the real data (x).

FIGURE 4 .
FIGURE 4. Distribution of the algorithm based on DCGAN with the main components in each of its networks; (a) discriminator, (b) generator.

FIGURE 5 .
FIGURE 5. Images obtained from a single real (a) corona PD, (b) internal PD and (c) surface PD signals.

FIGURE 6 .
FIGURE 6.Schematic diagram of the proposed algorithm to artificially generate PD.

FIGURE 7 .
FIGURE 7. Experimentally generated corona partial discharges: (a) normalized average spectral power and (b) PR map.

FIGURE 8 .
FIGURE 8. Example of a corona PD pulse: (a) experimentally obtained and (b) artificially obtained.

FIGURE 9 .
FIGURE 9. Artificially generated corona partial discharges: (a) normalized average spectral power and (b) PR MAP.

FIGURE 10 .
FIGURE 10.Experimentally generated internal partial discharges: (a) normalized average spectral power and (b) PR.

FIGURE 11 .
FIGURE 11.Artificially generated internal partial discharges: (a) normalized average spectral power and (b) PR map.

FIGURE 12 .
FIGURE 12. Experimentally generated surface partial discharges: (a) normalized average spectral power and (b) PR map.

FIGURE 13 .
FIGURE 13.Artificially generated surface partial discharges: (a) normalized average spectral power and (b) PR map.

FIGURE 14 .
FIGURE 14. Clusters obtained from artificial signals generated with the DCGAN algorithm, when the training was done with real and artificial signals.a) Corona PD, b) Internal PD and c) Surface PD.
(a), for these artificial signals, the three main spectral power peaks are in the same frequency bands as those obtained with the real signals (27 MHz, 36 MHz and 48 MHz).Likewise, in the PR map of Fig. 13(b), the cluster is located in the same position as the one obtained from the real signals (see Fig. 12(b)).As for the dispersion obtained for the cluster in both axes, the same behavior is shown (σ PRL = 3.34 and σ PRH = 3.25).D. GENERATION OF ARTIFICIAL PD SIGNALS FROM TRAINING WITH MIXED SIGNALS To evaluate the combined use of real and artificial signals in the training processes of ML-based algorithms, the process to train the implemented DCGAN algorithm with three different groups of signals was implemented.Each group corresponded to a type of PD which consisted of 500 signals; 70% of them were artificial and 30% were real signals obtained experimentally from the tests described above.This procedure aimed to evaluate if the new artificial signals generated with the DCGAN algorithm presented a similar behavior to that of the artificial signals generated when the algorithm's training is done only with real signals.Likewise, this new form of training aimed to verify if the algorithm was able to recognize the real and artificial signals as a single class, or if on the contrary, found differences between both groups of signals, which would erroneously produce the presence of two clusters in the classification map for a single source type.Fig. 14 shows the PR maps obtained, where the three clusters for each type of PD have a similar position and shape to those obtained previously for real and artificial signals (Sections V.A and V.B).For these new results, when analyzing the dispersion in terms of PRL and PRH of each cluster obtained, a behavior similar to that observed in clusters formed from real and artificial signals is evidenced: • Corona PD: σ PRL = 6.13 and σ PRH = 4.3 • Internal PD: σ PRL = 8.8 and σ PRH = 1.06 • Surface PD: σ PRL = 5.26 and σ PRH = 4.99For each type of PD source can be observed that the dispersion in PRL is greater than that obtained for PRH.Also, when comparing the dispersion of these new clusters to that obtained in the clusters from the previousVOLUME 8, 2020

FIGURE 15 .
FIGURE 15.Comparison of spectral behavior between the sources of PD experimentally and artificially generated.

TABLE 1 .
Dispersion values in PRL and PRH for the clusters obtained experimentally and artificially with the different types of PD.