Deep learning-based method for the robust and efficient fault diagnosis in the electric power system

The robust and efficient diagnosis of power quality disturbances (PQDs) in electric power systems (EPSs) is one of the most important steps to protect a power system with minimal damage. However, the conventional fault detection methods used in the EPS mainly rely on heavy mathematical calculations, resulting in delayed actions against PQDs. To overcome these limitations, deep learning has been recently proposed to diagnose PQDs in the EPS, which allows the extraction of features from a huge amount of data to delineate subtle differences in electrical waveforms under faulty conditions. In this study, a deep learning-based diagnostic method for PQDs was proposed by exploiting a convolutional neural network (CNN) and simulated realistic three-phase voltage and current waveforms obtained from the PSCAD/EMTDC software. Specifically, PQDs related to various faults in EPSs were assessed to demonstrate the applicability of the deep-learning method as a fault diagnostic method. The proposed CNN model, trained by end-to-end learning and supervised learning approaches, successfully classified the type and location of the faults. Moreover, we found that simulated data obtained at the sampling rate of 50 Hz also accurately diagnosed the faults with an accuracy of over 99%; therefore, the proposed method could be a potential diagnostic tool in practice.


I. INTRODUCTION
E LECTRIC power systems (EPSs) are likely to be severely damaged by natural disasters, such as earthquakes, lightning strikes, or floods. When a disturbance such as short-circuit faults occurs, the EPS becomes unstable owing to changes in the line impedance. If proper actions are not taken promptly against a disturbance, there is a possibility of cascading failures, resulting in power outages in a large area. Therefore, the rapid detection of fault types and locations is crucial to protect the power system with minimal damage [1].
The fault diagnosis methods used in EPS mainly rely on complex mathematical calculations. One of the limitations of the conventional methods is that they require a long computation time owing to the heavy calculations, which might worsen the damage by delaying the right decision to protect the EPS. To overcome these shortcomings, recent studies have attempted to apply deep learning (DL) methods for the detection and classification of power quality disturbances (PQDs) [2]- [4]. A DL model is trained to maximize the probability of obtaining a correct output from huge amount of data, and a well-trained model can instantly provide output [5]. Therefore, DL methods have been widely applied to solve the issues requiring rapid diagnosis, such as power system failure [6]. A few studies have demonstrated that DL methods can detect and classify PQDs using mathematically generated signals [7]- [9]. However, data used in previous reports was produced via a simple mathematical model, lacking information about the effect of faults on the entire power system. Although they reported promising results of using artificial intelligence (AI) to diagnose PQDs, accurate and efficient diagnosis of PQDs in the EPS using an AI model remains challenging. One important issue in training DL models is acquiring real voltage and current waveforms under fault conditions with high sampling rates from every point in the EPS. Because relay, protective equipment that collects voltage and current information at a certain sampling frequency to detect abnormal waveforms, is located only at several places in the power system due to cost and space issues.
In this study, we demonstrated that a convolutional neural network (CNN), a popular and powerful deep learning method for analyzing images [10] and signals [11], can accurately classify the location and type of the faults in the EPS. Data for training and testing the CNN models were prepared via a power system analysis program, which allows a more realistic simulation for obtaining three-phase voltage and current waveforms affected by the fault compared to the simple fault models used in previous studies. The simulation model used in this work is a standard tool for analyzing the complicated phenomena in the power system; thus, simulated data shows outcomes that involve various interactions in the power system under fault conditions. To avoid biases in the training and test data, a simulation was performed with random variations of the fault type, line fault location, and onset of the fault. We employed end-to-end learning of the CNN model and demonstrated that the fault type and location were accurately classified via the trained CNN model. Moreover, the optimal sampling rate for data acquisition was investigated to enhance the training speed with low computational costs by reducing the dimension of input data while retaining the performance of the CNN model. From this proof-of-concept study, we showed the CNN model could be a novel solution for the fault diagnosis in the EPS.
The rest of the paper is organized as follows: In Section II, we briefly describe the background of the CNN method and DL application in EPS. In Section III, we provide a detailed description of the test system configuration, data acquisition method, architecture of the CNN model, and validation method for the performance of the CNN model. The results are presented in Section IV. Finally, discussion and open challenges are provided in Section V.

A. DEEP LEARNING
Due to advances in the accessibility of big data and computational power, DL, a class of machine learning algorithms, has been widely used in various fields, including autonomous driving [12], speech recognition [13], computer vision [14], and medical imaging [15]. Traditional machine learning techniques require handcrafted feature extraction and selection, which limits the application of machine learning to delineate data with subtle differences [5], [16]. However, DL, consisting of multiple layers, allows automatic feature extraction from raw data. During training of the DL model, all parameters are optimized to maximize the probabilities for correct outputs; thus, large amounts of data are required to establish an accurate model with high computational power [5]. Although it takes considerable time to complete the training process, the trained model provides an output result almost in real time. As DL can accurately recognize patterns from undistinguishable data, several studies have exploited DL for diagnosing PQDs in the field of power systems.

B. CONVOLUTIONAL NEURAL NETWORK
In this study, a CNN was employed as a DL method to diagnose PQDs. The architecture of a CNN consists of multiple layers with an input layer, hidden layers, and an output layer. The hidden layers are stacked as convolution layers, pooling layers, and fully connected layers [17]. Convolution operations are a key component in a CNN, which extracts features from data via sliding filters (kernels). Pooling layers reduce the data dimension and trainable parameters, and fully connected layers flatten the input data to increase learnability. The output layer is the last fully connected layer for classification and regression. Various CNN models, such as LeNET5, AlexNet, ZFNet, and VGG, have been developed and employed in classification and regression problems [18].
The error-back propagation algorithm effectively trains the CNN model by computing the loss function between the expected outputs and the output propagated from the networks [5]. During the iteration of the error-back propagation process, trainable parameters, such as kernel weights and bias, are optimized to minimize the loss function. Therefore, sufficient learning data allow establishing an accurate CNN model. In the field of machine learning, there are two main approaches to train the AI model: supervised learning and unsupervised learning [19]. Supervised learning trains the AI model using data with ground truth labeling (the correct answer), allowing the computation of the loss function. This is extremely useful when teaching a model to yield the desired outcome. In contrast, unsupervised learning is used to find patterns in data or for clustering without labeling information.
We utilized a supervised learning approach to classify PQDs (the purpose of this study) because it is a multiclassification problem. One of the problems that occurs when conducting supervised learning is overfitting, which refers to a phenomenon in which the data for learning are analyzed with high accuracy, but the model does not work properly in other data not shown during training [19]. To solve this issue, only about 80% of the total learning data was used for model training, and the remaining data were used for validation of the model. A few techniques, such as batch normalization or dropout, have also been developed to minimize overfitting problems.

C. DEEP LEARNING FOR ELECTRIC POWER SYSTEM
Load forecasting is a representative field in which deep learning techniques can be applied to EPS. The load forecasting results are used to determine electricity prices in the market and to plan generator operations to secure the power reserve and transmission capacity. Therefore, accurate power demand forecasting is helpful for stable power system operation and efficient power market operation. In general, a method of predicting future power demand using a mathematical combination of power demand and related information is used, and several studies have been conducted to apply DL to this field. Numerous studies on DL applications achieved fast results through the use of proper DL methods. Load forecasting can be categorized according to the time domain, and many studies with DL applications deal with short-term load forecasting (STLF). These studies deal with load forecasting from the aggregated level to the building-level load. In certain studies, Bouktif, Salah, et al., and Kong et al. performed load forecasting using the Long Short-Term Memory (LSTM)recurrent neural network (RNN) method [20], [21]. Kuo et al. proposed a DeepEnergy model based on a CNN for power demand prediction [22]. This shows that the aggregation of all individual forecasts provides better performance than aggregated load forecasting [23]. In [24], an approach for pooling-based deep RNNs was proposed for household load forecasting [25], [26]. There DL method has been attempted in long-term load forecasting (LTLF), wherein forecasts for a relatively long period are made.
Furthermore, attempts to apply DL in terms of renewable energy in EPS have been made. Wind and solar, which are primary renewable energy sources, fluctuate frequently, thereby leading to variation in the power generated by renewable sources. The output forecasting of these distributed power sources is necessary for the stable operation, control, and planning of each power source. Various methods (statistical, physical, and ensemble methods) have been used for output forecasting, and recent studies have utilized DL in this field [27]- [35]. Forecasting studies using DL are usually conducted in a short-term domain, and a few of them propose a hybrid form in which multiple DL methods are mixed [36]- [38].
Studies on the application of DL for the detection and classification of PQDs are also essential. It is important to determine the type of fault in the power system and immediately protect the system. DL is suitable for the detection and classification of PQDs because it has the advantage of making quick judgments. With the development of metering technology in smart grids, voltage, current, and active and reactive power consumption can be measured in detail. This makes it easy to acquire learning data. Studies have been conducted to apply various DL methods to the detection and classification of PQDs. Specific attempts have been made to classify PQDs using stacked auto-encoders (SAEs) and stacked sparse auto-encoders (SSAEs), and they produced positive results in terms of classification [7], [39], [40]. Deng et al. [41] attempted to detect PQDs using a RNN, and conducted a study using a deep belief network (DBN) [42], [43]. Various CNN architectures have been used to classify PQDs, and studies on hybrid CNN utilization methods have also been conducted for detection and classification [9], [44]- [47]. Phase A,C to Ground 6 Phase B,C to Ground 7

III. METHODS AND MATERIALS
Three phase fault Phase A,B,C to Ground 8 Line-to-line Phase A,B 9 Phase A,C 10 Phase B,C

A. ELECTRICAL POWER SYSTEM FOR SIMULATING PQDS
PQDs in the EPS were simulated using PSCAD/EMTDC software, which has been widely used to study the behavior of complicated electrical systems. This simulation tool allows for the acquisition of realistic voltage and current waveforms under various fault conditions, which involves important information about the effect of a fault at a single point on the entire power system. As described in the Introduction, it is very challenging to obtain real data at every point from the EPS; therefore, the simulation would be the best option to perform a proof-of-concept study of applying DL for fault detection in the EPS. To obtain the data required for training and testing a DL model, a simple power system consisting of three transmission lines (T/L) and one load was used. The load type is constant power model. The current and voltage were measured at each point, as shown in Figure 1. Although a single-line diagram is shown in Figure 1, a three-phase system was utilized to obtain threephase voltage and current information. Faults occurred at one of the fault locations 1-3. Figure 2 shows the representative three-phase voltage and current waveforms under normal and faulty conditions. An appropriate T/L model comprising the conductors, ground wire, and bundling position data was selected and was used to obtain simulation data close to real electric signals.

B. ACQUIRING DATA FOR TRAINING A CNN MODEL
For training and testing a CNN model, a total of 13,000 data with and without faults were acquired through iterative simulation performed through the multiple-run component provided by the PSCAD/EMTDC program. Table 1 lists the types of faults that occurred during the simulation. Among various fault types, fault types 1,4,7 and 10 were selected as representative fault types.
To minimize biases in the simulated data, the fault location, fault type, fault onset time, and fault duration were randomly selected. In addition, random harmonic generation parts were included in the power system for realistic data. These randomly selected parameters allow acquiring unbiased dataset. The specific details of the random parameters are listed in Table 2. Information related to the fault in each dataset was recorded to be used as the ground-truth data for training the CNN model.   The simulation was performed for two seconds with a sampling rate of 2000 Hz. Therefore, each signal has 4000 data points for 2s. Because of the four measuring points in the power system, there were 12 output signals in each simulation, including the three-phase current and voltages measured at fault locations 1, 2, and 3, respectively. Simulations for each fault were performed 1,000 times with random parameters; therefore, a total of 12,000 simulations were performed. Moreover, 1,000 no-fault simulations were performed for normal condition data. Therefore, a total of 13,000 simulation data were used for training and testing the CNN model. The details of the iterative simulation are presented in Table 3.

C. 1D-CNN MODEL FOR THE DETECTION AND CLASSIFICATION OF PQDS
We established a one-dimensional CNN (1D-CNN) model based on vanilla CNN architecture. The proposed 1D-CNN model consists of two 1D convolutional layers, a maxpooling layer, and a fully connected layer followed by a softmax classifier and an output layer ( Figure 3). A constant kernel size of 1 × 7 was applied to the convolutional layers. A rectified linear unit (ReLU) was employed as an activation function of the convolutional layers. The max-pooling layer with a 2 × 2 filter leads to down-sampling of the input signal to reduce the complexity for further layers. Batch normalization and dropout with a rate of 0.1 were applied to the CNN model to prevent overfitting. The number of output classes was set to 13, including normal, and four faults occurred at three locations. The proposed CNN model was trained using a supervised learning method. The simulated data were randomly divided into two sets that included the training set (80% of the total dataset, i.e., 10400 data) and the testing set (20% of the total dataset, i.e., 2600 data). There was no pre-or post-signal processing to demonstrate the end-to-end capability of the proposed CNN model for the detection and classification of PQDs. The batch size and number of epochs were fixed at 64 and 50, respectively. The proposed model was trained using the Adam optimizer (learning rate of 0.001) with a sparse categorical cross-entropy loss function.
The entire training and testing process was implemented by exploiting Python 3.0, which employs TensorFlow, Keras, and NumPy libraries. A machine with an i7-9700K CPU, 64 GB RAM, and an NVIDIA RTX 2070 SUPER Graphic Processing Unit was utilized for the experiments. Source codes and data are available upon request.

D. EVALUATION OF PERFORMANCE OF THE PROPOSED CNN MODEL
The performance of the proposed model was evaluated using matrices, accuracy, specificity, precision, recall, F1-score, and Kappa. A confusion matrix of the classification result was retrieved using the trained CNN model and test dataset prior to evaluating the performance of the CNN model. It showed numbers of true positives (TPs), false positives (FPs), true negatives (TNs), and false negatives (FN). The mathematical equations for evaluating the model performance are as follows:

A. CLASSIFICATION OF FAULT TYPES AND LOCATIONS USING THE CNN MODEL
To test whether the CNN model could accurately classify the fault type and location from the simulated voltage and current waveforms, the CNN model was trained using full simulation data. As aforementioned, simulation data include time-varying voltage and current waveforms obtained at the sampling rate of 2000 Hz for 2s, resulting in the collection of 4,000 data points. Because of the high-dimensional data, the waveform shows a continuous profile, as shown in Figure 4a. Simulation data were directly used for training the CNN model without any data pre-processing, such as normalization and standardization, known as end-to-end learning. As the end-to-end learning method is a practical tool for PQD diagnosis owing to saving time for data communication and model training, we tested the capability of the end-to-end learning-based CNN model for diagnosing PQDs. The graph in Figure 5 demonstrates the learning performance of the CNN model during the 50-epoch training process. The training and test accuracies greater than 0.99 were achieved within the 10 epochs. Moreover, the training VOLUME 4, 2016 and test losses were close to zero as the accuracy increased. Table 4 shows a confusion matrix that describes the performance of the CNN model on a set of test data. When full information was exploited for the training process, performance metrics, including accuracy, specificity, precision, recall, f1-score, and kappa have values close to 1, indicating that the CNN model shows excellent performance for the classification of types and locations of PQDs.

B. EFFECTS OF REDUCING SAMPLING RATES ON THE PERFORMANCE OF THE CNN MODEL AND TRAINING SPEED
There are trade-offs between the performance and efficiency of exploiting full data for CNN model training. On the one hand, an accurate CNN model can be trained by using full information; however, it requires a long training period and huge computational power owing to the large data size. On the other hand, reduced information can efficiently train the CNN model, but the performance may have to be compromised.
To test whether the CNN model could be accurately trained with reduced information, training and test data were prepared with sampling rates of 2000 Hz, 1500 Hz, 1000 Hz, 500 Hz, 300 Hz, and 50 Hz. Figure 4 shows the representative voltage waveforms of the type 1 fault obtained at various sampling rates over 0.4 s. The representative voltage signals become discrete as the sampling rates are decreased, whereas the original voltage signal shows a continuous profile over time. Next, we tested whether the CNN model could accurately detect fault types and locations from low-sampled data.
The learning performance of the CNN models trained using different datasets is shown in Figure 5. Overall, all CNN models achieved an accuracy of greater than 0.99 within 50-epochs. The number of epochs required to reach an accuracy of over 0.99, increased as the sampling rate was decreased. The CNN model trained using full information attained an accuracy of greater than 0.99 within 10-epochs, but it required 30-epochs to train the CNN model using data obtained at the sampling rate of 50 Hz. Table 4 summarizes the performances of each CNN model. Next, the time required for completing 50-epochs training was assessed to determine the training speed depending on the sampling rates. Table 5 shows the elapsed time to complete 50-epochs. The training speed was significantly increased as decreasing the sampling rate.

C. COMPARISON WITH THE SUPPORT VECTOR MACHINE MODEL
We adopt a support vector machine (SVM) model for comparison [48]. The SVM model has been widely used for classification and regression problems as it can find hyper-plane that differentiates multiple classes efficiently. The proposed CNN model was compared with the SVM model to check whether the proposed model is an appropriate method for fault detection in the EPS. For a fair comparison, the SVM model was trained by using the same dataset used for training the CNN model. Principal component analysis of voltage and current waveforms was performed, and the first 20 principal components were used as input data of the SVM model. Figure 6 shows the accuracy of each model trained by using data obtained at various sampling rates. The SVM model achieved the highest accuracy of 96.8% when the model was    Deep learning Support Vector Machine trained via full information. But it showed 95.5% accuracy with data obtained at the sampling rate of 50 Hz. Overall, the CNN model outperforms the SVM model even though the CNN model was trained by using low-sampled data.

V. DISCUSSION AND OPEN CHALLENGE
This study demonstrated that the CNN model trained through simulation of PQD data enables the accurate classification of faulty types and locations in the EPS. Notably, the proposed method was trained using an end-to-end learning approach without data pre-processing, which reduced the time required for diagnosing faults using the CNN model. Moreover, we demonstrated that the CNN model trained using low-sampled data could accurately diagnose faults with accuracy over 99%, which indicates the CNN can be harnessed using electrical waveform data obtained at a cost-effective relay in a power protection system. A comparison between the proposed CNN model and the SVM model was performed, and we found that the CNN model outperforms the SVM model even though the training was conducted with lowsampled data. State-of-the-art relay measures voltage and current waveforms at a high sampling rate of over a few thousand Hz, but it is challenging to install relays in every place in the EPS due to cost and space issues. If the accurate fault diagnosis in the EPS could be performed using lowsampled waveforms obtained via a cost-effective relay, then the proposed method might overcome the aforementioned cost and space issues of placing relays. Collectively, the results that the CNN model with low-sampling electrical waveform data accurately diagnoses faults in the EPS might show there will be a chance to implement the proposed method as a fault diagnostic tool in practice.
Although this proof-of-concept study showed the proposed method is capable of diagnosing the types and locations of PQDs by exploiting a deep learning method, there are a few things to be improved. First, as the purpose of the study is to show the applicability of the CNN model to fault detection, the controlled simulation was performed with less complicated data in a simple electrical power system. To overcome this limitation, a simulation will be performed in a complicated electrical power system to obtain more realistic data. Second, the proposed method utilized the vanilla CNN model, which might not be an optimized CNN model for complicated PQD data obtained from a real power system. In the future, the advanced CNN models used in image processing areas, such as VGC, AlexNet, and GoogLeNET, will be tested to implement an accurate CNN model for diagnosing the fault types and locations of PQDs from complicated data. Finally, only harmonic noise was added to the simulation data as random noise to avoid a dataset bias. However, there are other real-world noises that affect the power system. Thus, realistic random noises will be added to the simulation data to produce more realistic simulation data in future studies.
Despite these limitations, the proposed method demonstrates that the CNN model can accurately diagnose faults in a power system using an end-to-end learning method and reduced data. This suggests that the proposed method is a versatile and efficient diagnostic tool for fault type and location. Moreover, we demonstrated the sampling frequency ranges required for measuring voltage and current data by testing the effects of low-sampled data on training the CNN model. Data acquired at the sampling rate of 50 Hz is enough for training the CNN model to achieve high performance. If the proposed method is validated using the data obtained from a complicated power system, it will open up a new era for diagnosing PQDs.