A Novel Methodology for the Determination of Impulse Response Coefficients Applied to Transmission Line Protection Relays

Impulse Response Coefficients (IRC) of digital filters is an imperative step in the development of transmission line protection relay algorithms. Traditionally, Fourier-based filters are used in real applications, where IRC are fixed values of sine and cosine functions with a data window of one or more cycles. Based on state-of-the-art, Mother Wavelet coefficients used in Multiresolution Analysis, and Structuring Element coefficients used in Mathematical Morphology are usually proposed to develop protection algorithms. However, the proper choice of these coefficients is based on empirical process of trial and error. This paper proposes a novel methodology for optimally determining coefficients that depend on the waveform structure analyzed, which is determined using variance as the metric. Assessment of methodology for three case studies considering requirements of relay manufactures (response time, design, harmonic attenuation and other) is presented. The first assessment is to extract coefficients useful for distinguishing among non-fault conditions, harmonics, and arcing faults. The second one is to extract coefficients to filter harmonic components. The assessment is carried out considering different data windows and sampling rates. Test results highlight the efficiency of the model to determine specific coefficients for each case study analyzed. Interestingly, results also showed that the discovered coefficients can be used in another filtering technique. Thus, the third case study involves developing two fault classifiers, which are developed using mathematical morphology where the structuring elements used correspond to the coefficient vectors determined through the proposed methodology. There is a notable paucity of scientific literature focusing on this topic. Therefore, there are several important areas where this study makes an original contribution regarding protection relays.

INDEX TERMS Transmission lines, power system faults, power system protection, relays, convolution and digital filters. IEDs are applied to minimize faults effects and abnormal phenomena on the operation of EPS, especially on TLs which are elements with the highest probability of faults occurring [1]. The 4th generation of protection devices has promoted the development of new relay algorithms, where the DSP has a pivotal role in the IEDs behavior [2]- [5]. Hence, the digital Filter design is an important aspect for analyzing the state of TLs, where FIR filters are usually used in real applications [5]- [11]. Currently, mostly protective relays use Fourier-based filters, which extract the fundamental component about one cycle of the fundamental frequency [7]. However, several transient signals produced by faults, LSs and other phenomena can produce an input signal highly contaminated with noise and high-frequency components [12]- [14]. For example, LS is a phenomenon with the highest noise level and high-frequency components. Obviously, in this case, the estimation of the fundamental component is calculated at a higher time than one cycle (3-5 cycles, approximately) [13]- [15]. Currently, the difficulty of traditional relays is to optimally find the IRCs that correspond to the desired response [2], [11], [16], [17]. As a result, trial-error empirical methods of filter design are usually applied by relay manufacturers [17], [18]- [21]. Other methods have been proposed to solve drawbacks of fixed coefficients but they are not adapted to protection relaying and no one has succeeded in replacing them [15] (see Section I.B). To sum up, IRCs of those methods are constant, and they cannot be modified or adapted to analyze data with different shapes and magnitudes. Therefore, one of the main obstacles is adequately choosing these IRCs. In addition, relay manufacturers indicate that there is an urgent need to address the IRC choice problem as an optimization problem [16], [18]- [22].

ATP Alternative Transients
Over the past four decades, most research in TL protection relays has emphasized the use of phasor estimation filters based on FT in real applications [6], [7], [22]- [25]. However, the major problem with this kind of application is not only its operation time, which uses data windows composed by IRCs of one or more cycles, but also the fact that these coefficients are predefined and assumed to be independent. In order to improve those drawbacks of traditional IRCs, several filters such as LES, Kalman, Walsh, Cosine, IIR and others have been proposed for TL protection relays [7], [11], [22], [7], [26]- [32].
As regards the Walsh filter, it uses IRCs at the interval [0,1] which only takes on the values ±1 [7], [26]. Thus, the amplitude and phase angle of the phasor of the fundamental component can be determined using those coefficients. As regards the Cosine filter, it is a simple filter with unitary filter impulse response that uses fixed IRCs of the cosine function with a data window of one cycle approximately. Another filter used by relays is based on LES, which minimizes the mean-square error between the analyzed waveform and the mathematical model coefficients of the waveform used as a reference [28]. However, since the upper boundary of the frequency response of this filter is higher than the Fourier-based filter, more errors can be introduced during phasor calculation [33]. A generalized filter from LES is the Kalman filter, which also transforms the input signal through the signal model coefficients. This filter must consider all possible unwanted components that are intended to be filtered out to obtain adequate phasor measurements. Moreover, Kalman and LES filters require not only a high computational effort compared to the Fourier filter, but also use predefined signal model coefficients that are chosen empirically [11]. Based on this information, regardless of the filter type used, the behavior depends on coefficients which are fixed and predefined. As regards the IIR filters, since they are based on the entire input history, they delay the analysis of fault waveform. Thus, FIR filters as opposed to these IRR filters are widely used for TL protection [5].
In addition, recent developments in the field of digital processing have led to a renewed interest in using SPT to try replacing traditional protection relays. As a consequence, MRA is viewed as an improvement over the Fourier-based filter because it deals with time-frequency resolution differently [34]. Similar to filters based on sine, cosine, Walsh, etc., MRA uses coefficients or weights corresponding to a MW [35]. However, the performance of MRA is limited by the proper choice of these coefficients [33], [11], [36], [37], which depends on the nature of the application and the nature of the signal analyzed. Therefore, that choice can be extremely harmful and difficult for researchers [11], [34]- [38]. Besides that, since the use of MRA in protection relay on TLs has been based on the generation of a huge amount of data in the form of wavelet coefficients concerning changes in scale and position, a large computational burden has been established [34].
A large and growing body of literature has proposed MRA to analyze most phenomena on TLs [39]- [63]. Furthermore, although there has been a great effort to prove that one wavelet is more suitable than another, currently there is no procedure for optimally choosing those coefficients. As a result, such approaches have been empirical in nature. Table 1 presents several MWs that were widely used in TL protection algorithms. From this table and other research based on the bibliographic review, it is possible to see that different MWs were used and chosen based on trial and error procedure. It is important to note that one study [63] has shown that the procedure of trial and error for choosing the best wavelet IRCs is very difficult when 40 MW were tested. See Table 2.
In addition, a considerable amount of literature has been published on MM and its application to TL protection relay algorithms, which uses several SEs such as linear, square, disk or ball-shaped, semicircular and others in order to transform shapes of original signals [64]- [73]. Similarly to MRA, the choice of an SE depends directly on the particular application of the protection type and database analyzed. However, there is no guideline for choosing the SE optimally, which is also chosen based on a trial-error procedure [66]. Table 1 presents some SEs used in research applied to this topic.
Overall, these studies clearly demonstrate that values of sine-cosine functions, ±1, and others with data windows of one or more cycles are used in real applications. In addition, all studies based on SPT reviewed here use empirical methods in order to find the best IRCs. Previous studies of digital filters have not dealt with the optimal extraction of IRCs applied to TL relaying.  [45] and [63].
In contrast to the above mentioned, this paper offers a novel methodology for optimally determining IRCs that can be applied to TL protection relay algorithms. These coefficients are based on the signal's variance value (see Section II).
The proposed work is divided into different sections: In Section I the introduction is presented. The mathematical basis of the proposed approach is presented in Section II. Transient models used in this approach are introduced in Section III. The first case study applied to IRC extraction, useful to distinguish between normal operations, harmonics, and arcing fault conditions is described in Section IV. The second case study, where the extracted IRCs are used as a transformation matrix to filter harmonic components is included in Section V. The third case study, where these extracted IRCs are applied as SEs in MM to classify fault types on TLs is explained in Section VI. An analysis of results according to the manufacturer requirements is described in Section VII. Finally, the main conclusions, contributions and future researches of this work are included in Section VIII.

II. MATHEMATICAL BASIS
MA with PCA is a useful method with time representation for the analysis of signals with and without transient features [74], [75]. Usually, MA of signals is expressed by an eigen decomposition fast algorithm that uses orthogonal coefficients called eigenvectors to transform the signal into new components called PCs. MA is useful to analyze an off-line database, where characteristic patterns are extracted and dimensional reduction is achieved. Then only the most relevant eigenvector is used in order to on-line analyze new transient signals.
As regards the off-line analysis, an original X database denoted by mxp, where m represents the number of transient signals and p represents the size of the signal which depends on the data window and sampling rate used, are transformed into a lower-dimensional space. Here, most of the original information is preserved, and most crucial eigenvectors are extracted, as follows: where the V new transformed matrix is of mxp size and the W matrix is of p × p size. The W matrix plays a crucial role in PCA behavior. Later on, the W and ψ arrays are calculated from VC matrix, which is obtained from the X matrix. Thus, it is necessary to calculate a VC matrix of p × p size as follows: Once the VC matrix is calculated, eigenvectors and eigenvalues matrixes are calculated as follows: where w j is a coefficient vector denoted in this research as VBMC, which satisfies the orthogonality condition and they are optimally extracted. Reference [74] describes the process optimization for calculation of VBMC. New transformed signals are determined only multiplying the original database by W.
It is necessary to note that in off-line analysis we discovered the W matrix formed by a series of VBMC, which gives a complete representation of transient signals. It is necessary to note that by using this projection, different characteristic patterns of original signals can be determined.
In order to choose the most important VBMC, the sum of eigenvalues is used, where if the variance value is higher than the threshold values, then those VBMC are chosen.
Each eigenvalue contributes to a percentage of the total variability as follows: where the numerator of eqn. (6) represents the variability of the first m eigenvalues, and the denominator represents the total variability of the original database.
By using the previous equation, it is possible to determine the number of coefficients that can be used for transforming data. In real applications, a threshold value (0.9) is usually used. Thus, coefficients are optimally determined to use variance value, which can also be chosen through the sum of eigenvalues. In addition, it is necessary to point out that the previous procedure is unique for each database analyzed. Therefore, coefficients obtained are considered the most suitable for analyzing said data, and a trial-error procedure is not necessary to determine these coefficients.

A. PROJECTING REAL-TIME SIGNALS
In addition, new signals that can happen unpredictably on the TL, corresponding to faults, LSs, and others must be analyzed in real-time. A special feature of PCA is that the W matrix can also be used to compute PCs for new signals that were not included in the off-line analysis.
The real-time analysis is developed by first positioning these testing signals into the PCA space and then projecting them onto the PCs. Thus, a new signal measured by the relay is represented by a discrete-time signal x test (t) that is, a 1 x p row vector, which is analyzed in real-time by means of the VBMC chosen, as follows: where, VBMC 1 , VBMC 2 are the number of coefficients based on eqn. (6), Xnw and Xmax are the mean vector and the maximum value of the original database calculated in the off-line analysis.
The previous mathematical procedure shows that the proposed methodology can be used in order to analyze signals not only off-line but also on-line.

III. TRANSIENT MODELS USED IN THIS APPROACH
To analyze the behavior of the proposed methodology data from events corresponding to normal waveforms, harmonics, arcing faults and LSs are simulated. In this paper, VOLUME 8, 2020 the ATP/EMTP software [76] is used to simulate two EPS. The first EPS used in this research contains six buses, five TLs, two transformers, and six generators. The 220 kV singlecircuit TLs with two ground wires are modeled with the frequency-dependent model (J. Marti) [77]. Spans of 400m between transmission towers are simulated. Two transmission towers on each side of the LS fault point are simulated. Elements and features such as transmission tower model, insulator string, resistance, and others have been simulated using international references [78], [79]. The second EPS used in this research consists of two areas connected by a 350km 500 kV TL and breakers that are activated when the fault occurs. In this case, voltage sources are considered. Similarly to the first EPS, in this case, the TLs are modeled using the model proposed by Marti [77]. The primary arc model for permanent faults and primary and secondary arc models for temporary faults are also simulated [80]. In both EPSs those events analyzed and their databases are registered by the R 1 relay at M bus. It is necessary to note that these databases were used for another research as they are previously reported by [81]- [83].
Also, to verify the feasibility of the proposed methodology with data windows and sampling rates similar to these used for the TBP and Phasor-based protection, in this research, data windows of 50µs, 4ms, one cycle, and 47 cycles are simulated. See Table 3. Moreover, Table 3 presents the EPS used in each case study carried out in this research work.
As regards harmonics, six signals made up of different harmonic levels corresponding to the 3 rd harmonic, 5 th harmonic, 7 th harmonic, (3 rd and 7 th harmonic), (5 th , 7 th and the 13 th harmonic), and (5 th , 7 th , 11 th and 13 th harmonic) are used. As regards temporary and permanent faults (auto reclosing), different arcing faults, considering primary and secondary arcs, are analyzed. In this case, the ESP shown in Fig. 1.b is simulated where the TL is divided into thirty-four parts with 10km at each section. Therefore, the permanent or temporary fault occurrence at 10km, 20km, 30km, and so on up to a distance of 340km from the location of the protection relay can be simulated. A total of 142 signals were generated. These signals are simulated with two data windows and sampling rates. Firstly, data windows of 100µs with 50 samples are simulated in case study 1 (see Section IV). Secondly, data windows of 48 cycles with 4000 samples during 0.8sg are simulated in case study 3 (see Section VI.B). This length of time is necessary to analyze and consider the generation and extinction of primary and secondary arcs. More details about these signals are presented in [83]. Finally, LSs are considered as ultra-high frequency transitory. In this research, there are two types of LSs, strike with fault (i.e. fault) and strike without fault (i.e. disturbance). Extensive simulation studies are carried out for nineteen different combinations of FPCM (6, 6.2, 7, 7.5, 8.5, 9, 9.5, 10, 20, 80, 100, 111, 120, 143, 160, 170, 200, 220, 250 kA), positive and negative polarity, TFR from 10 to 200 , and LS on the tower, on phase directly, or on mid span. Consequently, a database of 4758 direct LS signals, for which 1352 generate permanent faults and 3406 do not generate a permanent fault, are used. The data window of this database is 4ms with 4000 samples. Once samples were extracted, the VBMC analysis was performed using the proposed methodology applied to three case studies as presented below.

IV. CASE STUDY 1: VBMC S FOR DISTINGUISHING AMONG NO DISTURBANCE OR NO-FAULT, HARMONIC AND ARCING FAULT CONDITIONS
In this case, the proposed methodology is applied in order to distinguish sinusoidal waveforms, harmonics, and arcing faults. In order to achieve this, in first place, VBMCs corresponding to normal waveforms are determined. In second place, using these coefficients, harmonics and arcing faults are analyzed to determine a useful criterion for decisionmaking in the TL protection algorithm. In this case study, data windows of 50µs for sinusoidal and harmonic waveforms, and 100µs for arcing faults were used (see Table 3).
The A point at Fig. 2 shows the three-phase sinusoidal waveforms recorded by the R 1 relay at M bus with data windows of one cycle. Prior to analyzing the data, these previous waveforms are divided into data windows of 50 points where each point is sampled every 1µs. For example, at point B in Fig. 2, the first three data windows of 50µs are shown. As a result, the X data matrix is composed of the first data windows of 50 points. Next, a new sample becomes available and the oldest of the sample values is discarded. Consequently, the X matrix is composed of the first two data windows of 50 points and so on until the waveform cycle is completed. As a final result, 1200 data windows of 50 points are analyzed, forming an X matrix of 1200 × 50 size (see point C in Fig. 2). Once X matrix is determined these signals are processed through the proposed methodology, where 50 coefficients are determined to form a W matrix of 50 × 50 sizes (see point D in Fig. 2). Based on eqn. (6), the first and second column vector has the highest variance, which contains 100% of the variability of the original data. Therefore, only the first VBMC (see Point D1 in Fig.2) and the second VBMC (see Point D2 in Fig.2) are used.
Once the first two VBMCs are determined, the original data are sequentially projected on the 2D space. Point E in Fig. 2 shows that these data transformed or altered (PCs) through the first two VBMCs have a special characteristic represented by circular behavior. Based on that circular behavior, these values on both axes can be used to define a non-fault or non-disturbance zone (normal operation). Thus, in this research work, that zone is formed by its minimum and maximum values where for the first and second axes, the values are ±7.0710 and ±0.0319. See point E in Fig. 2. Apart from the circular behavior of sinusoidal waveforms that is determined in this research, other criteria are determined, which are presented in detail in [82]. Based on [2], and in order to increase stability against unwanted operations, it is appropriate that protection relays have more than one criterion in order to differentiate the normal operating conditions from the abnormal ones.
After these two VBMCs are extracted, the harmonic signals are processed or transformed using these coefficients. Hence, harmonic signal instantaneous values are recorded to form an x(t) vector of 1×50 size, where those values are sampled every 1µs (see Table 3). Then, that first data window is processed through the two previous VBMC. Next, a new data window of 50 points is formed where the oldest instantaneous value is discarded and the new sample is considered to form VOLUME 8, 2020 the new data window. Finally, this procedure is carried out continuously so that all instantaneous values recorded during one cycle are processed.
It is necessary to note that to optimally extract the VBMC of the sinusoidal waveform; the methodology presented in Section II is used. In the case of harmonic signals transformation, only these first two VBMCs must be used, i.e. those data windows of 1×50 size must be analyzed and processed in real-time/online, as is presented below. The new data window denoted by Xtest and which represents the harmonic signal is processed through eqn. (7). Afterwards, these harmonic signals are processed through the first two VBMCs. Interestingly, these signals present special behaviors. For instance, as it is shown in Fig. 3, the pattern for the 1 st and the 3 rd harmonic vary significantly more than the normal operation group. In Fig. 3, the red color spectrum represents the pattern related to the non-fault or non-disturbance condition, and the green color spectrum corresponds to 3 rd harmonic added to the nominal frequency. In the same way, other examples are presented from Fig. 4 to Fig. 6, which correspond to the 5 th harmonic (Fig. 4), 7 th harmonic (Fig. 5), and the 3 rd and 7 th harmonic ( Fig. 6). Table 4 and Fig. 7 provide a summary of the values of both VBMC axes for harmonic signals. In case of the 3 rd harmonic, it represents the smallest variation on the normal operation spectrum in comparison to other harmonics. Thus, the magnitude of these values increases according to the harmonic level. For the 5 th harmonic, the spectrum value is higher than the 3 rd harmonic on both axes. For the 7 th harmonic, its performance is similar to the previous, with a level higher than the 5 th and 3 rd harmonics on both axes. See Fig. 7. Therefore, it can be seen that the projected values on both VBMC axes tend to be higher than the non-fault or nondisturbance condition values. Table 4 presents the minimum and maximum values projected on both VBMC axes of the harmonic waveforms. Based on Table 4, it is possible to see that these minimum and maximum values are different from normal waveforms (first axis = ±7.0710 and second axis = ±0.0319). It is necessary to note that in Fig. 7,     As regards the assessment of arcing faults, it is necessary to note that in this case, the database used has special features. First, the data window of the x(t) vector is of 100µs formed by 50 samples recorded every 2µs. Thus, a vector of 1 × 50 size is used. Second, unlike those signals of normal operation and harmonics that correspond to the ESP shown in Fig. 1.a, arcing faults correspond to the EPS shown in Fig. 1.b. Therefore, a database with data windows different from the previous case and with different topology is used to test the viability of the first two VBMCs previously extracted. Similar to the case of harmonic signals, these permanent and temporary faults are processed in real time applying eqn. (7). In this context, the first data window is processed through the first two VBMCs corresponding to the non-fault or non-disturbance condition. Then, a new data window of 50 points is formed where the oldest instantaneous value is discarded and the new samples are considered in order to form the new data window. Results of the projection for the first one hundred data windows are set out in Fig. 8, where blue dots represent permanent faults and green dots represent temporary faults.
What is interesting in these data is that the variation of all arcing fault data windows on both axes increases drastically related to harmonics and normal operation. Thus, values  higher than 20 on both axes are determined. The projection of the first four data windows using the first VBMC for 15 permanent and temporary faults is presented in Table 5. Here it is possible to see that these values are higher than normal operation values even in the first data windows. For example, the projection of the first data window of the first permanent and temporary fault has values of 25.83 and 24.78 for the first axis (red in Table 5). The projection of the second data window has values of 24.83 and 23.73 for the first axis. As a result, it can be seen that in the axis corresponding to the first VBMC, arcing faults have considerable variation related to the highest harmonic spectrum value (±13) and to the nonfault or non-disturbance condition (±7.0710). The behavior of these different events in the space of the first VBMC is shown in Fig. 9, where A represents the zone for the nonfault or non-disturbance condition, B represents the zone for harmonic signals, and C represents the zone for arcing faults.
This finding has important implications for developing a criterion based on values of the first axis as follows: where li is a logic index that is used as a discriminator for the identification of the event type, and v fa is the value of the first axis. As a result, two threshold values are determined. In the first place, the threshold value for normal operation denoted by v fa <= t hn with t hn = 7.0710 is used. Thus, if the v fa index of a new signal is inside that range, it is identified as normal operation. In the second place, t hn <v fa <= t hh for harmonic conditions is used, and finally v fa >t hh for temporary or permanent fault is used. Based on Table 4, it is possible to determine that the absolute maximum value for the harmonic signals is 13 on the first axis. Thus, in this research, a safety margin is considered between harmonic and arcing fault signals, set at 1.3 of the maximum value of the harmonic i.e. t hh =1.3 × 13≈17. It is necessary to note that traditional TL protection relays use a safety margin based on a counting scheme to avoid unnecessary trip orders [3]. Therefore, in this research, a safety margin is also used. Point A in Fig. 1 shows the parameter setting of that proposed algorithm, which can be modified and determined if new events are considered, which is used by relay manufacturers [18].
In addition, based on eqn. (8), it can be seen that the l i logic index is 1 for arcing fault and 0 for normal operation and harmonic condition, due to the fact that the TL protection relay must send a trip order only when a fault condition is produced. Thus, when a condition different from fault is identified, the relay does not send a trip order. In this context, it must be noted that harmonic conditions must not be considered as faults and when these events are produced on TLs, the TL must not be disconnected. Several results for temporary and permanent faults are represented in Table 6, corresponding to a matrix of 74 new fault signals which are projected on the space of the first VBMC. Results show that these values are higher than the t hh threshold value. Consequently, these arcing faults correctly converge to the required values.
It is necessary to point out that this section does not involve the classification of permanent or temporary fault type. This is because, the primary and secondary arcs, which have a length in the order of milliseconds, must be considered. Thus, large data windows must be used. However, to show the potential of the proposed methodology, the classification of these faults is presented in Section VI.B.

V. CASE STUDY 2: VBMC S APPLIED AS FILTERS TO HARMONIC SIGNALS
Based on section II, an X database can be transformed or altered through a W orthogonal matrix, which allows us to determine a new V matrix of similar size to the original matrix. This method has an advantage which is that the transformation can be reversed and the original matrix can be exactly restored. Once these new coefficients (VBMC) are determined, it is possible to use them to filter different frequency signal components. Therefore, this transformation provides a means of noise reduction, where the signal data contained in the p-q component is assumed to be mostly due to noise, and the signal data contained in the first q component is assumed to be mostly due to relevant information from the database. As a result, a perfect reconstruction of the original signal is possible as follows: Based on eqn. (9), depending of the number of coefficients chosen based on the variability, the signal can be reconstructed, eliminating or reducing the noise. In this section, the database corresponds to 6 harmonic signals with 64 samples per cycle (see Section III and Table 3) that form the X matrix of 6×64 size. The database is therefore 64-dimensional and a SVD [84] will lead to 64 eigenvalues and 64 coefficient vectors. It is necessary to note that SVD is another alternative for calculating PCs. According to the SVD, for an X matrix of 6 × 64 size, a U matrix of 6 × 6 size, a transposed matrix of the impulse response coefficients W matrix of 64 × 64 size, and a eigenvalues diagonal matrix of 6 × 64 size consequentially exist. Therefore, X can be equivalently represented in the SVD form as follows: Based on eqn. (10), it is clear that according to the number of row vectors of the W matrix, the signal can be reconstructed. In this context, the W matrix is formed by 64 row vectors which are analyzed in detail in this section to show the behavior of these coefficients for filtering the signal. Also, in order to show the behavior quantitatively, based on international standards such as IEC 61000-3-2 [85], the THD index is used. The THD index is analyzed in different ways. Firstly, the THD index is calculated for the original harmonic signals. Secondly, the THD index is calculated for the reconstructed signals using only one VBMC (i.e. the reconstructed signal using only the first VBMC, the reconstructed signal using only the second VBMC and so on). Thirdly, the THD index is calculated for the reconstructed signals using different VBMCs together (i.e. the reconstructed signal using the first+second VBMCs, the reconstructed signal using the first+second+third VBMCs, and so on. The process is presented as follow: • The frequency spectrum and the THD values of the original harmonic signals are set out in Fig. 10.a and Table 7. From the data in Fig. 10, those harmonic levels from the fundamental component to the 13 th harmonic component can be seen. Prior to filtering out the harmonic component, the variability of data is carried out. Thus, from eqn. (6), it can be determined that the first eigen-VOLUME 8, 2020 value gives 68.4558% of the total variability, the first two eigenvalues give 86.9197% of the total variability, and so on until the first five eigenvalues give 100% of the total variability. Therefore, using only q = 3 or 4, i.e. the first three or four row vectors of the W matrix, the database could be reconstructed acceptably. The first three VBMCs for the harmonic signals of this case study are shown in Fig. 11.
• Once VBMCs were extracted, they are used to reconstruct the signal. For example, the reconstruction of the signal corresponding to the signal whose harmonic level is 5 th , 7 th , 11 th and 13 th (sixth signal in Table 7), is shown in Fig. 10.b, which is reconstructed using the first VBMC vector, and the first four VBMC vectors. In Fig.10.b there is a clear reconstruction of the signal using the first VBMC, and the best reconstruction of the signal is done using the first four VBMCs.
• Once the signals reconstruction is done, frequency spectrum and the THD values of these signals are determined. For example, from Fig. 10.c it can be seen that by using the first four VBMCs, the frequency spectrum of the reconstructed signal corresponding to the signal whose harmonic level is 5 th , 7 th , 11 th and 13 th is very similar to the original spectrum. By using the first VBMC, 5 th , 7 th , and 11 th harmonics levels are decreased. Besides this, as it can be seen in Fig. 10.c, the frequency spectrum of the reconstructed signal using the 5 th to the 64 th VBMC (red in Fig. 10.c) represents the noise of the signal. The right half of Table 7 provides values of THD from the first to the fifth VBMC separately. As a result, if the signal is reconstructed by using only the first VBMC, a THD index of 23% is determined, where 30.341% of the original THD is reduced. On the other hand, if signals are reconstructed using the second, third, fourth and fifth VBMCs, the THD indices of the reconstructed signal increase considerably.
• Similar to that previous case, all signals that are contaminated with different harmonic levels (see Fig. 10.d) are also reconstructed and the new THD index is determined. Thus, signals reconstructed by using the first four coefficient vectors are shown in Fig. 10.e, where it is possible to see that those reconstructed signals are very similar to these original signals, which states that the first four eigenvalues give variability very close to 100%. Unlike the previously said, the surface represented from the 5 th to the 32 nd coefficients, which is very different from the original signal is shown in Fig. 10.f. Therefore, that spectrum can be considered as the noise of original signals. Finally, the surface of the database reconstructed using only the first VBMC is shown in Fig. 12 where the harmonic level tends to be eliminated or decreased while waveforms are filtered, achieving more sinusoidal behavior. In conclusion, these results show that the noise and most of peaks have been eliminated from the original signals. THD values of these reconstructed signals through different VBMCs are presented in Table 8 all together. From these results, it is possible to see that by using the first two VBMCs, the THD index decreases 0.55377% from the original THD index. Using the first three VBMCs, the THD index increases 4.2261% from the original THD index. Using the first four VBMCs, the THD index increases 2.06% from the original THD index and finally, using the first five VBMCs, the THD index increases 0.005% from the original THD index.

VI. CASE STUDY 3: VBMC S APPLIED AS STRUCTURING ELEMENTS IN MATHEMATIC MORPHOLOGY TO CLASSIFY FAULTS ON TLS
The basic concept of MM is to modify or transform a signal through its intersection with a SE, where the Multiscale MM can be calculated at different scales [86]. In this section, it is used to develop two classifiers of faults on TLs. The procedure used is made up of three steps. In the first place, the function below is defined as a multi-scale opening operation by g n (SE) at scale n, with n = 1, 2, 3, 4,. . . , and so on.
Based on eqn. (11), the process results can be significantly affected by the choice of the SE (g n ). Secondly, a discrete pattern spectrum [87] is calculated, which is based on the opening operation. The pattern spectrum is defined as follows: where α is the scale used in the analysis, A(f • αg) is the measure of the pattern content of f relative to pattern αg, and A(f ) is the area under f . It is necessary to indicate that a variation of α and the shape of the SE g give the complete shapesize pattern spectrum of f . In this context, the properties of the pattern spectrum depend critically on the selection of SE. Finally, after the pattern spectrum is calculated, and in order to classify different pattern spectra, the average spectrum correlation coefficients of p pattern spectrum are expressed as follows [84]: where P 1 and P 2 represent two different pattern spectra and ρ is their correlation coefficient, which in this study is used to measure the similarity of two signal types (lightning with and without fault in section VI.A and temporary and permanent fault in section VI.B).
Considering the aforementioned information, in this section two novel classifiers of faults are presented using the previous procedure. These classifiers use two SEs. The first corresponds to the semicircular SE widely used in protection relay algorithms, and the second corresponds to the VBMC extracted or discovered through the methodology proposed in section II. The general procedure used in this section is shown in Fig. 13.

A. CLASSIFICATION OF LSS THAT GENERATE OR NOT PERMANENT FAULT
The main objective of TL protection relays is to correctly classify LSs that either generate or do not generate permanent faults. Thus, when an LS hits the TL, the transmission tower or the shielding wire, and if the overvoltage on the insulator string is higher than the BIL, a fault is produced. As a result, the protection relay must send a trip order. On the other hand, if the BIL is not exceeded, a fault is not produced, and the protection relay must not send a trip order. A description of the importance of the classification of these phenomena is presented in [81]. In this section, this case study is carried out in three steps. First, a database composed of 1352 LSs that generate permanent faults and 3406 LSs that do not generate permanent faults is used (see Section III). Thus, the X matrix of a 4758 × 4000 size recorded by R 1 relay of Fig. 1.a with a data window of 4ms is used. Second, the first classifier is established through the traditional Multi-scale MM, i.e. the X matrix is transformed through coefficients of SE traditionally used. Then, by using the procedure presented in Fig. 13, the two types of LSs are classified. In this research, the semicircular SE is used, an SE which is widely used in protection algorithms. Finally, the second classifier is identified through the Multi-scale MM combined with the VBMC, i.e. the X matrix is processed through the methodology proposed in Section II to determine these coefficients optimally. Then, these coefficients are used as SE in the Multi-scale MM (green in Fig. 13). Using the procedure presented in Fig. 13, the two types of LSs are classified.
Using the proposed methodology, the first VBMC of the X matrix is shown in Fig. 14, which, based on eqn. (6), gives 96% of the variability of the database.
Differences between the semicircular SE and the first VBMC applied to Multi-scale MM analysis are highlighted in Fig. 15. From these data, we can see that the first VBMC resulted in the best classification performance, which demonstrates that using the proposed coefficients the multi-scale MM can classify these two types of LSs more clearly than the semicircular SE. Consequently, it is possible to determine a threshold value th= 0.79 which is used to classify these signals. Therefore, if the correlation coefficient is higher than th, an LS that generates permanent fault is determined. On the contrary, if the correlation coefficient is smaller than th, an LS that does not generate permanent fault is determined. The parameter setting of that proposed algorithm is shown at Point B in Fig. 1. Some cases for distinguishing between permanent and not permanent faults produced by LSs are presented in Table 9.

B. CLASSIFICATION OF TEMPORARY AND PERMANENT FAULTS (AUTO RECLOSING)
85% of faults in overhead TLs are single-phase and temporary. If a temporary fault on TLs occurs, which is usually accompanied by an electric arc, protection relays must isolate the faulted line temporarily until the fault arc path de-ionizes and re-energizes it, a process which is called reclosing. In many cases, the fault on TLs is not permanent and is caused by conditions such as flying birds, tree branches and others, where the electrical arc plays a major role. Therefore, an important issue in reclosing is identifying the fault type, i.e., permanent or temporary. A description of the importance of the classification of these phenomena is presented in [79]. Similar to the previous case, this one is completed in three steps. First, a database composed of 36 permanent faults and 36 temporary faults is used (see Section III). Thus, the X matrix of 72×4000 size recorded by R 1 relay of Fig. 1.b with a data window composed by 4000 samples is used. Second, the semicircular SE is used in the procedure shown in Fig. 13. Third, similar to the case presented in section VI.A, the X matrix is processed through the methodology proposed in Section II to determine the coefficients optimally. The VBMC used as SE is shown in Fig. 16.
The behavior of the ρ between permanent and temporary faults using the second VBMC as SE is shown in Figure 17. It is possible to see that the semicircular SE and the second VBMC give acceptable results. Thus, using the second VBMC, these temporary and permanent faults are more identifiable whereas with the semicircular SE the first six signals are not well distinguished. In order to show the complete analysis of the classifier performance, all classification percentages using both SEs are presented. Thus, threshold values based on the average value are used. For the classifier based on the second VBMC, a t h1 threshold value of 0.387081 is determined. Thus, from the data in Fig. 17.a, the temporary and permanent faults are 100% correctly classified. For the classifier based on the semicircular SE, a t h2 threshold value of 0.81205 is used. As shown in Fig. 17.b, the temporary faults are correctly classified. However, as opposed to the permanent faults, six faults are now well classified. As a result, 83.33% of these 36 permanent faults are correctly classified. The parameter setting of that proposed algorithm is shown at Point C in Fig. 1.  These findings have important implications for the robustness of the proposed methodology in optimally determining coefficients related to TL protection relays. In addition, it is necessary to indicate that a complete analysis of the case studies presented in section VI.A and VI.B will be presented in another paper, because after the process a wealth of outstanding information was obtained, such as multi-scale MM with different scales, classification percentages, several VBMCs, and others. Thus, a full discussion of these case studies lies beyond the scope of this research.

VII. DISCUSSION
In order to guarantee that the design of a new digital filter will be successful in real applications, it must be examined on different features such as filtering response, harmonic attenuation, transient behavior, and others. In [18], based on the cutting-edge technology developed by many SEL engineers, including technical papers and others, it is presented that currently, in modern protection solutions, automation, and monitoring, microprocessor-based relays must provide many advanced functions and abilities such as faster fault clearing without sacrifying security for high-speed DF and TL protection [19], [20]. In addition, in [18], it is indicated that protection scheme complexity is a measure of the logic required to implement the protection, where the design and building are imperative. In addition, [18]- [21] describes that modern protection algorithms provide harmonic deleted and restraint. In this context, a detailed analysis of results of this research as regards these features is presented in this section.

A. DIMENSIONAL REDUCTION
In the first place, by using optimal coefficients, it is possible to reduce the size of data corresponding to disturbances and faults. As a result, from case study 1, these phenomena can be distinguished using only the first two VBMCs of these 50 possible VBMCs. Thus, a dimensional reduction of 96% is achieved. Another important finding was that from case study 2, using the first five VBMCs of these 64 possible VBMCs, these harmonic signals can be reconstructed. In addition, using the proposed methodology, different harmonic levels can be eliminated or decreased according to those coefficients used. An important feature of that harmonic filter is that it is possible to develop p levels of filtering, i.e. in Section V, an X matrix of size 6 × 64 is used, where 64 coefficient vectors are determined related to their eigenvalue and variance. The most interesting finding was that in all of the case studied, the first VBMCs are necessary for analyzing these database, and produce a very good dimensional reduction.

B. DATA WINDOWS
Regarding the response time, there are several factors in a digital relay that can affect its operation time. One of these is VOLUME 8, 2020 associated with dimensional reduction (see Section VII.A). The second factor is the time delay associated with the nature of the sampled data. Unlike the traditional relay operation time (one cycle approximately), the current study found that specific coefficients can be determined for short data windows with different sampling rates.
Surprisingly, another important finding was that data with different window sizes can be analyzed using a common VBMC. For example, the arcing fault data formed by 50 samples recorded every 2µs over 100µs total time is analyzed through the first VBMC of 1 × 50 size recorded every 1µs, corresponding to sinusoidal operation.
These results further support the idea of how this methodology can be applied to different protection principles. Thus, the first case study confirms that short data windows (50µs) can be used similarly to the data windows used in TW protection. Results of case study 2 allow the data windows of one cycle to be analyzed through the proposed methodology, similar to the traditional phasor protection. Finally, according to results of the first section of case study 3, it is possible to analyze data windows of 4ms size, similar to TBP. It can therefore be assumed that the proposed methodology can be applied to different protection principles.
The third factor that affects the operating time of protection algorithms is the time delay related to digital filtering. In this case, based on eqn. (7), vectors of different events are processed through coefficients using a simple and easy process, similar to the Fourier-based FIR filter. In addition, the fastest times for a given FIR filter are directly related to the highest sampling rate. Thus, based on previous results, these events can be analyzed with a very high sampling rate. In fact, in this paper, a high sample frequency is considered, where it is possible to optimally determine coefficients.

C. TRANSIENT AND HARMONIC BEHAVIOR
Regarding the transient and harmonic analysis in this research, events that produce high and ultra-high transient, corresponding to arcing faults and LSs and harmonic signals were tested. Results of the third case study confirm that those transient signals can be used in conjunction with the proposed methodology in order to develop protection algorithms. Thus, it is possible to see that using short data windows (4ms) for LSs and long data windows (48 cycles) for arcing faults, different coefficients can be optimally extracted. As regards harmonic behavior, it is possible to see that signals are clearly reconstructed using only the first VBMC. A possible explanation for this might be that these first VBMCs (first, second, third, fourth and fifth) give the total variability of data. The most surprising aspect of these data is the similar THD values for all harmonic signals when the VBMCs are used separately. However, when these coefficients are considered together, the THD values are different. It is necessary to note that, using the proposed methodology, in table 7 and 8 is presented that the first VBMC gives the smallest value of THD. Therefore, only that VBMC can be used in order to analyze the THD. In addition, as regards the THD different values in table 7 and table 8, different filters can be achieved according to the VBMC choice. The aim of this section is to show that the first VBMC should be used in the THD analysis, but other VBMC can also perform different filtering of signals. Thus, different filters can be developed applied to voltage signals. However, that analysis is out of the scope of this research.
In addition, it is necessary to note that different protection functions and events were also analyzed for the proposed methodology in other studies [81]- [83] and that other results corresponding to the identification of internal and external faults on DFs with DG integration have also been achieved, which will be presented later on.

D. DESIGN, BUILD AND MANUFACTURE
As regards the design, building and manufacture of digital filters, all proposed algorithms must be easy and simple, two essential requirements in the protection relay environment. In this research, the determination of features for the database analyzed is achieved only through different coefficients in the time domain. It is independent of phasor calculations. Besides this, the signal analysis is developed over a very short data window, where the time delay is shorter than the traditional filter. Therefore, these algorithms can have a higher-speed performance. In addition, the traditional phasorial estimation method has a drawback regarding operation time (it is usually done in one cycle), and in case that operation occurs before one cycle, there is greater error of phasor estimation and consequently, incorrect operation of the protection relay may happen. However, its calculation is simple and easy, i.e. the phasor estimation is calculated using a convolution process between the analyzed signal and two mother functions called sine and cosine. In this paper, the process of the proposed methodology has similar behavior to the phasor estimation method, i.e. the transformed matrix is calculated using a similar convolution process between the analyzed signal and optimal coefficients (VBMC).

VIII. CONCLUSIONS, CONTRIBUTIONS AND FUTURE WORKS
No previous study has investigated or proposed a methodology to optimally calculate the IRCs applied to TL protection algorithms. This study presents a novel methodology for optimally determining coefficients that depend on the structure of the waveform being analyzed. This paper has argued that variance offers an effective way of choosing the best coefficients.
Returning to those features of MA mentioned at the beginning of this study, it is now possible to state that several protection functions, such as identification of normal and fault conditions, harmonic analysis and classification of a fault on TLs, considering different data windows, sampling rates, EPS topology, and others, can be analyzed through the proposed methodology.
Unlike traditional digital filters where coefficients are predefined and assumed to be independent, this study has identified that those coefficients proposed are found in the data by looking for a set of independent and orthogonal axes.
The first case study was designed to identify these events, and using the first VBMC they can be clearly identifiable with an ultra-fast operation time.
The second case study set out to evaluate how the extracted coefficients can be used to reconstruct the waveform and reduce the THD value, where the first VBMC produces the highest reduction percentage.
The third case study confirmed that those coefficients discovered can be used in another filtering technique. Thus, these VBMCs are used as a SE in Multi-scale MM, where it is possible to see that the proposed coefficients have the best behavior related to traditional SE. It is necessary to note the ability of PCA in order to extract patterns in a 2D and 3D space and how these patterns can be useful in order to determine different criteria.
The validation of the behavior of proposed algorithms in real-time is the cornerstone in a medium time perspective. Specifically, due to the advances in SPT, sampling rate and other algorithms may be implemented into the next generation of protection relays. Besides that, based on the test cases, the proposed methodology can achieve the characteristics of the FIR filter required by manufacturers.
The key strengths of this study are its simple optimization process, its robustness for analyzing different phenomena with different variables and characteristics, its adaptability to other mathematical tools, and its simple final design applied to online conditions. Based on the above information and on results obtained in this work, this research could be considered as a reference for future research applied to protection relays. Therefore, the proposed work can be considered as novel research for researchers, manufacturers, and others related to protection relay.
It is recommended that further research be undertaken in the following areas: • Further research might explore whether the VBMC extracted with this proposed methodology could be used as MW in MRA and in other SPT.
• A future study investigating TL protection relays considering the penetration of renewable energy sources would be very interesting.
• In future, it would be important to explore the potential use of the methodology applied to other EPS elements such as transformers, buses, and others.
• It also would be important to explore the potential use of this methodology for developing a real-time evaluation of the COMTRADE files that IEDs provide.
• A number of possible future studies using the same experimental set up are expected.