Harmonic Current Estimation of Unmonitored Harmonic Sources with a Novel Oversampling Technique for Limited Datasets

In modern power systems, harmonics are amongst the significant issues attributed to renewable energy sources and nonlinear loads. Direct harmonic monitoring of the entire power system may be too costly or impractical, and measured data could be limited. In this paper, a new methodology is proposed to estimate harmonic current rms values of unmonitored harmonic sources, based on harmonic voltage rms magnitudes only, measured at a limited number of monitored buses. A new technique of output curve-normalization is employed in per-processing. Subsequently, a method is proposed to refine the architecture of the Artificial Neural Networks’ (ANNs), after which ANN-based harmonic current estimators are developed for each harmonic order and each harmonic source. Furthermore, a novel Neural Oversampling Consensus Algorithm for Regression (NOCAR) is proposed to improve estimation accuracy. K-Nearest Neighbor (KNN) and ANN are combined in developing NOCAR. A comparison is made with state-of-the-art techniques by using synthetic data, which demonstrates both the proposed method’s robustness and its capability to perform when minimal information is available. The implementation for real data demonstrates the efficiency of the ANN-based harmonic current estimators with oversampling. The influence of the number of harmonic meters is investigated, revealing the ability of this data-driven technique to reduce the number of harmonic meters, and hence monitoring costs. Moreover, the correlation between different harmonic orders is studied, with results suggesting that, unlike the widely accepted notion, this correlation should not be ignored in harmonic analysis. This study highlights the advantages of integrating intelligent techniques into harmonic monitoring systems.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.

I. INTRODUCTION
Harmonics and waveform distortion are of major concern for operators of modern power systems, due to the everincreasing penetration of Renewable Energy Sources (RES) and nonlinear loads. RES, primarily wind and solar photovoltaics, generally connect to the grid via inverters, while nonlinear loads typically incorporate power electronics switching devices and other nonlinear elements. In addition, the emerging tendency towards use of DC microgrids, solid-state transformers and high-voltage dc transmission [1] also increases the degree of nonlinearity in power systems. A high degree of nonlinearity will increase the flow of harmonic currents, which in turn will result in harmonic voltages and currents throughout. Power quality (PQ) standards, such as the IEEE Std 519-2014 [2] and IEC 61000-3 [3], impose limits on the levels of harmonic voltages and currents. Reliable and effective harmonic analysis and mitigation techniques are vital in order to ensure that these PQ standards are met [4].

A. Inspiration and Motivation
Harmonic analysis is generally quite complex, particularly given the typically high correlation between harmonic variables within a power network. The authors in [5], for example, reported a high correlation between harmonic orders 5 and 7 in low-voltage networks, and also high correlation between the 5 th harmonic in both the low-voltage and medium-voltage sections of a network. The intermittent and uncertain nature of wind and solar generation, which represent the main sources of harmonics on the generation side, further adds to complexity. On the other hand, direct monitoring of the entire power system may be too costly or impractical, with deregulated power systems now comprising many prosumers and various ownership models [6]. Therefore, the concept of Harmonic State Estimation (HSE) has been addressed by researchers [7]- [10]. HSE involves solving the reverse of the harmonic power flow problem, to provide estimation of harmonic voltages and currents of unmonitored harmonic sources, based on measurements at other buses and lines. HSE is thus a problem which requires analysis in the frequency domain. While conventional signal processing techniques such as Fourier Transform (FT), S-Transform (ST) and Wavelet Transform (WT) can be employed to convert data from time domain to frequency domain, they are not able to be applied to the HSE problem itself. HSE requires more sophisticated methods. This is illustrated in Fig. 1 [11]. In addition to monitoring occurring at a limited number of locations, measured data itself may also be limited, with for example harmonic phase angles being unavailable. However, even a limited and incomplete dataset still can contain useful information and provide the network operator with immediate insight into harmonic levels in a power system. Hence, HSE methods which are able to address limited information could be valuable for network operators. The main motivation of this paper is to develop a harmonic monitoring technique that works when the number of direct measurements in a network are limited and/or the measured data contains incomplete information. "See Fig. 1 at the end of the manuscript"

B. Assessment of the State-of-the-Art
Estimation of harmonic voltages and currents of unmonitored harmonic sources has been studied both by using conventional analytical models and by using datadriven models. Least squares (LS) methods and their variants, such as weighted LS and Genetic Algorithm LS (GALS), have been used to address HSE [9], [12]- [14]. In spite of the interesting technique for locating harmonic sources, the GALS method in [13] is only appliable when the number of simultaneously operating harmonic sources is lower than the number of measurements. In [12], weighted LS was used to solve the HSE problem, by considering uncertainty in network parameters. The number of measurements was around 1.35 times the number of state variables. Kalman filter (KF) methods have also been applied in HSE [7], [15] and were reported to be fast. An adaptive KF was applied in [7], which did not require explicit knowledge of the noise covariance matrix. Other numerical methods, such as sparsity maximization assuming sparsity in state variables (harmonic source injections) [8] and combined GA and sensitivity analysis [16] have also been applied to solve the HSE problem. In [8], [13] and [16], small numbers of harmonic sources were considered to operate simultaneously, compared to the number of measurements. However, the number of sources is likely to be much higher in power systems with high penetration of RES. When a limited number of harmonic meters are available, conventional HSE methods may reveal some errors due to the lack of harmonic observability [8]. Hence, nonconventional methods have been proposed to minimize the number of harmonic meters required, while preserving the accuracy of HSE and harmonic observability. For example, a method based on elimination of the row with minimum condition in the measurement matrix, and GA combined with grid topology analysis were proposed for optimal harmonic meter placement in [17] and [18], respectively. In spite the merits of such techniques, they did not address the variation of network parameters. Researchers have moved toward Artificial Intelligence (AI) methods for HSE because of their learning capabilities. Traditional HSE methods require pre-knowledge of power system parameters (impedance/admittance model) and quite a large number of harmonic monitors. These requirements are reduced by applying AI. AI techniques for HSE encompass Artificial Neural Networks (ANNs) and its variants, Bayesian Learning (BL), Fuzzy Inference System (FIS), and hybrid techniques. In early works of [19] and [20], ANN was combined with traditional HSE and constrained estimation, respectively, to estimate current injection of harmonic sources. The estimates by ANNs were treated as pseudo-measurements and modified in the complementary stages of either HSE or constrained estimation. In [21], ANN was employed to estimate voltage and Total Harmonic Distortion (THD) of an unmonitored bus, based on the voltage/current measurements at Distributed Generation (DG) buses. The algorithm was tested for a small power system and under slightly varying load condition. Authors in [22] proposed a method based on Nonlinear AutoRegressive models with eXogenous inputs (NARX), which estimated the voltage of an unmonitored sensitive bus as a function of voltage and current at a monitored bus. However, no test set was used to assess the generality of the trained model. BL and Sparse BL were utilized to solve HSE in [23] and [24], respectively. The former employed regression analysis and recurrent NN for power flow calculation and demand prediction, respectively. In [25] and [26], FIS and ANN were combined to locate harmonic sources based on the signs of active and reactive harmonic powers while their harmonic levels were estimated as qualitative variables.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Conventional techniques such as LS and statistical methods such as BL usually require an accurate network model and many harmonic meters, to ensure accurate estimation. Their accuracy is generally low for underdetermined systems. In addition, techniques such as [8], [13] and [16] assume sparsity of simultaneous harmonic sources, which is likely not the case for systems with high penetration of RES. The iterative nature of methods of [13] and [24] make them inappropriate for real-time applications. Last but not least, ANN techniques for HSE generally require large datasets.
To the best of the authors' knowledge, application of HSE techniques in systems where harmonic phase angles and network models are both not available and where the number of datapoints is limited, has not been addressed in the literature.

C. Contribution and Paper Organization
In this paper, harmonic current estimation of unmonitored harmonic sources based on harmonic voltage measurements at monitored buses of a power system has been addressed. ANNs have been used to develop the harmonic monitoring system, with input features extracted using FT. After preprocessing, the feature vectors are fed to the ANNs, to map harmonic voltage rms values at monitored buses to harmonic current rms values of the harmonic sources. The methodology is demonstrated first on synthetic data using the IEEE 14-bus network, and then real data measured in the Tasmanian transmission network, which features wind farms as the major harmonic sources. It is notable that the test sets, which have not been used in any stage of training the harmonic estimators, are employed to evaluate the performance and to compare the proposed technique with others reported in the literature including GALS [13], Singular Value Decomposition (SVD) [17] and BL [24]. The contributions of the paper are as follows. algorithm and ANN, can be useful in any regression problem for any application. 4) For real data measured in the Tasmanian transmission network, cases of different numbers of harmonic voltage monitors have been studied. In addition, harmonic orders in the input feature vectors have been considered separately and simultaneously. In the former, harmonic voltages of each order are only considered in estimating the harmonic currents of the same order while in the latter, harmonic voltages of all orders under study, are considered in estimating the harmonic currents of each order. The analysis of the real data suggests that unlike the widely accepted convention in the literature, the correlation between different harmonic orders is important for harmonic analysis. The rest of the paper is organized as follows. Harmonic current estimation is presented in Section II. Section III presents the proposed methods and the mathematical model for developing the AI system for HSE. Comparison of the proposed technique with the state-of-the-art techniques is made in Section IV. Implementation for a real transmission network and the discussion of the results are presented in Section V. Finally, the main findings of the paper and suggestions for further research are provided in Section VI.

II. HARMONIC CURRENT ESTIMATION
The basic formulation of conventional HSE is a mapping of monitored harmonic voltage and current phasors to unmonitored harmonic voltage and current phasors as (1) [11]. For a limited dataset, where only harmonic rms values (or magnitudes) are available, but not phase angles, the linear matrix equation of (1) does not hold and traditional HSE methods are not applicable. However, a mapping of monitored harmonic voltage/current rms values to unmonitored harmonic voltages/currents can be defined as (2), where F and G are nonlinear functions. While it may be analytically very hard or even impossible to derive these nonlinear functions, ANNs can efficiently address such problems by learning the nonlinear mapping from data of input/output pairs.
The relationship between measurements and current injected by harmonic sources can be defined as (3) and (4). Two approaches can then be considered: 1) unknown harmonic currents of each harmonic order are obtained only as a function of measurements of the same harmonic order, the HSE problem for each harmonic order being independent of other harmonic orders, as in (3); and, 2) simultaneous consideration of all relevant harmonic orders in estimating currents of each harmonic order, as in (4). The independent harmonic approach is commonly employed for conventional harmonic analysis [27]. In (3) and (4) metering locations. ANNs will be employed to estimate either the function G or the functions Gh for each harmonic order h.

III. THE PROPOSED AI SYSTEM FOR HSE
To develop the AI system for harmonic current estimation, three stages are employed: data refinement and feature extraction (pre-processing stage), ANN estimation, and, postprocessing. FT, conventional scaling normalization of the ANN inputs and a proposed normalization technique for the ANN outputs, called curve-normalization, are employed in the pre-processing stage. In developing ANNs, the training input/output pairs have been first used to refine the ANN structure by solving an optimization problem. ANNs, with this refined structure, are then trained using training sets. In the final stage, denormalization of the outputs is conducted. The test sets are only used to evaluate the performance of the AI system and are not used in the development stages.

A. PRE-PROCESSING STAGE: FOURIER TRANSFORM AND DATA NORMALIZATION
Frequency domain analysis is a common practice in HSE and harmonic estimation. FT and its variants such as Fast FT (FFT) and Short-Time FT (STFT) can be used to transform from time domain to frequency domain [28]. The intelligent harmonic estimators, in this paper, are also developed in the frequency domain. Data normalization is used to improve the generality of AI systems. One of the prevalent data normalization techniques, known as feature scaling, is based on the minimum and the maximum of each variable in the dataset. This normalization is usually applied to input features of the AI system so that each feature is transformed to a range between 0 to 1. Consider Zi as the ith feature vector in the dataset, feature scaling results in (5) [29].
Sometimes, different output variables of an AI system may also have high variance and wide ranges. The same normalization approach can be followed to normalize the training set outputs. However, this approach cannot be applied to the test set, since the maximum and minimum parameters of the test sets are assumed to be unknown in the training stage. Accordingly, an output normalization method is proposed which operates just based on the minimum and the maximum values of the target outputs in training sets. This approach is also applicable for other normalization methods such as standardization based on mean and standard deviation parameters. Consider the output vector X which is divided into two portions, X train and X test . The normalized output vectors will be train train As can be seen, the original value of any arbitrary data point in the dataset can be recovered.
In spite of the benefit of the output normalization, there still may be deficiencies in the process of learning for datasets containing outliers. This is the case for our study which uses data obtained from a real power system. There are spikes in the injected harmonic currents of the windfarms due to wind speed change, particularly when an interval of no wind or low wind is followed by an interval of high wind, or vice versa. The data outliers and significant variations within the data can be problematic when using normalization, since the vast majority of normalized data points become crowded near to zero. Hence it can be advantageous to further pre-process the data in such a way as to stretch out data points that are close to 0 and compress data points close to 1. In the domain of [0,1] and for datapoints which are crowded near to zero with outliers close to one, nth root functions operate to reduce the variance of the dataset and the effect of spikes. For example, cube roots of 0.1 and 0.9 are 0.464 and 0.965, respectively. The cube root function is a continuous and invertible function on R, and hence the original data can be easily recovered. The process of curve-normalization (C-Norm), via application of the cube root function, is depicted in (7). As a result, the ANNs will subsequently be trained to estimate the curve-normalized value of the outputs, from which the original outputs can be obtained, accordingly.

B. PROCESSING STAGE: ANN HARMONIC CURRENT ESIMTATOR
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. A Feed Forward Neural Network (FFNN) is used to estimate the function G and hence the harmonic current rms values. An FFNN is comprised of an input layer, several hidden layers, and an output layer. Each layer consists of a number of neurons (nodes) with activation functions such as sigmoid and hyperbolic tangent. The neurons in Layer k are connected to neuron in Layer k+1 through adjustable synaptic weights [30]. In the feed forward calculation, the output of node j in Layer k+1 is derived from the nodes in Layer k as follows.
where f is an activation function such as sigmoid, tanh or a linear function.
In a supervised manner, NN Learning is an iterative process where the ANN outputs are calculated in the feed forward path. The ANN estimated outputs are compared with the target output to form a loss function. The gradient of the loss function with respect to different weights of each layer are then back propagated and the weights are updated. This process will continue until the loss function is reduced to a desirable threshold or the maximum number of epochs is reached. Other stop criteria such as validation failure can also be used. Different training algorithms such as back-propagation gradient descent, stochastic gradient descent and Levenberg-Marquardt (LM) have been applied in the state-of-the-art literature [31]. Fig. 3 shows an eight-layer FFNN initially used in this paper which later will be replaced by a more appropriate architecture obtained from the optimization. The inputs to this ANN are the normalized harmonic voltage rms values of the monitored buses, and its output will be the curve-normalized h-order rms harmonic currents of the nth harmonic source. Hence, an ANN-based harmonic current estimator will be trained and developed for each harmonic order injected by each harmonic source. This procedure outperforms the approach where only one ANN is considered to estimate all variables, mainly because with one separate ANN trained for each harmonic, the task of each ANN will be more specific. This approach has also been recommended in [19].

1) REFINEMENT OF THE ANN ARCHITECTURE
When an AI system overfits a dataset, it performs almost perfectly on the training set while the same performance is not achieved on the test set. The gap between the training error and test error is generally observed to be too large [32]. In other words, the generality of the model is low in the case of overfitting. A validation failure check is a process which checks the performance on a validation set as part of the training process. The learning process will be stopped if the performance on the validation set is not improving for several consecutive iterations [32]. It is worth noting that validation is part of the training itself and thus contributes to the learning process, whereas testing is a process conducted after training and is used to evaluate the generality of the model. Without any validation stopping criterion, if the initial architecture of the ANN leads to overfitting, there is no need to make the ANN more complex. Instead, lighter structures may lead to better generality. Consider a neural network with p layers as , l = 1, 2, 3,…, p. The overfitted network has the l=1,2,…,P where square brackets, [], denotes the floor function and Q is the number of discrete points in the search space of each layer. Each member in corresponds to the size of a layer and zeros is correspondent to removing the layer. To illustrate this with an example, consider the ANN of Fig. 3 combinations. Each combination can be modelled as a ternary number with six digits. For example, 202112 corresponds the ANN architecture of {32, 10, 16, 16, 16} where layer B is removed and so forth. This way of defining the search space can be suitable for evolutionary optimization algorithms like GA. In this paper, the regularized Mean Squared Error (Reg_MSE) and conventional MSE have been used for training and testing as depicted in (9a) and (9b), respectively. Reg_MSE can be considered as the objective function used to refine the ANN structure, and the sizes of layers (number of neurons in each layer) are the control variables. In Section V, a minimization problem has been solved to obtain the refined ANN architecture by using a brute force algorithm [33] and by considering the training sets with validation check.

2) OVERSAMPLING FOR OVERFITTED ANNS
An oversampling technique which is applicable to regression problem could be useful in dealing with limited and skewed datasets. However, conventional oversampling techniques are not applicable to regression problems while they are efficient This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. tools to deal with unbalanced datasets and minor classes in classification tasks by AI. In this subsection, the new Neural Oversampling technique, NOCAR, is proposed which can efficiently address the task of oversampling for any regression problem. As the datasets used in this paper are limited, NOCAR can lead to improvements in accuracy of the harmonic current estimators. Traditional AI classifiers are vulnerable in learning highly skewed data as they are designed to expect different classes to contribute equally to minimization of the classifiers' loss functions [34]. One of the well-known oversampling techniques called Synthetic Minority Over-sampling Technique (SMOTE) was proposed in [35] which formed the basis of many other oversampling techniques for classification such as hybrid K-means SMOTE [36] and SMOTE combined with self-organizing maps [34]. The main reason which hinders the application of oversampling for regression problems, is the determination of the target output of the oversampled datapoints. This issue does not happen in classification problems, as the generated datapoint will inherit its output class from the original datapoint. A few methods have been proposed on oversampling for regression. In [37], a SMOTE for Regression (SMOTER) was proposed which oversampled the ANN input data based on the K-means method in the original SMOTE algorithm [35] and estimated the target output of the oversampled data by interpolation between two datapoints contributing to the generation of the new point. SMOTER may be insufficient as the interpolation between two original datapoints does not necessarily follow the innate pattern in the dataset which ANNs aims to find. It can lead to distortion of the dataset from the original pattern in dimensionally large problems. In NOCAR, the approach for generating ANN input datapoints is similar to the traditional SMOTE algorithm. However, the process of generating the final datasets (combination of newly generated datasets and original datasets) and the process of calculating the target outputs of the generated datapoints are completely novel. In addition, a consensus stage is also proposed for any regression task performed by the AI system. Two oversampling procedures can be conducted: 1) Oversampling on the upper portion of the output data, and 2) Oversampling on the lower portion of the output data. The target outputs of the newly generated datapoints are obtained by training additional ANNs for the initial upper and lower portions of the dataset. Then, a consensus between the upper-oversampling, the loweroversampling and no-oversampling will be formed. Let us consider a dataset D: (VD, ID) which is comprised of Ns datapoints (input-output pairs). First, ID should be sorted in the descending order, and two data percentiles pu% and pl% should be selected for upper-oversampling and loweroversampling, respectively. The corresponding inputs to the percentiles are VD(pu%) and VD(pl%). Based on the definition of percentile, 100-pu% of the data have output values greater than ID(pu%) and pl% of the data have output values lower than ID(pl%). Hence, correspondent to pu% and pl%, two subsets for upper-oversampling (VD,u, ID,u) and lower-oversampling (VD,l, ID,l) are extracted. Two oversampling numbers are chosen as Nu and Nl which are the number of generated datapoints per each original datapoint. Initially, for each oversampling subset, an ANN with the similar structure to the main ANN will be trained to map the input-output pairs e.g. (VD,u, ID,u) which is denoted by ANNu (or ANNl). Afterward, for all candidate samples in oversampling subsets (VD,u, ID,u) and (VD,l, ID,l), the Nu and Nl nearest neighbors, respectively, will be obtained by KNN algorithm. The difference between input features of each obtained neighbor and the initial sample is calculated. A uniformly random array (Rand) with the same size of the input feature vector is also generated. The generated sample's input feature vector, OVu, is depicted in (10) (a similar equation can be used to calculate OVl). Then, the target output vectors corresponding to OVu and OVl will be obtained using the trained ANNs as OIu and OIu. By using ANNs (ANNu or ANNl) to calculate the target output of the generated datapoints, the features of the oversampled subsets will be better projected in the generated data. However, still the new datapoints are randomly generated and they should not dominant the datasets. Hence, for each generated sample, the original sample will be replicated in the dataset. For example, if two new datapoints are generated for each candidate in VD,u (Nu=2), the final dataset will be VD,u_new ={VD, OVu,1, VD,u, OVu,2, VD,u} and ID,u_new ={ID, OIu,1, ID,u, OIu,2, ID,u}. By replicating the datapoints in the upper/lower subsets plus oversampling, the contribution of the upper/lower portions of the datasets will increase in minimization of the loss function which could be a good practice in dealing with outliers. Choice of upper-oversampling, lower-oversampling or both depends on the application and on the problem which is supposed to be solved. Algorithm 1 depicts the step-by-step implementation of oversampling part of NOCAR.
NOCAR also contains a consensus stage. After the oversampling stage, three categories of datasets are obtained: D: (VD, ID), Du_new: (VDu_new, IDu_new), Dl_new: (VDl_new, IDl_new). The number of generated datasets can be more using more oversampling runs. For each dataset, a neural network will be trained to estimate the outputs (ANN, ANNu_new, and ANNl_new). The consensus between the trained ANNs can be achieved by averaging as (11). For the case study of this paper, averaging consensus revealed accurate performance. Other statistical indices can also be employed based on application. Vsample=VD,u(i) // Initial Sample 3.
K=Nu // K is equal to either Nu or Nl 4.
end for 9. end for "See Fig. 4 at the end of the manuscript"

IV. COMPARISON WITH STATE-OF-THE-ART TECHNIQUES
To show the competence of the proposed technique, it has been compared with GALS [13], SVD [17] and BL [24]. The methods have been compared for a synthetic dataset generated on the IEEE 14-bus test system as shown in Fig.  5. The network information can be found in [38]. The uncertainty in harmonic voltage measurements and network parameters has been modelled similar to [24]. An undetermined harmonic state estimation has been performed for four harmonic sources at Buses 5, 10, 12 and 13 and harmonic orders 3, 5, 7 and 11. Three harmonic voltage meters are assumed at Buses 2, 6 and 11, based on the optimal meter placement technique of [17]. As the location of harmonic sources are assumed to be known, the harmonic location algorithm of [13] was skipped. However, the method of [13] requires information of at least one Zero Injection Bus (ZIB) for each harmonic order. Hence, an extra equation regarding zero harmonic current injection at Bus 3 is included to implement the method of [13] , Nu=Nl=2 and consensus of five members. 1000 datapoints were simulated and randomly divided into the training sets, validation sets and the test set with the ratio of 65%, 15% and 20%. The ANN structure of Fig. 3 is utilized in this section. Hyperbolic tangent and linear functions have been used as activation functions in the hidden layers and output layers, respectively, which revealed better performance than the sigmoid function for the case study. Reg_MSE is used for training and MSE for performance evaluation and testing. The error threshold is set to 0.0001; regularization factor and learning rate are chosen 0.05 and 0.1, respectively; maximum number of epochs is 100. The proposed methodology was implemented in MATLAB 2020b.
The results are presented for the test sets, which were not used in the training stage. Fig. 6 shows the comparison of MSE on per unit harmonic currents for different HSE techniques. It can be seen that the proposed method is performing as well as GALS and SVD, even though it does not employ harmonic phase angle measurements. The identical behavior of GALS and SVD methods is because the two techniques are mathematically analogous when the location of harmonics sources are known. For harmonic sources at Buses 6 and 10, the estimation by all four methods is accurate. BL revealed more error in some cases, owing to the undetermined nature of the problem. Although the proposed technique may have high computational burden during the training stage, it only requires one calculation of the forward path for each ANN at the estimation stage, in contrast to the iterative nature of BL [24] and GALS [13]. Accordingly, the simulations show that only SVD has a faster response than the proposed technique. Table I summarizes the measurements and information required for each technique, highlighting how the proposed technique has the minimum requirement. Although currently not used, harmonic phase angles could also be embedded into the proposed method, by using real and imaginary values of harmonic voltage measurements instead of rms values, potentially leading to further accuracy improvements.

V. IMPLEMENTATION FOR A REAL TRANSMISSION NETWORK
In this section, a real transmission network with high penetration of wind energy (four operating wind farms: wf1-wf4) has been studied. The simplified single line diagram of the Tasmanian transmission network is shown in Fig. 7 "See Fig. 7 at the end of the manuscript" The proposed harmonic estimation method uses harmonic voltage rms values of monitored buses to determine harmonic current rms values injected by each of the wind farms. The problem is made challenging because the ANNs' inputs, harmonic voltages, have low variance while the target outputs, harmonic currents, have high variance. In addition, only 10minute average harmonic data is available, and so wind farm output and thus injected harmonic currents can change considerably between two adjacent samples in the dataset. In this case study, harmonic current estimation has been evaluated for harmonic orders h=3, 5, 7 ,11. I h SH should be replaced by I h wf in (3) and (4) where I h wf is the vector of h-order harmonic current rms values injected by the wind farms.
For each harmonic order and each wind farm, an intelligent harmonic monitor comprised of several ANNs are trained. Hence, 16 intelligent monitors have been trained separately. Consideration of different number of harmonic voltage meters has been assessed in developing the intelligent harmonic monitors. In addition, simultaneous and separate consideration of harmonic orders in the input feature vectors has been compared. This evaluation addresses the question whether variables of a harmonic order can be analyzed and estimated independently of other harmonic orders.
The ANNs have been trained with the same activation functions and training parameters mentioned in Section IV. The data is again randomly divided into 65% training, 15% validation and 20% test sets. For all the methods and procedures elaborated in Section III, only training sets have been used applying LM training algorithm. It is worth noting that all error curves and error values presented in this section are calculated using curve-normalized outputs while the estimated/real value graphs are in electrical per unit. Similar to section IV, MATLAB 2020b has been used to implement AI harmonic current estimators.

A. OVERFITTING IN THE HARMONIC ESTIMATORS
Initially, the ANN structure depicted in Fig. 3 has been used to develop the harmonic current estimators of the wind farms, with monitors at M2, M3, M4 and M6. Validation and oversampling are not used in this instance, revealing that the ANNs may overfit the data. The estimation is almost exact on the training sets for all ANNs although the performance on the test sets is poor. Fig. 8 shows the estimated and real values of 3rd harmonic current of wf4 and 7th harmonic current of wf2 for 100 datapoints in the training sets and in the test sets. As can be seen from the figure, ANNs are suffering from overfitting because of the limited datasets. Remedial actions are required to mitigate this problem.
"See Fig. 8 at the end of the manuscript"

B. REFINING THE ANN ARCHITECTURE
The initial ANN architecture leads to overfitting, but can be simplified and lightened to improve accuracy. Optimizing the ANN structure is not the focus of this paper, however; the process proposed and described earlier can be used to obtain a more suitable structure. First, ANNs with the initial architecture of Fig. 3 are trained using 15% of the training sets randomly selected for validation in each run. Fig. 9 shows the MSE values for the test sets for each of the 16 harmonic current estimators. Since the outputs of all estimators are curve-normalized, their MSEs are comparable with each other. As can be seen in Fig. 9, the error for harmonic 7 of wf2 is significantly higher than the errors for other estimated harmonics. Hence, this estimator (h=7, wf2) has been used to find a unique refined structure of all ANNs. This methodology could be applied to each ANN separately, but this is not the focus of this paper. "See Fig. 9 at the end of the manuscript" The refined structure of ANNs is obtained by using training dataset of 7th harmonic of wf2. Based on the coding method introduced in section III, the initial ANN structure is coded as (22222)3. Every candidate ANN structure has a 6-digit ternary code. To illustrate the search space and the objective function in 3-D, the first three digits are considered on X-axis and the second three digits are considered on Y-axis. For example This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3186373 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME XX, 2017 9 (212022)3 will be divided into X=(212)3=23 and Y=(022)3=8. Accordingly, the discrete range of X-axis and Y-axis is 0 to 26. The search space and the best solution are shown in Fig.  10. The optimal solution is X=26=(222)3 and Y=16=(121)3 which implies ANN layer sizes of {32, 10, 10, 16, 32, 8}. This refined structure is used for all ANNs in the upcoming subsections of this study. Fig. 11a and Fig. 11b show the real values and estimated values of 3rd harmonic current of wf4 and of 7th harmonic current of wf2 for the test sets, respectively, by using the refined ANN structure. In spite of improvement, it can be seen that the accuracy is not high at some points, yet. In Subsection E, oversampling has been employed to improve the performance further.
"See Fig. 10 at the end of the manuscript" "See Fig. 11 at the end of the manuscript"

C. IMPACT OF NUMBER OF HARMONIC VOLTAGE MONITORS ON HSE
Reducing the number of required harmonic monitors leads to a reduction in cost of a harmonic monitoring system [17,39].
There are eight harmonic voltage meters in the transmission network (M1-M8) of our case study. There is a trade-off between the cost of the harmonic monitoring system and harmonic observability. By reducing the number of monitors below the number of unknown harmonic variable, the system of equations will be mathematically underdetermined which is more challenging to solve. Different cases of harmonic voltage monitors have been studied to estimate the harmonic currents of the wind farms, as shown in Table II. We study the impact of the number of voltage meters used, by creating eight separate cases, each with successively fewer meters being used to train and test ANNs, as per Table II. Case 1 employs the maximum number of harmonic voltage monitors available; however, the wind farms still remain unmonitored. Case 5 is based on four harmonic voltage meters which are further from the wind farms than other harmonic meters. Hence, for choice of four harmonic monitors out of eight, Case 5 is the most challenging. Fig. 12 shows the box plot of MSE for the test sets for 16 trained ANNs. The estimation errors increase with decreasing the number of monitors used, which is generally as expected. The red pluses (+), considered as outliers in the box plot, correspond to the estimation of 7th harmonic current of wf2, in each case. Figs. 13a-13d depict the 3rd harmonic current of wf1, 5th harmonic current of wf2, 7th harmonic current of wf3 and 11th harmonic current of wf4, respectively. The ANN estimation and real value on 50 datapoints of the test sets for Cases 1, 5 and 7 are compared. It can be seen that Case 1 and Case 5 with a greater number of monitors than in Case 7, are performing better than Case 7 for majority of the samples. The accuracy in Case 1 is mostly better than the accuracy in Case 5. The network is not thriving in estimating the high harmonic values as there are not enough of such points in the datasets for ANNs to learn their features which will be addressed by the proposed oversampling technique in Subsection E. "See Fig. 12 at the end of the manuscript" "See Fig. 13 at the end of the manuscript"

D. SIMULTANEOUS AND SEPARATE CONSIDERATION OF HARMONIC ORDERS
It is a common approach to analyze each harmonic order independently of other harmonic orders [23], [40], [41]. However, there is correlation between different harmonic orders which may not be negligible, particularly in power systems with high penetration of RES. In the dataset of the transmission network, there is high correlation between different harmonic orders of the same wind farm as they are generated by the identical harmonic source. In this subsection, harmonic currents have been estimated by considering harmonic voltage orders, simultaneously and separately. Cases of eight harmonic monitors and four harmonic monitors (Case 1 and Case 5 in Table II

E. NOCAR TO IMPROVE HARMONIC CURRENT ESTIMATION
In this subsection, the dataset will be oversampled for cases of eight and four harmonic voltage monitors (Case 9 and Case 10 in Table III Table III. The oversampling has reduced the errors and Ic31 and Ic32 lead to the best performance which corresponds to consensus with five participants. The accuracy of Ic31 with four harmonic voltages monitors is slightly better than the initial case (I12) with eight harmonic voltage monitors and no oversampling. This highlights the significance of proper intelligent and soft computing techniques which could results in identical to or even better performance than hardware-based solutions.
Moreover, hardware-based solutions such as installing more harmonic monitors can be much more costly than data analysis and intelligent techniques. "See Fig. 16 at the end of the manuscript" Harmonic current estimations by various ANNs are presented in Fig. 17a-17h to prove the effectiveness of NOCAR along with other techniques proposed in this paper. Harmonic current outputs of consensus Ic31 and Ic32 are compared with real values for 100 data points. A couple of points are noticeable in Fig. 17. First, the ANN-based harmonic current estimations are accurate for the majority of datapoints and the performance has been improved by the integration of ANN and oversampling as NOCAR. However, the network performance on 7th harmonic is not as good as other harmonic orders. This can also be understood from the outliers of box plots in Fig. 16. Error Reduction Rate (ERR) as a measure to show the improvement introduced by NOCAR is defined as (13). Fig.  18 shows ERR of 16 harmonic current estimators for consensus #3 in cases of four monitors and eight harmonic voltage monitors (Ic31 and Ic32). As revealed in the figure, NOCAR has led to error reduction and accuracy improvement for both cases of four and eight harmonic voltage monitors. The improvement for some cases of eight monitors is less than that of four monitors which is because the initial error for cases of eight monitors has been less.  "See Fig. 19 at the end of the manuscript"

VI. CONCLUSION
In this paper, an intelligent monitoring system based on ANN was developed for harmonic current estimation of unmonitored harmonic sources, based on harmonic voltages at monitored buses. Synthetic and real datasets were used, both of which were limited in terms of the number of meters, the number of datapoints and the measured variables available.
The developed system maps rms harmonic voltages measured in some locations in a power system to injected rms harmonic currents at other locations. A new curve-normalization of the target outputs was introduced to reduce the skewness of the outputs, and a novel neural oversampling technique, NOCAR, was proposed to address overfitting. Estimation performance was improved when these proposed methodologies were employed. Comparison of performance with GALS, SVD and BL, state-of-the-art techniques in the literature, demonstrated the effectiveness of the proposed technique. Applied to data collected from a real power system, the performance of the harmonic estimators was studied with limited number of harmonic voltage meters. Moreover, both simultaneous or separate consideration of harmonic voltage orders in training ANN-based current estimators for each harmonic order were investigated and analyzed.

A. MAJOR FINDINGS AND FUTURE RESEARCH
The major findings of this paper are as follows. 1) Simultaneous analysis of different harmonic orders yields better harmonic estimation accuracy than when each harmonic order is estimated independently of other orders.
2) It was demonstrated that by employing NOCAR and ANNs, the performance of harmonic current estimators with only four voltage monitors got close to the performance achieved with eight harmonic monitors. This suggests that software-based data analysis techniques are worthy of investigation, as opposed to only relying on hardware-based remedies such as installation of new monitors at harmonic source injection points.
3) NOCAR, as an oversampling technique for the regression problem, introduced some improvement in the performance of the proposed harmonic monitoring system. In addition, it can be a beneficial tool in the development of AI systems for other applications with regression tasks. 4) Compared to other techniques, ANN-based harmonic estimators can provide accurate results in real-time with minimum required information.
The following research directions could be pursued in future. 1) The advantage of curve-normalization could be revealed for estimation of all harmonic current sources using only one ANN.
2) By combining data techniques and direct harmonic monitoring, develop methods for determination of the minimum number of harmonic monitors and their optimal locations .
3) Integration of harmonic phase angles as input features of the ANNs for more accurate harmonic estimation. 4) Accurate assessment of harmonic measurement outliers, and analysis of their importance from an electrical system outlook. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.