Automatic Load Model Selection Based on Machine Learning Algorithms

Technology development and decentralized operations create changes in conventional electric systems, where load modeling has been a challenge in dynamic analysis. Consequently, accurate dynamic load models are required to ensure the quality of the studies in current systems. This paper presents an automatic strategy based on clustering, classification, and optimization algorithms, to obtain the load models in the case of several system operating conditions. The obtained load models are helpful for the planning, operation, and protection of electric power systems. The proposed approach validation is performed using the IEEE 14-bus test system, where high performance is obtained. The average obtained cross-validation error for the load models assigned to the 13 clusters of disturbances is $5.36\times 10^{-3}$ . The cross-validation error is used as a tolerance value to determine when an online assigned load model is suitable to represent the measured disturbance. The proposed tests show the strategy’s capabilities of defining the load model online, making this approach suitable for field applications.

Error estimation for the first of the V folds. Error function to determine the differences between the measured and the estimated active and reactive power. Load model parameters. par * Optimal load model parameters.

I. INTRODUCTION
A. MOTIVATION Load modeling has been a research topic due to its particular aspects, such as the type, composition, variability, and diversity of loads. Many strategies have been published, considering different approaches (component-based or measurement-based modeling) and mathematical structures (static, dynamic, or composite) or their physical (induction motor equations) or non-physical meaning (data-based black-box modeling), among other aspects [1], [2].
Two questions frequently appear when load modeling is analyzed: Which model best represents the load behavior? and How can the model's generalization capability be improved? Regarding the former question, several models can represent some of the required characteristics in specific applications, e.g., static load models for the power flow calculation [3], dynamic load models for transient, and voltage stability [1], and black-box or grey-box models for active distribution networks (ADN) [4], [5].
On the other hand, due to the load variability both in composition and power demand, it is difficult to guarantee that the same load model adequately represents operational conditions that are different from those used in the parameter estimation when the measurement-based load approach is applied. This characteristic is known as generalization capability and must be evaluated for each load model. Usually, the generalization capability is tested for disturbances similar to those used in the model parameter estimation, which is not a recommended test of the load models [6]. Otherwise, in conventional distribution systems, a load model type is generally defined for a specific bus, and its parameters are not changed regardless of the studies performed. However, considering the stochastic nature of distributed energy sources, the everyday use of power converters, and the characteristics of modern loads, new modeling approaches must be developed that allow one to shift between the model types according to the load time-varying composition. This paper proposes an automatic load model selection strategy based on machine learning and optimization algorithms to solve these problems. The proposed strategy obtains the best load model to analyze the electric system, considering different disturbances and operating conditions.

B. STATE OF THE ART -COMPARISON TO THE RELATED PROBLEM
Several approaches have been considered for load modeling; however, this paper focuses only on the measurement-based load modeling approach enhanced by machine learning and optimization algorithms.
Supervised learning algorithms have been applied in load forecasting in several studies [17], [18], [19], [20], [21]. However, according to the literature review summarized in Table 1, few papers consider a combination of data-based learning approaches to obtain an automatic adaptive load model, useful for system analysis under different disturbances. In [22] a k-means based clustering defines the consumption patterns used as a training set to obtain generalized load models constructed by RBF neural networks. Zhang et al. [23] used k-medoids clustering to analyze the parameter dependency for the complex WECC dynamic composite load model (CMPLDW). Tray et al. [24] identified the load and classified it into residential, commercial, agricultural, and mixed, using Artificial Neural Networks (ANN). Kontis et al. [25] proposed a statistical analysis and an ANN-based approach to obtain generic dynamic load models for different operating conditions. Another approach based on ANN-based modeling is found in [26], where the data of an induction motor load considering several operating conditions were used to develop a black-box model. In [27], clustering and ANN-based strategies were proposed to derive parameters for a variable-order dynamic equivalent model. Wang et al. [28] used the support vector machine (SVM) approach to propose a parameter identification for composite ZIP and electronic loads. Liang et al. [29] proposed SVM to identify parameters for a differential equation-based dynamic load model structure.
As shown in Table 1, most proposed approaches focus on applying machine learning to identify load composition or specific load models that only perform correctly under specific operating conditions. Unfortunately, these strategies are not fully automated for online model determination.

C. CONTRIBUTIONS
The measurement-based load modeling approach requires a time-consuming parameter estimation process; thus, a strategy based on clustering and classification algorithms is proposed to reduce the computation effort in determining suitable load models for online applications. This allows the automatic definition of a parameterized load model that best represents the online measured power disturbance. The proposed approach offers a quick and reliable load model to perform forensic or planning studies.

II. PROPOSED STRATEGY
Supervised learning is related to using labeled datasets to train algorithms that later classify data or predict outcomes accurately [30]. This feature is used in the proposed approach to select the best suitable parameterized load model online from a set of predefined models. As load model determination through an estimation process can be time-consuming, this strategy reduces the effort using a two-stage methodology. The offline stage obtains several load models for a power system disturbance database through the measurement-based approach. Then, using clustering and cross-validation techniques, the load model that best fits is selected for each disturbance at the database. For a new disturbance, the online stage quickly determines the load model, using the information obtained in the offline stage and a classification technique that reduces the computation time required to obtain new load models. Figure 1 shows the general framework of the proposed strategy. The stages and respective steps involved are described below.

A. OFFLINE STAGE: ESTIMATION, CLUSTERING AND SELECTION
The main objective in this stage is to determine several load models for a disturbance, applying a measurement-based approach, optimization, and clustering algorithms. Finally, using cross-validation, the best load model is selected. This section contains five steps, including the disturbance database determination, normalization of the database, clustering of the normalized disturbance database, parameterization of the selected load models, and the selection of the best representative model to describe the load behavior for each cluster of disturbances. All of the steps are explained below.

1) STEP 1. DETERMINATION OF THE DISTURBANCE DATABASE
The database contains RMS voltage and power (active and reactive) measurements at the load bus, obtained during disturbances at the analyzed system or from simulation responses (synthetic data). The database must contain a wide variety of large and small disturbances to achieve diversity and successfully apply machine learning tools. This strategy considers D disturbances caused by shunt faults, lines/loads switching, and variations of the transformer tap changer, for example.

2) STEP 2. NORMALIZATION OF THE DISTURBANCE DATABASE
Once the database has been obtained, a normalization process of 3 × D signals (voltage and active/reactive power) is performed before using the machine learning tools. The objective is to change the data set values to a standard scale without distorting the differences in the ranges of values. In general, for machine learning, the data set does not require normalization; however, it is required in the case of three measurements in different scale ranges, such as the voltages and active and reactive power signals used in this proposed approach.
There are several normalization strategies as min-max, means, median, and stdv [31]. With this strategy, the minmax normalization algorithm is applied, considering adequate results obtained in similar engineering applications. The normalized voltage and active and reactive power signals are merged for each disturbance at the database, into one single vector as presented in (1), to generate D normalized sampled signals. This set is used as input for the clustering algorithm.

3) STEP 3. CLUSTERIZATION OF THE DATABASE DISTURBANCE
A clustering step is applied to group the available D normalized sampled signals based on their similar behavior. The k-means algorithm is used in this approach, considering its simplicity and adequate performance obtained in similar clustering approaches, although the strategy can use any clustering approach [32]. This unsupervised clustering method is used to differentiate samples based on their similarity [33]. This algorithm aims to minimize an objective function, in this case, a squared error function, as shown in (2): where c k is the centroid of the cluster k.
A fair value of the number of clusters (K ) is defined using several strategies, such as the elbow and silhouette methods [34]. The elbow method, used in this paper, is based on calculating the Within-Cluster-Sum of Squared Distances (WSSD) to the cluster centroid, for different values of K , and choosing the value for which WSSD becomes first starts to diminish. In the plot of WSSD-versus-K , this is visible as an elbow, it defines the fine value of K [35].

4) STEP 4. PARAMETER ESTIMATION FOR LOAD MODELS
Simultaneous with the normalization and clustering steps, the database performs an estimation process for each d disturbance. The measurement-based load modeling requires an initial selection of the model structure for the identification and parameter fitting; then, several load models are considered because each disturbance is better represented by a specific load model [1]. The best model for each disturbance is automatically selected to avoid delegating the model selection to the power system analyst. M models are considered in this strategy to represent the static and the dynamic load behavior.
The load model parameter estimation is performed in this step by using the particle swarm optimization technique (PSO), considering the adequate performance reported in other applications [36], [37]. Despite the previous, the approach presented in this paper is not constrained by any specific optimization algorithm; thus, other options can be considered.
The accuracy of the estimated and parameterized load model is assessed based on the error function given in (3). This considers the difference between the power measured at the load bus (P s and Q s ), and the power estimated using the parameterized model (P e and Q e ) for each model m, set of model parameters par, and a specific disturbance d.
(3) VOLUME 10, 2022 J is the total number of samples at the measured signals. The load model parameters par are updated through the optimization algorithm to minimize the error. Finally, the optimal parameters for each disturbance are defined as par * . At this step and for each disturbance d, the optimal parameters for the M load models are obtained.

5) STEP 5. SELECTION OF THE MOST REPRESENTATIVE MODEL FOR EACH CLUSTER
As previously explained, the set of disturbances D has been grouped into K clusters. Each cluster contains several disturbances (D k ) with similar behavior as presented in Step 3, and each disturbance d has been represented by M load models as described in Step 4. Therefore, each cluster has D k × M load models.
Considering the previous explanation, Step 5 is oriented to determine the most representative load model m * k for each cluster k. The cross-validation technique is selected in this step because it allows an adequate estimation of the error defined in (3). The selection of the best load model using the cross-validation is depicted in Figure 2.
For cross-validation, the disturbance set in cluster k (D k ) is divided in F-folds. The method is repeated F times, such that each time, one of the F folds is used as the validation set, and the other F − 1 subsets are grouped to form a fitting set to define models with optimal average parameters. The error estimation for the first fold is presented in (4), where d represents the disturbances in each fold D k,f of cluster k and par * corresponds to the average parameters of each model for each one of the disturbances at the F − 1 folds.
The error estimation is averaged over all F trials to obtain the total effectiveness of the model m for representing the disturbances in cluster k, as presented in (5). This reduces bias, as the data are used for fitting, and it simultaneously reduces variance, as the data are also used in the validation set.
The criterion used to define the representative model of each cluster k is that the model must have the minimum cross-validation error, according to (5). The parameters of this model are obtained using the average of the model parameters for the D k disturbances.

B. ONLINE STAGE-AUTOMATIC LOAD SELECTION
In the online stage, the information obtained in the offline stage (clustering, parameter estimation, and model selection) is used to derive a load model automatically when a new disturbance is presented. This strategy reduces the computation time required to obtain load models when a measurement-based approach is used. The proposed approach automates the load model selection, giving a significant advantage over other proposals, which generally require the continuous execution of the parameter estimation process for each new disturbance. In this step, a signal processing phase is applied to each new disturbance, similar to that performed in the offline stage. The voltage and active and reactive power measurements are normalized using the strategy in Section II-A2 and considering the same maximum and minimum values used at the offline stage; this strategy avoids errors caused by comparing signals normalized using different scales.
This step allows the new disturbance signal to be prepared, leading to the classification step.

2) STEP 7. CLASSIFICATION OF THE DISTURBANCE
A distance-based classification algorithm known as the k-nearest neighbors (kNN) is applied to automatically associate the normalized measurements representing the new disturbance to a specific cluster previously defined in the offline stage. This cluster's best load model (representative model) is selected to represent the load behavior in the new disturbance.
The classification algorithm is used due to its simplicity, and adequate results for a wide range of applications [38]. As in the case of the clustering algorithm, the proposed methodology is not constrained to using a specific classification method.

3) STEP 8. LOAD MODEL VALIDATION
Once a parameterized load model (the representative model) is associated with the online acquired disturbance, the model performance is evaluated through the error presented in (3). The results of this process are compared to the defined reference error (tol). This paper uses the error for each cluster obtained from the cross-validation as (tol). In the case of an error higher than the reference, the offline obtained load models do not adequately represent the actual disturbance behavior; then, this disturbance serves to update the database as described in the next step. Otherwise, the proposed approach quickly offers a reliable load model for forensic or planning studies.

4) STEP 9. ORIGINAL DISTURBANCE DATABASE UPDATE
When the behavior of the assigned load model has a low similitude with the corresponding disturbance, this measurement is included in the disturbance database, and the complete offline stage is performed. Otherwise, the assigned model adequately represents the new disturbance, and the database need not be updated.

III. RESULTS AND ANALYSIS A. TEST SYSTEM
Methodology validation is performed using several operating conditions at the IEEE 14-bus system. This system has five synchronous machines with IEEE type-1 exciters and three synchronous compensators used for reactive power support; it additionally contains 11 loads (259 [MW] and 81.3 [MVAR]). The dynamic data for the generator exciters are obtained from [39], and the static data are obtained from [40].

B. OFFLINE STAGE: ESTIMATION, CLUSTERING AND SELECTION 1) STEP 1 -DETERMINATION OF THE DISTURBANCE DATABASE
The dynamic load behavior is captured through voltage and active and reactive power signals. A total of 319 disturbances are included in the database (D = 319). The disturbances consider shunt faults, load size variation, and the connection or disconnection of generators and lines.

2) STEP 2 -NORMALIZATION OF THE DISTURBANCE DATABASE
Sampled voltage and active and reactive power signals are normalized using the min-max strategy. The maximum and minimum values for each signal are stored, as these are required in the online stage.
Next, the normalized signals of voltage and active and reactive power are merged for each disturbance at database S d = {V d , P d , Q d } to generate D normalized sampled signals. Figure 3 depicts some of the merged normalized sampled signals; it is worth noting that diversity of disturbances is needed (from small to high voltage disturbances) to represent the expected system behavior adequately.

3) STEP 3 -CLUSTERIZATION OF THE DISTURBANCE DATABASE
At this stage, clustering is applied to assemble the available normalized signals into k groups. Based on the elbow criterion presented in section II-A3, an optimal value of k = 13 is obtained. As presented in Figure 4, the clustering algorithm adequately differentiates the load response's behavior in the case of each disturbance. Some clusters contain small perturbations, such as cluster 12; others, such as clusters 4 and 10, contain oscillatory signals or high variations of the normalized signal.

4) STEP 4 -PARAMETERIZATION OF THE LOAD MODELS
As described in Section II-A4, for each disturbance d represented by the sampled signals of V , P and Q, the selected M load models are parameterized.
Four load models (M = 4), which represent static and dynamic behavior, are considered in this paper: the polynomial load model (ZIP), the exponential recovery load (ERL), the classical composite model (CL), and the induction motor with an exponential model (EL+IM). These models are adequately described in [41], [42], [43] and [44] and summarized in the appendix.
In the proposed tests, as D = 319 and M = 4, 1276 load models are estimated. This process involves a high computation time; however, it is performed offline.

5) STEP 5 -SELECTION OF THE MOST REPRESENTATIVE MODEL OF EACH CLUSTER
Only a load model is selected to represent the disturbances in a cluster at this stage. This is the most representative model of the cluster, and it is selected considering the cross-validation strategy presented in Section II-A5. As a consequence, the error estimated according to (5) for each load model in each cluster is presented in Figure 5. In this figure, the blue-colored VOLUME 10, 2022  . Although there are no significant differences for some models, it is highlighted how the ZIP model does respond with good precision to severe disturbances as these are included in clusters 1, 4, 8, 10, and 11. However, for some critical disturbances, such as those included in cluster 5, dynamic load models such as EL+IM show better response because the disturbances present high signal variations from the pre-disturbance value. It is essential to note that these conclusions should not be generalized because they are related to the system behavior under specific disturbances. In power systems with different characteristics (topology, compensation systems, and size, among others), the response will be different, and therefore other models can better represent the load behavior. This argues that a load model used correctly for one specific condition will not necessarily suit another.
• Although three dynamic load models are considered, each one has unique characteristics that allow a specific load model to be more appropriate under specific operating conditions (more or minor oscillations in the load response, a fast load response, and low drop voltage, among others). For instance, cluster 5 can  be better represented for an EL+IM model than an ERL model.
• When there is a similar error among load models, the computation time is used as a selection criterion. This time is usually associated with the number of parameters. For instance, cluster 6 can be represented by ERL or EL+IM models (if the CL load model is not available) due to the similarity between errors; however, ERL has fewer parameters (6 parameters) than EL+IM (11 parameters), representing a reduction in computation time, especially in huge power systems.
• The selected load model must suit the proposed application. For example, a dynamic load model is preferred over a static one for stability analysis, and a static load model is preferred when a power load flow is executed. According to these observations, it is worth developing adaptive approaches to derive measurement-based equivalent models to manage the changing conditions of the electric systems.
Finally, the minimum cross-validation error explained in Section II-A5 is applied to define the load model's parameter. The parameters obtained for the best load model for each cluster are presented in Tables from 2 to 5.

C. ONLINE APPLICATION OF THE PROPOSED STRATEGY
The proposed strategy is validated using ten new disturbances obtained from the IEEE 14-bus system, not previously used in the offline stage. Variations in load size and voltage dis- turbances are considered to generate additional operating conditions not considered in the offline stage.

1) STEP 6-PRE-PROCESSING OF THE ONLINE RECORDED DISTURBANCE DATA
Once a new disturbance is presented, signal normalization is performed using the same minimum and maximum values obtained in the offline stage, as mentioned in Section III-B2.
As a result of the data processing, the normalized signals representing each disturbance are shown in Figure 6. This figure includes the normalized time-domain signals of the RMS values of voltage and active and reactive power.

2) STEP 7 -CLASSIFICATION OF THE ONLINE RECORDED DISTURBANCES
The assignment of a load model to a new disturbance in an agile and fast way, avoiding the parameter estimation process (a time-consuming process), is performed by applying the classification task explained in Section II-B2. The results for the online disturbance classification are presented in Table 6. According to the results, each disturbance is assigned to one cluster, and consequently, the respective average model defined in the offline stage for that cluster is selected. As shown in Table 6, although the ZIP or ERL models are assigned to several of the new disturbances, each one has different parameters (e.g., the ZIP model parameters for disturbance A are different from the ZIP model parameters for disturbances F, G, or J) because each model corresponds to a different cluster that groups similar load responses VOLUME 10, 2022   to different operating system conditions. This variability of the load behavior under several disturbances reinforces the applicability of the adaptive approach presented in this paper.

3) STEP 8 -LOAD MODEL VALIDATION
Once a load model is assigned to the new disturbances, the model's adaptability to this signal is evaluated. Table 6 shows the results for model assignation to each disturbance and the error obtained when using the model assigned (column four) to represent the new disturbance. The error calculated according to (3) shows that it is possible to obtain load models with an appropriate fit. This table shows that, although the ZIP load model can represent four of ten disturbances, sometimes dynamic load models must be incorporated, such as the EL+IM (disturbance B) or ERL model (disturbances C, D, E, and I).
In all of the previous tests, the load is strictly constant (magnitude and composition) at the bus where the measurements are obtained and is composed of induction motors. It appears evident that if a bus has a fixed load, only one model can represent this load; however, as demonstrated, the load response varies according to the system operating conditions. The previous situation proves that considering a fixed load model for a bus would not be appropriate for several operating conditions. Table 6 also includes the minimum error model obtained with other average load models. Although the selected model's error is higher than the minimum error given by other models in three disturbances (A, F, and I), these errors are very close (percentage error of approximately 3.7% for A and I disturbances, 2% for disturbance F). Finally, to demonstrate the applicability of the proposed strategy, Figure 7 depicts the disturbance H classified in cluster 9 and the estimated load model response obtained with the assigned load model (in this case CL); the other two load models' responses are used for comparison. As expected, cluster 9 assigns the CL load model, which allows a better adjustment to the disturbance than the ZIP or ERL load models, which are unable to follow the system's power response.

4) STEP 9 -UPDATING THE ORIGINAL DISTURBANCE DATABASE
This step determines if the error of the load model assigned to the cluster is higher than the cross-validation error obtained in the offline stage. In this case, the analyzed disturbance is included in the disturbance database, and the entire offline stage is repeated, updating the clustering and the corresponding load models.
As shown in Table 6, the disturbances A, F, and G meet the previously exposed criterion. The error analysis determines a value of 2.55 × 10 −3 > 1.2 × 10 −3 for disturbance A, 5.32 × 10 −4 > 1.82 × 10 −4 for disturbance F, and finally, 1.08×10 −3 > 4.3×10 −4 for disturbance G. According to the obtained results, disturbances A, F and G have to be included in the offline stage to update the disturbance database.

IV. CONCLUSION
The characteristics of the current power systems, where new elements and technologies appear, increase the importance of achieving adequate load models according to the study type. Although the measurement-based load approach is extensively used, considering the availability of measurements in an electric system, these models' adaptability is limited. A load model obtained from specific data measurements may not be adequate to represent other operating scenarios since it may lack the ability to fit new data measurements.
This paper presents a strategy that uses optimization, classification, and prediction algorithms to determine the suitable load model. The two-stage strategy allows the offline definition of several parameterized load models for each system bus. In the case of an online measured disturbance, the strategy speedily selects the best suitable load model in a low computational burden process. This is a remarkable advantage because the proposed strategy can be extended to online monitoring schemes.
According to the obtained results in the proposed tests, applying the proposed approach improves the online estimated load models' generalization capability. These models usually are required for adequate control, operation, protection, and planning studies in electric systems.

APPENDIX. LOAD MODELS A. ZIP MODEL
The ZIP model is commonly used as a static load model and can be described by (A.1): where a 0 , a 1 , a 2 , b 0 , b 1 and b 2 are proportional coefficients of the constant impedance, constant current and constant power, respectively, for the static active and reactive load. P 0 and Q 0 are the rated powers, and U 0 is the pre-disturbance voltage.

B. COMPOSITE LOAD MODEL (CL)
The composite load model (CL) combines a static polynomial load model ZIP and a third-order induction motor model (IM) as presented in (B.1) and (B.2) [41]: where T = X r + X m R r ; X = X s + X m ; X = X s + X m X r X m + X r where E d , E q are the d-axis and q-axis transient EMF respectively, w is the rotor speed; I d and I q are the d-axis and q-axis stator currents, respectively; R s , X s , X m , R r and X r are the stator resistance, stator reactance, magnetizing reactance, rotor resistance and reactance, respectively; T 0 is the mechanical torque; T is the time constant; X is the transient reactance; The exponential load model (EL) and the induction motor (IM), defined as EL+IM, use the third-order induction model shown in (A3) and (A4) and the static part defined in (C.1) [36]: P EL = P EL 0 (U /U 0 ) α ; Q EL = Q EL 0 (U /U 0 ) β (C.1) where U is the measured voltage at the load bus, P EL 0 and Q EL 0 are the pre-disturbance active and reactive power consumed by the static load, and α and β are the static exponents' parameters.

D. EXPONENTIAL RECOVERY LOAD (ERL)
The exponential recovery load model (ERL) is based on the power exponential response after a step disturbance at the bus voltage [44]. This is modeled through first-order nonlinear differential equations shown in (D.1): U is the measured voltage at the load bus; U 0 is the rated voltage; x p is the state variable related to active power dynamics; T p represents the time constants of the exponential recovery response; N ps and N pt are related to the steady-state and transient load response, respectively; P r is the active power recovery; P l is the total active power response; and P 0 , Q 0 are the rated active and reactive power at the load bus. A similar notation applies to reactive power.