Optimum Partition of Power Networks Using Singular Value Decomposition and Affinity Propagation

Due to coupling and correlation between nodes and buses in the power system, Power Grid Partitioning (PGP) is a promising approach to analyze large power systems and provide timely actions during disturbances. From this perspective, this paper proposes an efficient framework for fast and optimal PGP, based on singular value decomposition analysis of the graph's Laplacian. An Affinity Propagation clustering algorithm-based PGP is tailored for automatically forming highly interconnected clusters based on pairwise similarities without requiring a predefined number of partitions. The core objective is to quantify the clustering performance based on internal clustering validity indices, such as the Silhouette Index, Calinski-Harabasz Index, and Davies-Bouldin Index. The adopted methodology aims to enhance partitioning efficiency substantially while preserving a high level of partitioning quality. The proposed framework is verified on IEEE 14, 39, 118, and 2000-bus systems and compared to nine other well-known and widely used clustering techniques, including K-Means and Gaussian Mixture models. The simulation results demonstrate the scalability of the proposed approach and its high-quality partitioning output with a Silhouette index of 0.6162, 0.6597, 0.6664, and 0.6555 for the IEEE 14, 39, 118, and 2000-bus systems, respectively.

Abstract-Due to coupling and correlation between nodes and buses in the power system, Power Grid Partitioning (PGP) is a promising approach to analyze large power systems and provide timely actions during disturbances.From this perspective, this paper proposes an efficient framework for fast and optimal PGP, based on singular value decomposition analysis of the graph's Laplacian.An Affinity Propagation clustering algorithm-based PGP is tailored for automatically forming highly interconnected clusters based on pairwise similarities without requiring a predefined number of partitions.The core objective is to quantify the clustering performance based on internal clustering validity indices, such as the Silhouette Index, Calinski-Harabasz Index, and Davies-Bouldin Index.The adopted methodology aims to enhance partitioning efficiency substantially while preserving a high level of partitioning quality.The proposed framework is verified on IEEE 14, 39, 118, and 2000-bus systems and compared to nine other wellknown and widely used clustering techniques, including K-Means and Gaussian Mixture models.The simulation results demonstrate the scalability of the proposed approach and its high-quality partitioning output with a Silhouette index of 0.6162, 0.6597, 0.6664, and 0.6555 for the IEEE 14, 39, 118, and 2000-bus systems, respectively.
Index Terms-Clustering algorithms, complex networks, grid partitioning, machine learning, power system analysis.T HE current power grid is a dynamic and sophisticated system that combines multiple power generators, transmission lines, distribution stations, and numerous sensors to control and monitor energy systems [1].The growing complexity and scale of the power grid, along with the requirement for instantaneous decision-making, makes it a challenging domain for situational awareness and decision-support systems.Power grid situational awareness necessitates real-time monitoring and analysis of variables, such as power flow, voltage, and frequency [2].These variables are influenced by a variety of factors, including weather, generation, and demand changes, as well as equipment breakdowns.In such a large-scale system, simultaneously analyzing all these variables can be computationally demanding and may not be practical in real-time [3].To eradicate such limitations and construct a secure network, Power Grid Partitioning (PGP) is conceived as an emerging focus of vigorous ongoing research.In power systems, PGP is used for operational control to facilitate the analysis and enhance grid resiliency and situational awareness [4].The concept of cluster control mode involves partitioning the power grid into coherent sub-regions.These sub-regions have homogeneous properties and operate in harmonious environments and functionalities [4].PGP was mainly based on administrative regions and the long-term accumulation of operational knowledge.Yet, the traditional PGP methods do not consider the inner characteristics of the power network due to ignoring the functional and structural community operations.Currently, the conventional PGP approach calculates the electrical distance between nodes and accomplishes reasonable regional partitioning by applying various clustering algorithms [4].

NOMENCLATURE
In [5], a double-layer voltage control strategy based on partitioning distribution networks was proposed to address voltage violation issues in large-scale photovoltaic plants.Inspired by community detection algorithms, the authors introduced a cluster performance index (Q) that considered electrical distance based on the voltage amplitude sensitivity matrix and regional voltage regulation cluster capability.The method's effectiveness was demonstrated through simulation tests on a 10.5 kV feeder in China and the IEEE 123-bus system.However, the proposed method lacks a generalized cluster quality evaluation and scalability study.A balanced-depth-based community detection algorithm has been proposed for voltage control in complex power networks [6].The proposed method calculates the distance between buses based on their var-voltage response when a disturbance occurs.To validate the efficacy of the proposed partitioning approach, a zonal pilot bus selection method was proposed for secondary voltage control across various network sizes.The application of the proposed method on the IEEE 39-bus system yields a modularity value of 0.65662.While presenting comparable results to the genetic algorithm, this method is unstable due to its reliance on disturbance nature and magnitude, potentially resulting in varying results across different scenarios and with varying disturbances.Instead of using topological characteristics of the grid network, the authors in [7] advocated grid partitioning based on functional community structure.An Electrical Coupling Strength (ECS) matrix-based enhanced Newman fast algorithm of community discovery has been introduced and assessed using the Boundary Power Flow Factor (BPFF).For the IEEE-39 bus system, the BPFF of the partitioning is 0.1264.The achieved BPFF value reflects a stronger internal power supply and weaker external interaction.Unfortunately, the standard basic clustering evaluation metrics such as the Silhouette Index (SI) and Davies-Bouldin Index (DBI) were not considered in the clustering analysis.An Improved Louvain Algorithm (ILA) has been designed for community detection based on the ECS matrix for PGP [8].The ILA algorithm was assessed based on electrical modularity, denoted as Q e , and compared with a basic Louvain algorithm.For instance, for the IEEE 39-bus system, the ILA achieved a higher Q e (0.6434) compared to the Louvain algorithm (0.569).
In [9], Quantum Annealing with Integer Slack Variables (QAISV) has been introduced to optimize the accuracy computation speed trade-off for the PGP problem.Using the admittance matrix, the authors represented the partitioned power grid.Since linearly constrained integer programming problems are unsuitable for Quantum Annealing, they introduced integer slack variables.This allowed them to transform the PGP task into a quadratic unconstrained integer optimization problem.The method's performance was tested on the Pan European Grid Advanced Simulation and State Estimation (PEGASE) 89-bus system.It was evaluated using the modularity Q index, SI, and expansion metrics.Additionally, the computational time required by the method was presented.The proposed approach profits from the quantum computing resources to offer lower computational costs without compromising the accuracy result.In fact, the QAISV generates the grid partitions using only 53.3% time compared to the classical optimization method.While this approach yielded intriguing results, the introduction of slack variables and the conversion process can introduce complexity, potentially making it less feasible for larger or more intricate grids.
In [10], the authors presented a comparative study between static and dynamic clustering grid approaches.They investigated Spectral Clustering (SC), mixed integer programming for static clustering, and slow coherency based on the eigenvector of inter-area oscillation mode for dynamic clustering.A 6-node graph is used to demonstrate the properties of the Laplacian matrix, where the number of zero eigenvalues represents the number of connected components in the graph.The admittance matrix used in [10] shed light on the correlation between the power system's static structure and the generator's dynamic model.Unfortunately, the testing of the algorithm was conducted only on a small grid topology (IEEE 14-bus system), losing sight of the significance of the scalability and quantitative assessment of the clustering approaches.Paper [11] aimed to evaluate coherency in power systems using the Affinity Propagation (AP) algorithm.The objective was to find the best distance metric for clustering frequency datasets with coherent patterns.The study compared the AP algorithm with K-Means (KM) and hierarchical clustering.The method mainly clustered generative buses and was tested on simulated signals with noise from the 11-bus Klein-Roger-Kundur system two-area power system.The proposed method scored an SI value of 0.800751.However, when testing the real-world signals captured by 94 Phasor Measurement Units (PMUs), the performance dropped to an SI value of 0.218848, which indicates a scalability problem.One of the study's limitations is its exclusive focus on frequency datasets for power systems and the lack of generalization analysis.
Various PGP methodologies exist; however, they face limitations such as data availability constraints, topological changes [6], scalability challenges [5], [10], and parameters sensitivity [11].Meanwhile, clustering-based Machine Learning (ML) offers ample opportunities to decipher complex patterns in large-scale power networks.Still, there has been limited progress in integrating ML clustering approaches with power grid embeddings [10].Unlike previous works, this study deploys the AP algorithm to the embeddings derived from the Singular Value Decomposition (SVD) analysis of the graph's Laplacian of the power system.The primary goal of the proposed PGP technique is to identify nodes with specific coupling levels and then include them in one cluster.Partitioning the grid into smaller regions or zones allows the development of decentralized control schemes for faster and more localized decision-making.Furthermore, by identifying weakly connected components or clusters, operators can more easily detect and isolate faults within a specific region without affecting the entire grid, which prevents cascading failures and improves grid resilience.The method has been tested on multiple IEEE bus systems and compared to a wide variety of existing clustering techniques to demonstrate its efficacy.To the authors' best knowledge, this is the first systematic analysis of diverse ML-based clustering approaches on the PGP using three IEEE bus systems.The main contributions of the paper are summarized as follows.
r An efficient PGP technique based on the network character- istics is proposed.The AP algorithm searches the optimum number of partitions and the feature space by deploying SVD analysis.The proposed methodology is validated on IEEE 14, 39, and 118-bus systems to test its feasibility and scalability.Additionally, the proposed technique is tested on the 2000-bus synthetic grid to test its efficiency on a larger network.
r A comparative study of the clustering techniques for grid partition is realized.While numerous clustering algorithms have been studied in other literature, there is no consensus on which methods are more suitable for PGP.This paper enables researchers to justify the appropriate clustering methods to meet the actual requirements of PGP.
r A quantification of the grid clustering is conducted using three clustering indices to equip the Supervisory control and data acquisition (SCADA) system with a degree of confidence regarding the clustering outcomes.The rest of this article is structured as follows.The problem formulation is stated in Section II.The methodology is outlined in Section III.The clustering models are introduced in Section IV.Section V describes the case studies in detail.The final conclusions are presented in Section VI.

II. PROBLEM FORMULATION
The physical PGP problem aims to divide a power system into coherent or highly interconnected zones.The behavior of elements within these zones should be more closely related to each other than to those outside the zone.This is often done to simplify analysis, control, or isolate disturbances.Let's define the power system by a graph G(V, E), where V is the set of nodes (or buses) and E is the set of edges (or lines).A partition of the system is a division of The objective is to minimize the interactions (or power flows) between different partitions while maximizing the coherence within a partition.This can be mathematically represented as [4] min where f (V i , V j ) is the power flow between partitions V i and V j .The ability to transmit power between buses is characterized by the equivalent impedance [7].As the impedance of the line rises, the transmission capability decreases.This can be described as the electrical distance between buses.As such, the admittance matrix Y (the reciprocal of impedance) effectively illustrates the connectivity of buses within a power grid.For two buses (i and j) with a robust connection, a high value of Y ij is expected.Conversely, a weaker connection results in a lower Y ij value.If no connection exists between the buses, Y ij equals zero.Using this approach reduces the PGP complexity and minimizes the interconnections across different partitions [10].Each node belongs to exactly one partition as [4] k i=1 Each partition should be internally connected, meaning there should be a path between any two nodes in the partition that doesn't pass through a node outside the partition.In this paper, the internal clustering validation indexes provide insights into the compactness and separation of the clusters, ensuring that elements within a partition are more similar or highly interconnected to each other than to those in other partitions (further discussed in Section V-B).

III. PROPOSED METHODOLOGY
The proposed PGP approach involves assessing the electrical distance in a power grid by using its admittance matrix Y to construct the graph's Laplacian matrix that reflects the connectivity between the grid's buses.Notably, the main assumptions in this paper are the availability of the graph information of the target power system and that each bus belongs to only one cluster or sub-grid.
Spectral graph theory explores the relationships between a graph's properties, including its characteristic polynomial, eigenvalues, and eigenvectors, and associated matrices such as the adjacency or Laplacian matrix.The adjacency matrix of a basic graph is a matrix that is real, symmetric, and can be diagonalized by using orthogonal transformations.Consider the IEEE topology with graph G = (V, E) with vertex set V of n buses and edge set E of m transmission lines.The connectivity of the graph is expressed by n × n adjacency matrix A whose elements a ij = a ji = 1 if the vertices i and j are directly connected and a ij = a ji = 0 otherwise.Generally, the cause of weak connectivity is attributed to transmission lines with higher impedance or a limited number of connections between two sub-regions [10].Fig. 1 presents the workflow of our proposed method, which can be further explained in the following bullets: 1) Admittance matrix Y : The admittance matrix is a square matrix used in electrical network analysis to represent the linear relationship between the input currents and the voltages across the nodes of an electrical network.The electrical distance matrix is obtained from the admittance matrix Y of a grid and reflects the interconnections between different nodes (buses) in the grid.When two nodes, i and j, have a strong relationship, a large value of Y i,j is expected.On the other hand, a small value of Y i,j indicates a weak connection between the two nodes.When there is no connection between them, the admittance is equal to zero, that is Y i,j = 0 [10].
2) Laplacian matrix L: The eigenvalues λ i and eigenvectors v i of the Laplacian matrix L are foundational in spectral clustering and graph signal processing [12], satisfying the equation Lv i = λ i v i .Notably, the second smallest eigenvalue λ 2 , termed the algebraic connectivity or Fiedler value, is defined as [12] λ 2 = min The λ 2 values offer insights into the graph's overall connectivity.The normalized Laplacian L, given by [13] The L offers a size-invariant perspective of the graph.Given its intrinsic ability to encapsulate the graph's topology, the Laplacian matrix stands as an indispensable tool in the analysis of intricate networks.The difference between the adjacency matrix and degree matrix is called the Laplacian matrix [13].In this study, the Laplacian matrix depends on the admittance matrix Y to represent the strength of the connection between buses.L is defined as follows [14].
where D is a diagonal matrix with diagonal element d i defined as [15], while W is the adjacent matrix defined depending on the absolute values of mutual admittance with zero diagonal elements as follows [10].
3) Singular Value Decomposition: The SVD decomposes a matrix A into three matrices: U , Σ, and V T .Here, U and V are orthogonal matrices representing the left and right singular vectors, respectively, while Σ is a diagonal matrix containing the singular values.These singular values, distinct from eigenvalues, represent the magnitude or energy of each mode in the data.The decomposition provides a compact and often low-rank representation of the original matrix, making it invaluable in applications such as data compression, noise reduction, and feature extraction.In our paper, the SVD is applied to obtain eigenvalues λ and eigenvectors V of the Laplacian matrix L, which can be defined as follows [16].
4) Eigengap vector: An eigengap vector e represents the difference between adjacent eigenvalues λ in (7) as follows [16].all the eigenvalues λ 1 , λ 2 , . . .λ K of the Laplacian matrix L are very small but λ K+1 is relatively large [13].Thus, the optimum number of clusters k can be determined as the index of the first maximization of eigengap vector e as follows.
k plays a significant role in the next step of selecting spectral embeddings and a predefined number of clusters for some algorithms.6) Spectral embedding: The smallest k eigenvalues and their corresponding eigenvectors are computed to obtain the matrix of k-dimensional spectral embedding S ∈ R k×N [16].
7) ML clustering models: AP clustering algorithm is applied to S to obtain the clusters.Moreover, the performance is compared to different ML clustering algorithms (e.g., KM, Mini-batch K-Means (MBKM) clustering, etc.), knowing that some ML clustering models need a predefined number of clusters (e.g., KM).8) Clustering quality evaluation: Internal clustering evaluation metrics are used for each clustering algorithm to evaluate the performance, then random search hyperparameter tuning is applied to enhance the clustering performance [17].9) Clustering model selection: Finally, the best clustering model is selected depending on the evaluation metrics, which are further discussed in Section V-B.

IV. AFFINITY PROPAGATION CLUSTERING MODEL
Given the absence of labeled data in the context of the PGP problem, an unsupervised strategy concentrating on clustering methods is adopted.The AP is an unsupervised clustering algorithm that specifies exemplars, i.e., representative data points, to form clusters based on the similarity between data points.The AP algorithm uses the message-passing mechanism to iteratively refine cluster assignments based on the inherent similarities between data points, and it converges once a stable set of clusters is identified.In this work, instead of passing the system similarity matrix (of size N × N , where N is the number of buses) to the AP algorithm, SVD's embeddings of the graph's Laplacian matrix of the system are passed.Thus, the algorithm will converge faster as the feature space is reduced from N × N to N × k, where k is the predefined number of clusters.The main advantage of the AP clustering algorithm is that it does not require specifying the number of clusters in advance, unlike other traditional clustering algorithms.AP clustering iteratively updates the values of two matrices: responsibility (r) and availability (α) by passing messages between data points until convergence, generally resulting in a stable set of exemplars that define the clusters.The r matrix reflects data points' suitability to be exemplar of another data point by using similarity metric s, often based on negative squared Euclidean distance, to measure the closeness between data points defined as [18] where x i and x j are the feature vectors of data points i and j respectively.The r matrix can be defined as follows [19].
where i is the index of the data point, and i is the index of the other points.k is the index of the exemplar data point, and k is the index of other exemplars.s(i, k ) is the similarity between the data point and other exemplars, and α(i, k ) is the availability value sent from the candidate exemplar point to the data point.
The availability matrix reflects how appropriate for a point i to choose exemplar k with the support of other points, and it is defined as follows [19].
The iterative process of updating responsibilities and availabilities refines the cluster assignments.With each iteration, the algorithm gets closer to the optimal grouping of data points.
The algorithm converges when changes between consecutive iterations are below a certain threshold or when the exemplar assignments remain constant for a specified number of iterations.
Mathematically, the convergence Conv can be checked as where is a small positive value.c and p denote the values from the current and previous iterations, respectively.

V. CASE STUDIES
This section comprehensively validates the proposed network partitioning strategy.This strategy's scalability is validated on IEEE 14, 39, 118-bus, and 2000-bus systems as test cases, and their topological information is obtained from Matpower [30].Furthermore, this part depicts the data description, scoring errors, and simulation results for PGP.

A. Data Description
This part describes the different topologies used to verify the scalability of the proposed clustering approach, including IEEE 14, 39, and 118-bus systems.Table I presents some features for Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I IEEE BUS SYSTEMS DESCRIPTION
Fig. 2. IEEE grid topologies with (a) 14-bus system, (b) 39-bus system.each topology.The IEEE 14-bus test case represents a simple approximation of the American Electric Power system as of February 1962.It has 14 buses and 20 transmission lines, as shown in Fig. 2(a).The IEEE 39-bus system is well known as the 10-machine New England power system.It has 39 buses and 28 transmission lines as illustrated in Fig. 2(b).The IEEE 118-bus test case represents a simple approximation of the American electric power system (in the U.S. Midwest) as of December 1962.This system contains 118 buses and 186 transmission lines.The power grid clustering process should input the bus and transmission line parameters to calculate the admittance matrix Y , which is considered as the electrical distance matrix.

B. Internal Clustering Validation Indexes
The methods employed in the existing literature to assess the quality of algorithmically generated groups can be categorized into external and internal measures.External measures rely on a known and accurate clustering, providing information about the number of groups and the constituents belonging to each group.Conversely, internal measures focus on evaluating the degree of compactness and distinctiveness within the partitions created by the algorithm.These measures frequently utilize concepts of similarity or distance to ascertain clustering quality.This form of measurement aims to capture the cohesiveness within each cluster by minimizing the distance between elements within the same cluster while simultaneously maximizing the gap between different clusters.These assessment techniques can prove particularly valuable in analyzing clustering quality within power systems, especially when clear ground truth is absent [11].The quality of clustering results remains out of standardization as the validity indices are usually data types dependent on the innumerable applications spanning many fields.The quality of clustering relies on indicators based solely on datasets and clustering labels to evaluate whether the disclosed clustering structure is meaningful.Due to their wide applicability [11], this paper selects the SI, DBI, and Calinski-Harabasz Index (CH) to evaluate the quality of the obtained clusters, where a high-quality cluster has strong interconnections within belonging buses and weak interconnections with buses outside the cluster [31], [32].
1) Silhouette Index (SI): The SI method reflects the compactness and separation of clusters by measuring the optimal clustering effect [11].The SI determines the optimal number of clusters by comparing the average distance within the cluster (intra-cluster cohesion) and the minimum distance between the clusters (inter-cluster separation) [11].The silhouette value ranges from -1 to 1, where a high value > 0.71 indicates a strong structure, a value ∈ [0.51,0.7]produces logical structure, a value ∈ [0.25,0.5]reflects a weak structure, and a value < 0.25 advice that the resulted clusters have no significant structure [33].In this method, a(i) represents the average distance of sample i to other samples in the same cluster, and b(i) represents the minimum distance of sample i to samples in other clusters defined as follows [34].
where j is another object in one cluster |C i |, d(i, j) is the Euclidean distance between objects i and j in cluster C i , b(i) average distance from data point i to all points in another cluster.

2) Calinski-Harabasz Index (CH):
The CH is commonly used as an optimization criterion to evaluate clustering solutions as an internal validity measure.The CH is a variance ratio criterion that assesses the relevance of the clustering by calculating the Between-clusters Sum of Squares (BSS) and Within-Clusters Sum of Squares (WSS) calculated as follows [35].
where d(x, y) is the Euclidean distance between x and y, x i is the cluster i, c i is the centroid of cluster x i , and k is the number of clusters used.It is worth mentioning that the objective is to maximize the CH.

3) Davies-Bouldin Index (DBI):
The DBI evaluates the clustering solutions by maximizing the inter-cluster distance while minimizing the distance between points within a cluster.When the inter-cluster distance is maximized, it means that the characteristics between each cluster are more distinct, allowing for clearer differentiation between the clusters.The DBI defines each cluster index R i as the maximum comparison between the cluster C i and other clusters in the partition defined as follows [36].
where D ij = ||m i − m j || 2 is the distance between the centroids of clusters C i and C j .e i and e j are the average error for clusters C i and C j , respectively.Given as The DBI can be computed as [36] where the objective is to minimize the DBI(K) to obtain the best clustering result.Choosing the best index depends on the specific characteristics of the data and the goals of the clustering analysis: The SI emphasizes the compactness and separation of clusters.Here, compactness refers to the spectral coherence of buses within a cluster, indicating strong electrical interconnections.Separation, on the other hand, reflects the spectral distance between different clusters, suggesting weak electrical interconnections.The CH focuses on maximizing the separation between clusters and minimizing the variance within clusters.
The DBI balances inter-cluster spectral or electrical distance and intra-cluster compactness.
In practice, it is beneficial to consider multiple indices together rather than relying on a single one.Each index provides a different perspective on the quality of the clusters, and considering them concurrently can offer a more comprehensive evaluation of the clustering results and give a better interpretation of the clustering solution.Let's denote the normalized values of SI, CH, and DBI for a given partitioning as SI(p), CH(p), and DBI(p) respectively, where p represents the partitioning output.The objective function F (p) can be formulated as (23) where w 1 , w 2 , and w 3 are weights assigned to each index to balance their influence on the objective function such as w 1 + w 2 + w 3 = 1.The goal is to find a partitioning p * such that:

C. Simulation Results
This section summarizes the case study results to verify the solution's scalability on different topologies, including visualization of the clustering results in the feature space, and quantified analysis based on the evaluation metrics.Lastly, the best-performing clustering algorithm results are represented for each topology structure.This study uses IEEE 14, 39, and 118-bus systems to check the scalability of the clustering approach.For each network T n where n ∈ 14, 39, 118, steady state analysis is applied to collect the network's topology and physical parameters information to calculate the admittance matrix Y n using MATPOWER [30].Admittance matrix is involved in calculating the Laplacian matrix L n for each bus system topology T n as described in ( 5), then SVD defined in ( 7) is applied to calculate the eigenvalues λ n and eigenvectors V n of the Laplacian matrix.These results are used to calculate the eigengap between every two consecutive eigenvalues in λ n vector as in (8), where the index which maximizes the eigengap is considered as the optimum number of partitioning k n for each T n .The maximum eigengap value index(max(e 14 )) = 3, which indicates that the optimum number of clusters is k 14 = 3.Thus, the pre-defined number of clusters is set to 3 for T 14 , and the spectral embeddings are represented by the first three eigenvectors as S 14 = V 14 [1 : 3].Similarly, the optimal number of clusters for T 39 and T 118 are calculated to be 4, as index(max(e 39 )) = 4 and index(max(e 118 )) = 4 respectively.Therefore, the predefined number of clusters is set to four for both topologies T 39 and T 118 , and the spectral embedding are defined as S 39 = V 39 [1 : 4], and S 118 = V 118 [1 : 4].In this stage, the feature space of each topology T n is ready to be passed to AP mentioned in section IV.This study assesses the performance of the clustering algorithms based on two aspects: the quality of the resulting clusters, including visualization in the feature space and internal clustering evaluation metrics, and computation time.
1) Feature Space Visualization: This part presents and discusses feature space cluster visualization for each topology, with different clustering algorithms.Fig. 3(a) shows the IEEE 14-bus system clustering results in feature space using different clustering algorithms, where each dot represents one bus from the clustered network.AP, which does not require a pre-defined k, suggested three clusters as the optimal number.The clustering results from the AP algorithm were similar to those of KM, MBKM, and GMM, all of which require a pre-defined number of clusters.AHC and BRICH did not require a pre-defined k and produced different clustering forms from AP. MS and SC algorithms suggested four clusters but produced different clustering results.Fig. 3(b) presents the clustering outcomes for the IEEE 39-bus system within the feature space.AP, GMM, and MS clustering algorithms do not require a pre-defined k.However, they decided that the best k for the provided spectral is four, which validates the optimum number of clusters approach.The clustering results of algorithms that required a pre-defined k (i.e., AHC, BRICH, KM, and MBKM) were similar to each other and to AP output.On the other hand, DBSCAN and OP-TICS suggested three clusters for the provided spectral.While DBSCAN formed connected dense clusters, OPTICS produced clusters that appeared sparse.Fig. 3(c) displays the feature space clustering results for the IEEE 118-bus system.Uniquely, AP proposed six different clusters, these seem to be well distributed within connected units.GMM and OPTICS clustering algorithms do not require a predefined k.According to the simulation results in Fig. 3(c), the best number of clusters for the provided spectral is four.GMM produced similar clustering results to algorithms that required a pre-defined k, while OPTICS' result has disconnected clusters.On the other hand, DBSCAN and MS clustering algorithms decided on only three clusters for the provided spectral, and the resulting cluster has a similar distribution.Density-based clustering algorithms perform better with larger networks.However, other algorithms that required a pre-defined number of clusters (i.e., AHC, BRICH, KM, and MBKM) produced slightly different clustering results, and some clusters had disconnected regions.To support the obtained results, this study uses internal clustering evaluation metrics to quantify the performance.
2) Grid Partitioning Score Error: Quantifying the performance of clustering algorithms offers insights into the quality of the formed clusters and allows for a fair comparison between different algorithms.Table II presents the results using SI, CH, and DBI.For the IEEE 14-bus system, AP, BRICH, and MBKM achieved the highest SI score of 0.6162, indicating that buses within these clusters are highly interconnected to their own cluster and have fewer connections to neighboring clusters.MS, with the highest CH value of 30.8790 and the lowest DBI of 0.421, suggests that formed clusters are dense and have a high between-cluster variance compared to the within-cluster variance and are well-separated from each other.However, its relatively lower SI suggests less cohesion within clusters.Fig. 4 visualizes the performance of different clustering algorithms using score error indices for the three IEEE bus systems, with results indicated in blue bars for 14-bus systems.In the IEEE 39-bus system, AP, KM, and MBKM showcased identical clustering performance.Their highest SI and CH values, 0.6597 and 126.8339 respectively, combined with the lowest DBI of 0.4503, indicate that the clusters are well-defined, with buses highly interconnected within clusters.This consistency in performance metrics highlights the robustness of the proposed PGP.Fig. 4 visualizes the performance of different clustering algorithms using score error indices indicated in red bars for 39-bus systems.For the larger network, the IEEE 118-bus system, AP outperformed, especially with its CH score of 568.8009, suggesting well-defined clusters with high between-cluster variance.MS,  despite its best SI and DBI scores, had a lower CH value, hinting at a potential increase in variance within its clusters.The proposed AP algorithm's SI, and DBI performance is acceptable compared to other algorithms and has an outstanding CH value, thus, it is the best-performing algorithm.Results for 118-bus systems are visualized in green bars in Fig. 4.
3) Computational Time and Memory Requirements: The technical specification can be summarized in the embedding calculation and the clustering process.Firstly, the optimal power flow of the IEEE test cases was performed using MATPOWER environment [30].Then embedding calculation was conducted using MATLAB R2022b software using i7-6700 3.40 GHz CPU, and 64.0 GB memory machine.Lastly, the clustering process was executed in Google Collaboratory platform [37].The computational time for the first stage varies across different bus systems: 2.396 msec for the 14-bus system, 6.539 msec for the 39-bus system, and 34.163 msec for the 118-bus system.Table III presents a comparative analysis of the computational times, measured in microseconds (μsec) of the simulated algorithms.As the network size increases from n = 14 to n =118, there's a consistent increase in computational time for almost all algorithms.For the IEEE 118-bus system, BRICH and SC record the longest computational times, both 9.78 μsec.Conversely, within the same topology, the GMM, MS, and MBKM algorithms prove to be time-cost effective, each posting computational times below 8.5 μsec.AP's performance is falling within the mid-range for computational times across all topologies.Specifically, for the 118-bus system, its time of 8.34 μsec surpasses that of SC but trails slightly behind GMM and MS.Across various topologies, AP's computational time shows a modest rise, suggesting good scalability with network size and complexity.Clustering algorithms, such as DBSCAN and BRICH, exhibit increased computational times when transitioning from simpler to more complex topologies.The KM and OPTICS algorithms start with comparable computational times of 7.15 and 7.63 μsec for the 14-bus topology, respectively.However, they exhibit a marked difference in the 118-bus system, with OPTICS requiring 9.06 μsec and KM requiring  8.82 μsec.This divergence might conclude that distance-based algorithms converge faster than density-based algorithms.The ideal clustering algorithm choice should consider the accuracy and computational needs.The proposed AP algorithm provides satisfactory computational performance and outperforms most of the algorithms in terms of the three evaluation metrics (SI, CH, and DBI).Fig. 5 visualizes the computation time required for each algorithm to cluster different topologies.
In the realm of clustering algorithms, memory efficiency is crucial, especially when addressing larger networks or working within restricted environments.Therefore, Table IV presents a comparative analysis of clustering algorithms' memory sizes, measured in kilobytes (KB).Across all IEEE bus system benchmarks, the BRICH algorithm consistently requires the most memory, implying that its memory footprint escalates with the complexity of the network.Specifically, it consumes 17.00 KB, 31.91 KB, and 60.64 KB for the IEEE 14, 39, and 118-bus systems.In contrast, DBSCAN, MS, and KM algorithms maintain a consistently low memory size across all datasets, indicating a steady and conservative memory usage pattern.The proposed AP algorithm demonstrates a memory footprint comparable to, or even better than, most algorithms, particularly with larger networks such as the IEEE 118-bus system.Notably, it outperforms SC in the IEEE 39-bus system but has the same performance in the IEEE 118-bus system.Although it is more memory-efficient than BRICH, AP uses more memory than algorithms like AHC and DBSCAN, which are part of the hierarchical-based and density-based clustering approach, respectively.From a scalability standpoint, as the network complexity increases from n = 14 to n = 118, certain algorithms, such as AHC and GMM, display an almost linear increase in memory usage.Conversely, algorithms such as DBSCAN and KM exhibit remarkably consistent memory usage across all benchmarks, suggesting superior scalability.Notice that the GM algorithm maintains a stable memory size of 4.60 KB when transitioning from IEEE 39 to IEEE 118.This consistent behavior may indicate a constraint embedded in its design as both networks are clustered into four clusters.The proposed AP algorithm may not have the highest memory efficiency across all benchmarks, but it does provide a commendable performance.This becomes particularly significant when considering the trade-offs between memory usage, algorithmic precision, and potential computational demands.Prospective studies might focus on enhancing the memory management of the AP algorithm using distributed computing and sparse matrix representation.

4) Scalability Analysis:
The study evaluates the proposed method's performance on the 2000-bus power system (AC-TIVSg2000).The ACTIVSg2000 network is designed based on Texas' grid with 2000 buses, 3207 transmission lines, and 327 generators [38].Following the proposed methodology as in Section III, the first stage of finding the SVD of the graph Laplacian requires a computational time of 0.6758 sec and yields the optimum number of predefined clusters k 2000 = 11, based on eigengap maximization, as formulated in (9).Therefore, the first 11 spectral embeddings are then passed to the different clustering algorithms for PGP.Table .V provides the comparative results of the clustering techniques.Notably, the proposed AP clustering algorithm converged to 10 distinct clusters and stands out with an SI of 0.6555, indicating superior cluster separation, a CH index of 2074.4521reflecting cohesive clusters, and a DBI of 0.6103, demonstrating low inter-cluster similarity.These metrics underscore the proposed AP clustering algorithm's effectiveness in partitioning large-scale networks, compared to density-based and predefined cluster number algorithms.Fig. 7 illustrates the clustering results of the AP on the ACTIVSg2000 bus system in the feature space.Some clusters like 1, 4, and 7 show high density, indicating that the data points within these clusters are closer to each other in the feature space.However, some overlap is observed between clusters 2, 3, 5, and 6.This demonstrates the AP's capability in large-scale network clustering with a computational time of 3.39 sec and 30.53MB memory usage.
5) Discussions: Simulation results validate the scalability of the proposed PGP approach as the SI and CH increase while Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 6.Cluster plot of the optimal partitioning results for IEEE (a) 14-bus system, (b) 39-bus system, (c) 118-bus test systems.increasing the network size.Furthermore, it can be noticed that there is room for enhancement with larger and more complex networks such as the IEEE 118-bus system.Fig. 6 illustrates the best-resulting clusters represented on each network topology.It is worth noting that the resulting partitions are tightly clustered in densely connected groups, which could simplify the process of isolating a faulty cluster and help prevent fault propagation.For the IEEE 118-bus system, there are some clusters that appear to be disconnected.For example, the blue cluster in Fig. 6(c) has bus 117 disconnected from the rest of the cluster.Additionally, the fact that each cluster contains at least one generator can further aid in the isolation process, as these generators can be used to provide power to critical loads in the affected clusters while the faults are being addressed.Overall, these results demonstrate the potential of the proposed approach to enhance the resiliency and stability of power grids.The proposed method performs the best when compared to the previously proposed method on the SI scale.For instance, in reference [11], authors utilized the AP algorithm for PGP.However, for the 94-node system, they recorded an SI value of 0.218848 (<0.25,No significant structure [33]).While the proposed method recorded a SI value of 0.6664 (∈[0.51,0.7],Logical structure (appropriate) [33]) for 118 nodes, which is far better than the method proposed in [11].Moreover, the proposed approach proved its efficiency on large-scale networks when tested on the ACTIVSg2000 bus system with an SI of 0.6555 (∈[0.51,0.7],Logical structure (appropriate) [33]).

VI. CONCLUSION
Partitioning the power grid into distinct regions serves as a strategy to reinstate power dispatched in areas of the grid impacted by severe cascading failures.To secure power grid operations, this paper deployed the Affinity Propagation (AP) clustering algorithm and compared it to other popular clustering models based on topological features.The study assessed the algorithm's performance on IEEE 14-bus, 39-bus, and 118bus, and ACTIVSg2000 power systems.The Silhouette Index (SI), Calinski-Harabasz Index (CH), and Davies-Bouldin Index (DBI), clustering quality indices demonstrated that the AP clustering algorithm offers high-quality clusters, with optimal average values of 0.6474, 241.0480, and 0.4911, respectively, when tested on IEEE {14, 39, 118}-bus systems.The AP model formed a logical structure clusters on the large-scale network with SI of 0.6555, CH of 2074.4521, and DBI of 0.6103.On average, the proposed clustering algorithm provided a competitive partitioning speed of 7.87 μsec and reasonable computational requirements with 43.31 KB.According to the simulation results, the algorithm consistently demonstrated a combination of fast partitioning speed, optimal partitioning results, and reasonable computational requirements, making it a robust choice for achieving well-defined clusters.The results also showed that the algorithm had particularly good performance when detecting the number of clusters as the optimum cluster suggested by maximizing the eigengap.The adoption of the proposed AP clustering method can efficiently prevent large-scale outages and widespread blackouts by isolating disturbances to the affected partitions.Future studies will focus on distribution grids, controlled islanding, and time-domain validation of the created power grid partitions.Time-domain validation includes modeling each partition and contrasting its dynamic behavior with the original unpartitioned system.

Fig. 4 .
Fig. 4. Performance comparison of the clustering algorithms for different IEEE bus systems based on (a) SI (b) DBI (p.u.) (c) CH index.

Fig. 5 .
Fig. 5. Computation time comparison of the clustering algorithms for different IEEE bus systems-based execution time (µsec).

Fig. 7 .
Fig. 7. Feature space clustering results for the ACTIVSg2000 network using the proposed AP clustering algorithm.

TABLE III COMPARISON
OF THE DEPLOYED ALGORITHMS AND THEIR COMPUTATIONAL TIME

TABLE IV MEMORY
REQUIREMENTS OF THE CLUSTERING ALGORITHMS

TABLE V COMPARISON
OF THE DEPLOYED ALGORITHMS FOR ACTIVSG2000 NETWORK