A Model-free Approach for Estimating Service Transformer Capacity Using Residential Smart Meter Data

Before residential photovoltaic (PV) systems are interconnected with the grid, various planning and impact studies are conducted on detailed models of the system to ensure safety and reliability are maintained. However, these model-based analyses can be time-consuming and error-prone, representing a potential bottleneck as the pace of PV installations accelerates. Data-driven tools and analyses provide an alternate pathway to supplement or replace their model-based counterparts. In this article, a data-driven algorithm is presented for assessing the thermal limitations of PV interconnections. Using input data from residential smart meters, and without any grid models or topology information, the algorithm can determine the nameplate capacity of the service transformer supplying those customers. The algorithm was tested on multiple datasets and predicted service transformer capacity with >98% accuracy, regardless of existing PV installations. This algorithm has various applications from model-free thermal impact analysis for hosting capacity studies to error detection and calibration of existing grid models.

Abstract-Before residential photovoltaic (PV) systems are interconnected with the grid, various planning and impact studies are conducted on detailed models of the system to ensure safety and reliability are maintained.However, these model-based analyses can be time-consuming and error-prone, representing a potential bottleneck as the pace of PV installations accelerates.Data-driven tools and analyses provide an alternate pathway to supplement or replace their model-based counterparts.In this article, a datadriven algorithm is presented for assessing the thermal limitations of PV interconnections.Using input data from residential smart meters, and without any grid models or topology information, the algorithm can determine the nameplate capacity of the service transformer supplying those customers.The algorithm was tested on multiple datasets and predicted service transformer capacity with >98% accuracy, regardless of existing PV installations.This algorithm has various applications from model-free thermal impact analysis for hosting capacity studies to error detection and calibration of existing grid models.
Index Terms-Data-driven analysis, distribution system planning, parameter estimation, photovoltaic (PV) integration, smart meters.

I. INTRODUCTION
T HE adoption rate of distributed energy resources (DERs), such as residential photovoltaic (PV) systems, is limited in part by the tools available to study their potential impacts on the electric grid.These engineering analyses also contribute to the non-hardware "soft" costs of a PV installation that represent the dominant portion of PV system costs [1].Conventionally, PV impact studies are conducted on detailed models of the grid and often require time-consuming simulations, which can lead to bottlenecks and delays when processing PV interconnection requests.Grid models are also created and updated manually, meaning they are prone to a variety of errors.Unsurprisingly, these errors can have significant implications, such as leading to severe over-and underestimations of residential PV hosting capacities on distribution feeders [2], representing another limitation of existing model-based tools.For example, consider a hypothetical interconnection request for a 15 kW PV system in a neighborhood with two existing 10 kW rooftop PV systems.The customers are served by a 50 kVA single-phase transformer, but the grid model says that transformer is only rated for 25 kVA, so the request is denied based on overloading concerns.The utility could send a crew to verify the transformer ratings and update their model, but this process is labor-intensive and costly.In practice, creating and maintaining reliable grid models is an ongoing challenge for many utilities.
Fortunately, the widespread deployment of advanced metering infrastructure (AMI), such as intelligent reclosers and residential smart meters, has opened the door to a variety of data-driven model calibration techniques [3], which can be faster and more scalable than manual verification processes.Many of the recently proposed techniques apply to facilitating PV integration.A deep neural network approach was presented that can accurately determine the size, tilt, and azimuth of existing PV systems using only smart meter data [4], which can improve utility records of existing PV installations.A method for estimating the locations of existing PV systems in distribution grid models was proposed in [5] that leverages voltage sensitivity calculations on measurements from smart meters and other grid sensors.AMI data has also been leveraged to disaggregate reactive power contributions from PV systems with advanced inverter functions to improve time-series modeling and control setting estimation [6].
While data-driven model calibration techniques can be useful for detecting certain types of common modeling errors, there are limitations to these approaches.First, they can only be applied when detailed grid models have already been established and are able to be compiled.Many smaller utilities and electric cooperatives still rely on paper models of their systems or have grid models that are in such poor shape that the model-calibration techniques cannot be applied.Second, once the grid model errors have been detected, manual effort may still be required to make the corrections.Finally, even when grid models are error-free, model-based PV impact studies often require time-consuming power flow simulations that can be computationally prohibitive for some utilities.

TABLE I LITERATURE SUMMARY OF STATE-OF-THE-ART METHODS
To address these drawbacks, data-driven analyses have recently been receiving a great deal of attention as alternatives to their model-based counterparts.Instead of leveraging AMI data to calibrate grid models for simulations, the data are input directly into model-free tools that can yield the same type of results.As such, data-driven analyses are often more robust, given that any changes made to the underlying system would be captured directly in the AMI measurements.Many data-driven analyses for PV integration studies have focused on the voltage impacts of PV systems.In [7], a neural network-based approach was proposed for model-free voltage calculations from smart meter data.A similar approach was also proposed that utilized a deep neural network to calculate voltage changes associated with PV power injections from smart meter data [8].In [9], smart meter data were utilized in a linear regression-based approach to conduct a voltage-constrained PV hosting capacity (HC) analysis.While these data-driven tools can be utilized to assess the voltage impacts of grid-connected PV systems, they do not address potential thermal impacts, such as the overloading of conductors and transformers.
The main challenge with assessing the potential thermal impacts of residential PV installations is that grid models are rarely available for the low-voltage (LV) secondary circuits that connect residential customers to their service transformers; even when those models do exist, they are often inaccurate or oversimplified.Many data-driven model calibration techniques have been proposed to address this challenge.Most of the literature has focused on determining the topology and line parameters of the LV circuit and validating the customer-transformer groupings; these tasks were addressed in [10], which applied a combination of voltage correlation and linear regression techniques.Many subsequent methods have been proposed to estimate the parameters and topologies of LV circuits, such as implementing principal component analysis and a graph-theoretic approach [11], utilizing a multiple linear regression model [12], or using an improved statistical approach that combines correlation analysis with parallel-circuit regression [13].Other works have focused on validating customer-transformer groupings, such as by applying pairwise correlation coefficients from smart meter data to identify errors in service transformer connections [14] and a linear regression formulation for error correction [15], or by implementing a knowledge-driven identification model [16].
Determining the parameters of the service transformers that supply the LV networks is an important yet often overlooked task.Having accurate knowledge of service transformer capacity (kVA) ratings is critical for distribution system operation and planning tasks, such as PV integration or electric vehicle (EV) adoption.A summary of the state-of-the-art methods for datadriven estimation of service transformer capacity estimation is given in Table I.There are two main types of approaches used to estimate service transformer parameters: regression-based [17], [18], [19] and terminal measurement-based [20], [21], [22].In this article, it is assumed that none of the transformers are equipped with terminal measurement devices, meaning only the regression-based approaches are relevant here.The methods in [17] and [19] are similar in that they each rely on linear voltage drop approximations to estimate service transformer impedance, and [19] can be applied when reactive power measurement are unavailable.The drawbacks of these methods are twofold: they both assume that the primary MV system topology is known, and they rely on an iterative correlation-based approach to estimate the LV circuit topology.The methods in [18] also rely on the iterative topology estimation approach, but propose using location data to estimate line parameters when a transformer is serving a single customer.However, only one other location is utilized, meaning it is susceptible to error when the nearby transformer is physically close but connected to a different branch.Also, the transformers each must supply just a single customer, so these assumptions may not always be present.Lastly, none of the existing methods in Table I have been tested on feeders with distributed PV systems.
In contrast to existing methods, this article proposes a spatially-aware, topology agnostic algorithm for estimating service transformer capacity ratings using only residential smart meter measurements and metadata (e.g., geographic location) that does not require any grid models of the primary or secondary circuit topologies and has significant computational advantages over existing methods.The novel contributions of this article include the following.
1.A robust, model-free algorithm for estimating service transformer capacity ratings, directly applicable to data-driven thermal impact studies for PV and other DER.

A computationally-efficient, topology-agnostic method
for estimating measurements at transformer LV terminals.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.3. A spatially-aware weighted voting scheme for service transformer impedance estimation.The rest of this article is organized as follows.Section II introduces and describes the proposed algorithm, Section III details the distribution circuit and smart meter datasets used to evaluate the performance of the algorithm, Section IV presents the results, Section V discusses the limitations and future work.Finally, Section VI concludes the article.

II. PROPOSED MODEL-FREE ALGORITHM
Thermal constraints (e.g., conductor or service transformer capacity ratings) are among the most common limiting factors faced by residential PV interconnection requests.To streamline the analysis of these potential limitations, the proposed algorithm leverages and expands upon existing parameter estimation techniques from the literature to identify these thermal constraints using smart meter measurements without the need for detailed grid models or topology information.

A. Algorithm Overview
The diagram in Fig. 1 provides a high-level overview of the inputs and outputs of the proposed algorithm.For a given distribution feeder, "Customer-Transformer Groupings" refers to the knowledge of which groups of customers are served by the same service transformer.While this information is typically known, there are data-driven approaches available to extract these groupings from smart meter data (e.g., [15]).Next, smart meter measurements must be available for each of the customers in the groupings.Specifically, measurements for real power (P), reactive power (Q), and voltage (V) should be available for each customer.In this article, all customers had a full year of smart meter data available with measurements taken at 15-min intervals.Note that 30-min or hourly data may also be suitable, but evaluating the performance impacts of low-resolution data is beyond the scope of this article.Finally, metadata from each smart meter should be available, including the physical location of the meter (e.g., geographic coordinates or street address) and the phase to which it is connected.While customer phasing may not be known, there are methods available to extract this information from smart meter measurements (e.g., using voltage drop approximations [10] or co-association matrix ensemble clustering [23]).
It is important to emphasize that the proposed algorithm does not require any information about the medium-voltage (MV) feeder topology or the topology of the LV circuit connecting the customers and their service transformers.Utility models of LV circuits are nearly always over-simplified or missing entirely, which severely limits the practical use of existing parameter estimation methods from literature that require LV circuit topology.

B. Algorithm Description
The proposed model-free algorithm for estimating service transformer capacity consists of a multistep approach to leverage the input data outlined in Fig. 1 and convert it into actionable information for streamlining PV interconnections.Overall, the algorithm utilizes and adapts the principles of linearized voltage drop approximations [24] and secondary circuit parameter estimation [17] to estimate service transformer impedances that can be converted to capacity ratings using a lookup table.The diagram in Fig. 2 presents a flowchart of the proposed algorithm; the color-coded columns of the flowchart represent the three main components of the algorithm that are each explained in greater detail in the following sections.
1) Aggregation for Service Transformer Measurements: The goal of the algorithm is to estimate the resistance (R) and reactance (X) of a given service transformer as accurately as possible using the available input data from Fig. 1 and without any circuit topology information (i.e., topology-agnostic).To apply an existing parameter estimation approach (e.g., [17]) to this problem, measurements at the LV terminal of the service transformers would be needed, but few utilities in the US-if any-have meters on their service transformers.Therefore, these measurements must be estimated from the available input data.
The diagram in Fig. 3 depicts an example of a typical secondary circuit serving three customers.For this example, it Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. is assumed that the topology is known and that smart meter data is available for each of the three customers.Since the topology is known, the conventional parameter estimation approach [17] can be applied to estimate the voltage at "node 2," V Node2 , from the measurements at Customer 1 (i.e., P 1 , Q 1 , and V 1 ) and Customer 2 (i.e., P 2 , Q 2 , and V 2 ) by applying the linearized voltage drop equations, such that (1) where the general expressions for real current, I R , and reactive current, I X , are described as: where ( 1) and (2) can be combined and rearranged to form the linear regression problem in As long as there are enough measurements from Customer 1 and Customer 2 from Fig. 3, the linear regression problem (5) can be applied and solved, yielding an estimate of the impedance values (i.e., R 1 , X 1 , R 2 , X 2 ) of the conductors between those two customers and their nearest common node (i.e., "node 2" in this case).In this work, the linear least squares problem (5) was solved in MATLAB using the fitlm() function with the "intercept" parameter set to "false," which utilizes QR decomposition as its fitting algorithm.This function outputs values for each of the four coefficients in (5) as well as a variety of summary statistics describing the overall fit of the model.The root-mean-square error (RMSE) of the model is leveraged in the weighted voting scheme described in a later section.Next, V Node2 can be estimated as an average of the two branch voltage estimates from the full voltage drop where N is the number of parallel branches (e.g., 2 in this case for the branch to Customer 1 and Customer 2) and ||•|| refers to the magnitude of the complex number.
The same process can be repeated using measurements from Customer 3 (i.e., P 3 , Q 3 , and V 3 ) and estimates from "Node 2" to estimate the branch impedances (i.e., R L1 , X L1 , and R 3 , X 3 ) and estimate the voltages at "node 1."In that case, V Node2 was calculated from (6), while P Node2 and Q Node2 would be the real and reactive power sums of customer 1 and customer 2. After that, a representative location could be determined for the service transformer by averaging the coordinates of all three customers, at which point all the relevant information for estimating the service transformer impedances would be known.
However, since the secondary circuit topologies are rarely known, the process for determining the "node 1" measurements in Fig. 3 was modified for the proposed algorithm.The diagram in Fig. 4 provides a more realistic representation of the challenge faced by the proposed algorithm.In this case, the secondary circuit topology is unknown, so determining the exact topology, estimating each branch impedance, and calculating all intermediate node voltages becomes more challenging and requires more computational resources.
The proposed algorithm, outlined in Fig. 2, circumvents the unknown topology issue by leveraging certain characteristics of radial secondary circuits.First, a filter is applied to the real and reactive power measurements of all customers to ensure unidirectional power flow from the service transformer to the customer premises.This step guarantees that the voltage at the LV terminal of the service transformer will always be greater than the voltage at all of the customer locations.Next, the algorithm applies (1) through (6) to every possible combination of customer pairs supplied by the same service transformer.This step provides a voltage estimation for the nearest common node between the two customer locations.For example, in Fig. 3, the nearest common node between customer 1 and customer 2 is "node 2," whereas "node 1" is the nearest common node between customer 1 and customer 3 (and between customer 2 and customer 3, as well).
After every unique pair of customers has been evaluated, the voltage profile with the highest average voltage would be selected to represent the voltage at the service transformer LV terminals (i.e., "node 1" in Fig. 4), as long as it is also greater than each of the average customer voltages.If not, this would indicate that the nearest common node is actually further downstream from the service transformer than one of the customer locations or that there is only one customer served by that transformer.In these cases, the voltage profile from that customer would be selected instead.Finally, the real and reactive powers would be derived by summing all the customer power measurements, while the representative geographic location would be set by averaging all the customer coordinates.This process would then be repeated for all the service transformers before moving on to the next main component of the algorithm (middle, orange column of Fig. 2).
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

2) Pairwise Estimation of Service Transformer Impedance:
Once the customer smart meter measurements have been aggregated to the LV terminals of each service transformer on the distribution feeder, the linear regression in (5) can be applied after filtering out the instances of reverse power flow in the aggregated service transformer measurements to minimize the linearization error (see [17] for more details).In this case, the branch series impedances being estimated represent the R and X values of two neighboring service transformers connected to the same phase, as depicted in Fig. 5. Since different classes of service transformers have distinct impedance values, the estimated R and X values generated can then be converted to nameplate kVA ratings using a lookup table.
When the MV feeder topology is known, the regression problem (5) would simply be applied once to each set of neighboring service transformers.However, since the topology is unknown, selecting the most appropriate pairs of service transformers is more challenging.Therefore, the proposed algorithm leverages the service transformer location estimates (i.e., spatialawareness), whereby physical distance would be used as a proxy for electrical distance.Often, these two quantities are highly correlated, meaning that pairing the transformer physically closest and on the same phase as the target transformer would often result in an accurate impedance estimate.An example of this scenario is depicted in Fig. 6, which shows a portion of a circuit plot with pink triangle markers indicating all service transformers connected to the same phase.The dotted red circle represents a search radius around the target transformer (yellow star marker), and the nearby transformers are circled in green.In this case, the nearby transformers physically closest to the target transformer are located on the same branch of the circuit, meaning they would be the best options to use for the impedance estimations.
However, there are also instances where the nearby transformers are physically close to the target transformer but located on a different branch of the distribution feeder, as shown in Fig. 7.In this case, the closest nearby transformers would likely not result in the most accurate impedance estimates for the target transformer.
Instead of relying on the single closest service transformer, the proposed algorithm creates multiple impedance estimates for each service transformer by iteratively pairing it with all nearby service transformers connected to the same phase, as outlined in Fig. 2. For example, in Fig. 5 "Transformer 1" is the "Target" transformer for which multiple impedance estimates will be generated.So, the algorithm implements a search radius around that target transformer, as shown in Figs. 6 and 7, to identify at least two other nearby transformers on the same phase.The algorithm would then generate unique estimates of the target transformer impedances (i.e., R T1 and X T1 ) using the aggregated measurements from each of the nearby transformers.In other words, each nearby transformer will serve as "transformer 2" in Fig. 5.For the example shown in Fig. 6, the algorithm would generate ten unique estimates of R T1 and X T1 .However, since R and X must be positive values, if any of those ten estimates included negative values for either the target transformer or nearby transformer, those estimates were removed from consideration.
3) Determination of Service Transformer Capacity: The last stage of the proposed algorithm (rightmost, purple column in Fig. 2) is responsible for converting the multiple impedance estimates per target transformer into a single estimated capacity value using a weighted voting scheme.In this scheme, each impedance estimate is compared to a lookup table of all known transformer types used by a utility, which contains the transformers' apparent power rating (kVA), R, and X values.The set of R and X values from the lookup table that best matches the impedance estimate and has the lowest error, according to (7), receives a vote Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.where R est and X est are the estimated resistance and reactance values, and R Lookup and X Lookup represent the full set of impedance values from the lookup table.
After all impedance estimates for a target transformer get converted into votes for selections in the lookup table, the votes are then weighted according to the RMSE of their linear regression model from (5) using where WF t is the weighting factor for impedance estimate t, belonging to the set of all impedance estimates T for the target service transformer.The weighting factors are then applied to the votes and tallied.Whichever entry in the lookup table received the most weighted votes represents the final kVA estimate for that target transformer.Once all the transformer capacity ratings have been estimated, the overall accuracy of the algorithm is determined by taking the percentage of correctly predicted ratings out of the total number of service transformers in the model, as in Note that while the proposed algorithm is model-free, a synthetic circuit model was utilized for testing purposes that provided the ground-truth capacity ratings of all transformers.

III. METHODS
The performance of the proposed algorithm was evaluated using two realistic datasets corresponding to the same underlying distribution circuit shown in Fig. 8, one version (a) without any existing PV and (b) another with a high PV penetration.The test circuit in Fig. 8 is a modified version of the EPRI Ckt5 test feeder [25], which will be referred to "Ckt5."This test feeder represents an actual 3.2-mile-long 12.47 kV distribution circuit that serves 1379 customers.There are six different Types of service transformers in the Ckt5 model (see Table II) and 591 service transformers in total (marked as pink triangles in Fig. 8), all of which are single-phase.The transformers serve between 1 and 6 total customers, and each customer is directly connected to the service transformer through its own conductor with a length ranging from 40 to 180 ft.Recall from Fig. 1 that the required inputs include P, Q, and V measurements for all customers along with their physical location data.For this article, the P and Q measurements (1 year at 15-min resolution) were provided from an actual utility database and assigned to represent each of the 1379 customer load profiles accordingly.To preserve anonymity, the physical location data for each of the 1379 customers corresponds to their location in Fig. 8 (i.e., not their true location).Finally, since the actual utility measurements are from a different circuit, a yearlong quasi-static time-series (QSTS) simulation is required to calculate semi-synthetic V measurements that correspond precisely to the P and Q measurements from the utility data.Note that a QSTS simulation is a form of time-series power flow analysis in which the converged state of each iteration is used as the beginning state of the next.At the conclusion of the QSTS simulation, all input data needed to test the proposed algorithm has been established for the first dataset-i.e., for Fig. 8(a).
To generate the required inputs for the second dataset-i.e., for Fig. 8(b)-a few additional steps were taken.First, PV systems were added to 701 of the customer locations, selected at random; as a result, there were instances where multiple PV systems were connected downstream of the same transformer.The PV systems were modeled with generation profiles from an actual utility dataset that represented a diverse set of directly-metered residential PV systems with a variety of PV array orientations, dc power ratings ranging from 2-10 kW, and dc/ac ratios ranging from 0.8 to 1.3 [25].The PV generation measurements had the same resolution and time horizon as the smart meter measurements used for modeling the customer load profiles (i.e., 1 year of data at 15-min resolution).Once the PV systems were modeled, an additional QSTS simulation was then conducted to align the new semi-synthetic V measurements with the utility P and Q measurements for this case.
Finally, the impact of measurement noise on the performance of the algorithm was explored.A series of Gaussian noise masks were generated and applied to the customer voltage measurements.The noise masks were created using the nominal base voltage (240 V) as the mean, and several different standard deviation values were investigated corresponding to different meter accuracy classes (specifically, 0.1, 0.2, 0.3, 0.4, and 0.5).For example, a revenue grade meter with an accuracy class of 0.5 requires all measurements to be within ±0.5% of the true value [26].Being that Gaussian noise follows a normal distribution, setting the standard deviation of noise to 1/3 of the meter accuracy rating (e.g., 0.167% for the 0.5 class) ensures that nearly all (99.7%) of the generated noise is within the accuracy Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.range of that meter.The results from these analyses are presented in terms of equivalent meter class accuracy, or simply "meter class," in the results section.
In summary, the proposed model-free algorithm was tested on two realistic datasets-i.e., Fig. 8(a) and (b)-and under increasing levels of measurement noise, representing 12 distinct cases in total.For each case, the algorithm estimated the capacity rating of all 591 service transformers on the circuit, and the prediction accuracy was recorded.

IV. RESULTS
The proposed algorithm was first applied to the two variations of Ckt5 from Fig. 8 without measurement noise added.Recall that the algorithm generates multiple impedance estimates for each service transformer depending on how many nearby service transformers were identified.For Ckt5, a minimum search radius of 1000 ft was used, and the algorithm iteratively expanded the radius by 100 ft until at least 2 nearby transformers on the same phase were identified.This searching process resulted in 2 to 33 impedance estimates per transformer with an average of 15 estimates.The full distribution of the estimates per transformer is presented as a histogram in Fig. 9.
Note that the minimum number of nearby transformers and the minimum search radius are tunable parameters that each influence the distribution in Fig. 9, so it may be prudent to adjust these values for other circuits depending on the density of the customer locations.While the development of a generalized approach to optimally set these parameters is beyond the scope of this article, there are some tradeoffs to be aware of.In general, increasing the number of impedance estimates per target transformer is beneficial (i.e., to mitigate the risk highlighted in Fig. 7) but only up to a certain point.The weighted voting scheme does help to filter out erroneous results, but as the physical distance from the target transformer increases, so does the probability that an impedance estimate would match better with an incorrect different transformer in the lookup table and thus dilute the value of the correct votes.To mitigate this risk and to avoid unnecessarily including additional transformers, the minimum number of nearby transformers should be kept low (e.g., 2 or 3) and the minimum search radius should not be too large (e.g., it should not cover the entire circuit).After all the impedance estimates are calculated for a given target transformer, they are merged into a single prediction according to the weighted voting scheme described in Section II-B.To further explain this approach, detailed results for the target transformer in Fig. 7 are provided in Table III.First, (7) was applied to the R est and X est columns in Table III to determine which transformer Type from Table II would get a vote.Then, the weighting factors of those votes were calculated using (8).After tallying the weighted votes from Table III, transformer type 3 had received 79% of the weighted votes, followed by 14% for type 4 and 7% for type 5.In this case, the algorithm correctly identified the target transformer as type 3.
These results in Table III highlight the importance of the weighted voting scheme as well as the search parameters of the algorithm.For instance, if the minimum search radius had been set to 530 ft, there would have been two votes for type 4 and 1 vote for type 3 (if no weights were assigned).After applying the weights, type 3 would have gotten 55% of votes compared to 45% for type 4, meaning the algorithm would have still selected the correct type but by a much smaller margin.
The results in Table IV summarize the performance the algorithm when applied to the two variations of from Fig. 8 without measurement noise added.The algorithm was able to correctly predict the capacity ratings of over 98% of the service transformers on Ckt5, using (9), even when distributed PV systems had been added to more than half of all customer locations.As discussed in Section II-B, the algorithm filters out instances of reverse power flow before calculating the impedance estimates, which explains why the presence of distributed PV systems had only marginal impacts on prediction accuracy.The weighted voting scheme described in Section II-B Fig. 10.Impacts of measurement noise on prediction accuracy.was then implemented to merge the estimates from Fig. 9 into a single prediction per target transformer.
Next, the performance of the algorithm was re-evaluated after various levels of measurement noise were added to the customer voltage measurements.The results from these analyses, presented in Fig. 10, suggest that the proposed algorithm is fairly robust to measurement noise.Even for the 0.5 class meters (i.e., the least accurate class covered in [26]), the prediction accuracy remained above While these analyses only considered measurement noise on the voltage signal, prior work has shown the impacts of power measurement noise to be negligible [9].Similarly, the impacts of missing data would be negligible as well, since those time would simply be filtered out prior to solving the linear regression problem.
The bar chart in Fig. 11 compares the total number of predicted transformer types to the actual totals of each type from the using the results from the 0.5 meter class case.Overall, the errors were consistent across the different transformer types.In total, the actual cumulative thermal capacity of all the service transformers in Ckt5 was 17.55 MVA, and the predicted cumulative thermal capacity for the results in Fig. 11 was 17.37 and 17.44 MVA without and with PV, respectively.So, even in the worst case, the proposed algorithm only underestimated the total thermal capacity by 1.01%.

V. DISCUSSION
Overall, the proposed algorithm successfully addressed several gaps of existing methods.Specifically, this algorithm leverages smart meter metadata, such as geo-spatial location, to eliminate the need for knowledge of the grid topology.As long as smart meter measurements and location data are available for all customer locations, the service transformer ratings can be accurately estimated, meaning that potential thermal limitations of PV interconnections can be directly evaluated once the rating is determined.Thus, the proposed algorithm has applications in PV HC analyses, as well as more traditional applications like model calibration, identifying underutilized transformers, and flagging existing overloaded transformers.However, smart meter data quality and availability remain limitations for utilities lacking widespread AMI deployments.
While the results presented in this article are promising, and the computational advantage of the topology-agnostic approach provides enhanced scalability, future work is required to ensure the proposed algorithm maintains its accuracy across a diverse set of distribution circuits.For example, future work could include testing the algorithm on circuits with voltage regulating equipment or with PV systems providing grid support functions.It may also be prudent to test the algorithm on circuits with more complex LV secondary circuit topologies with larger numbers of customers, although utility models for such systems are hard to come by.Finally, the algorithm could be evaluated under other smart meter data limitations, such as incomplete smart meter coverage of all customer locations.

VI. CONCLUSION
The model-free algorithm proposed in this article provides a way to assess the thermal limitations of residential PV interconnections using smart meter data.Unlike existing parameter estimation approaches, the proposed algorithm does not require any utility models, topology information, or time-consuming simulations.Instead, the algorithm leverages smart meter measurements and metadata as inputs and implements a novel pairwise impedance estimation approach and weighted voting scheme to estimate the capacity of each service transformer on a distribution feeder.The proposed algorithm was tested on two different versions of an actual distribution circuit (i.e., with and without distributed PV systems present) and was found to have a prediction accuracy of >98%.Even when measurement noise was included, the algorithm remained above 95% accuracy.While future work is necessary, the initial results of the proposed algorithm are promising, highlighting its potential application for a variety of data-driven tools, including model-free PV impact analyses.

ACKNOWLEDGMENT
This article has been authored by an employee of National Technology & Engineering Solutions of Sandia, LLC under Contract No. DE-NA0003525 with the U.S. Department of Energy (DOE).The employee owns all right, title and interest in and to the article and is solely responsible for its contents.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a nonexclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this article or allow others to do so, for United States Government purposes.The DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan https://www.energy.gov/downloads/doe-public-access-plan.

Fig. 6 .
Fig. 6.Example of when nearby transformers are on the same branch.

Fig. 7 .
Fig. 7. Example of when nearby transformers are on a different branch.

Fig. 9 .
Fig. 9. Histogram of the number of impedance estimates per transformer.

Fig. 11 .
Fig. 11.Predicted versus actual transformer types for the 0.5 meter class case.

TABLE III DETAILED
ESTIMATES OF ALL NEARBY TRANSFORMERS IN FIG. 7