Real-Time Estimation of Support Provision Capability for Poor-Observable Distribution Networks

An indispensable step towards coordinating the actions of distribution and transmission system operators (dsos-tso) is to estimate the range of flexibility that can be offered to the tso by dsos. Within this context, a data-driven probabilistic approach is proposed to evaluate the capability of a distribution network in providing active and reactive power support in real-time. To this end, in an offline phase, a linear discriminant analysis model, together with a piecewise linear model of the distribution network are trained to delineate the boundary of a region representing the adherence to distribution network operational constraints. In the implementation phase, this region comprises the feasible set of a series of optimization problems, formulated to determine the support provision capability. These optimization problems are of iterative linear programming type, which allows for real-time applicability. The evaluated support capability can be deemed as the available reserve in real-time transmission operation, which enables providing a coordinated response towards unexpected events, and facilitates the participation of distributed resources in the balancing market by granting an up-to-date estimation of available supports. This approach is tested on the IEEE 123-node system and verified through comparison with an AC optimal power flow technique.


I. INTRODUCTION
T HE power system is undergoing a series of significant operational challenges due to the large-scale integration of renewable energies. Specifically, the increasing generation share from the variable renewables, in line with the decommission of conventional power plants, have hindered the preservation of the generation-demand balance, and the prevention of congestion in bulk power systems. Within this context, it is envisioned for distributed energy resources (DERs), located at the distribution level, to consider bulk power system requirements when Manuscript  operated. However, distribution and transmission networks are traditionally operated independently with transmission system operators (TSOs) and distribution system operators (DSOs) considering only local goals and constraints [1]. Enhancing the TSO-DSOs coordination has been recommended as a prominent solution towards addressing the future operational challenges of the Irish power system in regard to frequency control, voltage stability, congestion management, and system restoration [2].
To coordinate the TSO-DSOs interactions, one way is to combine the optimal operation of both systems to form a consolidated operating strategy, where decentralized [1], [3]- [6], e.g., master-slave splitting, and centralized (unified) solutions [7] have been suggested. On this subject, dynamic and multi-period coordinated economic dispatch problems are formulated in [1] and [3], respectively. In [4], a coordinated restoration framework is introduced. Two coordinated optimal power flow (OPF) approaches are proposed in [5] and [6]. In [7], the optimal scheduling of battery systems is proposed to secure the voltage stability in an integrated transmission and distribution system. To reduce the computational burden of centralized solutions, an estimate of the centralized operational model is used in [8] for economic coordination of TSO-DSOs, where four different approaches are used to model the behavior of the distribution network in transmission network operations. Whilst the cooptimization of the operation of transmission and distribution systems can lead to an optimal solution, its implementation in practice can be arduous due to barriers like escalation of the computational burden of the problem, data protection concerns, and the possibility of inconsistent objectives of TSO and DSOs.
A more practical strategy has been sought in recent studies, where DSOs provide ancillary services to the TSO at the HV/MV substation, through coordinated control of DERs in their grid. On this subject, in [9], to provide a specific amount of active power support requested by the TSO, a convex optimization problem is formulated to find the DER set-points that grant such support. In [10], providing the requested active and reactive power flexibility through coordination of DERs and battery systems is addressed as a linearized dynamic OPF model. An OPF-based approach is developed in [11] to obtain the DER set-points that optimize the distribution network operation, while also providing voltage support to the transmission network. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Notwithstanding the necessity of approaches devised for fulfilling the TSO requests, a prior step towards establishing such a framework is to contrive to advise the TSO of the range of flexibility available at the distribution level. Such flexibility ranges can be regarded as available reserves when the planning or operation of the transmission network is formulated, as illustrated for example in [12]. To obtain these flexibility ranges, different approaches have been developed that evaluate the capability of a distribution network in providing power support at the HV/MV substation, through controlling the DERs. The majority of the proposed approaches are based on applying AC-OPF techniques [13]- [16]. In [17], a two-part AC-OPF framework is developed for quantifying the flexibility ranges, and within that framework, the computational burden and solution quality of three different formulations are compared for several networks. To apply an AC-OPF technique, however, a presumption has to be made that the consumption of network nodes is known to the DSO, which is not the case in many distribution networks. Besides, solving these AC-OPF problems entails a high computational time for real-scale distribution systems which confines their applicability mainly to the planning phase of the transmission network. That is why it is necessary to have approaches with real-time applicability to provide the TSO with an up-to-date estimation of flexibility ranges, which can improve the transmission network management in facing unexpected events in the network, e.g., forced outages. This estimation also facilitates the participation of distributed resources in the balancing market. The approach in [13] also does not consider the coordination of the DERs actions and only acknowledges the DERs local voltage limits for evaluating the reactive power flexibility ranges.
To account for the uncertainty of loads estimation in evaluating the support provision capability, scenario-generation-based approaches have been suggested in some studies [12], [18]. In [12], probable operating points of the system are simulated by varying the loads and production of DERs within their uncertainty range, and the power support is evaluated as probabilistic capability charts. The distribution network operation is studied under different scenarios in [18], and a linear robust optimization is adopted to extract a curve that characterizes the power support capability. The main concern over these scenario-generationbased approaches, as it has been emphasized in [18], is that the resulting evaluated capability is applicable only for the planning of the upstream transmission network as the generation and analysis of the scenarios are conducted during the evaluation process which takes a considerable amount of time. The other concern regarding these approaches is that they have made some simplifications that restricts their generality for practical problems, e.g., in [12], the DERs active power generation is excluded in the decision variables, or in [18], the network behavior is assumed to be linear over all of its operating range. This paper is aimed at evaluating the capability of a distribution network in providing active and reactive power support to the upstream transmission network in real-time, through coordinated control of DERs. To this end, a data-driven statistical model and a piecewise linear model of the distribution network are developed, in an offline phase, to establish a probabilistic criterion delineating the boundary of the region corresponding to the safe operation of the network. In the implementation phase, the developed network model is employed to formulate a series of optimization problems to determine the interdependent extrema of the active and reactive powers that can be provided at the HV/MV substation. Consideration of the region of safe operation, developed in the offline phase, as the feasible set of these optimization problems ensures the adherence to the distribution network and DERs operational constraints. The propounded optimization problems are of iterative linear programming type that can be solved efficiently in real-time. The efficacy of this approach is demonstrated through comparison with an AC-OPF technique.
The main contributions of this work are listed below. 1) A piecewise linear criterion is proposed to probabilistically represent the adherence to the network constraints through exploring different possible operating points of the network in an offline training phase. This technique obviates the need for cumbersome multi-scenario OPF simulations to evaluate the support provision capability.
2) The proposed probabilistic criterion allows the evaluation of the support provision capability in the absence of reliable estimation of the network nodal power consumption.
3) The evaluation of the support provision capability is formulated as a series of straightforward iterative linear programming problems that involve a low computational burden. Consequently, the support provision capability can be evaluated in real-time, and hence, be regarded as available reserves when the ongoing operation of the upstream transmission network is formulated.

II. METHODOLOGY
Evaluating the support provision capability can be addressed by employing an AC-OPF technique if the consumption of all network nodes is known. However, since generally, only limited measurements are available in most practical distribution networks, a reliable estimation of the nodal consumption is not attainable. To address the limitations concerned with the lack of a reliable load estimation, in this study, a probabilistic data-driven approach is adopted for evaluating the support provision capability of a distribution network in real-time, when only scarce measurements are available from the network. These measurements could be from the smart devices in the network, such as the DERs inverters, or the voltage/current/flow measurement units. The proposed approach is comprised of the training phase conducted offline, and the implementation phase conducted in real-time.
In the offline training phase (Section II-A, II-B, II-C), firstly, Monte Carlo simulations are performed to explore different operating points of the network by examining different nodal consumption and DERs generations. In each operating point, the adherence to network operational constraints is investigated, and the corresponding simulated values of the network variables that are being measured, referred to here as measurement variables, are recorded. Subsequently, the linear discriminant analysis (LDA) technique is employed to train a set of probabilistic models aimed at evaluating the adherence to the network operational constraints based on the real-time value of the measurement variables. These LDA models determine the boundary of the normal operation of the distribution system as a linear region in the space of the measurement variables. In this way, the LDA models make it possible to evaluate the probability of the violation of network constraints in real-time, based on only the online values of measurement variables and specifically without knowing the ongoing network nodal power consumption. Therefore, the LDA models are used to address the uncertainties associated with the network nodal power consumption. The results of Monte Carlo simulations are also employed to train a polynomial regression model that presents a piecewise linear estimation of the network equations. This model maps the region of the normal operation to the space of the DERs active and reactive power generations. Consideration of the DERs operational limits constricts the mapped region and yields the region of safe operation, within the boundary of which the DERs power generations can be varied without violating the network or DERs operational constraints.
In the implementation phase (Section II-D), using again the model for estimating the network equations, a series of optimization problems of linear programming type are formulated to determine the support provision capability. These optimization problems have the DERs active and reactive power generations as the decision variables, and the region of safe operation, obtained in the training phase, as the feasible set.

A. Formation of Probabilistic Models Representing Adherence to Distribution Network Operational Constraints
Generally, the system operation is subjected to some restrictive constraints, e.g., no over-or under-voltages at any nodes. Let oc k describe the status of the kth network operational constraint, e.g., if the voltage magnitude of a specific phase at a specific node is above the under-voltage statutory limit. In this way, oc k = 1, when this constraint is satisfied, and oc k = 0, otherwise. Assume that OC = [oc 1 , . . ., oc n oc ] T is a set that represents the status of all network constraints, with n oc denoting the total number of constraints. Suppose that n z variables of the network (e.g., the voltage magnitude of a specific phase at a specific node, the active power flow of a specific phase at a specific line, etc.) are being measured. These variables are referred to here as the measurement variables. Let Z = [z 1 , . . ., z n z ] T denote the value of measurement variables, indexed by r, at a given operating point of the network.
During the offline training phase, Monte Carlo simulations are conducted by applying different nodal consumption and different DERs generation set-points. Consequently, a set of observations representing different network operating conditions is provided. Each observation includes the simulated value of measurement variables and the corresponding OC set. Considering the variations of the measurement variables over all the observations, a region, i.e., Ω Z , is formed that encompasses the probable operating points of the system in the n z -dimensional space of the measurement variables.
In this study, following the work in [19], the adherence to each network constraint is considered as a probabilistic phenomenon, and LDA, as a widely used classifier, is employed to build a statistical model that evaluates the probability of such an adherence, i.e., oc k is 0 or 1. To do so, the measurement variables are treated as the classification features (explanatory variables of the models). Considering a specific network constraint, i.e., oc k , LDA assigns two separate multivariate Gaussian distributions to model the probability distribution function of the incidents associated with the satisfaction (f + k ) and violation (f − k ) of this constraint, as (1) and (2), respectively, where μ + k and μ − k are the mean vectors associated with each of the distributions, respectively, and Σ k is the covariance matrix of these distributions [20]. The parameters of these distributions, i.e., μ + k , μ − k , and Σ k , for all k ∈ {1, . . ., n oc }, are determined in the training phase.
Following the training, given a new value for measurement variables, i.e., Z * , the probability of the satisfaction and violation of the kth constraint, i.e., η + k and η − k , respectively, can be evaluated using the Bayes theorem as (see, e.g., [20]): where ρ + k and ρ − k denote the prior probability that an observation is associated with the class in which the kth network operational constraint is satisfied or violated, respectively. On this subject, when using the Bayes theorem in LDA, the common practice is to assume that the prior probability of being associated with each class is equal to the ratio of the number of observations in the training set belonging to that class to the total number of observations in the training set.
In order to ensure the satisfaction of each constraint, a confidence level, e.g., α k for the kth constraint, is considered such that η + k (Z * ) ≥ α k . This condition separates the space of measurement variables, i.e., Ω Z , into two half-spaces, i.e., oc + k and oc − k , each encircling the incidents where oc k is satisfied, or violated, respectively, with the given confidence level, i.e., α k . In this way, the boundary, i.e., Z b k , that determines the satisfaction of this constraint can be found by solving η + k (Z b ) = α k , which, by considering (1), (2), and (3), yields: Equation (4), which determines the boundary of the satisfaction of η + k (Z * ) ≥ α k , is linear with respect to Z b k . Therefore, LDA separates oc + k and oc − k with a hyper-plane, referred to as the Bayes decision boundary. This boundary is a flat subset of Ω Z with dimension n z − 1. For example, if n z = 2, this hyper-plane will be a straight line. This property of LDA is particularly of interest since, as will be shown, allows formulating the network operational constraints as linear constraints to the optimization problems aimed at determining the support provision capability. The operating condition of the distribution network is considered normal if all operational constraints are satisfied. Therefore, the interSection of oc + k , for all k ∈ {1, . . ., n oc }, forms a region with linear boundaries, i.e., Ω + Z , that represents the region of the network normal operation in the space of measurement variables. Fig. 1 a presents a schematic diagram elaborating how Ω + Z is determined.

B. Polynomial Regression to Estimate Sensitivity Matrices
Let n D denote the total number of DERs. Suppose G = (g m ) is a vector of length 2n D containing the decision variables, i.e., the DERs active and reactive power generations, as: where P d and Q d denote the active and reactive power generation of the dth DER, respectively. Note that (· m ) and (· r,m ) represent the mth element of a vector and the element of a matrix at the rth row and mth column, respectively. An operating point of the network is determined by the active and reactive consumption of all network nodes, i.e., Υ 0 , the DERs generations, i.e., G 0 , and the corresponding value of the measurement variables, i.e., Z 0 . Let ΔG = G * − G 0 , G * denote a new set of DERs generations, Δg be a ceiling considered for the variations of G around G 0 , and || || ∞ give the infinity norm. For a small variation in the DERs generations, i.e., ||ΔG|| ∞ ≤ Δg, the associated variations in the value of the measurement variables, i.e., ΔZ, can be estimated using a matrix of sensitivities, i.e., H = (h r,m ), as expressed in (6). Network sensitivity matrices have been widely used in similar studies, e.g., in [9], [21]- [23] To extract H, the partial derivatives of the power flow equations with respect to g m , for all m ∈ {1, .., 2n D }, can be employed. For example, if Z contains the voltage magnitudes of a subset of network nodes, H will be the inverse of a particular segment of the network's Jacobian matrix. Using (6), for G * , the corresponding value of measurement variables, i.e., Z * , can be evaluated using (7).
It will be discussed in Section II-D3 that for variations beyond the considered range, (7) is rewritten around a new set-point and the value of H is updated accordingly. Therefore, (7) provides a piecewise linear estimation of the network equations. In practice, however, Υ 0 is not known to the DSO, and hence, the aforementioned partial derivatives cannot be evaluated directly. To address this issue, different data-driven approaches have been proposed, e.g., in [9], [22], [23], to estimate H based on the available information from the network. One interesting study on this subject is [24], where it is demonstrated that a multivariate 2-degree polynomial can effectively approximate the aforementioned partial derivatives. In this order, for a given G 0 and Z 0 , each element of H, e.g., h r,m , is estimated as: where W 2 r,m , W 1 r,m , and W 0 r,m are matrices of proper sizes that constitute the coefficients of this polynomial. In a concise form, H can be expressed as a function, i.e., Θ Z , of Z 0 and G 0 , using (9). The coefficients in Θ Z are determined in the training phase, using the results of Monte Carlo simulations. To do so, for each simulated operating point, the elements of the network sensitivity matrix are evaluated, and the corresponding value of Z and G is recorded. Then, the least-squares estimation method is applied over the set of all observations.
The concept of network sensitivity matrix is also employed to make a piecewise linear approximation of how ΔG causes a change in the value of active and reactive power flows (from the distribution network towards the transmission network), at the HV/MV substation. Let P S and Q S denote these active and reactive power flows, respectively. Let also U P S and U Q S denote the corresponding network sensitivity matrices, respectively. The polynomial regression approach in [24] is used to approximate the value of the elements of these sensitivity matrices, based on the given G 0 and Z 0 , as: where ΔP S = P S − P S 0 and ΔQ S = Q S − Q S 0 denote the variations of P S and Q S from P S 0 and Q S 0 , respectively, P S 0 and Q S 0 denote the measured values of P S and Q S corresponding to the current operating point of the system, respectively, and Θ P S and Θ Q S are the polynomial functions to estimate U P S and U Q S , respectively. Similar to Θ Z , Θ P S and Θ Q S are formed in the training phase. Similar to (6), for variations beyond ||ΔG|| ∞ ≤ Δg, (10) and (11) are rewritten around a new setpoint and the values of U Q S and U P S are updated accordingly. These piecewise linear approximation models, expressed in (10) and (11), are used to formulate the objective functions aimed at determining the support provision capability curve, as will be discussed in Section II-D2.

C. Constructing the Feasible Region
As stated, to ensure the satisfaction of a specific operational constraint of the network, i.e., oc k , a confidence level, i.e., α k , is assumed such that η + k (Z * ) ≥ α k is held. Considering (3), this condition can equivalently be expressed as (12). By considering (1) and (2), and replacing Z * with H(G * − G 0 ) + Z 0 form (7), (12) can be written as (13), where A k and b k are defined in (14) and (15), respectively.
Equation (13) is linear with respect to G * . Therefore, the proposed model for estimating the network equations maps each network constraint onto the space of the decision variables as a linear constraint. The interSection of all the constraints of (13), for all k ∈ {1, . . ., n oc }, forms a linear region in the space of the DERs generation. Let denote this region by Φ + G . Generally, the operation of DERs is subjected to some restrictive limits associated with their capacity and the lead/lag regulations on their power generation. These limits are listed in (16), (17), and (18), where (16) describes the floor and cap for the active power generation of DERs, denoted by P d and P d for the dth DER, respectively, and (17) and (18) are concerned with the minimum lead and lag power factor limits, denoted by pf le d and pf la d , respectively.
The interSection of (16), (17), and (18) forms a linear region, denoted by Φ DER , in the space of the DERs generations. Therefore, the interSection of Φ + G and Φ DER G , i.e., Φ ++ G , forms a region with linear boundaries that represents the adherence to the operational constraints of the network and DERs. This region comprises the feasible set of a series of optimization problems, formulated in Section II-D2, to evaluate the support provision capability. Fig. 1(b) elaborates how Φ ++ G is determined by mapping Ω + Z into the space of the DERs generations and considering the DERs operational limits.
Depending on the network regulations and the type of the DERs, the DERs operational limits can be different from the ones considered here. In some cases, for example, when the power production of the DER is restricted by its apparent power capacity, these limits can be nonlinear. However, as it has been shown in similar studies, like [18], these nonlinear limits can effectively be approximated by a set of linear constraints. Therefore, the type of DERs operational limits will not influence the generality of the proposed approach.

D. Real-Time Extraction of Support Provision Capability Curve
To determine the support provision capability, one way is to separately evaluate the maximum capability of the provision of P s and Q s . However, TSO requires to regard the interdependency of P s and Q s . Therefore, to account for such an interdependency, the maximum active power injection, i.e., max P S , and active power absorption, i.e., min P S , are evaluated as a function of Q S . In this way, a closed curve, referred to as the support provision capability curve (SPC-CURVE), is achieved encircling all possible values of P S and Q s that can be supported by changing the DERs set-points.

1) Specification of Support Provision Capability Curve:
To extract the SPC-CURVE, firstly, regardless of the value of P S , the maximum reactive power injection, i.e., Q + S = max Q S , and the maximum reactive power absorption, i.e., Q − S = min Q S , are evaluated. In this way, two points are determined in the Q S -P S plain. Let these points be denoted by O + Q S and O − Q S , respectively. Then, for n J different values of Q S , i.e., Q j S for j ∈ {1, 2, . . ., n J }, distributed uniformly between Q − S and Q + S , the corresponding value of the maximum active power injection, i.e., {P j+ S = max P S s.t. Q S = Q j S }, and maximum active power absorption, i.e., {P j− S = min P S s.t. Q S = Q j S }, are evaluated. In this way, for each Q j S , two points are determined in the Q S -P S plain. Let denote these points as O j+ P S and O j− P S , respectively. The points that are determined in this order, i.e., . ., n J }, are referred to as the support provision capability points (SPC-POINTs). Now, the SPC-POINTs can be linearly interpolated to delineate the SPC-CURVE. Fig. 2 illustrates how the SPC-POINTs and the SPC-CURVE are extracted. Note that the selection of the number of middle points, i.e., n J , depends on the accuracy and solution time that suit the application. Next, the objective functions to determine the SPC-POINTs are elaborated.
2) Objective Functions to Evaluate Support Provision Capability Points: As stated, in the first step of extracting the SPC-POINTs, Q + S and Q − S are to be evaluated. To this end, two separate optimization problems, with max Q S and min Q S as the objective functions, respectively, are formulated. In these problems, G comprises the optimization variables, and the feasible set is confined to Φ ++ G to ensure the satisfaction of the DERs and network operational constraints. To formulate these objective functions, the piecewise linear approximation that relates ΔG to ΔQ S , expressed in (10), is employed, which results in (19) and (20) for determining Q + S and Q − S , respectively. Equations (19) and (20) are linear in terms of G and can effectively be solved using available linear programming techniques, e.g., the dual-simplex approach.
Following the determination of Q + S and Q − S , P j+ S and P j− S are to be determined for different values of Q s between Q + S and Q − S , i.e., Q j S ∀j ∈ {1, . . ., n J }. To do so, similar to (19) and (20), (21) and (22) are formed by considering (10) and (11). These relationships are also linear in terms of G and can be solved using linear programming techniques.
3) Iterative Solution: Since a piecewise linear estimation of the network equations has been used to establish the network constraints, i.e., (13), and evaluate the extrema of Q S and P s , i.e., (19), (20), (21), and (22), these relationships are accurate only if ||ΔG|| ∞ ≤ Δg. However, to determine the maximum support, it is likely that ΔG should take larger values. This issue can be resolved effectively by solving the optimization problems in an iterative way. To do so, in each iteration, the considered ceiling, i.e., Δg, is imposed on the variations of the DERs generations in that iteration, as ||ΔG i || ∞ ≤ Δg, where i stands for the iteration number, ΔG i = G i − G i−1 , and G i represents the DERs generations evaluated in the ith iteration for i ≥ 1, and the initial DERs generations for i = 0. The values of H, U Q S , and U P S in each iteration, i.e., H i , U Q S i , and U P S i , are updated based on the values of G and Z in the previous iteration, i.e., G i−1 and Z i−1 , as: where Z i denotes the value of the measurement variables corresponding to the ongoing operating for i = 0, and the simulated values for i ≥ 1 (using (7)).
In each iteration, i.e., i, the values of A k and b k , i.e., A k i and b k i , respectively, are calculated based on H i , which is the value of H in that iteration. As a result of restricting the variations of G in each iteration, it may not be possible to sat- in the early iterations when solving (21) and (22). To resolve this issue, (21) and (22) are reformulated for the ith iteration as (26) and (27), respectively, where γ is a relatively large number to ensure that S } tends to zero as iterations go forward. Note that (26) and (27) can easily be transformed into linear programming problems by replacing the term with the absolute value operator, i.e., with an auxiliary variable, i.e., y, and introducing two new linear constraints as For each objective function, the iterations continue until the improvement of the objective function with respect to the previous iteration becomes less than a small threshold, i.e., . Figs. 3 and 4 present the flowcharts of the algorithms applied to calculate Q − S and P j− S , respectively. These flowcharts can also be used to calculate Q + S and P j+ S , respectively, by replacing "argmin" with "argmax," and changing the sign of γ.

A. Study System
The IEEE 123-node test feeder [25], depicted in Fig. 5, is considered as the test system. With a nominal voltage of 4.16 kV, this system contains 3-phase, 2-phase, and single-phase overhead lines, and 3-phase underground cables. Three DERs, with a nominal capacity of 3000 kVA, are integrated into this system, i.e, D83, D104, and D113. To simulate under-voltage and over-voltage operating conditions, the nominal active and reactive consumption of the network loads is multiplied by a factor of 4, and the regulators are assumed to be operated in the fixed-step mode.
Six different case studies, i.e., CS 1 to CS 6 , are simulated. In CS 1 , among network constraints, only the over-and undervoltage constraints are considered. The available measurements include P s , Q s , and the voltage magnitude of all three phases at the point of common coupling of each DER. In addition, 300 (kw or kvar) is chosen for Δg, and a relatively conservative value, i.e., 0.8, is considered for α k ∀k. CS 2 is similar to CS 1 , only a lower value, i.e., 0.6, is chosen for α k ∀k. In CS 3 , besides the voltage constraints, the thermal limits of network segments are also considered. CS 4 is formed by including 4 extra measurement variables, i.e., 3-phase voltage magnitudes at node 107 and the voltage magnitude of phase a at node 119. These measurements are located near to some of the critical nodes, where under-voltage conditions are likely to happen. CS 5 represents a fully observable network where voltage magnitudes at all critical nodes are measured. In CS 6 , the restriction on the variations of G in each iteration is ignored by setting a very large value for Δg. This is equivalent to assuming that the network's behavior is linear over all the operating range of the DERs. Table I describes the defined case studies.

B. Monte Carlo Simulations
To explore different network operating conditions, 20000 Monte Carlo simulations are conducted, which yields a set of 20000 data samples for training and test. Each sample is composed of the active power consumption and the power factor of network loads, the active power generation and the power factor of DERs, and the resultant nodal voltages achieved by running a load flow study. In each simulation, the active power generation and the power factor of DERs, together with the power factor of the loads are generated randomly: the active powers from 0 to the nominal value, and the power factors from 0.9 lag to 0.9 lead. By studying the Irish standard load profiles, developed by ESB networks (the Irish DSO) for the Irish market [26], the active power consumption of network loads is generated randomly from a Gaussian distribution with 0.6 and 0.5 pu as the mean and relative standard deviation, respectively, where a correlation of 30% is considered between the consumption of each pair of loads.
The over and under-voltage statutory limits of the network are assumed to be 1.05 and 0.95, respectively. Based on [25], 530 Ampere is considered as the ampacity of over-head lines. The same value is assumed for the ampacity of other segments. If the voltages of all nodes and the currents of all segments are within the associated limits, the network operating condition TABLE II NETWORK CONSTRAINTS FOR CRITICAL NODES AND SEGMENTS * V n = max{V a n , V b n , V c n }, V n = min{V a n , V b n , V c n } I n−m = max{I a n−m , I b n−m , I c n−m } V φ n : voltage magnitude of phase φ at bus n I φ n−m : current magnitude of phase φ of segment between buses n and m.
is regarded as normal, otherwise, it is regarded as an overvoltage, under-voltage, or over-current condition. Half of the data samples, i.e., 10000 samples, are employed to train the LDA and network characterizing models. Let Γ 1 denote the set that includes these samples. Let Γ 2 denote the set that includes the remaining samples. Γ 2 is used to test the performance of the LDA models.

C. Training LDA Models
For each node, two LDA models can be trained, one for the recognition of the under-voltage condition and the other for the over-voltage condition at that bus. In addition, for each network segment, one LDA model can be trained for the recognition of the violation of thermal limits on that segment. Subsequently, the associated μ − k , μ + k , and Σ k , and then A k and b k are calculated according to (14) and (15). Corresponding to each model, a constraint in the form of A k G ≥ b k is considered for the optimization problems. However, studying the nodal voltages of the training samples in Γ 1 reveals that to assess the under-voltage condition, it is sufficient to study the voltage of 7 critical nodes, including nodes 78, 82, 107, 112, 115, 119, and 122, since in all samples, always the voltage of one of these nodes has the lowest value. These nodes comprise the end nodes in the Sections of the feeder that are heavily loaded. Similarly, the over-voltage critical nodes are the nodes the voltages of which have the highest value in at least one of the samples. These nodes include nodes 90, 108, and 123.
The receiver operating characteristic (ROC) analysis is a prominent tool to evaluate the performance of classifiers in the training phase [27]. Here, this analysis is employed to assess the success in training the LDA models and is applied to the training samples (set Γ 1 ). In ROC analysis, difference values, from 0 to 1, are considered as the confidence level, i.e., α k for oc k , and the fraction of correct classification is depicted versus the fraction of incorrect classification. The resultant curve is referred to as the ROC curve. In this order, the area under the ROC curve, i.e., AU C, reflects the overall performance of the classifier [27], such that AU C = 1 indicates a perfect classifier, while smaller values show retreating from an ideal classifier, e.g., AU C = 0.5 represents a total random classifier. Table III presents the evaluated AU C for each network constraint and each critical node/segment, for CS 3 . As seen, for all critical nodes/segments, AU C is quite close to 1 demonstrating the success in training the LDA models. Another index to evaluate the performance of classifiers is the misclassification percentage of the classifier on test samples, where α = 0.5 is regarded as the confidence level. This index is also employed to assess the performance of the developed LDA models on test samples in Γ 2 . Table III presents the results. As seen, the misclassification percentage is low for all critical nodes/segments demonstrating that the LDA models perform desirably for unseen data samples.

D. Training Polynomial Regression Models to Estimate Network Equations
In order to find the coefficients of the polynomial regression models which estimate the network equations, i.e., to form Θ Z , Θ Q s , and Θ P s , for each data sample in Γ 1 , the partial derivatives of the intended parameter (Z, Q S , and P S , respectively) with respect to the DERs generations are evaluated at the point of the convergence of the power flow equations. Then, {Z 0 , G 0 } are assumed as the explanatory variables, and the evaluated values of the partial derivatives are considered as the response variables to the multivariate polynomial models. Subsequently, the least square estimation approach is employed to find the parameters of these polynomials.

IV. RESULTS AND DISCUSSION
The set Γ 2 , which was used to examine the performance of the LDA models, includes data samples with normal and abnormal operating conditions. Given that the distribution network, logically, cannot provide support if it is experiencing operational problems, evaluating the support provision capability is only  Fig. 6. The SPC-CURVE obtained using the proposed approach in CS 1 , compared to the one obtained using an AC-OPF approach, for one of the test samples. Positive values of P S and Q S stand for injection to the upstream network at the substation. meaningful for samples with the normal operating condition. Therefore, Γ 3 is constructed by including all samples in Γ 2 with the normal operating condition. The samples in Γ 3 are, hence, regarded for determining the SPC-CURVEs. To have a balance between accuracy and speed, the number of points between Q − S and Q + S , i.e., n J , is chosen equal to 4. Using the developed approach, the SPC-POINTs, and subsequently, the SPC-CURVEs are determined for all the test samples in Γ 3 , in all the defined case studies. The proposed approach is coded in Matlab R2019a and simulations are performed on a PC with an Intel(R) Core(TM) i7-8700 K 3.7 GHz CPU and 16 GB of RAM. The average and maximum simulation times, as well as the average number of iterations to extract the SPC-CURVE over all test samples in Γ 3 , are presented in Table IV. As seen, it takes less than 2.2 seconds for the proposed approach, in the worst case, to evaluate the SPC-CURVE. These results verify the real-time applicability of the proposed approach.
To establish a benchmark for evaluating the performance of the proposed approach, an AC-OPF approach is also applied on the test data samples to determine the SPC-CURVE. To apply the AC-OPF approach, as opposed to what is considered in the proposed approach, the nodal consumption, i.e., Υ 0 , used to generate each particular test sample is presumed to be known. Fig. 6 compares the SPC-CURVEs obtained using the developed approach in CS 1 and the AC-OPF approach applied without considering the network thermal constraints, for one of the test samples. As seen, the curve of the proposed approach is similar in shape to that of the AC-OPF approach, except that it provides a more conservative evaluation of the support capability. This conservation was expected since a conservative value, i.e., 0.8, was chosen for α in CS 1 . The main constraints that have confined r Regarding O 4− P S , the confining constraints are different for the proposed and AC-OPF approaches: under-voltage limits for the former, and DERs operational limits for the latter. Fig. 7 compares the SPC-CURVEs obtained in CS 2 , CS 3 ,..., and CS 6 , to that obtained in CS 1 , for the same test sample of Fig. 6. As seen in Fig. 7(a), with α = 0.6 (in CS 2 ), as expected, the SPC-CURVE expands and nearly sits on the SPC-CURVE of the AC-OPF approach. Therefore, with α = 0.6, a more optimistic estimation of the SPC-CURVE is achieved. Fig. 7(b) compares the SPC-CURVE obtained in CS 3 , to that obtained in CS 1 , and also to those obtained with the AC-OPF approach with and without the consideration of the network thermal limits. The first observation is that the proposed method is able to successfully provide a conservative estimation for the SPC-CURVE when the network thermal limits are considered in addition to the network voltage limits. The second observation is that the inclusion of thermal limits has capped the provisioning capability for the injection of P S and absorption of Q S . This is because both scenarios require the DERs to generate their maximum active power, which brings about congestion in some network segments. Note that the slight difference in the lower part of the SPC-CURVEs obtained in CS 1 and CS 3 is originated from the fact that only 4 points have been considered between Q − S and Q + S to depict the SPC-CURVEs. If more points are included, both of these curves will have the same shape in that part. Considering Fig. 7(c), with the introduction of extra measurements in CS 4 and CS 5 , the estimation of the proposed approach gets closer to the results  of the AC-OPF approach. Regarding Fig. 7(d), it can be seen that employing a linear model for the network over all operating range of the DERs (as in CS 6 ), instead of using a piecewise linear model (as in CS 1 ), results in an inaccurate estimation of the SPC-CURVE. This highlights the importance of taking an iterative approach towards solving the optimization problems.
To further study the performance of the proposed approach when extra measurements are provided, Fig. 8 presents the SPC-CURVEs achieved for all test samples in Γ 3 , using the AC-OPF approach, and the proposed approach in CS 1 , CS 4 , and CS 5 . As seen, the SPC-CURVEs have the same trapezoidal shape for the majority of the samples, however, as more measurements are provided (in CS 4 and CS 5 ), as expected, the size of the trapezoids gets closer to that of the AC-OPF approach.
To analyze the efficacy of the proposed approach, the distance of the proposed approach SPC-POINTs to the AC-OPF SPC-CURVE is calculated. For O − Q S , this distance is defined as the difference between Q S − of the proposed approach and Q S − of the AC-OPF approach. This difference is then expressed as a percentage of the Q-length of the AC-OPF SPC-CURVE, which is defined as Q + S − Q − S . The same concept is applied to O + Q S , O j+ P S , and O j− P S , only for O j+ P S and O j− P S , the differences are expressed as a percentage of the P -length of the AC-OPF SPC-CURVE, which is defined as {max j P j+ S − min j P j− S }. Fig. 9 shows how these distances are calculated for each SPC-POINT. The distributions of these distances over all test samples, separately for each SPC-POINT, are depicted in Fig. 10. The following conclusions can be made: r In CS 1 , as expected, the proposed approach has provided a conservative estimation of the SPC-POINTs, given that the distances are mainly distributed in the positive part of the vertical axis of Fig. 10.
r As expected, the range of the variations of the distances is lower in CS 2 . However, for some points, especially O 1− P S and O 2− P S , the capability of support provision has been over-estimated as the distances are partly distributed in the negative part of the vertical axis.
r The distances are slightly larger in CS 3 compared to those in CS 1 . This is basically because these distances have been evaluated as a percentage of the P -length or Q-length of the AC-OPF SPC-CURVE, and given that these lengths are smaller when the network thermal limits are included, larger percentages have been achieved in CS 3 .
r With the inclusion of extra measurements in CS 4 and CS 5 , the distances have been reduced, especially in CS 5 .
r The distribution of distances is more widely dispersed in CS 6 compared to the achieved distributions in other case studies, which suggests that without considering a legitimate value for Δg, the estimations of the approach may be inaccurate. Section V provides more details on selecting the value of Δg.
. This is mainly originated from the fact that for the former and latter points, the network over-voltage and under-voltage limits, respectively, are the confining constraints, and since the measurement nodes are electrically closer to the critical nodes where over-voltages happen, the LDA model has a higher level of accuracy in regard to modeling the overvoltage condition. Since the DERs active and reactive power generations comprise the optimization variables, for each SPC-POINT, the proposed approach also delivers the DERs set-points that grant the evaluated support capability. To validate that these DERs setpoints will not result in the violation of the network operational constraints if they are implemented, for each test data sample and each SPC-POINT, the DERs set-points, together with the loading scheme used to generate each particular test sample (assumed to be unknown for the proposed approach), are employed to run a power flow simulation and calculate the nodal voltages. Afterward, the minimum and maximum values of the nodal voltages (on all the phases), i.e., V and V , are calculated. To quantify the violation of network under-and over-voltage limits, UV and OV , respectively, are defined as:  Now, the RMS value of UV and OV over all test samples in Γ 3 , i.e., UV and OV , are considered as two indices to assess the performance of the proposed approach. These indices are evaluated for all SPC-POINTs in each case study. Fig. 11 presents the results. The following conclusions can be made.  r Under-voltages mostly occur in regard to evaluating O − , while over-voltages mostly occur for O 2+ P S , O 3+ P S , and O 4+ P S . This is because the network under-voltage and over-voltage limits are the confining constraints regarding the former and latter points, respectively. As presented in Fig. 10(b), with α k = 0.6 in CS 2 , the approach overestimates the support provision capability for some of the SPC-POINTs, as the distribution of distances has taken negative values. As an example, consider SPC-POINTs O 2− P S and O 4+ P S . According to Fig. 11, if the DERs set-points offered by the approach in this case study are implemented, the system may experience under-or over-voltages, regarding O 2− P S and O 4+ P S , respectively. Fig. 12 Fig. 12, UV or OV increases as the distance takes larger negative values. Also, there are some limited samples for which the distance is slightly negative, while UV or OV is equal to zero. This effect vanishes by reducing Δg, i.e., increasing the accuracy of the network piecewise linear model (see Section V).
Flexibility service providers can also effectively be considered in the evaluation of support provision capability via the proposed approach. Given that this method is specifically developed for real-time applications, services like energy storage systems are modeled in the same way as an available resource, and their maximum active power generation, i.e., P d in (16), is set based on their capacity and state of charge at the moment. Regarding demand-side flexibility services, only balancing flexibility services, e.g., deferrable and interruptible loads, are of importance in the short-term. Each of these services can be modeled as a resource. However, considering the response rate of these services, their maximum contribution, i.e., P d , is regarded as stochastic with a known probability density function (PDF). Using these PDFs, the point estimation method can be applied to extract the PDF of each SPC-POINT, similar to the method adopted in [28] for quantifying the uncertainty of transfer capability. Subsequently, the extracted PDFs can be employed to calculate the mean value for each SPC-POINT or the value corresponding to probabilities larger than a threshold.

V. ADJUSTING HYPER-PARAMETERS
As shown, the values chosen for α k and Δg, which can be regarded as the hyper-parameters of the proposed approach, have a noticeable impact on the accuracy and simulation time. In this Section, different criteria are introduced for adjusting the values of these hyper-parameters.
Regarding α k , choosing a very large value can result in the under-estimation of the providable flexibility, while a small value may bring about the risk of over-or under-voltages. To study how changing α k can affect the estimations provided for the SPC-POINTs, the Lagrange multipliers achieved through solving the linear programming problems can be employed. As an example, consider point O − Q S , where Q − s is to be found using (20). Let G * denote the optimal value found for G. If in the last iteration, none of the constraints associated with the network operation, i.e., A k G ≥ b k , has been activated, it shows a zero sensitivity for Q − s with respect to α k . If, however, the kth constraint has been activated, the associated Lagrange multiplier, denoted by λ k , gives the partial derivative of the optimal objective value, i.e., Q − S , with respect to b k , as λ k = ∂Q − s /∂b k | G=G * . In addition, considering (15), we have, ∂b k /∂α k = 1/(α k (1 − α k )). Therefore, the sensitivity of Q − S with respect to α k can be found as (30).
To assess the risk of the violation of network constraints involved with a certain value of α k , one way is to compare the probability of satisfaction of each network constraint, calculated by the associated trained LDA model using (1), with the actual value of the violation of that constraint for the training samples. As an example, consider the over-voltage constraint associated with bus 90, i.e., V 90 ≤ 1.05. Fig. 13 depicts the calculated probability of the satisfaction of this constraint, i.e., η + , versus the actual value of V 90 , for all training samples in Γ 1 . As seen, none of the samples with η + ≥ 0.8 have an over-voltage  problem, while some of the samples with η + ≥ 0.5 have a minor over-voltage problem. Therefore, if the α associated with this constraint is chosen equal to 0.8, it is expected that the set-points offered by the proposed approach do not result in over-voltages at this bus. However, with α = 0.5, minor over-voltages are likely to occur.
Regarding Δg, while better accuracy is achieved with a smaller Δg, more iterations are necessary to find the SPC-POINTs. This hyper-parameter is concerned with the range of the variations of G within which the values of network sensitivity matrices, i.e., H, U Q S , and U P S , do not change noticeably. As an example, Fig. 14 shows the variations of ||H|| 2 versus ||ΔG|| ∞ , when the elements of G are varied randomly around the DERs setpoints (within the DERs operational limits), for one of the training samples. Note that the variations of ||H|| 2 are presented as a percentage of ||H 0 || 2 as Δ||H|| 2 = 100(||H|| 2 − ||H 0 || 2 )/||H 0 || 2 , where || || 2 denotes the 2-norm, and H 0 stands for the value of H with the original DERs set-points, i.e., G 0 . As seen, when ||ΔG|| ∞ is below 300 (kW or kvar), Δ||H|| 2 is close to zero. However, as ||ΔG|| ∞ grows, Δ||H|| 2 may take very large values.

VI. CONCLUSION
A probabilistic data-driven approach was proposed to evaluate the support provision capability of a distribution network. To do so, a set of LDA models are trained to probabilistically represent the adherence to network statutory limits. A polynomial regression model was developed delivering an estimation of the network equations. By employing these models, evaluating the support provision capability was formulated as a series of iterative linear optimization problems granting a closed curve that delineates the interdependent extrema of active and reactive powers that can be offered at the HV/MV substation. With this formulation, the support provision capability can be evaluated in real-time, and hence, be regarded as available reserves in the ongoing operation of the transmission system. Given that this approach does not rely on the estimation of network nodal power consumption to evaluate the support provision capability, it is of particular interest in practical distribution systems. By defining different indices, it was shown that the SPC-CURVEs obtained by this approach match those obtained by an AC-OPF approach developed as a benchmark, although the former did not have access to the nodal power consumption. This comparison validated the piecewise linear approximation proposed for modeling the adherence to network operational constraints.
In this paper, it was demonstrated that with the employment of certain data-driven models, it is viable to establish a piecewise linear boundary delineating the region of the safe operation of an unbalanced distribution network. The linearity of this boundary makes it viable to achieve a straightforward formulation of the problem. As a consequence, in the implementation phase, evaluating the support provision capability can be conducted in real-time. In regard to evaluating the support provision capability, the extrema of active and reactive powers are interdependent. Therefore, this evaluation, inherently, involves solving a series of optimization problems, the multiplicity of which escalates the problem complexity. This escalation in complexity vindicates the necessity of employing data-driven approaches which can effectively deal with the computational burden of this problem by conducting the required simulations in an offline phase.
Alireza Nouri (Member, IEEE) received the Ph.D. degree in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 2016. He is currently a Senior Energy Lab Researcher with the Energy Institute and the School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland. His current focuses on power system optimization and control.
Valentín Rigoni received the B.Eng. degree from the Universidad Nacional de Córdoba, Córdoba, Argentina, in 2014, the M.Sc. degree from the Politecnico di Torino, Turin, Italy, in 2014, and the Ph.D. degree from University College Dublin, Dublin, Ireland, in 2020. He is a Modeling and Optimization Engineer with Kevala, Inc. His research interests include distribution systems planning and operation, smart grids and distributed energy resources integration, and forecasting.
Andrew Keane (Senior Member, IEEE) received the Ph.D. degree in electrical engineering from University College Dublin (UCD), Dublin, Ireland, in 2007. He is currently a Professor and the Director of the Energy Institute with UCD. His research interests include power systems planning and operation, distributed energy resources, and distribution networks.