Probabilistic Load Flow Based on Parameterized Probability-Boxes for Systems With Insufficient Information

The increased penetration of intermittent renewable energy sources and random loads has caused many uncertainties in the power system. It is essential to analyze the effect of these uncertain factors on the behavior of the power system. This study presents a new powerful approach called probability-boxes (p-boxes) to consider these uncertainties by combining interval and probability simultaneously. The proposed method is appropriate for problems with insufficient information. In this paper, the uncertainty in distribution functions is modeled according to the influence of natural factors such as light intensity and wind speed. First, the p-boxes load flow problem is studied using an appropriate point estimation method to calculate statistical moments of probabilistic load flow (PLF) outputs. Then, the Cornish–Fisher expansion series is used to obtain the probability bounds. The proposed approach is analyzed on the IEEE 14-bus, and IEEE 118-bus test systems consist of loads, solar farms, and wind farms as p-boxes input variables. The obtained results are compared with the double-loop sampling (DLS) approach to show the proposed method’s precision and efficiency.


I. INTRODUCTION
Load flow (LF) problem has been used in electric power system fields, such as generation scheduling and operation. LF's problem involves solving non-linear equations. Several reliable techniques, such as Fast-Decoupled and Newton Raphson [2], have been proposed to solve it. However, the LF problem's input data includes uncertain errors due to several different causes, such as power forecast errors [3]- [5], grid parameters measurement errors. Ignoring these uncertainties will lead to errors in analyzing the behavior of the power network. So far, three general methods have been proposed to address these uncertainties in the LF problem; probabilistic, fuzzy, and interval methods.
In probabilistic methods, it is assumed that input data obey precise probability distribution, and the parameters of the distribution functions are precisely specified. Three different The associate editor coordinating the review of this manuscript and approving it for publication was Siqi Bu .
approaches are proposed to solve the PLF [6]: the numerical method, analytical method, and approximation method. Monte Carlo simulation (MCS) is the most widely used solution between numerical methods and provides accurate results [7]. Nevertheless, MCS is time-consuming because it requires analyzing many samples to obtain accurate results. The analytical techniques operate based on assumptions and simplification of LF equations [8], [9]. The linearization can significantly reduce the computation burden, but simplifications cause more errors than the MCS. In approximation techniques, the PLF is studied by employing deterministic methods. They are faster than numerical and analytical methods [10], [11].
The uncertainty in renewable energy sources (RESs) and loads are typically complicated, and obtaining accurate data about probability distributions associated with them is challenging. However, to overcome these problems, the interval approach can be used to specify the output changes. In the interval approach, the uncertainty in an input variable is expressed as the interval. The interval calculation methods are performed in two ways: interval arithmetic or affine arithmetic. Several techniques have been suggested to solve the LF based on the interval arithmetic method. They are largely based on non-linear equations through iterative approaches such as Newton's technique [12] and the Krawczyk-Moore technique, [13]. These techniques are not suitable for analyzing nonlinear equations due to interval increase, which is caused by the relationship among intervals [14]. To address this shortcoming, affine arithmetic has been suggested. The interval is illustrated as a central value and weighted partial deviation as the affine arithmetic's independent uncertainty sources. In affine arithmetic, because the relationship among intervals is persistent, the interval calculation precision is improved [15].
The probabilistic model's main disadvantage is that the exact value of the probability distribution functions' (PDF) parameters must be precisely determined beforehand. However, constructing an accurate probability distribution is difficult. Therefore, the use of probabilistic approaches is not always possible. In the interval LF, only the upper and lower bounds of the input uncertainties are determined, so its modeling is easy, and the obtained results are numerous intuitive. However, the probabilistic structure of the input uncertainties is not considered, which causes the probabilistic behavior of the variables to be ignored.
The fuzzy LF is represented by fuzzy set theory [16], [17] of which the random variables are illustrated as the possibility distribution, and the LF state possibility distribution is computed, but it is not easy to use directly in the LF problem. In short, fuzzy LF is difficult to be modeled; therefore, the employment of which is bounded.
In recent years, to consider uncertainties, a model based on probability and interval has been developed [18], [19] as the p-boxes; in this model, the uncertainty in a random variable is determined by the upper and lower bounds of the cumulative distribution function (CDF). The p-boxes model combines interval and probability models to express a stochastic variable's uncertainty, so it can be employed to deal with problems lacking adequate data efficiently. The p-boxes are divided into two varieties, parameterized and non-parameterized [20]. The parameterized p-boxes expressed all feasible distributions resulting from a specified distribution function, whose parameters are as the interval. In fact, for a parameterized p-box, the distribution type of a random variable is beforehand known, but some of its distribution parameters could only be given intervals due to insufficient data. The non-parameterized displaying contains all feasible non-decreasing distributions lying within its lower and upper CDFs. Theoretically, the interval-valued distribution parameters can be simply specified employing the interval estimation method [19]. In table 1, this method is compared with other methods. According to this table, p-boxes and simplicity in modeling can be utilized in problems without sufficient information.
In this paper, the uncertainty LF's is studied based on the parameterized p-box to consider the uncertainties of load and RESs, and presents a new PLF model by combining interval and probability approaches to solve the problem of obtaining the exact CDF of PPF inputs in cases where historical data are insufficient. The proposed approach includes two procedures: (i) statistical moment estimation (ii) obtaining probability bounds. The moments of the PLF outputs are calculated by applying the point estimation method, and also, the Cornish-Fisher series is employed to obtain CDFs of outputs. Therefore, the proposed method estimates the statistical moments faster than the DLS method, and also the probability bounds are obtained with great accuracy.
The proposed method is numerically studied, and the results obtained by the present method are compared with the DLS approach in the standard IEEE 14-bus and IEEE 118-bus systems.
This paper is organized as follows. The problem definition is presented in section 2. In section 3, the proposed idea is explained; in section 4, the IEEE 14-bus and IEEE 118-bus systems numerical results are discussed, and the conclusion is provided in section 5.

II. PROBLEM DEFINITION A. BASIC CONCEPTS
In many practical problems, adequate data are not available to obtain precise probability distributions of the input variables. In these cases, the p-box model can be applied to illustrate the uncertainty in a variable. The p-box of a random variable x is specified by its lower and upper bounds as where F(X ) shows the lower bound of the p-box and F(X ) shows the upper bound of the p-box.
The imprecision in the parameters of the distribution function is determined using the interval model. The p-box of random variable x is expressed as [20]: is the CDF of the random variable, and θ contains all distribution parameters that are defined as an interval. Also, L and R denote the lower and upper bounds of an interval value, respectively. The parameterized p-box of a normal random variable with an exact standard deviation and imprecise mean µò[µ L , µ R ] is express as: LF problem is the most important tool in the power system study to calculate the bus angle, voltage magnitude, etc. LF equation is defined as follows: where vectors X and W are input and output LF variables, respectively. According to the above equation, the input variables' stochastic behavior causes the LF outputs' stochastic behavior. The deterministic LF does not consider uncertainties of input data such as power generation and consumption. Therefore, probabilistic, interval, and fuzzy approaches should be applied to consider the uncertain parameters in LF. In the PLF model, the input data distribution function must be precisely specified beforehand, but accurate data are not available to calculate the exact distribution of input variables. In the interval approach, only the upper and lower bounds of inputs are specified, so the variables' probabilistic structure is not considered and makes the modeling of variables not real. Finally, the application of fuzzy LF is limited because modeled is hard. Therefore, the p-box can be employed to specify LF problem inputs. The purpose of the p-boxes analysis is to obtain output bounds considering p-box inputs. In the p-box model, inputs are as p-box, so the output are also p-box. Assume that vector X = (x 1 , x 2 , . . . , x m ) T represents an m-dimensional independent input random variable vector of the LF problem. Which are defined by the distribu- Where F X (.) is the CDF of a random variable; θ contains all interval distribution parameters. The p-boxes analysis needs to calculate the CDFs as follows: where W _X and W X are lower and upper bounds of the p-boxes LF outputs, respectively. These bounds will include all feasible CDFs of the outputs with the changes θ. In this paper employing the point estimation method to calculate statistical moments and the Cornish-Fisher series to calculate the CDFs, the PLF outputs probability bounds are obtained with high accuracy.

III. PROPOSED ALGORITHM
This section proposes the LF problem with parameterized p-box input variables. DLS is an easy method for analyzing p-boxes to obtain probability bounds [21]. The DLS has two sampling loops: 1) parameter loop: This loop is associated with distribution functions' parameters. The parameter loop includes a sampling of different values for a set of distribution parameters specified as intervals. 2) Probability loop: This loop is associated with PDFs. This loop includes a sampling of distribution functions whose parameters are known. The probability loop is essentially a MCS that determines the statistical moments of the outputs. These two sampling loops cause the very low efficiency of this method. In the proposed approach, the MCS in the probability loop of DLS is removed, and the point estimation method is used to calculate the statistical moments. Finally, the CDF of outputs is calculated using the Cornish-Fisher series. The proposed approach is divided into two main steps, 1) the statistical moment bounds of the PLF outputs are computed using the point estimation method, 2) the probability bounds are obtained by applying the Cornish-Fisher expansion series. The calculation step of the proposed approach is shown as follows: Step 1: In the first step, set the number of iterations in the parameter loop (n = number of iteration).
Step 2: Define all input uncertain parameterized P-box , where x is the input random variable, and p is the number of input parameterized P-boxes.
Step 3: Define interval distribution parameters (parameter space), as, θ = θ 1 , θ 2 , . . . θ p , θ ∈ [θ L , θ U ], Which θ indicates interval distribution parameters and the L and R represent the upper and lower bounds of the interval distribution parameters, respectively. These bounds are essential because they express all imprecision in the PLF model.
Step 4: From the previous step, randomly select a point of the parameter space as θ p, Step 5: Calculate the statistical moments of the outputs for the selected points θ p,j in step 4, using the appropriate point estimation method, m k,j = [m 1,j , m 2,j ] where k is moments order. In the appropriate point estimation, to calculate the PLF's statistical moments, the appropriate conversion is applied to transform non-normal input stochastic variables to their standard space. The description of this approach is provided in [9].
Step 6: Go back to Step 4; repeat for j = 1, . . . , n. The process of computing moments, iterated for n random points in the parameter space. If the number of iterations is over, compare the results to determine the statistical moment's interval, Step 7: In each using the calculated statistical moments in Section 5, the CDFs obtain using the Cornish-Fisher series [22].
Step 8: Connect all the CDFs from the previous Step to specify the PLF outputs' probability bounds. The flowchart of the above method is given in Fig. 1.

IV. RESULTS
The performance of the proposed method is studied using the modified IEEE 14-bus and IEEE 118-bus systems. These cases consist of loads, solar farms, and wind farms.
The results calculated by the proposed method are compared with the DLS method. The probabilistic models of wind turbines power generation and the output power of the solar farms are taken from [23]. We evaluate the precision of the proposed approach, with used relative error-indices. These definitions can be illustrated as follows: where µ is the mean value, and σ is the standard deviation. These indices determine the error of moments computed by the proposed method from DLS. The simulations were produced in the MATLAB environment, and MATPOWER [24] was employed to solve the deterministic LFs on a personal system with a 2.2-GHz processor and 4GB of RAM.

A. IEEE 14-BUS TEST SYSTEM
In the modified IEEE 14-bus test system, six additional RESs are integrated that consist of three wind farms at bus numbers 4, 5, and 6 and three solar farms at bus numbers 9, 13, and 14. The probabilistic model of loads are taken from [25]. The correlations among loads, solar farms, and wind farms are defined as a correlation coefficient matrix. The correlation coefficient ρ = 0.2 is considered for uncertain loads. The correlation coefficient between the wind farms located in buses 6 and 13 is ρ = 0.6 and wind farms located at buses 3 and 6 are considered independent. The correlation coefficient between the solar farms located in buses 5 and 9 is ρ = 0.4 and the solar farm located at bus 14 is considered independent. For both methods, 100 iterations are executed in the parameter loop, and the selected points of the parameter space are the same for both methods. The number of simulations in the probability loop (MCS) of the DLS method is 10000, considered to stop MCS, and the stopping rule based on the second moment is used [26]. For the proposed approach, the appropriate point estimation method needs (ω × τ )+1 simulations [9], if the number of iterations is n in parameter loop, the computational burden of the proposed method would be n (ω × τ + 1). In this modified IEEE 14-bus test case, there are 17 random variables (ω), and the number of points (τ ) is four considered in univariate integration. Therefore, the number of iteration is 69 for each iteration in the probability loop. The CDFs are estimated to employ the Cornish-Fisher series.
In this case, to analyze the uncertainty in the parameters of distribution functions, the level of uncertainty is 10% in the middle of the mean of the normal distribution function [µ − 0.1 × µ, µ + 0.1 × µ] and 5% uncertainty in the middle of the Weibull and beta distributions parameters The proposed approach is used to analyze the LF problem with p-boxes inputs. Fig. 2 show the probability bounds of active power flow from bus 4 to bus 9 for the DLS method and the proposed method. As can be seen from Fig. 2, the CDF acquired by the two approaches is almost the same, and the fitting, in this case, is perfect. Also, the cumulative curve of the lower and upper probability distributions is almost the same, which shows that the standard deviation does not change much; therefore, it is the mean of outputs that creates two different probability bounds. Table 2 lists the bounds of mean and standard deviation values given using two methods. In this table, V mag−2 is the voltage magnitude at bus 2, and V ang−4 is the voltage angle on bus 4. P br,4−9 is the active power flow through line 4-9, and Q br,2−3 is the reactive power flow through line 2-3. It can be seen from Table 6, that the changes in mean values computed by the proposed method based on (5) fall within 0.0356% compared to the DLS method; which verifies the efficiency of the proposed approach, and the standard deviation bounds are close to each other, which corresponds to the results shown in

B. COMPARISON OF RESULTS WITH PLF
In this section, p-boxes LF is compared with the PLF, which includes precise distribution parameters. The random variables with exact parameters are replaced with p-box variables. The PLF model is performed in two scenarios. In the first scenario, the mean values of the interval distribution parameters are considered in PLF inputs and in the second scenario, the values randomly are selected. Fig. 3 show the comparison of different scenarios with the p-box LF. As it is observed from the figure, the obtain CDFs, in these cases, are between the probability bounds obtained by the p-box LF method. Table 3 shows the mean and standard deviation values of the selected variables given for two scenarios. It is seen that the results obtained from these scenarios are between the values obtained by the proposed method.

C. IEEE 118-BUS TEST SYSTEM
In the modified IEEE 118-bus system, eight additional energy sources consist of five wind farms at bus numbers 2, 28, 38, VOLUME 9, 2021  57, and 108 and three solar farms at bus numbers 67, 70, and 71. The solar irradiation and wind speed characteristics are given in Tables 4 and 5, respectively. P sf and P wf are the rated power of solar farms and capacity of wind farms, respectively. S std is solar irradiation and S c is certain irradiation. V i , V r and V o denoted the cut-in, cut-out, and rated wind turbine speed. α L and α U are the lower and upper bounds of the scale parameter of distribution functions, respectively. β L and β U are the lower and upper bounds of the shape parameter of distribution functions, each node's active power obeys the uniform and normal distribution as given in [25]. Also, the parameters of loads that are modeled as p-boxes are given in Table 6. According to the quantity of available information for each of the input random variables, each distribution function's accuracy can be different. Therefore, in this case, the value of changes in the parameters is considered various. In this test case, for the p-box LF analysis, all energy sources added and 20 loads, considered as the p-box random  variables; other loads are considered as distribution functions with precise parameters. Similar to the previous case, the correlation coefficients among the uncertainties are considered. The correlation coefficient ρ = 0.2 is considered for uncertain loads. The correlation coefficient between the wind farms located in buses 2 and 28 is ρ = 0.6, and the correlation coefficient between wind farms located at buses 38 and 57 is ρ = 0.4.
The wind farms located at bus 108 is considered independent. The correlation coefficient between the solar farms located in buses 67 and 71 are ρ = 0.4, and the solar farm located at bus 83 is considered independent. In this case, for   both methods, 100 iterations are executed in the parameter loop. In this modified IEEE 118-bus test case, there are 107 random variables, and the number of points is two considered in univariate integration. Therefore, the number of iteration is 215 for each iteration in the probability loop. Fig. 4 show the p-boxes of active power flow from bus 19 to bus 34 for both methods, respectively. Based on this fig, the two approaches' CDFs are almost the same, and the fitting, in this case, is also perfect. As shown in Fig 4, the range of changes in the mean parameter includes both negative and positive values. This indicates that the uncertainty in the distribution function parameters has a significant effect on the outputs.   As can be seen, the proposed method's accuracy in estimating statistical moments is very close to the DLS method.

D. COMPARISON OF RESULT WITH PLF
In this section, to compare the p-box LF results with PLF, two scenarios are considered. In the first scenario, the parameters' main values are considered in PLF inputs, and in the second scenario, the values are randomly selected. The results of this comparison are shown in Fig. 5. Based on this fig, the PLF results are located between the probability bounds of p-box LF. In the Table 8 mean and standard deviation values of the selected variables are given for both scenarios. It can be seen that the values obtained are in the intervals of the p-box LF.
The calculation time required for calculations in the IEEE 14-bus system in the DLS method is 57.24 seconds and for the proposed method is 5532.67 seconds. In the 118-bus system, the DLS method takes 11819.74 seconds, and the proposed method takes 227.69 seconds. So not only the proposed  approach provides accurate results, but also it is faster than the DLS method.

E. CONVERGENCE ANALYSIS
As a stochastic simulation method, more simulations in the parameter loop increase the accuracy of the results. However, by solving the problem in different sample sizes, some studies can be performed to determine convergence. In this subsection, sensitivity to the number of parameter loop iterations is studied. For this study, the p-box LF was made for different iterations in the parameter loop (5, 30, 100, 300, and 500). Mean have been used to measure the parameter loop the accuracy. The results are shown in Tables 9, 10. It is observed that 100 iterations are appropriate for two case studies, and in more iterations, the changes in the means bounds are low.

V. CONCLUSION
In this paper, PLF based on the parameterized p-box is analyzed; loads and renewable energy sources (RESs) are modeled as p-boxes and the obtained results are probability bounds of the PLF outputs. There are p-box variables in the input data, so the obtained result is not the precise probability distribution of the outputs. However, they include a set of feasible distributions between the upper and lower probability bounds. The proposed approach framework allows uncertainty in the parameters of distribution functions to be analyzed. Therefore, it can be used in problems that lack sufficient information. This approach was analyzed, in the IEEE 14-bus and IEEE 118-bus test systems, including loads, solar farms, and wind farms. Load power, solar radiation, and Wind speed were modeled by Gaussian, Beta, and Weibull distribution functions as p-boxes random variables, respectively. The precision of the results was compared with the DLS approach. In the proposed method, statistical moments are calculated by replacing the point estimation method with MCS in the DLS method's probability loop. The CDFs of outputs computed using the Cornish-Fisher expansion series. Comparing the results showed that the proposed approach provides results close to the DLS method, and the computation time is much lesser than the DLS method.