An Improved Integrated Cumulant Method by Probability Distribution Pre-Identification in Power System With Wind Generation

The increase of wind generation (WG) has challenged the conventional way of probabilistic load flow (PLF) calculation. A reliable and efficient PLF method is required to face the stochastic nature of various power systems with WG. Firstly, the paper analyzes several typical cumulant methods (CMs) for PLF, such as Gram-Charlier expansion of type A (GCA), Gram-Charlier expansion of type C (GCC), and maximum entropy (ME). Then, an improved integrated CM by probability distribution pre-identification is proposed for power systems with WG based on doubly fed induction generations (DFIGs). The skewness and kurtosis are used as probability distribution pre-identification indices in the CM framework. Meanwhile, the influence of the DFIG control strategy on reactive power is considered in the load flow model and the moment calculation. Finally, the accuracy and efficiency of the proposed method are validated with the IEEE test system. In various scenarios, suitable CM is selected and applied to the PLF based on pre-identifying distribution characteristics. Results reveal that probabilistic density functions (PDFs) of bus voltages and line flows obtained by the proposed method have both accuracy and efficiency.


I. INTRODUCTION
In recent years, the development of renewable energy is growing rapidly worldwide, which plays an important role in alleviating fossil energy depletion and environmental pollution [1], [2]. However, intermittence, randomness, and weak controllability are the main features of renewable energy, such as wind and photovoltaic. The high penetration of renewable energy leads to increasing the uncertainties of power systems [3], [4]. The uncertainties affect the operation and stability of power systems [5]- [8]. In order to consider the uncertainties arising from the stochastic nature of loads and generators, the method of probabilistic load flow (PLF) is proposed [9]. Currently, wind generation (WG) is one of the premier renewable power generations [10]- [12]. Considering the uncertainties arising from WG in steady-state operation, The associate editor coordinating the review of this manuscript and approving it for publication was Hao Wang . the PLF method is further studied to analyze the power system integrated with WG [13], [14].
The methods for PLF include simulation method, analytical method, and approximate method [15], [16]. Among them, the analytical method obtains probabilistic distributions of variables with high computational efficiency and the method has attracted wide attention from scholars [6]. The most widely used method in the analytical method is the cumulant method (CM) that replaces the complex convolution operations with simple cumulants arithmetic operations [1]. In the research of CMs, Gram-Charlier expansion (GC) [17]- [20], Cornish-Fisher expansion (CF) [21]- [23], and maximum entropy (ME) [24]- [26] are used more.
In traditional power systems, uncertainties mainly come from common units and loads. The generation satisfies 0-1 distribution and the load characteristic satisfies normal distribution in steady-state operation [17], which makes the state variables of power systems such as bus voltages and line flows meet quasi-normal distributions. In this situation, Gram-Charlier expansion of type A (GCA) is used in the PLF method to fit the probability density function (PDF) of the state variable in power systems with enough accuracy and efficiency [17], [18]. Moreover, the form of GCA is simple and easy to implement.
However, the distribution characteristic of WG is different from that of common generators because wind speed in nature generally approximates Weibull distribution [6]. The WG's output characteristic deviates from a normal distribution, which causes state variables not to meet quasi-normal distributions. To calculate non-quasi-normal distribution based on cumulants, CF, Gram-Charlier expansion of type C (GCC), and ME have attracted researchers' attention for their good performances in PLF calculation. CF is the inverse function expansion of the variables' cumulative distribution function (CDF) based on GC. In [23], with few iterations and low time consumption, CF is more effective for evaluating the impact of photovoltaic generators on voltage profiles in an IEEE 33-bus radial system. In [19], GCC that adopts an exponential form is introduced to the power grid integrated with large-scale wind farms. In the IEEE 30-bus test system, the computing time of GCC is significantly faster than that of the Monte Carlo simulation (MCS). Taking the MCS results as the comparative baseline, the average root mean square (ARMS) of the results of GCC is mostly less than 0.01%. ME is used in PLF to make a reasonable inference on the unknown probability distribution of random power flow. In [25], one bus is integrated with 30 MW wind turbines based on the IEEE 118-bus system. As the change of scenarios, ME always gets accurate PDFs. For time consumptions, PLF with 50000 times MCS is 242.6 s and with 8 th ME is 1.04 s in a scene. In [26], results show that ME is superior to GCA in accuracy and better than MC in computational efficiency. The effectiveness of the CMs is closely related to whether the variable satisfies the normal distribution [27]. Most current studies focus on improving a certain PDF fitting method, with the purpose to obtain results from the method closer to MCS results. But the calculational efficiency and time-consuming of PLF are not fully taken into account. In addition, doubly fed induction generation (DFIG) is widely used in power systems due to flexible control strategies and good performance [28], [29]. But in references above, the relevance between reactive power and active power of wind power is determined by fixed power factor, which cannot reflect the effect of WG's output characteristic in the PLF method.
In this paper, an improved integrated CM framework is proposed with fully considering the accuracy and efficiency of the above-mentioned three CMs that are GCA, GCC, and ME. The improved integrated CM is based on a probability distribution pre-identification and is used for PLF in a power system with WG. In the method framework, there are two steps. In the first step, two judgment indices (kurtosis and skewness) are calculated. In the second step, more effective CM from GCA, GCC, or ME is selected in PLF by judging whether the indices meet the threshold. To make the PLF method more suitable for the power system with DFIGs, according to the reactive power functions of DFIG under different control strategies, the corresponding elements of the Jacobian matrix are revised in the load flow model. And the PDFs of DFIG outputs are derived under two control strategies including the constant power factor control and the constant voltage control.
The paper is organized as follows. In Section II, several typical CMs are introduced, such as GCA, GCC, and ME. In Section III, PDFs and cumulants of wind power are derived based on the mathematical theory of cumulant and the output characteristics of DFIG. In Section IV, an improved integrated CM for PLF with DFIGs is proposed, where the kurtosis and skewness of probabilistic distribution are used as pre-identification indices. In Section V, the accuracy and efficiency of the proposed method are verified by the IEEE test system in different operation scenarios. Finally, the conclusions are drawn in Section VI.

II. TYPICAL CUMULANT METHODS IN PROBABILISTIC LOAD FLOW A. GRAM-CHARLIER EXPANSION OF TYPE A
GCA expresses the PDF of a random variable as a combination of normal distribution and its derivatives [30]. p(x) is defined as the PDF of a random variable and its expression is as follows: where r is the order; α(x) is the PDF of the standardized normal frequency function; x is a random variable in standard measure; H r (x) is called as Chebyshev-Hermite polynomial; the coefficient c r is determined by the cumulants κ r of x [30]. A great advantage of GCA is that its coefficients have a very simple relationship with the cumulants of the variables. However, GCA has the uncomfortable drawbacks due to the poor accuracy of the strong non-normal distribution and negative values in certain regions of the PDF.

B. GRAM-CHARLIER OF TYPE C
To overcome drawbacks of GCA sometimes, GCC as an exponential variant of GCA is expressed as follows [31]: where H r (x) is same as (1) and γ r is the unknown coefficient of the series expansion.
The exponential form of GCC ensures that the integral of the variables' PDF in the defined domain is 1 and the values of the PDF are non-negative.
In (3), A is a k × k dimensional symmetric matrix and its element expression is shown as (5), where i−1,j−1,m is expressed as (6), when i+j+m = even and C. MAXIMUM ENTROPY ME is a nonlinear PDF reconstruction method that also ensures that the sum of probabilities is equal to unit and the probability values do not exist negative. Under certain constraints, when the entropy reaches the maximum, the probability distribution of the variables is estimated to be the closest to the true distribution. The model of ME is formulated as [26]: where h(x) is the entropy of the random variable x and µ r is the moment of x. Based on the available input moments, ME fits PDF p(x) according to the measurement of distribution entropy to achieve the best distribution fit. p(x) is the PDF of x which is generally given as (8).
In (8), λ r is the Lagrange multiplier that is solved by combing (7) and (8) with setting the function φ(x) = x r . The solving formula of λ r is shown in (9).
Newton-Raphson iteration method is used to solve the nonlinear equation (9) to obtain λ r .

III. CUMULANTS AND PDFS OF WIND GENERATION BASED ON DFIGS
The cumulant κ r of the random variables x is defined as (10) [32].
If x is a continuous random variable satisfying the CDF F(x), its moment µ r about the mean is obtained by: If x i (i = 1, 2, . . . , n) is a discrete random variable, and its PDF is p(x i )(i = 1, 2, . . . , n), its moment µ r about the mean is obtained by: In the PLF method, PDFs of power system injected variables (power generation and load) are known, then the moments of the injected are obtained from (11) or (12). According to the equations of moments and the cumulants as shown in (13), the injected cumulants are obtained by the injected moments [30].
Assume that the injected variables in power systems are independent of each other. Defining W r as the sum of cumulants of the injected variables in the power system. W r G is the cumulant of power generation and W r L (negative) is the cumulant of load. According to the properties of cumulant, their relationship is shown in (14).
The PDFs of wind power are defined as p(P Gw ) and p(Q Gw ), and defining F(P Gw ) and F(Q Gw ) are the integral of p(P Gw ) and p(Q Gw ), respectively. In this paper, the total power generated by a wind farm is assumed as the sum VOLUME 9, 2021 of power generated by each independent DFIG in the farm regardless of the interaction between two wind turbines. The active power P Gw of the wind farm with DFIGs is a function of wind speed v as shown in (15): (15) where v N is the rated wind speed; P GNw is the rated active power; a 0 , a 1 , and a 2 are the coefficients that are obtained by the least square curve fitting method. Wind speed v satisfies Weibull distribution which is expressed as: where the constant k and c are the scale and shape parameters which are obtained by analyzing the historical data [14].
The distributions of P Gw are expressed as follows: where P −1 Gw (v) is the inverse function of (15). The reactive power from a DFIG is a function of voltage V w at the bus where the wind farm is located, and is determined by the control strategies of DFIGs [33]: a) When DFIGs are under the constant power factor control on the stator-side with power factor cosϕ, the expression of the reactive power Q Gw of the wind farm with DFIGs is shown as: where P 1 and Q 1 are the active and reactive power on the DFIG stator-side. The relationship of Q Gw , P 1 , and P Gw is obtained by (19): where P Gw is obtained by (15); R 1 and X 1 are the stator resistance and reactance; R 2 and X 2 are the reduction values of the rotor resistance and reactance; X m is the excitation reactance; s is the slip. b) When DFIGs are controlled at a constant voltage V w , Q Gw ≈ Q 1 satisfies: where I 2 is the reduction value of the rotor-side current. c) When the reactive power exceeds the adjustable range under the constant voltage control, it satisfies the inequation as shown in (21) where I 2m is the maximum current.
Then the function Q Gw is obtained by (19) and (20) with According to the above equations P Gw and Q Gw , Q Gw is expressed by v and V w . When defining the equation Q Gw (v, V w ), the distributions are expressed as follows: (22) where Q Gwmin is the minimum reactive power and Q Gwmax is the maximum.
Applying the aforementioned theoretical background and derivation, the moments of the wind farm with DFIGs outputs are expressed as (23)-(24).

IV. AN IMPROVED INTEGRATED CM BY PROBABILITY DISTRIBUTION PRE-IDENTIFICATION
At present, CM widely used in literature is mostly based on quasi-normal distribution, such as GCA. When the stochastic nature of the state variable is similar to normal distribution, GCA can obtain accurate probability distribution. But when GCA is used to fit a non-quasi-normal distribution, the negative values appear at the tail of the PDF. However, although GCC and ME are suitable for fitting non-quasi-normal distributions, their calculations are complicated and their time consumptions are relatively long. In order to make use of the advantages of various typical CMs, an integrated CM for PLF analysis method with WG based on quasi-normal distribution pre-identification is proposed in this section.

A. PRE-IDENTIFICATION INDICES FOR CM SELECTION
The skewness S k and kurtosis K u of the distribution reflect the extent of departure from the symmetry of the normal distribution [34]. Through mathematical derivation, the S k and K u can be expressed by the variable's third cumulant κ 3 and fourth cumulant κ 4 in standard measure, usually as shown in (25).
The skewness and kurtosis of the standard normal distribution are 0 and 3. When the skewness and kurtosis of the distribution are within a certain range, it can be regarded as a quasi-normal distribution. Therefore, the S k and K u are used as pre-identification indices for the next step in the proposed CM framework and the ranges [S,S] and [K ,K ] are set as the threshold ranges to pre-identify whether the distribution meets the quasi-normal distribution.

B. FRAMEWORK OF THE PROPOSED METHOD
The process framework of the proposed integrated CM for PLF with WG by pre-identification of probability distributions is shown in Fig. 1. In step a of Fig. 1, the cumulants of the DFIGs outputs are obtained by their moments that are derived as (23)- (24). Meanwhile, the Jacobian matrix is used to calculate the state variables' cumulants with the injected variables' cumulants and is obtained by the load flow model with DFIGs. The Jacobian matrix J in this paper is modified as follows according to the output characteristics of DFIG: In PLF calculation, the relationship between the cumulant of the injected W and the cumulant of the state variable X is built as a load flow model as shown in (26) [14].
In (26), to obtain J at the injected expected value, X and W are regard as the correction and error during the load flow iteration; δ and V are the corrections of voltage magnitude and angle; P and Q are the errors of active power and reactive power; the subscript ''w'' indicates the bus where the DFIGs are located. In a power system with n conventional buses i and n w buses w k integrated with DFIGs, for conventional bus i, the errors P i and Q i are specifically the difference between the constant of injected power and the power function of grid state variables, so the expressions of part elements in the J are shown as: where G ij + jB ij = Y ij and Y ij is the bus admittance matrix. For a bus w k with DFIGs, the reactive power Q Gw k is as a function of bus voltage V w i and not a constant. The expressions of Q Gw k are shown in (18)- (21), so the expression of elements in J is: According to the above, the Jacobian matrix of the power load calculation with DFIGs is obtained.
In step b, to obtain the state variable's cumulant W r in each order for the CM to calculate the PLF result, W r is calculated by the injected cumulant X r and the Jacobian matrix J, which is expressed as: In step c, the S k and K u of a state variable's distribution with their threshold ranges are used to select a CM to fit the variable's PDF. In the proposed method, the threshold ranges are derived by quasi-normal distribution [34]. Within the threshold ranges, the variable's PDF is considered to satisfy quasi-normal distribution and GCA is used, otherwise, GCC and ME are used which are more suitable for fitting a non-quasi-normal distribution.
In step d, according to the CMs' introductions in Section II of this paper, the selected CM in step c is to fit the distribution function of the state variable as the PLF result.

V. CASE STUDY
The proposed method described in the preceding sections is applied to an IEEE-14 test system. The network diagram is shown in Fig. 2. A wind farm with DFIGs is connected to bus 2. All the case studies are performed with the MATLAB platform. The bus active and reactive basic loads in power system are generally considered to satisfy the normal distribution and their standard deviations are assumed to be 10% [19], [27] or 30% [25] of their expectations. In the below case studies, assume that the standard deviations of all bus active and reactive basic loads are 30% of their expectations.
The capacity of each DFIG is 2MW with the cut-in wind speed v ci = 5 m/s, the rated wind speed v N = 18 m/s, and the cut-out wind speed v co = 25 m/s. The DFIG's motor parameters are R 1 = 0.0062 , X 1 = 0.0024 , R 2 = 0.0024 , X 2 = 0.0694 , X m = 4.2233 , and s = −0.208.
As the comparative baseline, the sampling number N of MCS is set to be 20000 and ARMS is calculated for comparing the accuracy of different calculation methods. The value of ARMS is defined as: where PLF i is the i th point's value on the CDF calculated by the proposed PLF method; MCS i is the i th point's value on the CDF calculated by MCS. The orders of GCA, GCC, and ME are selected as 7 th , 4 th , and 7 th in the proposed method, considering both accuracy  and efficiency. The 7 th GCA and 7 th ME use from 1 st to 7 th cumulants, and the 4 th GCC uses from 1 st to 6 th cumulants. The accuracy of PLF results obtained by GCA, GCC, and ME with other orders are shown in Appendix.

A. VALIDITY OF THE PROPOSED METHOD
This case is to verify the necessity of the pre-identification method in this paper. The verification scenario is in case a): bus 2 is integrated with 20 × 2 MW DFIGs under a constant power factor (cosϕ = 0.98, ϕ <0) and wind speed satisfies Weibull distribution with parameters k = 2.84 and c = 7.35. Table 1 shows the ARMS results in part of the line reactive power distributions obtained by the 7 th GCA, 4 th GCC, and 7 th ME.
From Table 1, the ARMS values of 7 th GCA, 4 th GCC, and 7 th ME are close, although the calculation accuracy of GCA is generally considered not as good as the other two CMs [27]. It can be seen that the accuracy of GCA results is only slightly lower than other CMs in some cases and the accuracy for reactive power distributions in line 2-3 and line 6-12 is slightly better than that of GCC or ME. Table 2 shows the time consumption of each CM to obtain bus voltages' PDFs in the power system.
It is found that the 7 th GCA is the fastest, the 4 th GCC takes twice as long as the 7 th GCA, and the 7 th ME method is the slowest due to iterative operations.
In order to clearly show the results of the three CMs, Fig. 3 and Fig. 4 show the PDFs of voltage V 7 at bus 7 and active power P 1−2 in line 1-2, respectively. From Fig. 3, the fitting errors of the three CMs are all small and similar. But from Fig. 4, the tail part of the PDF fitted by GCA appears negative, which is unreasonable and indicates that the third and fourth cumulants of the variables exceed a certain range according to the above analysis.
In general, GCC and ME have better applicability and accuracy, and GCA has a high calculation speed. In order to realize precision and efficiency in the PLF with the full use of the advantages of CMs, an improved integrated  CM framework is proposed that selects suitable CM for certain variable distribution based on cumulants by pre-identification.

B. ACCURACY AND EFFICIENCY OF THE PROPOSED METHOD IN DIFFERENT SCENARIOS
The following verifies the accuracy and efficiency of the proposed method, when the wind farm with DFIGs is under different scenarios, such as different control strategies, different wind speed distributions, and different wind power penetration levels. The case a) in the following scenarios is the same as the situation of the power system in the validity verification above case study. The quasi-normal distribution pre-identification indices are that S k is within ± 1.0493 and K u is between 3 and 7 [34].

1) SCENARIO 1: CONTROL STRATEGIES
In this scenario, bus 2 is integrated with 20 × 2 MW DFIGs. Wind speed satisfies Weibull distribution with parameters  Table 3 shows pre-identification indices, selected CM, and ARMS results of part of state variable distributions under different control strategies.
From Table 3, GCA, GCC, or ME is selected to be used in the proposed PLF by range judgment of pre-identification indices. GCC has greater applicability in PLF for power grids with DFIG and the calculation speed is better than ME. Take the first row for example. Under control strategies a) and b), the K u of the V 2 's PDF does not meet the GCA applicable conditions, hence GCC is selected in the proposed CM framework, whereas under control strategies c) the S k and K u of the V 2 's PDF are both within their threshold ranges so GCA is selected.
It can also be seen from Table 3 that control strategies of DFIGs have different influences on distributions of load flow. Under the constant voltage control c), most of the state variables' PDFs are obtained by GCA that is easy implemented. The control strategy a) and b) adopt constant power factor control. In contrast, under the control strategy b) the power factor lag of WG is more serious, hence some state variables need to be fitted by GCC or ME to obtain more accurate PDFs.
To show the method effect more clearly, PDFs of the active power P 1−2 and reactive power Q 1−2 in line 1-2 under control strategy b) are drawn in Fig. 5 and Fig. 6.
From Fig. 5 and Fig. 6, the selected CMs that are ME and GCC respectively are closer to MCS. GCA is not selected that appears negative tail part and deviates from MCS. In the same way, Fig. 7 shows PDFs of P 1−2 in line 1-2, when the VOLUME 9, 2021  DFIGs are under control strategy c) where the selected method is ME. Fig. 8 shows PDFs of Q 2−4 in line 2-4, when the DFIGs are under control strategy c).
In Fig. 8, the selected method is CCA. Although, GCA is considered that the accuracy is slightly lower in many cases. By the pre-identification, GCA is applied to the case of quasi-normal distribution, the calculation accuracy is nearly the same with other CMs, meanwhile, the calculation time is saved.    Table 4 shows pre-identification indices, selected CM, and ARMS results of part of state variable distributions under different wind speeds. Fig. 9 and Fig. 10 show the PDFs of V 2 and P 2−4 in line 2-4 respectively, while wind speed satisfies Weibull distribution with k = 2 and c = 8.5.
In scenario 2, it is found that the accuracy of PLF is influenced by the wind speed distribution which directly affects wind power injection. From Table 4, Fig. 9 and Fig. 10, under wind speed b), the distributions of state variables in power system deviate from the quasi-normal distribution to a greater extent and the selected CM gives enough reliable results in the case without negative values appearing at the tail part of PDF.

3) SCENARIO 3: WIND POWER PENETRATION
In this scenario, wind speed satisfies the Weibull distribution with parameters k = 2.84 and c = 7.35. The control strategy  of DFIG is constant power factor (cosϕ = 0.98, ϕ <0). The installed capacity of the wind farm integrated at bus 2 is as follows: a) 20 × 2 MW; b) 40 × 2 MW. Table 5 shows pre-identification indices, selected CM, and ARMS results of part of state variable distributions under different wind power penetration. Fig. 11 and Fig. 12 show the PDFs of V 2 and Q 1−2 in line 1-2 respectively, while bus 2 is integrated 40 × 2 MW DFIGs.
In scenario 3, when the penetration rate of wind power increases, the deviation degree of the state variable distribution from the quasi-normal distribution intensifies. Under b) with more wind power penetration, GCC is selected to fit the PDFs of more state variables by pre-identification indices. From Fig. 11 and Fig. 12, the PDFs of V 2 and Q 1−2 obtained by GCA have obvious negative values at the tail while the PDFs obtained by selected method have better accuracy. Overall, in the proposed PLF method, the state variable's   distribution is predicted before calculating, and then a CM is selected to fit the distribution to get accurate results.

VI. CONCLUSION
In order to obtain results of PLF with DFIGs with both accuracy and efficiency, this paper proposes an improved integrated CM by probability distribution pre-identification. Before fitting the distributions of the state variables of power systems, the state variables' cumulants are used to calculate the pre-identification indices that assess whether the distributions of the state variables meet the quasi-normal distribution. In the proposed CM framework, GCA is used to fit the quasi-normal distribution, and GCC or ME is used to fit the non-quasi-normal distribution. The proposed framework method makes full use of the advantages of various CMs for PLF with DFIGs. It can not only give reliable and efficient PLF results under various random variables but also avoid the negative values at the tail of the nonquasi-normal PDFs. Compared with several typical CMs, the proposed method has enough accuracy and efficiency. In addition, based on the output characteristics of DFIG, the influence of control strategy on the DFIG's reactive power is considered in the load flow model and the PDF of wind power.
In case studies, the accuracy and efficiency of the proposed method are validated under different WG conditions. It is concluded that when the skewness and kurtosis of the variable's distribution within the threshold ranges, the selection of GCA can halve the calculation time, and by comparison, the ARMS of the three CMs are almost the same at this time. Therefore, the applicable pre-identification is introduced to predict the distributions of variables, so that the advantage of each CM is flexibly applied to make PLF with DFIG have both computational accuracy and efficiency.
Large-scale wind power integration also affects the stability of the power system, such as voltage stability, lowfrequency oscillation, and sub-synchronous oscillation of the power system. The effect of wind uncertainty on stability will be carried out in the future.