Online Dynamic Total Transfer Capability Estimation Using Cotraining-Style Semi-Supervised Regression

With the increasing integration of wind power, the operating condition of the power system varies more rapidly. As the total transfer capability (TTC) of the transmission interface changes with the operating condition, the ofﬂine TTC estimation has become less suitable for online security control. In this paper, an efﬁcient online dynamic TTC estimation method using semi-supervised learning approach is proposed. First, considering the high-order uncertainties of wind and load, a sample database of expected operating conditions with or without corresponding TTCs is generated. Then, the pivotal features which greatly correlate with the TTC are selected. Finally, the relationship between the TTC and pivotal features is learned, using the cotraining-style semi-supervised regression algorithm (COREG), thus the dynamic TTC estimation model is established. With real-time data inputting the model, the TTC can be estimated. The proposed method is validated on Gansu Province Power Grid in China, and the results and accuracy and efﬁciency comparison with other typical existing methods indicate that, the proposed method can provide accurate TTC estimation, and because of the high efﬁciency of the semi-supervised learning approach, the whole process of model establishment and TTC estimation can be refreshed every 15 minutes. Therefore, the proposed method of online dynamic TTC estimation is suitable for online security control


I. INTRODUCTION
The TTC of the transmission interface is one of the most important operating rules in the power system. It is the ability of a power system in transferring electric power from the sending-side to the receiving-side through the transmission interface in a secure and stable manner. During online security control, operators control the power flow through the transmission interface to be less than the TTC. The TTC used in traditional online security control is a fixed single number which is calculated offline based on a typical operating condition, and the same offline TTC is used for a long period, such as a month or a year.
In recent years, the increasing integration of wind power has made the operating condition of the power system vary The associate editor coordinating the review of this manuscript and approving it for publication was Canbing Li . more rapidly. As the true value of the TTC changes with the operating condition, the offline TTC has become less suitable for online security control. If the offline TTC is greater than the true value, it will bring safety risk to the system. If the offline TTC is less than the true value, the full transmission capability cannot be utilized. The offline TTC is insecure and uneconomical for nowadays power system, therefore, an effective online dynamic TTC estimation method is of great significance for online security control.
Online dynamic TTC estimation consists of two parts, the first is to establish the dynamic TTC estimation model, and the second is to estimate the TTC with real-time operating condition. The development and wide application of Artificial Intelligence has provided a new technical foundation for establishing the dynamic TTC estimation model. The idea of it is inspired by the concept of Automatic Learning put forward by Dy-Liacco [1], which can turn a complicated and mechanism-based dynamic security assessment problem to a data-driven problem. Based on this, the concept of automatic learning the fine operating rule of a transmission interface is proposed in [2], which is establishing the dynamic TTC estimation model using a supervised learning approach. The dynamic TTC estimation model is composed of the TTC of the base operating condition and its sensitivities to the operating condition [3]. Using a supervised learning approach to establish a dynamic TTC estimation model involves three key steps: (1) Generate a labeled sample database. It is by randomly simulating numerous nearby operating conditions around the base operating condition, and then manually labeling the operating conditions by calculating their corresponding TTCs. Monte Carlo simulation is the most widely used method to generate operating conditions. Security constrained Continuation Power Flow (CPF) and Repeated Power Flow (RPF) are both effective methods employed to calculate the TTC [4], [5]. (2) Select the pivotal features which greatly correlate with the TTC to lower the dimension of the samples. Heuristic Regress Feature Selection (HRFS) method and Hybrid Mutual Information (HMI) method are successfully used in pivotal feature selection [3], [6]- [8].
(3) Learn the relationship between the TTC and the pivotal features using a supervised learning approach. Deep Belief Network (DBN) and Deep Network (DN) are recently used to learn the precise relationship between the TTC and pivotal features [9], [10]. However, the real power system is massive and non-linear regression learning approaches cost a lot of time. Some studies have found that the relationship between the TTC and the main pivotal features is linear in both test and real-world systems [3]. Once the dynamic TTC estimation model is established, with the real-time operating condition inputting the model, the TTC can be estimated.
There are two application modes of online dynamic TTC estimation: the first is ''model established offline, TTC estimated online'', the second is ''model established online, TTC estimated online''. The former does not require high efficiency, but has poor online adaptability, and the latter is the opposite. Most studies choose the first application mode, because using a supervised learning approach to establish the dynamic TTC estimation model requires numerous labeled samples. However, manually labeling samples by calculating TTCs needs a large number of transient differential computations and power flow calculations, and it takes a lot of time that the online security control cannot afford to wait. In [11], several dynamic TTC estimation models are established offline based on different historical scenarios, and the real-time TTC is estimated online using the most relevant model. The most relevant model is the one whose corresponding scenario matches the real-time operating condition most closely. However, the historical scenarios cannot cover all possible real-time operating conditions, so the online adaptability of this method is poor. In order to improve this situation, the studies in [12] and [13] establish the dynamic TTC estimation model offline based on the day-ahead wind power prediction, and estimate the real-time TTC online according to the deviation between the real-time operating condition and the day-ahead operating condition. However, the error of the day-ahead wind power prediction is quite big (more than 40% [14]), causing the deviation of the operating condition too large that the accuracy of the online TTC estimation cannot be guaranteed.
To sum up, the online dynamic TTC estimation method has been studied and improved in many ways. However, due to the online computing time limit and the insufficient efficiency of the existing methods, the dynamic TTC estimation model has to be established offline based on historical data or long-term predictions, and then the TTC is estimated online with real-time data. The offline dynamic TTC estimation model and online real-time operating condition don't match very well, causing poor TTC estimation accuracy and bad online adaptability. Thus, an efficient online dynamic TTC estimation method that can achieve ''model established online, TTC estimated online'' within the time cycle of the online security control remains an important issue to be addressed.
This paper presents an efficient online dynamic TTC estimation method using the semi-supervised learning approach COREG. The dynamic TTC estimation model is established online based on the ultra-short-term predictions of wind and load, considering the high-order uncertainties of the prediction errors. Then with the real-time data inputting the model, the real-time TTC is estimated online. The study case on a Chinese provincial power grid demonstrates that because of the high efficiency of the semi-supervised learning approach, the whole process can be refreshed within the time cycle of the online security control. Two contributions in this paper are summarized as follows: (1) The sample database is generated considering the highorder uncertainties of the prediction errors. In consideration of the inaccuracy of the traditional prediction error model which uses the Gaussian distribution with known mean and variance. Here, the uncertainties in the mean and variance of the prediction error distribution (the high-order uncertainties) are considered, and the prediction error can be described more precisely, therefore, the sample database for establishing the dynamic TTC estimation model can cover the expected operating conditions more accurately.
(2) The semi-supervised learning approach COREG is used to establish the dynamic TTC estimation model to reduce online computing time. In the process of establishing the model, manually labeling samples by calculating TTCs is the most time-consuming part. COREG can make use of unlabeled training samples to assist learning and enhance the learning performance of labeled training samples, so the demand for labeled training samples is reduced, therefore, the computation efficiency is improved. This paper is organized as follows. In Section II, the sample database is generated considering the high-order uncertainties of the prediction errors. In Section III, the pivotal features are selected using HRFS method. In Section IV, the dynamic TTC estimation model is established using COREG. VOLUME 8, 2020 In Section V, the architecture of the proposed method is presented. In Section VI, a case is studied and analyzed. Finally, conclusions are given in Section VII.

II. SAMPLE DATABASE GENERATION
A semi-supervised learning approach is used in Section IV to establish the dynamic TTC estimation model, so a sample database composed of both unlabeled and labeled samples is needed. An unlabeled sample is in the form of s, which is a vector denoting the operating condition of the system, and the indexes of s are the variables of loads, outputs of power plants, power flow through transmission lines and voltage of nodes. A labeled sample is in the form of (s, TTC), where TTC is a real number label denoting the corresponding TTC of operating condition s.
There are two steps in sample database generation: first, simulate a set of expected operating conditions; second, select a certain portion of operating conditions, and calculate their TTCs to generate labeled samples, and the rest operating conditions are used as unlabeled samples.

A. EXPECTED OPERATING CONDITION SIMULATION
Because the uncertainty of the system is mainly caused by wind power and load, the 2-dimensional array of centralized wind power and total load (''wind-load'') is used to represent the operating condition. First, a set of expected ''wind-load'' arrays are randomly sampled, based on the ultra-short-term predictions of ''wind-load'', considering the high-order uncertainties of the prediction errors. And then based on each of the ''wind-load'' arrays, the corresponding operating conditions are simulated through power flow calculation.
In online applications, wind power output is the sum of its ultra-short-term prediction and prediction error. The ultrashort-term prediction can be obtained from wind farms. The prediction error is modeled using the Gaussian distribution with random distribution parameters, which is the high-order uncertainty [15], and the theory has been successfully applied to robust optimization scheduling in the power system integrated with large-scale wind power [16].
The output of the centralized wind power is as follows: where, P prediction W is the ultra-short-term prediction of the centralized wind power. εP prediction W is the error of P prediction W , considering the high-order uncertainty of the prediction error, it is assumed that the εP prediction W follows a Gaussian distribution, of which the mean and variance are random variables within a constant interval as follows: The same way is used to deal with the total load, assuming the load and wind power are mutually independent.
The steps of simulating the operating conditions are as follows: 1) Use Monte-Carlo sampling approach to sample a set of two-dimensional arrays of ''wind-load'', based on the ultra-short-term predictions and the high-order uncertainty probability distributions of the prediction errors. The number of the ''wind-load'' arrays is N .
2) Simulate the base operating condition s 0 through power flow calculation based on the ultra-short-term predictions of ''wind-load'' and the corresponding online scheduling plan. Then based on s 0 , by adjusting the output of the generator units by their initial output ratios, the operating conditions for the other ''wind-load'' arrays are simulated through power flow calculation.

B. SAMPLE DATABASE GENERATION
A certain portion (from 10% to 50% [17]) of the operating conditions simulated above are selected using Monte-Carlo sampling approach. They are manually labeled by calculating their corresponding TTCs using the following method to generate labeled samples, especially, the ultra-short-term prediction operating condition and its corresponding TTC (s 0 , TTC 0 ) is the base sample. The rest of the operating conditions are used as unlabeled samples. The number of labeled samples is N L and the number of unlabeled samples is N U .
With each of the selected operating conditions as the initial condition, their corresponding TTCs are calculated using the security constrained RPF method as follows. According to [18], TTC calculation is an optimization model as in (3): where, λ denotes the increase in generation in the sendingside as well as the increase in load in the receiving-side, and it is the decision variable of the optimization model; G(s, λ) is the power flow equality constraint set; H(s, λ) is the inequality constraint set of the voltage constraints, the generation constraints, the equipment thermal constraints and the transient stability constraints. In G(s, λ), P Gi (generation in sending-side), P Dj (active load in receiving-side), and Q Dj (reactive load in receiving-side) are modified as follows: where, the superscript ''0'' means the initial condition; k Gi and k Dj are constants denoting the increase step-size in P Gi , P Dj and Q Dj as λ varies. The transient stability constraints included in H(s, λ) is the well-known criterion in (5), where, G is the set of generator units, δ Gi and δ COI are the rotor angle of generator unit Gi and the rotor angle of the center of inertia, respectively. t is the moment in the transient process of the N-1 contingencies. δ max is the pre-defined threshold which is set to 180 • in this paper. The TTC is the power flow from the sending-side to the receiving-side through the transmission interface when λ reaches its maximum value. And it can be calculated as total generation in sending-side minus total active load in sending-side, or total active load in receiving-side minus total generation in receiving-side, here, we use the latter one as in (6), (6) where, j∈receiving−side P Dj (λ max ) is the total active load in the receiving-side when λ reaches λ max , and g∈receiving−side P Gg is the total generation in the receiving-side.
In the above model, in each modification of λ, in order to validate transient stability constraints, the time-domain simulation needs to be performed, and it includes several transient differential computations and power flow calculations. This makes labeling operation conditions by calculating TTCs the most time-consuming part in the process of dynamic TTC estimation model establishment.

III. PIVOTAL FEATURE SELECTION
In this section, based on the sample database above, the pivotal features which greatly correlate with the TTC are selected to lower the dimension of the samples, and it includes two steps: first, calculate the approximate TTCs for unlabeled samples as their pseudo labels using the k-nearest neighbor (k-NN) algorithm; second, select pivotal features using the HRFS method.

A. PSEUDO LABEL BASED ON K-NN ALGORITHM
If there are too few labeled samples, it may lower the effectiveness of the pivotal feature selection. In this case, the approximate TTCs can be calculated for some of the unlabeled samples as their pseudo labels using the k-NN algorithm [19], the steps are as follows: 1) Use Monte-Carlo sampling approach to select a certain portion of unlabeled samples.
2) For each of the unlabeled samples, locate its k nearest labeled samples based on the ''distance'' of s i − s j .
3) Calculate the approximate TTC of each of the selected unlabeled samples, which is the average of the TTCs of its k nearest labeled samples.

B. PIVOTAL FEATURE SELECTION USING HRFS METHOD
Taking the deviation between the TTC of a sample (labeled sample or pseudo labeled sample) and TTC 0 as the target feature ( TTC), and taking the deviations between the indexes of the s of a sample and the indexes of s 0 as the candidate features, the pivotal features which greatly correlate with the target feature are selected using the HRFS method [3].
The selection accuracy standard R as ( F) is as follows: where, F is the pivotal features set, N LP is the sum number of the labeled and pseudo labeled samples, TTC i is the TTC of sample i, TTC iF is the estimated TTC of sample i, which is linear regressed only with the indexes in F.
The steps of pivotal feature selection using the HRFS method are as follows: 1) Initialization: S is the input candidate features set. F is the output pivotal features set and it is initialized null. R as ( F) is initialized to 1. In this paper, the accuracy threshold η of R as ( F) is 0.1%, and it means a candidate feature can only be selected as a pivotal feature if it is able to reduce R as ( F) by at least 0.1%.
2) Forward Selection: Repeat this until there is no new candidate features can be selected into F: find the candidate feature s from S that minimizes R as ( F ∪ s), if R as ( F) − R as ( F ∪ s) > η, then s is selected into F and removed from S.
3) Backward Replacement: The size of F from step 2 is N F . Specify i = 0 and repeat this until i = N F : find the candidate feature s from S that minimizes R as ( F\ f i ∪ s), and if R as ( F\ f i ∪ s) < R as ( F), then s is selected into F and removed from S, and f i is put into S and removed from F. After that, specify i = i + 1. And the final size of F is m.
After the pivotal features are selected, the training sample database which is directly used in the learning part in Section IV is obtained as follows: where, L is the labeled training sample set, U is the unlabeled training sample set, and f i ∈ R m is a vector of pivotal features in F of the ith sample, TTC i is the label of the i th sample.

IV. MODEL ESTABLISHMENT USING COREG
In this section, the relationship between TTC and the pivotal features is first qualitatively analyzed in a test system and a real-world system with wind power integration, and then quantificationally learned using the semi-supervised learning approach COREG. COREG uses two diverse regressors which label the unlabeled training samples for each other to get updated, and the final output is the average output of the two regressors [18]. Here, two Support Vector Machines (SVM) are used as the regressors. The theory of SVM regressor is first briefly introduced, and then the dynamic TTC estimation model is established using COREG. VOLUME 8, 2020 FIGURE 1. Relationships between TTC and the first two pivotal features in test system.

FIGURE 2.
Relationships between TTC and the first two pivotal features in real-world system.

A. QUALITATIVE RELATIONSHIP BETWEEN TTC AND MAIN PIVOTAL FEATURES
The relationships between TTC and the main pivotal features in the test system and the real-world system with wind power integration are shown in Figure 1 and Figure 2, respectively. Each black spot represents an operating condition of the system and its corresponding TTC. Each system considers two different scenarios. The detailed information of the test system and the real-world system is in Appendix and Section VI, respectively. It can be seen from Figure 1 and Figure 2, there is a linear relationship between TTC and the main pivotal features in both the test system and the real-world system with wind power integration. Therefore, the relationship between TTC and the pivotal features can be expressed using a linear function.

B. SVM REGRESSOR
The SVM regressor is constructed by seeking the optimal hyperplane that minimizes the total deviation of all training samples from it [20]. Given a labeled training samples set {(x i , y i ), i = 1, · · · , n}, where x i ∈ R m , y i ∈ R and n is the number of samples, the SVM regressor is in the following form: where, φ(·) devotes the mapping that transforms vector x to a high dimensional space, and ω devotes the vector of weight. It is noted, in this paper the pivotal features in F and the target feature TTC are all deviations from the base sample, so there is no bias in the regressor. By introducing the loose factors ξ i andξ i , ω is estimated by minimizing the regularized risk as follows: where, ε is the radius of insensitive damage, β is the penalty coefficient. By introducing the Lagrange multiplier α i andᾱ i , the optimization problem (10) is then transformed to its dual problem and solved, and the SVM regressor is expressed as follows: where, K (x i , x) is the linear kernel function, and it is the inner product of features mapped to the high dimensional space.

C. DYNAMIC TTC ESTIMATION MODEL ESTABLISHMENT USING COREG
COREG uses two labeled training sample sets to train two diverse regressors, respectively, and the diversity of the regressors comes from the difference in the training samples. Each regressor estimates the unlabeled training samples, adding the one with the highest training confidence into the labeled training sample set for the other regressor, and the regressors are retrained with the updated labeled training sample sets. The training process is repeated until meeting the given conditions, and the final output is the average output of the two regressors. Based on the training sample database in (8), the detailed steps of dynamic TTC estimation model establishment using COREG are as follows: 1) Divide the labeled training sample set L into two sets, L 1 and L 2 .
2) Train two diverse svm regressors h 1 and h 2 from L 1 and L 2 , respectively.
3) Randomly extract a subset U from the unlabeled training sample set U . For each f ui in U , i is the set of its k nearest neighbor labeled training samples in L 1 , and the most confident unlabeled training sample f u is identified by maximizing the deviation of Mean Squared Error (MSE) over i as in (12): where, h 1 is the original regressor, h 1 is the modified regressor with the information of ( f ui , h 1 ( f ui )).
Then, ( f u , h 1 ( f u )) is added to L 2 and deleted from U and U , and the same method is used to deal with h 2 and L 1 , so L 1 and L 2 are updated. 4) Return to step 2, and repeat the steps until the maximum times of iterations is reached or there are no unlabeled training samples that can reduce the MSE of the regressors.
5) The final output is the average output of the two regressors: Using the linear kernel function K , the relationship between TTC and the pivotal features f is obtained as follows, where,b 1 , · · · ,b m denote the linear relationship parameters.
The dynamic TTC estimation model is as follows: The dynamic TTC estimation model is composed of two parts, the first part TTC 0 is the TTC of the ultra-short-term prediction operating condition, and the second part TTC is the deviation TTC due to the deviation between the real-time operating condition and the ultra-short-term prediction operating condition. Therefore, once the real-time pivotal features f is obtained from online measurements, the model can provide the estimation of the real-time TTC.

V. ARCHITECTURE OF ONLINE DYNAMIC TTC ESTIMATION
The dynamic TTC estimation model is established online based on the ultra-short-term predictions of wind power and load. With the real-time data of pivotal features obtained from online measurements like SCADA as the input, the model can provide the estimation of the real-time TTC. The architecture of the proposed online dynamic TTC estimation method is shown in Figure 3.

VI. CASE STUDIES
The proposed online dynamic TTC estimation method is applied to Gansu Province Power Grid in the northwest area of China. The simulation and calculation are done on a computer with an Intel Core i7 CPU and 16GB RAM, using the Power System Analysis Software Package (PSASP). The base power is S B = 100MVA in the system.

A. TEST SYSTEM
The diagram of simplified Gansu Province Power Grid is shown in Figure 4. Wind farms are centralized in the sendingside. The sending-side and receiving-side are connected by the 750kV AC transmission interface, which is composed of two 750kV transmission lines of Hexi-Wusheng and two 750kV transmission lines of Shazhou-Yuka.

B. SAMPLE DATABASE GENERATION
Assuming the ultra-short-term predictions of centralized wind power and the total load are 4236MW and 15127MW, respectively. Their prediction errors follow Gaussian distributions which are mutually independent, and the means and variances of the Gaussian distributions are random variables within the constant intervals as follows:   According to the ultra-short-term predictions and the highorder uncertainty probability distributions of the prediction errors, 2000 two-dimensional arrays of ''wind-load'' are sampled using Monte-Carlo sampling approach as shown in Figure 5.
Based on each of the two-dimensional arrays, the corresponding operating condition is obtained through power flow calculation. 500 operating conditions are randomly selected which are represented by the red spots in Figure 5, and their TTCs are calculated using the security constrained RPF. The TTCs are not calculated for the other 1500 operating conditions which are represented by the blue spots in Figure 5. So that the sample database is generated with 500 labeled samples and 1500 unlabeled samples. Especially, the ultra-short-term prediction operating condition and its TTC (s 0 , TTC 0 ) is the base sample.

C. PIVOTAL FEATURE SELECTION
Based on the sample database above, using the pivotal feature selection method in Section III, η = 0.1%, the pivotal features which greatly correlate with the TTC are selected from 160 candidate features like bus voltage deviations, power flow deviations over transmission lines, load deviations and generation power deviations. The pivotal features are shown in Table 1.
After the pivotal features are selected, the training sample database is obtained. A labeled training sample is composed of a vector of pivotal features of a sample and its corresponding TTC, and an unlabeled training sample is a vector of pivotal features of a sample without TTC.

D. DYNAMIC TTC ESTIMATION MODEL
The relationship between TTC and the pivotal features is learned using COREG presented in Section IV, and it is as follows in (18). The results are their per unit values. TTC = 20.128 U Hexi + 12.836 U Shazhou −2.207 P T + 1.895 P SL − 0.808 P W (18) The TTC of the ultra-short-term prediction operating condition is 75.231 p.u.. So, the dynamic TTC estimation model is as follows: The explanation of (19) is as follows: as the operating condition deviates from the ultra-short-term prediction operating condition, the TTC changes from TTC 0 accordingly. For example, as the voltage of Hexi Substation increases by 0.1p.u. the TTC increases by 2.01p.u.. This is because increasing the voltage of Hexi Substation is beneficial to the transient stability of the system, therefore, the TTC of the transmission interface is increased.
The dynamic TTC estimation model can estimate the real-time TTC with the pivotal features read from online measurements like SCADA, what's more, it also can provide a quantitative method to increase the TTC of the transmission interface.

E. ACCURACY AND EFFICIENCY
The performance of the online dynamic TTC estimation method is evaluated by accuracy analysis and efficiency analysis.

1) ACCURACY ANALYSIS
The average error of the dynamic TTC estimation is calculated according to (20), and the maximum error of the dynamic TTC estimation is calculated according to (21): where, TTC i is the estimated TTC of the ith training sample in the sample database, which is estimated using the dynamic TTC estimation model, TTC * i is the true TTC of the ith training sample calculated using the RPF method. In order to  analyze of the estimation error, the true TTCs of the unlabeled samples are all also calculated using RPF method here.
In this study case, the proportion of labeled samples is 25%. The comparation between the estimated TTC and the true TTC in the sample database is shown in Figure 6, and the fitting curve of the ''estimated TTC -true TTC'' spots is sufficiently close to the curve x = y. According to (20) and (21), the average error is 1.42% and the maximum error is 6.55%.
The relationship between the proportion of labeled training samples and the errors are shown in Figure 7. Figure 7 shows that when the proportion of labeled training samples is over 20%, the errors of the dynamic TTC estimation become stable and relatively small.
It is noted that the pseudo labeled samples are used for pivotal feature selection, and they affect the accuracy of TTC estimation by affecting the accuracy of pivotal feature selection. The comparation between the 1500 pseudo labels and their true TTCs is shown in Figure 8.
The average error of pseudo labels is 3.36% and the maximum error is 11.17%. The errors are pretty big so that the pseudo labeled samples are not used in the quantificationally learning step. The following analyzes the impact of using pseudo labeled samples for pivotal feature selection on the accuracy of TTC estimation.
First ten pivotal features are selected using 2000 all labeled samples and using 500 labeled samples plus 1500 pseudo labeled samples, respectively (the first five pivotal features selected from these two sample databases are the same ones, and in order to compare the differences, first ten pivotal features are selected). And then the relationships between TTC and the two sets of pivotal features are learned, respectively, using the same labeled training samples and unlabeled training samples.
The dynamic TTC estimation model which is established using the pivotal features selected from all labeled samples is shown in equation (22) The comparation of TTC estimation errors between model (22) and model (23) is shown in Table 2. It can be seen that the accuracy of TTC estimation model (22) is higher than that of TTC estimation model (23). This is because the accuracy of pivotal feature selection using all labeled samples is higher than that of using labeled samples plus pseudo labeled samples. However, as shown in model (22) and (23), their first seven pivotal features are the same ones, and the rest three pivotal features are much less relevant to TTC, so the error differences between the two TTC estimation models are small. Therefore, using the pseudo labels for pivotal feature selection does not much affect the accuracy of TTC estimation.

2) EFFICIENCY ANALYSIS
The computing time of each step of the proposed method is shown in Table 3. It can be seen that the whole process VOLUME 8, 2020  of dynamic TTC estimation model establishment and TTC estimation can be updated within 15 minutes which is the time cycle of the online security control, therefore, it is suitable for online application.
It is worth mentioning that if there are more than one transmission interface in the system, using several computers doing parallel computing and simulation can ensure the dynamic TTC estimation to be updated within the time cycle of the online security control.

F. COMPARISON WITH OTHER METHODS
To further evaluate the performance of the proposed method, the accuracy and efficiency comparison between the proposed method and three different kinds of other methods is carried out here. These methods are as follows: Method 1: In [3], the TTC estimation model is established using the supervised method with linear regression approach.
Method 2: In [9], the TTC estimation model is established using the Deep Belief Network (DBN) supervised method with multivariate polynomial non-linear regression approach.
Method 3: In [10], the TTC estimation model is established using the Deep Network (DN) supervised method with Neural Network non-linear regression approach.
The comparation of errors is shown in Table 4. According to Table 4, methods 1-3 have smaller errors than our method. This is because the above methods use supervised approaches with 2000 labeled samples to learn the relationship between TTC and the pivotal features, and our method uses a semi-supervised learning approach with 500 labeled samples and 1500 unlabeled samples. And because the relationship between TTC and the main pivotal features is linear, the error differences between the linear method and nonlinear method is pretty small. The accuracy of our method is sufficient for online security control. The comparation of computing time is shown in Table 5. As shown in Table 5, for step 1 sample generation, the proposed method takes a lot less time than methods 1-3, because semi-supervised learning approach need a lot less labeled samples than supervised learning approach, thus the times of manually labeling samples by calculating TTCs are reduced, saving a lot of computing time; For step 2 pivotal feature selection, in this paper, before the pivotal features are selected, the approximate TTCs are calculated for some of the unlabeled samples as their pseudo labels first, so it takes a little more computing time; For step 3 model establishment, in this paper, because of using the unlabeled training samples, the relationship between the TTC and pivotal features is learned by training two linear SVM regressors repeatedly, so it takes more computing time than training one linear regressor in method 1. The non-linear regression approaches of method 2 and method 3 take more computing time than method 1, too. So, if we use two non-linear regressors, it will take much more time training them repeatedly. For step 4 TTC estimation with real-time data, it takes less than 1s for all methods and can be neglected. As the proposed method can save a lot of computing time in the most time-consuming step, its total computing time is less than those of methods 1-3.
In conclusion, our proposed method can save a lot of online computing time, meanwhile maintaining sufficient accuracy, therefore, it is more suitable for online application.

VII. CONCLUSIONS
In this paper, an efficient online dynamic TTC estimation method is proposed. First the dynamic TTC estimation model is established with three key steps: sample database generation, pivotal feature selection and semi-supervised regression. Then with real-time data inputting the model, the TTC is estimated. While generating the sample database, the highorder uncertainties of wind and load are considered to cover the expected operating conditions more accurately. The HRFS method is used to select pivotal features to lower the dimension of the samples. After the qualitatively analyzation showing that there is a linear relationship between TTC and the main pivotal features in both the test system and the real-world system with wind power integration, for the purpose of reducing online computing time, the relationship between the TTC and the pivotal features is learned using the semi-supervised learning approach COREG with linear  regressors, which requires less labeled training samples than supervised learning approaches, thus, saving a lot of online simulation and calculation time of manually labeling samples. The case study results demonstrate that using the proposed method, the whole process of dynamic TTC estimation model establishment and TTC estimation can be refreshed every 15 minutes on a Chinese provincial power system, and the TTC estimation is sufficiently accurate. Through the accuracy and efficiency comparison between the proposed method and other typical existing methods, it shows that the proposed method can save a lot of online computing time, meanwhile maintaining sufficient accuracy, therefore, it is more suitable for online security control.

APPENDIX
The test system is the modified IEEE 39 system as shown in Figure 9. Based on the original IEEE 39 system, a wind farm is integrated at bus 17. The rated capacity of the wind farm is 400 MW. The wind farm is equivalent by 200 same Doubly Fed Induction Generators (DFIG), and wind turbine parameters are shown in Table 6. The transmission interface is composed of line 1-39, line 2-3, line 18-3, and line 16-15. While calculating the TTC of the transmission interface, the N-1 criterion is considered, and the contingencies are all of the N-1 three-phase short circuit of the transmission lines in the transmission interface, and the fault-clearing time is 0.1s.