Finding Optimal Point Features in Transient Multivariate Excursions by Horizontally Integrated Trilateral Hybrid Feature Selection Scheme for Transient Analysis

In transient analysis (TA), the processing time (PT) and prediction accuracy (PA) are the most significant indices be influenced the decision-making of grid operators to conduct timely-accurate corrective actions. In fact, achieving low PT and high PA (high-performance TA) necessitates designing the comprehensive feature selection scheme to select optimal transient point features (OTPFs). Hence, the partial-injective trilateral hybrid (filter-wrapper) scheme called PITHS is introduced in this paper. First, the transient dataset in the form of multivariate time series is constructed by an integrated programming platform. Next, based on PITHS, the first univariate trajectory feature (UTF1) is entered into the nested trilateral filter phase (NTFP) equipped with intertwined triple criteria of information theory for selecting filter-OTPFs of UTF1 (f-1OTPFs). Then, f-1OTPFs are fed to the nested trilateral wrapper phase (NTWP) for selecting filter-wrapper-1OTPFs (fw-1OTPFs). The NTWP is including the hyperplane-based predictive approach accompanied by the triple kernel. After conducting NTWP, fw-1OTPFs are considered as the first ultimate optimal point features (1UOPFs). Next, survived fw-1OTPFs injected into the subsequent trajectory (UTF2), and the neo-formed trajectory (fw-1OTPFs plus UTF2) drives a new round of NTFP and NTWP for finding fw-2OTPFs (2UOPFs). By conducting this procedure on the last neo-formed trajectory (the fw-k-1OTPFs+UTFk), the fw-kOTPFs are obtained (kUOPFs). Finally, the 1:kUOPFs set is tested to verify their efficacy for TA based on the cross-validation technique. The obtained results show that the proposed framework has a prediction accuracy of 98.75 % and a processing time of 152.591 milliseconds for TA.


I. INTRODUCTION
Emerging wide-area monitoring systems (WAMS) like phasor measurement units (PMUs) caused soft-hard restructuring in grid monitoring platforms, which has a direct impact on the precise reliability assessment of power systems [1]. In fact, using the PMU-based monitoring dashboard depicts the real-time variations of dynamic responses for grid operators, which helps them to promote awareness of the synchrony degree of the power system components [2]. In this regard, one of the most significant concerns in the power system exploitation process is related to the importance of maintaining synchronism of power system components under severe and sudden disturbances, called transient stability [3]. Hence, transient stability assessment (TSA) to identify unstable conditions by mining on the dynamic characteristics of system variables is a vital task. By detecting the transient instability via TSA, makes it possible to take corrective control action for the secureadequate power supply. However, such actions provide the potential opportunity to keep normal operations, if conducted in a timely manner. An important point to note is that timely corrective action requires the fast detection of transient stability status (stable or unstable case), which is achieved through applying robust lightweight predictive data mining (DM) techniques on a small observation window (SOW) of transient features [4,5]. Taking into cognizance these points, the processing time of transient stability prediction (TSP), which is included prediction and observation time must be less than one second (< 1 s) [6]. Besides the pay attention to the time complexity of the classifier for labeling transient samples, SOW plays a major role in reducing the processing time of TSP. However, lack of consideration to the intrinsic characteristics of existing transient features in picked SOW negatively affected the training and testing procedure of classification techniques, which leads to low accuracy on TSP. In other words, selecting the most relevant transient features in the form of the best-laid SOW is the necessary concern to achieve highperformance (time and accuracy) on TSP. This challenge can be solved via conducting the feature selection process, which is known as the most prominent category in DM technology. Hence, studies on designing feature selection scheme (FSS) have become an interesting research topic in the field of TSA and pattern recognition in recent years. Generally, FSS-centric transient studies fall under two categories: 1) Filter method; in Reference [7], ReliefFbased feature selection algorithm is used to select the most discriminative features for diagnosing fault of induction motors. Also, dynamic stability features are selected by the minimum-redundancy and maximum-relevance (mRMR) for large-scale power systems transient stability assessment in [8,9]. In Reference [10], regarding transient stability constraints, a feature pre-screening strategy for selecting optimal features based on the fast correlation-based filter method (FCBF) is used to achieve a high-performance total transfer capability calculation model, and 2) filter-wrapper method; In Reference [11], the optimal trajectory cluster features are exploited from the large observations of rotor angle and voltage magnitude by conducting hybrid framework in the form of the Relief-support vector machine for TSP. Also, extracting the most relevant features by conducting a segment-oriented filter-wrapper method on reactive power-based two-variate time series has been considered for TSP in [12]. Also, in [13], the bi-mode hybrid feature selection scheme (BMHFSS) finds optimal transient features on multivariate time series by coupling the point and trajectory-based filter-wrapper scheme. Focusing on the structure of the above-mentioned classical FSS shows the fact these approaches were designed based on vertically integrated strategy. For example, in verticaloriented hybrid FSS, first, entire feature space [13] or fragmented feature space (fragment 1 , fragment 2 and so on) [12] is entered into information theory-based approaches (filter), and then selected primary optimal features are fed to predictive-based algorithms (wrapper) for finding final optimal features. Such a cohesive-mode strategy leads to the extraction of the intrinsic characteristic of some transient multivariate time series as optimal features. Although, the selected optimal features based on vertical mode induced high accuracy in TSP, relevant features of some trajectories (called optimal-blurred features) are overshadowed based on this strategy. In fact, such an approach may cause the loss of the discriminative transient features per univariate trajectory and negatively affect the TSP performance in the presence of severe transient space. Hence, designing the horizontally integrated hybrid FSS, which considers all trajectories in the generalized form, is essential for selecting the most discriminative features. Applying the proposed hybrid FSS based on the partialinjective scenario makes the opportunity to find optimalblurred features as best-laid SOW.
According to what was mentioned above, designing the comprehensive FSS based on the proper strategy is one of the main solutions to achieve key indices, namely processing time (PT) and prediction accuracy (PA) on TSA. In fact, achieving low PT and high PA based on the most discriminative transient features in the presence of the severe transient space provides the necessary context to conduct timely-accurate corrective actions in power systems. To this end, as can be seen in Fig. 1, we consider the three-step scenario for TSA, which is included: 1) transient dataset generation based on transient simulation on the New England-New York interconnection (NETS-NYPS) test case was considered in the first step, 2) Next, the partial-injective trilateral hybrid scheme (PITHS) based on horizontally integrated mode is applied on transient multivariate trajectory features (TMTFs) which consist of two nested trilateral phases: a) nested trilateral filter phase (NTFP); TMTFs is entered into the NTFP equipped with intertwined triple information theory criteria for selecting filter-optimal transient point features (f-OTPFs) and b) in the nested trilateral wrapper phase (NTWP); the f-OTPFs is entered into the hyperplane-based predictive approach based on the triple kernel to find the filter-wrapper optimal transient point features (fw-OTPFs), and 3) in the final step, performance evaluation on TSP based on selected ultimate optimal point features (UOPFs) is considered by crossvalidation technique.
The rest of the paper is organized as follows: we describe the structure of the PITHS in Section 2. Experimental results of the proposed framework are presented in Section 3. Finally, the conclusion is interpreted in Section 4.

II. OVERALL PITHS PROCEDURE
The conjoined steps of PITHS to select the best-laid SOW in TMTFs for TSA are depicted in Fig. 2. Have a glance at the structure of the proposed scheme show the fact that the PITHS implements a partial-injective policy on transient multivariate excursions during the feature selection process. In fact, fw-1 OTPFs of the first univariate trajectory feature (UTF 1 ) survived by conducting the first round of dual-phase of PITHS (partial-manner) are considered as first ultimate optimal point features ( 1 UOPFs) and then accompanied by subsequent univariate trajectory feature (UTF 2 ) (injection) for exerting new rounds of dual-phase for finding fw-2 OTPFs ( 2 UOPFs). This horizontal integration will continue until the last neo-formed trajectory (LnfT) (retrieved from combination fw-k-1 OTPFs and UTF k ) is obtained. Next, LnfT is entered into the final dual-phase of PITHS to select fwk OTPFs as k UOPFs. Regardless of the partial-injective scenario coupled with PITHS to obtain UOPFs set (greenface circles in Fig. 2), utilizing two nested trilateral phases (filter and wrapper) called 2NTPs in PITHS has a driven role for extracting UOPFs. The 2NTPs is including the nested trilateral filter phase (NTFP) and nested trilateral wrapper phase (NTWP). In NTFP, filter-optimal transient point features called f-OTPFs are selected via information theorybased triple criteria, namely relevance, interdependence, and redundancy (RIR). The RIR analysis is exerted based on basic ratios like entropy, conditional entropy, mutual information (MI), and conditional MI. After conducting RIR analysis, the NTWP is applied on f-OTPFs for finding fw-OTPFs. In fact, the NTWP is considered a supplementary phase in PITHS to cover the weakness of the IRI analysis in ignoring supervised learning-based analysis on feature selection process. In this regard, the hyperplane-based approach equipped with the elastic and non-elastic kernels is considered in NTWP for selecting fw-OTPFs. The detailed descriptions of the NTFP and NTWP are considered in the following sections.

A. NESTED TRILATERAL FILTER PHASE (NTFP)
In the filter phase of PITHS, we consider statistical and intrinsic characteristics of the trajectory features based on the information theory concept. In fact, three significant criteria of information theory to measure of relatedness state of point features per trajectory to the target class are considered in the NTFP, which is the leading transient data analytic package in each stage of PITHS (See Fig. 2). As can be seen in Fig. 3, the NTFP of PITHS rounds consists of the nested steps as follow: Step 1) Specifying the input trajectory: In the first stage of PITHS, the input trajectory, including univariate of the first trajectory called UTF 1 (See Fig. 2), is entered into NTFP 1 . In the next stage of PITHS (2 nd round), trajectory input involved fw-1 OTPFs (optimal point features extracted based on applying NTFP 1 and NTWP 1 (See Section 2.2) on UTF 1 ) and UTF 2 are fed to NTFP 2 . This procedure will continue until the k th UTF, where fw-k-1 OTPFs and UTF k (called LnfT) are entered into NTFP k in the last stage of PITHS. So, we have (1), where k indicates the number of univariate time series constrained{ | 1, 2,..., 28} Step 2) Calculating relevance rate: The first component of RIR analysis, namely relevance rate is considered in this step. To this end, symmetric uncertainty (SU) [14] is used for selecting point features (PFs) from univariate trajectory input, which has tightly relation with the target class. In terms of SU, entropy, conditional entropy, and mutual information (MI) are the main factors to measure the value of information shared between pf ∈ PFs of input trajectory and target class. In this regard, the entropy H(pf) is defined as:  Where pf ∈ PFs be a discrete random variable and probability density function ( ) Pr{ } p x pf x   . Also, conditional entropy calculates the entropy of fp in the presence of target class knowledge as follow: Now, MI is defined as Eq. (4): Regarding Eqs. (2) to (4), The SU index is calculated as normalized form of MI, given by: Step 3) After calculating SU per pf ∈ PFs of input trajectory, by setting proper thresholds, each pf is situated in one of the three bundles (high SU PFs, middle SU PFs, and low SU PFs) according to its SU amount. The main reason for making this decision is to avoid absolute reliance on obtained results based on the SU index and the prevention of forced removal of pf ∈ PFs in the middle and low SU bundles. In fact, not content with SU-oriented analysis gives middle SU pf and low SU pf a chance to be re-analyzed via the interdependence-redundancy (IR) analysis and 1-persistence trilateral filter (See Step 4).
Step 4) After bundling PFs of input trajectory, each bundle (high SU PFs, middle SU PFs, and low SU PFs) entered to complementary analysis based on IR index as (6). In IR analysis, the effect of the presence of Consequently, by following Step 1 to Step 4, the filteroptimal transient point features (f-OTPFs) per bundle are obtained ( high SU f-OTPFs, middle SU f-OTPFs, and low SU f-OTPFs). To better understand the details of RIR analysis, consider the pseudo-code of the NTFP (Step 1 to Step 4) as Table 1.
Step 5) The obtained high SU f-OTPFs, middle SU f-OTPFs, and low SU f-OTPFs are joined together and entered into function IR once again (1-persistence scenario) for selecting f-OTPFs of input trajectory. After conducting the above-mentioned steps, each NTFP kspecific f-k OTPFs are fed to NTWP k (k=1 to 28) (See Fig. 2 and Section 2.2) for finding NTWP k -specific fw-k OTPFs at the end of each round of PITHS as k UOPFs (e.g.; last round: k=28; f-28 OTPFs entered into NTWP 28 for finding fw-28 OTPFs as 28 UOPFs).

B. NESTED TRILATERAL WRAPPER PHASE (NTWP)
For more analysis on transient features, the outputs of NTFP are fed to the predictive-oriented analysis called NTWP to extract the most discriminative transient features.
As can be seen in Fig. 2, each NTFP k -specific f-k OTPFs are entered to NTWP k (e.g.; first round: entering f-1 OTPFs into NTWP 1 ; last round: entering f-28 OTPFs into NTWP 28 ), which is considered as a complementary analysis in the proposed FSS. The NTWP is equipped with a hyperplanebased classifier accompanied by the triple kernel to evaluate the efficacy of selected features via NTFP. In fact, the point features in each NTFP k -specific f-k OTPFs inducing high performance in TSP survive by NTWP. Generally, each NTWP in PITHS is formed based on three components as follow: 1) The wrapper approach based on incremental mechanism: = initial weight per pf ∈ PFs level SU is 1 equally; ; Regardless of the effective role of the filter method in selecting f-OTPFs, using the wrapper method in the form of applying supervised learning models is the key approach to evaluate the predictive capacity of f-OTPFs. In fact, selecting the discriminative transient point features (DTPFs) from the f-OTPFs that lead to the correct prediction of unseen transient samples (stable or unstable) is the main goal to incorporate this approach in the FSS process. However, applying a mechanism that checks the performance rate of the DTPFs subset for TSP in an incremental manner is another important aspect of the wrapper phase. One of the incremental-based mechanisms is incremental wrapper subset selection (IWSS) [15] that regards the SU value of each member of f-OTPFs as the criterion in arranging the entry of features to the learning model in each iteration of IWSS. In IWSS, first, the feature that has the highest SU is added to the DTPFs subset, and it is fed to the learning algorithm. Then, classification accuracy is recorded as the best result. In the next iterations, the feature with the second-highest SU is added to the DTPFs subset, and the training and testing procedure is conducted based on existing members of DTPFs. If by adding this feature to DTPFs, the prediction accuracy increased against the preceding DTPFs subset accuracy, the feature has remained in the DTPFs subset; otherwise, this feature is removed, and the next feature is added to DTPFs to be executed subsequent iteration of IWSS.
2) The hyperplane-based classifier: Focusing on the type of classification algorithm embedded in the IWSS iterations is a significant issue that affects the learning process (traintest). This becomes doubly important when we find out that the processing time (PT) in the TA is a crucial concern. As mentioned in the introduction section, the reasonable PT for TSP is less than one second (< 1s) which such PT constraint provides the necessary condition for timely corrective control action in the power grid. Two factors are influential in achieving low PT namely observed length of transient cycles (observation window) and the time complexity of the classification model. Taking into cognizance these points, picking a small observation window (SOW) and labeling transient samples in low time is the only definitive solution to handle the PT bottleneck. To this end, we need to apply the robust-lightweight classifier that robust reflects the algorithm's ability to precisely learn the decision boundary in SOW (high accuracy prediction), and lightweight refers to the algorithm capable of fast detection of transient stability status (low time complexity). As the best option, the support vector machine (SVM) [16] is the robust learning model that employs a separating hyperplane with low structural risk in the classification of the limited and not linearly separable transient feature space. Furthermore, several kernels can be embedded in the SVM classifier for optimal matching (point-based or trajectory-based alignment) between transient samples, which increases the generalization capacity of the learning model. On the other hand, the kernel-based SVM capability to maximizes prediction accuracy without overfitting training SOW indirectly affects train-test computational complexity and turns it into a lightweight classifier. Hence, this issue motivated us to use SVM in the IWSS iterations. The optimization problem and the constraints of SVM are defined according to (7): The optimal separating surface in transient feature space is solved by (8): 3) Type of kernel function: In (7) and (8), ( , ) K   indicates the kernel function plugged into SVM to learn optimal decision boundary without raising the computational complexity. In this regard, three efficient kernel-based on elastic and non-elastic functions are introduced as follow: a) The non-elastic kernel; Standard Gaussian radial basis function (Standard GRBF) kernel [16]: The GRBF kernel works based on point to point alignment as (9): Where 2 || || x x  is squared Euclidean distance between the two time series feature.
b) The elastic kernel; DTW in GRBF kernel (DTW-GRBF kernel) [17]: Since the pattern matching with DTW outperforms Euclidean distance in most cases because of its non-linear alignment, using DTW distance in the GRBF kernel can help to build the high-performance SVM model for time series classification. Hence, the DTW-GRBF kernel is defined as (10): , , , According to what was mentioned about the main components of NTWP, the NTWP is exerted on f-OTPFs as shown in Fig. 4. As can be seen in Fig. 4, each NTFP kspecific f-k OTPFs are entered into the three stages of NTWP k . In the first stage, f-k OTPFs are entered into IWSS-SVM REDK . By applying SVM REDK in IWSS iterations, the REDK fw-k OTPFs are obtained (See panel (a) of Fig. 4). Next, the f-k OTPFs are fed to the IWSS-SVM DTW-GRBF and consequently, the DTW-GRBF fw-k OTPFs is obtained as the output of the second stage of NTWP k (See panel (b) of Fig  4). Unlike the previous two stages of the NTWP k which plugged elastic kernel into SVM to feed IWSS iterations, the obtained f-k OTPFs are entered into the non-elastic face of NTWP k accompanied with IWSS-SVM GRBF . In this way, the GRBF fw-k OTPFs are extracted as the DTPFs subset of fk OTPFs. Next, intersection of kernel fw-k OTPFs in pairs ([ REDK fw-k OTPFs ∩ DTW-GRBF fw-k OTPFs], [ REDK fw-k OTPFs ∩ GRBF fw-k OTPFs], and [ DTW-GRBF fw-k OTPFs ∩ GRBF fwk OTPFs]) are calculated and result with the most members is considered as the fw-k OTPFs of NTWP k , which is called the k UOPFs. Furthermore, if intersection kernel fw-k OTPFs in pairs have the same length of members, we combine their members as k UOPFs. Also, if the difference in the prediction accuracy of each set compared to the other two sets is more than 10%, the members of that are added to k UOPFs (numeric example in Fig. 4, UOPFs: pf1, pf2, pf3).

A. TRANSIENT DATASET GENERATION
As can be seen in Fig. 1, transient dataset generation based on dynamic contingency simulation is the preliminary step of the proposed framework for TSP. For implementing this step, the transient dataset generation workflow (TDGW), including two parts followed as Fig. 5. In the first part of TDGW, Python technology, SIEMENS power system simulator for engineering (PSS/E) planning tools, and case study (top-funnel in Fig. 5) triangulated to record transient data from output channels of basic features (OCBF-X). The X in OCBF-X is the symbol of basic features, including bus voltages (VOLT), voltage phase angle (VANGLE), machine active power (PELEC), machine reactive power (QELEC), and reactive power consumption (QLOAD) [19]. In this part, the transient sequence of OCBF-X is recorded based on coupling Python script and PSS/E application program interface (API) routine [20], which is executed on 68-Bus New England-New York interconnection system (NETS-NYPS) (See Fig. 6) [21]. An important point to note is that the obtained transient data of OCBF-X is related to applying several disturbances like substation outages, generator outages, and line outages on the NETS-NYPS grid case. In terms of simulation time, the fault duration time is set to 0.23 seconds (the time step is 0.0167 seconds). Also, the fault clearing time is set after the end of fault duration time. Furthermore, for recording severe transient samples, different load characteristics are considered by converting the constant MVA load for a specified grouping of network loads to a specified mixture of the constant MVA, constant current, and constant admittance load characteristics. In the second step of TDGW, OCBF-X-specific univariate trajectories accompanied with required add-ons are entered into MATLAB-based feature calculation module (bottom-funnel in Fig. 5) which leads to construct transient multivariate

B. UOPFs SET
The UOPFs set ( 1 UOPFs to 28 UOPFs) obtained by applying 2NTPs (NTFP and NTWP) in each round of PITHS (rounds 1 to 28) is elaborated in this section. According to (1), in the first round of PITHS, UTF 1 is entered into NTFP.
According to was what mentioned about the NTFP scenario (See Section 2.1), based on the first three steps of NTFP, the PFs of UTF 1 (9 cycles) based on relevance rate (SU measure) fall into the three bundles, namely high SU, middle SU, and low SU (Lines 1-10 of Table 1). As can be seen in Fig. 8 Fig.  3). The Fig. 9 to Fig. 11, shows how level SU f-OTPFs is selected by IR analysis. As can be seen in Fig. 9 to Fig. 11  {pf1, pf7}, {pf3, pf5}, and {pf2, pf9} are selected as high SU f-OTPFs, middle SU f-OTPFs, and low SU f-OTPFs, respectively (See green-border point explosion in 3-D pies).
After conducting the NTFP 1 on UTF 1 for selecting f- 1 OTPFs, the f-1 OTPFs are fed to NTWP (See Section 2.2) for finding NTWP 1 -specific fw-1 OTPFs at the end of the first round of PITHS as 1 UOPFs. To this end, by applying In each iteration of IWSS based on SVM kernel , the maximum value of the Acc package (retrieved by optimal pair of learning parameters) is recorded. In Fig. 13, we depicted the IWSS-SVM kernel Acc variations based on learning parameters (C, σ) related to selected kernel fw-1 OTPFs. For example, the illustration of SVM REDK performance variations (Acc) in the selected iteration of IWSS is shown in panel (a) of Fig. 13 (in the first iteration: pf7 as REDK fw-1 OTPFs). Also, SVM DTW-GRBF and SVM GRBF performance variations related to DTW-GRBF fw-1 OTPFs (first iteration with pf7) and GRBF fw-1 OTPFs (first iteration with pf7) are shown in panel (b) and (c) of Fig. 13, respectively. An important point to note is that the obtained 1 UOPFs are considered as the final output of the first round (NTFP 1 and NTWP 1 ) of PITHS and will be injected into the second round of PITHS (Refer to (1)). For more clarity, the roundspecific results related to 2 nd to 28 th rounds of PITHS are shown in Table 3 (including input and outputs of NTFP per round) and Table 4 (including input and outputs of NTWP per round). In the final step of PITHS, the obtained fw-OTPFs (UOPFs) per round of PITHS (See the last column of Table 4) are joined together to obtain the UOPFs set. The members of the UOPFs set including selected cycles of each UTF that will be used for transient stability prediction (See Section 3.3) are listed in Table 5.

C. TSP BASED ON UOPFs SET
After selecting the UOPFs set based on PITHS, the performance evaluation of UOPFs set in TSP based on crossvalidation technique which is accompanied with SVM GRBF was exerted on the transient dataset. In the learning scenario (SVM GRBF ) plugged into the cross-validation technique, for finding the optimal value of learning parameters (C, σ) to achieve high-performance prediction in each fold (train-test in fold 1 to fold 10), values of C and σ were selected from{ 2 | 0,1,...,15}    , respectively. Furthermore, for performance evaluation of TSP in each fold, three metrics were considered in this section as Table 6.
After conducting the above-mentioned train-test procedure, the value of the Acc index per fold in TSP is shown in Table 7. The Acc results per fold are related to the maximum value of Acc variations. For more details of Acc variations based on the different values of learning parameters, we depicted Acc variation of some folds as Fig.  14. Also, TPR and TNR related to obtained Acc per fold are considered for in-depth analysis on UOPFs set capacity in TSP. Also, the mean value of Acc, TPR, and TNR in 10folds are given in Table 7 (in the last row) as the final report of UOPFs set efficacy on TSP. As can be seen in Table 7, the classification accuracy based on the UOPFs set shows the high-performance capacity (high PA) for TSP (Acc: 98.75 %, TPR: 98.75 %, and TNR: 98.75 %). In addistion to PA analysis, the PT index, including observation time and prediction time was considered in the performance evaluation process. Regarding selected cycles of UTFs caused to construct UOPFs set show this fact that the maximum observing time for recording UTF cycles will not take more than 9 cycles (e.g., 9 th cycle of UTF 23 and UTF 25 , in the rest of UTF, the observation window will take less than 9 cycles). Hence, the observation window (OW) is equal to 9 cycles (150.3 milliseconds (ms)). Also, the prediction time based on applied SVM GRBF on UOPFs is 2.291 ms. Consequently, the processing time is 152.591 ms (See Table 8) which in this way, suitable time conditions are provided for the system operator to take corrective actions.  features selected by FCBF, ReliefF, and BMHFSS [13]). Also, based on a 9 cycles observation window, PITHS has higher processing time (152.591 ms) than 4vFSS (mRMR: 68.793 ms, FCBF: 68.930 ms, ReliefF: 68.910 ms, BMHFSS: 52.948 ms), which uses a four-cycle observation window for TSP [13] (See Table 10). However, the processing time of PITHS causes the system operator to have enough time to take corrective actions. For more details on the processing time of P and 4vFSS, refer to Table 8 and  Table 10 [13].

V. CONCLUSION
This study aimed to achieve high prediction accuracy and low processing time on transient stability assessment based on data mining technology. In this paper, to satisfy coupled indices, the horizontally integrated feature selection scheme is introduced for finding discriminative transient point features. We propose the partial-injective trilateral hybrid scheme (PITHS), including the nested trilateral filter phase (NTFP) and nested trilateral wrapper phase. In NTFP, information theory-based analysis by relevance, interdependence, and redundancy (RIR) criteria and in NTWP, the supervised learning-based analysis by nonlinear support vector machine (SVM) applied on 28-variate trajectory features. According to two nested trilateral phases (2NTPs) mounted on the partial-injective scenario, the first univariate trajectory feature (UTF 1 ) is entered into the NTFP for selecting filter-OTPFs of UTF 1 (f-1 OTPFs).
Next, the obtained f-1 OTPFs are fed to NTWP for finding filter-wrapper-1 OTPFs (fw-1 OTPFs) as 1 UOPFs. After conducting the first round of PITHS, the new round is triggered by feeding the fw-1 OTPFs and UTF 2 to 2NTPs for finding fw-2 OTPFs ( 2 UOPFs