A Wind Farm Equivalent Method Based on Multi-View Transfer Clustering and Stack Sparse Auto Encoder

Large-scale wind farm (WF) constitutes dozens or even hundreds of wind turbines (WTs), making it complex and even impractical to develop each individual WT in detail when building WF model. Thus, the equivalent model of WF, with a reasonable reduction of the detailed model, is essential to be developed. In this paper, we propose a multi-view transfer clustering and stack sparse auto encoder (SSAE) based WF equivalent method, which can be used in the low voltage ride through (LVRT) analysis of WF. First, to obtain distinguishable deep-level and multi-view representation of wind turbine (WT), stack sparse auto encoder (SSAE) is used to extract features from the time series of several WT physical quantities, and these features are used as the clustering indicator (CI). Then, a multi-view transfer FCM (MVT-FCM) clustering algorithm, which combines transfer learning with multi-view FCM (MV-FCM), is put forward for WTs clustering. Two transfer rules are designed in this algorithm, and the clustering center and membership degree in the source domain are transferred to guide the clustering process of target domain samples. Finally, the calculation method of equivalent parameters is presented. To verify the effectiveness of the proposed method, a modified actual system in East Inner Mongolia of China is utilized for case study, and the performance of the proposed model is compared with several state-of-the-art models. Simulation results show that the equivalent errors of the proposed model decrease at least 3% when comparing with other models. Also, the error fluctuations are within 6% under different simulation conditions, which illustrates the well-performed robustness of the proposed model.

the ith clustering center in the kth view, i = 1, . . . , C, k = 1, . . . , K d ij,k distance between the pattern x j,k and the clustering center v i,k in view k. µ ij,k membership degree of x j,k to v i,k α j,k importance coefficient of x j,k in rewarding membership degree β j,k penalty coefficient used to weaken α j,k VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ λ regularization parameter, and λ >0 γ trade-off factor, and γ ∈(0,1) µ ij overall membership The installed capacity of wind turbines (WTs) would be 2.1 billion kW in China until 2020 [1]. In a wind farm (WF) system, voltage dip occurs frequently in its output port because of short-circuit faults in the power grid, thereby threatening the safe operation of the WF. The dynamic process of WF under voltage dips is defined as low voltage ride through (LVRT), and the studies of LVRT are essential for developing strategies of preventing WF off-grid and improving stability of power systems. Hence, a dynamic WF model, which can simulate the LVRT process accurately, is quite necessary to establish [2]. However, a WF usually contains dozens or even hundreds of WTs, and it is not feasible to model each WT in detail due to the huge model size and simulation time [3]. Thus, it is critical to develop the WF equivalent model for LVRT analysis on the basis of reasonable reduction [4]. In general, the WF equivalent methods can be categorized as two types: single-machine equivalence and multi-machines equivalence (MME). Studies have shown that MME-based models simulate WF characteristics better and have a border application in practice [5]. Three steps need to be taken in MME.
Step 1: Clustering indicator (CI) selection. The operation characteristics are different among the WTs in a WF, so it is important to select the proper CI to assess these differences. Until now, wind speed [3], pitch angle [4], rotor speed [5], and active power [6] have already been fully investigated to be assigned as CIs. These CIs are selected at specific time spot, and with time passes, these CIs might be changeable and the clustering results might not be valid in the whole time span. In [7], the time series of active power is selected as CI, and the clustering result is appropriate for the simulation in the selected time span, increasing the use range of the WF equivalent model. This paper also uses time series-based CIs because of their advantages in use range.
Step 2: WTs clustering. This step aims to cluster the WTs with the same or similar operation characteristics into the same group, and pushes the WTs operating quite different to different groups. Fuzzy C-means (FCM) clustering [6], K-means clustering [8], hierarchical clustering [9] are the commonly used clustering methods.
Step 3: Equivalent parameters calculation. Based on the WTs clustering in step 2, the WTs in the WF have been divided into several groups. Each group of WTs is called as one equivalent WT, and the parameters of the equivalent WTs need to be calculated. Equal power loss method [10] and equal voltage loss method [4] are the mostly used for collector network equivalence, where capacity weighted method is commonly used in the parameter estimation of equivalent transformer and equivalent WT [4].
Based on the above three steps, the MME-based WF equivalence is implemented. Although favorable equivalent results are achieved, some limitations are still existed in the present researches: 1) the existed time series-based CIs often focus on single-view, failing to reflect comprehensive aspects of the WT operation characteristics; 2) the useful information in the time series-based CIs is somewhat hidden and need to be extracted; 3) the clustering results are easily changeable and inaccurate when using traditional clustering methods. The following is the analysis of these limitations.
The time series-based CIs often focus on single-view. The problem is that the single-view CI only reflects the WT operation characteristics in a few aspects, but fails to reflect in more comprehensive aspects. For example, active power time series is used as CI in [7]. Yet, the active power in some WTs might be similar, but their reactive power might perform quite different because different reactive power controls can be adopted in different WTs. Apparently, based on this CI, the final WF equivalent accuracy of active power might be high, but the accuracy of reactive power cannot be ensured. Therefore, the CI selection should focus on more views, by this way, the equivalent accuracy of more state variables in the equivalent WF can be ensured.
In addition, for the time series-based CIs, the existed researches cluster the time series directly. However, the useful information of the lengthy time series might only exist in some fragments, and the clustering algorithms are generally incapable of digging these useful features and might be affected by the useless fragments. Thus, the useful information extraction from the time series-based CIs, which is beneficial for the processing of clustering algorithms, is quite important.
In the WTs clustering step, the results of the traditional clustering methods can be influenced by the initialization, which is a random process. Thus, the clustering results might be changeable and unstable [11], and it is hard to choose the best one. If several WTs are pushed to the wrong groups, the final equivalent accuracy of WF would be influenced severely. Therefore, a clustering method that can acquire stable and accurate result of WTs clustering is essential to be developed.
To address the above three problems, this paper proposes an equivalent method for large-scale WF based on multi-view transfer clustering and stack sparse auto encoder (SSAE) [12], main contributions of this paper can be summarized as follows.
1) Time series of active power, reactive power, voltage and current, which have multi-view attribute, are used to evaluate the WT operation characteristics. Further, the useful information of these four quantities are extracted using SSAE, which is an effective tool for extracting features by making nonlinear variations to the original signals layer by layer. The multi-view extracted features are used as the CI in this paper. This CI can not only be used to evaluate the WT operation characteristics in more comprehensive aspects, but also convenient for clustering algorithm processing because of more obvious and distinguishable feature information. 2) To make the WTs clustering results more stable and accurate, a new clustering method namely multi-view transfer fuzzy C means (MVT-FCM) algorithm is put forward. In this method, transfer learning [13] mechanism is introduced to multi-view clustering [14]. Specifically, transfer learning migrates the beneficial knowledge of source domain, i.e., the source domain clustering center and membership degree, into the target domain to guide the clustering process. This auxiliary guidance significantly enhances the stability and accuracy of WTs clustering. The paper is organized as follows. The multi-view CI, which is extracted using SSAE, for clustering WTs is presented in Section 2. In Section 3, MVT-FCM is proposed by applying transfer learning to MV-FCM, and the source domain data determination method is also put forward in this section. In Section 4, the equivalent parameter calculation method after WTs clustering is given. Section 5 provides the case study. Section 6 concludes the paper.

II. FEATURE EXTRACTION OF WT BASED ON SSAE
A. DESCRIPTION OF SSAE SSAE belongs to the field of deep learning, and its basic unit is sparse self-encoder (SAE). Through combining the sparse penalty and reconstruction errors in the coding layer, SAE can effectively encourage model to learn the data feature in addition to copying the input to output. SAE is generally used in feature extraction, and the feature is used in classification or clustering algorithms. However, SAE cannot obtain deeper feature of data, since it only has one-layer [15]. Therefore, SSAE is proposed by stacking the hidden layer of previous SAE to the input layer of next SAE. The low-level features of the data are combined continuously, and the abstract, easily distinguishable high-level feature representation is finally obtained. The structure of the SSAE is shown in Fig. 1. Due to space limitations, the detailed theory of SSAE can be found in the [15].

B. FEATURE EXTRACTION FROM WT BASED ON SSAE
The global target of the WF equivalence is guaranteeing WF output characteristics consistent before and after the equivalence. Thus, a proper CI is important. In this paper, the time series of active power, reactive power, voltage and current are used to evaluate the WT operation characteristics. Further, SSAE is used to extract features from these four physical VOLUME 8, 2020  quantities. The multi-view extracted features are used as the CI in this paper. To extract these features, the method of acquiring samples used to train SSAE and the SSAE construction method are given as follows.
To acquire a large amount of samples used in the training of SSAE, the important factors that affect active power, reactive power, voltage and current of WT are analyzed, as shown in TABLE 1. In TABLE 1, each factor has multiple values. The value combinations of the three factors are used as the input of WT, and there are 102 × 11 × 5 = 5610 compound modes. The time series of the four physical quantities under these 5610 simulation experiments are used as the samples to train SSAEs.
Based on the acquired samples, the SSAE models are constructed for active power, reactive power, voltage and current, respectively. In the construction of SSAEs, the number of stacked layers and the number of neurons in per layer have an important influence on the final feature extraction [17]. In this paper, the determination method of stacked layer number and neuron number in per layer is shown in Fig. 2, that is, the optimal number of neurons in each layer is determined layer by layer, and the optimal number of stacked layers is determined according to the error after increasing the number of layers [17].
Specifically, some numbers of neurons in the first layer are set, and the training experiments are performed, respectively, the number corresponding to the minimum mean absolute percentage error (MAPE), which is shown in (1), is selected as the number of neurons in the first layer. Then, fix the number of neurons in the first layer, and obtain MAPE in the second layer using the same method. In the first few layers, MAPE would generally decrease because the feature extraction precision is higher, then, MAPE would increase due to over-fitting. Therefore, the number of stacked layers is taken as the optimal number when MAPE starts to rise.

III. WTS CLUSTERING BASED ON MULTI-VIEW TRANSFER LEARNING
A. TRADITIONAL MV-FCM MV-FCM [14] is essentially a multi-view clustering method, which could reward or suppress the corresponding fuzzy membership degrees for the current view, and combines the clustering results in each view into global clustering result. Assuming that the data set can be divided into C clusters, the objective function and constraints of the MV-FCM can be expressed as Using the Lagrange optimization, the update equations of the clustering center v i,k and the membership µ ij,k can be derived as

B. THE PROPOSED CLUSTERING METHOD MVT-FCM
We declare firstly, if two variables use the same symbol, the variable with top horizontal line is from source domain, and that without top horizontal line is from target domain, the meanings of the two variables are totally the same.
In recent years, transfer learning has received widely attention. The learning method provides a certain reference for the target domain by referring to the source domain information. In the clustering analysis, transfer learning can effectively deal with the problem of low clustering accuracy and instability [18]. This paper introduces transfer learning into MV-FCM to improve the stability of WTs clustering results.
We firstly transfer the clustering center of source domain to target domain, and the first transfer rule is In MV-FCM, µ m ij,k d 2 ij,k is used to estimate the deviation between the data and the clustering center, and µ ij,k is used as weighting factor. To transferμ ij,k to target domain, we define the second transfer rule Based on (6)-(7), the optimization objective and constraints of MVT-FCM are The necessary conditions for minimizing the objective function in (8) yields the following clustering center and membership update equations After obtaining µ ij,k , the overall membership can be calculated using (12) The premise of MVT-FCM is determined by source domain data, which is used to guide the clustering of the target domain data. However, not all source domain data can play a positive role in the clustering of target domain data. If the data distribution of the two domains are different, the introducing of the source domain data would worsen the target domain calculation result. In transfer learning, the similarity test is generally used to determine source domain data that could be transferred to target domain [19].
In the LVRT experiments, wind speed, voltage dip and control method are selected as the main factors affecting the WT output. Therefore, we consider the three factors to judge the similarity of the data in source and target domain, and only transfer the source domain samples that are most similar to target domain. Define the comprehensive similarity coefficient, which is as follows In (13), M v = 102 × 11 × 2 = 2244, M pf = 102 × 11 × 3 = 3366, representing the number of LVRT experiments that adopt constant voltage control and constant power factor control, respectively. The smaller value of s ij , the more similar VOLUME 8, 2020 of the operation characteristics between the ith WT and the jth LVRT experiment. As for the ith WT, select the n sim LVRT experiments that are most similar to it, namely Ex i . Finally, the multi-view features corresponding to the following LVRT experiments are transferred to target domain.

D. WTs CLUSTERING AFTER INTRODUCING SOURCE DOMAIN DATA
In this paper, SSAE is used to extract features of active power, reactive power, voltage and current of WT, and these features are input to MVT-FCM to achieve the clustering of WTs. The WTs Clustering process after introducing source domain data can be represented by Fig. 3.
(1) According to the wind speed, voltage dip value and control mode of each WT, the active power, reactive power, voltage and current of each WT are obtained through simulation.
(2) The active power, reactive power, voltage and current time series of WT are input into the trained SSAE, and the features of four types of physical quantities is obtained, which are used as target domain data for WTs clustering.
(3) Select the appropriate n sim , and use (13) and (14) to acquire the LVRT experiments for data transfer. The active power, reactive power, voltage and current in these selected experiments are input into the trained SSAE, and the features are used as source domain data for WTs clustering.
(4) For MVT-FCM, set the algorithm parameters, and input the source domain and the target domain data. Finally, the clustering result based on the proposed MVT-FCM algorithm is obtained.

IV. CALCULATION OF EQUIVALENT PARAMETERS AFTER WTs CLUSTERING
After WTs clustering using the MVT-FCM algorithm, the topology of the equivalent WF is shown in Fig. 4. The parameters of the equivalent WT, equivalent transformer and equivalent collector network need to be calculated.
As for equivalent WT, the equivalence of wind speed, WT capacity, stator and rotor winding reactance and resistance, shafting inertia time constant, shafting stiffness coefficient, shafting damping coefficient could be found in [20]. As for equivalent transformer, the equivalence of capacity and reactance could be found in [20]. As for equivalent collector network, the equivalence of resistance, reactance and capacitance could be found in [10].
Besides, constant power factor control and constant voltage control are adopted in this paper. The control mode in the ith equivalent WT follows the following principle: 1) Calculate the proportion of constant power factor control in all WTs, denoted as p pf . Then there would be N pf = [p pf × C] equivalent WTs taking constant power factor control, where C is the number of equivalent WTs, and [ ] is the integer symbol following the rounding rule. 2) Calculate the proportion of constant power factor control in each equivalent WT, denoted as p pf,i . Take the first N pf with larger p pf,i , then the corresponding equivalent WTs would take constant power factor control, and other equivalent WTs would take constant voltage control.

A. THE CONSTRUCTIONS OF SSAEs
To acquire the extracted features as the multi-view CI, the SSAEs need to be well-trained. Using the wind speeds, voltage dips and control modes shown in TABLE 1, a total of 5,610 LVRT experiments are performed to acquire the active power, reactive power, voltage and current time series, which are used as the training samples of SSAEs. In these LVRT experiments, three-phase short-circuit fault occurs at the output port of the WT, and the fault occurs at 90s for 0.15s. Among these time series, Fig. 5 shows the active power curves of the WT at various experimental conditions. In the training of SSAEs, the time series of active power, reactive power, voltage and current from 89.9s to 90.5s are selected as the input of the SSAEs, respectively. Set ρ = 0.01, θ = 0.0001, δ = 3, and Fig. 6 shows I MAPE under different number of stacked layers and neurons. The stacked layers d, and the number of neurons in per layer s i (i = 1, . . . , d) can be acquired using the method in Fig. 2. Taking active power as an example, from Fig. 6 (a), I MAPE performs a downward trend when the stacked layer number increases to four. However, I MAPE increases when the stacked layer number is five. Therefore, the number of stacked layers of active power SSAE is four. At the same time, considering that the data dimension input to the MVT-FCM should be consistent in each view, the neurons number in the last layer of the SSAE of four type physical quantities should be the same. Finally, the number of stacked layers of SSAEs of active power, reactive power, voltage and current and the number of neurons in per layer are shown in TABLE 2.

B. THE TEST WF AND ITS EQUIVALENT RESULT
In this paper, the simulation is carried out based on an actual system in East Inner Mongolia of China and a certain degree of expansion. The layout of the test case is shown in Fig. 7.  In this system, the WF consists of 61 WTs, and all of the WTs connect to collector lines by 0.69/35kV box transformers and access to external system by 35/220kV transformer. Main parameters of the system are shown in TABLE 3.
Constant voltage control and constant power factor control are randomly employed to the WTs in the WF at the ratio of 1:2. Set three-phase short-circuit fault at PCC with voltage dip value 0.2pu. When fault occurs, the wind speed range is set between 10m/s and 20m/s, and the wind speed of each WT is randomly set in this interval. Specifically, the wind speed, voltage dip and control mode of each WT in the WF are shown in Fig. 8, and the time series of active power, reactive power, voltage and current in each WT are simulated.
The time series of these four quantities are then input to the well-trained SSAEs, respectively, and the extracted features are shown in Fig. 9. In Fig. 9 (a)-(d), each curve represents the extracted features of a WT in one view, and the dimension of the features is 500, in other words, each curve is a 500-dimensional vector. The multi-view extracted features in Fig. 9 are selected as the of target domain samples, which would be used as the target domain data in MVT-FCM for WTs clustering.
Select n sim = 20, and the LVRT experiments that are most similar to WTs in WF are obtained using equations (13) and (14), totally 20×61 = 1220 WT experiments. The multiview extracted features corresponding to these 1220 experiments are selected as the source domain samples, which would be used as the source domain data in MVT-FCM for WTs clustering.
Using the above target domain and source domain samples, the WTs in the WF can be clustered based on MVT-FCM. According to the transfer rules shown in (6) and (7), not the source domain samples are transferred to target domain directly, but the clustering centerv i,k and membership degreē µ ij,k are transferred. Thus, thev i,k andμ ij,k are calculated based on the source domain samples using the traditional MV-FCM. The parameters setting of MV-FCM is shown in TABLE 4, then thev i,k andμ ij,k can be computed using (4) and (5), iteratively.
After acquiringv i,k andμ ij,k of source domain, these two items are transferred and used to guide the clustering of the target domain samples. The parameters setting of MVT-FCM is also shown in TABLE 4. Based on the target domain samples and the transferred itemsv i,k andμ ij,k , the v i,k and µ ij,k are computed using (10) and (11), iteratively. Finally, thê µ ij is calculated using (12), and the result is shown in Fig. 10. In Fig. 10, the darker of the color, the larger of the value. The largest value in each column is circled, and for each row, all the circled WTs are clustered to a group. Through statistics, the WTs clustering results are obtained, and are shown in TABLE 5. After WTs clustering, the equivalent parameters are calculated. As for the control mode, Weq1 takes constant   voltage control, Weq2-Weq4 take constant power factor control. Other parameters can be calculated using the methods in [10], [20], and this paper would not list them.

C. SUPERIORITY OF THE PROPOSED WF EQUIVALENT METHOD
In this paper, the features of time series are extracted using SSAE. In addition, MVT-FCM is proposed for increasing the stability of the clustering results, and providing more views to ensure the equivalent accuracies in more aspects. To illustrate the superiority of the proposed method, the proposed model (M0) is compared with several MME models (M1 to M5), which are designed by ourselves. The descriptions of these models are listed in TABLE 6. Among these models, VOLUME 8, 2020  M1 is used to illustrate the superiority of feature extraction, M2-M3 are used to illustrate the advantages of the application of transfer learning in clustering process, and M4-M5 are used to illustrate the benefits of multi-view consideration in CI selection.
To verify the accuracy of the proposed WF equivalent model in statistical level, root mean square error (e RMSE ), mean absolute error (e MAE ), and mean absolute percentage error (e MAPE ) are introduced in the paper, which are widely used to evaluate the error of time series. The expressions of e RMSE , e MAE and e MAPE are shown in (15). The e RMSE , e MAE and e MAPE of M0-M5 are shown in TABLE 7. In TABLE 7, by comparing the value of e RMSE , e MAE and e MAPE in M0-M5, M0 performs better in the accuracy aspect than the other five models. The reasons are listed as follows.
First, as for M0 and M1, with the help of SSAE in M0, the clustering algorithm can identify the hard-to-find data distribution characteristics. Yet, there is no feature extraction in M1, and the data features might only reflect in some fragments of lengthy time series, thereby influencing the clustering accuracy.
With regard to M0 and M2/M3, although feature extraction steps are applied to both M0 and M2, there is no introducing of transfer learning in the clustering process of M2. Thus, only the target domain samples are clustered in M2 without the reference of source domain knowledge, and the samples without obvious distribution characteristics might be pushed to the wrong groups. For M3, there is also no feature extraction step except for no transfer learning, so its equivalent accuracy is even worse than M2.
With M0 and M4/M5, M4 only use single-view extracted features as the CI, which is corresponding to active power, so the accuracy of active power is quite high, but the accuracy of other physical quantities cannot be ensured. This illustrates the necessity of the multi-view consideration in CI selection. With M5, the time series of active power is directly used as the CI, which features are hard-to-find for the clustering algorithm. Thus, its accuracy is not comparable to other models.

D. PERFORMANCE OF THE PROPOSED CLUSTERING ALGORITHM
The superiority of the presented WF equivalent method is discussed qualitatively in the above Section V-C. In particular, the proposed clustering algorithm MVT-FCM plays a very helpful role in increasing the equivalent accuracy because of the introducing of transfer learning. Thus, in this part, the advantages of MVT-FCM are analyzed in a quantitative manner by comparing with other state-of-the-art clustering algorithms. Except for the used MV-FCM and FCM in M2-M5, several novel multi-view clustering algorithms, including multi-view fuzzy clustering (MV-FC) [21], multi-view fuzzy k-Means (MV-FKM) [22], and multi-view expectation maximization (MV-EM) [23], are also used to compare with MVT-FCM. For MV-FC, a local inertia term is defined by the inertia of the fuzzy clusters in a certain view, and penalized by a disagreement with the other views. MV-FKM makes use of a penalty term in its objective function, which aims at reducing the disagreement between organizations on different views. With MV-EM, the objective function combines the local log-likelihoods from any views and a disagreement term.
The test datasets include two parts, i.e., multi-view extracted features of source and target domains acquired in Section V-B, and the multi-view time series of source and target domains acquired in Section V-B. Yet, not the all the samples in the dataset are required to fed to some algorithms because of their principles. More specifically, transfer learning is introduced into MVT-FCM, thus, the multi-view samples in both domains are fed to it to acquire the clustering result. For MV-FCM, MV-KM, MV-FKM and MV-EM, there is no transfer learning, so only the multi-view samples in target domain are fed to these algorithms. FCM is a single-view clustering algorithm without transfer learning, so the singleview samples (we use active power) in the target domain are fed to it.
Three popular validity indices are adopted in this paper, i.e., DVI, SI, and DBI [24], for verifying clustering performance of these algorithms. Larger values of DVI and SI indicate better clustering performance. In contrast, smaller values of DBI are preferred. Besides, smaller values of the standard deviation mean the stability of the clustering is stronger. The experimental results are reported in terms of the means and the standard deviations, and are calculated after 100 repeated runs. The clustering performances in different clustering algorithms are shown in TABLE 8. In TABLE 8, the mean values of DVI and SI in MVT-FCM are apparently larger than those of other clustering algorithms, and the DBI values of MVT-FCM are smaller. This indicates that better performances are acquired by MVT-FCM. The reason is that MVT-FCM introduces the knowledge from source domain using transfer learning, and the transferred knowledge remedies the disadvantages of unobvious data distribution in the target samples.
In addition, the standard deviation values of DVI, SI, and DBI in MVT-FCM are 0, which illustrates that the same clustering results are acquired in the 100 repeated runs. Conversely, the standard deviation values in other algorithms are larger than 0, which means different clustering results are acquired. This is also because the algorithms without the introducing of transfer learning are sensitive to the initialization. Thus, the applying of transfer learning mechanism is quite helpful to increase the stability of clustering results.
Comparing the clustering performances in the two test datasets, we can see the performances are better when using the extracted features as the clustered samples. Take the clustering performances of MVT-FCM as example, the DVI and SI are larger and the DBI is smaller when clustering extracted features. The reason lies in that data features of time series have been fully mined using SSAEs, these features are easily recognizable for clustering algorithms. For the time series, the features might only reflect in some fragments, and it is difficult for the clustering algorithms to mine these features. This indicates the positive effect of feature extraction in WTs clustering.
The proposed MVT-FCM introduces transfer learning in its clustering process, thereby increasing the algorithm complexity when comparing with the clustering algorithms in TABLE 8. Thus, the complexity of the proposed MVT-FCM VOLUME 8, 2020 is analyzed. Generally, the algorithm complexity can be divided into two aspects, i.e., time complexity and space complexity. The computer run time of the algorithm can be used to denote the time complexity, and the maximum memory usage can denote the space complexity. It should be noted that clustering center matrix and membership degree matrix in the source domain are transferred to target domain, and the two matrices are computed using the traditional MV-FCM. Therefore, only the calculation process using the target domain samples and the acquired two matrices are taken into account to assess the complexity of MVT-FCM.
The above two test datasets are also used to assess the algorithm complexity. The computer configuration is Intel(R) Core(TM) i5-8250, CPU@1.60 GHz, 8.00 GB of RAM. For each dataset, the algorithm repeated runs for 100 times, and the mean results are calculated. TABLE 9 shows the complexity assessment results of the proposed method and several compared methods. From TABLE 9, we can see the proposed clustering algorithm is slightly more time-consuming than other multi-view clustering algorithms. For example, the run time of MVT-FCM increases about 36% when comparing with MV-FCM. This is because the introducing of the knowledge from the source domain increase the complexity of the objective function of MVT-FCM. Yet, this increasing in run time is quite acceptable because the better clustering accuracies and stabilities are acquired. In the aspect of space complexity, the maximum memory usage of MVT-FCM has not increased too much. This is because only the source domain knowledge (the source domain clustering center and membership degree), but not the source domain samples are transferred to the target domain. Thus, the data amount that are required to be processed for MVT-FCM increases slightly. This also indicates that the proposed MVT-FCM has the ability to process the WTs clustering of the WF with larger scale.

E. COMPARISON WITH OTHER WF EQUIVALENT METHODS
Besides comparing the proposed model M0 with our designed contrast models M1-M5, the equivalent results are also compared with several novel WF equivalent approaches in other literatures [7], [25], [26]. In [7], the researchers proposed a WF equivalent model taking the WF output consistency at a selected time span as objective. In [25], a dynamic voltage equivalent method for WF is proposed, and the randomness and fuzziness of WTs are used to evaluate the similarity among WTs based on cloud model. In [26], the WF equivalent model is established in the view of energy conservation.
To facilitate the following analysis, the used CI and clustering algorithms of the models in [7], [25], [26] are listed in TABLE 10. We also use the WF system shown in Fig. 5 as the test system. In addition, the experimental conditions in Section V-B are used. The clustering algorithm in each compared model repeated runs 100 times, and we random select one result from these runs for each model.  [7], [25], [26].
The e RMSE , e MAE and e MAPE values of the compared models are calculated, and are shown in TABLE 11. In addition, the DVI, SI, and DBI values of these compared models are shown in TABLE 12. It should be noted that the error results and the clustering performances of the proposed model are shown in TABLE 7 and TABLE 8. By comparing the values of e RMSE , e MAE and e MAPE between the proposed model and the compared models, the proposed model performs the same good or even better in the accuracy aspect than the other three compared models. Meanwhile, the values of DVI, SI and DBI also show an advantage comparing with other models. The analysis are listed as follows. The compared models all use single-view CI for WTs clustering, thus, their accuracies are higher in the aspect corresponding to their CIs. Specifically, the model in [7] use active power as the CI, so its active power accuracy is higher than other physical quantities. Similarly, for the model in [25], the voltage accuracy is higher, this is because the dynamic voltage is selected as the CI in this model. The model in [26] clusters the WTs using their rotating speed, which is a physical quantity relevant to active power, so this model performs better in the equivalence of active power. Conversely, the proposed model uses multi-view CI, thus, its equivalent accuracies are ensured in more aspects.
From TABLE 12, we can see the standard deviation values of DVI, SI and DBI are greater than zero, which means the clustering algorithms has acquired different results. This is also because that no transfer learning is introduced in these clustering algorithms, and the clustering results are easy to be influenced by the random initialization. This again verifies the applying of transfer learning in MVT-FCM is quite helpful to increase the stability of WTs clustering results.

F. THE ROBUSTNESS ANALYSIS
In this paper, the robustness refers to the equivalent accuracies of the proposed model under different LVRT scenarios. To evaluate the robustness of the proposed model, the representation abilities of dynamic responses under different scenarios are tested. There are 21 scenarios with different wind speeds, voltage dips and control modes for the WTs in the WF, and these scenarios are shown in Fig. 11. In Fig. 11, the x-axis denotes the WT number, and the y-axis denotes the scenario number. Take the 1st scenario as example, the voltage dip value in this scenario is 0.2pu; WT1-WT41 take constant power factor control, and the control modes for WT42-WT61 are constant voltage control; besides, the wind speeds for the WTs are the 1st row of this figure.
The e RMSE , e MAE and e MAPE in different LVRT scenarios are shown in Fig. 12. Take the Fig. 12 (a) as example, the expectation and median e RMSE are 7.18 × 10 −2 and 7.11 × 10 −2 , respectively. Sort the error sequence from small to large and take the median e RMSE as the center, 50% errors around this center are in the range of [6.95 × 10 −2 , 7.41 × 10 −2 ], and 82% errors around this center are in [6.86 × 10 −2 , 7.52 × 10 −2 ]. The minimum error is 6.81 × 10 −2 , where the maximum error is 7.62 × 10 −2 . Using the same method to analyze these errors, we can find the errors are quite small and the fluctuation of the errors is less than 6% of their expectation. In other words, the errors are quite stable when LVRT scenarios changes. Thus, the robustness against various LVRT circumstances of the proposed model is quite high. The model is suitable for dynamic response analysis of WF, and has high practical value.

VI. CONCLUSION
A MVT-FCM and SSAE based WF equivalent method is proposed in this paper. Based on the simulation study of the actual system, several conclusions can be drawn: (1) Using the SSAE-based extracted features as the multi-view CI and the proposed MVT-FCM for WTs clustering, the equivalent errors of the proposed model decrease at least 3% when comparing with other state-of-the-art models. VOLUME 8, 2020 (2) Based on MVT-FCM, the knowledge of source domain is transferred to guide the clustering process of target domain samples. The introducing of transfer learning is quite helpful to increase the clustering stability of MVT-FCM. The tests show that the clustering results would be only and would not be influenced by initialization.
(3) The equivalent error fluctuations are within 6% under different LVRT scenarios. Thus, the proposed WF equivalent model is applicative under different LVRT circumstances, and the robustness of the proposed model performs excellent in the dynamic response analysis of WF.
WEICHEN YANG was born in Henan, China, in 1995. He received the B.S. degree in electrical engineering from North China Electric Power University, Baoding, China, in 2017. He is currently pursuing the Ph.D. degree with the Huazhong University of Science and Technology (HUST).
His research interests include high-voltage direct current (HVDC) technology and renewable energy technology.
HAORAN YIN was born in Hebei, China, in 1997. He received the B.S. degree in electrical engineering from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2019, where he is currently pursuing the M.S. degree.
His research interests include application of machine learning in power systems and fault diagnosis of power systems. VOLUME 8, 2020