Static Grid Equivalent Models based on Artificial Neural Networks

Power systems are rapidly and significantly changing due to the increasing penetration of distributed energy resources (DERs) and the rapid growth of widespread grid interconnections. An increasing number of grid operators is thus interested in the reduced equivalent representation of a large, interconnected power system to reduce the amount of required computational resources and data exchange, e.g., between grid operators. However, state-of-the-art grid equivalents become more and more inapplicable since they are analytically calculated for one specific grid state. They cannot properly be adapted to grid state changes and the behavior of the increasingly used controllers, such as reactive power controllers of DERs. Therefore, we propose an innovative grid equivalent based on artificial neural networks (ANN) which overcomes the drawbacks of the state-of-the-art grid equivalents as follows: 1) Using supervised ANNs with feedforward and recurrent architectures, power systems can be equivalently represented adaptively and thus more accurately. 2) A feature selection method identifies the elements in the grid with high sensitivity on the boundary enabling a reduction of grid data required for the ANN-based equivalent. 3) To guarantee data confidentiality and cybersecurity, an additional unsupervised ANN, an Autoencoder, is used for obfuscation of the data which is required for the proposed grid equivalent to be exchanged among grid operators, while the relevant information of the original data is preserved, maintaining the estimation accuracy. Our ANN-based approach is analyzed and evaluated with two German benchmark grids and representative scenarios. The simulation results demonstrate that the proposed ANN-based grid equivalent outperforms the state-of-the-art radial equivalent independent method.


A. MOTIVATION
In 2014, the European Commission set a new target of 15% electricity interconnection by 2030, i.e., all Member States should achieve a level of electricity interconnection (electricity transport across borders to neighboring countries) to at least 15% of their installed generation capacity [1]. Power systems worldwide increase in size and complexity due to the increasing penetration of distributed energy resources and the rapid growth of widespread interconnections. Grid operators have to cooperate closely at their borders to neighbors on the same or to other voltage levels. Due to the interconnection, there are interdependencies that need to be considered between neighbor grid operators. Special attention has to be given to the modeling and simulation of a large and interconnected power grid. A standard solution is to use the equivalent model in place of a complete detailed model of the neighbor grids. There can be various reasons for using an equivalent model. The main reasons are the following: 1. Practical limitations on the computational resources for power market behavioral analysis, grid monitoring, operational and planning studies of large power systems [2][3] [4][5]. 2. As the electrical distance from the point of interest increases, the required detail of modeling of the remote location is lower [3] [6].
3. An increasing number of grid operators are interested in cross-network cooperation, such as horizontal cooperation (among transmission system operators (TSOs) and distribution system operators (DSOs)) and vertical cooperation (TSO-DSO) [7] [8]. The involved grid operators are usually unwilling to share their grid data because of confidentiality and security-relevant reasons [1][4] [9]. Therefore, obtaining an equivalent part of the power system that is analyzed is of great importance. According to the representation of the model and its intended use, grid equivalent techniques can be broadly classified into static and dynamic [10]. In this paper, the term "grid equivalent" only refers to the static equivalent that is used for the analysis of quasi-steady states, such as grid operation, planning, and market-oriented studies. Fig. 1 shows the general case of an interconnected power system, which is divided into the internal subsystem (IS) and the external subsystem (ES) (IS and ES could belong to the same or different grid operators). The former, in which engineers are interested, remains unmodified, and a simple grid equivalent model represents the latter by using grid reduction methods. The reduced interconnected power system should represent the original system as accurately as possible.

B. LITERATURE REVIEW ON STATIC EQUIVALENTS
Many different grid equivalent methods have been developed over the last decades. The most classical grid equivalents are the Ward equivalent and the REI equivalent.
The Ward equivalent was initially proposed in [11] and then further discussed in [12][13] [14]. It disregards all of the buses in the ES and models the ES as a set of equivalent impedance, shunts, and power injections attached to the boundary buses via the Gaussian elimination. To approximately consider the reactive power response of the ES to the IS, the extended Ward equivalent was proposed where a fictitious PV bus is added without power injection at each boundary bus (see Fig. 2 a) and c)). The significant limitation of the Ward equivalent is that the operating points of the assets in the ES cannot be simply adapted when they change [11] [15]. Therefore, attempts have been made to minimize load flow errors caused by operating point change; one of them is called radial equivalent independent method (REI). The idea behind the REI equivalent is to shift the power injection in the ES to one or more fictitious REI buses [16]. Afterward, the passive external grid area resulting from the power shift can be reduced by the Gaussian elimination, and the original power injection is substituted by equivalent power injections at the REI buses attached to the boundary buses (see Fig. 2 a) and b)). An adaptation to other operating points is possible without repeating the equivalent calculation process, with the help of a simple scaling of the operating points of the equivalent devices [10].
In addition, several researchers have used bus aggregation techniques to reduce large power networks. E.g., approximate equivalent networks have been generated based on power transfer distribution factor (PTDF) matrices [17] [18], where buses with a similar contribution to a designated internal subsystem are grouped in zones. Each of these zones is aggregated to a single bus. Another bus aggregation approach is based on the local marginal prices (LMPs) [19] [20]. Clusters of buses with almost identical LMPs are reduced using the REI equivalent. The application of this kind of equivalents is intended mainly for transmission cost allocations and electricity markets.
An obtained equivalent grid based on the methods mentioned above is generally only accurate for a specific grid state. Larger deviations from this grid state lead to less precision [15] [21]. In the intelligent grid paradigm, the switching statuses can be changed for congestion management [22], and different controllers, e.g., Q(V) control for distributed energy resources (DERs), are widely used and play a key role in improving the grid stability by adjusting operating points. These complicated interactions are difficult to be accurately represented and render the use of grid equivalents with acceptable accuracy more challenging. Also, the strong fluctuations of DERs lead to frequent grid state changes. The fundamental solution to this problem is to repeat calculating the grid equivalent after each grid state change. However, frequently recalculating the grid equivalent is usually not applicable in practice due to the high requirement of grid data, the high computational complexity, and the fact that this would require an increased data exchange, e.g., between adjacent grid operators [15].

C. CONTRIBUTION AND ORGANIZATION
Inspired by the application of an artificial neural network (ANN) for state estimation in [23], an innovative adaptive grid equivalent method based on ANN with feedforward architecture has been proposed in our previous works [15] [24]. The method utilizes ANN to approximate a power grid through extensive training without using the original grid model. After the ANN model has been trained to sufficient accuracy, it yields as outputs the power flows between the IS and the ES at the boundary buses at a very low computational complexity and a high accuracy (see Fig.  2 a) and d)). Input to the ANN is the current grid parameter vector containing, e.g., the operating points of the loads and the DERs and the switching states. However, our previous approach has some limitations: 1) The high amount of grid data that was required to train the ANN, and some of the grid data had only a small impact on the accuracy; 2) Considering grid data confidentiality and cybersecurity, exchanging grid data may not be acceptable when different grid operators are involved; 3) Only feedforward architecture was used. Other kinds of ANN architectures have the potential to increase the estimation accuracy; 4) Lack of automatization and generalization. We further develop this approach in this paper by implementing the following improvements and extensions. 1. A component feature selection is implemented such that grid elements (and the corresponding grid data) with a high impact on the grid equivalent can be identified. By considering only the identified (featured) grid elements, the amount of grid data required to train the ANN can be reduced.
2. An ANN-based data obfuscation (Autoencoder) is proposed to obfuscate the grid data for confidentiality reasons. The obfuscated data rather than the original data is then used as input for the subsequent grid equivalent ANN. In the data obfuscation process, the data properties required for the grid equivalent ANN to estimate the interactions at the boundary buses are maintained, while the original data cannot be reconstructed from the obfuscated one without knowledge of the data obfuscation ANN.
3. An ANN with recurrent architecture is considered for the grid equivalent. Its feedback connections are more appropriate in power grids where feedback is involved, e.g., in DERs with local voltage control.
These developments are implemented as a module for grid equivalents with automated parametrization, training, simulation, and evaluation. Comprehensive simulations are carried out, considering horizontal and vertical equivalent scenarios, operating points of grid assets, voltage fluctuation, switching statuses of lines, tap changer statuses, reactive power controllers, and measurement errors. To validate the advantages of the ANN approach, its performance is compared to that of the previous ANNbased approach and the state-of-the-art REI equivalent. Due to the above-mentioned limitations, the REI equivalent could be shown to outperform the Ward and xWard equivalent in our preliminary work [15]. Therefore, in this paper, we only compare the performance of the proposed equivalent to the REI equivalent and not to the Ward and xWard equivalent.
The structure of this paper is arranged as follows: In Section II, together with introductions of theoretical fundamentals, the proposed ANN-based scheme is described. In Section III, the test grids and scenarios are presented. The evaluation with different grids and scenarios is performed in Section IV. Finally, in Section V, conclusions are drawn.

II. GRID EQUIVALENT USING ARTIFICIAL NEURAL NETWORKS
The most desirable property of a grid equivalent method is that it should represent the effect of the ES on the interconnection between the IS and the ES (i.e., on the boundary buses) as accurately as possible. This effect depends on the current grid state of the overall grid (IS + ES). Therefore, the goal of the grid equivalent is, in effect, to adaptively estimate the relationship between the grid state (inputs) and the interactions at the interconnection (outputs). This goal is represented by the core component interaction estimation of the proposed approach, see Fig. 3 a) middle. Starting with this component, we successively describe our proposed scheme for the ANN-based grid equivalent.

A. ANN-CONFIGURATION for INTERACTION ESTIMATION
An artificial neural network imitates a biological neural network and can discover patterns adaptively in data sets because of its flexibility and nonlinearity. It has been used to learn from experiences complex input-output relationships with high accuracy [33]. The training of an ANN-based grid equivalent is realized within the component interaction estimation. As Fig. 3 illustrates, it receives the training input data (grid data set of historical grid states) and training output data (interactions, e.g., power exchange at the interconnection) and trains an ANNestimator such that the outputs of the ANN approximate the given output training data as close as possible. In this component, two supervised ANNs with different architecture are considered.

1) FEEDFORWARD ARCHITECTURE
The first supervised ANN is a feedforward neural network (FNN). As its name implies, FNN is a framework having only direct connections between neurons of one layer and those of the next layer as the ANN depicted in Fig. 2 d).
The input vector of each neuron is multiplied with the vector of weights and the resulting scalar is passed through an activation function to produce the output = ( T ) [32]. In many cases, the output also includes a bias.
For the regression task in this paper, a multi-layer FNN is used.

2) RECURRENT ARCHITECTURE
Unlike in FNNs, in recurrent neural networks (RNNs), feedback connections exist from the output of neurons to the inputs of neurons (either their own inputs or the inputs of other neurons). RNN is a dynamic system, and it can use its internal control state to process sequences of inputs [36] [37]. The state not only depends on the current inputs, but also on the previous state −1 , see Fig. 4 a). Accordingly, the general mathematical representation of a RNN for one pattern is presented by (1) and (2): xh T : transposed matrix of weights connecting the input to the state hh T : transposed matrix of weights connecting the previous state −1 to the current state hy T : transposed matrix of weights connecting the input to the output The output of the current input depends on the past computation such that the ANN exhibits a memory [37].
In power systems, many recursive processes exist, e.g., controllers such as local voltage controllers. E.g., in a Q(V)-control, the reactive power Q fed-in by a DER depends on the local voltage V which depends on Q. When the local voltage V changes, Q changes as well, which changes Q again, forming a feedback loop. To represent the recursive feedback processes, the recurrent architecture is modified, i.e., instead of time series [ , + , + ] , we consider static and duplicated grid state of a single timestep [ , , ] as input data, see Fig. 4 b). The output of each RNN-layer is obtained after two or more recursive iterations. From many varieties of RNNs, the long short-term memory (LSTM) architecture has been used in a wide range of applications because of its ability to overcome the gradient vanishing problem [36] [39]. Thus, we have implemented the LSTM-RNN for the ANN-based grid equivalent.

B. TRAINING DATA PRE-PROCESSING 1) FEATURE SELECTION
The observed grid elements (see Table TABLE I) for the ANN-based equivalent have different impacts on the interconnections due to various capacities and distances to the boundary. The elements with a small impact on the interaction require the same amount of grid data for the ANN training as those with a high impact. However, they improve the estimation accuracy only marginally. Thus, we propose a component feature selection, where those elements with a high impact on the interconnections, which are the critical (featured) elements, are identified based on a set of sensitivity analyses. In general, the sensitivity analysis is based on a single static grid state. To find the critical elements associated with power injection, we carry out a sensitivity analysis for each time step (grid state) of the used grid data set. The analysis yields the sensitivity of a power injection to the power exchange at the interconnection ( ). This set of calculations is treated as the base case sensitivities. A DER or load is critical for the grid equivalent if any corresponding result for any time step meets the predefined thresholds, cf. Table TABLE II. To determine the sensitivities of the switching status of a single branch element (e.g., a line), a set of sensitivities is calculated for each switching status change (e.g., outages of every single line). The effect of the switching status change on the interconnection depends on its deviation to the corresponding base case sensitivities. The switching status of the branch element is considered for the ANN-based equivalent if its effect on the interaction meets the predefined thresholds or any other branch elements become overloaded. The sensitivity analyses mentioned above are very computationally intensive. We have used the fast parallel power flow calculation algorithms in [40], with which the computing time is substantially reduced.

2) DATA OBFUSCATION
In practice, the IS and the ES could belong to the same or different grid operators. In the latter case, different grid operators are not willing to share their grid data with others. In view of this situation, it is essential and meaningful to obfuscate the grid data set before sharing. In the obfuscation process, the relevant information for the training of the ANN-estimator has to be maintained. Another ANN with unsupervised learning, Autoencoder (AE), is used. The unsupervised learning is supplied with unlabelled data sets (containing only the input data) and left to find properties in the data set and build a new model from it. While supervised learning leads to regression and classification, an unsupervised-learning-based AE performs feature learning, clustering, obfuscation, and data dimensionality reduction [41], which is what the proposed method needs to obfuscate the grid data. As Fig. 4 c) illustrates, an AE consists of an encoder and a decoder. The encoder transforms the input and produces the obfuscated code : The decoder then reconstructs the input data by obtaining outputs ′ that are as close as possible to the original input data : Different kinds of AEs aim to achieve different kinds of properties [42]. In this paper, an AE with stacked architecture is used, which is most appropriate for cybersecurity applications [41], is chosen. The stacked AE we have trained is symmetrical, and both the encoder and the decoder are fully connected FNNs. The middle-most layer, the output of the encoder (cf. Fig. 4 c)), is used as the obfuscated data for data exchange and the subsequent training of the ANN-estimator. The original input data can only be reconstructed by the decoder that is simultaneously trained in the same training process, i.e., by an FNN with decoder . Since the latter is not shared, it is unknown to the other grid operator, and a reconstruction of the original input data would only be possible for the grid operator delivering the obfuscated data .
The data obfuscator is only used for the input data. The output data set, which is the data at the interconnection, is generally the same and open for both involved grid operators (cf. Fig. 3 a)) and can be directly used without obfuscation.
It is assumed that an RNN-estimator based on obfuscated data is used for a grid equivalent task. The original grid data for IS and ES ( IS and ES ) are separately obfuscated by (5) and (6). The vector concatenation of their results IS and ES (equation (7)) is the input for the RNN that estimates the interaction y at the interconnection, see equations (1), (2), and (8).

C. MODELINGANDMODIFICATION
Based on the selected and obfuscated grid data, through proper training, an ANN can be used to find a mapping from different grid states to the interactions at the interconnection. In our proposed approach, we model the estimated interactions as equivalent devices attached to the boundary in the IS. Which kind of equivalent devices to be implemented depends on the data type the estimated interaction contains, i.e., the estimated power values are represented by equivalent adaptive loads, and the estimated voltage values are modeled as equivalent adaptive generators or external grids, such that the effect of the ES is approximated. As shown in Fig. 3 b), grid operators exchange their selected and obfuscated grid data in the operation phase. The corresponding interactions estimated by the ANN-estimator are implemented by modifying the equivalent devices at the boundary, e.g., by modifying the values of the equivalent adaptive loads.

III. TEST POWER GRIDS AND SCENARIOS
To better understand the functionalities of the proposed approach, in this section, the ANN-based grid equivalent is applied and investigated based on two German benchmark grids and two scenarios, i.e., as a horizontal equivalent and as a vertical equivalent, see Table TABLE II. The motivation for the horizontal equivalent is that within one grid (and one voltage level), the requirements for detailed modeling of distant grid areas decreases, and for the calculation with large grids increases (cf. the first two reasons for using a grid equivalent in Section I-A. We represent the related grid areas by using the proposed approach. The grid is a slightly modified version of the medium voltage grid Oberrhein in [30]. The configurations for the ANN-based approach are listed in Table TABLE II Grid equivalents are also required in cross-grid studies where different grid operators are involved. Thus, we use the modified SimBench grid [43] with multi-voltage levels and operators for the vertical scenario. The extra-highvoltage (EHV) level containing eight connection points operated by the TSO and the high-voltage (HV) level operated by the DSO are interconnected via three transformers. Except for the voltages at the connection points, the operating points, the switching statuses of lines, and time-varying tap-changer positions are considered for this scenario (see Table TABLE II). To train an ANNestimator to adaptively represent the interactions that are influenced by both grid operators, TSO and DSO have to exchange their grid data sets. Hence, we activate the component data obfuscation. The following analyses are performed: 1. effects of different configurations on data obfuscation; 2. effects of obfuscated data on accuracy; 3. comparison of the proposed approach with conventional methods. Also, horizontally interconnected grids can belong to different grid operators (e.g., DSOs' or TSOs' cooperation). The application of the proposed approach in the latter case is comparable to the scenario described above.

A. IMPLEMENTATION AND SIMULATION ASSUMPTION
The implementation of the proposed approach and the

Grid information
The medium voltage grid Oberrhein in [30] is located in the section of Rhine in the Upper Rhine Plain between Basel in Switzerland and Bingen in Germany. It has 179 buses, 51 DERs, 147 loads, and 1 connection point.
In the the project SimBench [43], a benchmark dataset to support research in grid planning and operation has been developed. The SimBench grid used for the vertical equivalent consists of a EHV grid in northeast Germany and a HV grid for the region Schwerin, which belong to the control area of the German TSO 50Hertz.  classical equivalent methods is done in the programming language Python, based on the grid analysis tool pandapower [30], the PyTorch package [29], and the estimation tool in [23]. The implemented REI, Ward, and extended Ward methods are validated by comparing the results of our implementation with those provided by DIgSILENT PowerFactory [31]. Before and after the calculation of an equivalent, the static bus voltages deviate up to 10 -6 p.u. for standard IEEE benchmark grids such as case9, case39, case118. We use pandapower as the tool for grid simulation.
For an ANN to accurately approximate the interaction at the interconnection for a given grid state, it is important to obtain a meaningful training data set, which covers as many grid situations as possible. We use an improved version of the scenario generator of our preliminary work [23] such that grid parameters for scenario generation are configurable through a simple string, i.e., in the framework of pandapower, the observed grid parameters follow their elements by dashes (-) and connected with each other by slashes (/), e.g., load-p_mw/line-in_service/controller-cos_phi means that the active power of load, the switching status of lines and the controller parameter cos are considered during the scenario generation.
For the horizontal equivalent scenario, we generate an annual time-series (inputs) with 15-min resolution for the input elements mentioned in Table TABLE I. The corresponding annual interactions at the boundary (outputs) are obtained by power flow calculations. To simulate grid topology changes, a single random line is switched off at each time step. This generated data set is divided into two parts for training (first half) and grid simulation (second half). For the vertical equivalent scenario, the SimBench grid provides data sets of realistic time series for loads and DERs in 15-min over one year (35040 time steps). We have added random tap changer positions and voltages of the eight external grids for each time step (cf. Table TABLE II). After an annual simulation, the corresponding power exchanges between the TSO and the DSO (outputs) are calculated. To simulate in a more realistic way, we have added measurement errors (maximum error of 1.5% for power measurements and maximum error of 0.5% for voltage measurements according to IEC 61869) to the input data set for the ANN. In addition, to let the ANN learn the periodicity of the realistic data set, we have added a number (from 0 to 95) to the input data for each time step to represent the time of a day (24 hours per day, 4 time steps per hour → 96 time steps per day). We have used the data set from January to August for training and that from September to December for simulation. The complete TSO-DSO grid is only used for the grid data set generation using power flow calculations such that the outputs are correctly calculated according to the given inputs. In practice, it is not required to perform power flow calculations on the complete TSO-DSO grid as measurements of the boundary buses should be available for both grid operators.
To find a proper ANN, the optimization method in [25] is used to obtain the number of layers and other hyperparameters. Adam [26] is chosen to optimize the ANN's weights, and loss functions L1Loss [27] for ANNestimator and MSELoss [28] for ANN-obfuscator are used. All results are produced with an Intel i7-4702MQ CPU (2.2 Ghz), 16 GB of RAM (800 MHz), SSD storage, Python 3.7 on Ubuntu Linus with GPU (2*GeForce GTX 1080) acceleration.
In this paper, the accuracy of the ANN-estimator is assessed using the Weighted Average Percentage Error (WAPE): where: the estimated vector for time-step i; the actual vector for time-step i;time-step size of the observed data set. Compared to the most commonly used key performance indicator to measure estimation accuracy, the mean absolute percentage error, the WAPE overcomes the high sensitivity to outliers and the issues when the actual value is zero [44].

1) FEATURE SELECTION RESULTS
To find the critical elements for the interconnection, we have activated the component features selection. With the predefined conditions of Table TABLE II, the feature selection has found 273 critical elements. Compared to the original elements, the amount of required data set could be reduced by more than one-third (157 of 430), see Table  TABLE I. The original data set and the critical data set are used for the subsequent ANN training.

2) ANN MODELS AND TRAINING RESULTS
According to our previous works [15] [24], an ANNestimator with feedforward architecture, with the ID FO (see Table TABLE IV for the definition of the IDs), is trained as a benchmark model based on the original grid data set (labeled with an O in its ID) and optimized hyperparameters. To make a fair comparison, we have created three other models with the IDs ROI, FCI, and RCI, which inherit the hyper-parameters of FO, see Table TABLE IV. The differences between the FNN-and RNN-models are 1) Instead of the activation function ReLU used in the FNN models, the LSTM architectures represent the layers in the VOLUME XX, 2017 8 RNN models; 2) Each LSTM layer has a memory size of two, cf. Fig. FIGURE 4. Models trained with critical data are labeled with a C in their IDs. To verify the effect of the critical data set and the component feature selection, we have re-optimized the hyper-parameters and trained two additional models, FCO and RCI2, i.e., for the latter two models, the hyper-parameters are re-optimized with the reduced data set (the input size is 273) of only the critical data.  Fig. 5 illustrates the training results for different ANN models. The differences between the results of FNN (dark color) and that of RNN (light color) are obvious. In the diagrams for train loss, the RNN-models can almost always reach a lower loss with fewer epochs during the training process. The related training time (b) and estimation time for all the samples (d) for the RNN-models are higher than that for the FNN-models due to the recursive architecture of the RNN-models. With RNN, the outputs are more accurately estimated (c), and the accuracy is improved by up to 40%.

FIGURE 5. Results comparison for different ANN-models: a) train loss; b) training time; c) estimation accuracy measured in WAPE; d) estimation time for all samples
Compared to the FO and ROI (blue), with the same hyper-parameters, the ANN-models FCI and RCI based on critical data (orange) are successfully trained in less time due to the smaller input size. However, the estimation accuracy is slightly decreased. With the re-optimized hyper-parameters, the ANN-models FCO and RCI2 (green) trained with the reduced data set even outperform those trained with the original data set, the FO and ROI (blue), in terms of estimation accuracy. The reasons for the longest training time for FCO and RCI2 are 1) During the reoptimization and the retraining, ANNs need more time to find an optimal relationship between the limited inputs (critical data set) and outputs; 2) The RNN-architecture is more complicate than the FNN-architecture so that its training time is longer.
To summarize, the results shown in Fig. 5 match the expectations: 1) The reduced and critical data set has nearly no effect on the accuracy, and, therefore, the requirement of data volume for data exchange in practice can be reduced substantially, cf. Table TABLE I; 2) The implemented RNN-models improved the accuracy significantly.

3) INTERACTION EVALUATION
We have applied the best performing ANN-models FO, ROI, FCO, and RCI2 in the operation phase in a time-series simulation. The estimated values are implemented as equivalent loads attached to the boundary buses and modified for each time step. Fig. 6 shows the deviation of the estimated active power and reactive with different equivalent models for the ANN-equivalents and for the repeated and non-repeated REI equivalent. REIrep means that the REI equivalent is recalculated in every time step to be able to adapt it to the changing grid conditions. Analogous to the results in Fig. 5 c), the interaction estimation is improved with the recurrent architecture (Pestimation improvement is more significant than the Qestimation improvement). With the reduced grid data and re-optimized hyper-parameters, the accuracy for RCI2 (light green) is closest to that based on the original data (ROI (light blue)). The deviation for the conventional REI method (purple) without recalculation is about eight times larger than that of the ANN estimators. With the REIrep model, the interaction can be precisely calculated within 1.2s, see Tabel TABLE V. However, the computation time for REI-calculation depends on the size and the complexity of the grid model and increases as the grid grows larger and more complex. Thus, frequent recalculation can require a long computation time which is usually not feasible. In addition, according to our experience, users often have to deal with a large sparse matrix occurring during the REIcalculation process. The sparsity can cause convergence problems in the calculation process. The most obvious drawback of the REIrep model is that grid operators have to provide comprehensive grid information (including a grid model) for each time step. Grid operators are reluctant to do this, and grid models for some grid fields are not even available in detail or lack updates, which can still cause deviations.
Considering these drawbacks for the REI-models, the proposed approach clearly outperforms the state-of-the-art REI approach since it yields almost instantaneous reestimation and high estimation accuracy with only limited grid data and excluding grid models. Among all the ANNmodels, the RNN-models ROI and RCI2 yield the best results. The RCI2 needs only reduced information of the current grid condition. The very short computation time enables an application in real-time.

1) FEATURE SELECTION RESULTS
The original data set for this scenario has a total of 359 parameters for 35040 (96*365) time steps, from which the feature selection has identified 166 critical parameters for the ANN-equivalent, see Table TABLE VI. Note that none of the single line outages has a significant impact on the interconnection between the TSO and the DSO due to the N-1 security criterion and the meshed grid topology. The observed loads and DERs are mostly residual devices to represent under-loaded medium voltage grids or highpower devices. Their large power fluctuation cannot be disregarded. Therefore, only a few loads and DERs are ignored. The deviations between using the original grid data and considering only the critical grid data are comparable to that for the horizontal scenario, cf. Fig. 5. Thus, for the remainder of this section -for the vertical scenario -we only show the results for the critical (reduced) data set.

2) DATA OBFUSCATION EVALUATION
As explained in Section II-B-2), we use the outputs of the encoder as the obfuscated data. Also, the size of the encoder's output layer is configurable, enabling obfuscated data with different dimensionality. A reduced dimensionality allows a date exchange with reduced data volume. We obfuscate the original data set with different reduction degrees (RD). Fig. 7 shows the results for an array consisting of 10 randomly selected active power operating points from the original data set. The obfuscated data with different RD and its related reconstructed data for one time step are shown in Fig. 7 a) and b). The obfuscated values deviate from the original data. With increasing RD, the dimensionality of the obfuscated data is reduced, and some information of the original data is lost due to the data compression (e.g., the deviation between blue points and black points in Fig. 7 b) are the largest). As a consequence, the reconstruction loss increases (see Fig. 7 b) and c)) [42].
The general criteria for grid equivalents are "accurate", "anonymous", and "less data exchange". In practice, under acceptable accuracy, grid operators can further reduce the data volume for data exchange with a nonzero RD. For the remainder of this section, we use the obfuscated data with RD0% and RD20% (e.g., purple and green points in Fig. 7 a)) to investigate different RD effects on the accuracy of the equivalent.

3) ANN MODELS AND TRAINING RESULTS
After successful training with the real grid data set of the first eight months, the ANN-models in Table TABLE VII are evaluated via the estimation for the last four months, see Fig. 8. It can be seen (left) that with the use of obfuscated data and increasing DR, the estimation accuracy decreases. However, the improvements brought by RNN offset this deviation to some extent, i.e., the accuracy of FOCE0 with obfuscated data (light orange) is very close to the best (light blue). Lower RD is more likely to be chosen for better accuracy.
Plot b) in Fig. 8 shows the accuracy changes over one day. The power exchange between the DSO and the TSO between 8 am (time step 32) and 3 pm (time step 60) is estimated with a WAPE of around 1-2%. The reason is that the operating points of the observed grid assets typically vary more significantly and regularly from 8 am to 3 pm. Correspondingly, the training data set for this time window covers almost all relevant grid conditions, enabling the ANN to learn the relations correctly. The accuracy is worse at other time steps due to the lack of a "meaningful" training data set and the "inaccurate" estimation for the values close to zero. The TSO-DSO exchange from September to December ranges from 34.4 MW to -180 MW. Values close to zero cause high deviations, e.g., it is estimated to feed -0.02 MW to one boundary bus, but the actual feed-in is 0.03 MW, yielding an estimation error of 250% for that time step, but has negligible impacts on the grid operation in this scenario, cf.

4) INTERACTION EVALUATION
We carry out a time-series simulation with the real grid data set of the last four months, considering measurement errors.
Only the ANN-models with recurrent architecture (RCE, ROCE0, ROCE20) are considered. Note that the ANNobfuscators are trained and constructed at the DSO and the TSO, respectively, cf. Fig. 3, and they are only valid for the grid data of the DSO or the TSO, respectively. In ROCE0 and ROCE20, obfuscated grid data of the DSO and the TSO are always used. The DSO-TSO interactions are estimated by different equivalent models and realized as equivalent loads attached at the boundary buses of the TSO. Fig. 9 shows the maximum deviations of bus voltage, line loading, and line loss in the TSO grid for 99% of the simulated time window. The resulting grid state deviations for RCE and ROCE0 are very small, i.e., bus voltage deviations up to about 0.005%, line loading deviations up to 2%, and line loss deviations up to 0.05%. The ANN-model ROCE20 estimates the DSO-TSO interaction worse due to the compressed data set with RD of 20%, which causes larger deviations (green). The REIrep-model (pink) exhibits the smallest deviations but still exhibits some deviations caused by the measurement errors. However, the shortcomings of the REIrep-model are distinct: as mentioned in Section IV-b-3), the DSO has to provide the recalculated REI equivalent model in every time step, requiring a complete grid model and probably a high computational burden. With realistically limited data sharing between the DSO and the TSO, e.g., only sharing PQ-operating points modification, the corresponding errors of the REI-model (purple), which is not recalculated in every time step, increase dramatically, i.e., to more than 1% voltage deviations, more than 20% line loading deviations, and more than 3% line loss deviations. These significant errors, in this scenario, are mainly caused by the timevarying tap changer positions that are only considered by the recalculation of an REI model based on the current and complete grid data of the DSO.
In contrast, under the same condition of limited data exchange, the ANN-models estimate the DSO-TSO interactions much more accurately. Although their effects on the TSO operation are still worse than that of the REIrep -model, the estimation errors are small, e.g., the line loading errors are all around 1-2%. Despite using an additional ANN model for data obfuscation (Autoencoder), the computational burden of the ANN equivalents is still almost instantaneous and substantially lower than that of the REIrep-model, cf. Table TABLE V. In practice, the ANN-obfuscator guarantees the grid data confidentiality and cybersecurity in the process of data exchange, with a slight effect on equivalent accuracy.  There are different levels in which the exchange of grid data\model is necessary. Apart from the TSO-DSO exchange in this scenario, exchanges can be realized between TSOs and DSOs. The involved grid operators aim to perform common studies using shared data. The Common Information Model standard has helped to standardize the exchanges of grid data/model [44] [45]. However, there are many methods for constructing this data or models. Different models describing the same networks and equipment may use different hierarchies, classes, attributes, associations, and nomenclature. These differences in modeling make it challenging to ensure consistency of the network model between each usage [45]. The proposed ANN-based approach "skips" the constructing and power flow calculation processes, and the interoperability inside business processes is met through the ANN-estimation. Meanwhile, the use of our approach is more flexible (configurable inputs and outputs) and confidential (data obfuscation).
In another context, to build a (regional or pan-European) common grid model, the European TSOs share their information with regional security coordinators (RSCs). Based on the resulting broader overview of electricity, RSCs provide TSOs different services such as calculation of cross-border capacities, outage planning coordination for relevant transmission facilities et al. With the help of the proposed approach and based on the common grid model, RSCs could make their services efficient, e.g., in the future, TSOs can obtain/provide required information with lower efforts (sharing of limited grid data).

V. CONCLUSION AND OUTLOOK
In this paper, we have proposed an innovative machinelearning-based grid equivalent approach for reducing an external grid area and equivalently representing its effect on an internal grid area accurately, adaptively, and confidentially. The proposed approach comprises three main components which guarantee a superiority over stateof-the-art schemes, such as the REI equivalent: • The component feature selection finds the elements most sensitive to any target buses such that the ANNtraining can be performed using a reduced data set while maintaining the estimation accuracy. • The component data obfuscation obfuscates the original grid date set with stacked Autoencoder for grid data confidentiality and cybersecurity reasons, and the ANN-estimation accuracy is not seriously degraded without the original data dimensionality. • Within the component interaction estimation, an RNN architecture is considered to provide a high estimation accuracy of the interactions between the external and the internal grid area even for the case that, e.g., local controllers are involved. Based on two typical equivalent scenarios with German benchmark grids and complicated grid conditions, the proper functionality of the proposed approach and its advantages over the state-of-the-art REI equivalent are demonstrated. Our proposed approach is flexibly extendable, and the belonging components and tools can be applied independently, e.g., the ANN-obfuscator can benefit other studies related to confidential data.
The practical implementation of the proposed approach is expected to be relatively simple and efficient. There are no additional requirements for hardware. Meanwhile, the ANN-obfuscator makes the data exchange (exchange of an array or a JSON file consisting of obfuscated grid data) convenient and confidential. The proposed equivalent grid can be adapted to different power system studies according to the types of considered data, e.g., an ANN trained with switching status can be applied for congestion management. Nowadays, power systems are changing rapidly and significantly. In order to obtain an accurate equivalent model, the timeliness of the training data is essential. It is necessary to regularly retrain the ANN with the new grid data set to cope with the continuous changes of power systems. Thus, further improvement is needed to update the training data and retrain the ANN model automatically.
As an outlook, the compatibility of an ANN-based equivalent grid with more sophisticated grid operation studies, e.g., optimal power flow, is worth studying. Furthermore, in this paper, we concentrated on the technical use of static equivalents for grid operations. It could also be applied and modified for grid planning and market-oriented studies.