Parallel-in-Time Power System Simulation Using a Differential Transformation Based Adaptive Parareal Method

For parallel-in-time simulation of large-scale power systems, this paper proposes a differential transformation based adaptive Parareal method for significantly improved convergence and time performance compared to a traditional Parareal method, which iterates a sequential, numerical coarse solution over extended time steps to connect parallel fine solutions within respective time steps. The new method employs the differential transformation to derive a semi-analytical coarse solution of power system differential-algebraic equations, by which the order and time step, as well as the window length with a multi-window solution strategy, can adaptively vary with the response of the system. Thus, the new method can reduce divergences and also speed up the overall simulation. Extensive tests on the IEEE 39-bus system and the Polish 2383-bus system have verified the performance of the proposed method.


I. INTRODUCTION
R ENEWABLE intermittent energy resources keep penetrating electrical power systems and bring more uncertainties and challenges to power system operations [1]. Specifically, a day-ahead power system dynamic security assessment for representative or predicted operating conditions is no longer adequate for an interconnected power grid with a high penetration of renewable energy resources because its generation is far less predictable than before. Thus, dynamic security assessment tools that possess a realtime or even faster-than-real-time computing capability will be highly valuable to accommodate more uncertain and diversified operating conditions [2].
Time-domain simulation has been widely used for dynamic security assessment, especially transient stability assessment, of a large-scale power grid under contingencies [3], [4]. Compared to direct methods for transient stability analysis, time-domain simulation has high accuracy and is capable of dealing with a wide variety of detailed power system models. On the other hand, its major drawback lies in the huge computational burden and numerical divergence issues when solving many nonlinear differential-algebraic equations (DAEs) that model the power grid [5], [6]. Hence, solutions to boost the computational performance while ensuring the convergence of the solver algorithm in time-domain simulation to meet the real-time or faster-than-real-time requirements have been a challenging problem in power system research.
In the parallel computing category, existing methods mainly parallelize computations among multiple simulation runs for different contingencies. They still have limited ability of parallelization on a single simulation run, even with simplified models and many parallel processors in highperformance computers [26]. In this respect, the Parareal method has been successful in temporal parallelization of computations with low-dimensional models in other fields such as analysis of molecular dynamics and fluid-structure computation [27], [28]. The basic idea of the Parareal method is below. First, a computationally efficient coarse solver is employed to initialize the state variables for future time steps. Then, a fine solver is used to solve all coarse time steps in parallel, whose solutions are used to update the initial values of each coarse time step. Finally, the above procedure is performed iteratively until convergence. However, when applied to simulations of high-dimensional power system DAE models, a conventional Parareal method can face numerical divergence issues as discussed in [12], [13], [14], [15], [16], [17], [18], and [19].
For the methods in the semi-analytical simulation category, the derivation of a high-order semi-analytical solution (SAS) for high-dimensional power system DAEs can be challenging due to intensive symbolic computations of analytical solutions. Such challenges can be overcome by the recently proposed differential transformation (DT) method [23], [24], which allows a more efficient, recursive procedure for deriving an SAS from low order to high order. Thus, it allows one to use an SAS of any desired order and accuracy for the purpose of, e.g., an expected shorter or longer time step length in time-domain simulations. Considering these observations, power system simulation requires and benefits from further improvements, which innovatively integrate methods from the two categories to create a more adaptive and effective parallel-in-time algorithm for enabling the prediction of power system dynamic behaviors in real-time.
It is worth mentioning that an adaptive Parareal method [29] was recently proposed in the applied mathematics field and was tested on simple low-dimensional differential equations. However, it focuses on adjusting the accuracy of the fine solver to improve the computational efficiency, which is different from this paper that aims at developing the adaptive coarse solver and other adaptive strategies for improving both the convergence performance and computational efficiency in a high-dimensional nonlinear power system model. Currently, there is no mature method to overcome the divergence issues of the Parareal method in high-dimensional nonlinear power systems models.
In this regard, the contributions of this paper include: 1) a variable-order variable-step variable-window (VO-VS-VW) adaptive Parareal method is proposed for pareallelin-time power system simulations with greatly enhanced convergence and computation speed; 2) the DT method, for the first time, is employed as the coarse solver in a Parareal method to enable its better adaptivity than a numerical coarse solver; 3) the performance of the proposed method is compared to a traditional Parareal method by extensive tests on a 39-bus system and a 2383-bus system.
In the rest of the paper, Section II introduces the traditional Parareal method using numerical-based coarse solvers with fixed multi-window and first-order coarse solution update strategies. Section III presents the proposed DT-based adaptive Parareal method, with case studies presented in Section IV, and conclusions drawn in Section V.

II. TRADITIONAL PARAREAL METHOD
Notation: The following notations are used in the rest of the paper: where x is the vector of state variables, v is the vector of bus voltages, and f and g are functions in differential equations and algebraic equations, respectively. For simplicity, let Fig. 1 illustrates the basic idea of the Parareal method for simulating a power system model. First, the desired simulation length [t 0 , t N ] is decomposed into N coarse steps with length h c , and each coarse step is further decomposed into many fine steps with length h f . Then, the block of initial coarse evaluation computes the initial coarse solutions x c,0 n by performing the simulation over [t 0 , t N ] in serial, using a coarse operator C with the coarse step length h c . Afterward, the iteration process is performed between the blocks of fine evaluation and the coarse solution update. The block of fine evaluation is to provide a better solution x f ,0 n at each interval by simulating each coarse interval [t n , t n+1 ] in parallel using a fine operator F with the fine step length h f . The block of coarse solution update is to correct the coarse solution at each coarse interval. The iteration process continues until the stopping criteria is met, e.g., the differences of coarse solutions between the m th iteration and the (m + 1) th iteration are smaller than a pre-defined threshold, or the maximum number of iterations is reached.
The Parareal method described above, having been successful in other fields and having shown promising results for speeding up power system simulations, can face divergence issues when being applied to nonlinear high-dimensional power system models. The simulation length to obtain the fast and robust convergence of the Parareal method for a power system model is often restricted to around 1 second. When a longer simulation length is desired, e.g., 5 to 10 seconds in transient stability assessment, a multi-window strategy was utilized in [12], [13], [14], [15], [16], [17], [18], and [19] which decomposes the desired simulation length [t 0 , t N ] into multiple fixed windows with a window length of h w . Then, the Parareal method is applied to each window sequentially, and the trajectories of state variables in all windows are connected together to give the trajectory of desired simulation length. Fig. 2 illustrates the basic idea of the proposed method. The following improvements are made compared with the conventional Parareal method in Fig. 1. First, the DT method is employed as the coarse solver to replace a traditional numerical integration method as detailed in Section III-B. Second, a variable-order variable-step DT method is further designed to utilize the adaptivity of the DT method to adjust the accuracy of the coarse solver adaptively, as is shown in Section III-C-1). Third, the Parareal method is applied to multiple time windows and a variable-window strategy is designed to allow further adaptivity of the algorithm, with details presented in Section III-C-2). Finally, an improved coarse solution update strategy is designed as discussed in Section III-C-3).

B. DT-BASED PARAREAL METHOD
The selection of the coarse operator plays a significant role in the convergence performance and computational efficiency of the Parareal method. The coarse operator needs to have sufficient accuracy to ensure the convergence of the Parareal method on one hand and needs to be fast enough to achieve considerable speedup on the other hand. The latter is due to the coarse operator evaluation in serial, which can offset the efficiency gain from the parallel computing and should be minimized. Therefore, selecting a proper coarse solver with sufficient accuracy and efficiency would be key to improving the convergence performance and time performance of the Parareal method. For this purpose, this section proposes a DT-based Parareal method that replaces the coarse solver in the traditional Parareal method by a DT solver. Our previous work has demonstrated that the DT method achieves better accuracy and efficiency than various conventional numerical integration methods. Moreover, the coarse solutions obtained VOLUME 10, 2023 by the DT method are in the form of power series of time, which further allow adaptively adjusting the order and time step length to balance between the accuracy and computation time in evaluating the coarse solutions, thus improving the overall performance of the Parareal method.
The motivation, definition, and transformation rules of the DT method were described in [23] and [24] and are briefly summarized in the Appendix. In short, the DT method provides various transformation rules for numerous generic functions such that a differential equation in a continuous set about the variable t (time) is converted to a new set of difference equations in a discrete set about the variable k (the power series order). The transformation rules for widely used linear and nonlinear functions in power system models are provided in [18] and [19]. Fig. 3 illustrates the idea of deploying the DT method in deriving semi-analytical solutions of differential equations or differential-algebraic equations in the form of high-order power series of time.
The left-hand side of Fig. 3 gives the procedure of applying the DT method. First, the original differential equations of power system models are transformed to a new set of difference equations by the DT method. The derivation of the difference equation is performed offline and is a one-time effort. Then, based on the obtained difference equation, the power series coefficients up to any desired order could be efficiently computed recursively. Finally, the solutions are approximated by summating all the calculated power series terms.
The right-hand side of Fig. 3 uses the swing equation of a single-machine-infinite-bus (SMIB) system to illustrate the derivation procedure. Then, the rotor angle trajectories obtained from the power series solution with different orders are compared with the benchmark solution obtained by the RK4 method with a small enough time step length for accuracy. It shows the convergence region of the DT method increases with the order of the power series solution. Especially, the solution with an order of 15 is accurate enough up to 0.35 seconds, which is more than half period of the rotor angle trajectory.
Algorithm 1 shows the detailed procedure of the DT method where: the offline stage derives a recursive equation about power series coefficients; line 1 in the online stage is the initialization of variables; lines 3 to 5 are the online evaluation of power series coefficients; line 6 is to compute x(t n+1 ) by summing the power series terms from 0 th to K th orders; line 7 moves the simulation to the next time step.
Algorithm 2 shows the detailed procedure of the DT-based Parareal method where: lines 1 and 2 correspond to the block of initial coarse evaluation in Fig. 2 and C DT is the DT-based coarse solver in Algorithm 1; lines 4 to 6 correspond to the block of fine evaluation; lines 7 to 10 correspond to the coarse solution update; lines 11 to 13 are the stopping criterion. In the algorithm, x c,m n , x f ,m n , x * ,m n respectively mean the coarse solution, the fine solution, and the corrected coarse solution at time instant t n in the m th iteration; tol is a predefined threshold; M is the recorded number of iterations for the Parareal method to converge; m max is the pre-defined for k = 0 : K // computer power series coefficients 4.
x n ← x * ,m n , n = 1, 2, . . . N ; M ← m; break; 13. end 14. end maximum number of iterations. Note that in m th iteration, the fine evaluation in line 5 is performed from m to N , not 1 to N . This is because the coarse values in the previous (m−1) steps have been corrected by the fine solver in the previous (m − 1) iterations.
The proposed DT-based Parareal method has the following advantages: 1) the DT method is more efficient than numerical methods since it can reduce the sequential time and improve the overall speedup; 2) the DT method, at the same time, is more accurate than numerical methods since the coarse solution by the DT method is closer to the true solution and fewer iterations are needed to make the Parareal method converge; and 3) the DT method is highly flexible since the order and the time step can be adjusted adaptively (as shown in the next section) to balance the accuracy and efficiency of the coarse solver during the simulation, thus, improving the overall performance. Algorithm 3 VOVS-DT Strategy input: t 0 , t end , x 0 , h 0 , K 0 output: x(t n ), h(t n ), K(t n ), t 0 ≤ t n ≤ t end offline stage: derive the recursive equation X(k + 1) = F(X(0 : k)) online stage: 1.

C. ADAPTIVE STRATEGIES FOR DT-BASED PARAREAL METHOD 1) VOVS-DT STRATEGY
Algorithm 3 insert shows the proposed VOVS-DT strategy, where the outputs include not only the trajectories of state variables but also the variable time step lengths h(t n ) and variable orders K(t n ) during the simulation; lines 1 to 7 are similar to the DT method in Algorithm 1, except for the initialization of additional variables associated with the time step length and the order of DT; line 8 is error estimation using the (K + 1) th power series term; lines 9 to 12 are the adaptive change of time step length and the order of DT based on the error estimation. In Algorithm 3, ε 1 , ε 2 , ε 3 , ε 4 are predefined error thresholds; q 1 , q 2 are the factors to adjust the time step length; K is an incremental adjustment of the order of DT; h max , h min are the pre-defined maximum and minimum time step length; K max , K min are the maximum and minimum order.
The proposed VOVS-DT strategy has the following compelling advantages when used as the coarse solver in the Parareal method: 1) it increases the order and reduces the time step length of the DT-based coarse solver when the error is larger than a threshold during the simulation, which could reduce the needed number of iterations for the Parareal method to converge; 2) it decreases the order and increases the time step length of the DT-based coarse solver when the error is smaller than a threshold during the simulation, which might slightly increase the number of iterations but can avoid unnecessary computation burden and improve the overall efficiency. Note that these advantages are not easily attainable from other existing methods. VOLUME 10, 2023 Algorithm 4 VW-Parareal Strategy input: t 0 , t end , h w , x(t 0 ) output: call Algorithm 2: DT-based Parareal method to compute The proposed VW-Parareal strategy has the following advantages. First, the use of the multi-window strategy makes it easier for the Parareal method to converge because the length of a window is smaller than the total simulation length. Second, the proposed VW strategy can provide more suitable window sizes during the simulation than a pre-defined fixed size, because the Parareal method with a fixed window size may diverge when the size is too large and may be inefficient when the size is too small. Therefore, the proposed VW-Parareal strategy can balance the convergence with efficiency performance.

3) IMPROVED COARSE SOLUTION UPDATE STRATEGY
The coarse solution update strategy in line 9 of ), could be interpreted as a first-order Newton iteration method to solve the multi-shooting problem, where the Jacobian matrix in the Newton iteration method is approximated by the coarse solutions [30]. Following the interpretation in [30], this section extends the Parareal method to the second-order Newton iteration method to solve the multi-shooting problem, where both the Jacobian matrix in the first-order term and the Hessian matrix in the second-order term are approximated by the coarse solutions. The procedure is similar to [30] and the details are omitted. The coarse solution update strategy including the second-order term can be represented as follow: Here, note that the value of x c,m+1 n is unknown in the m th iteration. In this work, it is approximated by x c,m+1 n ≈ x c,m n , which results in the proposed improved coarse update strategy. This simple yet practically effective modification improves the coarse solution update strategy and, thus, can reduce the number of Parareal iterations if the Parareal method converges. Despite its practical effectiveness, rigorous validation of this assumption and the proposed improved coarse solution update strategy is also important and necessary to advance the understanding of the temporal parallelization algorithms and would be an interesting research topic that deserves further exploration.

IV. CASE STUDY
The proposed DT-based adaptive Parareal method is first tested on the IEEE 39-bus system [26] with classical generator models, and the Polish 2383-bus system [31] with detailed models of generators, exciters, governors, and turbines. Then, various scenarios are tested to demonstrate the reliable performance of the proposed method. In each test, the following three methods are compared: 1) M1: the traditional Parareal method, 2) M2: the proposed DT-based Parareal method, and 3) M3: the proposed DT-based adaptive Parareal method. Compared with M2, M3 further integrates the proposed adaptive strategies. In all three methods, the fine solvers are selected as the 4th order Runge-Kutta method.
For all the tests in Section IV-A and Section IV-B, the test settings are the following. The time step lengths of the coarse and fine solver are selected as 0.1 seconds and 0.001 second, respectively; the parameters in the algorithms are selected as h max = 0.2, h min = 0.0001, K max = 20, K min = 2, q 1 = 1.1, q 2 = 0.9, K = 1, m max = 15, tol = 0.1. For the 39-bus system, the simulated contingency is a fictitious temporary three-phase fault at bus 8 applied at t = 1 second that clears after 0.2 seconds. For the 2383-bus system, the simulated contingency is a permanent three-phase fault at bus 9 applied at t = 1 second and cleared after 0.4 seconds by tripping the line between bus 6 and bus 9.
Note that these two specific scenarios are used to provide detailed simulation results and corresponding descriptions. A wide range of different scenarios to further validate the proposed method is considered in Section IV-C. The tests are conducted in MATLAB R2021b environment on a personal computer with i5-8250U CPU. Both the DT method and the Parareal method are implemented in MATLAB using the research code developed by the authors. The Matpower toolbox is used to solve the power flow and initialize the dynamic simulation. Table 1 gives the number of iterations and CPU time for the three methods when the Parareal window size is 0.75 seconds. It shows the traditional Parareal method takes an average of 7 iterations in each window to converge while the proposed DT-based Parareal method and the DT-based adaptive Parareal method need only one iteration in each window. Moreover, the computation time of the DT-based Parareal method and the DT-based adaptive Parareal method is reduced by 37.5% and 50% respectively, compared to the traditional Parareal method. When the Parareal window size is further increased from 0.75 seconds to 1 second, both the traditional Parareal method and the DT-based Parareal method diverge, while the DT-based adaptive Parareal method can still converge. For the DT-based adaptive Parareal method, Fig. 4 shows the trajectories of rotor angles of all generators, voltages of all buses, and electrical power outputs of all generators. Fig. 5 shows the window length and the number of iterations of each Parareal window. Fig. 6 shows the time step length and order of the DT-based coarse solver. It can be seen from these results that: 1) the first Parareal window needs 11 iterations to converge, and 2) the proposed DT-based adaptive Parareal method can adaptively adjust the window size, order, and time step length of the coarse solver during the simulation such that convergence is achieved, as the number of iterations is reduced in the subsequent windows (as shown also with the CPU time in Table 1). Table 2 gives the number of iterations and CPU time of the three methods when the Parareal window size is 0.75 seconds. The traditional Parareal method diverges while both the proposed DT-based Parareal method and the DT-based  adaptive Parareal method converge and need only one iteration in each window. Since the DT-based adaptive Parareal method adjusts the window size, the order, and the time step VOLUME 10, 2023  length adaptively during the simulation, the computation time is reduced by 25% compared with the DT-based Parareal method without adaptive strategies.

B. PERFORMANCE WITH THE 2383-BUS SYSTEM
When the Parareal window size is further increased from 0.75 seconds to 1 second, the results are similar to the 39-bus system test, i.e., only the DT-based adaptive Parareal method converges. Figs. 7 to 9 respectively give the trajectories of rotor angles of all generators, voltages of all buses, and electrical power outputs of all generators; the window length and the number of iterations of each Parareal window; and the time step length and order of the DT-based coarse solver. From Figs. 8 and 9, it can be seen that the number of iterations is 12 in the first window but it is reduced in the subsequent windows benefiting from the flexibility of adaptively adjusting the window size, the order, and the time step length of the coarse solver during the simulation. These results demonstrate the better convergence performance of the proposed DT-based adaptive Parareal method in a large power system with detailed dynamic models.

C. RELIABLE PERFORMANCE IN VARIOUS SCENARIOS
In this study, the three methods M1, M2, and M3 are tested on the 39-bus system under 54 scenarios in total, with different contingencies and parameter settings, as follows: 1) Three contingencies with increased severity, i.e., Contingency #1: a temporary fault at bus 8, cleared after 0.1 seconds; Contingency #2: a permanent fault at bus 8, cleared after 0.2 seconds by tripping the branch between bus 8 and bus 9, and Contingency #3: a permanent fault at bus 8, cleared after 0.3 seconds by tripping the branch between bus 8 and bus 9; and 2) Different time step length of the coarse solver, i.e., h c = 0.1 seconds, 0.08 seconds, and 0.05 seconds.
3) Different Parareal window size, i.e., h w = 0.5 seconds, 0.75 seconds, 1.0 second, 1.25 seconds, 1.5 seconds, and 2 seconds. Table 3 to Table 5 give the average number of iterations in each window, respectively for the three methods. They show that there are 23 divergent scenarios (42.5% of the total number of scenarios) for the traditional Parareal method and 4 divergent scenarios (7.4% of the total number of scenarios) for the proposed DT-based Parareal method; while the proposed DT-based adaptive Parareal method converges in all the 54 scenarios. These results demonstrate the effectiveness of the proposed DT-based adaptive Parareal method in improving the convergence performance and robustness across various scenarios.
A similar study is conducted on the Polish 2383-bus system. First, the performance under severe contingencies is validated by creating three contingencies with increasing severities, where contingency #3 leads to unstable trajectories: 1) Contingency #1 has a temporary fault at bus 9 cleared  after 0.4 seconds; Contingency #2 has a permanent fault at bus 9 cleared after 0.4 seconds by tripping the branch between bus 6 and bus 9, and Contingency #3 has a permanent fault at bus 9 cleared after 0.6 seconds by tripping the branch between bus 6 and bus 9. None of the three cases is converged with the conventional Parareal method. In contrast, the proposed method converges and takes an average of 2.56, 2.56, and 3.46 iterations, respectively, for the three contingencies. Second, the impact of parameter settings is validated by testing different time step lengths of the coarse solver, i.e., h c = 0.1 seconds and 0.08 seconds, respectively, and different Parareal window sizes, i.e., h w = 1.0 seconds and 1.25 seconds, respectively. Similar results are obtained where the conventional Parareal method diverges with those large time step lengths for the coarse solver and window lengths while the proposed method converges by less than 5 iterations (averagely 1.42 and 2.56 iterations when h c = 0.1 seconds and 0.08 seconds, respectively; and 2.56 and 4.57 iterations when h w = 1.0 second and 1.25 seconds, respectively). These results show the proposed DT-based Adaptive Parareal method has reliable performance on the Polish 2383-bus system as well.
In the above tests, the tolerance tol=0.1 is selected in the Parareal algorithm. However, the proposed method could still converge for a smaller tolerance while the conventional method would diverge. Two cases are tested on the Polish 2383-bus system with tol=0.1 and tol=0.01, respectively. The conventional Parareal method diverges in both cases while the proposed method converges in an average of 2.56 and 3.25 iterations, respectively.

V. CONCLUSION
The Parareal-based parallel-in-time method and the differential transformation-based semi-analytical solution method are two of the state-of-the-art techniques in power system time domain simulation. Leveraging the temporal parallelization capability of the Parareal method and the high accuracy, efficiency, and adaptivity of the differential transformation method, this paper proposes a differential transformationbased variable-order variable-step variable-window adaptive Parareal method for temporal parallelization of power system simulation with greatly enhanced convergence performance and efficiency. Since the coarse solver plays a key role in the performance of the Parareal method, this paper employs the DT method as the coarse solver in the Parareal method and further designs a set of adaptive strategies to enhance the performance. Extensive simulations on a 39-bus system and a 2383-bus system demonstrate that the proposed approach improves the convergence performance and computational efficiency of the Parareal method and has reliable performance under various contingencies and parameter settings.

APPENDIX
The motivation, definition, and selected transformation rules of the DT method are summarized below, and more details could be found in [23], [24].
Motivation: The DT method is a promising semi-analytical solution method that was originally proposed in the applied mathematics field and introduced to the power system field recently by [23]. It aims at deriving an approximate solution of nonlinear differential equations in the form of finite power series of time with any desired order. Different from other SAS methods such as the Adomian decomposition method [20], [21] and the power series method [22], a unique benefit of the DT method is that it provides various transformation rules to convert a nonlinear function directly to its power series coefficients. As a result, the DT method avoids the complicated process of calculating explicit high-order derivatives and provides a flexible and efficient procedure for deriving SAS up to any desired order.
Definition: The DT of a real continuous function x(t), t ∈ R is defined by (A1), where k ∈ N is the order.
Transformation rules: The DT method provides transformation rules for various generic functions (both linear and nonlinear functions; both simple and compositional functions). The widely used transformation rules in the power system model are summarized below. Note that these rules apply to not only scalar functions but also vector-valued functions.
The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, and world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for the U.S. Government purposes. The Department of Energy (DOE) will provide public access to these results of federally sponsored research in accordance with the DOE public access plan.