A Locally-Implicit Discontinuous Galerkin Time-Domain Method to Simulate Metasurfaces Using Generalized Sheet Transition Conditions

The generalized sheet transition conditions (GSTCs) are incorporated into a discontinuous Galerkin time-domain (DGTD) method to efficiently simulate metasurfaces. The numerical flux for GSTCs is derived for the first time using the Rankine–Hugoniot jump conditions. This numerical flux includes the time derivatives of fields, and therefore, explicit time integration schemes, which are traditionally used with DGTD, do not yield a stable time marching method. To alleviate this bottleneck, a new time marching scheme, which solves a local matrix system for the unknowns of the elements touching the same GSTC face, is developed. This locally implicit method maintains its high-parallel efficiency just like the traditional explicit DGTD schemes. Numerical results, which validate the accuracy of the proposed method against analytical solutions and demonstrate its applicability to the simulation of curved and space/time-varying metasurfaces, are presented.


I. INTRODUCTION
I N RECENT years, metasurfaces have attracted significant attention due to their unprecedented ability to manipulate electromagnetic fields [1], [2], [3]. Building blocks of metasurfaces are subwavelength meta-atoms that can introduce amplitude and phase changes in electromagnetic fields. These meta-atoms are arranged into a 2-D array, and depending on the shape of this array and the meta-atoms' phase and amplitude profiles, the resulting metasurface is capable of reshaping wavefronts at scales beyond the operation wavelength [1], [2], [3]. Metasurfaces have found a diverse range of applications, such as anomalous reflection and refraction, beam steering and focusing, and cloaking [1], [2], [3], [4], [5], [6], [7], [8], [9]. In more recent years, curved [10], [11], [12], [13], space/timevarying [14], tunable and active [15], [16], and nonlinear [17] metasurfaces have also been explored and shown to have many benefits in various real-life applications. Numerical modeling has played a vital role in the development of metasurfaces and their widespread use. While full-wave methods are commonly used to simulate a single meta-atom [often under periodic boundary conditions (PBCs)], the simulation of a whole metasurface, which often consists of hundreds of meta-atoms with subwavelength and detailed geometries, is computationally challenging [1], [18], [19]. To circumvent this bottleneck, the generalized sheet transition conditions (GSTCs) [1], [11], [20], [21], [22], which model a metasurface as a zero-thickness sheet and account for the field discontinuities across its sides using surface susceptibility functions, have been successfully incorporated into several full-wave electromagnetic solvers. These include finitedifference frequency-domain (FDFD) method [23], finiteelement method (FEM) [24], finite-difference time-domain (FDTD) method [25], [26], [27], [28], and surface integral equation (SIE) solvers [29], [30], [31]. With the inclusion of GSTCs, these methods can efficiently account for metasurfaces in full-wave simulations since they do not need to discretize the meta-atoms with fine meshes anymore. Having said that, electromagnetic solvers with GSTCs retain the strengths and the weaknesses of their original versions without GSTCs. For example, the frequency-domain FEM and SIE solvers are well-equipped to model arbitrarily shaped metasurfaces, but they cannot be directly used if the metasurface is time-varying, active, or nonlinear. FDTD can be used in such cases [25], [26], [27], [28], but since it uses orthogonal grids for spatial discretization, it loses its efficiency and/or accuracy when the computation domain includes complex geometries with details, curved surfaces, and so on. The finite-element timedomain (FETD) method can easily model arbitrarily shaped metasurfaces and account for time variation and nonlinearities. However, it calls for the solution of a global (but sparse) matrix system at each time step. This increases FETD's computational cost and makes its parallelization on distributed memory computer clusters a nontrivial task [32]. In this context, the discontinuous Galerkin time-domain (DGTD) method combines advantages of FDTD and FEM [33], [34], [35], [36], [37], [38], [39]. Like FDTD, it can account for nonlinearities This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ and has a very high-parallel efficiency; meanwhile, like FEM, it can model arbitrarily shaped geometries using unstructured meshes. These properties render DGTD an ideal candidate for simulating space/time-varying and nonlinear metasurfaces with complex shapes. Even for planar metasurfaces, DGTD might be preferred if other arbitrarily shaped scatterers are present in the computation domain. It is worth mentioning here that a 2-D or 3-D FETD method that can account for GSTCs has not been reported in the literature, and FDTD implementations have only been limited to 2-D problems [25], [26], [27], [28].
In this work, GSTCs are incorporated into a DGTD scheme to efficiently simulate metasurfaces in 3-D electromagnetic problems. The numerical flux for GSTCs is derived for the first time using the Rankine-Hugoniot jump conditions. The resulting expression includes the time derivatives of the electromagnetic fields averaged across the discontinuity introduced by the metasurface. This makes the explicit time integration schemes, such as leap-frog [40], multistep [41], and Runge-Kutta [34], [39], which are often used with the traditional DGTD scheme, unstable in the presence of a metasurface mathematically modeled using GSTCs. To overcome this problem, a new time marching scheme, which solves a local matrix system for the unknowns of the elements touching the same GSTC face, is developed. The accuracy of the resulting locally implicit DGTD method is validated against analytical solutions. Numerical examples involving space/time-varying and curved metasurfaces are provided to demonstrate the applicability of the proposed method. Examples also show that even with the inclusion of the locally implicit time updates, the proposed method maintains the high-parallel efficiency of the traditional explicit DGTD schemes.
It should be noted here that the idea of representing physical structures using boundary conditions enforced on a zero-thickness sheet has been explored before. For example, impedance boundary conditions (IBCs) have been incorporated into DGTD to permit computationally efficient modeling of thin conductive layers [42], [43]. Having said that, thin conductive layers and metasurfaces have different physical properties, and IBCs are quite different from GSTCs. As a result, the numerical flux expressions for GSTCs are also different from those of IBCs [42], [43]. As briefly explained above, the numerical flux for GSTCs includes time derivatives of the fields averaged across the GSTC surface. Therefore, the time integration schemes developed in [42] and [43] cannot be directly used for a DGTD scheme that implements GSTCs.
The rest of this article is organized as follows. Section II provides the formulation underlying the proposed method, which includes the derivation of the numerical flux for GSTC and description of the locally implicit time marching scheme. In Section III, first, the accuracy of the proposed method is validated against analytical solutions, and then, its applicability to the simulation of space/time-varying and curved metasurfaces is demonstrated with several examples. Section III also presents an example that shows that the proposed method maintains its high-parallel efficiency. Finally, Section IV concludes this article and provides some remarks on future research directions.

A. Generalized Sheet Transition Conditions
GSTCs describe the "jumps" in the electromagnetic fields across the two sides of an arbitrarily shaped metasurface as [20] Here, E(r, t) and H(r, t) are the electric and the magnetic field intensities, respectively, ε 0 and μ 0 are the permittivity and the permeability in free space, respectively, the jump operator is defined as [[ f ]] = f − − f + , superscript "+" or "−" attached to a variable means that the variable is on the exterior (+) or the interior (−) side of the metasurface,n(r) is the normal unit vector pointing from the interior side to the exterior side, subscript "⊥" or " " attached to a variable means that only the normal [alongn(r)] or the tangential [transverse ton(r)] component of that variable is considered, and finally, P(r, t) and M(r, t) are the electric and the magnetic polarization densities. Hereinafter, to simplify the notation, the explicit space (r) and time (t) dependencies of the variables are dropped unless they are needed for clarification. The electric and the magnetic polarization densities are expressed in terms of electric and magnetic field intensities using the constitutive relations [20] where χ ab , a, b ∈ {e, h}, are susceptibility and the average operator is defined as For the sake of simplicity and without loss of generality, it is assumed that the metasurface is mono-anisotropic (χ eh = χ he = 0) and uniaxial (χ aa , a ∈ {e, h}, are diagonal matrices with entries χ xx aa , χ yy aa , and χ zz aa ), and M ⊥ = 0 and P ⊥ = 0 [20], [27]. With these simplifications, inserting the tangential components of the polarization densities from (2) into (1) yields [27]n It is assumed here that χ ab are time-varying but not dispersive [25], [26], [27], [44], [45]. The proposed method can easily be extended to simulate dispersive metasurfaces using the well-known auxiliary differential equation method that accounts for susceptibility dispersion models in the time domain [39], [42], [46], [47], [48].

B. Numerical Flux for GSTCs
The derivation of the numerical flux for GSTCs starts with the conservation form of the Maxwell equations [34] Q∂ t U + ∇ · F (U) = 0 ( 4 ) Fig. 1. Schematic of the Rankine-Hugoniot jump condition on a discontinuity/interface. The normal vector of the interface is along the x-direction. For the discontinuity across a metasurface described by GSTCs, U * and U * * are given by (11). where Here, ε and μ are the material permittivity and permeability, respectively, and F ν is the component of F along the direction ν ∈ {x,ŷ,ẑ}. Consider the Riemann problem and let Q + and Q − represent the material properties on the exterior (+) and the interior (−) sides of a discontinuity. Then, the conservation of U across this discontinuity yields the Rankine-Hugoniot jump conditions [34], [49] where A is the flux operator defined bŷ n is the unit normal vector of the interface and points from "−" side to "+" side, ± = di ag{λ ± 1 , . . . , λ ± 6 }, λ ± 1,4 = −c ± , λ ± 2,5 = c ± , and λ ± 3,6 = 0 are the eigenvalues of (Q ± ) −1 A, c ± = 1/(ε ± μ ± ) 1/2 are characteristics mode speeds, and U * and U * * are intermediate modes connecting the characteristic modes U − and U + (see Fig. 1) [34], [49]. Inserting the expressions of the operators defined in (5) and (7) and the eigenvalues defined above into (6) yieldŝ where Z ± = (μ ± /ε ± ) 1/2 and Y ± = (ε ± /μ ± ) 1/2 . For a simple material interface (i.e., μ or ε or both are discontinuous across the interface), the tangential continuity of the fields yieldsn which provides the well-known expressions of the upwind flux as [34] The jumps in the fields due to GSTCs [as described in (3)] are enforced aŝ Inserting the expressions of the jumps in (11) into (8) yieldŝ Then, the expressions of the numerical flux for GSTCs are obtained from the expressions of E * and H * as Here, F H-up and F E-up are the upwind flux given by (10) and ∂ t F H-GS and ∂ t F E-GS are the new numerical flux terms due to GSTCs. Their expressions are given as , and γ HE = ε 0 χ ee /(2{{Y }}) are tensor-valued coefficients. Note that while deriving (14), Unlike the numerical flux expressions for simple interfaces [e.g., those of upwind flux as shown in (10)], the numerical flux for GSTCs as shown in (13) includes time derivatives of the fields averaged across the sides of the metasurface. These derivatives should be accounted for by the time marching scheme that integrates Maxwell equations (4) in time. This time marching scheme is described next.

C. Time Marching Scheme
To facilitate the numerical solution, the computation domain is discretized into K tetrahedral elements with volumetric support k and boundary surface ∂ k , k = 1, . . . , K [34], [50]. It is assumed that the permittivity and the permeability in element k are constant and represented by k and μ k . In element k, testing (4) using the Lagrange polynomials i (r) [34] and applying the divergence theorem twice yield the strong form (15) Here, i = 1, . . . , N p , N p = ( p + 1)( p + 2)( p + 3)/6 is the number of interpolation nodes, and p is the order of the Lagrange polynomials. Let E ν and H ν represent the components of the fields in element k along the direction ν ∈ {x,ŷ,ẑ}. These six field components are expanded using the Lagrange polynomials i (r) as where r i , i = 1, . . . , N p , are the interpolation nodes and vectorsĒ ν k (t) andH ν k (t) store the unknown time-dependent expansion coefficients to be solved for.
Inserting the expansion in (16) into (15) yields the semi-discrete system as T store the unknown coefficients andF H k, f andF E k, f represent the vectors of the numerical flux on face f of element k. In (17), Here, ∂ k, f represents the surface of face f of element k.
Using the expressions of the numerical flux from (13) in (17) yields Here,F H-up k, f andF E-up k, f are vectors of the upwind flux and their treatment in time marching is the same as that in traditional DGTD schemes and is not described in detail here. Also, note Assume that face g of element k overlaps the metasurface. This face coincides with face g of element k , which neighbors element k via the metasurface (see Fig. 2). In this scenario, using (14), one can express the numerical flux vectors associated with GSTCs,F H-GS Here, vectorsẼ kk = (Ē k,g +Ē k ,g )/2 andH kk = (H k,g + H k ,g )/2 store the average of the coefficients on face g of element k and face g of element k ,n k,g is the unit normal vector of face g pointing from element k to element k , and γ ab k,g , a, b ∈ {E, H}, represent the values of the (timedependent) tensor coefficients γ ab [see (14)] on face g.
Approximating the time derivatives in (18) using forward finite differences and inserting (19) in the resulting expressions yield the time-update equations as In (20) and the rest of the text, superscript n attached to a variable means that the variable is sampled at time step n, i.e., f n = f (n t), where t is the time step size. Note thatẼ n+1 kk andH n+1 kk store unknowns that are associated with elements k and k and sampled at time (n + 1) t. Re-arranging the terms that involveĒ n+1 k ,H n+1 k ,Ē n+1 k , and H n+1 k in (20) yields a matrix system that is written in a compact form as The expressions for the blocks of A n k and B n k are given by (30) in the Appendix and the expressions of the column vectors ofC n k are given as Similarly, a matrix system is obtained for the field expansion coefficients in element k (Fig. 2) as where the expressions of the matrices and the vectors are the same as those in (30) and (24) except that the indices k and g are replaced with k and g , respectively.
Matrix equations (21) and (25) (21) and (25) as Here During time marching, at time step n, first, A n k and A n k are computed using (22) and (30) (in the Appendix), respectively, andC n k andC n k are computed using (23) and (24), respectively. Then,Ā n k ,Ā n k ,B n k , andB n k are constructed and the matrix equations in (26)  is updated using (20) without the terms in the curly brackets (terms corresponding to the numerical flux for GSTCs). Consequently, the time marching becomes locally implicit where only the unknowns associated with elements that touch the metasurface are obtained by solving element-level matrix systems. Also, note that the time step size t of this time marching scheme is restricted by the Courant-Friedrichs-Lewy (CFL) condition just like the traditional explicit DGTD methods [34], [39], [40].
Again just like these traditional schemes, the locally implicit time marching algorithm can be parallelized efficiently. A n k , B n k , andC n k and A n k , B n k , andC n k are computed locally by the message passing interface (MPI) processes assigned to elements k and k , respectively. If these processes are different (i.e., reside on different processors), then A n k , B n k , andC n k and A n k , B n k , andC n k are communicated between these processors to enable the computation ofĀ n k andB n k andĀ n k andB n k . Then, each processor locally solves the matrix equations in (26). The MPI communications associated with the explicit part of time marching (for elements that do not touch the metasurface) remain the same as those in traditional explicit DGTD schemes without GSTCs. As shown by the numerical results presented in Section III, for a 3-D computation domain, the number of elements that touch the metasurface is usually small compared to the total number of elements, and therefore, the computational cost of locally implicit part of time marching is relatively low.
Note that, approximating the time derivative terms ∂ t F H-GS and ∂ t F E-GS on the right-hand side of (18) using backward finite differences and approximating the time derivative terms ∂ t E k and ∂ t H k on the left-hand side of (18) yield an explicit time marching scheme. However, numerical experiments presented in Section III-A show that this scheme is not stable.
The locally implicit scheme described in this section is a first-order scheme (in time). One can easily modify this scheme to yield a leap-frog-type time integration method. However, further generalizations to higher order schemes, such as Runge-Kutta methods [34], are not trivial and are subjects for future research.
One last remark about the formulation described in this section is in order. The nodal basis functions used to discretize the fields in space can be replaced by the vector basis functions used in [32], [42], [43], [52], and [53], but the numerical flux expressions and the resulting locally implicit time marching scheme would essentially stay the same.

A. Planar Metasurfaces
First, the proposed method is validated against analytical solutions. Consider a mono-anisotropic, uniaxial, and time-independent planar metasurface located on the z = 0 plane in free space. A plane wave with electric field E inc (r, t) =x E 0 cos [ωt − k 0 (z − z 0 )] is normally incident on the metasurface. Here, E 0 is the amplitude of the electric field, ω = 2π f , f is the frequency of excitation, and k 0 = ω/c 0 and c 0 = 1/(μ 0 ε 0 ) 1/2 are the wavenumber and the speed of light in free space, respectively. Note that the phase of the plane wave is zero at z = z 0 . In the following examples, the (synthesized) susceptibilities are given as [20], [21]: where R and T are the reflection and the transmission coefficients of the electric field on the metasurface, respectively. Solving (27) for T and R yields Let χ xx ee = χ yy hh = χ, and then, R = 0, i.e., the metasurface becomes reflectionless. Furthermore, let χ = αc 0 /( j ω), and then, T = (2 − α)/(2 + α) (in addition to R = 0). For α = 2, T = 0, i.e., the metasurface becomes reflectionless and fully absorbing. For α = 2/3, T = 0.5, i.e., the metasurface is reflectionless and partially absorbing.
The proposed locally implicit DGTD scheme is used to numerically investigate the above cases. The dimensions of the computation domain are L x = 6 m, L y = 1.5 m, and L z = 18 m along the x-, y-, and z-directions, respectively. PBCs [54], [55] are used along the x-and y-directions and perfectly matched layers (PMLs) [52], [56], [57] are used along the z-direction. The excitation parameters are f = 100 MHz, E 0 = 1 V/m, and z 0 = −6 m. The average edge length of the elements in the mesh used to discretize the computation domain is 0.25 m (λ 0 /11.99, where λ 0 is the free-space wavelength at f = 100 MHz), p = 2, and t = 26.04 ps. Fig. 3(a) and (b) shows E x (the x-component of the electric field) computed using the proposed locally implicit DGTD scheme at t = 80 ns in the whole computation domain for χ = 1 and χ = 5, respectively. One can observe different phase jumps between the fields on both sides of the metasurface when different susceptibility values are used. Fig. 3(c) compares E x computed using the locally implicit DGTD scheme and the analytical expression from [20] and [21] at (0, 0, 1.5 m). Excellent agreement is observed between the analytical and numerical results.
Assume that ∂ t F H-GS and ∂ t F E-GS in (18) are approximated using backward finite differences. As briefly explained in Section II-C, this yields an explicit DGTD scheme. Fig. 3(d) and (e) shows E x computed using the locally implicit DGTD scheme, the explicit DGTD scheme, and the analytical expression from [20] and [21] on the z = 0 + side of the metasurface for χ = 0.0530 and χ = 0.0554, respectively. For the larger value of χ, the solution obtained by the explicit scheme becomes unstable even though it matches well with the analytical solution at early times. This is because large values of χ introduce large phase jumps between the fields on both sides of the metasurface and the explicit scheme becomes unstable no matter how small the time step size is. Further tests show that, for this problem, the largest value of χ that the explicit scheme can accommodate is 0.055, which corresponds to a phase angle of 4.9 • in T . Table I shows the value of the largest time-step size t max that ensures the stability of this explicit scheme with and without GSTCs and the locally implicit scheme (with GSTCs) for different values of χ. All other simulation parameters are kept the same for all three schemes and are the same as those described above. "Without GSTCs" refers to the simulation where GSTCs are removed from the computation domain, while everything else is kept the same. Table I also shows that t max of the locally implicit scheme is the same as that used by the explicit scheme without GSTCs and is independent of χ.
Next, the case where χ = −αc 0 /( j ω) is considered. For this particular selection of χ xx ee and χ yy hh , the time derivatives in (3) are canceled [27] and ∂ t F H-GS and ∂ t F E-GS in (18) are replaced by F H-GS and F E-GS , respectively. Note that in this updated formulation, one should replace expressions of γ EE , γ EH , γ HH , and γ HE by −Z + ε 0 αc 0 /(2{{Z }}), −μ 0 αc 0 /(2{{Z }}),  [20] and [21] at (0, 0, 1.5 m). E x computed using the locally implicit DGTD scheme, the explicit DGTD scheme, and the analytical expression from [20] and [21]  which involve only the averaged quantities from the previous time step n. Consequently, the time-update equation in (20) becomes explicit. Fig. 4(a) and (b) shows E x computed using this explicit DGTD scheme at t = 80 ns in the whole computation for α = 2 and α = 2/3, respectively. As expected, for α = 2, there is no reflection or transmission at the metasurface, and for α = 2/3, there is no reflection, but the amplitude of the fields transmitted through the metasurface is halved.

B. Space/Time-Varying Metasurface
In this example, the nondispersive space/time-varying metasurface described in [27] is considered. It is assumed that the response of this metasurface to electromagnetic fields is instantaneous and the time dependence of χ ee and χ hh reflects the real-time control of the effective susceptibilities of the metasurface [27], [44], [45], [58], [59]. Entries of χ ee and χ hh are expressed as , ν ∈ {x, y, z} (29) where  time. At the beginning of each time period, the metasurface is fully absorbing everywhere on its surface. As time approaches to the middle of the period, the center of the metasurface gradually becomes half-transmitting, while the two ends of the metasurface (along the x-direction) stay fully absorbing. The metasurface is located on the z = 0 plane. The dimensions of the computation domain are L x = 3.15 m, L y = 0.04 m, and L z = 6.15 m along the x-, y-, and z-directions, respectively. PBCs [54], [55] are used along the y-direction and PMLs [52], [56], [57] are used along the x-and z-directions. The metasurface is excited by the fields generated by an aperture source located on the z = 2.85 m plane. The electric and magnetic current densities impressed on this aperture are given by J s =x J 0 sin(2π f t)e −x 2 /0.02 and M s =ŷ J 0 Z 0 sin(2π f t)e −x 2 /0.02 , where Z 0 = (μ 0 /ε 0 ) 1/2 is the wave impedance in free space, J 0 = 1 A/m 2 , and f = 2 GHz. The average edge length of the elements in the mesh used to discretize the computation domain is 0.0125 m (λ 0 /12, where λ 0 is the free-space wavelength at f = 2 GHz), p = 2, and t = 0.439 ps. Fig. 5(a) shows E x computed using the locally implicit DGTD scheme at t = 28 ns in the whole computation domain. One can observe that the fields transmitted to z < 0 region are stronger near the x = 0 plane, i.e., the center of the metasurface. This agrees with the specified properties of the metasurface: it switches between totally absorbing and half-transmitting near its center, while it is more absorbing toward the ends (along the x-direction). Even though the source is monochromatic, the fields transmitted to z > 0 region show pulsed patterns with wideband characteristics (modulated with a lower frequency). This is because ∂ t (χ ee {{E}}) and ∂ t (χ hh {{H}}) in (3) generate new frequency components when χ ee and χ hh are time-dependent [27]. Fig. 5(b) shows E x computed using the locally implicit DGTD at (0, 0, 0 − ) and (0, 0, 0 + ). The modulation effect introduced by the metasurface is also visible in this figure. The results presented in Fig. 5 agree well with the results obtained using a 2-D FDTD method as reported in [27].

C. Curved Metasurfaces
In this section, first, curved counterparts of the examples presented in Section III-A are considered. Planar metasurfaces are replaced with curved metasurfaces, while all other simulation parameters are kept the same. In the examples considered here, the curved surface is described by z = 0.75 sin(2π x/L x ). This selection ensures that the surface curvature is larger than the wavelength of the excitation frequency and the synthesis technique used for normally incident plane waves (as done in Section III-A) can still be used here [11], [20], [21]. Fig. 6(a) and (b) shows E x computed at 80 ns in the whole computation domain for χ = 5 (reflectionless metasurface) and χ = −2 c 0 /( j ω) (fully absorbing metasurface), respectively. As expected, the curved metasurfaces perform similar to their planar counterparts. In Fig. 6(a), one can see that the fields are totally transmitted and they retain the wavefront of a plane wave. In Fig. 6(b), it is clearly shown that the fields are fully absorbed as soon as their wavefront touches the metasurface. Next, a more complex scattering scenario is considered. The computation domain is shown in Fig. 7(a). It includes two cylindrical dieclectric scatterers (with relative permittivity 3.0) and one cylindrical perfect electrically conducting (PEC) scatterer. Each dielectric scatterer is surrounded by a reflectionless and partially absorbing metasurface (similar to the examples in Fig. 4) [52], [56], [57] are used along the x-and z-directions and PBCs [54], [55] are used along the y-direction. The incident fields are generated by an aperture source that is centered at (11.21 m, 0, 11.21 m) and makes an angle of 45 • with x-and z-axes. The electric and magnetic current densities impressed on this aperture are given by J s = −ŷ J 0 sin(2π f t)G(x, z) and M s = (−x sin α +ẑ cos α)J 0 Z 0 sin(2π f t)G(x, z), where G(x, z) = e −[(x−11.21) 2 +(z−11.21) 2 ]/144 , α = 45 • , J 0 = 1 A/m 2 , and f = 100 MHz. The average edge length of the elements in the mesh used to discretize the computation domain is 0.375 m (λ 0 /8, where λ 0 is the free-space wavelength at f = 100 MHz), p = 2, and t = 18.20 ps. Fig. 7(a) shows E y (the y-component of the electric field) computed at 120 ns in the whole computation domain. For comparison, the same scattering problem without the metasurfaces is simulated. All simulation parameters are kept the same, only GSTCs are removed. The corresponding result is shown in Fig. 7(b). These figures clearly show the properties assigned to each metasurface. The incident fields are transmitted through the two partially absorbing metasurfaces with reduced amplitudes. The transmitted field is stronger on the right-bottom scatterer as the transmittance on the surrounding metasurface (T = 0.75) is larger than that of the left-top one (T = 0.25). On the other hand, the field patterns on both scatterers are almost the same (except the field amplitude) due to the symmetry of the problem. On the right-top metasurface, the incident field is fully transmitted but with a phase jump as expected (see Section III-A). The fields scattered away from the PEC cylinder also show the same phase jump while being transmitted through the surrounding metasurface.

D. Computational Efficiency
As discussed in Section II-C, updating the unknown field coefficients of the discretization elements that touch the metasurface calls for solving the element-level matrix systems in (26). This operation is computationally more expensive than doing explicit time updates for unknown coefficients of the elements that do not touch the metasurface. However, since the metasurface is a 2-D structure in a 3-D computation domain, the number of elements that touch the metasurface is often much smaller than the total number of elements. For instance, in the last example in Section III-C, only 9 264 elements touch the metasurface, while the total number of the elements in the computation domain is 1 005 767.
On a shared-memory system, where the workload is allocated to all processor cores, the additional computational cost introduced by the metasurface does not significantly change the computation time and memory requirement. For the example in Fig. 7, the wall time and the peak memory required to complete all the operations in one time step on a single CPU core (Intel 1 Xeon 1 E5-2680 v4 processor) are 4.44 s and 4.05 GB, respectively. These numbers reduce to 3.47 s and 3.89 GB, respectively, when the metasurface is removed.
On a distributed-memory system, if all elements are evenly distributed between all processes, the processes with the elements that touch the metasurface would have heavier workloads. Clearly, to balance the workload, one should assign a smaller number of elements to these processes. This can be achieved by simply assigning a larger partition weight to each element that touches the metasurface [60]. For the example shown in Fig. 7(a), let w GSTC denote the partition weight for the elements touching the right-top metasurface. The reflectionless and partially absorbing metasurfaces (the lefttop and right-bottom ones) are accounted for using an explicit scheme (see Section III-A). The workload of updating the elements touching these metasurfaces is smaller than that of the elements touching the right-top metasurface. The partition weight assigned to the elements touching the left-top and rightbottom metasurfaces is the same as that of the elements in the PML region and is denoted by w PML . The weight of all other elements in the computation domain is denoted by w. Fig. 8(a) shows the MPI partitions for the example in Fig. 7(a) when 24 CPU cores are used and w GSTC = 100, w PML = 2, and w = 1. One can see that the numbers of elements in the partitions that touch the right-top metasurface are smaller. Fig. 8(b) and (c) shows the wall time required to complete all operations in one time step and the parallel efficiency for different values of w GSTC , respectively. The label "Without GSTCs" in this figure refers to the simulation where the metasurfaces are removed and the time integration is fully explicit. All simulations required to generate Fig. 8(b) and (c) are executed on Cray XC40 Shaheen-II (https://www.hpc.kaust.edu.sa). Each computing node has 128 GB memory and two 16-core Intel Haswell CPUs running at 2.3 GHz. The computing nodes are connected with the Cray Aries High Speed Network. Note that the wall time required by 32 cores (from a single computing node) is used as the baseline to calculate the parallel efficiency.
For the locally implicit scheme,H n k andĒ n k are communicated between MPI partitions, but A n k , B n k , andC n k are communicated only when metasurface coincides with the MPI partition boundary. Therefore, the MPI communication graph is more complicated than the traditional explicit DGTD schemes without GSTCs. Fig. 8(b) shows that the overall computation time for the simulations with GSTCs is slightly larger than that of the simulation without GSTCs. Nevertheless, with the locally implicit scheme, the computation time decreases almost linearly with the increasing number of CPU cores. The measured parallel efficiency is very high, which is around 94% for w GSTC = 50 and around 97% for w GSTC = 100.

IV. CONCLUSION AND DISCUSSION
In this work, GSTCs are incorporated into DGTD to efficiently simulate metasurfaces. The numerical flux for GSTCs is derived for the first time using the Rankine-Hugoniot jump conditions. This numerical flux includes the time derivatives of the fields averaged across the sides of the metasurface. These terms make the explicit times marching schemes used by the traditional DGTD schemes unstable. To alleviate this bottleneck, a new marching scheme, which solves an element-level matrix system for the unknowns of the elements that touch the metasurface, is developed. The resulting locally implicit DGTD scheme is stable and maintains the high-parallel efficiency of the traditional explicit DGTD methods. The accuracy of this new scheme is validated against analytical solutions and its ability to simulate space/time-varying and arbitrarily shaped metasurfaces is demonstrated by numerical results.
In many applications of metasurfaces, such as beam steering and cloaking, the synthesized susceptibilities of GSTCs are usually frequency-dependent. This strong dispersion usually means that GSTC is also strongly dissipative in the relevant frequency range. Extension of the DGTD scheme developed in this work, which will efficiently and accurately account for dispersive and dissipative metasurfaces, is underway.  (21) and (22) n ν k,g represents the component ofn k,g along the direction ν ∈ {x,ŷ,ẑ}, γ ab,νη k,g , a, b ∈ {E, H}, μ, η ∈ {x, y, z} is the νη component of the tensor γ ab k,g , and I and O are identity and zero matrices. Note that F k,g has the same matrix entries as F k,g but with its columns reorganized such that it operates on the unknowns of element k , i.e., if node j of element k is connected to node j element k , column j of F k,g is equal to column j of F k,g [61].