Deep Learning for the Design of Toroidal Metasurfaces

In recent years, the toroidal dipoles have had a profound impact on several fields including electromagnetism. However, the on-demand design of toroidal metasurfaces is still a very time-consuming process. In this paper, a method of neural network simulating the nonlinear relationship between the structural parameters of metasurfaces and its multipole scattered powers is proposed based on a deep learning algorithm. The forward network can quickly predict the scattered powers from input structural parameters, which can achieve an accuracy comparable to the electromagnetic simulations. In addition, with the required scattering spectrum as input, the appropriate parameters of the structure could be automatically calculated and then output by the inverse network which can achieve a low mean square error of 0.074 in training set and 0.18 in the test set. Compared with the conventional design process, the proposed deep learning model can guide the design of the toroidal dipole metasurface faster and pave the way for the rapid development of toroidal metasurfaces.

. Schematic diagram of TD and the toroidal metasurface. Poloidal currents flowing on a surface of a torus along its meridians create toroidal dipole moment T, which can be seen as a closed loop of magnetic dipoles arranged head-to-tail. Metal disc with a dumbbell-shaped aperture is the structural element of toroidal metamaterial. exhibit that the TD scattered power has been obviously enhanced to a measurable range while the scattered power of ED is strongly suppressed [5], [6]. In 2010, the resonant excitation of TD in metamaterial had been first demonstrated [7]. Afterwards, TD metasurfaces have attracted the attention of many researchers and been applied to a variety of fields such as electromagnetically induced transparency, highly sensitive sensors, modulators and absorbers [8], [9], [10], [11], [12], [13].
However, designing metasurfaces with TD resonance is often difficult. The traditional design methods of TD metasurfaces usually utilize electromagnetic simulation software such as HFSS, CST Microwave Studio, but the TD scattered power is not intuitive in the simulation software and further calculations are required for each candidate object. In addition, the conventional design approach involves a constant trial-and-error search in a large number of candidates of metasurfaces. Once the candidate object is changed, electromagnetic simulation and the calculation of scattered power need to be performed again, and the repetitive calculations require significant computational resources and designer efforts. Therefore, it is an important challenge of designing TD metasurfaces to obtain their multipole scattering characteristics directly from the given structural parameters of the metasurface that TD resonance evaluation can be performed quickly. In addition, the TD response is heavily dependent on the structural shape and size of the resonators. Conventional methods for designing metasurface with TD response and optimization still rely on the intuition, working experience and expertise of researchers. How to obtain the corresponding This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ structure quickly according to the given multipole scattered power is the worthwhile research in the current.
Deep learning, a popular numerical computation method at the moment, demonstrates a superior ability to learn complex relationships between input and output data. Theoretically, deep learning can solve any nonlinear problem and has been widely used in areas such as pattern recognition, image processing, robotics, and data mining. In recent years, deep learning has been introduced to many physical systems by many scholars, such as plasmonic nanostructure design [14], [15], [16], digital coding metasurfaces [17], [18], intelligent metasurfaces [19] and other systems [20], [21], [22], [23], [24], [25]. Deep learning algorithms can easily learn the correlation between the structural parameters and electromagnetic response of material replacing the traditional time-consuming electromagnetic simulation and avoiding the complicated Maxwell solution process [26].
In 2018, J. Peurifoy et al. [27] proposed a neural network ANN that can approximate light scattering from multilayer nanoparticles. The ANN takes the thickness of each layer of nanoparticles as input and is able to output the transmission spectrum with an average error of less than 3% at each point after training. It is worth noting that the structure of the ANN is fixed after training, even if the structural parameters of the nanoparticles are changed the network structure does not need to be retrained. In addition, the training time of ANN is only 50 seconds, and the prediction accuracy can reach more than 95%. Compared with the traditional electromagnetic simulation, ANN has the advantage of higher efficiency, which saves the time cost of researchers to a large extent.
In 2019, S. So et al. [28] proposed a neural network with devised loss function to learn the correlation between the extinction spectra of ED and MD and core-shell nanostructure design. The neural network taking both material information and structural parameters as inverse design output achieved the on-demand design of independent ED and MD resonances. In 2020, L. Xu et al. [29] demonstrated the inverse design of a high quality (Q) metasurface consisting of two nanorods using deep learning, and the proposed tandem network can acquire high Q resonances with specific desired spectral locations, linewidths, and transmission amplitudes. However, the inverse design with the structural parameters and scattering spectrums of TD metasurfaces, to the best of our knowledge, has not been demonstrated In this paper, a data-driven forward neural network (FNN) is innovatively applied to learn the nonlinear relationship between structural parameters and ED, MD and TD scattered powers in small sample data. Like most simulations do, the FNN is able to predict the scattered powers of metasurfaces from the design parameters. In addition, a reverse neural network (RNN) is proposed to fast design TD metasurfaces, which can directly obtain the appropriate metasurface from a given spectrum of ED, MD and TD scattered powers. To demonstrate the predictive capability of the proposed method, metasurfaces designs with arbitrary TD resonance are exhibited and the corresponding surface current and magnetic field distributions are simulated to further verify the TD responses. Compared with the traditional design method of parametric scanning and iterative verification, this method saves a lot of time of designers by eliminating the complex design and optimization process and provides an efficient way to design TD metasurfaces on demand.

II. DESIGN GOAL AND SETUPS
To demonstrate the feasibility of deep learning in TD metasurface, a randomly selected target structure consists of a disc resonator etched off a dumbbell shape is shown in Fig. 1. The disc resonator is fabricated from a copper sheet of 0.035 mm thickness and affixed to a dielectric substrate with permittivity ε γ = 6.5. The periodicity of the meta-atom (P x = P y ) is fixed as 18 mm, and the geometric parameters r 1 , r 2 , a and d of the resonator are selected as independent variables of this structure. Due to the fixed period of the structure and the shape of the resonator, parameters of the metal resonator are physically limited to the range shown in Table I. In addition, as shown in Fig. 1, several conditions are necessary in order to stay within the physical limits of the metasurface, r 1 > (2r 2 + a) and d < 2r 2 .
When a beam of linearly y-polarized light enters the metasurface normally, the density of the conducting current under different structures can be extracted from the electromagnetic simulation results by changing the variables. And then the scattered powers of the TD metasurface could be obtained according to the multiple scattering theory [30].
And the radiated powers of multipoles can be calculated from the following formulas: where the moments of traditional ED and MD are respectively represented as P and M, and the values can be obtained from the following equations: And the third term is the TD moment (T) form the follow: The next fourth term expresses the radiated power of the electric quadrupoles (Q e ), and the fifth term is derived from the magnetic quadrupoles (Q m ). The last term indicates the higher-order correction. In all the above equations, c represents the speed of light, j is the current and r corresponds to the distance vector from the origin to the point (x, y, z). In order to reduce the dimension and simplify the design process, only three main multipole scattering, ED, MD and TD, are considered in this study.
To save sampling time, the structural parameters are sampled at equal intervals using a joint simulation of CST and Python, and the scattered spectrum is discretized into 201 equally spaced points. In addition, the scattering spectra are processed with a logarithmic function in order to improve the predictive and inverse design capabilities of the network. Ultimately, a total of 2330 pairs of datasets are collected as the datasets for FNN and RNN, and each entry of the datasets constitutes four structural parameters and 201 spectral points of each of ED, MD and TD scattered spectra at a frequency range from 4 to 20 GHz. Furthermore, 80% of the total samples are randomly selected from the datasets as the training set, while the remaining 20% samples are divided into validating (10%), and test data sets (10%). The validation set is used to evaluate the network after each training epoch to avoid overfitting problems and determine the best iteration number. After the training with the training set, the network is checked by the test set. This analysis is done by comparing the loss maps between the true and predicted values.

A. Forward Neural Network
To achieve the forward prediction of the metasurface, we have separately trained two network models with fully connected layers and residual blocks. The fully connected neural network structure is constructed as shown in the left of Fig. 2 and the other network has introduced with residual blocks as shown in the right  Fig. 2. Forward neural network (FNN) aims to learn mapping from structural parameters to scattering properties (EDs, MDs and TDs) by training with a small amount of data, so that the scattering energies can be accurately predicted from a given metasurface. Thus, the structural parameters are used as inputs to the FNN, while the three scattered powers of ED, MD and TD are output as results. The entire FNN consists of fully connected (FC) layers including an input layer, an output layer and six hidden layers.
At the same time, residual blocks of deep residual learning framework are introduced as hidden layers to avoid the degradation problem that occurs with increasing network depth [31]. The number of neurons in each layer is set to the same 603 to ensure that the input X of the shortcut connections to match the dimension of F(X). Each weight layer, except the output layer, is nonlinearized with Rectified Linear Unit (ReLU). During training, the network model is evaluated with a loss function, which is usually expressed the difference between the predicted value and the true value. The smaller the difference, the better the fit of the network model. In the network, the loss function is defined with the mean square error (MSE) from the follow: where n is the number of samples in the calculation process, S pre is the scattered spectrum predicted by the neural network and S sim is the true spectrum given during training. Besides, the optimizer Adam [32] is used in the training to update the weights through the error back propagation process to minimize the MSE. The training set is used to train the FNN model, and after training, the test set consisted of samples that have never been used in previous training steps has been used to evaluate the model. Finally, we trained both the network models with the same dataset, the number of hidden layers and the same set of hyperparameters. And after 600 epochs, the results are shown in the Table II. As can be seen from the Table II, the training loss of the network with the residual blocks is significantly lower than that of the fully connected network by more than half, and the error on the test set is 0.025 which is also much lower than 0.034 that of the fully connected network indicating that the network with residuals is able to predict the metasurface scattering spectrum better. At the same time, the training process of the residual network takes only 12 minutes, compared to 25 minutes for the fully connected network, which makes the residual network more efficient. Explicitly, the MSE of the training set can be minimized to 0.012 and the MSE of the test set reaches 0.025. The maximum MSE of the training set and test set are 0.087 and 0.156, respectively. Therefore, the residual network was chosen as the FNN in the manuscript for the prediction of the metasurface scattered spectra. And from the statistics, it can claim that the proposed model FNN with the residual blocks can learn well from the datasets and predict the scattered powers of ED,MD and TD properly via the geometrical parameters.
For more intuitive illustration of the function, a set of the structural parameters of TD metasurface are freely chosen to obtain the corresponding scattering spectra through the prediction of FNN and the numerical calculation of CST respectively, and the results have been compared as shown in Fig. 3. The comparison of the results shows that the FNN is well trained to provide the scattering spectra of ED,MD and TD, which is very similar to the simulation results for the given input structural parameters. Compared with conventional methods for calculating electromagnetic multipole scattering, the proposed network structure, once trained, is able to quickly predict the scattering spectra of ED, MD and TD within a few seconds. Therefore, this method can replace the traditional electromagnetic simulation and scattered energy calculation and the designer can obtain the excitation patterns of an arbitrary TD metasurface in a short time.

B. Reverse Neural Network
In order to solve the inverse design problem of TD metasurfaces, a reverse neural network (RNN) is constructed for the fast design. As shown in the Fig. 4, the RNN consists of five layers of processing units and the main function is to automatically design TD metasurface with specific TD scattered power. The first three layers are composed of residual blocks which can be used for feature extraction of scattered powers of ED, MD and TD. The respective residual block contains three one-dimensional convolutional layers, and which one is a linear projection performed by the shortcut connection as shown in Fig. 4. The last two layers are all fully connected layers which are used to reduce the output dimension to generate the eventual output data. The kernel size is  1 × 3, except for the convolutional layer in shortcut connections, where the kernel size is 1 × 1. Similarly, each weight layer is nonlinearized with the ReLU activation function. And the output of the Layer3 is then flattened to be fed to the fully connected layers from 250 dimensions to 4 dimensions which generates the corresponding structural parameters of the TD metasurface. The activation function ReLU is also associated with the output layer to ensure the output is positive. The loss function is the MSE between the predicted structural parameters and the target structural parameters.
In order to improve the convergence rate, mini-batching is used to help the RNN model not be trapped in local minima [33]. The optimized hyperparameters used in RNN are the learning rate of 0.001 and a batch size of 20 respectively. Subsequently, the training of RNN is completed via the training set and the variation tendency of MSEs of the RNN during this training process has been shown in Fig. 5 and the MSE is the mean square error between the output values of structure parameters with the true. As can be seen from   In addition, the average MSE between predicted and target responses of 233 test data sets is 0.18 quantitatively showing that RNN indeed provide appropriate designs that have desired optical properties.
As shown in Fig. 6, two examples of test results show that the predicted parameters (8.2, 1, 5.5, 0.8 and 7.2, 3, 03.5) are very close to the target metasurface (8.22, 0.98, 5.36, 0.74 and 6.9, 2.96, 0, 3.13). And the ED, MD and TD scattered powers obtained from predicted designs (circles) show good agreement with desired spectra (solid lines), especially for the TD and MD responses. Therefore, it can be considered that the proposed RNN is well trained to inversely design the parameters of TD metasurface properly according to the target ED, MD and TD responses.
Deep-learning-based inverse design is practically utilized to find a structure, which reconstructs an input spectrum with specific purposes. To demonstrate the ability of the proposed network to inversely design TD metasurfaces using user-required spectra, metasurfaces with arbitrary TD resonances are engineered. As shown in the gray band in the solid line of Fig. 7, the scattered spectrums with arbitrary TD resonances are used as input of the method to automatically design the metasurfaces. It takes only 2 seconds to batch design three user-required metasurfaces, which is much faster than the parameter optimization process of previous work [34]. The parameters of the designed metasurfaces are (5.9, 1.37, 2.9, 0.3), (6.29, 1.35, 1.63, 0.69) and (7.15, 1.28, 1.63, 1.48) and the scattered spectra of ED, MD and TD obtained from the predicted designs after numerical calculations are shown as circles in Fig. 7. In addition, the MSEs between the required spectra with the calculated are 0.023, 0.016 and 0.012 respectively.
As can be seen from the Fig. 7, the results of RNN prediction are highly overlapping with the target scattering, and especially the trends of EDs and TDs scattering are well fitted with the given scattered spectra. Although there is a slight frequency shift at the intersection of ED and TD scattering, the TD scattering still remains dominant near the gray band. The accurate prediction at different resonant frequencies indicates that the method can achieve arbitrary TD resonance prediction at different target frequencies. In addition, the TD mode is further demonstrated by calculating the surface current and cross-sectional magnetic field distribution at the resonance. According to the Fig. 7(d)-(f), a pair of reverse flowing currents can be clearly observed independently which can form a head-to-tail magnetic field, that is the TD in the y-direction, as shown in Fig. 7(g)-(i).
To sum up, our networks model can make good predictions of TD metasurfaces and further decrease the time-consuming numerical simulations to calculate the scattered spectrums. Our proposed model cannot obtain the S parameters at the same time as the scattered spectrum is calculated. However, if a large relevant dataset of metamaterials with S parameters is available, the model will be able to get the connection between the scattered spectra and S parameters, and successfully predict accurate TD results.

IV. CONCLUSION
In conclusion, a method had been provided to inverse design the structural parameters of TD metasurfaces based on independently given ED, MD and TD spectra. Firstly, a forward prediction network is designed with a fully connected network based on residuals. After training, the scattered power can be predicted according to the structural parameters provided by the user quickly and approximates the numerical calculation results with high accuracy. In addition, a convolutional residual network is proposed for the inverse design of TD metasurfaces. To show the capability of the RNN, a deep-learning-assisted inverse design for TD resonance are specifically demonstrated.
Using this model to reversely design the TD metasurface avoids the highly non-linear problem of solving the Maxwell equation and the large amount of time wasted in traditional numerical calculation methods which can predict the appropriate structural parameters to meet user requirements based on an arbitrary scattered power spectrum. We believe that the proposed efficient inverse design method will provide new ideas for the design and development of TD metasurfaces in the future.