On Inadequacy of Sequential Design of Experiments for Performance-Driven Surrogate Modeling of Antenna Input Characteristics

Design of contemporary antennas necessarily involves electromagnetic (EM) simulation tools. Their employment is imperative to ensure evaluation reliability but also to carry out the design process itself, especially, the adjustment of antenna dimensions. For the latter, traditionally used parameter sweeping is more and more often replaced by rigorous numerical optimization, which entails considerable computational expenses, sometimes prohibitive. A potentially attractive way of expediting the simulation-based design procedures is the replacement of expensive EM analysis by fast surrogate models (or metamodels). Unfortunately, due to the curse of dimensionality and considerable nonlinearity of antenna characteristics, applicability of conventional modeling methods is limited to structures described by small numbers of parameters within narrow ranges thereof. A recently proposed nested kriging technique works around these issues by allocating the surrogate model domain within the regions containing designs that are of high quality with respect to the selected performance figures. This paper investigates whether sequential design of experiments (DoE) is capable of enhancing the modeling accuracy over one-shot space-filling data sampling originally implemented in the nested kriging framework. Numerical verification carried out for two microstrip antennas indicates that no noticeable benefits can be achieved, which contradicts the common-sense expectations. This result can be explained by a particular geometry of the confined domain of the performance-driven surrogate. As this set consists of nearly-optimum designs, the average nonlinearity of the antenna responses therein is almost location independent, therefore optimum training data allocation should be close to uniform. This is indeed corroborated by our experiments.


I. INTRODUCTION
The design of modern antennas is a demanding and multi-stage endeavour that involves conceptual development, topology evolution, as well as the adjustment of geometry parameters [1]- [3]. The latter may be quite extensive and often pertains to all antenna dimensions [4]. The reasons are strictly related to geometrical complexity of contemporary antenna systems where the fulfillment of stringent performance requirements [5] and implementation of various The associate editor coordinating the review of this manuscript and approving it for publication was Giambattista Gruosso .
functionalities, e.g., circular polarization [6], multi-band [7] or MIMO operation [8], pattern/polarization diversity [9], let alone meeting additional requirements such as reduction of the physical size of the radiator [10], [11], calls for unconventional layouts [12]- [19]. These include incorporation of stubs [12], [13], slots [14], [15], defected ground structures [16], [17] or complex (e.g., spline-parameterized) profiles [18], [19] the exact effects of which cannot be quantified using analytical or equivalent network representations. Thus, utilization of full-wave electromagnetic (EM) simulation tools is imperative at all design stages to ensure the reliability of antenna evaluation [20], [21]. It is especially VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ crucial in parameter tuning, which is nowadays frequently realized through rigorous numerical optimization. The high computational cost of such procedures-a result of massive EM analyses required by both local [22] and global [23] search routines, is a serious practical problem. It is even more pronounced for uncertainty quantification procedures, especially robust (tolerance-aware) design [24], [25]. Expediting simulation-driven design is a practical necessity. It can be accomplished using strictly algorithmic means, the example of which is the incorporation of adjoint sensitivities [26] into gradient optimization [27], [28]. Nonintrusive methods include gradient-based routines with sparse sensitivity updates, e.g., [29], [30]. Computational speedup can also be obtained using variable-fidelity simulation models such as equivalent networks in the case of microwave components [31] or coarse-mesh EM analysis in the design of antenna structures [32]. In either case, the low-fidelity model has to undergo an appropriate enhancement to be used as a reliable predictor. Popular techniques include space mapping [33] as well as various response correction schemes (manifold mapping [34], adaptive response scaling [35], shape-preserving response prediction [36]). For certain purposes, especially global optimization, machine learning techniques are often employed [37], [38], typically in connection with surrogate modeling methods [39] and adaptive sampling [40]. Local surrogates are becoming indispensable for uncertainty quantification, either to replace EM analysis when performing Monte Carlo analysis [41] or to directly yield the statistical moments of the system outputs (e.g., polynomial chaos expansion [42]).
A more aggressive approach is to replace expensive EM simulations in their entirety by globally accurate surrogates [43]. Ensuring a sufficient predictive power of the metamodel enables conducting all conceivable simulation-based design tasks at a negligible cost. The initial expenditures related to training data acquisition, even if considerable, may be justified by multiple use of the model. Due to their generality, data-driven (or approximation) surrogates are by far the most popular [44]. Some widely applied techniques include kriging [45], Gaussian process regression [46], radial basis functions [47], artificial neural networks [48], and, recently, polynomial chaos expansion [49]. Although appealing, practical realization of the above concept is hindered by several factors including high nonlinearity of antenna characteristics and the curse of dimensionality, i.e., a rapid increase of the training data set size required to render a reliable model as a function of the number of antenna parameters. Another issue are utility demands: design-ready surrogate should be valid over broad ranges of the system operating conditions, material parameters (e.g., substrate permittivity) and geometry parameters. Establishing an accurate model that fulfills such conditions is challenging for modern multi-parameter antennas to the extent of being virtually impossible beyond a few variables with narrow ranges thereof [44]. Available mitigation techniques, e.g., high-dimensional model representation (HDMR) [50], smart basis function selection schemes (orthogonal matching pursuit, OMP [51], leastangle regression, LAR [52]), or variable-fidelity surrogates (co-kriging [53], Bayesian model fusion [54]) are only applicable to specific situations.
A recently proposed performance-driven modeling offers an alternative approach to overcoming the dimensionality and parameter range issues [55]. As suggested in [55], by constraining the model domain to a region that contains high quality design with respect to performance figures relevant to the antenna structure at hand, it is possible to render the surrogate that is accurate and valid over wide ranges of operating conditions (e.g., antenna center frequency [56] or permittivity/thickness of the substrate antenna is implemented on [57]). At the same time, the required number of training samples is significantly lower than for conventional methods. Identification of the region of interest is carried out using a pre-optimized set of reference designs [55]. Due to a complex geometry of the constrained domain, some practical problems may arise related to design of experiments but also model optimization [58]. These have been greatly alleviated by the nested kriging framework [59], the formulation of which involves a surjective transformation between the unity interval and the model domain.
This paper investigates whether introducing sequential design of experiments (DoE) into the nested kriging framework may bring further benefits (in terms of improving the predictive power of the surrogate) over the uniform sampling originally implemented in [59]. Sequential DoEs [60], especially the exploitative ones [20] aim at identifying and filling in the regions characterized by higher nonlinearity of the system outputs. This allows for redistribution of the training data samples by putting more emphasis on such areas while sparing samples on the plateau or low-nonlinearity regions. Here, the infill criterion is maximization of the mean square error [61] because the overall goal is to improve the global accuracy of the surrogate. Numerical experiments conducted for two dual-band microstrip antennas indicates that sequential DoE does not bring any computational benefits over the uniform sampling. This is an interesting and counterintuitive result. Notwithstanding, it can be explained by the specific geometry of the domain of the nested kriging surrogate model, i.e., the fact that it only contains designs that are nearly-optimum with respect to the selected figures of interest. The latter implies that the frequency-averaged nonlinearity of the antenna responses is almost location independent. Consequently, the expected optimum allocation of the training data samples should be close to uniform. This seems to be an inherent feature of performance-driven modelling techniques in general, and the nested-kriging framework in particular.

II. SURROGATE MODELING BY NESTED KRIGING. UNIFORM AND SEQUENTIAL SAMPLING
This section briefly recalls the nested kriging modelling method with the emphasis on the uniform domain sampling technique utilized in the original version of the framework.
Subsequently, incorporation of sequential sampling into the nested kriging methodology is discussed. Section III compares the predictive power surrogate model constructed using these two sampling strategies and provides qualitative explanations for the obtained results.

A. NESTED KRIGING MODELING FORMULATION
The nested kriging procedure constructs two kriging interpolation surrogates [45]. The first-level model serves for identifying the domain for the second-level model, being the actual surrogate that represents the antenna characteristics. The surrogate domain X S is a confined region of the box-constrained design space X , typically delimited by the lower bounds l and the upper bounds u for design variables. The domain X S accommodates high-quality designs, i.e., the designs that are optimal or nearly-optimal w.r.t. the performance figures that are of interest in a given design context. Exemplary figures of merit may refer to antenna electrical characteristics (e.g., operating frequencies in the case of multi-band antennas) or material parameters, such as relative permittivity of dielectric substrate the antenna is to be implemented on. These are denoted as f k , k = 1, . . . , N . The ranges of the performance (j) are referred to as the reference designs. Thus, the first-level model s I (f), that maps the objective space F into the parameter space X , is an inverse model, which, for a given performance vector f ∈ F, yields a corresponding vector x ∈ X . The intended domain for the surrogate is to contain all the designs that are optimal w.r.t. all performance vectors f ∈ F. As the set s I (F) ⊂ X is a mere approximation of such a region, it has to be expanded. This is carried out by an orthogonal extension of s I (F) in its normal directions. Let us denote by {v (k) n (f)}, k = 1, . . . , n − N , the orthonormal basis of vectors normal to s I (F) at f ∈ F [59]. Furthermore, we denote by x d = x max −x min the ranges of design variables within s I (F)), where x max = max{x (k) , k = 1, . . . , p}, x min = min{x (k) , k = 1, . . . , p}. Using these, the following manifolds can be defined where are the extension coefficients with T being a domain thickness parameter. The domain X S is then established as The second-level kriging surrogate is set up in X S using the data pairs {x ..,NB , with R being the response of the EM antenna model. Allocation of the training samples is of paramount importance for model reliability. Section II.B outlines the design of experiments strategy utilized by the original nested kriging framework [59], which is replaced in this work by the sequential sampling scheme as described in Section II.C.

B. DESIGN OF EXPERIMENTS FOR NESTED KRIGING: UNIFORM SAMPLING
In original nested kriging, one-shot design of experiments is carried out, i.e., the entire data set is allocated prior to constructing the model [59]. Despite a potentially complex geometry of the model domain, the space-filling sampling is greatly facilitated by exploiting the domain definition (3) and a two-stage surjective transformation from a unit hypercube [0,1] n onto X S . In the first step, the data samples {z (k) }, n ] T , are uniformly distributed using a Latin Hypercube Sampling [62], and mapped using an auxiliary transformation h 1 onto a Cartesian product whereas the second transformation The samples x (k) B within the constrained domain X S (being a subset of the design space X ) are obtained by applying a composed transformation H : [0,1] n → X S , H (·) = h 2 (h 1 (·)), to the data set {z (k) } as follows Note that the uniform distribution of {z (k) } is understood with respect to the objective space F. This is generally more advantageous over a uniform distribution in X S because of a normally nonlinear dependence between the performance figures and the geometry parameter values corresponding to the designs optimized with respect to these figures. Figure 1 shows a graphical illustration of the sampling procedure outlined above.

C. SEQUENTIAL DESIGN OF EXPERIMENTS FOR NESTED KRIGING
In this section, sequential design of experiments is considered as an alternative sampling strategy for the nested kriging modeling framework. The aim is to improve the predictive power of the surrogate model without increasing the training data set size. As explained in Section II.B, in original nested kriging, the samples are allocated using a oneshot procedure, based on LHS [62] and a mapping from FIGURE 1. One-shot sampling procedure in the domain X S (for two-dimensional objective space F and three-dimensional parameters space X ) [59]: (i) function h 1 (see (4)) maps LHS-allocated samples onto the Cartesian product of F and [−1, 1] n−N ; (ii) next, function h 2 (see (5)) maps the samples onto X S ; penultimate picture from the bottom shows samples s I (h 1 (z)) mapped into the image s I (F ) of F ; (iii) orthogonally relocated samples within entire X S (see (6)).
the normalized domain (unity interval) onto the surrogate domain [59]. One of the objectives of sequential sampling [60], [63] is to concentrate the training data samples in the regions of higher nonlinearity of the system outputs. This normally allows for reducing the modeling error as compared to uniform distributions, assuming comparable training data set sizes [64].
In this work, we focus on exploitation-based sequential sampling [20], where the new (infill) data samples are allocated iteratively using information acquired at the previously allocated points. To that end, the choice of kriging interpolation, among various data-driven surrogates, is beneficial, because the kriging surrogate provides information about the expected model error [65]. Here, the adopted infill criterion is maximization of the mean square error [61].
For the convenience of the reader, a brief formulation of kriging interpolation is provided below. Let X B.KR = {x 1 , x 2 , . . . , x NB } be the training set with R f (X B.KR ) referring to the corresponding high-fidelity model outputs. The kriging surrogate s KR (x) is defined as follows [45] s KR In (4), µ stands for a N B × t model matrix of the training set X B.KR and ϕ refers to a 1 × t vector of the evaluation point x; with t being the number of terms used in the regression function [66] described by the coefficients β whereas ρ(x) = ψ(x, x 1 ), . . . , ψ(x, x N B ) is an 1 × N B vector of correlations between x and X B.KR , and = [ i,j ] is a correlation matrix with i,j = ψ(x i , x j ). Frequently, the following correlation function is utilized [67] ψ where θ k , k = 1, . . . , n, (n being the parameter space dimensionality), are the hyperparameters, whereas P is typically constant and decides upon the prediction 'smoothness' (for many practical problems P = 2, i.e., Gaussian correlation function, is assumed). The In each iteration of the sampling procedure, the global maximum of MSE over the surrogate model domain (10) has to be sought [68]; a new sample is allocated therein [61]. Typically, the global search is realized using populationbased metaheuristics, the CPU cost of which is usually high [23], [69]. In this work, an alternative approach is taken, which exploits the particular structure of the surrogate model domain of the nested kriging technique as well as the mathematical formalism defining the domain and its relationships with the normalized domain (a unity interval). More specifically, the maximum of MSE is found in a two-step process, where the first step is exhaustive grid search leading to identification of a good initial point for the subsequent local improvement. We use the following notation: • N 0 -initial number of data samples allocated in the surrogate model domain X S (here, using the method of Section II.B); : Thus, M F contains (N M + 1) N points uniformly covering F. The initial approximation x max.tmp of the point maximizing the MSE is found through the exhaustive search on the grid as where The number of zeros in (12)  The design x max.tmp is refined to obtain the new sample point through local search by solving with The starting point for (15) is The mappings h 1 , h 2 , and H were defined in Section II.B. The overall design of experiments procedure can be summarized as follows: 1. Allocate the initial sample set {x by solving (14), (15) with the initial design (16); 6. Set i = i + 1; 7. If the termination condition is not satisfied, go to 3; 8. Construct the final second-level surrogate s KR using the current training set. The termination condition in Step 7 can be based on: (i) exceeding the maximum budget N 0 + i > N max (userdefined maximum number of samples), (ii) achieving the target value of maximum MSE, or (iii) achieving the target predictive power of the model (estimated using, e.g., FIGURE 2. Identifying infill samples in sequential design of experiments for the nested-kriging framework. The point ftmp is an initial approximation of the MSE maximizer, found using (13). Its image through the mapping h 2 (cf. (5)) becomes an initial point for local refinement as in (14)-(16); however, optimization process is formally conducted in the unit interval using the mapping H (cf. (6)) mapping the unit interval onto the surrogate model domain X S . cross-validation [70]). Note that (ii) and (iii) do not coincide because the system responses are vector-valued and the error measure applied for model quality assessment may be selected to, e.g., reflect visual agreement between the surrogate-predicted and EM-simulated antenna characteristics. Figure 2 provides a graphical illustration of the overall process of identifying the infill samples.

III. VERIFICATION CASE STUDIES
This section provides numerical verification of the nested kriging with sequential design of experiments, including its comparison with uniform sampling of Section II.B. Our considerations are complemented by discussion that gives a qualitative interpretation of the obtained results.

A. CASE STUDIES
For the sake of numerical verification, the following antenna structures are considered: • A dual-band uniplanar dipole antenna (Antenna I) shown in Fig. 3(a) [71]. The antenna is implemented on RO4350 substrate (ε r = 3.5, h = 0.76 mm). The EM model is implemented in CST Microwave Studio and evaluated using its time-domain solver (∼100,000 cells; simulation time 60 s). The objective is to construct the surrogate model valid for the following ranges of operating frequencies 2.0 GHz ≤ f 1 ≤ 3.0 GHz (lower band), and 4.0 GHz ≤ f 2 ≤ 5.5 GHz (upper band). The details about the reference designs and the parameter space can be found in [71].
• A trapezoid dual-band dipole antenna (Antenna II) shown in Fig. 3(b) [72]. The structure is implemented VOLUME 8, 2020 In this case, the surrogate model is to be constructed for the objective space parameterized by the operating frequencies f 1 and f 2 = Kf 1 for 2.0 GHz ≤ f 1 ≤ 3.5 GHz, and 1.2 ≤ K ≤ 1. 6. The details about the reference designs and the parameter space can be found in [72].

B. EXPERIMENTAL SETUP AND RESULTS
For all considered test antennas, the nested kriging surrogate has been constructed using several training sets of various sizes: 100, 200, 400 and 800 samples. In both cases, the surrogate was constructed for two different values of the thickness parameter T (cf. Section II.A). For all cases, the surrogate was constructed using training data allocated according to uniform sampling method of Section II.B as well as sequential DoE of Section II.C. Additionally, a conventional kriging interpolation and radial basis function surrogates have been included to emphasize the overall benefits of the nested kriging framework. The numerical results have been gathered in Tables 1 and 2 for Antennas I and II, respectively.     It can be observed that the sample distributions for both sets is quite comparable in terms of uniformity, which is an indication that sequential design of experiments does lead to uniform sample allocation when applied to constrained domain of the nested kriging.
as well as different training data set sizes. This interesting outcome is counterintuitive and not in line with the intended performance of sequential DoEs. A closer look into the formulation of the performancedriven surrogates, specifically, nested kriging, helps explaining this phenomenon. The fundamental reason is a very definition of the surrogate model domain, whichby design-contains the parameter vectors that are optimum or nearly-optimum with respect to the performance figures of choice. Because the domain is the extended image of the objective space through the first-level surrogate, it contains uniformly distributed representations of the optimum designs for all combinations of the relevant figures of interest (e.g., operating conditions). This means that the Because large regions of the domain X contain poor-quality designs with shallow resonances, the nonlinearity of the functional landscape within X is not uniform and sequential DoE may bring some benefits by concentrating samples in the areas of higher nonlinearities. For the constrained domain, the nonlinearity of the antenna responses is more or less the same throughout X S (the resonances are deep and just allocated at different frequencies); consequently, uniform sampling seems to be the optimum choice and sequential DoE does not improve the model predictive power.
(frequency-averaged) nonlinearity of antenna characteristics is essentially independent of the location within the domain. In particular, there are no regions where the typical nonlinearity of the functional landscape to be modeled is higher than elsewhere. This is illustrated in Fig. 6 showing random samples distributed within the original (box-constrained) domain X and the constrained domain X S of the nested kriging. Consequently, uniform distribution of the training samples is what is preferred (from the point of view of improving the global accuracy of the surrogate) and sequential design of experiments leads to such a distribution. Figure 7 provides an illustration for Antenna I, where the distributions obtained using uniform and sequential DoEs are very much comparable.
Based on the above conjectures, the predictive power of the models rendered with uniform and sequential DoEs should indeed be comparable. It appears that (error-wise) optimality of uniform sampling is an inherent feature of performancedriven modeling methods in general, and the nested kriging framework in particular.

IV. CONCLUSION
The paper addressed design of experiments for computationally efficient surrogate modelling of antenna input characteristics. In particular, we compared the performance of the nested kriging modelling framework using uniform (LHSbased) and sequential sampling that involves maximization of the mean square error as the primary infill criterion. Comprehensive numerical experiments conducted for two microstrip antennas lead to counterintuitive results demonstrating no improvement of the predictive power for the surrogate rendered using sequential DoE over the uniform sampling. The results are consistent for all considered test cases, and various sizes of the training sets. This phenomenon has been explained based on the inherent properties of the constrained domain of the nested kriging model, specifically that fact that the domain only contains nearly optimum parameter vectors uniformly representing the design objectives selected for the antenna structure at hand. This leads to a conclusion that uniform sampling seems to be an optimum choice and the improvement due to sequential DoEs (if any) would be negligible. This seems to be an inherent feature of the nested kriging framework as well as other performance-driven modelling techniques. He is currently a Professor with the School of Science and Engineering, Reykjavik University, Iceland. His research interests include CAD and modeling of microwave and antenna structures, simulation-driven design, surrogate-based optimization, space mapping, circuit theory, analog signal processing, evolutionary computation, and numerical analysis.