Controlled Stratification Based on Kriging Surrogate Model: An Algorithm for Determining Extreme Quantiles in Electromagnetic Compatibility Risk Analysis

An electromagnetic compatibility failure is a consequence of an applied interference level being in excess of the susceptibility level of the electronic equipment under investigation. Both interference and susceptibility levels depend on various configurations of coupling paths described by sets of unknown or uncertain parameters. It is therefore convenient to describe the applied interference and the susceptibility levels as random variables. As extreme values may have a strong impact on the risk of failure, we focus in this article on the estimation of extreme values of interference level (relevant applied fields, currents or voltages) by means of a restricted set of numerical simulations. The controlled stratification method aims at reducing the variance of estimation of extreme quantile, based on a correlated simple model. We recently highlighted that a kriging surrogate model was a good candidate to provide this simple model. Combined with controlled stratification, we obtained better estimation performances than using a standalone kriging model with the same output sample size. In practice, this sample size is limited due to the excessive simulation time of electromagnetic solvers. In this paper, we propose an original algorithm, which aims at checking whether the sample size is adequate to perform an acceptable estimation or not. We first validate the algorithm using analytical models. Finally, we apply this method to estimate the 99% quantile of the total radiated power of a source located inside an open cavity with 16 uncertain inputs. In that case, the algorithm reduces the number of calls to the initial model to approximately 40% of the budget that is required using a standard Monte Carlo approach. Moreover, it provides almost 4 times more extreme outputs. More remarkably, our proposed algorithm provides guidance for assessing the performance of quantile estimation according to the initially sample size of the design of experiment.


I. INTRODUCTION
Electromagnetic Compatibility (EMC) risk analysis, in the context of intentional electromagnetic interference (IEMI), often requires solving Maxwell equations with 3D numerical solvers based for instance on the method of moment, the finite difference time domain, or the finite element method. Such models are deterministic and are supposed to provide ''exact'' solutions, their errors being neglected. Such models are The associate editor coordinating the review of this manuscript and approving it for publication was Wen-Sheng Zhao . denoted as true models in the rest of this paper. Unfortunately, a true model is very time consuming. In a realistic EMC test case, many input data are not well known due to epistemic uncertainties. These uncertainties propagate through the model. As a result, the calculated output exhibits nontractable fluctuations and may be described as a random variable. A proper IEMI risk assessment requires estimation of some extreme quantiles of the output for some input distributions.
A α-quantile is the output variable value for which its cumulated distribution function (cdf) reaches the probability level α. Any quantile estimation resorts to the construction of an adequate design of experiment (DOE) [1]. The DOE includes on the one hand some random realizations of the inputs and on the other hand, the corresponding outputs computed with the model. There are various strategies to build an adequate DOE in order to speed-up the convergence of the searched quantile estimation.
The standard Monte Carlo approach consists in building an empirical cdf from a DOE based on a sampling of input variables according to Latin hypercube sampling (LHS). Any statistic, such as the α-quantile is then directly estimated from the empirical cdf of the output sample. Re-sampling techniques [2] may also refine this estimation. However, it requires a large sample size of the DOE to compute extreme quantiles.
In recent years, some authors took advantage of advanced statistic concepts to solve EMC risk analysis problems. Most of these techniques are based on an adequate definition of the DOE to build a Surrogate Model (SM) [3] that can predict many outputs at low computation cost, and ultimately any statistic of interest. Different stochastic techniques have been successfully applied in order to provide a more realistic view of EM simulations including uncertainties. Among them, the unscented transform [4] or the ''Lagrange'' stochastic collocation [5], [6] methods, the kriging technique [7]- [11], the polynomial chaos expansion [12]- [16] were proposed. The previous quoted techniques were mainly applied to numerical simulations including various applications: shielding effectiveness [17], scattering and propagation [18] and/or EMC/EMI problems. Other important aspects of statistical developments (especially considering the EMC framework) deal with extreme quantile estimation based on reliability techniques [19]- [22] as well as on the so-called controlled stratification (CS) method [23], [24]. This last technique in particular reduces the variance of estimation of extreme quantiles, which is the purpose of our study.
The CS technique uses a specific SM to design an adequate DOE of the true model. It may be seen as a pre-sampling technique dedicated to the estimation of an extreme quantile without resorting to the entire output distribution. A SM for CS needs to be specifically correlated to the true model in terms of probability of reaching extreme values for both SM and the true model, with the same input realizations. In [24], the SM was a physical approximation of the true model. Finding a systematic approach to build an appropriate SM for the CS is an open question in literature. In [25], we rather investigated generic SMs as candidates for CS. The kriging interpolation was selected as a suitable candidate for the CS, with regard to the above-mentioned specific correlation. Recently, we applied the CS method based on a kriging SM [26]. This method, called KCS was compared to direct quantile estimations with the standalone kriging model. For a fair comparison, the sample size of the DOE (i.e. the number of calls to the true model) was identical for the extreme quantile estimation through KCS and kriging. As regard KCS, we allocated half of the sample size of the DOE to build the kriging SM, and the other half to the CS method. We then demonstrated that KCS may significantly outperform the standalone kriging SM for estimating extreme quantiles.
This paper deals with solving the question of quality or convergence of extreme quantiles estimation from a practical point of view and introduces a systematical approach to solve this problem. Obviously, the performance of KCS (or of any statistical estimation) improves as the sample size of the DOE increases, but for most EMC simulations models, only a very limited simulation budget is available due to limited time and cost resources. A single computation may last a few minutes for very simple geometries up to several hours or even days for entire systems (e.g. a vehicle).
For a given sample size of the DOE, one can hardly predict the variance of a α-quantile estimation with the KCS method. In addition, the variance of an α-quantile estimation varies rapidly with the sample size if it is too limited. This paper introduces an original algorithm to check the consistency of the sample size of the DOE according to the searched α-quantile.
The proposed algorithm starts with one DOE of a carefully chosen maximum sample size. From this DOE, a SM is built (in this paper a kriging). This SM is then used to predict the output of new and independent inputs thus forming new DOEs of smaller sizes. At each smaller DOE, a new SM is built and used to estimate the targeted quantile. While the true model is called only to create the first DOE of maximum sample size, several quantile estimations are therefore available at smaller sample sizes with no extra cost. Consequently, the fluctuations of the α-quantile estimation with regard to the sample size can be checked at a much-reduced cost. This information is useful to indicate if the simulation budget is large enough. Once the quantile estimation of SM is considered stable enough, the CS with the most accurate SM (i.e. built at the maximal Sample size of the DOE) is used to provide the final quantile estimation. To our knowledge, the KCS was presented for the first time in [26] and this implementation of the KCS has not been introduced so far in the literature. This paper is organized as follows. The KCS method is first introduced in section II. It is followed by a complete description of the proposed algorithm in section III. Then, we provide a validation of the critical part of the algorithm based on toy examples in section IV. The last section deals with the practical implementation of the method on a classical EMC case study. Finally, a conclusion and perspectives are given. Table 1 provides the nomenclature of the numerous variables used throughout the paper.

A. THE KRIGING
The goal of a SM like kriging is to approximate the true model. Once the SM is built up from the DOE of size n, it can predict an unlimited number of output realizations at negligible cost. Universal kriging assumes the output is a Gaussian process realization described as the sum of a trend term and a random term:Ŷ The term β T f (x) represents the process mean, composed of P basis functions f, and P regression coefficients β.
The term σ 2 Z (x) represents random noise, composed of a centered and reduced Gaussian random variable Z (x) multiplied by the process variance σ 2 .
If the process is stationary (i.e. constant yet unknown mean), a constant trend is sufficient (P = 1). In that case, kriging is called ordinary and the trend is written as: The core assumption of kriging is that outputs corresponding to neighboring inputs are more correlated than outputs from more distant inputs. Correlation functions, like the Matérn-5/2, are used to model the relationship between the correlation (R) and the Euclidean distance (h). Such functions are parametrized by a vector θ (with as many components as input variables): with D the number of input variables, the distance between two input points x and x is: The main task when building a kriging is to choose the best θ according to the DOE. Thanks to an optimization algorithm, like the Hybrid Genetic Algorithm and Gradient, the best θ may be found by minimizing the negative log-likelihood function: with S SM , the size of the DOE. The other unknowns (β and σ 2 ) are deduced from θ such as: where the matrix F is: Once the kriging is built, it can be used as a predictor at any new input x i in the design space. The corresponding new output follows a Gaussian distribution with mean: and variance: where: with r the vector of cross-correlations between a prediction point and every output point of the DOE: In this paper, we used the ''UQLab'' framework implementation of ordinary kriging. Additional theoretical details and practical implementation advices can be found in the ''UQLab'' user manual [27].

B. THE CONTROLLED STRATIFICATION
CS is a technique dedicated to find extreme quantiles. CS samples the input space in order to generate more extreme output events in much less trials than direct sampling.
CS requires beforehand a correlated simple model. In our case, this simple model is a kriging SM. From the total simulation budget of N , S SM realizations are reserved for building the SM and the remaining S CS realizations are used for the CS. We chose arbitrarily: The SM, once built, is used to predict a large number of output realizations. The predicted output sample is stratified, or sliced, hence the name ''stratification''. The strata limits are SM quantiles estimation, thus the adjective ''controlled''. The number of strata (n s ) is arbitrary as well as the strata's limits, except for the extreme stratum which has to be the targeted α-quantile. We used four strata. The limits are quantiles whose probabilities are grouped in a vector A: A budget of S CS simulations of the true model is allocated to the CS. Each stratum has a dedicated number of simulations N j , j = 1 . . . n s . We chose a uniform allocation strategy: A more advanced allocation strategy based on an adaptive allocation of realizations in each stratum is also available. However, adaptive allocation relies on the estimation of the optimal allocation, which does not always warrant performance improvement, in contrast with the uniform strategy [28].
The SM identifies N j output realizations, Y (j) belonging to the j th stratum. The model is called for each identified input in order to compute the true output Y (j) . If the correlation (as defined in [23]) is high enough, most of the Y (j) outputs remain in the j th stratum of the true model. In fact, CS performance relies rather on the high correlation associated to input sensitivity between the SM and the model than on the accuracy of SM prediction. In such conditions, an input producing an extreme event with the SM is likely to produce an extreme event of the true model. From the sample of outputs generated by CS, the output cdfF Y (y) is computed: Finally, the estimation of the targeted α-quantile is:

III. PROPOSED ALGORITHM
The goal of the algorithm is to estimate the SM performance improvement rate according to the sample size of the DOE. It aims at examining the relative change of statistical fluctuations (mean and standard deviation) of the quantile estimator. For example, if it is insignificant for smaller sample sizes of the DOEs, increasing the simulation budget is irrelevant. On the contrary, if this relative change is important, according to the user perspective, increasing the simulation budget might be useful. Note that this algorithm is mainly based on the observation of kriging performance. The KCS approach estimates a quantile thanks to the kriging prediction capability. Therefore, the quantile estimation performed with the CS improves accordingly.
The algorithm is described in Fig. 1. As indicated by the color code, it is composed of three main parts:

Maximal size DOE (red) 2. Monte Carlo Simulation of Surrogate Model (blue) 3. Controlled Stratification (violet).
Note that the true model is called in part 1 and 3 only. Therefore, the overall number of times (N ) the true model is called is Smax + S SM ,add in part 1 and +S cs + S cs,add in part 3.

A. MAXIMAL SIZE DOE
A DOE of maximum allowed sample size, Smax,is set up: the input sample X max is created from a LHS sampler [29]. Then, Smax calls of the initial model are performed in order to compute the corresponding Y max output sample.
The controlled stratification performance depends on the extreme realizations prediction capability of the SM. Indeed, the SM is responsible for identifying good candidates of the input space that should be in the targeted stratum. However, such a SM must be learned with a sample that contains at least one extreme realization to be reliable in the tail of the output distribution. In order to satisfy this criterion, we provide a simple approach based on the negative distribution [30].
The problem of counting output realizations above (or below if α < 0.5) the targeted α-quantile in a DOE of size N constitutes a N trials Bernoulli experiment where each output realization has a success probability p = min(α, 1 − α) to be higher (or below if α < 0.5) than the targeted α-quantile. Let K be a random variable that counts the number of failure k, i.e. realizations below (or higher if α < 0.5) the α-quantile, before observing n success trials. In that case, K follows a negative binomial distribution and its cdf is: The inverse cdf (also known as quantile function) of such distribution is used to compute the minimum sample size of the DOE, N min (at 95% confidence interval), to observe at least n extreme values: We used the Matlab function icdf in order to compute the inverse cdf. Some numerical values of N min are given in Table 2 as a function of the probability of the targeted α-quantiles and of the number of extreme values n. For instance, with 299 random realizations, the probability of getting at least one output below the 1% quantile is 95%.
An enrichment strategy allows adding more realizations to the DOE without breaking the LHS properties of the enriched DOE and, above all, without wasting previously computed samples [31]. This enrichment may be needed according to the Monte Carlo simulation of the SM as described in the following section. Moreover, the simulation budget may be increased for other reasons.

B. MONTE CARLO OF SURROGATE MODEL
From the maximum sample size of input and output realizations, a SM is built. Two indices are also initialized (ind and mc) as shown in Fig. 1. The index ind identifies the size of the DOE currently tested. Every tested size (always smaller or equal to the maximum size) is grouped in a vector of sizes S = [S 1 , . . . , S ind , . . . , Smax]. The choice of the sizes to be tested is not critical; the simplest strategy consists of equipartitioning within the range of possible sizes, according to Table 2, up to the maximal affordable size. The mc index serves as a counter in the Monte Carlo simulation loop. The selected SM is a kriging implemented in the UQLab framework as described in section II.1.
A new sample of X of size S ind is provided from the same underlying distribution associated with each input variable, using the same LHS sampler. The SM (and not the initial model) is called (at low computation cost) to predict the corresponding Y outputs. As a result, a new DOE execution is created without further runs of the initial model. From this new DOE execution, another SM is set up, which is used to estimate the targeted α-quantile.
This process is repeated MC times in order to perform a statistical analysis from MC α-quantiles associated with the sample size S ind . From this sample of quantiles, we compute its mean µ qind and standard deviation σ qind . Because we needed only a rough estimate of the first two moments, we chose MC = 20.
Eventually, we obtain a multivariate sample of estimated quantiles (one sample of quantiles for each tested sample size).
Two information can be inferred from the multivariate sample: 1. Detect if the chosen maximum sample size actually results in falling in the too small size region where performances are catastrophic. 2. Evaluate the rate of improvement of the SM observing the relative change of the statistics. In the first situation, the affordable size is clearly not large enough. This is not likely to happen if the DOE max size has been chosen wisely as discussed previously. VOLUME 8, 2020 The second situation falls into two categories. If the rate of change tends to be insignificant, the available maximum sample size of the DOE is suited for this quantile estimation. On the contrary, if it is significant, this is a strong indication that enriching the DOE (through LHS enrichment) would be profitable, using additional calls to the true model, if affordable.

C. CONTROLLED STRATIFICATION
A new DOE is created for applying the CS using the realizations of the kriging SM (called SM max ) built at the maximal size, Smax. The stratification of the output space with adequate quantiles is achieved with the SM, and the corresponding inputs realizations are identified. We chose four strata defined by (14) and a uniform allocation strategy as defined by (15).
The initial model is then called with the identified input realizations to compute the corresponding outputs. Finally, the empirical weighted output distribution is deduced with (16) and a quantile is computed according to (17).
In order to improve the precision of the CS quantile without computing an entire new stratified DOE, a simple enrichment strategy is used. Namely, from the same SM max , we add a few more predicted extreme realizations (allocated to the two last strata). The empirical weighted distribution is updated to compute another quantile. This process is repeated a few times.

IV. ALGORITHM VALIDATION A. VALIDATION METHODOLOGY
The algorithm described in Fig. 1 is partly based on a Monte Carlo simulation of the statistics of the α-quantile estimation performed with the SM as a function of the sample size of the DOE. This corresponds to the blue-colored part of the flowchart in Fig. 1. Note that a common and single DOE execution of maximum sample size is used. In this section, we validate our methodology by checking its sensitivity to the initial DOE of maximum sample size.
To that end, the algorithm depicted in Fig. 1 is run 10 times (from BEGIN to END), except the DOE enrichment and CS processes which are useless for this specific validation analysis. Moreover, we provide a pseudo-reference through a Monte Carlo simulation as described in Fig. 2 (with MC = 100). For each sample size of the vector S, the execution of Fig.2 provides a new DOE execution per Monte Carlo trial. Therefore, we obtain a pseudo-reference for quantile estimation for each component of S.
The maximum sample size is Smax = 770 for both algorithms, which are also run with the same vector S.
In order to carry out this validation step, we use two analytical models as toy examples. The first model deals with the calculation of the reflection coefficient (in linear) of a series R, L, C circuit with 4 random inputs. It exhibits a non-linear behavior due to its resonating nature. The second For this specific problem, we aim at identifying the 1% quantile that corresponds to very low magnitudes of the reflection coefficient S 11 at circuit's resonance. Figures 3 and 4 show the computed mean µ qind and standard deviation σ qind of the 1% quantile for each of the 10 executions of the flowchart in Fig. 2 as a function of the sample size of the DOE, respectively. The pseudo-reference obtained from the algorithm in Fig. 2 appears as a black curve. Colored dots enable the distinction between the 10 different executions of the flowchart of Fig. 1.
Both Fig. 3 and Fig. 4 exhibit two distinct regions. First, we identify a small sample size region, below 300, where the 1% quantile is overestimated with a large and unstable  standard deviation from one run to another. This is in line with expectations from Table 2. The second region corresponds to a rather monotonic convergence of the 1% quantile estimation. The mean and variance estimators remain slightly biased, but the rate of improvement is well estimated and in agreement with the pseudo-reference result. The proposed algorithm very clearly manages to distinguish the small size region. Beyond that region, the improvement rate is well estimated despite a slight bias.

2) PCB MODEL
The 11 random variable inputs are: the frequency (fr), geometrical characteristics (the substrate thickness (h), the trace width (W ) and length (l)), electrical characteristics (substrate permittivity (ε r ), voltage source (V s ), impedance source (Z s ), impedance load (Z l )), and the position where the electric field strengh is measured (spherical coordinates r,θ , and ϕ).  For this specific problem, we aim at identifying the 99% quantile that corresponds to extreme values of radiated field, i.e. at risk from an EMC point of view.
The results for this case study are plotted in Fig. 6 for the computed mean and in Fig. 7 for the standard deviation. Since we target a quantile in the upper tail, the mean is underestimated in opposition to the S 11 model case study. The standard deviation remains large for small size samples, but looks stable. The monotonic convergence of the 99% quantile estimator appears for rather small sizes but the rate of change is weak beyond a limit size of approximately 200. Clearly, the proposed algorithm ( Fig. 1) is successful, at a limited cost (a single DOE at maximum available size), to identify the trend of the estimation quality. This provides a very useful guidance for the user.

V. APPLICATION CASE
In this section, the full algorithm (including CS and LHS enrichments) is implemented in a radiation emissivity VOLUME 8, 2020 test scenario with epistemic uncertainties about the model parameters.

A. MODEL PRESENTATION
The model is a cavity with 2 rectangular holes made on the top face. A current flows on the external surface of a filled cylinder inside the cavity. Walls of the cavity as well as the cylinder are made of perfect electrical conductors.
The model is described by input parameters considered to be known with a constant value and 16 uncertain input variables. All input parameters and random variables are listed in Table 3. Each random variable follows a uniform distribution centered to their nominal value with a tolerance, arbitrarily set to +/ − 10% around the latter. Furthermore, there is no correlation between them. A 3D view of the geometry with the inputs set to their nominal value is represented in Fig. 8.   FIGURE 8. Geometry of the cavity with two apertures on its top face. A current source, i , is located at the surface of a cylinder inside the cavity. The output is the maximum electric field strength radiated at distance d .
The output of interest is the maximal radiated electric field strength (E) on a sphere at a distance d (with reference to c in Fig. 8) of the cavity position. The nominal output (i.e. from inputs set to their nominal value) is equal to 0.01 V /m. The targeted quantile of E is the 99% quantile.

B. REFERENCE SAMPLE
In order to have a reference sample of the output, the HFSS software is used to generate inputs sample and compute the corresponding output. The simulation was stopped after 72 hours providing a DOE of size 3225. As there are a finite number of outputs, the 99% quantile computed from the reference sample is still an estimation. In order to assess the estimation uncertainty of this reference of the 99% quantile, we have used bootstrapping [2] which allows computing likelihood bounds of the reference quantile. Figure 9 shows a histogram of the reference sample output distribution. The mean of the bootstrapped sample of 99% quantiles is plotted in Fig. 10 with its bootstrap confidence bounds (2.5% and 97.5% quantiles of the bootstrapped sample of quantiles). This reference mean and bounds appear as three horizontal lines in Fig. 10.

1) INITIAL DOE
The simulation of one realization (inputs set to their nominal value) takes 30 seconds. As far as the 99% quantile is concerned, Table 2 requests a sample size of 299 (n = 1) which would result, under the linear hypothesis, in a 2.5 hours  simulation. For this specific example, we assume that such a budget of simulation is obviously affordable.

2) SM MONTE CARLO SIMULATIONS AND DOE ENRICHMENTS
After the computation of the DOE (Smax = 299) and of the output of the initial HFSS model (red colored section of Fig. 1), we run the SM MC simulation section of our algorithm (blue colored section of Fig. 1) for 10 sizes between 10 and 299. The mean of the quantile is plotted in Fig. 10 (in blue color for sample sizes between 10 and 299) with error bars (a bar width is equal to 3.92 times the empirical standard deviation).
Considering this initial sample size of 299, the blue portion of the curve in Fig. 10 shows that the rate of change of the estimation tends to be limited above 150 realizations. We may therefore conclude that 299 realizations are adequate to yield a reasonable approximation before applying the CS approach. We may however expect some improvement for higher sample sizes.

3) ENRICHING THE DOE
Since the HFSS model runs fast enough, we assume we can afford the enrichment of the initial DOE before applying the CS. Back to table 2, we have chosen to increase the sample size of the DOE so that we probably find at least 2 extreme output values, which yield to a new total sample size of 473 (n = 2). Thus, we add 174 more realizations to the previous DOE which lead to a new DOE of 473 realizations. Therefore, the SM MC simulation is carried out with different sizes ranging between 299 and the new maximum size of 473. The blue part of the curve of Fig. 10 is now prolonged with a new portion in magenta color with its corresponding errors bars. The improvement rate is clearly slow. However, there is a discontinuity observed at a sample size of 299 between the blue and magenta curve. Indeed, results shown in blue all originated from a SM built with a sample size of the DOE of 299 whereas the results shown in magenta color are predicted from a SM built at a sample size of the DOE of 473. Therefore, a perfect continuity would have been in fact surprising.
For sake of confirmation of the slow improvement rate, we have carried out a second and final enrichment. Therefore, we have added another set of 155 realizations, raising the total simulation budget to 628 (n = 3). The same type of SM MC simulation is performed with sizes ranging from 473 to 628 and the result appears in the green portion of the curve in Fig. 10. There is also a discontinuity at 628 for the same reason mentioned above.

4) FINAL QUANTILE ESTIMATION WITH CS
CS is performed in association with the SM built from 628 realizations. The CS enables to record 628 new realizations to accelerate the quantile estimation convergence. Therefore, the quantile estimated with CS is carried out from a total budget of 1256 realizations and is plotted in a red unfilled circle in Fig. 11. Finally, in order to check the stability of the quantile estimation by the CS, a few more realizations are allocated in the VOLUME 8, 2020 last stratum. The quantiles estimated by this CS enrichment strategy are plotted also in red but filled circles.
The CS estimated quantiles in Fig. 11 oscillate around the reference confidence upper bound also plotted in Fig.11. The variation is negligible compared to the width of the reference confidence bound, which confirms the relevance of our algorithm.
The entire number of calls to the true model is 1296, which represents 40% of the reference sample size (3225). It could have been further reduced avoiding the unnecessary final enrichment of the supporting surrogate model. Moreover, the number of realizations above the estimated quantile was 32 with the reference sample and 125 with the KCS method (number of realizations above 0.01931071, i.e. the final estimated quantile). More generally, the CS enables to reduce the estimation bias of the standalone SM model.

D. SENSITIVITY ANALYSIS OF EXTREME VALUES
Thanks to the SM built at 628 and the application of the KCS method, we are able to identify a large sample of extreme events. We use this information to analyze the range of inputs that are likely to produce them. This information provides a useful physical insight to improve the design.
In Fig. 12, we provide a selection of histograms attached to 4 particular inputs. These histograms are normalized for visualization purposes such that the sum of the bar areas is equal to 1 (i.e. pdf normalization). The blue colored histograms represent the inputs causing the reference output sample in Fig. 9. Each variable follows a uniform distribution with 10% variation off their nominal as intended (see Table 3). The orange colored histograms represent the distribution of inputs that generated extreme values (i.e. values above the quantile estimated by the CS). A uniform distribution is not informative. This is the case with the a1 and b1 variables (length and width of the first hole). An impact may exist in conjunction with other variables but cannot be seen through these histograms. A global sensitivity analysis with Sobol indices [33] would be needed. However, we may conclude that this aperture does not determine itself the probability of obtaining extreme values of radiated emissions.
On the other end, if the distribution is clearly not uniform, that definitely proves a strong impact. This is the case for b2 and yt1 (second hole length and first hole position). For example, an extreme output will not be observed if yt1 > 48 mm, no matter the other inputs values. The height of the cavity (i.e. the distance from the current source) and the length of the aperture in front of the source play a key role with regard to strong radiated emissions.
Consequently, the decrease of b2 and increase of yt1 are the preferred options to reduce radiation outside the cavity rather than modifying a1 and b1.

VI. CONCLUSION
Extreme quantiles provide useful information for EMC risk analysis. In this paper, we estimated extreme quantiles of the output distribution when the electromagnetic model is costly to simulate. We introduced a practical implementation of a new approach called KCS, which associates kriging to controlled stratification. The proposed algorithm answers the practical question about the simulation budget needed to reach a sufficient quality for extreme quantile estimation. In addition, this algorithm provides useful insight about the improvement rate based on quantile estimation with smaller simulation budget. The critical part of the algorithm has been validated on analytical models. Then, we have applied the complete calculation procedure to an EMC case study. In that case, we showed that it clearly outperformed the classical Monte Carlo approach. In particular, KCS provides a significant number of extreme outputs for either sensitivity analysis or extreme value distribution analysis. In addition of its interesting performances, the proposed algorithm can be very easily implemented. Kriging is already implemented in most popular programing languages (i.e. Matlab, Python and R). Although the Controlled Stratification is not, it can be easily done.
However, for models in very high dimensions (couple of dozens), building a kriging surrogate model may be not considered computationally negligible so the proposed algorithm may not efficient. An interesting perspective would be to adapt the kriging through partial least squares approach as in [34] for very high dimensions.