A Spline-High Dimensional Model Representation for SRAM Yield Estimation in High Sigma and High Dimensional Scenarios

Traditional Static Random-Access Memory (SRAM) yield estimation through Monte Carlo analysis is an extremely time-consuming process since it runs millions of expensive transistor-level simulations to get the yield results with the specified precision, especially for the large-scale circuits. In this paper, we develop an efficient yield analysis framework by integrating our novel performance metamodel into a state-of-art importance sampling method. The performance meta-model, named Spline-High Dimensional Model Representation (SP-HDMR), is used to substitute the expensive transistor-level simulations in yield estimation. The proposed SP-HDMR model provides a high computationally efficient formula expansion. It uses spline functions as the kernels to describe the various relations between the process parameters and SRAM read access delay. And an adaptive sampling method with sparsity analysis is developed to support SP-HDMR modeling. The experiments on the 40nm SRAM circuits validate the accuracy and the efficiency of the proposed yield analysis framework based on our SP-HDMR model with 1.3X $\sim 5\text{X}$ speedup over the other state-of-art methods within 9% relative error.


I. INTRODUCTION
As semiconductor technology continues to advance, SRAM cells designed with minimum sizes are more susceptible to process fluctuations [1]. As a result, yield degeneration has become a bottleneck for the robust SRAM design [1]- [4]. To guarantee a robust design, traditional corner-based analysis methods will lead to a too pessimistic result in the worst-case corner and have to be verified on thousands of corners if more process parameters are considered. Thus, statistical methods are required to make yield estimations reasonably realistic. An SRAM chip typically consists of millions of SRAM cells. To make sure the acceptable chip yield, the failure rate of each cell should be extremely low. For a 1Kbits SRAM chip with 99% yield, the failure rate of an SRAM cell should be lower than 10 −5 .
To estimate the failure rate of SRAM accurately, many statistical methods have been proposed. Among them, Standard Monte Carlo (MC) analysis is the traditional method and is remained as the gold standard. It samples the whole The associate editor coordinating the review of this manuscript and approving it for publication was Jian Guo. variation space directly and simulates each sample to get the corresponding performance at the transistor-level. However, it is extremely time-consuming to estimate SRAM failure rates due to the huge number of simulations, e.g. it needs over 10 7 simulations to get a 4-sigma yield result. Besides, in an SRAM array, the dynamic functional failure, such as read delay failures defined as read operations exceeding a specified time in our work, depends on not only the state of the weakest cell but also the state of other cells in the same column [5]. Hence, we must consider the process variation of all these cells to estimate the SRAM yield. As a result, it brings a high dimensional variation space for SRAM yield estimation and extremely expensive computational overhead per simulation.
To accelerate the traditional MC method, many statistical approaches have been proposed, which can be grouped into two categories: Importance Sampling (IS): The basic idea of IS is to find a distorted sampling distribution to sample near the failure region. The efficiency and accuracy of IS-based methods heavily depend on the distorted sampling distribution. Most of the approaches [6]- [17] have been proposed to construct such distribution by shifting the mean-vector of the original distribution to the boundary or center of the failure region. Recently, Shi and Liu et al. [6] proposed the Adaptive Importance Sampling (AIS) which improves the tolerance of poor initialization by searching the failure boundary dynamically in resampling iterations through a developed unbiased estimator. However, the methods above are infeasible in high dimensional scenarios because the likelihood ratios between the original sampling distribution and distort sampling distribution have huge numerical instability [18], [24]. Wu and Gong et al. [9] proposed a kind of High Dimensional Importance Sampling (HDIS) to address this issue by constructing a new subset to calculate the failure rate indirectly but it consumes a large number of transistor-level simulations to converge. Unlike most digital circuits that can be efficiently analyzed at the gate level, SRAM included most analog/mixed-signal circuits must be simulated at the transistor level. Although the IS methods can decrease the number of simulations, the time for a single simulation is still large.
Meta-Modeling: To further speed up the yield estimation, many works [13], [16], [17], [31], [32] try to construct a metamodel to take place of the expensive transistor-level simulations by mapping the process variation into the circuit performance metrics. The works [13], [16], and [13] leverage Polynomial Regression, Gaussian Process, Radial Basis Network respectively, which show good accuracy in the low dimensional scenario. However, these meta models suffer two challenges.
(I) High Dimensionality: Most of works [8], [10], [16] only consider the effect of threshold voltage V th0 , while other process variations also have significant effects on the SRAM performance, such as offset voltage V off 1 and electron mobility u 0 . These parameters increase the unknown coefficients in the model construction. Fig. 1 shows the simulation results of SRAM read access delay with different variations near the failure region. 1000 samples are sorted by their corresponding read performance. There is a huge difference between the simulation results of samples that only consider V th0 and the ones that consider all parameters. Furthermore, the number of transistor-level simulations required by the meta models [13], [16], [17] grows exponentially with the total number of process parameters to obtain acceptable accuracy. For example, there are total 4608 process parameters in a 256-depth 6T SRAM column only considering the variations of V th0 , V off , u 0 for each transistor. Other state-of-art works, such as [31] and [32], construct the computational efficient meta models by utilizing the sparsity of the underly problem. However, the result of yield estimation is very sensitive to the accuracy of the metamodel, which still needs an extremely large number of training sets to make the model converge.
(II) Discontinuity: For the SRAM yield estimation, the model should guarantee the accuracy of the high sigma 1 compensate threshold voltage shifting in BSIM model. region of SRAM performance. Thus, a wide range of process parameter variation needs to be considered in the performance meta-model modeling. However, the variations in a certain direction (e.g. positively increased V off ) will increase the read access delay and even make the simulation failed which means the simulator can't read a value in these cases. In Fig. 1, the 1000 samples are sorted by their read access delay. And we found the simulation failed at about the 900 th sample due to the too large process variation. Notice the maximum continuous result is 5.8 × 10 −8 . And to facilitate our statistics, we set the results of these simulation failed samples as a constant, such as ''1 × 10 −7 '' in Fig. 1. The constant can be set to other values larger than the maximum continuous result. This phenomenon comes from two reasons. First, the reliability of bit cells is no longer guaranteed in low supply voltages, resulting in extreme cases where the data stored in cells cannot be read out correctly due to the large process variation. Second, the transient analysis time is limited in the. TRAN statement. It is unrealistic to set too large analysis time in the expensive simulation. Oppositely, the time limit is set to the Wordline enable time in the practical SRAM design. The discontinuity region brings a huge challenge in performance modeling. Most previous works [30]- [32] try to construct a high dimensional model to approximate the circuit performance metric by decomposing f (x) into a combination of N orthonormal basis functions. Once the basic functions are determined, the modeling problem is transferred to solve the model coefficients. The termination of such algorithms is to find a set of coefficients to minimize the loss function, typically the Mean Square Error (MSE). However, these regression-based models can only fit the continuous performance because of their nature of approaching all training samples to minimize MSE. It is infeasible to apply these models in the high dimensional and discontinuous scenario simultaneously in SRAM read performance modeling.
In this paper, a statistical SRAM performance meta model is constructed and applied to the SRAM yield estimation. Our paper contributions are summarized as follows: • We developed a Spline-High Dimensional Model Representation (SP-HDMR) to replace expensive transistor-level simulations in yield estimation. The model is based on Sobol's theory [21] which can decompose a high dimensional problem into multiple low dimensional ones. Meanwhile, a strategy of adaptive modeling with sparsity analysis is developed to VOLUME 9, 2021 further minimize the number of unknown coefficients of SP-HDMR in high dimensional scenarios. The spline function is chosen as the kernel of SP-HDMR and properly trained to address the discontinuity problem caused by the large variations of process parameters within limited analysis time.
• A yield analysis framework is developed by integrating our SP-HDMR model into a state-of-art importance sampling method to replace expensive transistor-level simulations. It aggressively improves the yield estimation overhead compared to both the traditional MC method and the pure IS-based method. The accuracy can be guaranteed by the technique of re-simulation. The rest of this paper is organized as follows. In Section II, the rare event analysis problem and related works are revisited. The yield analysis framework based on SP-HDMR is introduced in Section III. Section IV provides details about the construction of SP-HDMR. The accuracy and efficiency of our method will be demonstrated by several experiments in Section V. In Section 6, we will give our conclusion finally.

A. RARE EVENT ANALYSIS
For the n process variables: P = [p 1 , p 2 , . . . , p n ], these variables are modeled as a vector of independent Gaussian variables: x = [x 1 , x 2 , . . . , x m ] by the Principle Component Analysis (PCA) [26] in commercial Process Design Kit (PDK). For generalization, each variable is normalized to standard Normal. And H (x) is joint probability density function (PDF) of x. Let f (x) be the interest performance metric which is measured through expensive transistor-level simulation, such as SRAM read access delay in our work.
For the failure rate evaluation of SRAM, we denote S as the tiny failure region. We define the circuit performance doesn't meet the specification when f (x) ∈ S. And we further introduce indicator function I (x) to identify pass/fail of f (x): Therefore, the probability can be calculated as: Unfortunately, formulation (2) is difficult to calculate analytically because we don't know what distribution I (x) satisfies exactly. Traditionally, Monte Carlo is used to estimating the failure probability by sampling from H (x) directly, and the unbiased estimate of P fail :

B. HIGH DIMENSIONAL IMPORTANCE SAMPLING
For SRAM failure rate estimation, Y ∈ S is a rare event.
Standard MC needs hundred millions of expensive circuit simulations to capture such a ''rare event''. Although MC can be run at paralleled mode, it is still a time-consuming process.
The IS methods are proposed to reduce the number of simulations by constructing a ''distorted'' PDF G(x) to generate the samples near the failure region. And the failure probability can be expressed as (5): where the w(x) denotes the likelihood ratio between original PDF H (x) and the distort PDF G(x) which compensates for the discrepancy between H (x) and G(x). And an unbiased IS estimator P IS,fail can be calculated as (6): With a proper G(x), P IS,fail can be approximately equal to MC results. However, the likelihood ratio w(x k ) shows huge numerical instability in the high dimensional scenario [18], where some w(x k ) become dominant and even infinite so that the estimation in equation (6) becomes unreliable. Wu, etc [9] proposed a provably bounded failure analysis method, High Dimensional Importance Sampling (HDIS). The basic idea of HDIS is to set a new threshold t, where t > t c . And f (x) > t is not a ''rare'' event but dominates the ''rare event'' f (x) > t c . Hence the failure rate of SRAM can be estimated as follows: The P (f (x) > t) can be calculated by MC. While the P (f (x) > t c |f (x) > t) is less than 1 according to conditional probability theory [37], which avoid the huge numerical instability of P (f (x) > t c ) in the high dimension. However, it still needs large number of expensive transistor-level simulations to converge a stable result.

C. META-MODELING
Although IS methods reduce the number of sampling to a certain degree, it is still time-consuming for the expensive simulation overhead, especially the circuit size is large. Just using IS is not enough to decrease the estimation cost drastically compared to parallel MC.
Sobol [21] proposed the High Dimensional Model Representation (HDMR) by decomposing an integrable function into the sum of low dimensional ones, which improves the computational efficiency greatly. It can be formulated as: where f 0 is a constant used to measure the zeroth-order effect of variable vector x on the circuit response f (x) and f i (x i ) represents the first-order effect of a single variable x i acting independently upon f (x). Similarly, f ij x i , x j represents the second-order effect of variables x i and x j on f (x). And the latter terms show the high-order effects of the variables. The HDMR expansion aims to represent multivariate functions arising in physical contexts rather than for arbitrary function interpolation [19]. It gives us space to carefully analyze the effects of different variables and train the proper basic functions to characterize these effects. The accuracy and convergence speed of HDMR is determined by its basic functions and corresponding modeling method. Commonly used basic functions, such as Polynomial [24] and Gaussian Process [36], are not accurate enough to handle the discontinuity characterization of SRAM read access delay.

III. YIELD ESTIMATION BASED ON SP-HDMR MODEL
The framework of the proposed yield analysis method, named HDMRIS, is summed as Algorithm 1. We integrate our SP-HDMR model into HDIS [9] to further speedup the yield estimation. The initialization step and the failure calculation ways are the content of HDIS [9] mentioned in the background. The pre-training samples generated by the distorted probability distribution in procedure 2.1 are used to determine the cut point x 0 and the training scope of variables in Algorithm 1. After the construction of SP-HDMR, most samples in step 3 are predicted by the model.  3.4 Calculate the failure rate as equation (7) Notice that transistor-level simulations are only used to evaluate the samples whose predictions are in the predefined range of the failure specification in step 3.2. It is because IS has strict accuracy requirements on the boundary of the failure region. To illustrate it, we review the failure rate calculation in importance sampling shown in (6). As long as the model prediction is wrongly greater than the failure specification, I (x) will be incorrectly judged as 1. It makes an unnecessary likelihood w (x i ) in (6) accumulating in P IS,fail , resulting in a large error in the final estimation.
As the black and red lines are shown in Fig. 2, the convergence results of yield estimation using transistor-level simulations (HSPICE results) and SP-HDMR only are completely different. As the blue and green lines are shown in Fig. 2, the estimation with the re-simulation technique can greatly reduce the prediction error. Re-simulation within a 7% range of failure specification performs best in this comparison. In our experiment, there are only 532 reevaluated samples in total, which is a small fraction of the whole sample set. The range of re-simulation depends on the accuracy requirement of the SP-HDMR. The more accurate the meta-model is, the smaller range can be set to satisfy the requirement of importance sampling.

IV. SPLINE HIGH DIMENSIONAL MODEL REPRESENTATION A. ALGORITHM OVERVIEW
In most well-defined physical systems, only relatively loworder correlations among input variables have significant impacts on the output [19]. Besides, the process variation variables x = [x 1 , x 2 , . . . , x n ] has been modeled as independent ones by principal component analysis (PCA) [26] in PDK. Hence, we reserve the top two order terms in (8) to achieve the balance between the complexity and the accuracy. As shown in Table 1, there is almost no difference between the results predicted by the top two order terms and the top three order terms. However, the training cost grows drastically with the exponentially increased second-order interacting terms.
And the proposed SP-HDMR can be formulated as: where thex is the mean of variables and f (x) denotes SRAM read access delay. The variation range of x is set to ±8σ which has the probability of 10 −15 , which means the samples out of this range are infeasible.
There is no unique expansion for equation (9) [19]. However, the modeling cost heavily depends on its expansion. Cut-HDMR expansion [19] provides an exact representation of equation (9) along a hyperplane passing through the ''cut'' point or the reference point. Then f (x) is represented by superposing the value of f (x) on the lines, planes, and hyper-planes passing through the cut point. As a result, each term of (9) can be modeled as: where x 0 = [x 10 , x 20 , . . . , x n0 ] T is the cut point, the x i 0 = [x 10 , x 20 , . . . , x (i−1)0 , x (i+1)0 , . . . , x n0 ] T represents the variable vector x 0 without the element x i0 .
The cut-HDMR expansion only involves simple arithmetic computation, which provides us the flexibility of selecting basic functions. In this work, spline basic functions and adaptive training method with sparsity analysis are developed to minimize cost, which is discussed in the next subsection.

B. IMPLEMENTATION DETAILS 1) BASIC FUNCTION SELECTION
The key to constructing an accurate HDMR is to obtain the proper basic functions for measuring each order effect accurately. Due to the too large process variations in the failure region, the simulations often failed to get numerical values. Here, we define SRAM read access failure as that the read access delay exceeds the 4.8 × 10 −8 seconds (4.5σ ) at 0.7V. The word ''failed'' will be written into the final data file. For our modeling convenience, these discontinuous responses are represented by a constant. However, it brings a ''jump'' in the trend of performance metric, which is a stepping point as marked in Fig. 3 a). The characteristic of regression functions, such as Polynomial [24] and Gaussian process [32], is to reach all data points as close as possible for minimizing some cost (e.g., the square of error), which make them failed to fit the discontinuous curves.
To address this issue, we notice that the interpolation method can pass through all samples. It can model the discontinuous functions with proper intervals. And among the different interpolation functions, the spline function p (x) , x ∈ [a, b] is the most widely used interpolant in the spline interpolation for its smoothness and robustness (High order interpolation will lead to the ''Runge phenomenon'' [38]). It is defined by piecewise third-order polynomials defined as follows: where p (x i ) is the spline basic function of ith variable, x i . The x '' is away from the step point, the p (x i ) will have a relatively large error near the step point. Hence the training samples must be carefully generated in our adaptive modeling method, which is discussed in the next subsection. Fig. 3 compares the fitting effects of different basic functions for three process parameter variables, offset voltage V off , threshold voltage V th0 , and electron mobility u 0 , by using 50 training samples near the failure boundary. The ''f (x i , x i 0 ) −f 0 '' in y-axis means the first-order effects on SRAM read access delay of these three variables. 100 sample predictions are sorted by the normalized variable. As shown in Fig. 3 (a), the SRAM read access time failed to be evaluated by transistor-level simulation in the tail of V off . The variable V off affects the drain-source current of MOSFET [27]. Too large variation on V off may make the low gate-source voltage and cannot open the channel between drain and source so that I ds ≈ 0. It is the reason that why the simulation failed to get an exact SRAM read access delay. As the result, all basic functions failed to predict the correct effects except for the spline function. Notice that, the trend of two functions, Gaussian Process and Polynomial, is like a ''balance'' process. However, the discontinuous results make these regression functions ''balance'' wrongly to average out the overall error. Oppositely, the spline function is a kind of interpolation and jumps to discontinuous value ''1 × 10 −7 '' successfully because it must pass through all training samples. As shown in Fig. 3 b), the spline function also outperforms other methods in measuring the effect of V th0 on the performance within 50 training samples, although there is strong nonlinearity between V th and SRAM read access delay. All functions can fit the linear effects of u 0 perfectly as shown in Fig. 3 c).
Notice that the weighted regression is also useful to fit the curve by penalizing more the deviation near the discontinuous region. The goal of weighted regression is to ensure that each data point has an appropriate level of influence on the parameter estimates. It requires that we know exactly what the weights are. However, the optimal weights, which are based on the true variances of each data point, are never known. Estimated weights have to be used instead. The effect of using estimated weights is difficult to access. When the weights are estimated from a small number of training samples, the results of an analysis can be very badly and unpredictably affected. As shown in Fig. 4, the weight regression with 200 training samples can fit the discontinuous region accurately. Whereas the spline just takes 50 training samples. Notice that there is no need to consider the model sensitivity to noise because we only estimate the deterministic effects of process parameters on the circuit responses without injecting any noise in this work.

2) ADAPTIVE MODELING METHOD
In order to reduce the training cost, we developed an adaptive sampling strategy with sparsity analysis to support the proposed model. Compared with the traditional method of collecting a large number or even all of the required samples at a time by a certain algorithm, the proposed sampling strategy collects only one or several samples at a time with respect to the different impacts of variables on the performance metric. It can be viewed as the local sensitivity analysis for these variables before constructing basic functions for them. SP-HDMR's training method is as follows:  (10) is taken out to convergence. Given that it is important to fit the tail of the delay distribution for estimating SRAM failure rate accurately. The cut point can be chosen as the mean of samples near the failure region.
We find that the majority of variables only have weak effects on SRAM read access delay, which can be viewed as a sparsity constraint on SRAM performance modeling. To facility this sparsity, we execute the sparsity analysis for each variable in step 2 and step 3. If the slope of f (x i ) in step 3 is zero, it means the variable x i does not affect SRAM read access delay and can be filtered out in the modeling. In another scenario, if f (x i ) goes through x 0 , it means a linear function is sufficient to model the relationship between variable x i and the target performance metric. Otherwise, f (x i ) will be reconstructed by the spline basic function.
To characterize the discontinuous effects of variables, the training samples must cover a relatively wide range to cross continuous and discontinuous regions. Hence, we adopt the Latin-hypercube sampling [35] to generate the initial samples across different intervals of the cumulative probability density for each variable in step 4. For the algorithm efficiency, we start sampling from 10 intervals empirically and increase the intervals until finding the split point ''x (k−dis) i '' closest the step point.

V. EXPERIMENTAL RESULT AND COMPARISON A. EXPERIMENT SETUP
The proposed SRAM performance metamodel will be first verified on the SRAM column and sense amplifier by comparing with SPICE simulations. We also implement other state-of-art meta-models, such as LRTA [32], and OMP [31] for accuracy comparison under different operating voltages. And then the yield analysis based on our model will be illustrated by comparing with Monte Carlo which runs in parallel mode on 60 cores server and the other two importance sampling methods, HDIS [9] and AIS [6]. All experiments are performed with 40nm SMIC model on the Server with Intel Xeon Gold 5118 CPU @ 2.30 GHz. Fig. 5 shows the simplified schematic of the read path of the 128-row SRAM column which has 2306 variables. The read operation begins by activating word-line (WL) and the precharged bit-lines. One bit-line BL will discharge through the first accessed cell and enlarges the voltage difference between BL and BL. The read access delay is defined as the time required to generate the voltage difference between two bitlines that can be sensed by the sense amplifier. Notice that to generate the worst case for the read operation, the accessed Bit cell 1 stores ''0'' and other idle cells store ''1'', which maximum the leakage current through idle bits to increase the read access delay and impede the successful read. For the SRAM yield estimation, we consider the read access delay failure that the time of read operation exceeds a specified time.

B. EXPERIMENT ON SRAM COLUMN 1) MODEL ACCURACY VALIDATION
To verify the accuracy and efficiency of our model and modeling method, we trained other state-of-art high dimensional models, LRTA [32] and OMP [31], with 3000 training samples near the failure region.
As shown in Fig. 6, a hundred samples near the failure region are predicted by meta-models and transistor-level simulations, respectively. All samples are sorted by the read delay measured by transistor-level simulations. Fig. 7 summarized the average error of three models from Fig. 6. In the Fig. 6 a) and b), the SRAM performance shows strong nonlinearity, all models can fit the relationship with enough accuracy at 0.8V and 0.9V. The average error of SP-HDMR, OMP and LRTA at 0.8V is 2.1%, 8.8%, 4.3% respectively. And the relative error of all models at 0.9V is even lower than 2%. Fig. 6 c) and Fig. 6 d) show the predicted read delay at 0.7V and 0.6V, respectively. The SRAM read access delay shows varying degrees of discontinuity. The predicted values of SP-HDMR are closest to the results of simulations. However, as shown in Fig. 6 c) and d), OMP and LRTA have the deviation from the HSPICE results in the first half and the second half of the curve. The relative error of OMP and LRTA at 0.6V has reached 29.6% and 14.8% respectively,   while the SP-HDMR is just 4.6%. The huge error of OMP and LRTA in modeling read access delay can be attributed to the natural characteristic of regression of attempting to be as close to all real values as possible to minimize some cost, usually the mean of squares of errors. However, the discontinuous values break the normal trend of performance, which brings a larger error for the regression models. If we set the ''failed'' result as ''1 × 10 −4 '', the error of predictions of OMP and LRTA will be enlarged dramatically due to this characteristic. Notice that the relative error of OMP and LRTA at 0.7V is much larger than that of those at 0.6V in Fig. 6. It is because that the too large gap between the successful simulated results, the magnitude of 10 −9 , and the set failed results, the magnitude of 10 −7 at 0.7V.

2) YIELD ANALYSIS EFFICIENCY VALIDATION
We compared the accuracy and efficiency of the failure rate estimated by our method with MC, AIS, and the original HDIS without modification.
We apply Fig of Merit (FOM) ρ to verify our proposed method, defined in [10]: where the VAR P fail is the variance of P fail . And ρ < ε √ log(1/δ) means one estimation has reached (1 − ε) 100% accuracy with (1 − δ) 100% confidence. Here, we set ρ = 0.1 which means 90% accuracy with a 90% confidence level. As shown in Fig. 8, the AIS has failed to converge the right result when the FOM reaches 0.1. It is because the diversity of samples is decreasing as the iterations of the resampling procedure in the high dimensional scenario. While HDIS and HDMRIS successfully converge to the required precision with 3.4% error and 8.9% error respectively.
The efficiency of MC, AIS, HDIS, HDMRIS is shown in Table 2. In this experiment, 2160 training samples are included in the importance sampling step of HDMRIS. Both HDIS and HDMRIS cost over 60000 samples to get stable results. Although the relative error of HDMRIS is larger than HDIS, it only consumes 0.67 hours, which is 5.3x faster than HDIS and 99.9x faster than MC.
Besides, we also present HDMRIS under different supply voltages, as shown in Fig. 9. Our method can predict yield accurately for a wide range of supply voltage. The yield estimation results deviate from the simulations when VDD is larger than 0.68V. This is because the relative error of SP-HDMR is enlarged so that the 7% range of re-stimulation    is not enough to ensure an extremely accurate result. We can increase the pre-defined range of re-simulation to reduce the error at the expense of additional simulations.

C. EXPERIMENT ON SENSE AMPLIFIER
We also make an experiment on Sense Amplifier (SA) which is one of the most important components in SRAM circuits. As shown in Fig. 10, it mainly composes of two cross-coupled inverters. The voltage difference between BL and NBL will be enlarged to the expected value due to the positive feedback. We focus on the estimation of offset voltage which is the most important performance metric of SA. It is the voltage difference V between SA inputs (bit-lines) when the inverters of SA remain at the metastable point. The larger offset the more time will be needed for SRAM to discharge one of the bitlines, which leads to additional read delay. We consider the failure of SA that the offset voltage exceeds 28 mV due to process variation. Fig. 11 shows 100 offset voltage predictions at different voltages. The predicted results under different operating voltages are less different. It is because the SA has less sensitivity on process parameters due to its large transistor size. All models have good accuracy on the SA within the relative error of 5%. Fig. 12 shows the yield analysis result at 0.6V. The MC result remains ''golden standard'' and converges to 3.13e-5 with 26 hours. While the AIS, HDIS, and HDMRIS take 15 minutes, 36.7 minutes, 10.9 minutes, to converge to 3.21e-5, 3.19e-5, 3.2e-5, respectively. Notice that the time cost of HDMRIS includes 6957 simulations and 18640 model predictions. NOTICE 960 simulations are used to train the SP-HDMR model. AIS and HDIS take 9600 simulations and 23400 simulations respectively. All methods have converged to a reasonable result with less than 4% relative error when  their FoMs are lower 0.1. However, our method speedup 1.3X over AIS and 3.4X over HDIS, which gains 143.1X over MC.

VI. CONCLUSION
In this paper, we developed a yield analysis framework HDM-RIS based on our SP-HDMR model to accelerate yield analysis by substituting expensive transistor-level simulations. To construct SP-HDMR model, we facility the cut-HDMR expansion to provide a computationally efficient model representation. Then the spline basic function is carefully analyzed and trained to address the discontinuous problem brought by the large variations of process parameters in SRAM. And the model is implemented efficiently by an adaptive sampling strategy with sparsity analysis. The proposed HDMRIS achieves great speedup compared to the state-of-the-art yield estimation methods with enough accuracy.
LIANG PANG received the B.S. degree in communication engineering from the Hefei University of Technology, Hefei, China, in 2017. He is currently pursuing the Ph.D. degree in microelectronics and solid state electronics with Southeast University, Nanjing, China. His research interests include low-voltage SRAM design and reliability analysis.
SHAN SHEN was born in 1993. He received the B.S. degree from the Department of Microelectronics, Jiangnan University, in 2016. He is currently pursuing the Ph.D. degree in microelectronics with the School of Microelectronics, Southeast University.
His research interests include hardware designs in computer architecture and memory systems.
MENGYUN YAO received the B.S. degree from Southwest Jiaotong University, China, in 2017, and the M.S. degree from Southeast University, Nanjing, China, in 2020. Her research interests include low-voltage SRAM design and reliability optimization. VOLUME 9, 2021