Simulation Acceleration of Bit Error Rate Prediction and Yield Optimization of 3D V-NAND Flash Memory

When designing 3D V-NAND technologies with a gate induced drain leakage (GIDL) assisted erase scheme, many experiments must be conducted to determine the optimal GIDL design targets to achieve fast erase performance and secure yield characteristics. However, only a limited amount of data can be used since V-NAND processes are time-consuming and expensive in the early stage of development. TCAD and numerical methods also require a considerable amount of time and effort to calculate bit error rate (BER), and it is impossible to explore the entire design spaces in time. In this paper, we propose a novel simulation acceleration technique for bit error rate prediction and yield optimization in 3D V-NAND technology. This acceleration framework includes a machine learning (ML)-based compact model for the lognormal variability of GIDL currents and a physics-inspired slow cell model for the read margin reduction. Using a combination of these models with efficient Monte Carlo (MC) circuit simulations, we can accurately estimate threshold voltage ( $V_{th}$ ) distributions to explore the entire design spaces using a limited amount of data. Based on the proposed technique, the predictive model achieves high accuracy in the current 176-layer V-NAND technology, and it also provides high scalability with respect to GIDL transistor geometries, temperatures, supply voltages, variabilities, and the number of stacking layers. Moreover, a contour map of bit error rate is newly introduced for the efficient design space exploration and read margin prediction. Therefore, the results indicate that the proposed framework can be further extended to large-scale experimental data and new architectures to accelerate the yield optimization in next-generation 3D V-NAND flash memory development.


I. INTRODUCTION
Three-dimensional vertical-NAND (3D V-NAND) devices for high speed and capacity products have been extensively utilized in data-driven computing environments, over conventional computation-driven computing, and has been further strengthened by the critical involvement of big data. This has The associate editor coordinating the review of this manuscript and approving it for publication was Cristian Zambelli . also enabled the modern everywhere-always-connected life through smart handheld devices, along with telecommunication technologies [1]. Based on vertical stacking technologies of many memory layers, cell arrays of the 3D V-NAND induced the tipping point into more aggressive scaling of the bit storage density without relying on the reduction in cell dimensions. Fig. 1(a) shows the structure of 3D V-NAND devices with a single word line (WL) scheme which was the leading candidate for the early 3D V-NAND technologies.
In this era of V-NAND development, the horizontal placement of the cell arrays with peripheral circuits was common and reducing the cell pitch (XY shrink) was a key driver. However, it was extremely vulnerable to deep channel holes because of leaning problems during the high aspect ratio (HAR) plasma etch process [2]. Therefore, the word plane scheme was utilized to overcome these HAR issues [3], and it was specialized in the 3D vertical stacking process, which has over 100 layers with the multiple-stack technologies, as shown in Fig. 1(b). However, it is estimated that the allowed maximum number of stacking layers would be limited to 400 stages without any cell dimension reductions (Z shrink) [4]. To prolong the historical scaling trends of the storage density, the cell over peripheral (COP) structure is newly introduced [5], which offers the chance to reduce the chip area by arranging the peripheral circuits under the memory cell array rather than alongside the array. However, the COP structure cannot operate the bulk erase scheme, which increases the channel string potential (V ch ) using direct contact with the memory cells, as shown in Fig. 2(a). The bulk erase has excellent erase performances and large read voltage windows. To simplify the technological process and increase the integration density of the COP structures, the GIDL-assisted erase scheme (GIDL erase) has become the gold standard for modern state-of-the-art 3D V-NAND products [6]. The problem is that its erase efficiency is very poor and GIDL-related erase failures are frequent due to the inherent GIDL generation process of hole carriers. The GIDL erase is also sensitive to the supplying electric fields for bandto-band-tunneling [7], and it causes the inevitable channel potential delays, as shown in Fig. 2(b).
In the flash memory industry, technology computer aided design (TCAD) and numerical simulations are actively utilized to resolve these GIDL-related failures and yield problems, because many experiments must be conducted to obtain the optimal design parameters in the early stages of development. However, the numerical methods require a considerable amount of time and effort to predict the bit error rate (BER), and it is nearly impossible to explore the whole design space for optimizing product yield in time. In addition, we cannot perform as many experiments as desired in a process development step because only a limited amount of experimental data can be obtained since the V-NAND processes are timeconsuming and expensive.
Therefore, an efficient and accurate BER estimation technique is essential to evaluate the various design options and optimize the yield in a timely manner. However, prior studies investigating BER estimation in the GIDL-assisted erase scheme have been very limited. In [8] and [9], an analytical compact model is presented to describe the time dynamics of the GIDL-assisted erase operation in the 3D V-NAND structures. This analytical model is suitable for reproducing the GIDL-assisted transient analysis results from TCAD simulations. However, this approach does not support a scalable model for widely different channel sizes, bias voltages, and temperatures. This lack of scalability makes it difficult to perform an evaluation with various technological options accurately in circuit simulators. In addition, considering process variation is essential for the bit error rate prediction, and this approach does not provide the variability models for GIDL characteristics. In [10] and [11], the authors proposed a machine learning approach that reproduces the variations of threshold voltage (V th ) and current (I on ) in the 3D V-NAND cells. The model is based on an artificial neural network (ANN) whose inputs are variability sources and electrical parameters. However, this method focuses on predicting the variations of V th and I on for the wear-out in pre-production steps that can achieve the same accuracy of TCAD simulations. However, this model cannot reproduce I-V characteristics that can be implemented in SPICE simulators for accurate bit error prediction. In [12], [13], [14], and [15], an analytical fitting method, ANN model, and support vector regression are proposed to predict the bit error rate as a function of program/erase (P/E) cycle, read cycle, retention time, and the total ionizing dose effects. However, these predictive models are entirely based on the data measured from an experiment with various P/E cycle and retention time. They aim to improve the error correction codes and the wear equalization algorithms regardless of time and cost. Therefore, these approaches cannot be applied to accelerate and evaluate the read margin loss of the GIDL-assisted erase operations. In [16], the authors present a parameter estimation algorithm to find the means and variances of the threshold voltage distribution that is modeled as a Gaussian mixture. However, this approach has limits on predicting the read margin of GIDL-assisted erase, because the distributions are approximated to the Gaussian mixture and the errors from non-Gaussian characteristics are relatively large. To overcome this limitation, the work proposed in [17] to choose from the various distributions as well as Gaussian mixture are evaluated to find the accurate predictive model VOLUME 11, 2023 FIGURE 2. Threshold voltage distributions of the body erase and GIDL-assisted erase scheme. (a) The body erase scheme operates bulk erase, which increases the channel potential using the direct contact with p-doped substrates and has excellent erase performance. (b) The GIDL erase scheme of COP is built on n+doped poly-silicon and increases the channel potential using GIDL current, which has an inherent channel potential delay and slow cells (V th loss and small read margin).
of the threshold voltage distribution. However, the evaluation results show that the average accuracy is 95%, and it also causes a large error in the bit error rate prediction at the very low probability density. Because of the lack of effective yield estimation methods in the GIDL-assisted erase scheme, we propose a novel simulation acceleration technique for bit error rate prediction and optimization using an ANN-based compact model and Monte Carlo circuit simulations.
The rest of this paper is organized as follows. Section II describes the mechanisms of channel potential delay and read margin reduction in the GIDL-assisted erase scheme, section III discusses the methodology of the acceleration models with their scalability and accuracy, and also describes a Monte Carlo (MC) simulation technique, and section IV highlights the bit error rate prediction and yield optimization for next-generation candidate structures. The last section concludes by outlining the efficient yield prediction framework to deal with the challenges of the extreme high stack flash memories.   Table 1.

A. MECHANISMS OF CHANNEL POTENTIAL DELAY
To understand the mechanism of the hole carrier accumulation process, we performed a TCAD simulation by solving the Poisson and drift-diffusion equations with the dynamic nonlocal BTBT model in a channel string structure. Fig. 4 shows the simulated channel potential (V ch ), external biases (V d and V g ), internal potential differences (V gs , V ds ) in the floating body, and hole current injected into the channel string. V BTBT is the supply voltage (V g -V d ), which determines the BTBT electron-hole pair generation rate, and V ds is the internal bias, which affects the hole injection rate into the channel.
The GIDL-assisted erase scheme has two operation regimes, (ramp-up and execution), and they have distinct phases (I-VI) and singular points (1)- (6). Phase I is the quasi-static region before the start of pulse-type ramp-up at point 1 to prevent the slow BTBT response. Phase II is the acceleration region, which has the biggest increase rate of V ch before the V ds reaches a peak at point 2. During phase III, the BTBT rate faces its maximum limit with the deceasing V ds due to the stored hole carriers in the floating channel, and the hole current has its peak at point 3. During phase IV, these exponential behaviors of V ch increase ( V ch > V d ) come to the finish with the start of the constant V BTBT at point 4, and the linear behavior of V ch increase ( V ch = V d ) starts, and the hole current is saturated during phase V in the rampup regime. During phase VI (also in the execution regime), it works in a negative feedback configuration due to the accumulated carriers in the floating channel region. Therefore, the erase execution always finishes without reaching the body erase potential (V d ), and this channel potential delay makes the GIDL-assisted scheme vulnerable to erase performance variations.   . 5 shows the simulation results with three different GIDL transistors. In this simulation, we used the same external voltages and cell array structure, except for the BTBT rates of the GIDL transistors. The transient results show that the injection levels of the GIDL currents are significantly different in only small V BTBT phases (II, III, and IV) due to the increasing V ds and V BTBT . However, the small V BTBT phases are relatively short, and they are less than 20 percent of the total erase operation. In contrast, the injection GIDL currents start to saturate after their peak levels, and they finally converge to similar levels in the high V BTBT phases (V and VI) due to decreasing V ds and the negative feedback configuration. These high V BTBT phases take most of the erase operation time and V ds is bound to stay low for the entire period. This shows that high V BTBT and low V ds are the general operating conditions in the GIDL current trajectories, and the industry normally extract the characteristic current (I gidl ) from these conditions to represent the delay of channel potential instead of using the entire GIDL current values. Therefore, we define the characteristic current (I gidl ) as a condition of V gs = −8V and V ds =3V in this work.

B. MECHANISMS OF READ MARGIN REDUCTION
In the GIDL-assisted erase, the V th distribution can be separated into two independent components, as shown in Fig. 6(a).  One is (1) Gaussian distribution (V th_med , V th_std ), which comes from the random variations of tunneling oxide thickness, gate critical dimension, trap site, and Fowler Nordheim (FN) tunneling sensitivities, and this component is identical to the distribution of a body erase scheme. The other is (2) tail distribution, which originates from the inherent channel potential delay of GIDL generation process and its impact on the slow cells. Therefore, the read margin reduction and threshold voltage loss of slow cells ( V th_gidl in Fig. 6(b)) caused by the potential delay can be defined as To investigate the relationship between the channel potential delay and the V th loss of slow cells, we performed TCAD simulations in a 176-layer V-NAND structure. In this simulation step, we ignored the random variations of Gaussian distribution (V th_std =0) to reproduce the tail distribution only, and the programmed cells are erased with different channel potential delays using random variables of the BTBT rate, channel string resistance, and capacitance which are significantly related to the delays. Fig. 7(a) shows the simulation results based on the random variables, and we can define the channel potential delays ( V FN ) on the reference of the body erase scheme, which is primarily responsible for the tail shift of slow cells, as follows, During the erase operation, the carriers are gradually removed from the floating gates due to FN tunneling, and the tunneling effects finish at the erase execution time (t ers ). In addition, the amount of V th shift can be determined by the number of carriers remaining in the floating gates at t ers , and it is significantly related to the shortages of applied electric field ( E FN ) for the FN tunneling as These field quantities ( E FN (t)) during the erase operation are shown in Fig. 7(b). Therefore, the cumulative factor of channel potential delay and read margin loss from the slow cells is the areas under the curves, and it is the integral E FN (t) from t FN to t ers as follows, where t FN is a model parameter depending on the ramp-up slops and FN tunneling parameters which vary throughout the technologies and processes.

III. FAST SIMULATION MODELS FOR BIT ERROR RATE PREDICTION AND YIELD OPTIMIZATION A. SLOW CELL MODEL
The bit error rate prediction of slow cells and yield optimization can be accelerated based on the computational efficient and accurate circuit simulation based on the physics-inspired slow cell model and GIDL variability compact model. Using large-scale Monte-Carlo (MC) circuit simulations with these computationally efficient models, the analysis on the V th shift of slow cells and yield loss in the GIDL-assisted erase can be easily performed by the accelerated data.
To obtain the cumulative field factor ( E gidl in Eq. (4)) in the current 176-layer V-NAND technology, we used t FN =200µs and t ers =1400µs, and the result of the calculation 93960 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  1)). It shows that the two random variables are highly correlated, therefore, the slow cell model can be described as follows, where A FN is a threshold voltage coefficient parameter and B FN is a FN tunneling exponential coefficient of the slow cell which depend on the processes. In this work, these fitting parameters are extracted with 1.85 × 10 1 and 8.9 × 10 4 for A FN and B FN , respectively, and the slow cell model has good agreement with the numerical simulation data, as shown in Fig. 8.  Table 1 summarizes these calibration parameters of TCAD and circuit simulations. The accelerated circuit simulation requires an accurate compact GIDL transistor model, which can reproduce the logarithmic characteristics of the hole injection into a macaroni-shaped floating channel and its variability of lognormal distribution. Based on the measured I-V data, we take the natural logarithm of the corresponding current values to generate a normal distribution, which has a mean (µ Y ), standard deviation (σ Y ), and its coefficient of variation (CV). Then, we arrange the log-transformed data into the ANN-based compact model [18], [19], [20] for feeding, as shown in Fig. 9(a). This ANN-based compact model has two hidden layers with 20 and 15 neurons in each layer (N 1 =20, N 2 =15). The input features are channel width (W), length (L), temperature (T), and bias voltages (V g ,V d ,V s , and V b ), and the targets are the log-transformed current values (ln(I d ), ln(I s ), and ln(I b )), as shown in fig. 9(b). In the Verilog-A implementation step for the SPICE circuit simulations, we multiply the inference results of the ANN-based compact model by a multiplication factor, which follows a normal distribution of N(1,σ Y ), and then take the exponential of them to change the distribution back into a lognormal, as shown in    Fig. 10(a). In an MC simulation, the inference results of the characteristic current, I gidl follow the lognormal distributions in the cumulative probability plot for three different V BTBT values, as shown in Fig. 10(b).

C. SCALABILITY OF THE PROPOSED MODELS
The high generality of an ANN-based compact model enables high scalability in modeling complex characteristics of tunneling physics. Therefore, we perform additional simulations at different transistor geometries and temperatures to evaluate the model scalability and accuracy. In the TCAD simulations, the GIDL current changes nonlinearly with the channel width because a stronger electric field is applied and a higher BTBT current density is generated at the channel edge region as the channel width decreases. The ANN-based compact model is trained and verified with the nonlinear simulation data and shows high average accuracy of 99.3% as shown in Fig. 11. The GIDL current also has a positive temperature dependence because the bandgap narrows as temperature increases, resulting in an increasing BTBT rate. In this process, the BTBT rate is varied by a factor of 10.8 from 248K (cool temperature, CT) to 358K (hot temperature, HT). The ANN-based compact model is also jointly trained with the temperature dependent data and shows high average accuracy of 99.2% as shown in Fig. 12.
In addition, the slow cell model (read margin reduction model) is implemented in the SPICE simulator. The model verification results of the different GIDL transistor widths show good agreement (average R2 score is 0.991) with TCAD simulation data as shown in Fig. 13. For the temperature variation, the effects of an increase in GIDL current (I gidl ) and a decrease in channel potential delay (V FN ) according to increasing temperature needs to be considered in a GIDL-assisted erase scheme. The operating temperature can be different from the nominal temperature (TNOM) at which the slow cell model parameters are extracted. In this work, the slow cell model parameters (t FN A FN , and B FN ) are extracted and verified at the nominal room temperature (RT), TNOM=298K. Therefore, the slow cell model can account for the effects of temperature by making parameter B FN temperature dependent, as follows: FNT is introduced as a FN tunneling temperature exponent for the exponential coefficient (B FN ) which is an image of the potential barrier that corresponds to the effective band gap of the barrier materials. The temperature dependence model shows good agreement (average R2 score is 0.992) with TCAD simulation data from 248K (CT) to 358K (HT) as shown in Fig. 14. Table 2 summarizes these fitting parameters of the slow cell model for bit error rate prediction and yield optimization.

IV. ACCELERATION SIMULATION FOR BIT ERROR RATE PREDICTION AND YIELD OPTIMIZATION
Based on the ANN-based compact model and the slow cell model, the V th shift evaluation and BER estimation can be accelerated in the efficient circuit simulation domain. The flow of this acceleration work is demonstrated in Fig. 15(a): (1) the Gaussian distribution of V th is generated using the agauss(V th_med , V th_std ) function of the SPICE simulators, (2) the random variables of the V th shift ( V th_gidl ) are obtained based on the large-scale analysis data from the acceleration simulation using the instantaneous calculation of Eq. (5). Finally, we can obtain the entire V th distribution of the GIDL-assisted erase operation by calculating the sum of the two independent random variables and also evaluate the tail bit of the slow cells using a metric, V th loss in Fig. 15(b).
We performed the large-scale MC circuit simulations based on the present generation of a 176-layer V-NAND technology as shown in Table 1. The circuit simulation includes the fast models which is trained and calibrated with the same TCAD simulations of the 176-layer V-NAND supplying -8V BTBT voltage (V BTBT = −8V). Fig. 16 shows the comparison results of the calibrated TCAD simulation (ground truth) and the acceleration simulation. The fast models show good agreement with the ground truth depending on the various medians of the GIDL currents. Fig. 17 represents the comparison results of the ground truth and the proposed models depending on the variation levels of the GIDL current, which have good agreement. The temperature dependent models in Eq. (6) and (7) are also implemented and performed in the accelerated circuit simulation. Fig. 18 shows good agreement between the ground truth and the temperature dependent models.
To explore the design space and evaluate the impact of the GIDL transistor performances on the V th loss and yield, we can perform a large-scale simulation based on the accelerating models. Fig. 19(a) shows the acceleration simulation results of V th loss according to the GIDL transistor performances of V BTBT = −8V in the 176-layer structure. The optimal design target of the GIDL transistor performances (I gidl , CV of I gidl , and V BTBT ) is the value set when the V th loss becomes small enough to distinguish the erase from the programmed states. It generally depends on the valid window and bit error rate criteria of various technologies. In this work,  we define the V th loss of 0.5V or less (V th_loss ≤0.5) at the probability density of 1 × 10 −3 as an optimal boundary to achieve the successful read operation in this demonstration. A small V th loss indicates larger read voltage window, higher error correction probability, and a smaller number of read   retry. In addition, the yield characteristics is determined by bit error rate, which strongly depends on the error correction   abilities and read retry techniques of each industry. To demonstrate the flow of yield prediction and optimization in this 176-layer V-NAND technology, we assume that the optimal boundary criterion of valid window (V th_loss ≤0.5) is equivalent to a bit error rate of 0.1%, and it increases by a factor of 3 for every 0.5V increase in V th loss. Therefore, a contour map of bit error rate for yield analysis can be derived from the V th loss as shown in Fig. 19(b).
Based on the acceleration simulations combined with the bit error occurrences, we can obtain a metric for predictive  yield modeling in the 176-layer V-NAND structure, as shown in Fig. 20. Starting with the reference point of this current GIDL transistor (R in Fig. 20), any performance shifts of the median I gidl or its CV immediately will reduce the read margin (V th_loss >0.5V) and degrade the bit error rate (BER>0.1%). However, we can boost the read margins by supplying the higher BTBT voltage from −8V to −10V to drive the same GIDL transistor, as shown in Fig. 21, which must be carefully examined from various side effects on the degradation of reliability and program-erase cycles.

V. PATHFINDING FOR NEXT GENERATION 3D V-NAND CANDIDATE STRUCTURES
Based on the calibrations in the current 176-layer V-NAND technology, the bit error rate prediction and yield optimization of the next-generation candidate technologies can be performed for the pathfinding activity. We extend the acceleration simulations to the next-generation 256 and 352layer V-NAND technologies to determine the impact of the high aspect ratio etch processes of vertical stack-up on the GIDL-assisted erase characteristics.
A. 256-LAYER 3D V-NAND STRUCTURE Fig. 22(a) shows the contour map of yield and its projection ot the 3D plot of V th loss in the 256-layer V-NAND structure with the same −8V BTBT voltage (V BTBT = −8V) as the previous 176-layer V-NAND. The prediction results show that the V th loss will be degraded by up to 1.2V due to the extra RC delay (R in Fig. 22(a)), and three paths of yield improvement are possible. Firstly, nearly 1.3nA GIDL  injection current is required without any variation reduction techniques (A1 in Fig. 22(a)). Secondly, the CV of the GIDL currents needs to be reduced up to 20% while the median remains the same (A2 in Fig. 22(a)). Lastly, only 1.1nA injection current will be sufficient if the CV can be reduced to 22% as a compromise. In addition, we can supply the higher BTBT voltages in the identical GIDL transistor to avoid any process challenges. In Fig. 22(b), the same GIDL transistor of the previous technology is successfully reused to operate the erase in the 256-layer V-NAND structure, and it represents the minimum BTBT voltage of -10V required for ensuring the yield criteria (V th_loss ≤0.5 and BER≤0.1%). Fig.23(a) shows the prediction results for the 352-layer V-NAND structure with the same -8V BTBT voltage (V BTBT = −8V) as the current 176-layer simulation. The results indicate that V th loss is increased to 2.2V (R in Fig. 23(a)) due to the extra delays of additional layers. In the 352-layer V-NAND, there are also three possible paths for yield improvement. Firstly, the GIDL injection current is required to increase to 1.7nA without any variation reductions (A1 ′ in Fig. 23(a)). Secondly, the CV needs to be improved up to 18% (A2 ′ in Fig. 23(a)). Lastly, both the 1.3nA GIDL injection current and the 21% CV can be one of the optimal design targets as a compromise solution. To retain the same GIDL transistor process of the current 172-layer V-NAND technology, a minimum of -12V BTBT voltage is required to obtain the same read margin and bit error rate in the next-generation 352-layer V-NAND, as shown in Fig. 23(b).

VI. CONCLUSION
In this paper, we proposed a variability-aware artificial neural network compact model and a fast Monte Carlo circuit simulation technique that accelerate the bit error rate estimation and yield optimization of the GIDL-assisted erase scheme in state-of-the-art flash memories. The GIDL-induced channel potential delay and time dynamics are thoroughly investigated, highlighting the read margin reduction mechanism due to the V th loss of slow cells in the GIDL erase operation. The ANN-based compact model accurately reproduces the GIDL current and its lognormal statistical variations, and the physics-inspired slow cell model is efficiently implemented in circuit simulation to regenerate the underlying relationships between the two probability distributions of electric field and read margin loss. This acceleration simulation technique is a valuable tool for exploring the GIDL design space, yield optimization, and pathfinding activities in next-generation 3D V-NAND flash memory.