Development and Challenges of Reliability Modeling From Transistors to Circuits

The integration density of electronic systems is limited by the reliability of the integrated circuits. To guarantee the overall performance, the integrated circuit reliability must be modeled and analyzed at the early design stage. This paper reviews some of the most important intrinsic aging mechanisms of MOSFETs and elaborates the physical mechanism of the coupling between aging effects. Then the progress in reliability modeling under static and dynamic operational voltages is reviewed. It is found that although these models can accurately predict the degradation in short term, they are with large errors for the long-term degradation prediction. Besides, for the circuit-level reliability modeling and simulation approach, there are still problems to be solved. This article aims to provide guidance for researchers and practitioners in integrated circuit field, and highlight the challenges for reliability research. It is of great significance to the optimization of the reliability of integrated circuits.


I. INTRODUCTION
The downscaling rate of the supply voltage is slower than that of its size for the transistors, which results in soaring current density and temperature in the transistor channel, and accelerated transistor degradation. Therefore, aggressive transistor downscaling causes unavoidable reliability issues in modern integrated circuits (IC). On the one hand, among all the reliability issues, the aging effects such as bias temperature instability (BTI) [1], [2], [3] and hot carrier injection (HCI) [4], [5], [6] manifest themselves as an increase in the absolute threshold voltage (V th ) and reduction in the carrier mobility (μ), thus gradually degrading circuit performance [7], [8]. On the other hand, there are booming demand of high reliability circuits in the markets of autonomous vehicles, space operations and biomedical electronics. To have a better analysis of the circuit degradation and improve its lifetime, it is necessary to conduct degradation modeling and simulation at the early design stage of circuits.
The studies describing aging mechanisms have matured as a result of years of refinement to the existing theories.
Many previous works have been done to extend the aging models for transistors to a circuit-level reliability analysis [9], [10], [11], [12]. However, there are difficulties to achieve that. At the circuit level, BTI and HCI can occur either simultaneously or sequentially, and the two effects could also interact with each other, which makes the modeling difficult. The model only considering one degradation mechanism is difficult to accurately predict the lifetime of the circuit, causing an unnecessary compromise in IC design. A physical-based reliability model that can deliver accurate lifetime prediction at both transistor and circuit levels is urgently required, while there are still challenges in accurate modeling of the aging effect at the moment.
In this review, we take a comprehensive look at the progress of the reliability research. First, we briefly introduce the main reliability physics mechanism and the coupling between them. Next, we provide a comprehensive survey of the reliability models for BTI and HCI. Then, the relationship between the transistor degradation model and circuit lifetime prediction is studied. At the end of this paper, we summarize the challenges of reliability simulation and potential ways of pushing forward researches of this field.

II. RESEARCH ON MOSFET DEGRADATION MECHANISMS
The basis of studying the degradation mechanism is the analysis of the defect state inside the transistors, for which the most direct method is to measure the defect density generated by the degradation mechanism. Charge pumping and low-frequency noise 1/f measurements are widely used to evaluate the defects state [13], [14], [15]. Besides the study on the trap spatial distribution of high-k and ultra-thin oxide layer, charge pumping techniques based on the variation of the charge pumped with frequency and employed ring-oscillator-connected devices are proposed in [16], [17]. According to the analysis of the trap density, some literatures have focused on revealing the physical mechanisms underlying aging in semiconductor devices [18], [19], [20], [21], [22], [23], [24], [25], [26], [27].
From the physical mechanisms, it can be found that charge distribution in the gate oxide layer mainly includes four types: potential interface traps, potential oxide traps, mobile ions and fixed charges, as shown in Fig. 1(a). The interface of Si/SiO 2 have some defects, trapping holes and degrading the transistor performance, as shown in Fig. 1(b). Due to the non-epitaxial amorphous structure on the crystalline silicon, the Si -H or Si -O bonds are prone to dissociate at the substrate/gate oxide interface. When an electric field between gate and source is applied, holes can be captured by these bonds, resulting in the bond break [19], [20], as seen in Fig. 1(c) and Fig. 1(d). These are the key factors for the parameter degradation of transistors.

A. HOT CARRIER INJECTION
As described in the literature [21], [22], HCI occurs as carriers shoot out from the source of a transistor, accelerated in the channel, and experiencing ionization near the end of the drain. As shown in Fig. 2(a), these carriers at the drain gaining sufficient energy to be injected into the gate oxide is called the drain avalanche hot carrier (DAHC). This effect generally occurs at the stress conditions with high drain voltage and low gate voltage. When the gate voltage is approximately equal to the drain voltage, carriers gain sufficient energy to penetrate the Si/SiO 2 barrier at the end of the drain in the channel, without losing energy due to collisions with atoms in the channel, which is called the channel hot electron (CHE) injection effect, as shown in Fig. 2(b).
The hot-carrier-induced damage has been found to cause either trapping of carriers on defect sites in the oxide or the creation of interface states at the silicon-oxide interface, or both. The damages cause a degradation in its transconductance (G m ), a shift in the threshold voltage (V th ), and a decrease in drain current (I ds ). Since the carrier velocity, which decides the trapping process rate, is sensitive to the electric field, temperature, and frequency, thus HCI effect is a complex function of these factors [23].

B. BIAS TEMPERATURE INSTABILITY
BTI includes negative BTI in PMOS and similar wearout positive BTI in more advanced high-k metal gate NMOS transistors. Based on published works, both reactiondiffusion (RD) [24] and trapping/de-trapping (TD) [25] theories could explain the process of the BTI effect in current semiconductor devices.
The major difference between the two theories is the nature of the diffusing behaviors and the medium that facilitates the diffusion. In the RD theory, the reaction process is that the Si -H or Si -O bonds triggered by hot holes in NBTI or hot electrons in HCI are broken. The diffusion process is that the reaction-generated carriers diffuse away from the interface toward the gate, driven by the gradient of the carrier density. When the stress voltage is removed, they diffuse back toward the interface, re-passivating the broken bond and restoring some of the degradation from NBTI, as shown in Fig. 1(d). The TD theory essentially elaborates the capture and emission mechanism of the oxide traps, as shown in Fig. 3. When gate voltage is positive, a positive electric field is formed across the channel interface, positive charges are captured by acceptor-type traps and negative charges are emitted from the donor-type traps. Positive gate bias leads to positive drift while negative gate bias leads to negative drift [26].
According to the mechanism of single aging effect, it can be found that the traps generated by the BTI effect are distributed uniformly in the channel, while the traps generated by the HCI effect are mainly at the drain end [27]. When operating voltages are applied to both gate and drain of the transistor, the traps created by the two aging effects will work together. As shown in Fig. 4, there is trap competition at the drain end. Therefore, the two mechanisms can occur either simultaneously or sequentially, and they can also interact with each other. In practice, BTI and HCI can occur sequentially in transistors in ring oscillators. The final degradation is much smaller than summing the degradation induced by each mechanism alone. This can be understood like this: although occurring at different energy/spatial locations, both mechanisms are related to similar types of traps. Therefore, the increase of traps induced by one mechanism will suppress the degradation caused by the following stress. Without considering such coupling effect, the degradation can be overestimated. However, there is limited research on this case, most studies consider only the dominant mechanisms and ignore the secondary ones. With the decreasing of transistor size in advanced technologies, such coupling effect will become stronger. It is necessary to analyze the decoupling method and establish a more accurate physical model.

III. THE RELIABILITY MODEL IN TRANSISTOR LEVEL
In order to evaluate the transistor's lifetime and optimize its reliability, it is crucial to establish aging models based on the degradation mechanisms. The prerequisite for accurate modeling is to obtain the degradation data of the transistors. For the sake of time-save, over-biased voltage and temperature stress are applied in the experiments to accelerate the degradation. Conventionally, the value of stress is confirmed based on the substrate current (I sub ) measurement [28]. Then I-V test is used to measure the parameters shift of the transistor after degradation. However, when testing I sub , it is difficult to clearly separate degradation data from noise in practice. We adopt the voltage stress according to the median degradation of the electrical indicators for the aging test [29].

A. AGING MODEL UNDER STATIC STRESS
By studying the degradation mechanisms and incorporating test data of aging effects, a great number of studies have established BTI and HCI reliability models to estimate the transistor's expected lifetime. As shown in Table 1 (See [30], [32], [33], [34], [35], [36], [37], [38], [39] for more details), many aging models show that the degradation agrees with a power-law dependence on time except for [38]. It points out that degradation is log-dependent on time for BTI. Relevant verifications are also carried out in their experiments as shown in Fig. 5 and Fig. 6. We used the transistors of the 0.18 μm technology to test the HCI effect in [29]. The power-law relationship is verified again and the degradation value could saturate. Generally, the industry takes the degradation of 10% for I ds or shift of 50 mv for V th as the failure standard [40].
The degradation of transistor parameters predicted by these models is accurate in the short term. Among them, the degradation of transistor is mostly represented by the parameter V th or I ds . As shown in the these models, the relationship between parameter degradation and temperature conforms the Arrhenius equation. The degradation value induced by HCI is not only related to the stress voltage and temperature, but also to the transistor size. This is because the degradation is mainly caused by the trapped charges at the junction, which is caused by the impact ionization. The ionization probability decreases dramatically when the size of transistors increases. Compared with the transistors of  smaller sizes, the impact of ionization for the larger size ones is less severe, resulting in fewer interface states and oxide traps.
According to the current research, it is founded that the HCI effect is affected by the drain voltage (V ds ), the gate voltage (V gs ) and temperature, while NBTI is only affected by V gs and temperature, and its relationship with V ds remains unclear. We summarize the degradation model at transistor level for calculating the HCI and BTI effect under static stress as: where, P is the degradation of the transistor parameters, such as the shift of V th and G m . n, r_vds, r_vgs, r_L, r_W and A are the model parameters that need to be extracted by data fitting. t is the stress time. W and L are the width and length of the transistor, respectively. The size and temperature parameters in (1) are integrated into A for calculation in some literature [30], [31]. Likewise, (2) and (3) indicate that the degradation caused by BTI conforms to the power law (short term) and logarithmic (long term) relationship with time, respectively. The above mentioned studies are static stress related. However, transistors are seldom applied with static operation voltage (SOV). Instead, transistors experience a series of dynamic voltages and various workloads in large systems. As a result, these models are likely to overestimate the degradation, and a large margin is designed in the circuit to guarantee the functions over the entire lifetime.

B. AGING MODEL UNDER DYNAMIC STRESS
Accordingly, some researchers have carried out works predicting the lifetime of transistors under dynamic operation voltage (DOV), as shown in Table 2 (See [32], [38], [40], [41], [42], [43], [44] for more details). These models could be broadly divided into two categories. One is that the duty ratio factor added to the degradation model of SOV, as illustrated by the red color in Table 2, which is mainly for the HCI effect. The others are calculating the degradation value according to the stress state, mainly for the BTI effect. Relevant verification was also carried out in their experiments as shown in Fig. 7 and Fig. 8.
For the HCI effect under DOV, [45] concluded that the different alternating stresses can reduce or enhance the degradation, which mainly depends on the stress condition of the latter cycle, as shown in Fig. 9. HCI only occurs during switching under DOV, which is its transition from non-conducting to conducting state [43], as shown in  Fig. 10. HCI-induced degradation is also affected by operating frequency [46], [47]. However, whether there is recovery phase in the HCI effect needs further study.
Although the degradation induced by BTI will have a partial recovery phase under DOV, the recovery phase does not fully reverse the aging process. Therefore, NBTI degradation under DOV stresses is modeled using a comprehensive framework [19], which can roughly predict the long-time degradation of transistors. In some works [48], [49] the aging model calculating the degradation induced by NBTI follows a logarithmic dependence on time under DOV stress.
Previous studies mainly focused on NBTI-only and ignored PBTI. Both the NBTI and PBTI are studied by applying a stochastic, workload-dependent, atomistic trap-based BTI model [50]. The impact of PBTI degradation may be lower, equal, or even higher than the NBTI impact [51]. The physical process of PBTI and NBTI is shown in Fig. 11, the voltage applied on the gate of transistors alternates continuously between on-state and off-state under dynamic gate stress. Holes will be captured under negative gate bias, while electrons will be captured under positive gate bias. The AC BTI depends on the capture efficiency of electrons and holes. In addition, [52], [53] indicate that acceptor-like interface traps, positive oxide charges, and neutral electron traps were generated by the off-state stress. They are accelerated by a high lateral electric field, inducing impact ionization, and causing reliability problems. By this token, the defects induced by the off-state stress should also be taken into account in the modeling. The reliability models on the down-scaling transistors with new materials are also worth studying. For the transistors with novel materials, such as the typical SiC and GaN power devices, whose performance under dynamic bias gate stress have also been experimentally investigated in [54], [55]. A unified physical and statistical compact model of BTI on the down-scaling technology nodes was developed, which realized cycle-to-cycle/device-to-device reliability evaluations [56]. In [57], data from workloads through machine learning techniques were extracted to establish the aging model. But these studies are fairly preliminary at the moment.
It can be seen from the above studies that the degradation model development has experienced a shift from predicting short-term aging to predicting long-term aging. Long-term degradation needs to consider not only the offstate stress of the transistors but also the actual process and the operational conditions. In addition, modeling for the interaction of different aging mechanisms is still a challenge.

IV. CIRCUIT RELIABILITY ANALYSIS BASED ON TRANSISTOR AGING MODEL
The purpose of establishing the aging models is to accurately estimate the required guard bands in which the reliability can be sustained for the entire circuit lifetime. Once the longterm reliability of the circuit is estimated through simulation, the results can be compared with predetermined reliability specifications or limits. If the predicted reliability does not satisfy the requirements, appropriate design modifications will be carried out to improve.
To simulate the aging effect of transistors in the circuit functionality, the transistor-level aging effects can be characterized with some additional circuit elements in the SPICE model. For example, [39] use the voltage-controlled voltage source (VCVS) to characterize the V th change due to the HCI effect, as illustrated in Fig. 12. The output voltage of the VCVS will become larger as the HCI effect intensifies. A voltage-controlled current source (VCCS) is used to characterize the decrease of I ds caused by the HCI effect. Essentially, SPICE simulation is no more than solving a group of transistor-model equations to predict the interactions of all these transistors upon external stimuli. Therefore, circuit degradation can be viewed as the result that transistorlevel aging effects express themselves at the circuit level by changing their model structures. As long as the change of transistor model structures can be correctly modeled with the inclusion of additional circuit elements, and the relations between these additional elements and the time-dependent aging parameters can be built and calibrated with simple testing work, then it is foreseeable that circuit-reliability simulation will become a natural and simple step of the overall circuit functional simulation.
Further, in the aspect of predicting the lifetime degradation of circuits, the aging models are developed independently from the transistor models, thus needing to be embedded into circuit simulation. The typical flow for circuit aging  simulators includes three steps. First, the investigated circuit is simulated without any aging, namely the fresh SPICE simulation. For each transistor, the state of the operation points it runs through are recorded. With the obtained information, the degradation values of the transistor parameters are calculated. This step is achieved through the Verilog-A module [58] as illustrated in Fig. 13(a). Then, the model parameters of the transistor are correspondingly changed. Finally, the circuit can be simulated again with newly adapted (aged) transistor model parameters, as illustrated in Fig. 13(b). This allows us to extract circuit performance under transistor degradation and identify the key issues at end of life. At each time step, the parameters of all transistors are updated and the  corresponding new circuit conditions are used for the next iteration.
Based on these steps, in [44], [59], a simulation framework for studying the NBTI-induced delay degradation in circuits was discussed. As shown in Fig. 14, the lifetime is divided into N time periods, and the V th degradation is calculated at the end of each time interval. The process continues till  reaching the end of the simulation time (10 years in this study). The timing degradation for the circuit is obtained at the end of the simulation. Reference [60] thoroughly investigated the influence of the BTI-induced transistor degradation on the dynamic write performance of a 20 nm bulk CMOS SRAM. As shown in Fig. 15, the aging model is also used to monitor the impact of mixed-mode stress on real circuits [61].
Nonetheless, the process variations (PVs) in manufacture and workload are also more and more important to the reliability evaluation of circuit. It could be too optimistic for circuit degradation without considering these factors. Thus, [41] proposed a new framework for analyzing the impact of BTI and PVs on large-scale digital logic circuits, reducing the complexity of circuit-level analysis and making it possible to handle large-scale circuits. A new V th drift model was proposed and analysis the delay and subthreshold current of the circuit under the synergistic effect of BTI effect and PVs [62]. Besides, a framework considering interwoven effects of the workload and aging effect was proposed in [44], which studies the NBTI-induced delay degradation in digital circuits. The simulation setup for BTI degradation dynamically based on the workload during the simulation was introduced in [48], as shown in Fig. 16. The simulation setup consists of the transistor model which is combined with the model of BTI effect, and the V th values of each transistor are dynamically updated during the simulation depending on the workload and time. The circuit simulations are conducted through SPICE or Spectre to get the circuit performance (i.e., delay).
At present, circuit reliability evaluation is mainly based on equivalent circuits or establishing the SPICE model of aging effect to update the model parameters of transistor after degradation. The degradation of transistor strongly depends on the 'stress history,' which is decided by its actual  operation in the circuit. For circuit aging simulation, the degradation-, recovery-and variability-behavior have to be modeled as accurately as possible. Degradation can become permanent or compensated, depending on the order that the aging effects are activated during the circuit operation [63]. Thus, lifetime prediction procedures must take the aging mechanism interdependency into account.
The HCI and NBTI effects of transistors are tested using an efficient aging test method [29]. According to the I ds degradation of different channel lengths, we found that there is no obvious channel length dependence for the degradation induced by NBTI. In addition, the degradation of I ds caused by HCI decreases with increasing device size. In the longchannel transistor, the degradation of I ds is small and can be ignored. Therefore, we test the I ds degradation of 0.5 μm transistor by applying both voltage stress to the gate and drain (the conditions for the NBTI and HCI effects can be met). Likewise, we test the I ds degradation of 0.18 μm transistor. In order to get the degradation of I ds only induced by HCI, we subtract the I ds degradation of 0.5 μm transistor from the I ds degradation of 0.18 μm transistor. This is a way of decoupling aging effects. However, it is necessary to consider how to apply the decoupling method to the circuit reliability simulation.

V. SUMMARY AND CHALLENGE
The above-mentioned literatures show that the degradation of transistor performance can be calculated by the aging model under SOV or DOV. Then the degradation values can be converted into SPICE model parameters or equivalent circuit models to perform the reliability analysis of the circuit. For digital circuits, the transistor degradation model is used for timing analysis after aging. For analog circuits, the transistor degradation model is used for analyzing critical parameters (such as gain, bandwidth) after aging. However, transistor aging simulations are either too much work for the entire system, or the model accuracy is hard to guarantee due to the coupled aging effect at the design stage of the system. As channel geometries scale down, the physical mechanisms of these aging effects become more stochastic. Hence, a tendency is that the degradation model for the state-of-art technologies combines with the machine-learning models to make the model more general.
In view of this, the following problems still need to be solved urgently, which will be the focus of our future research: 1) Lack of accurate combined effect model of HCI and BTI under dynamic stress voltage; 2) It is difficult to use aging models to describe the degradation of multiple transistor parameters (I ds , V th and G m ) in the circuit at the same time; 3) Lack of reliability models that take the off-state degradation into account makes it difficult to accurately evaluate the circuit lifetime; 4) In order to provide a practical basis for the reliable design and optimization of the circuit, it is necessary to quickly locate the sensitive transistors in the circuit.
In addition, the challenge for the future is to establish reliability models with only a few parameters, analyzing the degradation of transistors on mixed digital and analog circuits, which can correctly describe the circuit's degradation under different operation voltage conditions as described above. At present, the simulation tools available also do not include important dynamic effects (such as recovery). Thus, there is still a lot to do to find a satisfactory solution to analyze the interdependency of these degradation effects, and perform accurate circuit reliability analysis.