Design of Robust Latch for Multiple-Node Upset (MNU) Mitigation in Nanoscale CMOS Technology

Multiple-node upsets (MNUs) caused by charge sharing effects are dramatically increasing in advanced nanoscale digital latches. Consequently, the robust latches against MNU cases are increasingly important. Although some existing robust latches are designed to recover MNU cases, they incur significant hardware redundancy and more sensitive nodes due to only depending on multiple circuit instances (e.g., C-elements (CEs)). In order to obtain a balance between high tolerance capability and low overheads, in this paper, we propose a novel radiation hardened latch (RHL) based on the polarity of the radiation-induced voltage pulse (positive or negative pulse). The proposed latch is capable of tolerating any possible single node upset (SNU) and MNU cases in all considered nodes while manifesting fewer transistors and sensitive nodes. The timing (transparent and hold) function and reliability are successfully verified by simulation in TSMC 65nm bulk CMOS process. In addition, the results of the cost comparison have illustrated that the proposed RHL latch has a moderate area and power dissipation, but provides significant benefit in terms of both delay and power-delay-area-product (PDAP) among the alternative latches.


I. INTRODUCTION
A single event upset (SEU) is generated when the collected charge of the struck node is larger than the critical charge in a radiation particle strike, and probability of incurring an SEU is dramatically increasing in the sequential cells [1]- [3]. Thus, latches that are widely used to latch the key signals in the data propagation paths need to be protected to avoid the data corruption [4].
The radiation hardening techniques that protect latches are generally implemented at three levels: 1) process level: the use of other innovative manufacturing process (e.g., Silicon-On-Insulator (SOI)) [5]; 2) layout level: using some special layout modifications such as H-gate, guard rings, shallow trench isolation (STI), and increasing adequate node spacing [6]- [8]; 3) circuit level: hardware redundancy by introducing circuit duplication, and novel hardened latches with tolerance structures [9]. The dominating approach of tolerating an SEU in the latches is the circuit level techniques The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney. because they can provide higher reliability; at the same time, they do not need to modify the commercial process [10]. For instance, the first robust latch (latch1) in [11], and the latch in [12] use dual modular redundancy (DMR) to mask a single node upset (SNU). However, the main drawback is that when a particle changes the value of an internal node, the output node will be forced to a floating state due to the lack of proper feedback paths between the internal nodes and the output node. As a result, the latching value in the output node will be charged or discharged due to the higher leakage current in nanoscale CMOS process [13]. Based on multiple circuit instances such as C-elements (CEs) and dual interlocked storage cell (DICE) [4], the second and third latches (latch2 and latch3) in [11] are proposed to perform fault tolerance in a single node. The latch in [13] is proposed to achieve SNU tolerance by adding an extra feedback loop. Unfortunately, it only performs SEU tolerance in its internal nodes. The tolerance of an SEU in the output must be considered since the upset in the output node can be also induced by a particle strike (i.e., the output node also is a sensitive node).
The main disadvantage of the above latches is that only an SNU can be tolerated. Therefore, the protection level of these robust latches is not enough for addressing the upsets occurring in the adjacent (two) nodes, which are widely considered as multiple-node upsets (MNUs) caused by the charge sharing effects [14]. MNUs in the sequential cells such as SRAMs and latches are dramatically increasing due to the smaller physical distance between the adjacent nodes with the scaling of nanoscale process, so higher protection level in the latches must be provided to improve their fault reliability in radiation environments [3]. A series of robust latches are designed to meet the requirement of reliability [15]- [24], while the cost penalties make them less attractive, so they may not be commercially available in some given applications: 1) Although the latches designed in [15], [20] and [23] use few transistors to filter an MNU, many MNU cases cannot be recovered, so that a floating state will be generated in the output. Obviously, this floating state will be easily altered by the leakage current when the clock interval is long enough in low-speed systems [13]. On the other hand, short paths from VDD to GND will inevitably exist, deteriorating the power dissipation (i.e., increasing its short-power dissipation).
2) Layout optimization techniques such as lengthening node spacing are used in some MNU tolerance latches. For example, the authors in [21] propose a DICE-based latch to correct an upset by duplicating the internal nodes; this latch has nine sensitive nodes, so the number of its node pairs is 36. In order to avoid an MNU for each node pair, adequate node spacing must be provided, extremely increasing the layout area; otherwise, many MNU cases can make the output node float a high-impedance state.
3) To recover all upset cases, multiple circuit instances such as CEs are repeatedly used to construct robust latches, however requiring larger area penalty (transistors) and more sensitive nodes (node pairs) [22], [24]. Thus, this hardening approach is not a good choice. For example, Fig. 1 (a) gives the schematic of the robust latch in [24], in which nine CEs are used; it has 60 transistors and 23 (253) sensitive nodes (node pairs). Fig. 1 (b) shows the schematic of the latch in [22]; it requires 70 transistors and 21 (210) sensitive nodes (node pairs).
In this paper, to obtain a balance between high tolerance and lower overheads, a radiation hardened latch (RHL) is proposed. The proposed latch has the following advantages: 1) It relies on the polarity of the radiation-induced voltage pulse (positive or negative pulse) to provide SEU protection, so the number of sensitive nodes is reduced.
2) The number of transistors is reduced because of fewer sensitive nodes, and any layout hardening techniques (layout optimization and node isolation) are not used, so the area and power overheads are reduced. Besides, the propagation path is shorter, so it has smaller propagation delay.
3) All possible upset cases can be recovered. 4) It does not require a keeper circuit to maintain the value of the output node, because the output node never becomes a high impedance node.  [22] and [24] use nine CEs to recover an MNU: a) the schematic of the latch in [24]; and b) the schematic of the latch in [22] where 9 CEs are required.
The remainder of the paper can be organized as follows. The proposed RHL latch is shown in Section II; its timing and protection mechanism are also analyzed. In Section III, the simulation and evaluation are achieved by using TSMC 65 nm bulk CMOS process design kit (PDK); the effects of process variations for the proposed RHL latch are assessed by using Monte Carlo simulation. Finally, Section IV gives the conclusions of this paper.

II. PROPOSED LATCH DESIGN
A. PROPOSED LATCH Fig. 2 shows the schematic of the proposed RHL latch, in which TP1 ∼ TP20 are PMOS transistors, and TN1 ∼ TN20 are NMOS transistors, so 46 transistors (plus three inverters I1, I2 and I3) are needed. Compared with the latches VOLUME 8, 2020 in [22] and [24], the proposed RHL latch effectively decreases the number of transistors. The value of the output Q is captured by the next level circuits. S1 ∼ S8 are its internal nodes; they are used to maintain the latching value of node Q by TP13 ∼ TP15 and TN17 ∼ TN19. Transistors TP16 ∼ TP19 are controlled by the CLK signal to drive the values of S1, S4, S5 and S8 nodes to high (1) or low (0) state. Transistors TP20 and TN20 compose a transmission gate to drive node Q to high (1) or low (0) state, depending on the value of the input D. They are respectively controlled by the CLK signal and its complementary signal CLKN. Here, the CLKN signal is generated by an inverter comprising of a PMOS and an NMOS. The other transistors are driven by the nodes S1 ∼ S8, in which the feedback loop is established. TP1, TP2, TP5 ∼ TP8, TP11 and TP12 guarantees that if a particle respectively strikes S1, S2, S5 and S6 nodes, their values are not induced to 0 because the positive charge is only collected [25], [26]. A buffer consisting of two inverters (I2 and I3) is connected to the transmission gate, since it can guarantee that the input capacitance is dependent from the output load.
2) During the high clock phase (CLK = 1), because of the closed propagation path (both transistors TP20 and TN20 are closed), the value of the output Q is latched depending on the latching values of S3 and S7 nodes: the latch loop preserves the values of all the internal nodes S1 ∼ S8 unless they are covered when CLK is lowered to 0 again, so when S3 = S7 = 0, TP13 ∼ TP15 are on, propagating the supply voltage VDD to its output node Q (Q = 1); when S3 = S7 = 1, the output Q is maintained to GND through the on TN17 ∼ TN19 (Q = 0).

B. SEU TOLERANCE
Knowledge of the radiation-induced voltage pulse is essential for the construction of the proposed latch. It is induced when the deposited charge is collected, so the upset polarity purely depends on the struck location [25]. Let us utilize an inverter to explain this phenomenon, as shown in Fig. 3 [26]: 1) When the input is 0, transistors TP1 and TN1 are on and off respectively, so its output is 1. If the drain of NMOS TN1 is struck, the output will be pulled down to 0, causing a negative voltage pulse (see Fig. 3(a)); if the drain of PMOS TP1 is struck, the output will be driven to a higher voltage than VDD, thus causing a positive pulse (see Fig. 3(b)).
2) If the input is 1, transistors TP1 and TN1 are off and on respectively, so the output is 0. If the struck location of a particle is the drain of PMOS TP1, a positive pulse will be induced, so that the value of the output node is altered to its complementary value (see Fig. 3(c)); on the contrary, if the struck location is the drain of NMOS TN1, the output will be pulled down to a lower voltage than GND, resulting in a negative pulse (see Fig. 3(d)).
As per the upset mechanism discussed above, the proposed latch reduces the number of sensitive nodes. Determining the sensitive nodes is important because it allows the reduction of transistors and sensitive nodes (node pairs) that rely on the latching value. Consider the proposed latch illustrated in Fig. 2; if Q = 1, the internal nodes S2 ∼ S4, S6 ∼ S8, the floating nodes F1, F2, F4, F7, F8, F10 and F16, and the output Q (Q = F15) are susceptible to a particle. Nodes S1 and S5 are not the drain of an off NMOS transistor, so if these two nodes are struck, only the positive charge is collected which can induce only a positive transient pulse (Fig. 3(b)). This indicates that nodes S1 and S5 are not sensitive nodes because their values are never flipped. Therefore, for the same reason, when Q = 0, only nodes S1, S3 ∼ S5, S7, S8, F3, F5, F6, F9 and F11 ∼ F13, and the output node Q (Q = F14) are sensitive nodes. Thus, compared with the latches in [22] and [24], the number of sensitive nodes (node pairs) is significantly reduced to 14 (91).
The tolerance performance of the proposed RHL latch is introduced by using the value shown in Fig. 2. In order to simplify the analysis, the latching structure of the proposed latch is divided into two modules (cell-1 and cell-2 modules): 1) When the flipped node is S2 (an SNU alters the value of node S2), TP1 and TP10 are turned off, and TN7 is turned on; the other nodes such as nodes S1, S3 and S4 preserve their values, so TP6, TN8 and TN4 maintain the on state, quickly recovering node S2.
2) When the flipped node is S3 node, TN3, TN6 and TN10 are turned on, and TP6 is turned off. However, these changes cannot disturb the values of the remaining nodes, so nodes S4 and S8 preserve their values which can turn on TN5 and TN1; then node S3 is quickly recovered to 0.
3) If the upset occurs on the drain of transistor TN6 (node S4), TN5, TN4 and TN9 are quickly turned off, and only TP5 is turned on. However, this upset can be corrected due to the on TP4. 4) Since the placed nodes in the circuit layout are closer in advanced nanoscale CMOS techniques, a single particle can affect adjacent (two) nodes, causing an MNU in a node pair due to the charge sharing effects [14]. Hence, if the node pair (S2, S3) is upset, TN7, TN3, TN6 and TN10 are turned on, and TP1, TP10 and TP6 is quickly turned off. However, TN5 and TN1 are on, thus restoring the value of node S3; since nodes S1 and S4 are not altered, TN8 and TN4 are turned on. As a result, node S2 recovers its value again by the on TP6, TN8 and TN4. 5) If the charge sharing effects flips the node pair (S2, S4), TP1, TP10, TN5, TN4 and TN9 are turned off, and TN7 and TP5 are turned on; due to preserving the values of nodes S3, S7 and S6, TN6 and TN2 are off and TP4 is on, so that the erroneous value of node S4 is also recovered by charging. As a result, TN4 is turned on again; due to the keeping value of node S1, TN8 keeps the on state. Hence, the upset occurring on the node S2 can be recovered by discharging through the on TP6, TN8 and TN4. 6) If the charge sharing affects the node pair (S3, S4), the latching values of nodes S3 and S4 are upset, turning on TN3, TN6, TN10 and TP5, and turning off TP6, TN5, TN4 and TN9. However, since TP4 maintains the on state, node S4 is restored, turning on TN5 transistor. Finally, node S3 can be recovered by the on TN1 and TN5. 7) If one floating node (F1, F2, or F4) is struck, this node can deposit the positive or negative charge which are not be propagated to affect other nodes, so the latching value in the output Q is remained. In similar, when two floating nodes are affected by the charge sharing effects, the output node always maintains the latching value. 8) If one floating node (F1, F2, or F4) and the node S2, S3, or S4 are simultaneously affected by an MNU, the deposited charge of the floating node cannot affect other nodes because node S2, S3, or S4 is the recoverable node. As a result, the output Q maintains the correct value.
9) The proposed latch has a symmetrical structure (cell-1 and cell-2 modules are symmetrical), so if an SNU or MNU occurs on the cell-2 module, it can be also corrected.
10) If two nodes between cell-1 and cell-2 modules incur an MNU, it can be corrected because the case is regarded as two SNUs occurring in two modules, respectively. 11) When the output node Q and one sensitive node of two modules incur an MNU, it can be also corrected because the change of the output node Q cannot alter the latching values of cell-1 and cell-2 modules.
Due to the symmetrical design, the proposed latch can also recover all possible SNU and MNU cases when 0 is latched. The tolerance of the proposed design is independent of any layout hardening techniques; thus, the designers can draw the minimal layout to save silicon area. Fig. 4 shows its layout comprising an area of 25.35 µm 2 , where only M1 metal layer are used for routing the interconnects, so it does not affect the overall routing in VLSI automatic design. VOLUME 8, 2020

A. SIMULATION RESULTS OF TIMING AND TOLERANCE
Using Cadence Spectre tool (the process library is the TSMC bulk 65 nm PDK, and the supply voltage VDD = 1.2V), the function simulations of the proposed latch including timing and tolerance verification are implemented. Fig. 5 shows the post-layout simulation waveforms in which the dual double-exponential current pulses are injected to mimic the induced transient current; the dual double-exponential current source is used as the fault injection model because it can accurately simulate the charge sharing and collection [1]. It can be seen that the proposed latch can successfully propagate the input to the output in the low clock phase (CLK = 0), and latch the right value in the high clock phase (CLK = 1); on the other hand, all SNU and MNU cases can be restored due to its fault tolerance mechanism: 1) SNU cases occurring on one internal node from 30ns to 55ns, and MNU cases occurring on the internal node pair from 60ns to 180ns (Fig. 5 (a)); 2) The charge sharing occurring on (F1, S2), (F1, S3) and (F1, S4) node pairs (scenario 1 in Fig. 5(b)); 3) The charge sharing occurring on (F2, S2), (F2, S3) and (F2, S4) node pairs (scenario 2 in Fig. 5(b)); 4) The charge sharing occurring on (F4, S2), (F4, S3) and (F4, S4) node pairs (scenario 3 in Fig. 5(b)); 5) The charge sharing occurring on two floating nodes F1 and F2, node F1 depositing the negative charge (scenario 4 in Fig. 5(b)); 6) The charge sharing occurring on two floating nodes F1 and F2, node F1 depositing the positive charge (scenario 5 in Fig. 5(b)); 7) The charge sharing effects occurring on (F1, Q) node pair (scenario 6 in Fig. 5(b)); 8) Node F1 collects the negative charge, so its value is not changed (scenario 7 in Fig. 5(b)).

B. COST COMPARISON
In this subsection, the hardware overheads in terms of layout area, power dissipation, delay, as well as a traditional metric power-delay-area product (PDAP) is used to compare with SNU tolerance latches in [11], and MNU tolerance latches in [15]- [24]. R-latch is also assessed as a reference (see Fig. 6), it uses 12 transistors to transmit and latch the value, including four inverters (I1 ∼ I4) and two transmission gates (TG1 and TG2); it also has three sensitive nodes A, B and Q.  Table 1 reports the results of storage nodes, sensitive nodes and node pairs. Fig. 7 plots the layout results. As can be seen, apart from the R-latch, the latch2 in [11] has the smallest   layout area due to using fewest transistors. The issue for this hardened latch is that it only can recover an SNU. In all the considered MNU tolerance latches, the latches in [19]- [21] and [15] have a smaller area than the proposed latch. However, the latches in [15], [20] and [21] cannot recover all MNU cases in the considerable node pairs, so the short paths will be formed if an MNU is not recovered, resulting in the short power dissipation. Moreover, the output of the latch in [20] is isolated in a floating state, this indicates that its value can be easily changed by charging/discharging, and meanwhile this floating state can increase more leakage power dissipation, as mentioned before. The latch proposed in [22] has the largest area because it requires maximum transistors (70). The latch in [24] has maximum sensitive nodes (node pairs), so it needs a large number of transistors to tolerate an MNU (the number of node pairs is 253). For the latch in [19], its area is smaller than that of the proposed RHL latch, but its delay overhead is larger (see Fig. 8).

FIGURE 8.
Delay results for the considered latches. VOLUME 8, 2020 Fig. 8 performs the delay comparison; the proposed latch consumes the minimum delay among MNU tolerance latches, and the delay overhead of these latches ranges from 0.72% to 148.65%. The main reasons are that the propagation paths are long, and the output is difficultly driven since the protection of the keeper circuit (stronger driving capability). Considering the power comparison in Fig. 9, the latch3 in [11] consumes the minimum power dissipation due to fewer transistors and the stacked topology. Compared with the proposed latch, although the latches in [15], [17], [21], [22] and [24] have a smaller dynamic power dissipation, they can only tolerate partial MNU cases; it can result in larger power dissipation (short and leakage power dissipation). Compared with the hardened latches in [16], [20] and [23], the proposed latch reduces power dissipation by 33.47%, 1009.13% and 472.40%, respectively. The latch in [20] has the maximum power dissipation because it uses the isolation construction.
A traditional metric PDAP is used to show the benefit of the proposed latch, which is obtained by using the following equation: The normalized PDAP comparison result is plotted in Fig. 10. As can be seen, the proposed RHL latch can manifest the minimum PDAP value among MNU tolerance latches in [15]- [24]. The three robust latches in [11] feature a smaller PDAP, but only an SNU occurring in a single node can be corrected rightly.
Overall, from the above results, it can be demonstrated that the proposed RHL latch features a moderate layout area and power dissipation to recover all possible upset cases with the minimum delay and PDAP, compared with existing MNU tolerance latches.

C. ROBUST COMPARISON
The charge sharing effects can induce an MNU in a node pair if this node pair shares charge deposited by a particle event: a large number of charge is deposited in the primary node, and the remaining charge is shared by a closer node which is also regarded as the secondary node [27], [28]. However, because the charge sharing strongly relies on the layout topology of a circuit, the collected charge strongly depends on the distance between the primary and secondary nodes. Thus, the increase (decrease) of the distance between two nodes can result in a dramatic decrease (increase) in charge sharing and collection [14]. Moreover, an MNU scenario that affects more than two nodes is unlikely to manifest a significant state upset due to the extensive charge diffusion in the sequential elements, and the wider spread of an SEU strike [29]- [31], so the term MNU commonly refers to double-node upset [32]. Fig. 11 has depicted the deposited charge curves of various latches in which the closest nodes of each latch are selected as the injection nodes. In the proposed RHL latch, the closest nodes are (F1, F2), (S4, F4), (F7, F8), and (S8, F10) node pairs. However, the charge sharing occurring on (F1, F2) and (F7, F8) node pair cannot change the values of other nodes, so (S4, F4) node pair is selected as the injection node pair (same results can be obtained if the charge sharing occurs on the (S8, F10) node pair due to the symmetrical design). From Fig. 11, it can be seen that an SNU occurring on the R-latch is not tolerated because its curve intersects X and Y axes. The latches in [11] can tolerate an upset in any node, so their curves do not intersect X and Y axes; meanwhile, the areas of their curves are relatively smaller than that of the other MNU tolerance latches, apart from the latch proposed in [20]. This proves that the tolerance against an MNU for the latches in [11] is weaker than the proposed latch. The latch in [20] is regarded as an MNU tolerance latch by the authors, but the FIGURE 11. Comparison of the deposited charge in the primary and secondary nodes for the considered latches (when the area of the curve is larger, the corresponding tolerance against an MNU is stronger). Two dual double-exponential current sources are simultaneously used to simulate the charge collection on the closest nodes: one is connected with a sensitive node, and another is simultaneously connected with its closest node [27]. Fig. 8 have demonstrated that this latch is an SNU tolerance latch, because its curve does intersect X axis. The reason is that its output is a sensitive node, and only a small number of deposited charges can flip the latching value, but it is ignored by the authors. The curves of the proposed latch and the latches in [15]- [19], and [21]- [24] coincide. This proves that the proposed RHL latch has superior tolerance against an MNU.

D. PROCESS VARIATIONS
In nanoscale CMOS process, the effects of process variations such as oxide thickness, and channel length should be strictly investigated since they can degrade circuit performance [27], [28]. Because Monte Carlo simulation can effectively model process variations as statistical distributions, in this section it is used to measure the effects of statistical process variations as accurately as possible [33].
The results of Monte Carlo simulations of different latches are given in Table 2. Failure probability is defined as [28]: Failure Probability = Total Number − Tolerance Number Total Number (2) in which Total Number is the number of total simulations (3000), and Tolerance Number is the simulation number of successfully recovering an MNU. Higher failure probability represents that the tolerance capability of a latch against an MNU is affected more seriously. As can be seen, the robust latches in [11] and [20] as well as the R-latch have a higher failure probability; the failure probability of the proposed latch is zero. Thus, these results have demonstrated that the process variations do not degrade the tolerance performance of the proposed latch against an MNU.

IV. CONCLUSION
In this paper, a radiation hardened latch (RHL) is proposed to tolerate all single node upsets (SNUs), and multiple-node upsets (MNUs) by using the polarity of the radiation-induced voltage pulse, without any layout hardening techniques. The timing and recovery function of the proposed latch has been demonstrated by using circuit-level simulation tool in TSMC 65 nm CMOS process. Additionally, the cost comparison in terms of area, delay, power, and a traditional metric PDAP is also implemented. The obtained results can illustrate that the proposed latch manifests significant benefit in terms of both delay and PDAP. Monte Carlo simulation confirms that the MNU tolerance of the proposed RHL latch is not affected by process variations.