Nontraditional Design of Dynamic Logics Using FDSOI for Ultra-Efficient Computing

In this article, we propose a nontraditional design of dynamic logic circuits using fully-depleted silicon-on-insulator (FDSOI) FETs. FDSOI FET allows the threshold voltage (<inline-formula> <tex-math notation="LaTeX">$V_{\text {t}}$ </tex-math></inline-formula>) to be adjustable (i.e., low-<inline-formula> <tex-math notation="LaTeX">$V_{\text {t}}$ </tex-math></inline-formula> and high-<inline-formula> <tex-math notation="LaTeX">$V_{\text {t}}$ </tex-math></inline-formula> states) by using the back gate (BG) bias. Our design utilizes the front gate (FG) and BG of an FDSOI FET as the input terminals and proposes the dynamic logic gates (like NAND, NOR, AND, OR, XOR, and XNOR) and circuits (like a half-adder and full-adder). It requires fewer transistors to build dynamic logic gates and achieves high performance with low power dissipation compared to conventional dynamic logic designs. The compact industrial model of FDSOI FET (BSIM-IMG) has been used to simulate dynamic logic gates and is fully calibrated to reproduce the 14 nm FDSOI FET technology node data. Calibration is performed for both electrical characteristics and process variations. The simulation results show an average improvement in transistor count, propagation delay, power, and power-delay product (PDP) of 23.43%, 57.16%, 47.05%, and 77.29%, respectively, compared to the conventional designs. Further, our design reduces the charge-sharing effect, which affects the drivability of the dynamic logic gates. In addition, we have analyzed the impact of the process, supply voltage, and load capacitance variations on the propagation delay of the dynamic logic family in detail. The results show that these variations have a minor impact on the propagation delay of the proposed FDSOI-based dynamic logic gates compared to the conventional dynamic logic gates.


I. INTRODUCTION
In modern microprocessors, the design of dynamic logic [see Fig. 1(a)] is essential because it offers advantages over other logic families such as static logic, pseudo-nMOS logic, and complementary pass-transistor logic (CPL) [1] in terms of high switching speed, small area, and low power consumption [2]. Both dynamic and static logic designs can eliminate static power dissipation; however, dynamic logic gates can be twice as fast as their static counterparts.
Dynamic logic gates are beneficial for applications where improved performance and reduced area are required, e.g., ARM Cortex A8 processors [4], dynamic ternary contentaddressable memory (TCAM) cells [5] and many other lowpower applications [6]. However, dynamic logic gates suffer from circuit design constraints like noise sensitivity and several reliability problems such as charge loss and charge sharing [7]. Trommer et al. [8] have used a new class of fieldeffect transistors with multiple independent gates (MIGFETs) to mitigate the charge-sharing issue. Still, it fails to capture two critical effects; velocity saturation and lower switching threshold of the dynamic logic gate. The extensive transistor scaling, process, and supply voltage variations deeply impact the performance of dynamic logic circuits [6]. Several transistor-level and gate-level topologies have been proposed to mitigate these issues. Pal and Islam [6] have proposed a new modified logic topology, the Dynamic Schmitt Trigger topology, to improve the reliability of dynamic circuits against process and supply voltage variations. This topology uses an additional Schmitt Trigger connected with the precharge transistor to mitigate noise and has a stable output voltage irrespective of the supply voltage scaling or process variations. However, the delay is increased, and thus the speed of operation reduces. Also, using extra transistors for the Schmitt Trigger increases the overall area and power consumption. Azizi and Najm [9] have used adaptive body biasing to control the systematic variations in the delay of dynamic logic gates. The correlation between such systematic variations and threshold shift is calculated and used to design a suitable monitor circuit. However, this technique can not be implemented for a system consisting of many blocks of dynamic logic gates, as this would require setting body bias dependencies for each block separately. Also, this cannot mitigate the effect of random process variations that have increased predominance at scaled technology nodes [10].
Our Main Contribution Within the Article: We present a nontraditional circuit topology exploiting back-bias of fullydepleted silicon on insulator (FDSOI) FET to design dynamic logic gates and circuits. Our design requires a lesser number of transistors as compared to the conventional dynamic logic design. The proposed design is faster, more power-efficient, and less prone to variations. Our nontraditional design also reduces the charge-sharing effect of dynamic logic gates. Further, we demonstrate the reduced impact of the process, supply voltage, and load capacitance variations on the delay of our proposed logic gates against the conventional dynamic logic gates.
The rest of the article is organized as follows. Section II describes the BSIM-IMG compact model used for the simulation and its validation with measurement data. Section III presents the proposed FDSOI-based dynamic logic design and covers two and three input logic gates. Circuits implementation based on the proposed design is presented in Section IV. Results and discussion are presented in Section V. Section VI summarizes and concludes this work. Fig. 1(b) shows the schematic view of the 14 nm technology node FDSOI FET. The transistor dimensions of the fabricated high-κ/metal gate FDSOI FET [3] used for compact model calibration are as: gate length = 20 nm, buried oxide (BOX) thickness (T box ) = 25 nm, and FG oxide thickness = 1.1 nm. We have used the industry-standard BSIM-IMG compact model [11] to calibrate FDSOI FET. Fig. 2(a) shows the model validation of 14 nm technology p-FDSOI FET with the measured data [3] for drain current (I ds ) as a function of FG voltage (V fg ) for different BG voltages (V bg ). We have assumed symmetrical drain current characteristics for both p-and n-FDSOI. Fig. 2(b) shows the simulated transfer characteristics of n-FDSOI FET.

II. TRANSISTOR MODEL CALIBRATION WITH MEASUREMENT DATA
To realize the nontraditional dynamic logic, we have utilized the threshold voltage (V t ) tuning feature of FDSOI FET using back-bias. The V t of an n-FDSOI FET can be tuned to a high (low) value when a negative (positive) voltage is applied to its BG terminal. Fig. 2(b) shows the I ds as a function of V fg characteristics of an n-FDSOI where V bg is varied from −2 to 2 V at V ds = 0.75 V. For V bg = 2 (−2) V, the inversion layer forms at lower (higher) V fg in the channel compared to V bg = 0 V. Hence, the transistor operates in low-V t (high-V t ) state. The separation between the low-V t and high-V t is depicted in Fig. 2(c) for V bg = 2 and −2 V.
The T box of an FDSOI plays an important role in varying the V t [3]. Fig. 3 shows the V t change as a function of V bg for different T box . It shows that the decrease in T box value increases the voltage difference between the low-V t and high-V t . For T box = 5 nm and at V bg = 2 (−2) V, the channel inverts more (less) quickly than for T box = 25 nm and requires less (more) V fg to turn on the transistor. Using optimized T box value, the delay of the circuit can be optimized as the transistors can turn on faster and discharge the output node quickly. The logic gate implementation using the optimized T box = 5 nm of FDSOI is presented in Sections III and IV. Fig. 4(a) and (b) show the comparison of two-input conventional dynamic logic and the proposed FDSOI-based dynamic logic gates. For the proposed dynamic logic gates, inputs ''A'' and ''B'' are considered bipolar signals. During the precharge phase, the clock signal (CLK) is low, the precharge transistor (M pre ) turns on, and the OUT node charges to V dd and evaluation transistor (M eva ) is kept in OFF state to prevent any leakage. Input signal ''A'' is applied at the FG of n-FDSOI, and its value is chosen such that it lies in between the high-V t & low-V t value [as shown in Fig. 2(c)]. The second input signal, ''B'' is applied at the BG of n-FDSOI. Therefore, only  when the input signals ''A'' and ''B'' are high (i.e., ''A'' = 0.75 V, and ''B'' = 2 V), n-FDSOI will be in low-V t state and conducts. For any other input signal condition, it remains in non-conducting mode and this property of n-FDSOI forms the basis of the proposed circuit operation. We proposed the basic idea in [12] for the nontraditional dynamic XNOR logic gate. Here we extend the idea to basic dynamic logic gates like NAND, OR, XOR, AND, NOR, and also combinational circuits (like, half-adder and full-adder) that forms the basis for more complex circuits. Furthermore, we have also analyzed the impact of the process, supply voltage, and load capacitance variation for each gate and show how our proposed design performs better than the conventional dynamic logic gates. Fig. 4(b)(i) and (b)(ii) show the FDSOI-based dynamic NAND gate and OR gate, respectively. It consists of only one n-FDSOI in the PDN compared to the two nFETs required in conventional dynamic logic design. For the two-input NAND logic, the input signal ''A'' is applied at the FG, and signal ''B'' is applied simultaneously at the BG of n-FDSOI. For OR logic, input signal ''A'' is applied at the FG, and signal ''B'' is applied at the BG of the n-FDSOI.

2) WORKING OF THE PROPOSED DYNAMIC NAND GATE
During the evaluation phase, if both input signals ''A'' and ''B'' are logic high (as shown in Fig. 5, during interval 10-20 ps) the n-FDSOI turns on because of the 0.7 V at the FG, and is in the low-V t state due to 2 V at BG. The OUT node discharges to GND through the ON resistance of M eva . In other cases of input signal combinations, the OUT node will remain at the high (precharged value) as n-FDSOI will be OFF due to logic low at the FG or operates in the high-V t state (see Fig. 5). Thus, it demonstrates dynamic NAND logic gate operation at the output node using only one transistor in the PDN. The working of the OR gate can be explained in a complement to the input signal applied in the NAND gate.

4) WORKING OF THE PROPOSED DYNAMIC XNOR GATE
In case both input signals (A and B) are logic low or high, both N1 and N2 are not in conducting state, leading the OUT node to stay high. If one of the inputs, A or B, is a logic high and the other one is logic low, N1 or N2 starts conducting, and the OUT node discharges to GND. Thus, the functionality constitutes an XNOR gate operation, as shown in Fig. 5. Similarly, for the working of the proposed dynamic XOR gate, when both input signals ''A'' and ''B'' are logic high (low), N1 (N2) is in the OFF state, whereas N2 (N1) is in ON state and the OUT node discharges. When one of the inputs is logic high, and the other is logic low, both N1 and N2 are not conducting, and the OUT node remains high. Thus, the two n-FDSOIs constitute an XOR logic gate.   input signal at FG; the BG terminal is grounded. For the NOR logic, the polarity of the input signals is reversed.

6) WORKING OF THE PROPOSED DYNAMIC AND GATE
In the evaluation phase, if both the input signals (A and B) are logic high, N1 and N2 will remain off, and the OUT node stays at the precharged state V dd . For the case when A = high (low) and B = low (high), the N1 turns on (the N2 turns on), and the OUT node discharges to GND. If both input signals are logic low, the N2 will be on and discharge the OUT node to GND. Hence, this circuit design performs AND logic successfully. Similarly, the working of the NOR gate can be understood. Fig. 5 shows the timing diagram of all the above-mentioned dynamic logic gates.

B. THREE-INPUT FDSOI-BASED DYNAMIC XNOR GATE
Most of the modern digital circuits use three-input logic gates (such as full-adder [13]). Fig. 6(a) shows the PDN of a conventional three-input dynamic XNOR logic gate which requires ten transistors, whereas our proposed three-input FDSOI-based dynamic XNOR logic gate requires only six n-FDSOIs. When all three inputs are high, N1, N3, and N5 will be off. Since N2 has B at BG, it will be in high-V t state and off. However, N4 and N6 will be on, and the OUT node will discharge to GND, which results in a high output state. For the other input combinations, logic can be deduced  based on the state of n-FDSOIs and XNOR gate functionality is realized. Other three-input FDSOI-based dynamic logic gates can also be constructed using n-FDSOIs. This idea can also be generalized for the n-input dynamic logic gates.

IV. PROPOSED FDSOI-BASED COMBINATIONAL CIRCUITS A. HALF-ADDER
A half-adder consists of two logic gates that perform the addition operation between the two input signals, A and B. The XOR logic calculates the SUM (S = A ⊕ B) of two inputs, and the AND logic computes the CARRY (C o = A · B). The basic PDN structure of half-adder is shown in Fig. 7. The conventional half-adder [see Fig. 7(a)] requires six transistors (four for SUM and two for CARRY evaluation), whereas the proposed FDSOI-based half-adder [see Fig. 7(b)] consists of only four n-FDSOIs (two for SUM and two for CARRY).
To compute the SUM bit, S = A ⊕ B, two n-FDSOIs (N1 and N2) are connected at their drains, which act as an XOR logic. The other n-FDSOIs, N3 and N4, are connected to perform AND logic and compute the CARRY (C o ) bit. Fig. 8 shows the layout of the conventional and proposed half-adder circuit. The layouts are drawn using commercial 22FDX PDK from GlobalFoundries. The proposed design has a 33% improvement in the area due to fewer transistors required. The proposed structure perfectly performs the halfadder operation. The timing diagram result for the half-adder is shown in Fig. 9.

B. FULL-ADDER
To extend the half-adder circuit to a full-adder circuit, another input, carry-in (C i ) is added to get the CARRY, The basic PDN of the full-adder structure is shown in Fig. 10. The conventional full-adder circuit [see Fig. 10(a)] requires 16 transistors (ten for SUM and six for CARRY). In contrast, the proposed FDSOI-based full-adder circuit [see Fig. 10(b)] requires ten n-FDSOIs (six for SUM and four for CARRY) and hence, a very compact design is obtained with the proposed full-adder circuit design.  To compute the SUM, S = A ⊕ B ⊕ C i , N1 and N2 are connected in parallel and N5 is in series with them and N3 and N4 are connected in parallel and N6 is in series. The other four n-FDSOIs, N7-N10, are connected to get the C o as shown in Fig. 10(b). The simulation result of the full-adder circuit is shown in Fig. 11.

A. PERFORMANCE METRICS
The performance of the proposed FDSOI-based dynamic logic circuits is compared with the conventional logic circuit designs in terms of transistor count, worst-case delay, power dissipation, and power-delay product (PDP) in Table 1  standard SPICE simulator ''HSPICE'' and calibrated BSIM-IMG model.
The delay of a dynamic logic gate (t pd ) is calculated as where t phl is the time taken by the output signal to go from 90% to 10% of V dd and t pre is the time taken to precharge the OUT node from 10% to 90%. The dynamic power dissipation is data-dependent in the dynamic logic circuit. The total dynamic power consumption (P dy ) for a dynamic logic gate connected to a constant power supply (V dd ) is given by where I is the current drawn from the supply during the evaluation phase of dynamic logic gates, I depends on factors like; gate capacitance, mobility, dimension of the transistor, and gate overdrive voltage. To first order, I is directly proportional to the square of the gate overdrive voltage. In the proposed design, the transistor operates near the sub-threshold region, where the gate-overdrive voltage is much lower than conventional dynamic logic gates. Thus, the VOLUME 9, NO. 1, JUNE 2023  proposed design's power consumption is much lower than the conventional design. Biasing the conventional design in the near sub-threshold region loses its functionality and increases the delay. Our design does not suffer from such problems as FG and BG drive it. The conventional and the proposed dynamic logic designs are simulated under identical temperatures, supply voltage, and load conditions. However, the input voltage for the designs varies according to the operation's demand. Table 2 provides the nominal values of the parameters like channel length (L), channel width (W ), channel thickness (TSI), effective oxide thickness (EOT1), and box thickness (EOT2). XNOR and XOR logic gain 33.33% in transistor count than the conventional one, whereas OR and NAND have a 25% gain. However, AND and NOR designs take the same number of transistors as conventional designs. In case of propagation delay and power dissipation, the proposed logic gates have a considerable amount of improvement compared to conventional ones. Additionally, We have evaluated the performance of half-adder and full-adder circuits based on the proposed logic design. The gains in the transistor count are 33.33% and 37.5% for the half-adder and full-adder, respectively. Also, a significant amount of reduction in propagation delay and power dissipation is obtained in both circuits, as shown in Table 1.
The post-layout performance metrics of our proposed design is shown in Table 3. These improvements are attributed to the reduced number of transistors. As the number of transistors is reduced, the parasitics and capacitances are also reduced, which increases the speed without the increase in power. Additionally, using BG as one of the input terminals, we can conduct the channel at a lower voltage in the sub-threshold region where the current is much lower. This explains the significant low power consumption of the proposed design.

B. CHARGE SHARING ANALYSIS
Charge sharing is a critical problem that occurs in conventional dynamic logic gates when internal nodes are present between the transistors. These internal nodes share the charge stored in the OUT node during the precharge phase and reduce the drivability of the gate. Fig. 12(a) shows a conventional dynamic two-NAND logic gate where an internal node C 1 is present between N1 and N2. When CLK = 0, the OUT node precharges to V dd and when CLK = 1, A = 1, and B = 0, the charge stored in C L is shared with the C 1 node. In this case, the voltage at the OUT node can be estimated by This charge-sharing effect becomes critical if the induced voltage drop goes below the threshold voltage of the transistor in the evaluation network. The proposed design avoids the charge-sharing problem for the two-input logic gates due to the absence of internal nodes [see Fig. 12(b)], which helps the output voltage level to maintain its previous charged value. Also, for a higher number of input signals, the less number of internal nodes will be present in the proposed design, which reduces the charge-sharing problem compared to the conventional dynamic logic designs.

C. VARIABILITY ANALYSIS
This section explores the impact of process variations on the FDSOI device by performing Monte-Carlo (MC) simulations using 10 000 samples in the SPICE simulator (Synopsys HSPICE). For the FDSOI technology, the sources of variability are the gate work function (φ g ), channel length (L), channel width (W ), channel thickness (TSI), equivalent FG dielectric thickness (EOT1), and equivalent BG dielectric thickness (EOT2) [14]. This work assumes all the sources of variability to have Gaussian distribution. The standard deviation in the gate work function due to metal gate granularity (σ φ g ) is assumed to be 15 mV for the n-FDSOI as demonstrated in [15]. The mean value σ φ g = 4.425 V which amounts to σ φ g /µφ g = 0.34% [15]. Fig. 13 shows the regression curve (I ON as a function of I OFF ), which we have employed to calibrate and benchmark our results against the measurement data [3]. As explained above, keeping the σ φ g values for n-FDSOI and assuming an equal contribution from the rest of the variability sources (i.e., L, W , TSI, EOT1, and EOT2), MC simulations are performed. The simulations over the σ/µ% values until the slope of the regression curve of data points from measurement data matches with the slope of the regression curve of our MC data points. We find that for σ/µ% = 3.12% in the remaining variability sources has a good match between the regression curve of our MC data points and measurement data points as demonstrated in Fig. 13. Using the above-mentioned calibrated setup, the amount of variability originating from other sources, i.e., L, W , TSI, EOT1, and EOT2, is calculated.
In the presented analysis, the variability analysis is performed for the XNOR, XOR, OR, NAND, AND, and NOR gates. Fig. 14 shows the decomposition of overall σ Delay to its individual variation sources i.e., process, supply voltage, and output load capacitance. Supply voltage and output load capacitance variations have been calculated using 3σ variation. The proposed dynamic logic designs are less affected by all the variation sources compared with the conventional logic design. The main reason for less variation of the proposed dynamic logic designs will be the reduction in the transistor count, which translates into fewer effects of the parasitic and coupling issues. This makes the sensitivity of the delay of the proposed logic designs more minuscule than the conventional design.
However, the challenge associated with our proposed dynamic logic gates is cascading due to the requirement of different voltages at the FG and BG. Also, the output of our design is always 0.75 V, so if the output needs to be connected to BG in the next gate then it has to be amplified using charge pump circuits. Nevertheless, our proposed dynamic logic gates have a niche focus on the standalone circuit applications as opposed to being a generic solution for dynamic logic gates, such as mismatch calculation between two signals using an XNOR gate and applications where highly parallel computations are needed, such as Hamming distance calculation using parallel connected XNOR cells for Hyperdimensional computing and XNOR-based Binary Neural networks, as well as in other analog computing scenarios like crossbar array.

VI. CONCLUSION
We have proposed a nontraditional design of dynamic logic gates and circuits with FDSOI FETs. The performance metrics in terms of transistor count, propagation delay, power dissipation, and PDP show high improvements compared to conventional designs. We have also presented a design methodology for the three-input XNOR logic gate that can be extended for other logic gates and generalized for n-input dynamic logic gates. Our design avoids the charge-sharing problem for the two-input dynamic logic gates and reduces its effect for higher-input dynamic logic gates. Finally, a comprehensive analysis of the process, supply voltage, and load capacitance variations of n-FDSOI is performed to study their effects on the delay of the logic gates. The proposed design is less sensitive to variations than the conventional design. VOLUME 9, NO. 1, JUNE 2023