Ultracompact and Low-Power Logic Circuits via Workfunction Engineering

An extensive analysis of sub-10-nm logic building blocks utilizing ultracompact logic gates based on recently proposed gate workfunction engineering (WFE) approach is provided. WFE sets the WF in the contacts as well as two independent gates of an ambipolar Schottky-barrier (SB) FinFET to alter the threshold of two channels, as a unique leverage to modify the logic functionality out of a single transistor. Thus, a single transistor (1T) CMOS pass-gate, 2T NAND and NOR gates as well as 3T or 4T XOR gates with substantial reduction in overall area (50%) and power (up to $\times 10$ ) dissipation can be implemented. To harness this potential and illustrate the capabilities of these compact ambipolar transistors, novel logic building blocks, including 6T multiplexer, 8T full-adder, 4T latch, 6T D-type flip-flop, and 4T AND-OR-invert (AOI) gates, are developed. Besides the logic verification using 7-nm devices, the dynamic performance of the proposed logic circuits is also analyzed. The comparative simulation study shows that WFE in independent-gate SB-FinFETs can lead to absolutely minimalist CMOS logic blocks without significant degradation to overall power-delay product (PDP) performance.


I. INTRODUCTION
A S silicon-based CMOS technology is searching for alternative devices and approaches to extend its dominance and to delay the imminent demise of Moore's scaling, minimal changes to the established FinFET architecture is still welcomed due to the associated cost savings and rapid adaption cycle [1], [2]. Although less revolutionary, this ''more-of-Moore'' approach can provide additional time for the paradigm-shifting alternatives to Si CMOS to be developed fully [3]. Recently, we proposed such an effort via gate workfunction engineering (WFE) that can extend FinFET-based logic design down to 5 nm while also resulting in ultracompact XOR/NAND/NOR logic gates with significant reduction in area (∼50%) and power dissipation (up to ×10) [4]. In the proposed gates, the WFE is selectively applied to Schottky-barrier (SB) FinFETs with independent gate inputs, which leads to entirely novel logic elements such as single transistor (1T) CMOS pass-gates as well as 3T/4T XOR gates and 2T NAND/NOR gates with noninverting inputs. In this article, we further extend the WFE approach on SB-FinFETs and introduce additional designs for basic logic building blocks, such as latches, multiplexers (MUXs), and full adders. Hence, we illustrate how the compact gate designs based on ambipolar SB-FinFETs in sub-10 nm can be conveniently utilized to build novel logic elements that have never been explored before.
The basic goal of this article is to introduce the WFE paradigm and SB-FinFETs to the Circuits and Systems community and show that most essential logic elements can be redesigned in this low-power minimalist approach down to 5-nm gate length. Particularly, we propose novel logic building blocks that require very small (≤50%) area and can operate with supply voltage as low as 0.5 V. Besides the elimination of CMOS pair in pass gates, we also eliminate the inverted inputs in several of the circuits introduced in this article. This is accomplished by the creative use of independent gates and WFE, whereby the transistor threshold can be shifted by the choice of gate WF and independent gates allow access to single-transistor and functionality. Thus, the proposed logic building blocks are extremely compact and novel, using the smallest number of CMOS transistors logically possible. In doing so, only two or three WFs are utilized in the logic design to alleviate fabrication concerns. For each proposed circuit, following the logic verification via TCAD simulations, we analyze the switching performance in terms of power-delay product (PDP), area required (in units of λ 2 , where λ refers to the smallest feature size in the layout), and noise margin (NM) losses (resistive drops in the pull-up and pull-down networks that compromise logic-level regeneration) and compare these results with conventionally built CMOS (p-n-junction FinFETs) counterparts. Thus, not only does this article provide unique insights into the design and operation of novel logic building blocks, but also it explores the potential advantages in pursing both combinational and sequential logic systems via the WFE approach.

II. DEVICE STRUCTURES AND MODELING
The main building block of the proposed logic gates is the generic ambipolar FinFET transistor with SB source/drain contacts. It is assumed to have a gate length L g = 7 nm, oxide thickness t ox = 1 nm with a dielectric constant ox = 12, made up from a SiO 2 /HfO 2 stack with 4/6 ratio, and silicon fin thickness t Si = 5 nm. SB-FinFET is capable of substituting for both n-or p-type device in CMOS circuits, as shown in Fig. 1. This is an innovative design that requires implementation of a FinFET with SB source and drain (S/D) contacts as well as independently driven gates, which is not trivial to achieve. Especially, the number and level of gate WFs must be limited to what is practically achievable. However, both modifications have been experimentally demonstrated [5]- [8] and it is assumed that they can be combined with some effort in the near future, as traditional Moore's scaling concludes. Similarly, it may be possible to finely tune required WFs, using alloys and novel gate dielectrics [9].
All devices and circuits in this article are modeled using Synopsys Sentaurus TCAD suite, where nanoscale transistors can be accurately simulated via quantum (density-gradient) corrected drift-diffusion transport formalism, high-density 2-D meshes, Fermi-Dirac statistics, barrier tunneling modeling via Wentzel-Kramer-Brillouin (WKB) approximation, and field-dependent mobilities [10]. Hence, two major impacts of quantum mechanics (barrier tunneling and threshold shifts due to low device dimensions) are included in the present simulation study with sufficient detail. More rigorous models, such nonequilibrium Green's function (NEGF) and energy-balance models, are neither practical nor necessary for the circuit implementations in this article for several reasons, such as overestimation of on-current [11], comparable errors in threshold voltage (∼k B T , where k B is Boltzmann's constant and T is the temperature) [12], and exceedingly long simulation times. In any case, since a relatively low supply voltage (≤0.8 V) is used in all logic simulations, the extracted performance parameters have an acceptable level of errors compared to higher order models, such as NEGF and full quantum solutions [13]. More importantly, the performance comparisons are fair within the current framework, as identical mathematical and physics models are employed in all devices across the board. Nonetheless, the absolute accuracy of the reported performance parameters could be improved, if full-3D simulations with a more rigorous solution of underlying quantum mechanical transport problem can be attempted [14].
An independent-gate ambipolar SB-FinFET has two conduction modes, depending on the size of SB height dictated by the choice of S/D contact metal. [15] The presence of SB at the contacts ''lifts up'' the potential at the end of the channel that would otherwise slip away from gate control due to the depletion fields of S/D junctions, thus improving device scalability. According to Fig. 1, the choice of effective mass, which plays an important role in the SB tunneling, has negligible impact on the current-voltage (I -V ) curves and the overall ON/OFF current ratio important for logic performance. As a result, electron and hole effective masses adapted for this article are m * e = 0.2m 0 and m * h = 0.6m 0 , where m 0 is free electron mass. These values are similar to the density of states effective masses used by others [16].
The impact of independent gate control and varying WFs in the operation of the SB-FinFETs can be understood via Fig. 2, where both changes are studied as a function of bottom-gate bias (V BG ). Fig. 2(a) shows that increasing V BG shifts the current minimum to the left since the electron (hole) branch has a lower (higher) threshold voltage as a result. However, the impact is stronger for the p-type branch than the n-type one for two reasons: hole barrier is more readily impacted from depletion field established by the drain bias V DS and it is much smaller than the electron barrier on the source side ( M = 5.0 eV). This result also shows that the balance between the ON currents of the two branches can be quickly upset by varying gate or drain bias conditions. Fig. 2(b) discloses a more complicated picture for the work-function dependence in the presence of independent gate control. In a 3 × 2 arrangement of three WFs (φ TG = φ BG = 4.2, 4.6, or 5.0 eV) and two drain bias conditions (V DS = 0.1 or 0.6 V), the figure shows the changes to the transfer curves (I D versus V TG ) when the bottom gate is held at the same potential as the top gate, grounded (logic LOW) or maintained at ∓1 V (logic HIGH) for the hole-or electron-rich channels. Higher drain bias leads to shifting of the ambipolar response to the right, as would be expected from depletion from the drain and resulting enhancement of hole injection. When the bottom gate is grounded, the ON current is reduced as only one channel is contributing to the conduction. On the contrary, when the bottom gate is held at high potential, the corresponding electron-rich (φ TG = φ BG = 4.2 eV) or hole-rich (φ TG = φ BG = 5.0 eV) channels cannot be essentially turned OFF with the action of the top-gate alone. Of course, for different gate WFs, this picture will alter and conductivity of the channels can be further enhanced or reduced. Although not very linear or intuitive, if properly used and balanced in a given circuit topology, these competing elements of transistor conductivity can lead to compact logic functions as explored in this article.
Other important insights obtained from Fig. 2(b) include that the independent gate control generally leads to lower ON/OFF current ratios and that SB-MOSFETs have substantially lower ON current compared to the p-junction junction FinFETs. In addition to serving as a reference point, these symmetrically driven conventional FinFETs with midgap gate WFs will also serve as the building blocks of CMOS circuits used in Section IV for performance comparison. They possess the identical gate structure and mesh as the SB-FinFETs and only differ in terms of S/D contacts that are formed by heavily doped n-or p-type Si.
With the correct contact metal choice, ambipolar devices, such as the SB-FinFET considered here, can deliver equal current drive for both types of carriers [16], [17] and have been suggested by several groups as a reconfigurable logic element in CMOS circuits [18]. When realized in the form an SB-FinFET with two independent gates optimized with unequal WFs, the same device can operate as a pass gate with separate electron and hole conduction channels in a single-transistor body [19]. It can pass logic 0 and 1 equally well in a single transistor that would normally take two separate MOSFETs in conventional CMOS. As such, it can provide significant (up to 50%) area reduction in circuits that heavily use CMOS pass gates. A sketch of device operation and the resulting characteristics of this novel 1T pass-gate device is provided in Fig. 3. Due to extremely short gate length and fin thickness, its effective resistance varies only three orders of magnitude during logic switching, which could be enhanced in longer transistors or thicker Si fin layers are used.
Another important capability used in the proposed logic implementations is the fine threshold adjustment via WFE that leads to lateral shifts in the ambipolar I -V characteristics for SB-FinFETs, as shown in Fig. 4. Such higher or lower threshold FinFETs are needed to create pass gates that only turn on when both independent gates are driven (and function) in a single transistor, which is key to combining two series FinFETs in the NAND/NOR logic gates into one transistor body. [20], [21]. Thus, it is clear that the ability to set independent gate WFs in a single FinFET enables designers to pursue novel circuits as explored next [4].

III. ULTRACOMPACT LOGIC GATES
Application of WFE to sub-10-nm SB-FinFETs allows us to redesign XOR, NAND, and NOR gates in a minimal fashion. This is accomplished by setting the S/D contacts at the same WF (5.0 eV) and specific adjustments to the independent (top and bottom) gates as necessary to configure each gate for a given logic function. This must be done with care so that not only devices work with acceptable static and dynamic performance but also utilize as few metals as possible, to avoid process complexity and cost [4].

A. XOR
Three types of XOR gates can be built via WFE optimized SB-FinFETs. The first one, AG-XOR in Fig. 5, is based on the use of two ambipolar pass gates introduced in the previous section and is driven by complementary inputs (A, B, A,  and B). Thus, the AG-XOR requires a total of six transistors (6T), including the inverters, and only 2T if the inverted inputs are available. In comparison, the XOR with conventional CMOS pass gates would require 8T with the inverters.
The second XOR, named ambipolar noninverted gate (ANIG) XOR, is essentially a novel and superior implementation of standard 6T CMOS XOR [22] using SB-FinFETs optimized via the WFE approach. In an ANIG-XOR, A is the input to all gates, and B and B drive positive and negative supplies, respectively, as shown in Fig. 5. Although having the same S/D WFs (5.0 eV), the two pass gates differ in gate WFs (5.2 and 3.6 eV) to operate either as p-or n-type pass gates. Therefore, this device can also be built with conventional FinFETs available today, provided that the WFs indicated in Fig. 5(b) are adapted in the design. Like the earlier AG-XOR case, if B is already available, this design results in a 2T-XOR implementation as well. If the inputs B and B are replaced with different logic variables, it can also serve as 2-to-1 MUX in the full-adder (FA) carry-out calculation (see Fig. 7).
The last XOR circuit [ANI-XOR; Fig. 5(c)] is a unique and hitherto unexplored gate, composed of only three SB-MOSFETs and no inverters at all. It relies on NAND like pull-down network based on a high-V t nMOSFET (can only turn on if A = B = 1) and two low-V t pMOSFETs for pull-up network, reminiscent of the ANIG configuration. The inverted B input is eliminated because there is an actual ground. The two p-type pass gates generate alternating (AB + BA) states, while the lower SB-FinFET with coinputs of A and B is in high state. Since no inverter is needed, this XOR gate can ensure the lowest transistor count for logic units where the inverted inputs are not readily available. It requires only two different gate WFs to optimize the thresholds. Fig. 6 clearly shows that all the three proposed XOR gates operate correctly and evenly with similar rise/fall times. The glitches at the transitions are due to relatively slow (10 ps)  0/1 or 1/0 edges used in the simulation to ensure fast and good convergence in the demanding TCAD simulations. AG-XOR has the best NMs (regenerative loss in output logic levels, which is indicative of the static dc leakage as a switch), and ANIG-XOR is the worst, especially in its pull-down network that drops ∼30 mV.

B. NAND AND NOR
It has already been shown that a single high-V tn independent gate FinFET can be utilized as a two-input and logic element that only conducts if both gates are biased with logic 1 (V DD ) input. [23] This creates an extremely compact 2T NAND gate that has been shown to reduce both power (40%) and delay (10%), culminating in an absolutely minimal logic gate arrangement. A similar arrangement can be made for p-MOS pull-up network in CMOS NOR gates, replacing two series pMOS with a single independent-gate high-V tp SB-FinFET. However, the original work suggested using oxide thickness to adjust for high-V tn,p , which is impractical VOLUME 5, NO. 2, DECEMBER 2019 and suboptimal for sub-10-nm devices that would suffer from poor subthreshold slope. In our approach, using the I -V characteristics optimized only by the choice of gate WFs [see Fig. 5(d) and (e)], it is possible to implement high and low threshold variants of the n-or p-type SB-FinFETs, which can be used to build 2T NAND/NOR gates.
According to the logic outputs in Fig. 6, only a pair of ambipolar SB-FinFETs with low-V tp (empty FinFET symbol) and high-V tn devices (filled symbol) in series arrangement will operate as a NAND logic gate. Similarly, in the opposite arrangement, a NOR logic operation is also confirmed. Moreover, in all cases, the logic gates are found to operate correctly with supply voltages as low as 0.6 V and gate lengths as short as 5 nm using this exact same arrangement of WFs [4]. Thus, together with the use of 2T-XOR gates, WFE can be used to design ultracompact logic circuits in terms of transistor count and area required. This is true as long as the independent gate electrode metals (or silicides) with the correct WF can be introduced to FinFET architecture, which is not trivial, but within the reach of CMOS engineering, especially in sub-10 nm when the options to reduce area and power are truly limited.

IV. EXAMPLE LOGIC UNITS
Having introduced the basic gates, this section will focus on the range of logic blocks that can be designed using the proposed WFE approach. To provide a basis for comparison, data from alternative designs and, where applicable, CMOS counterparts that employ conventional p-n junction FinFETs are also included in each of the sections.

A. FULL ADDERS
In addition to its general logic use, XOR gates are especially important for the efficient implementation of FAs. In this article, due to the novel XORs designed using the WFE approach, two novel FAs can be implemented. The first of these is an eight-transistor implementation, named AFA-8T (see Fig. 7) that requires no inverted inputs. It is based on two ANI-XOR blocks introduced earlier and a single 2T ANIG block repurposed as a 2-to-1 MUX circuit via top and bottom inputs. Note that carry-out bit (C out ) takes advantage of equivalence between A.B = B. (A ⊕ B) so that a second ANI-XOR gate can be used, eliminating the need for an inverter. Operation of the AFA-8T circuit is verified via TCAD mix-mode simulations in Fig. 8. For the chosen set of WFs, SUM term is working with minimal (∼25 mV) loss in NMs, whereas the carry bit has more significant (∼95 mV) loss for one input combination (A = 1, B = 0, and C in = 1). This loss can be corrected if WFs are slightly altered. However, this may be a zero-sum game as losses may appear in other input combinations. In most cases, the culprit is the second stage that inherits a slight loss of logic levels from the first XOR stage. Thus, if WFs for consecutive stages are slightly varied in an alternating manner, at the expense of process complexity, or by playing with width to length (W /L) ratios in each stage, it may be possible to mitigate these losses further.
The second FA circuit in Fig. 7 is called AFA-10T since it requires ten transistors to function, including the two inverters. Thus, if inverted inputs are available, it would take only 6T to operate this FA design, employing only three ANIG-XOR stages. Its operation is also verified via TCAD simulations, as shown in Fig. 9. Unlike the AFA-8T design, both sum and carry outputs suffer from ≤60-mV losses in NMs. Smaller losses are also visible in the SUM = 1 state, indicating that the WFs chosen for the correct XOR operation may still be suboptimal.
Besides the obvious area gains, the proposed FA designs offer significant gains in dynamic performance. This may be ascertained by comparing its PDP with the conventional CMOS designs, as provided in Table 1. Total PDP of the AFA-10T and AFA-8T cases are found to be 8.6 and 20.7 aJ, respectively, as opposed to the conventional CMOS FA design with PDP of 65.5 aJ. While these values are not extremely accurate, due to the finite parasitics included in the TCAD model (S/D resistance R S,D = 50 and load capacitance C L = 1 fF) and lack of gate tunneling in the present model, they do indicate a very competitive performance and relative order in terms of PDP for the proposed FA circuits. However, the NM losses are greater in the proposed FAs, which needs to be improved further, possibly by fine-tuning the WFs and Si fin dimensions. Doing so will most likely lower power dissipation and enhance the PDP even higher. Instead of the XOR gate level, such additional optimization of WFs might be more beneficial at the functional level, i.e., altering the WF of a specific transistor so as to improve the overall FA performance, an approach currently being studied for follow-up publication.

B. MULTIPLEXERS
Potential of the proposed novel XOR gates for compact and high-performance combinational logic can be accessed via two simple 4-to-1 MUX circuits built using three XOR pass gates. This can be accomplished in two different compact MUX designs based on ANIG (6T) and AG (10T) XOR circuits compared to the conventional CMOS implementation that would require 16 FinFET transistors (16T), including the inverters for the select bits. ANIG-XOR version does not need any inverters, which makes it extremely compact. Fig. 10 shows the topology of the novel MUX circuits as well as their logic responses to the application of full 16-bit input combinations using 7-nm transistors and a V DD bias of 0.8 V. The MUX circuit operates correctly for all input conditions and transistors as small as 5 nm (not shown). The dynamic performance of the MUX circuits extracted from these plots via MATLAB postprocessing is provided in Table 2. The SB-FinFET versions are clearly slower, up to an order of magnitude. However, they consume five times less power and at least 50% less area. Hence, the resulting PDP is ∼30% and ∼65% larger for the 6T and 10T MUX circuits, respectively. Moreover, despite the significant area and PDP advantage of 6T MUX circuit, its NM is higher than the 10T version. Therefore, the main advantage of the ambipolar MUX designs appear to be their lower power consumption and area, as opposed to their speed. This is actually an acceptable outcome since the performance of nano-CMOS logic circuits, dominated by interconnection overheads, is limited only by power dissipation and not by switching speed.

C. SR LATCH AND D FLIP-FLOPS
As an additional example of fundamental logic circuits, we also designed sequential logic units, including a minimalist SR Latch (4T) circuit and two different implementation of D-type flip-flops (6T and 10T DFF). All three circuits are based on novel ambipolar pass gates and NOR gates presented earlier and verified by mixed-signal TCAD simulations. Both the topology and simulated characteristics of these sequential circuits are given in Fig. 11. Extracted figures of merit for the dynamic response of the proposed circuits are listed in Table 3. T and JK-type flip-flops designed using the same minimalist approach as above also resulted in comparable gains in area and power.
The first sequential logic circuit to demonstrate is the simple SR latch with cross-feedback NOR gates that has a total of 33-ps average delay while consuming ∼0.2-µW power. A similar characteristic is observed also for the SR latch counterpart built using the ambipolar NAND gates (not shown). It is possible to optimize the choice of gate WFs [see Fig. 5(e)] in the SR latch to balance the switching performance of the two complementary outputs (Q and Q). One can also fine-tune the SR latch by using the dc transfer curves, which is not pursued here for brevity.
The second demonstration example comprises two D flip-flop circuits designed to show multiple degrees of freedom provided by the WFE approach in sequential logic synthesis. First, a 10T D flip-flop circuit employing four 2T NAND gates and one inverter is designed, which is fastest among the three flip-flop circuits and only slightly ∼20% slower than the conventional FinFET circuit (not shown in Fig. 11). Second, as shown in Fig. 11, a novel 6T D flip-flop is also designed and verified using two individual (n-or p-type) 1T SB-MOS pass gates and two inverters. The use of 1T transmission (pass) gates with optimized gate work functions (4.1 and 5.0 eV, respectively) is noteworthy. It illustrates that the WFE approach can also be used to nMOS or pMOS only 1T transmission gates, eliminating the need for inverted CLK input and hence saving two transistors or 25% reduction of total number of transistors in the circuit. The same circuit can also be implemented using the proposed 1T ambipolar pass gate introduced in Fig. 3. However, this would require an additional inverter. The simulated power consumption of the novel 6T D flip-flop is only 98 nW, about ten times less than 10T-DFF design, despite being 1.5 times slower than this larger two-stage design. Therefore, 6T-DFF has an advantage over the 10T design not only in terms of transistor count and area but also having more than seven times better PDP.

D. AOI GATE
As the final example for the use of WFE approach in the design of minimalist logic building blocks, we showcase a four-input AND-OR-Invert (AOI) circuit using only four SB-FinFETs. AOI gates provide a straightforward and one-stage compact synthesis of logic minterms in open-form logic functions. They are especially preferred in cases where the Carnaugh map is readily available and will not further reduce [21], [24]. An implementation a four-input AOI gate via the WFE approach is shown in Fig. 12, along with its  TCAD simulated transient response that validates its output functionality. Actual selected gate WFs for the proposed AOI design is also indicated in the table inset of Fig. 12, As before, extracted figures of merit for the switching and design characteristics of the proposed circuit are provided in Table 4. In order to serve as baselines for performance comparison, we also provide in this table two alternative AOI implementations: one with a three-stage design based on separate and, or, and inverter gates utilizing WFE (as introduced in Section III) and one implemented as a compact one-stage static-CMOS counterpart with standard p-n junction FinFETs. Fig. 12 shows that the AOI circuit has very little loss in NMs that was a concern in the full adder circuits. This is accompanied by an extremely efficient logic switching performance that is almost an order of magnitude lower in power dissipation than the conventional CMOS implementation that is offset by equally slower switching, resulting in a comparable PDP figures. The PDP performance of the compact AOI implementation is superior to the three-stage WFE implementation by twice. This is similar to the previous observations; lower ON and OFF currents in SB-FinFETs lead to substantial power savings while also slowing the proposed gates down.

V. DISCUSSION
This article introduces two novel (4T and 3T) XORs, two extremely compact full adders (8T and 10T), and one 6T MUX, and two flip-flops, all based on the WFE-optimized SB-FinFETs. To the best of our knowledge, the 8T FA proposed here is the first fully CMOS logic block of its kind, compared to an nMOSFET-only example proposed earlier, which clearly had significant loss of NMs that would disqualify it in real applications [25]. Similarly, 6T MUX design and 3T ANI-XOR circuit are the smallest CMOS logic gates in their class that deliver correct function without inverted inputs. Although 3T and 4T CMOS XOR gates are reported in the literature [26], [27], they rely on nMOS or pMOS only pass gates that does not pass logic 0 or logic 1 states equally well. Unlike these previous examples or conventional CMOS pass gates that demand twice the area or almost an order of magnitude larger power dissipation, the proposed WFE-optimized 4T and 3T XOR gates utilize full CMOS blocks that can pass logic 0/1 levels either equally well or better than single-gate counterparts owing to dual-gate action of the FinFET.
Several broad observations can be made on the WFE approach for logic design based on the above-mentioned simulation study. First, the use of SB S/D contacts leads to additional series resistance associated with the ''tunneling process'' that appears to be slowing SB-MOSFETs down (3-10×) in all logic gates and circuits designed. However, substantial (30%-50%) savings in area and significant (5-10×) power reductions offset this slower response. This tradeoff between lower speeds for lower power dissipation appears to be the hallmark of the WFE design in SB-FinFETs. Since power, parasitics, and chip size have become more important figures of merit in the post-Moore scaling, WFE approach as applied to SB-MOSFETs can become a significant option to consider for enhancing the performance of logic circuits in sub-10-nm regime. It is important to underline that the lower ON current in SB S/D contacts is primarily responsible for the slower speed of the proposed gates in this article, not the changes in the gate WFs. Consequently, the best way to mitigate the reduction of switching speed in SB-FinFET-based designs is to further optimize the S/D WFs, which was fixed at 5.0 eV (i.e., hole barrier height φ b = 0.15 eV) in this article to balance ON currents and to simplify comparisons across different gate WFs. Another possible avenue to explore is to introduce unequal S/D barrier heights or utilize SiGe heterojunction contacts to be able to optimize the barrier height and contact resistances further.
There is also room for improvement in terms of loss of NMs in the output of multistage functions, such as the full adder. NM loss or lack of full supply-voltage swing at the output can severely limit the logic depth in the synthesis of complex logic functions. Although it can be corrected by insertion of high-gain inverter pairs, this would erode the area savings implied by the novel gates presented here. The worst case of NM loss in the proposed circuits was ∼100 mV drop in the carry-out bit of the full adders that requires four-stage logic. This level of loss is rather high and must be reduced for better reliability in operation. Hence, a more refined optimization or additional mitigation approaches, such as the introduction of an additional WF for one of the gate inputs [28], [29] to compensate or tune out such losses, may be necessary for more complex functions that require additional logic depth.
Due to space and scope limitations, a limited number of the logic circuit examples were provided. However, these examples can be easily expanded. Similarly, the WFE approach can also be applied to other types of transistors, with perhaps less effectiveness, and become an additional tool for designers to optimize logic transistor in sub-10-nm technologies. This, of course, would place additional burden to process engineering that would have to make independent-gate FinFETs and dual-gate metal choices available to logic designers [17], [30]. While indeed challenging, these are more manageable and lower cost solutions compared to the conventional scaling approach that can no longer be pursued due to financial, process, and material limitations.

VI. CONCLUSION
A novel approach to design ultracompact logic circuits in sub-10-nm CMOS was proposed and verified via TCAD simulations. The approach relies on the ambipolar characteristics and WFE of independent-gate SB FinFETs to build extremely minimalist logic gates, including 1T CMOS pass gate, a 3T XOR gate that does not require any inverters, and 2T NAND/NOR gates. Based on these novel logic gates, several low-power logic building blocks were implemented, including two different (8T and 10T) full adder circuits, two types (6T and 10T) of 4-to-1 MUX circuits, a 4T SR latch, two alternative D-type flip-flops employing six or ten transistors, and a 4T AOI gate. In addition to the logic verifications, switching performance of these compact logic blocks was also studied using quantum-corrected TCAD simulations. In general, logic implementations using SB-FinFETs with engineered WFs display up to an order of magnitude slower response while dissipating 5-10 times lower power and taking up ∼50% lesser area. As a result, the proposed ultracompact circuits generally suffer from 20% to 60% higher PDP figures at 10 −17 J scale compared to the conventional designs based on the standard FinFETs with p-n junction contacts. This level of PDP increase can be easily tolerated, given the significant reduction in power dissipation and substantial gains in logic density. Besides the power and area advantages, the WFE approach can allow designers an additional degree of freedom in sub-10-nm logic design that cannot be afforded by any other means, provided that additional process complexity is practical and manageable.