Cryo-CMOS Voltage References for the Ultrawide Temperature Range From 300 K Down to 4.2 K

This article presents a family of sub-1-V, fully-CMOS voltage references adopting MOS devices in weak inversion to achieve continuous operation from room temperature (RT) down to cryogenic temperatures. Their accuracy limitations due to curvature, body effect, and mismatch are investigated and experimentally validated. Implemented in 40-nm CMOS, the references show a line regulation better than 2.7%/V from a supply as low as 0.99 V. By applying dynamic element matching (DEM) techniques, a spread of 1.2% (3 $\sigma $ ) from 4.2 to 300 K can be achieved, resulting in a temperature coefficient (TC) of 111 ppm/K. As the first significant statistical characterization extending down to cryogenic temperatures, the results demonstrate the ability of the proposed architectures to work under cryogenic harsh environments, such as space- and quantum-computing applications.

circuits must typically only ensure operation over the military temperature range from −55 • C to +125 • C, several applications, such as space exploration [5], require electronics capable of operating over a significantly extended temperature range, for example, lunar temperatures are ranging from −230 • C to +120 • C [6].Control electronics for particle detectors [7] or quantum computing applications even require operating temperatures as low as 100 mK and below, and up to a few tens of kelvins due to self-heating [8].Given its very-large-scale-ofintegration (VLSI) capabilities, high-frequency operation, and wide operating temperature range, nanometer-CMOS technology is an ideal candidate to implement such cryogenic electronics.Cryogenic CMOS (cryo-CMOS) voltage references are, therefore, extremely relevant for the development of such wide-temperature-range applications.
For the standard temperature range, state-of-the-art voltage references typically use Si bipolar transistors (BJTs) [9], [10], [11], where a proportional-to-absolute-temperature voltage (PTAT) and a complementary-to-absolute-temperature (CTAT) voltage are summed to generate a first-order temperatureindependent reference voltage, fundamentally equal to the bandgap voltage of silicon.However, bandgap references suffer from poor performance at cryogenic temperatures due to freeze-out effects in the base region [12], [13], rendering Si BJTs not useful for cryogenic electronics.Moreover, BJTs are fundamentally incompatible (at cryogenic temperatures) with the low supply voltages used in nanometer CMOS technologies, because the base-emitter voltage V be is higher than 1.1 V at cryogenic temperatures, even for nA collector currents.The SiGe heterojunction bipolar transistor (HBT) can overcome such limitations, as it is functional down to mK temperatures, and has already been used in references [13], [14].However, HBTs are not available in standard CMOS processes and are not suitable for cryogenic sub-1-V designs, since they also require a V be above 1 V at cryogenic temperatures.
Alternatively, MOS devices in weak inversion have been employed at room temperature (RT) [15], [16] and remain well-behaved down to mK temperatures [17], [18], [19].However, all prior works employing MOS devices instead of BJTs in cryo-CMOS voltage references lack the statistical characterization and require high supply voltages [12], [20] (3 and 5.5 V), thus being unsuitable for sub-1-V applications.Next to combining voltages with complementary temperature dependence, MOS-based references can exploit the zero-temperature-coefficient (ZTC) point, which is a specific gate-source voltage V gs corresponding to the drain current I d being constant over temperature [21], [22].However, extending this principle to cryogenic temperatures would require reliable CAD-compatible cryogenic device models, which are only scarcely available and have significant limitations, such as coverage for only a limited set of geometries [17], [19], [23].Although a cryogenic ZTC-based reference has been demonstrated [13], the lack of statistical characterization still leaves uncertainty on the robustness with respect to process variations.
As an alternative, this article presents a series of MOS-based voltage references employing NMOS, PMOS, or DTMOS as core elements and capable of operating from a sub-1-V supply from 300 down to 4.2 K. Extending on [24], we present extensive characterization over process, supply voltage and temperature, together with the assessment of the performance improvement when using dynamic element matching (DEM) and trimming.By providing a systematic study of several main error sources, this work lays the basis for the design of the accurate low-voltage wide-temperature range cryo-CMOS voltage references presented in this article.
The article's organization is as follows.Section II presents a brief study of the changes in CMOS device behavior at cryogenic temperatures, after which Section III describes the implementation of the proposed voltage reference architectures.Finally, Section IV shows the measurements of the fabricated chip, and Section V provides a conclusion.

II. CRYO-CMOS DESIGN CHALLENGES
One of the major design challenges for cryo-CMOS circuits is the lack of CAD-compatible cryogenic device models, making it difficult to quantitatively predict circuit performance.Due to the cryogenic shift in device performance and the numerical instability in the foundry device models when extrapolated beyond their range of validity, also standard foundry models cannot be used at cryogenic temperatures.Still, by comparing characterization data [25], [26] at 300 and 4.2 K, boundaries for the main relevant changes in device and circuit behavior can be derived to ensure robust circuit design, although unfortunately no circuit simulations can be performed.
First, the threshold voltage V th increases by 100-150 mV, which effectively reduces the available headroom by the same amount, implying that cryo-CMOS low-voltage circuit design is even more challenging than at 300 K.For example, passgates can stop conducting in a dead-zone around mid-supply due to the increased threshold voltage of both the PMOS and NMOS transistor [27].In this work, this challenge is overcome by maximizing overdrive on the switches, or using pass-gates only when higher ON-resistance is tolerated.
Second, the subthreshold slope (SS) is steeper at cryogenic temperatures, causing transistors to exhibit behavior closer to an ideal switch.As a consequence, V gs cannot be significantly reduced, even in weak inversion, thus exacerbating the headroom limitations.
Third, mismatch increases at cryogenic temperatures [25], [26].Due to the steep SS, the impact of V th mismatch on the drain current is more significant.DEM techniques will be employed to mitigate and investigate these effects.
Finally, the resistors that are required for most references also suffer from a temperature dependence.To minimize these effects, n-type unsilicided poly resistors will be used, which vary less than 5% over temperature [28].Furthermore, the reference voltage will mostly be set by a ratio of resistors, hence making it less vulnerable to changes in absolute resistance.

A. Working Principle
A MOS transistor operating in weak inversion can emulate the exponential I -V characteristic of a BJT that is required for classical bandgap references.The drain current I d of a MOS transistor is then given as where µ is the mobility, C ox is the oxide capacitance per unit area, W and L are the width and length, respectively, n is the nonideality factor, and V T is the thermal voltage.Looking at Fig. 1(a), and assuming M 1 and M 2 are in weak inversion and have nominally equal size, the voltage V R 1 across R 1 can be computed as where V gs1,2 is the gate-source voltage of M 1,2 , and p is the ratio of current densities between M 2 and M 1 set by the 1: p gain of the current mirror M 3 -M 4 .Note that V R 1 is a PTAT voltage.Due to the source of M 1 and M 2 being freely available (unlike the collector in parasitic pnp BJTs), V R 1 can be generated without using the typically adopted operational amplifier (e.g., in [9]), resulting in lower power consumption, higher accuracy, and improved reliability under unexpected environmental conditions.Resistor R 1 converts V R 1 into a current (as in [29]), which is mirrored into the series connection of M 6 and R 2 using M 3 and M 5 , hence the voltage across R 2 is a scaled version of V R 1 .A corresponding CTAT voltage is generated from the gate-source voltage of M 6 , provided that M 6 is also in weak inversion.The reference voltage V ref is then given as where m is the gain of the current mirror M 3,5 , and I d6 is the drain current of M 6 .By appropriately choosing m R 2 /R 1 •ln( p), the temperature coefficient (TC) of the PTAT component can be scaled to obtain a first-order temperature-independent reference voltage V ref , approximately equal to the threshold voltage V th .Since V gs of a MOS transistor is typically lower than V be of a BJT, MOS-based architectures do not necessarily require low-voltage techniques to implement sub-1-V references, unlike traditional bandgap references.Fig. 1(b) shows the dual-circuit implemented with PMOS as core devices and NMOS as current sources, resulting in V R 1 and V ref now being referred to V dd .By placing the core PMOS transistors in separate n-wells, their bulk can either be connected to their source to avoid the body effect or to their gate to create a reference based on DTMOS transistors [15].Compared to PMOS transistors, (P-)DTMOS transistors require a lower V gs , have a nonideality factor n closer to unity, and, at least at RT, exhibit lower process variations [15], [18], reducing the minimum V dd and improving linearity and variation of V ref .

B. Proposed Architecture
A drawback of the circuit in Fig. 1 is the limited supply rejection due to the noncascoded current sources.Via the finite output impedance, the difference in the drain-source voltage between M 3 and M 4 , V ds = V ds3 − V ds4 , translates into an error in the 1: p current ratio.Furthermore, V ds depends linearly on V dd , thereby limiting the supply rejection.However, inserting cascodes in this architecture is nontrivial due to the required biasing and the limited headroom.A 5× change in absolute current is expected (due to the current being set by V R 1 /R 1 ), which is likely to bring the cascodes from strong into weak inversion.Due to the lack of accurate cryogenic device models, reliably designing bias networks dealing with such widely shifting operating points is challenging.Moreover, using an operational amplifier (opamp) to keep V ds3 and V ds4 equal is challenging since the required input common-mode of such an opamp (equal to V gs2 ) would not leave sufficient headroom to reliably implement the opamp, especially in the absence of accurate device models.As current-mode voltage references typically need an opamp with similar requirements [30], current-mode references are not suitable for the target wide-temperature-range low-voltage applications.
As a solution, the proposed architecture in Fig. 2 employs an additional feedback branch to keep the drain voltage V d3,4 of M 3,4 at the same potential, inspired by [31], but now further reducing the required headroom.The transistor M 7 (M 8 ) is a copy of M 2 (M 4 ).Since V gs7 = V gs2 , M 7 and M 2 carry equal currents, resulting in V gs8 = V gs4 and thus V ds3 = V gs8 = V gs4 = V ds4 , which is independent of V dd and hence reduces the supply sensitivity.The proposed architecture (Fig. 2) ensures a much better matching of V ds3 and V ds4 than the simplified architecture (Fig. 1), showing a simulated sensitivity to supply variations of the difference V ds3 − V ds4 of only −64 mV/V (Fig. 2) versus −960 mV/V (Fig. 1).Simulations then show that the supply sensitivity is now limited by the limited impedance in the output branch.Similar to the simplified architecture in Fig. 1, also PMOS and DTMOS flavors of the proposed architecture have been implemented, where all voltages are referred to V dd .Adding the feedback branch also affects the loop-gain in this architecture, thus potentially impacting stability.The simplified architecture [Fig.1] has a loop-gain equal to A simp ≈ (gm 4 /gm 2 ) • (Gm 1 /gm 3 ), set by the gain of the two gm/gm amplifiers formed by M 4 and M 2 , and M 1 and M 3 , where Gm 1 = gm 1 /(1 + gm 1 R 1 ) is the equivalent transconductance of the source-degenerated M 1 , and gm i the transconductance of M i .Since Gm 1 < gm 1 , the loop gain is positive and below unity (A simp = 0.4 for the simplified NMOS architecture), and hence the circuit is stable.For the proposed architecture, the gain from the feedback loop equals A fb = −gm 8 /gm 7 , noting that M 8 and M 7 form a gm/gm amplifier.Effectively, this can be modeled by increasing Gm 1 to A fb •Gm 1 .The gain of the loop can now be expressed as Note that the direction of the loop is now opposite to the direction as in Fig. 1.Since M 7 and M 8 carry the same current, with M 7 in weak inversion, gm 7 > gm 8 , and therefore A prop < −A −1 simp < −1, and the circuit is stable.
The sizing of the proposed architecture is shown in the table in Fig. 2. The sizing process starts by finding the current density range in which the core transistors are in weak inversion.This range, divided by the expected change in (PTAT-)current (due to the temperature change) determines the maximum current density ratio p.Having a larger p reduces the required scaling of V R 1 and therefore reduces error propagation from V R 1 to V ref .The available cryogenic device characterization data shows that when devices are in weak inversion at 300 K, the devices can be assumed to be in weak inversion also at cryogenic temperatures [12].Moreover, the PTAT nature of the bias current ensures that the current at cryogenic temperatures is fundamentally lower than at 300 K.As a next step, the absolute currents can be set based on leakage considerations, to ensure that the leakage currents, such as the gate leakage, are negligible with respect to the bias currents.This current can be defined using R 1 according to (2).In this design, the current at 300 K equals 425 nA to limit the effect of leakage.Given that the core transistors (M 1,2,6,7 ) need to be in weak inversion, the minimum current will set their aspect ratio.To avoid the current sources (M 3,4,5,8 ) entering weak inversion when current decreases at cryogenic temperatures, thereby compromising their matching [25], they must be biased far into strong inversion, hence the long channel length.The remaining m and R 2 can be set based on the scaling factor required for V R 1 [see (2)], where there is a tradeoff between power (higher m) and area (higher R 2 ).Due to the scaling factor being dependent on a ratio of resistors, the scaling factor m • R 2 /R 1 will be independent of the resistor TC.
The left part of Fig. 2 shows the implementation of the startup network.When the reference is in the OFF-state and no current is flowing, the gate-source voltage V gs = 0 for all transistors, and V ref = 0.A comparator (M p1,2 and M n1,2 ) senses whether the circuit is on ( /2).Two cascaded inverters ensure that the comparator output is reconstructed to full logic levels.Although a basic digital inverter may be employed to efficiently detect the reference being in the OFF-state, the target reference voltage is close to the midsupply and hence to the threshold of the digital logic, thus affecting the PVT robustness of digitalbased detectors.Using the comparator avoids such an issue and improves the startup's robustness.In the OFF-state, the startup transistor M 9 is enabled, forcing a current to flow in the reference.After startup is detected by the comparator, M 9 is disabled again.For characterization purposes, an enable signal EN was added to allow turning off the reference, startup circuit, and resistive divider.Measurements (see Section IV) showed that the startup network in Fig. 2 is not effective below 60 K.The low V dd will limit V gs9 +V gs4 to 1.1 V, causing those transistors to be either off, or too far in subthreshold due to the high threshold voltage at low temperatures (about 600 mV for PMOS [26]).In the second batch, the startup transistor was modified into NMOS with the drain connected to the gate of M 3 and M 4 , and the source connected to ground.This startup is also not yet fully reliable, as it does not guarantee the startup Fig. 3.
Schematic showing the proposed reference implemented with core-transistor DEM (on M 1,2 ), current-source DEM (on M 3,i ), and a resistive trimming network (on R 2 ).The 16 unit current sources from M 3,4,5 in Fig. 2 have been combined into the transistor indicated as M 3,1−16 .Each of the units in M 3,1−16 can be uniquely configured to be connected to either the drain of M 1 , M 2 , or M 6 .The bulk of M 1,2,6,7 is connected to the ground.The chopper, cascode, and trimming switches are SVT devices, and S f 1−4 are LVT devices.
of the feedback branch.For future designs, it is recommended to connect the source of the (NMOS) startup transistor to the ground, and the drain to the gate of M 8 .This will ensure the startup of the feedback loop, which in turn starts up the rest of the circuit, as proven in a different test chip (not shown in this work).

C. Trimming
By making R 2 tunable, the PTAT term in V ref in (3) can be scaled.Consequently, all errors resulting in a PTAT error in V ref , such as a mismatch in the ratio R 2 /R 1 , can be compensated for by trimming R 2 .To allow for this, R 2 has been implemented as a fixed resistor R 0 , in series with a 7-bit, binary weighted resistor ladder, as depicted in Fig. 3.A series structure is chosen to optimize the required area.To circumvent the switch limitations mentioned in Section II, R 2 is not placed at the drain of M 6 , but at its source.The transistors switching R 2,i thus have a source voltage ranging from ground to <70 mV, allowing for sufficient overdrive.In case R 2 and M 6 were interchanged, a voltage of roughly 450 mV would be on the source of the switches at cryogenic temperatures.The smaller resistors are then arranged to be closer to the ground to minimize the switches' source voltage.The switches were sized to optimize their ON/OFF-resistance by taking into account their different source voltages.The simulated worst case error due to the nonzero ON-resistance is limited to below 700 µV (or 5 ppm/K in terms of TC).

D. Dynamic Element Matching
Any mismatch in the current mirrors will affect the 1: p:m mirror ratio and therefore the accuracy of V ref .By applying DEM on the current sources, this error can be removed.Given that p = 10 and m = 5, it is a natural choice to implement 16 unit current sources.As confirmed by the simulations, any mismatch in the feedback branch translates into a mismatch between V ds3 and V ds4 , which is negligible with respect to the residual error after applying DEM.Fig. 3 shows the implementation of the 16 unit current sources, each having three switches (S c1,i -S c3,i ) that can be individually and statically controlled by an on-chip SPI module, allowing the current to be directed to any of the branches.The switches are implemented as PMOS transistors, which can be opened by applying V dd to their gate.To close a switch, 150 mV is applied (via an external bias source) to the gate of the switch.By using 150 mV instead of ground, the supply rejection of the circuit could be optimized by using the switch as a cascode.As their source is at 1.1-V ds , sufficient overdrive can be guaranteed at cryogenic temperatures.In the first phase, M 3,1 is connected to the drain of M 1 , M 3,2−11 to the drain of M 2 , and M 3,12−16 to the drain of M 6 .In the next phase, this will be M 3,2 , M 3,3−12 , and M 3,13−16 and M 3,1 , respectively.After a total of 16 phases, M 1 and M 2 are interchanged with the chopping switches, and the procedure is repeated, yielding 32 phases.As the branch with only one unit current source is the dominant source of variation, each of the 16 unit current sources will now be connected to this branch once every 16 phases.Behavioral simulations with Spectre and MATLAB show that the statistical error in p and m is around 2.8% at −40 • C before DEM and is expected to reduce about two orders of magnitude to below 0.025%.
Mismatch in the core transistors M 1 and M 2 affects the reference voltage, as any mismatch-induced difference between V gs1 and V gs2 directly appears in V R 1 , which is then amplified to V ref by m • R 2 /R 1 .Note that since the TC of the V gs of an MOS (below −0.9 mV K −1 in our case) is smaller than for a BJT (typically −2 mV K −1 ), a lower value for m • R 2 /R 1 can be used compared to BJT-based references (for the same p), which reduces the amplification of error sources associated with M 1,2,3,4 and R 1 to the output.This is a beneficial property of MOS-based references, especially for uncompensated error sources.In case there is both a threshold voltage-and beta-mismatch between the two core transistors, V R 1 can be computed as where β 1,2 and V th1,2 are the beta-factor and threshold voltage of M 1,2 , respectively.By exchanging M 1 and M 2 and averaging V ref , the mismatch is removed.Ignoring the body effect (see following subsection), the residual V th mismatch is below 0.2 mV.The implementation of the required switches is shown in Fig. 3.A chopper using NMOS switches at the source of the core transistors can be made sufficiently low impedance, as V R 1 is below 100 mV.The chopper at the drain of the core transistors can be conveniently combined with the already present cascode switches S c1,i -S c3,i .Pass-gate S f 1−4 ensures proper feedback is maintained when interchanging M 1 and M 2 .Since these pass-gates are in series with a gate (with a gate leakage below 5 nA at 300 K), this would only require V gs > V th .Since V th is larger for PMOS than for NMOS in this process, this requirement is always met.At 4.2 K, the ON-resistance is estimated to be below 12.5 k .

E. Configurable Bulk
Whereas DEM can be used to remove statistical mismatch between M 1 and M 2 , it cannot remove systematic mismatch due to the body effect, since M 1 and M 2 have a different source potential.Next to M 1 , also M 6 suffers from the body effect due to the drop on R 2 .Interchanging M 6 and R 2 would solve this problem, but it also makes it challenging to implement a tunable R 2 (see Section II).Using the available deep n-well, the architecture in Fig. 4 has been implemented, where the NMOS core transistors are all placed in isolated p-wells.Using the switches, the potential of the p-wells can be connected to either the source (φ 1 ) or ground (φ 2 ), allowing to assess the effects of the body effect on the PTAT, CTAT, and the reference voltage.For PMOS references, source and bulk are always shorted.Finally, the bulk can also be connected to the gate (φ 3 ), essentially creating an N-DTMOS configuration.As the gate voltage in N-DTMOS configuration is expected to be below 450 mV, leaving 650 mV headroom, these switches can also be implemented with NMOS transistors without the risk of insufficient headroom.

IV. MEASUREMENT RESULTS
Two batches have been fabricated in a commercial 40-nm bulk CMOS process (Fig. 5), similar to the nanometer processes commonly used in cryo-CMOS quantum computing applications [3], [32], and packaged in ceramic DIP packages.Characterization was performed using a dipstick in a Dewar with liquid helium (LHe) (Fig. 6).Due to the high input impedance (>100 G ) from the multimeter (Keithley 2002), no buffering for the references was needed.Seven chips from the first batch (two NMOS-, four PMOS-, and four DTMOS instances per chip) were measured, and four chips from the second batch (nine NMOS-, seven PMOS-, and seven DTMOS instances per chip).All architectures are exactly the same in both batches, except for the slight modification in the startup network in the NMOS-based architecture.The NMOS architectures with DEM and configurable bulk connection are only present in the second batch.Data for all presented plots can be found in [33].
A. Reference Voltage-NMOS Fig. 7(a) shows V ref versus temperature of the NMOS-based architecture for both batches.The value of R 2 is set to optimize the TC determined from the box method over the temperature range from 4 to 300 K.The same value for R 2 is used for all instances in both batches.Applying a single-point scaling trim in MATLAB at 150 K to both batches, where a temperature-independent scaling factor is applied postmeasurement to the reference voltage, such that at 150 K all references coincide, yields the curves in Fig. 7(b).A TC of 258 ppm/K and spread of 3.8% (3σ ) is achieved, where the TC is computed using the box method, in which the box fits all curves from both batches.It is clearly visible that the box size, and therefore the TC is dominated by the variation at cryogenic temperatures, attributed to the more severe effects of mismatch at cryogenic temperatures [25], [26], and the systematic nonlinearity below 20 K. Before trimming, a TC and 3σ spread of 141 ppm/K and 2.7% for batch 1, and 348 ppm/K and 4.8% for batch 2 are achieved.Due to the startup issue, batch 1 has a temperature range limited to above 60 K (see Section III), hence explaining the performance difference between batches 1 and 2.
Next to V ref , the PTAT voltage was characterized by measuring the voltage V R 2 across the output resistor R 2 , in turn allowing also the CTAT voltage V gs6 to be computed using (3).As can be seen in Fig. 8(a), the CTAT voltage V gs6 shows an offset between the two batches, but the PTAT voltage V R 2 overlaps.This low susceptibility to process corners is attributed to the spread of V R 2 in (3) mainly depending on the mismatch rather than spread (between M 1 /M 2 and R 1 /R 2 in Fig. 2), in addition to any spread in the nonideality factor n.
The CTAT voltage V gs6 in (3) is directly affected by spread in V th , R 1 (via I d6 ), µ, and n.Given that V th is outside the logarithm, batch-to-batch spread in V th will thus be the main source of offset in V ref in Fig. 7(a) and the CTAT voltage in Fig. 8(a).This is also confirmed by corner simulations (about 60 mV change in V ref and 80 mV in V th between extreme corners).The saturation in V gs6 at low temperatures is caused by saturation in V th , induced by the saturation in bulk Fermi potential [34], which has been previously observed [25], [35].

B. Reference Voltage-P/DTMOS
As can be seen from the measured reference voltage generated by the PMOS-(c) and DTMOS-based (e) references in Fig. 7, the reference voltage for the DTMOS-based reference is roughly 100 mV lower than for the PMOS-based references.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.This is caused by the lower threshold voltage of the DTMOS devices, resulting from the bulk of the DTMOS being at a potential lower than V dd .Similar to Fig. 7(a), a small offset is present between the two batches, which is again mainly attributed to the spread in V th between both batches and is well within the corner simulations (50 mV change in V ref and 60 mV in V th between extreme corners).In Fig. 7(d) and (f), the reference voltage is depicted after a single-point scaling trim at 150 K.The TC and 3σ spread are computed on all samples from both batches together.Contrary to the NMOS, it can be observed that both for PMOS and DTMOS, the TC is limited by the systematic nonlinearity below 100 K rather than statistical errors.In fact, the variation for PMOS and DTMOS is lower than for NMOS (2.6% and 2.7% versus 3.8%).Again, the 3σ spread is larger below 50 K, pointing to the mismatch at cryogenic temperatures as the dominating factor for the variation.
Similar to the NMOS, for the PMOS and DTMOS, V R 2 in Fig. 8(b) and (c) from both batches overlaps.Furthermore, an offset is present when comparing V gs6 from both batches.As observed in Fig. 7(c) and (e), the PMOS and DTMOS V ref suffers from a large systematic nonlinearity.Based on Fig. 8(b) and (c), this can be traced back to both the PTAT and CTAT voltage.First, a saturation in V R 2 can be observed, which is fundamentally caused by a saturation in the SS [18], [36].Second, V gs6 starts increasing below 50 K, which is attributed to the increase in PMOS V th also observed in literature [34], although also a saturation in PMOS V th has been reported [25].Given that both the increase in V gs6 and saturation in V R 2 have the same sign, a significant systematic nonlinearity appears in V ref below 100 K, which turns out to be the dominant error that sets the TC.Mostly for the P/DTMOSbased references, but also for the NMOS-based references, a strong nonlinearity in the V ref below 20 K appears (Fig. 7), which can be traced back to the PTAT voltage V R 2 .A similar nonlinearity was observed in [18], where the data suggested the nonlinearity may depend on the operating region of the transistor.Using the model and data in [23], it was verified that the core transistors in the proposed architecture are in weak inversion for all temperatures, hence making it unlikely that the nonlinearity is caused by the core transistors being out of weak inversion below 20 K. Whereas the model in [23] can be used to investigate whether the devices are in weak inversion, numerical issues cause the model to be inconclusive about the physical origin of the nonlinearity.

C. Dynamic Element Matching
When DEM is not enabled [Fig.9(a)], that is, for V ref in the first DEM phase out of 32 phases, the circuit in Fig. 3 exhibits comparable TC (255 versus 348 ppm/K) and 3σ spread (5.1% versus 4.8%) as the second-batch NMOS V ref in Fig. 7  Consequently, 3.8% can be attributed to the current sources, and 3.0% to the core transistors.DEM is particularly effective at cryogenic temperatures, as it prevents mismatch from the current sources and core transistors to be the dominant source of variation, which is expected in view of the increased mismatch in both weak-and strong inversion.By applying DEM and a scaling trim, the residual TC of 111 ppm/K in Fig. 9(d) is not anymore limited by the spread but by the systematic nonlinearity below 20 K.As such a nonlinearity does not benefit from mismatch-compensation techniques in the circuit core, its cause cannot be attributed to random or systematic mismatch effects.Because the current magnitude in the circuit significantly reduces at cryogenic temperatures (by approximately 5×), gate leakage could potentially induce nonlinearity.Simulations from −40 • C to 27 • C indicate a maximum gate leakage of about 5 nA, which would lead to an error in V R 2 at 4.2 K up to 4%.However, the lack of suitable cryogenic device models, and even the absence of cryogenic gate-leakage characterization data, prevents us from drawing a definitive conclusion.

D. Impact of Body Effect
The impact of body effect can be analyzed by observing the reference voltage V ref [Fig.10(a)] and the corresponding CTAT (V gs6 ) and PTAT (V R 2 ) components [Fig.10(b)] when switching the core-transistor bulk in the circuit in Fig. 4 to their source, to ground, and their gates, respectively.Looking at Fig. 4 and neglecting the statistical mismatch between M 1 and M 2 , V R 2 can be written as (5) where V th = V th2 − V th1 is due to the body effect.When the bulk of each of the NMOS is connected to the source, V th of all core devices is nominally equal to V th0 = V th | V bs =0 , and V th = 0.When the bulk is connected to ground, M 7 and M 2 have the same As V th < 0 in this case, V R 2 is lower than for V bs = 0, as shown in [Fig.10(b)].Due to the lower PTAT voltage, the bias current reduces (since R 1 is fixed), and also V gs6 is expected to reduce.Given that V b6 = 0, the V th6 increases, which has a stronger effect on V gs6 than the reduced bias current, hence explaining why V gs6 is higher than for V bs6 = 0.As the source voltage of M 1 and M 6 is a PTAT voltage, the circuit with V b = 0 (φ 2 ) converges to the configuration with V bs = 0 (φ 1 ) when the temperature approaches absolute zero.As a result, both V gs6 and V R 2 converge at low temperatures in this case, which is indeed observed in Fig. 10(b) as well.By computing V th = V R 2 /(m R 2 /R 1 ), V th can now also be computed to be −13 and −2.0 mV at 300 and 4.2 K, respectively, corresponding to a body-effect coefficient of 0.17 V/V and 0.15 V/V.Moreover, the behavior of V R 2 (φ 1 ) − V R 2 (φ 2 ) shows that also V th is essentially a PTAT voltage, implying that the body effect can be mitigated by applying a PTAT trim.By trimming of R 2 and V ref for V b = V s and V b = 0 can be made equal up to 0.6 mV, thereby making it not the limiting factor for the TC.It is, therefore, not required to use a deep n-well process to achieve a lower TC.
When the gate is connected to the bulk (φ 3 , V b = V g ), we form an N-DTMOS device.As the bulk-source voltage V bs1 < V bs2 , also V th1 > V th2 and thus V th < 0, implying that V R 2 in φ 3 is lower than in φ 1 , where V bs = 0. Due to both the reduced bias current (since V R 1 is smaller and R 1 is fixed) and the reduced V th of M 6 , V gs6 for φ 3 is therefore smaller than for V bs = 0 (φ 1 ).This reduction in V gs6 is mostly induced by the N-DTMOS configuration, which essentially lowers V th .In terms of headroom, using the deep n-well to form an N-DTMOS structure is thus beneficial for cryogenic low-voltage designs where headroom is a limiting factor.Note that because the nonlinearity in Fig. 10(a) is consistent over the bulk arrangements, it can be excluded that the systematic nonlinearity below 20 K in V ref is caused by the body effect.

E. Line Regulation and Power Consumption
To assess the effectiveness of the additional feedback loop in the proposed architecture in Fig. 2, the line regulation was Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I PERFORMANCE COMPARISON
Fig. 12. Measured power consumption from a 1.1 V supply for the proposed architecture core (average of 9/7/7 samples for N/P/DTMOS) versus the simplified architecture (1 sample for N/P/DTMOS).This plot is excluding the 2.8 µA drawn by the resistive divider formed by R 3 and R 4 , which varies less than 5% over temperature.measured for both the proposed architecture and the simplified architecture.The line regulation has been computed using a first-order fit of V ref for V dd ∈ {1.05, 1.15} V at 300 K and V dd ∈ {1.0, 1.15} V at 4.2 K. Datapoints for which the reference did not startup were discarded (mostly below 0.95 V).As can be seen in Fig. 11, the proposed architecture achieves better line regulation than the simplified architecture, demonstrating the effectiveness of the additional feedback branch.An important observation is that at 4.2 K, the reference is either on or off, and there is no smooth transition region as there is at 300 K.This effect is caused by the steeper SS at cryogenic temperatures, making the transistor behave closer to an ideal switch.In case there is not sufficient headroom available, the circuit will then fully turn off.Combined with the increased V th at cryogenic temperatures, the references consistently need a higher minimum V dd than at 300 K. Two instances even exhibit negative line regulation, which is likely caused by the vastly shifting operating point of the circuit (and thus the variation of the loop-gain) during the measurements, combined with the very low current levels, cryogenic device effects, and mismatch effects.
The measured power consumption is shown in Fig. 12, where the power consumption from the proposed architecture (Fig. 2) is about 1.5× higher than the simplified architecture (Fig. 1) due to the additional feedback branch.The microwatt power consumption is in line with the typically assumed power budget of roughly 1 m watt/qubit for quantum computing applications [3].The absence of the typically adopted amplifier in the proposed architecture (as mentioned in Section III-A) allows for low power and low noise.However, as the DEM in this architecture is only static, 1/ f -noise is the dominant factor in terms of noise.A performance comparison with other works is presented in Table I.

V. CONCLUSION
Harsh-environment applications, such as quantum computing, require electronics to operate far below the standard temperature range.A family of voltage references is presented that can reliably operate from 300 down to 4.2 K from a sub-1-V supply.Prototypes fabricated in a commercial 40-nm CMOS process achieve a TC below 547 ppm/K and 3σ variation below 3.8% after a single-point trim over 56 samples from 2 batches.The adoption of a feedback-regulated architecture ensures a line regulation below 2.7%/V for sub-1-V operation.After applying DEM techniques, the TC and the spread can be reduced to 111 ppm/K and 1.2%, respectively, mainly limited by systematic nonlinearity below 20 K.When no deep n-well is employed, the body effect manifests itself mainly as a PTAT error and can, therefore, be easily removed with a PTAT trim.Furthermore, nonlinearity, core-transistor, and current-source mismatch have been experimentally analyzed.Thus, the proposed architectures reliably provide a PVT-robust reference voltage, allowing for use down to extremely low temperatures.

Fig. 1 .
Fig. 1.Simplified schematic of the proposed CMOS voltage references with core devices M 1,2,6 : (a) NMOS as core devices and (b) PMOS as core devices when the bulk of M 1,2,6 is connected to their source, and, alternatively, DTMOS as core devices when the bulk of M 1,2,6 is connected to their gate.

Fig. 2 .
Fig. 2. Schematic and sizing of the proposed architecture based on NMOS core transistors, where p = 10 and m = 5.All transistors are low-Vt (LVT) devices, except for M R and M p1,2 (standard Vt, SVT).No stacked devices were needed to obtain the desired transistor channel length.The bulk of M 1,2,6,7 is connected to the ground.The arrows indicate the main feedback loop.Resistors are implemented as unsalicided n-poly resistors.The startup and enable transistors are depicted in gray.A dual architecture was also implemented with PMOS and DTMOS as core devices.

Fig. 5 .
Fig. 5. Die micrographs for both batches.Insets show instances of the proposed architecture in Fig. 2 with NMOS, PMOS, and DTMOS as core device, as well as the architecture in Fig. 3 (NMOS DEM).

Fig. 7 .
Fig. 7. (a), (c), and (e) Measured V ref from the proposed references implemented with either NMOS, PMOS, or DTMOS as core device, without trim, and (b), (d), and (f) after applying a single-point scaling trim in MATLAB at 150 K.The mean and ±3σ are indicated using the red and blue lines, respectively, where the dashed lines are for batch 1 and the solid lines for batch 2.

Fig. 8 .
Fig. 8. PTAT-(V R 2 ) and CTAT (V gs6 ) voltage corresponding to the measured V ref in Fig. 7(a), (c), and (e) for the proposed architecture, implemented with either (a) NMOS, (b) PMOS, or (c) DTMOS as core device.The setting of R 2 is the same for all curves of the same device flavor.

Fig. 9 .
Fig. 9. Measurements of the architecture in Fig. 3 showing V ref (a) without any compensation, (b) a single-point scaling trim, (c) core transistor-and current source DEM, and (d) core transistor-and current source DEM, together with a single-point scaling trim.The setting of R 2 is the same in all plots.
(a).The same holds when considering the single-point scaling trim as in Figs.7(b) and 9(b) (218 versus 258 ppm/K, and 4.0% versus 3.8%).Enabling DEM on the current sources and the core transistors reduces the spread by up to 3× [Fig.9(c)].V ref is now computed by taking the average of all 32 DEM phases.By only applying DEM (w/o trim) on the current sources, the variation reduces to 3.4%, and to 4.1% (w/o trim) if only applied on the core transistors.

Fig. 10 .
Fig. 10.Output voltage from a typical sample (a) V ref , (b) V gs6 and V R 2 , and the differences between (c) V gs6 and (d) V R 2 in the three configurations for the circuit in Fig. 4. The setting of R 2 is the same for all curves.

Fig. 11 .
Fig. 11.Measured supply dependence of V ref for two instances (blue curves) of the proposed architecture and one instance (red curves) of the simplified architecture, measured at 300 and 4.2 K.