A Cryo-CMOS SAR ADC With FIA Sampling Driver Enabled by Cryogenic-Aware Back-Biasing

This paper presents a floating inverter amplifier (FIA) that performs high-linearity amplification and sampling while driving a $2\times $ time-interleaved (TI) SAR ADC, operating from room temperature (RT) down to 4.2K. The power-efficient FIA samples the continuous-time input signal by windowed integration, thus avoiding the traditional sample-and-hold. Cascode switching, a floating supply and accurate pulse-width timing calibration enable high-speed operation and interleaving. In addition, by exploiting the behavior of CMOS devices at cryogenic temperatures, forward-body-biasing (FBB) is pushed well beyond what is possible at RT to ensure performance down to 4.2K, and its impact on the performance of cryogenic circuits is analyzed. The resulting ADC, implemented in 40-nm bulk CMOS and including the FIA driver, achieves SNDR=38.7dB (38.2dB), SFDR>50dB (>50dB), and FOMW=25.4fJ/conv-step (31.3 fJ/conv-step) with Nyquist-rate input at 1.0GS/s (0.9GS/s) at 4.2K (RT), respectively.


I. INTRODUCTION
Q UANTUM computers promise significant speed advan- tages for many applications that are excessively demanding for classical computers.To achieve such a speed-up, the number of quantum bits (qubits) used to store quantum information in such machines must scale up by orders of magnitude from the currently available 100s [1].However, due to the fragile nature of the qubits, the most promising quantum computing platforms must operate at cryogenic temperatures ≤4.2 K [2], [3], [4], posing significant challenges to the realization of large-scale quantum computers.Crucial to obtain this goal is an electronic interface for the quantum processor located close to the cryogenic quantum substrate, or even on the same chip [5], [6], hence, also operating at cryogenic temperatures.
Out of the many candidates, here we target semiconductor spin-based quantum computers due to their inherent compatibility with CMOS fabrication and good scaling properties [5].For the compact cryogenic readout of spin qubits, a cryogenic wide-band ADC is required to digitize the frequency-multiplexed channels in a reflectometry readout scheme [7], as proposed in [8], [9], and [10].The power dissipation of such circuitry is strictly constrained by the limited cooling power available in deep-cryogenic environments.Nevertheless, prior works only focused on the power efficiency of the ADC itself, while either neglecting the ADC driver or just using traditional power-hungry settling drivers, e.g., in [8] or high-linearity source followers that cannot provide any gain or filtering.This is a substantial shortcoming as these settling drivers can require a power budget even larger than the ADC itself [11].
As an alternative to settling amplifiers, open-loop dynamic amplifiers have been proposed for their high efficiency combined with high linearity [12].These dynamic amplifiers have been used as sample-and-hold [13], drivers for ADCs [14], [15], [16], and as interstage amplifiers in pipeline ADCs.For the latter, common-mode control has been eased by adopting floating supplies, forming floating inverter amplifiers (FIA) [17], [18].A detailed analysis of FIA amplifiers can be found in [19].However, employing dynamic amplifiers at cryogenic temperatures is a daunting task due to the lack of reliable device models and the significant cryogenic increase in threshold voltage V th (0.1/0.18 V for NMOS/PMOS) [20], which prevents biasing power-efficient inverter-based amplifiers in the high-linearity region.Although independently AC-coupling the PMOS and the NMOS could alleviate this, it would limit the usable ADC bandwidth near DC.The increased V th complicates even the adoption of standard techniques, such as pass gates for switching mid-rail voltages [21].Thus, clock boosting, bootstrapping or high-voltage supply domains [9], [22] are necessary, deteriorating the power efficiency and increasing the design complexity.
To address those issues, we propose the use of cryogenicaware forward body-bias (FBB).FBB has been used in FDSOI technology to mitigate the cryogenic increase in threshold voltage by applying a large back-gate biasing voltage (up to −5.8 V for PMOS) [23].Although the control range for the body voltage in bulk technologies is severely limited by the forward conduction of the bulk-source diode, the modeling and the characterization in [24] suggest that a level of control comparable to FDSOI can also be achieved in bulk CMOS, given the lowered forward bias diode leakage at cryogenic temperature [25].In this work, we employ, for the first time, cryogenic-aware FBB in bulk CMOS to control the V th of individual transistors in a wide range of cryogenic analog circuits, thus enabling the first dynamic ADC driver at cryogenic temperatures.The presented driver and ADC combination achieves high linearity with more than 50 dB SFDR and also a competitive FOM W =31.3/25.4fJ/conv-step with Nyquist-rate input at 0.9/1.0GS/s at RT/4.2 K.These advances are enabled, in addition to the cryogenic-aware FBB, by the use of cascode switching, the adoption of a floating supply, and the use of accurate pulse-width timing calibration.
The article is organized as follows: after a description of the impact of body-biasing in analog design at cryogenic temperatures (Section II), we describe the amplifier design (Section III), its experimental validation (Section IV), and draw the conclusions in Section V.

II. FORWARD-BODY-BIASING (FBB) IN CRYOGENIC
ANALOG CIRCUIT DESIGN FBB primarily affects the transistor's V th .This effect can approximately be described by the body factor where V bb is the voltage applied via the body contact, as in Fig. 1.In the 40 nm bulk process adopted here, ζ varies between ≈ 0.1 to ≈ 0.35 at 4.2 K when the body-bias is swept from 0 to 1.1 V with an average of ≈ 0.25 V/V [24], which is higher than in common FDSOI technologies with, for example, 0.085 V/V in [23].While the body contact has been used at RT both as a tuning knob for mitigating mismatch [26], or as additional input [27], the usable range for FBB is much wider at cryogenic temperature thanks to the reduced forward-bias leakage of the bulk-source diode.For 40-nm CMOS, a 5 µm×0.2µm P+/N-well diode conducts ≈1 nA when forward biased with the full nominal supply voltage (1.1 V) at 4.2 K [25], more than 5 orders of magnitude less than at RT.For more sensitive applications, the diode leakage can be decreased by applying a lower FBB, since the leakage decreases by ≈ 10× for a 100 mV decrease in V B B , as estimated from Fig. 20.With a full FBB V B B − V S =V dd =1.1 V, the threshold voltage can be shifted by >200 mV in the adopted technology.Combined with the available threshold flavors, this offers a wide range of viable threshold values.
In the following subsections, we analyze two examples of circuits enabled by cryogenic-aware FBB and their limitations.For the analysis, we use data measured at RT and 4.2 K from a characterization chip, as no accurate model for simulation was available for cryogenic behavior of the adopted process at design time.Both circuits will be used in the driver design described in Section III.

A. Pass-Gate for Fast Switching of Mid-Rail Voltages
A pass gate (Fig. 2 a) can be easily designed to switch mid-rail voltages at room temperature, as shown by the limited spread in the monte carlo (MC) simulation of its mid-rail (550 mV) on-resistance (Fig. 2 b).Considering the V th increase of 110/180 mV in NMOS/PMOS measured at cryogenic temperatures in triode for 40 nm devices, and, for simplicity, no further changes in device behavior, the standard deviation of the mid-rail resistance spread increases dramatically by >4×.Assuming no change in the spread of model parameters from RT to cryogenic temperatures may even underestimate the variation, as variability, such as device mismatch [28], typically degrades at cryogenic temperatures.To recover RT performance with traditional methods, the pass gate either needs to be significantly enlarged to contain the spread, or be replaced by a boosted or bootstrapped switch.
Alternatively, applying FBB can bring the V th back to its RT value, or even below, thus reducing the on-resistance as shown in [24].At mid-rail, the switch will also benefit from the generally increased mobility at cryogenic temperatures [29], allowing for smaller sizing than possible at RT.Although the increase in subthreshold leakage associated with a lower threshold may be a concern.This effect is contained by the about 3× steeper subthreshold slope at cryogenic temperature as reported in [29].This allows to reduce the transistor threshold voltage even below RT values without deteriorating leakage performance.

B. DC-Coupled Linear Inverter Amplifier
The inverter amplifier, see Fig. 3 a), is a core building block of many efficient amplifier architectures, thanks to its power efficiency obtained by current reuse and the beneficial scaling with technology.At RT, this amplifier is also moderately linear when biased at mid-rail and used in a Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.differential configuration.This is illustrated in Fig. 4, where we show the inverter transconductance (g m = g m,N + g m,P , with g m,N /P the transconductane of the individual transistors) derived from measured I d (V d =550 mV) of individual devices.A sizeable linear region can be observed in the differential transconductance g m,di f f = (g m,N ,1 (V in ) + g m,P,1 (V in )) − (g m,N ,2 (−V in ) + g m,P,2 (−V in )) of an inverter-based pseudodifferential pair Fig. 4b) at the mid-rail point.This breaks down at cryogenic temperatures, where, due to the increased threshold voltage, a significant dip in the g m is observed, corresponding to a limited linearity.To avoid this dip and recover the linear behavior, the 4.2 K characteristic needs to be shifted by 100/140 mV for NMOS/PMOS, see Fig. 4. We can now observe similar linearity if comparing the transconductance of the differential pair in Fig. 4b).The mobility increase at lower temperatures does not severely compromise the linearity in the nominal case, as shown in Fig. 4.Although the linearity could degrade over different process corners, RT corner simulations showed the linearity to be robust against process spread.Since no data about the process spread at cryogenic temperature has been reported to the best of the author's knowledge, we assume that the linearity at cryogenic temperature would be comparable to the one predicted by corner simulations at room temperature, as it happens for the particular case shown in Fig. 4.
For implementing this shift, we could use a bias-T as shown in Fig. 3 and applying bias voltages V b,1/2 , but the amplifier bandwidth would be reduced around DC by the bias-T highpass characteristic and the signal would suffer attenuation due to the parasitics of the passive network.Alternatively, FBB can shift the transfer characteristics by shifting V th without significantly altering the transistor characteristics.This allows recovering the linearity without introducing any additional components into the signal path and/or limiting the input bandwidth.

C. Limitations of Cryogenic-Aware FBB
Applying FBB via the bulk contact may be potentially limited by the high substrate resistance at cryogenic temperature, as indicated by typical N-well resistances up to a few G /□ at 4.2 K [29], [30].If such large bulk resistance (R B in Fig. 1) would be effectively present, the applied bias V bb would only set the DC operating point, around which capacitively coupled excitations could alter the bulk potential, causing unexpected effects.For instance, the capacitive coupling via the drain-bulk diode (D D ) could lower the output resistance due to modulation of the bulk potential.If floating the bulk terminal, the size of this effect is about 8% in RT simulation.The influence of the gate in this context is largely reduced due to shielding by the channel.Luckily, the field-dependent ionization might significantly reduce the effective resistance, as soon as potential differences in the order of mV build up over the bulk resistance [31], which is in-line with the steep drop in substrate resistance with increasing bulk current shown in [29].To mitigate the effects of the unknown substrate resistance, substrate contacts can be placed near the active devices to ensure field-dependent ionization in case of potential differences.We have chosen a contact distance in the order of 1 µm in this design, maximizing the field strength while still allowing for a dense layout.
The application of body-bias is restricted by the available process.It is applicable to planar bulk technologies with a triple-well option, as well as to FDSOI technologies.Effective FBB is precluded in FinFET technologies, as these generally have a very low body factor and are therefore ill-suited for adopting body-bias [32].
If circuits employing FBB must operate both at RT and cryogenic temperatures, measures must be taken to ensure correct operation, especially when using high FBB values.For instance, to avoid excessive diode leakage at RT, the body potential must be switched depending on the operating temperature, or DACs adjusting the body-bias are required.This is not an issue for the target application in quantum-computer interfaces, which always operate at cryogenic temperatures.
If using a triple-well layout for minimizing leakage paths when employing FBB, additional area might be necessary due to the design rules of such processes, see, e.g., the layout in Fig. 13 c).Especially the distance of a deep-N-well (DNW) to an N-well (NW) of different potential typically carries a significant distance requirement.The additional area may also cause increased parasitic capacitance due to necessary routing between now spaced transistors, which may be critical for parasitic-sensitive scenarios like the input of a latching comparator.To avoid this space constraint, the PMOS can be placed in the DNW surrounding the NMOS P-well (PW).While reducing the required extra area to a minimum, this leads to some additional leakage via the P-well/N-well diode if the PMOS transistors inside the DNW are also using FBB.Additionally, this would also imply using the same body-bias for all PMOS transistors sharing the DNW.

III. ARCHITECTURE AND CIRCUIT DESIGN
The acquisition front-end in Fig. 5 comprises the ADC core with its two time-interleaved slices A, B driven by the FIA.All body biases used in the amplifier are static and generated by the on-chip DAC.The FIA and ADC are clocked by the timing generator synthesizing all necessary timing signals from a single full-rate clock signal.The frontend operates in three phases on each slice in alternation, see Fig. 6: First, during T R , the slice is reset and its input settled to V CM .Second, during T S , the differential input signal V in is amplified via windowed integration on the top-plate of the ADC sampling capacitor, V A/B,+/-.Finally, during T conv , the amplified signal is converted by the slice to the output word.The slices are 7-bit SAR ADCs that are loop-unrolled for speed and equipped with foreground calibration for the comparator offset.The slices' design is identical to [22] except for changes in the timing circuitry necessary to integrate the amplifier and a slightly increased capacitive DAC (CDAC) to retain the input voltage range of 600 mV pp,d after adding the amplifier parasitics.In an optional bypass-mode included to verify the ADC stand-alone performance, the FIA is disabled and the input is directly sampled on the DAC top-plates by clock-boosted sampling transistors W=1.5 µm to minimize feed-through), similar to [22].
Our target front-end specification required >50dB SFDR, >38dB SNDR when operating at a conversion rate of ≥1GS/s [9].As the ADC slices described in [22] meet these specifications, the following sections focus on the driver design.Although the target application requires only cryogenic operation, the chip was designed also for RT operation to allow RT characterization, thus easing the chip testing, and also to showcase the state-of-the-art RT performance of the proposed architecture, which can be employed also in other non-cryogenic applications.

A. Core FIA
The core differential amplifier, see Fig. 7 for the schematic and Table I for the device sizes, uses the same set of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I
FIA CORE SIZING amplifying inverters (M 1 -M 4 ) for driving both ADC slices.Instantiating a separate amplifier for each slice would not result in a direct power penalty due to the fully dynamic operation but would require an extended amount of interslice calibration.The inverters are designed to deliver an output current signal for windowed integration, rather than settling to a voltage for the associated benefits in power efficiency [12], [16].Therefore M 1 -M 4 are chosen with a length of 100 nm to increase the intrinsic gain of the amplifying transistors, approximating an integrating behavior.
Interleaving of the shared inverters is implemented by a separate set of cascodes (M 5/6,A/B -M 7/8,A/B ) and pass-gate reset switches (SW +/-,A/B ) for each of the two slices (A, B) [15].First, during T R , see Fig. 6, SW +/-,A/B , controlled by R A/B , reset the output of the amplifier to V CM .In case of a metastability event causing the previous ADC slice conversion time (T conv ) to extend up to T R , the data out bits are latched in their incomplete state and the CDAC undergoes a forced reset to avoid propagating the error to the following conversion.Then, during T S , the cascodes connecting to the target slice are turned on using the S A/B signal and the input signal is integrated on the cap-DAC top-plate.This windowed-integration operation during T S dictates the circuit transfer function, which can be approximated as [16]: where g m is the differential-inverter transconductance, and C D AC is the load capacitance.In addition to the limited intrinsic gain of the devices, deviations from this sinc shape are caused by the stray capacitance at the drain of the input transistors [16].Both the cascodes as well as the reset switches contribute charge to the output node due to charge injection and clock feed-through.This charge signal is predominantly input signal independent common-mode, with a minor differential contribution creating a slight increase in offset.The duration of R A/B and S A/B can be configured in the timing generator, see Section III-B.S A/B is shorter than 400 ps, leading to an output attenuation below 7% for a 0.5 GHz input compared to the DC gain, which is acceptable in the scope of our application.At the end of T S , the slice conversion T conv and supply reset supply-R are triggered.During supply-R, the amplifier's floating-supply capacitor C supply is reset via M 9 /M 10 to ground/V dd , respectively.The process continues at the next clock edge with a reset on the other slice.
The choice of a floating supply allows for the stable definition of the output common mode without using a powerhungry full-rate common-mode feedback circuit [17], [18].Since C supply is disconnected from the ground/V dd supply during T S , it acts as a floating battery-like supply.As the current is now sourced from this floating supply, the amplifier has (ideally) no common-mode drive capability, and can therefore not alter the output common mode that was reset to V CM during T R .Both V CM and V in are nominally set to 550 mV, with the amplifier gain showing only minor variations withing a ±25 mV common-mode range in RT simulations.In practice, the amplifier is not fully floating due to the parasitic capacitance of C supply and the core transistors towards the AC ground.The amplifiers common-mode specifications are especially important for the loop-unrolled ADC driven here, as the architecture has poor common-mode rejection caused by the common-mode dependence of the comparator offset [22].Also the floating supply reduces the common mode gain to 0.5 in RT simulation for a small power overhead, while it would equal the differential gain without any common-mode control.With a full-scale differential output signal, the amplifier produces a 4 mV common-mode signal in extracted RT simulations, resulting in negligible comparator offset variation.This common mode signal is caused by second-order distortion in the signal inputs, that is canceled in the differential signal domain.The C supply is designed to be large (1.3 pF), compared to the load cap (113 fF), largely avoiding the degenerative effect of the floating supply to enable a larger gain and sustained bandwidth during amplification.We did not target the narrow high-linearity condition outlined in [17] in favor of robustness, as the achieved linearity is sufficient for the application.The amplifier shows robust linearity performance over corners and temperature within the validity of the RT device models.The amplifier shows robust linearity performance over corners and temperature within the validity of the RT device models.This robustness, in combination with the analysis in Section II-B showing how a linearity comparable to RT can be reached at cryogenic temperatures by means of a threshold shift, was used to extrapolate the cryogenic linearity behavior after application of FBB.A detailed analysis of the linearity of capacitively degenerated inverter amplifiers can be found in [17].
The cascode-sampling scheme used here replaces an otherwise needed sampling switch at the output, while also providing a small boosting of the inverter output impedance.The limitation in boosting is caused by the cascodes' operation close to triode due to the full-swing S A/B control signals.As the cascodes are not shared between slices A, B, mismatch in them causes differences in impedance boosting.This in turn adds a small gain error that can calibrated by the timing generator, see below.A downside of implementing interleaving with the cascodes is the introduced inter-slide feed-through via C DS during S A/B onto the top-plate of slice B/A.This feed-through happens during the sensitive conversion phase T conv,B/A .To address this, different strategies can be employed: to cancel the feed-through, an additional pair of cross-coupled always-off transistors could be employed as done for the switches in [33] but at the cost of significant additional capacitive load and layout complexity.In [34], the coupling capacitance was minimized by spacing the source and drain contacts apart, thus minimizing the coupling capacitance.Here, we pursue a third approach for isolation, by increasing the diffusion-contact-to-gate distance of the cascode transistors to allow for metal shielding above the gate, see Fig. 8. RT simulations shows negligible feed-through due to the residual coupling through C DS .In addition to implementing the interleaving, also turning the amplifier off during supply-R is ensured by the cascodes being open outside S A/B .This removes the need for additional switches at the source of the input transistors M 1 -M 4 used in [17], and [18], which can cause additional source degeneration.
The design uses back-biasing for all core transistors to enhance operation at cryogenic temperatures.Most importantly, the input transistors M 1 -M 4 need to be back-biased at cryogenic temperatures if using DC coupling, as discussed in Section II-B.By biasing the body of the input transistors separately (V bb,n+/-and V bb,p+/-), we also allow for input offset cancellation.We target an offset of 1 LSB to avoid significant SNDR degradation, which dictates the body-bias DAC resolution.For an expected gain of 7, an LSB of ≈5 mV at the ADC input, a body-bias factor ζ = 0.25, and a total DAC range of 1.1 V we require approximately 8b resolution to cover the expected mismatch range when applying the body-bias to one of the four input transistors M 1 -M 4 .To get a reliable pass-gate operation, the complementary transistors in SW +/-,A/B need to be back-biased, as discussed in Section II-A.And finally, back-bias can also be applied to the cascode transistors M 5,A/B-8,A/B for additional swing, avoiding the cascode transistors driving the input pairs towards triode.According to RT simulation, we would be able to adjust for the expected increase in V th at cryogenic temperatures and recover the target driver linearity of >50 dB.In addition to enabling cryogenic operation, the adjustable body-bias also allows for compensation of the process spread affecting open-loop amplifiers, as the spread in the threshold can now be compensated in the field.
As discussed in Section II, FBB can cause leakage by forward-biasing the device diodes.To identify possible sources of leakage, we show a sketch of the amplifiers' well layout in Fig. 9, which does not differ from the usual layout in a triple-well process.The problematic diodes in this context are formed by the source/drain diffusion of transistors (labeled D S/D in Fig. 9).All well-to-well diodes  (D DNW,1 , D DNW,2 and D NW ) are never forward biased for FBB within the supply rails.Among the D S/D diodes, the worst-case for leakage is found at the source of the cascodes M 5,A/B -M 8,A/B when a full nominal supply is applied as FBB.During reset, M 1 -M 4 are in triode and the supply is reset to the nominal ground/V dd rails.Hence, the forward voltage for the source-bulk diodes of the cascode is a full V dd .This leads to an estimated leakage of 58 nA from the PMOS cascode onto a node in reset, see discussion of Fig. 20, causing only additional power dissipation at a negligible magnitude in the context of our application.All other diodes carry less FBB, specifically the ones connecting to the ADC top-plate, and are therefore not expected to contribute measurable effects.
The static body-bias DAC uses a simple resistive ladder between ground and V dd , which is tapped by a set of switches addressed by binary decoders, see Fig. 10.For compactness, the DAC uses the surrounding DNW to contain all PMOS circuitry, as explained in Section II-C.As the DAC is fully passive, decoupling is added at the output to isolate the resistive ladder from kickback.The small-size pass-gates implementing these switches must be operational for switching mid-rail voltages at cryogenic temperatures.To ensure that, in this prototype chip the switches themselves are also back-biased by externally supplied voltages V bb,n/p,ext .In a future iteration, these voltages can be generated with a low-resolution and low accuracy DAC, as the necessary body-bias for guaranteeing full functionality (around 0.4 V for NMOS, 0.7 V for PMOS) are easily switchable by switches without body-bias and lowprecision is acceptable for these biases.The DAC also allows for using the external voltages V bb,n/p,ext instead of the resistive ladder, as well as read-back of the control voltages to detect abnormalities via V debug , connected to a pad.

B. Timing Generation
The timing-generation block, see Fig. 5, produces all pulses shown in Fig. 6 from the full-rate input clock.The output of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the pseudo-differential clock receiver is divided and aligned on the negative clock edge, while the primary pulse is initiated at each positive edge.The entire timing calibration block, except for the clock divider, is implemented with open-loop delays and combinational logic.This saves power compared to using the high-frequency clock required to produce all the phases and fine-grain adjustments necessary here.A DLL-based alternative would improve the robustness but at the cost of increased power consumption and design complexity.Care was therefore taken to make the delay-based logic robust to PVT variations by only using relative delays and carefully matching driving capabilities of parallel paths, thus achieving reliable operation from RT to 4.2 K.
The primary pulse generator (Fig. 11) is shared between both slices to avoid the additional calibration necessary to generate the control pulses via separate blocks.The produced pulses are multiplexed in the timing generator, see Fig. 5.Both T R and T S are ideally kept short to allow more conversion time for the ADC, and are adjustable from 120 ps to 400 ps.For applications requiring the FIA gain to be robust against extended PVT variations, circuit techniques as proposed in [35] can be employed.To generate this range, three functions are used: a full delay step ( T ) defined by the combined delays of two inverter delays and a pass-gate, a half step ( T hal f ) corresponding to two inverter delays, and a 5b binary weighed capacitor array for fine steps.While the main effect of adjusting T S is varying the amplifier gain, the duration of T S also affects the inherent filtering introduced by the windowed integration [16].As the windowed integration corresponds to a sinc response, this could allow, for example, adjustment of the notch to reject a spurious out-of-band tone like mixer LO feed-through.
While most of the timing blocks are shared between the slices, the non-shared sections cause inter-slice mismatch, among which, the relative pulse timing mismatch T S,timing and gain mismatch caused by pulse-width mismatch T S,P W .
T S,timing is calibrated by a capacitor array with a 3b binary and 2b unary control, see Fig. 12 a), which allows for delay  adjustments for each slice up to 40 ps in ≈1 ps steps.The T S,PW calibration allows for gain calibration by adjusting unary-coded inverter weights, see Fig. 12b), allowing to calibrate up to ±20 ps of mismatch per slice.The described calibration circuits are sufficient to reduce the interleaving spurs below 60 dBc in RT simulations.The total jitter contributed by the timing generation is about 0.5 ps/0.6 ps for the rising/falling edge of S A/B in extracted simulation, and therefore not limiting the amplifiers SNR [16].Typical quantum computing systems require spectral purity significantly beyond this level [36], resulting in no additional system constraints due to the amplifier.

IV. MEASUREMENT RESULTS
A micrograph of the test chip, implemented in a 40 nm LP bulk technology, is shown in Fig. 13 a), with more details of the analog core in b).In the amplifier layout (Fig. 13 c), we minimized the distance of the active devices to the body contacts, keeping it below 1 µm for most of the circuit.The triple-well layout shown in Fig. 9 consumes more area than minimally required (approximately 4×), but such an increase is insignificant compared to the size of the floating capacitor C supply or the ADC slice.
The chip was tested in a dip-stick setup with chip-on-board assembly, similar to the test-setup in [9].We concentrate on testing the ADC with the driver, as the stand-alone ADC achieves performance similar to [22] thanks to the minor changes in slice design.Both input and clock signals are provided by a single signal generator (SMA100B) and converted to differential signals by on-board baluns (BAL-3SMG).The conversion result is recorded at full rate in the on-chip SRAM and then read back via a low-speed opto-coupled Fig. 14.Measured spectrum at 4.2 K a) without FBB, b) with FBB on the input pair, c) with FBB on the input pair and the reset switches, d) with FBB on the input pair, the reset switches and the cascodes.e) Spectrum at RT. serial link to an RT FPGA for analysis.The timing and ADC slice calibrations are performed in the foreground via loop-back through the RT equipment.The calibration decks differ between RT and cryogenic temperature due to the drastic changes in transistor characteristics.For applications requiring background calibration due to higher expected PVT variations than in the target use case, calibration techniques as proposed in [37] may be applied.All reported measurements have been performed under the following conditions unless otherwise noted: all supplies are kept at the nominal value of 1.1 V, the amplifier input and output common mode are set to 550 mV, the gain is set to 6.6/8.9 at RT/4.2 K, corresponding to the same timing calibration setting (T S code in Fig. 11b)).This gain was chosen as a representative value in the mid-range of available gain settings, see Fig. 18.SFDR/SNDR values are always excluding the spurs at DC and Nyquist, as these are outside the band of interest for the target application.
In Fig. 14, we activate the body-biasing for different parts of the circuit in succession to observe their influence on the amplifier performance.With no body-bias applied (Fig. 14 a), the circuit is still operational at 4.2 K but shows numerous spurious tones, with the 3rd harmonic dominating at 35.7 dB due to the input inverters entering weak inversion in the middle of the amplifier input voltage range, as discussed in Section II-B.In Fig. 14 b), turning on the FBB on the input pair (with 539/−669 mV for NMOS/PMOS similar to the expectation from Section II-B) leaves the 2nd harmonic as the dominating spur.This is attributed to the incomplete reset via SW +/-,A/B leaving significant differences in starting condition between the two slices.In Fig. 14 c), now activating a full V bb of FBB on the reset switches, the performance reaches the design target (SFDR > 50 dB), with the 3rd harmonic dominating, This is expected to be caused by the input pair being compressed by the cascodes towards the edge of triode for part of the swing.In Fig. 14 d), a full V bb of FBB is also applied to the cascodes, achieving the optimal SFDR performance.It is important to note that in the body-bias calibration, the 8b body-bias DAC is necessary only for the input pair's offset cancellation and threshold compensation, as the switches and cascodes are operated at the inverted supply.The spectrum also demonstrates that the timing calibration reduces the gain and timing mismatch spurs sufficiently not to limit the amplifiers performance.Fig. 14 e) shows that RT performance is similar, but at slightly lower sampling as discussed next.In Fig. 15 a), we show the flexibility of the proposed amplifier over a wide range of sampling frequencies.The circuit has a speed advantage when operating at lower temperature, thanks to the speedup of the ADC logic [21].As required, the SFDR stays over 50 dB at maximum sampling speed over the entire bandwidth (Fig. 15 b), in accordance with RT simulation expectation.
In Fig. 16 a) we show the circuits DNL and INL that is on the order of half an LSB, likely limited by the comparator calibration accuracy.In Fig. 16 b), a two-tone test, performed with two signal generators (both SMA100B) and a passive combiner, yields a maximum IM3 spur at 57dB.If exciting the input with a comb of continuous wave (CW) tones produced by a VSG (SMW200A) and removing one tone, we observe a multi-tone power ratio of 35.8dB (Fig. 17), indicating sufficient isolation between the channels.
The VGA functionality is demonstrated in Fig. 18 by setting the gain (4.0-7.9/7.7-10.4 at RT/4.2 K) by using the pulse-width control (T S ) showcased in Fig. 11 b).The deviation from a linear scaling with T S code is caused by the limited intrinsic gain of the amplifying transistors.The lower Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 17.Multi-tone power ratio.Fig. 18.Measured gain at RT/4.2 K; the x-axis is the same as in Fig. 11b.Fig. 19.a) Measured magnitude response beyond first Nyquist band, b) SNDR vs f in , c) SFDR vs f in , 4.2 K@1 GS/s, RT@0.9 GS/s.

TABLE II GAIN SETTING OVERVIEW
bound in the gain range is due to the impossibility of reliably triggering the ADC conversion with a very short T S pulse.In a future redesign, this could be easily solved by using the wider primar y RS pulse, see Fig. 6, to trigger the ADC.At RT, the maximum gain setting is limited by the time T S reducing the conversion time available to the ADC slice, while this is not a limitation at 4.2 K thanks to the ADC slice being faster.The higher gain at cryogenic temperature can be traced to the increased g m at cryogenic temperature [29].
In Fig. 19, we explore the circuit behavior beyond the first Nyquist zone for various gain settings, while keeping a constant input signal amplitude and calibration.The amplifiers' output swing, normalized to the swing at low frequency, shows approximately the expected sinc shape of Eq. 2 and allows for estimation of T S in the circuit, shown in Table II.The deviations from the ideal sinc shape are caused by the parasitic capacitance at the cascode node [16].We can observe sustained SFDR performance >50 dB in the 3rd Nyquist zone, as SFDR tracks the driver output swing.The SNDR  performance drops as the swing reaching the ADC input is reduced due to the sinc shaped transfer characteristic reducing the circuit gain beyond the 2nd Nyquist zone significantly, limiting sub-sampling operation to this zone.
To estimate the junction leakage due to FBB, Fig. 20 reports the back-bias leakage of the full body-bias DAC, measured via V bb,n,ext /V bb,p,ext in Fig. 10.For the measurement, the biases are swept individually with the other terminal fixed to its nominal supply.The measured leakage is produced by a total width of 4.7 mm contributed by each NMOS/PMOS transistor in the decoders and switches of the body-bias DAC.As mentioned in Section II-C, this does include the DNW to PW diode leakage thus making the leakage in Fig. 20 an upper bound for a layout avoiding these diodes as in Fig. 9.At RT we measure significant leakage upon reaching the diode thresholds.At 4.2 K, we observe no leakage above the measurement noise floor of 100 nA for NMOS and up to 10 µA for the PMOS.Normalized to the total transistor width, this corresponds to a leakage below 20 pA/µm (2 nA/µm) for NMOS (PMOS) over the full FBB range, and below the measurement noise floor within the bias ranges used for the DAC in cryogenic measurements, i.e., 0.7/0.4V for NMOS/PMOS in all measurements shown here.We did not find any signature possibly caused by diode leakage in any measurement, also when observing the DAC output voltages via V debug .This demonstrates that, except for extremely leakage-sensitive circuits, cryogenic-aware FBB opens new design options for bulk-CMOS circuits.
Looking at the power breakdown in Fig. 21, the power is approximately equally split between the ADC core and the FIA (including the timing generation for both the FIA and the ADC).Simulations at RT in Fig. 21 c) show that the core ADC power is dominated by the logic, while the FIA core and the timing generation use approximately the same power, about half of the combined power.Despite the need for timing circuitry for the dynamic amplifier, this performance results in a FOM W of 31.3 and 25.4 pJ/conv.-step at RT When compared to prior ADCs with similar sample rate and resolution at RT and 4.2 K in Table III, the proposed ADC achieves comparable FOM W while also including the driving amplifier.Among the ADCs including a driver, we improve the FOM W by 2× over the state-of-the-art at RT and report the first ADC with a dynamic driver at 4.2 K.

V. CONCLUSION
In this paper, we have presented an FIA amplifier driving a time-interleaved SAR ADC at RT and 4.2 K.The driver pioneers the extensive usage of FBB in bulk technologies in cryo-CMOS analog circuit design, thus enabling cryo-CMOS designers to use techniques and topologies that were usually confined to RT applications.The proposed driver uses an effective combination of dynamic amplification, floating supply, cascode sampling and cryogenic-aware FBB to efficiently drive interleaved SAR ADCs.The design also shows the reliable performance of a dynamic amplifier under an extreme temperature variation, irrespective of the drastic changes in all transistor parameters.To the authors' knowledge, this is the first reported dynamic ADC driver operating at cryogenic temperatures.Furthermore, the proposed circuit achieves the best FOM among state-of-the-art RT ADC with a driver and comparable FOM among cryogenic and RT ADC operating at similar sampling speeds and resolution while also including the driver.

Fig. 1 .
Fig. 1.Sketch of NMOS in DNW with resistances and diodes considered here.

Fig. 2 .
Fig. 2. a) Pass-gate, b) RT MC simulation of pass-gate resistance at V C M = 550 mV, 10 4 samples, cryogenic behavior only modeled by V th increase with an equivalent series voltage.

Fig. 4 .
Fig. 4. a) Transconductance (g m ) of an inverter amplifier and of its individual PMOS and NMOS the x-axis corresponds to the inverter input voltage as shown in Fig. 3.The g m is derived from measured I d of individual NMOS/PMOS with L=100 nm, W=1.2/2.4 µm and 6 fingers.b) Transconductance of an inverter based differential pair.

Fig. 11
Fig. 11.a) Primary pulse generator b) RT simulation of pulse-width control.

Fig. 13
Fig. 13.a) Micrograph of the test chip; b) Micrograph of core analog blocks, c) Layout details of the amplifier core.

Fig. 20 .
Fig. 20.Back-bias leakage for the full DAC.The 4.2 K leakage on V bb,n,ext was below the measurement floor (≈1 × 10 −7 A) and therefore not plotted.

TABLE III COMPARISON
TABLE and 4.2 K, respectively, with the FIA core power (excluding the timing) degrading the FOM W by only 6.6/5.5 pJ/conv.-step at RT/4.2 K, thus demonstrating the efficiency of the proposed driver.