A 1-GS/s 6–8-b Cryo-CMOS SAR ADC for Quantum Computing

This article presents a two-times interleaved, loop-unrolled SAR analog-to-digital converter (ADC) operational from 300 down to 4.2 K. The 6–8-bit resolution and the sampling speed up to 1 GS/s are targeted at digitizing the multi-channel frequency-multiplexed input in a spin-qubit reflectometry readout for quantum computing. To optimize the circuit for the altered device behavior at cryogenic temperatures, a modified common-mode switching scheme is adopted as well as a flexible calibration. The design is implemented in 40-nm CMOS technology and achieves 36.2-dB signal to noise and distortion ratio (SNDR) for Nyquist input at 4.2 K while maintaining a Walden figure of merit (FOM textsubscript W) of 200 pJ/conv-step (for a 10.8-mW power consumption), including the clock receiver, and 15 pJ/conv-step (for a 0.8-mW power consumption) for just the core ADC. With these specifications, the ADC can support the simultaneous readout of 20 qubit channels with a power consumption of 0.5 mW/qubit, thus advancing toward the full integration of the cryogenic readout for future large-scale quantum processors.

such an up-scaling by leveraging the fabrication techniques and facilities of the semiconductor industry [2]. In addition, the classical-i.e., non-quantum-electronic interface needed to control and read out the qubit state must also scale together with the quantum processor. To avoid connecting thousands or even millions of cables from the general-purpose roomtemperature (RT) equipment to the cryogenic quantum processor, the interface electronics must be operated close to the qubits, hence at the same cryogenic temperature. Thus, CMOS circuits and systems operating at cryogenic temperature (cryo-CMOS) have been proposed, since CMOS is the only technology offering the VLSI required to interface large-scale quantum processors [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. Semiconductor-based spin qubits are even compatible with CMOS technology [12], thus offering the potential of co-integration on the same chip of the quantum processor and its electronic interface.
The main disadvantage of moving the electronics to cryogenic temperature is the need for reduced power consumption to comply with the cooling constraints of the refrigerators. Most solid-state qubit platforms need to operate below 100 mK, where typically ≤1 mW of cooling power is available. An advantage of spin qubits in this context is their operation above 1 K [13], [14], which relaxes the power constraint to a few Watts, enabling the use of more extensive circuit architectures. For the above reasons, this article focuses on the design of the electronic readout of semiconductor spin qubits.
To read a spin qubit, a spin-to-charge conversion is first performed. The two commonly employed spin-to-charge conversion methods are Elzermann-type [15] and Pauli-Spin-Blockade (PSB)-type readout [16].   [21], showing speed and efficiency of previous RT and cryogenic ADCs. displacement of an electron, which, in turn, modulates either the capacitance of an electrode coupled to a quantum dot [17], [18] or the resistance of a single electron transistor (SET) [19], as depicted in Fig. 1(b). The change in SET resistance can be detected by a dc readout, in which a transimpedance front end directly acquires the signal current induced by the impedance change under a constant voltage bias [6]. Although the circuitry for such a readout can be extremely compact, it suffers from low-frequency noise, as well as the reduced bandwidth and increased noise due to the parasitic capacitance on the readout line. Furthermore, its scalability is limited by the need for a separate wire for each SET to be read.
As an alternative, in an RF readout [3], [7], [8], the quantum-state-dependent impedance Z Q is matched to the 50-cable impedance by an LC network, and it is probed via a directional coupler; see Fig. 1(a). The reflected signal is amplified by a low-noise amplifier (LNA) and then digitized by an analog-to-digital converter (ADC), either directly or after I/Q mixing. The downsides of such an RF readout, when compared with the dc readout, are a higher component count (matching network and directional coupler) and higher power consumption for various sub-blocks, such as the LNA and the ADC. Those disadvantages can be mitigated and counterbalanced by frequency multiplexing [ Fig. 1(c)]: here, not only one but multiple matching networks are placed in parallel, thus allowing for sharing both the RF components and a single cable over multiple qubits, consequently lowering the power consumption per qubit. Furthermore, in contrast to the dc readout, the parasitic cable capacitance does not limit the achievable readout bandwidth, which we target at 1 MHz to avoid creating a throughput bottleneck in execution of, e.g., a quantum error correction scheme [20]. In the context of RF readout, moving the digitization to cryogenic temperatures is crucial to enable a compact and reliable system: it avoids routing high-frequency sensitive analog signals through the several stages of a cryogenic refrigerator and potentially allows closing the algorithm execution loop completely inside the cryostat; see Fig. 1.
For the digitization of the input signal, a moderate resolution (≤8 bits) is sufficient (see Section II), while the sampling speed must be maximized to allow for more frequencymultiplexed qubit channels, thus lowering the readout power per qubit. Specifically, as the system's cooling power is on the order of several W, we require circuits to consume below a few mW per qubit for a system with thousands of qubits.
Among possible candidates for a cryogenic ADC, superconducting ADCs [22], [23] are inherently cryogenic and offer high speed at a low power consumption but are incompatible with the integration in a cryo-CMOS system on chip (SoC). A cryogenic FPGA-based ADC has demonstrated a sampling rate of 1.2 GS/s and offers great flexibility [24], but its power efficiency is much lower than RT CMOS ADCs, e.g., [25]. Previously reported cryo-CMOS ADCs tested at liquid helium temperature (4 K) [26], [27], [28] showed insufficient sampling speeds and low power efficiency due to the adoption of mature technologies and different application requirements. ADCs tested at liquid nitrogen temperature (77 K) [29], [30], [31] are not suitable for below 1-K spin-qubit applications and reach a maximum speed of only 100 MS/s. The 7.5-bit cryo-CMOS SAR ADC within the 4-K cryo-CMOS SoC in [9] was tested at sampling speeds up to 400 MS/s as part of a spin-qubit readout system, but its power consumption and design details have not been reported.
To cover the unexplored area in the cryo-CMOS ADC design space shown in Fig. 2, this article presents a 4.2-K 1-GS/s ADC with energy efficiency comparable to RT stateof-the-art ADCs and with an expected power consumption of only 0.5 mW per qubit when employed in spin-qubit readout. To achieve this performance, design techniques specifically targeting cryogenic operation and optimization have been employed, including cryogenic-aware comparator optimization and offset-calibration design, ad hoc capacitive DAC (CDAC) switching scheme, and thick-oxide front end and clocking, coupled with experimental techniques optimized for cryogenic characterization.
This article, an extension of our work in [32], is organized as follows. In Section II, we model the system to derive the specifications for the ADC. The ADC circuit design is described in Section III, including the design of peripheral blocks for testing and supply. The ADC testing with an emphasis on cryogenic testing is reported in Section IV, and the conclusions are drawn in Section V.

II. SYSTEM MODELING AND SPECIFICATIONS
The use case to derive the ADC specification is a receiver for the RF-readout scheme, as shown in Fig. 3. Each of the quantum sensors is placed in a matching network designed to match its nominal impedance, typically 0.1-1 M for an SET, to the 50-cable impedance. The values of N ch of these networks, each tuned to a different frequency, are placed in parallel and excited at their resonance via a directional coupler. For the "zero" state of the quantum-state-dependent impedance Z Q , we assume a matching condition; for the "one" state, we assume signal reflection, resulting in a change of amplitude and phase of the signal at the LNA output. The modulation depth µ mod of this amplitude-modulated signal can be significantly affected by the spread of the component values in the matching network (L and C) and the accuracy of the SET tuning. As those parameters vary between different systems and the modulation depth impacts the receiver specifications, we will derive the specifications taking into account a practical range of values for the modulation depth.
As the magnitude of the excitation needs to be low not to disturb the quantum sample, e.g., −110 dBm at the matching network input in [18], the input-referred noise power of the LNA is typically dominating the system noise. Therefore, we assume white noise over the signal bandwidth in the following. After the LNA, a variable gain amplifier (VGA) adjusts the amplitude to occupy the full scale of the following ADC.
As the model concentrates on the ADC, we abstract the signal chain preceding it as an ideal signal source producing N ch frequency-multiplexed channels equally spaced over the ADC bandwidth. Although both amplitude and phase information are typically used to minimize the demodulation errors [8], the modulation parameters greatly vary for the different applications. Thus, without loss of generality, we assume the purely amplitude-modulated pulse data superimposed to a white-noise background dominated by the LNA. The demodulation after the ADC is performed by a digital coherent receiver.
The main performance specification is the resulting bit error rate (BER), which, for binary amplitude modulation, is given by where Q is the "Q function." The SNR for each of the channels is then given by the ratio of rms signal power, and the sum of the quantization noise within the channel bandwidth f ch and the noise from the signal source, computed at the output of the digital demodulator following the ADC, can be computed as follows: where ENOB is the effective number of bits of the ADC, we approximate the quantization noise to be white, and SNR source is the input-signal-limited SNR appearing in case of an ideal ADC. As the readout error rate directly translates into a lower bound for the qubit readout infidelity, a BER below 10 −5 is expected not to limit the system's performance in typical error correction schemes [33]. Therefore, we assume an input SNR of 13 dB for the ADC and compare the influence of the ADC ENOB on the attainable BER. The ADC quantization noise should be designed to be negligible in the total receiver noise budget, as a higher resolution is expected to be cheaper in terms of power consumption than a lower LNA's noise floor. For a fixed energy per conversion in the ADC, the power consumption per qubit is independent of the ADC sample rate, since the sample rate scales proportionally to the number of qubit channels. However, a higher sample rate fitting more qubit channels is preferred, as it would reduce the number of required receivers and optimize the compactness of the whole quantum computer. A 1-GS/s ADC sampling rate is then chosen, as this is expected to allow for a highly power-efficient ADC in the target technology (40-nm CMOS) [25], [34]. We assume a per-channel data rate of 1 MHz, a modulation depth of 0.5, a channel count N ch = 20, and an input SNR of 13 dB unless otherwise noted. The channels are assumed to be equally distributed from 100 to 500 MHz, with the minimum frequency being motivated by the size requirements of the matching network. Fig. 4 shows the simulation results for the above model for a range of channel counts and modulation depth settings. Diminishing returns are reached for up to 40 channels for an ENOB above 6 bit, while 80 channels require up to 8 bit. When decreasing the modulation depth, the requirements change significantly: a modulation depth of 0.1 requires a resolution of at least 8 bit for 20 channels.
Based on these results, the target specifications are 6-8-bit ENOB at 1 GS/s. This would allow 20 channels with a modulation depth above 0.2 without limiting the achievable BER. If budgeting 0.5 mW per qubit, this requires a Walden figure of merit (FOM W ) better than 156 fJ/conv-step.

III. ADC ARCHITECTURE AND COMPONENT DESIGN
We have chosen the SAR architecture for the ADC, as it combines two essential features for our application: first, it offers good power efficiency, even for sampling speeds above 100 MS/s [25]; second, it is robust against the expected variations in device behavior at cryogenic temperature because of its predominantly digital and capacitor-based operation. An overview of the proposed SAR ADC is shown in Fig. 5. To minimize the time for non-critical comparator decisions, the ADC core is asynchronous [35] and employs the loop-unrolled technique [34]. In such architecture, each decision is performed by a different comparator that directly drives the  CDAC control. This minimizes the logic delay in the critical loop, as the ready logic and digital to analog converter (DAC) settling now run in parallel, allowing for higher sampling speed. The final comparator used in 8-bit mode is designed for lower thermal noise, as it carries the largest probability of noise-critical decisions [36]. Two identical cores are time interleaved for a further increase in speed. All comparators are used for 8-bit conversions, while the LSB comparators are disabled for lower resolution settings, effectively ending the conversion earlier.

A. Cyrogenic Circuit Design
No cryogenic simulation models were available for the target process at the design time. To achieve a robust design, we had to account for various changes in device behavior without using an exact model.
The most consequential changes when operating at cryogenic temperature occur in the transistor's dc characteristics. Both threshold voltage and mobility increase significantly at cryogenic temperatures [37]. The threshold voltage increases by about 120 mV in the adopted technology, while g m approximately doubles for the same drain current; see the measurement of minimum length NMOS and PMOS devices in Fig. 6. We also observe a significant >2× increase in g m /I d when biasing in deep sub-threshold, but in a much lower bias range, due to the much increased sub-threshold slope. To approximate this behavior in circuit simulation, we simulated all circuits at −55 • C using the model provided by the foundry. For additional functional verification, we included voltage sources mimicking the threshold voltage increase in series with crucial transistors, e.g., in the sampling front end. In addition, degradation of transistor matching is reported in [38]. While the increase in V th is small, the current gain mismatch β increases by 20% and requires a slightly widened range in the comparator offset calibration (see Section III-B). A moderate increase of about 10% is expected for the speed of digital circuits [39], [40]. This relatively minor change is due to the counterbalancing effects of the increases in both threshold and mobility. Although the absolute temperature decreases by 70×, the broadband noise of the active devices is expected to reduce by only a much smaller factor, as the white noise is believed to be caused by a mix of temperature-independent shot noise and thermal noise [41]. In this design, we conservatively assumed no improvement in noise, as this did not affect the design apart from a slightly enlarged comparator for the final decision.
Apart from the effects on the transistors, the change in temperature also influences the used passive components. Most relevant for this design is the effect on metal-oxide-metal capacitors, that have been characterized in [42]. While the absolute value of the capacitance is almost unaffected, the quality factor increases by 5× because of the improvement in metal conductivity and higher substrate resistance, which must be taken into account for the decoupling capacitors. For implementation of all resistors on this work, we used unsilicided n-type polysilicon due to its stability at cryogenic temperatures. In [43], a change in resistance of ≈5% was measured between cryogenic and RT.

B. Comparator
The strong-arm comparator [44], shown in Fig. 7, offers a good balance between power efficiency and speed for moderate-resolution SAR designs [25]. In Fig. 8, the comparator simulated speed and noise performance is shown as a function of the input common mode V CM at RT and −55 • C. The common mode at the calibration pairs (N3-N6, see Section III-C) was kept constant to 800 mV, while the differential input was 1 mV. With increasing V CM , the input pairs overdrive voltage V OD rises, initially resulting in increased speed due to faster activation of the latch (N7, N8, P1, and P2 in Fig. 7). This improvement levels off with higher V CM , due to reduction in input pair gain outweighing the improvement in latch activation speed, creating an optimum range. At −55 • C, the comparator is overall slower, with a stronger onset of slowdown at decreased V CM due to increased V th . For the noise, we observe a similar optimum at moderate V CM . At low V CM , the constant calibration pair V CM dominates the noise, while at high V CM , the amplification of the input pair drops and the latch noise becomes important. At cryogenic temperatures, we expect a further shift of these curves toward higher V CM , due to further increase in V th , but also increased speed due to the higher mobility (see Fig. 6). Due to the uncertainty in the device noise behavior at cryogenic temperatures, we designed the comparator to meet the noise requirements even at RT, thus resulting in a moderate over-design at cryogenic temperature.
After sampling, the ADC common mode is predominantly affected by the DAC switching scheme (see Section III-D) and the comparator kickback after the decision. Every time one of the eight comparators in the loop-unrolled ADC decides, the clock feedthrough via its NMOS input transistors (N1 and N2 in Fig. 7) reduces the common-mode voltage. The decreased V CM for later decisions will cause an associated slowdown of the decisions. As discussed above, this effect is significantly more pronounced at cryogenic temperatures with its increased threshold voltage. Such a slowdown could be mitigated by increasing the input common-mode voltage. This, however, comes at the cost of reduced available swing at the ADC input due to headroom limitations in the ADC driver. To allow for a around mid-rail common-mode input voltage and still alleviate the slowdown due to the increased threshold, we adopted a variable common-mode switching scheme detailed in Section III-D.

C. Comparator Calibration
Due to the loop-unrolled nature of the core ADC, individual comparator offsets cause distortion [34], thus requiring calibration of the comparator offsets. As the circuit will be operating in a highly temperature-controlled environment, we adopt a single foreground calibration. The offset calibration is performed via additional coarse (N3, 4, sized at 1/4 of N1, 2) and fine calibration pairs (N5, 6, sized 1/16 of N1, 2) in parallel with the main comparator pair, as shown in Fig. 7. Using two separate pairs relaxes the requirements of the calibration DAC used to generate the controlling voltages V ci+,− and V fi+,− . A resistive-ladder calibration DAC is shared among all comparators and among both fine and coarse calibration pairs; see Fig. 9. The voltages for each calibration pair are tapped from the ladder by a separate set of switches. Sign reversal is possible via an additional switch that swaps the positive and negative voltages. The nominal DAC resolution is 3 and 4 b for the coarse and fine pair, respectively. Missing codes between coarse and fine ranges are prevented by creating a 1-b overlap. To avoid crosstalk between the calibration pairs of different comparators, decoupling capacitors are added after the selection switch. The resistive DAC ladder consumes up to 30 µA of static current, which is negligible in the total ADC power budget. To account for the increased mismatch at cryogenic temperatures [38], the calibration range needs to be widened by 20% compared with RT requirements, primarily to account for the increase in β mismatch. To set the calibration range, 3-b resistive DACs defining the reference voltages V cm+ and V cm− , labeled R DAC,− and R DAC,+ in Fig. 9, are added.
As mentioned above, the comparator common mode varies during the conversion. The comparator offset is a function of the input common mode, as this changes the relative weight of the input pair and latch mismatch. To calibrate the comparators at the correct common-mode voltage, a switch shorting the CDAC top plate (see Fig. 5, labeled Cal) is activated, and then, normal conversions are performed. The differential input voltage generated by the changing DAC codes during the conversion is now decaying to zero via the shorting switch, while the common-mode voltage varies as described in Section III-D. To ensure sufficient settling of the differential input voltage in between comparator decisions, the shorting switch must have an ON-resistance below 500 to settle the input error to less than 0.25 LSB according to simulations. For reliably achieving this low ON-resistance at cryogenic temperatures, the shorting switch is implemented using thickoxide (2.5 V) devices, as standard-transistor resistance would be heavily affected by the increased threshold voltage at midrail voltages. The comparator output is, therefore, determined by the sign of the comparator offset voltage. As the individual decisions will be noisy, many conversions can be averaged, and, based on this sign information, a linear search algorithm can find the optimum calibration setting for all comparators in parallel.

D. CDAC
The CDAC is implemented using a custom-designed capacitor array. The unit capacitors are 500 aF in size and arranged in a common-centroid layout.
As discussed above, we want to implement a variable common mode in the DAC switching to counteract the comparator slowdown for low common mode at cryogenic temperatures. Variable common modes have been employed before in RT designs [45], [46], [47]. To increase V CM during conversion, the MSB and MSB-1 bottom plates are reset to ground during tracking, while the rest of the bottom plates are reset to V ref , as shown in Fig. 10(a). Upon the decision of the MSB comparator after the top-plate sampling, one side of the DAC switches the MSB capacitor accordingly to V ref . Thereby, the differential input amplitude is decreased in accordance with the binary search algorithm, but also V CM increases [ Fig. 10(b)]. This increase in V CM allows for a higher conversion speed in the MSB-1 decision. The same procedure is repeated for the MSB-1 capacitor. Afterward, all the remaining decisions are switch down, as the speed advantage deriving from the increased V CM becomes marginal, and the NMOS used for switching down requires a lower driver fan out because of the smaller necessary transistor. It needs to be noted that this variable V CM is possible only due to the loop-unrolled architecture of the ADC core, Fig. 11. Chip micrograph and layout details. as, otherwise, the variable comparator offset would cause distortion.
Due to the uncertainty in the device models, there was a risk of incomplete DAC settling in the case of higher speedup in the ready logic at cryogenic temperatures than in the settling. To avoid an over-design to mitigate this risk, an optional additional delay has been implemented in the ready logic, but was kept deactivated during measurement. It has been used in calibration mode to allow for additional settling time of the differential signal across the input-shorting switch.

E. Sampling Front End
Both the input and clock signals are terminated differentially on chip with 100 . At the center tap of the input termination, a 1.5-resistor is added to absorb the common-mode kickback from the track-and-hold switches in the ADC front end.
The sampling front end is implemented using thick-oxide NMOS transistors with an additional half-size charge injection cancellation pair. The choice of thick-oxide transistors for sampling rather than the usual pass gate, which would be sufficient at RT for the speed and linearity required here [25], is motivated by the increased threshold voltage at cryogenic temperature. This increase leads to an estimated 100-mV dead zone around mid-rail for pass gates at cryogenic temperatures, which would require exponentially larger transistors for sufficient settling. For design robustness and simplicity, we choose the thick-oxide sampling switch over alternative solutions, such as bootstrapping.

F. Clock Receiver
The sampling switches are controlled by a thick-oxidebased clock receiver. The thick-oxide nature of the front-end switches motivates implementing the clock receiver using thick-oxide devices, as a supply domain crossing would necessitate significant power to ensure low-enough jitter, as well as additional alignment calibration.
After the on-chip clock termination, pseudo-differential selfbiased inverters followed by a differential to single-ended converter are used to generate a CMOS-level clock. The single-ended full-rate clock signal is subsequently divided by two, and the resulting 180 • phase-shifted clock phases are aligned by cross-coupled inverters; see Fig. 5. The sampling pulses are then shaped to optimize sampling and conversion time. By default, the ADC sampling pulse duration is shaped to be 650 ps for a 1-GHz input clock. To allow for more time for the conversion for the low-resolution settings, the sampling time can be shortened by up to 215 ps in steps of 80 ps by delaying the rising edge of the pulse while leaving the falling (jitter sensitive) edge unaffected.
Due to interleaving, artifacts as analyzed in [48] can occur. The two-times interleaving employed here causes dc and F s /2 spurs due to offset, which are not detrimental to our application, as well as a F s /2-F in spur due to gain and timing mismatch. The timing mismatch can be calibrated in the clock receiver by a variable capacitive load per slice in the fan out toward the sampling switch. The timing calibration is simulated to have a 1.5-ps step and a maximum range of 45 ps. At cryogenic temperature, both the calibration range and the step are expected to slightly decrease due to the increased carrier mobility [37]. By design, the gain interleaving spur is below the noise floor, so no calibration is included for it.

G. Digital Back End
Due to the 3-m-long cables in the available cryogenic measurement setup, real-time streaming of the ADC output is unfeasible. To evaluate the performance of the proposed ADC at cryogenic temperatures, the data are stored in an on-chip memory and read out slowly in a second step. As storage medium, we chose an on-chip 290-kb SRAM generated using a compiler supplied by the foundry. The SRAMs specified speed at RT was not high enough to store the samples at full speed, and an additional margin was necessary for eventual speed degradation of the SRAM at cryogenic temperatures, as these are outside the SRAM specified operational range. For these reasons, we choose to operate the SRAM at F s /8, with an 8× wider parallel data interface. For the digital implementation flow, an additional safety margin of 50 ps was used in the hold and setup constraints to ensure correct operation at cryogenic temperatures, similar to the precautions taken in [5].

H. Supply Decoupling
The ADC's analog supplies and the digital supply are decoupled by on-chip capacitors. As the chip is wire bonded, these capacitors are prone to oscillation with the bondwire  inductance. As this effect is more pronounced at cryogenic temperatures due to the higher quality factor of the metal MOM capacitors, all supplies are degenerated with a 10resistor in series with the bondwire. Also, this low-value resistor is realized with unsilicided n-type polysilicon for its temperature stability. To realize the low value and comply with the current-density requirements, the degeneration resistors are implemented with a width of 200 µm. Furthermore, numerous bondwires are used in parallel for the ground connection to reduce its inductance. The ground connection is not degenerated.

IV. MEASUREMENT RESULTS
The proposed ADC is manufactured in a 40-nm LP process; see micrograph and layout in Fig. 11. The core ADC and   the clock receiver occupy an area of 130 × 240 and 100 × 80 µm, respectively. The chip is wire bonded and tested in a dip-stick setup, shown in Fig. 12. In the dip stick, the components to be tested are inserted directly into liquid helium to provide a 4.2-K environmental temperature. The offsetcalibration algorithm is implemented off-chip as part of the measurement routine. The 1.1-and 2.5-V nominal supply voltages are used for all measurements. The data underlying the plots shown here can be found under [49].
The ADC has a differential interface for both input signal and clock. When generating differential signals at RT and delivering them through the dipstick to cryogenic temperature, a differential phase relationship between the signals cannot be accurately ensured. To solve this issue, we tested RT baluns and found the Marki BAL-3SMG works well at cryogenic temperatures; see measurements in Fig. 13. The only limitation is a much increased loss at cryogenic temperatures for signals below 50 MHz. This is acceptable as the target signal and clock frequencies for our application lie above this limit.
Integral nonlinearity (INL) and differential nonlinearity (DNL) measurements are shown in Fig. 14. Here, we observe <0.5 LSB DNL/INL for the 6-bit case in Fig. 14(a) and a significant degradation to <1.5 LSB DNL/INL for the 8-bit case in Fig. 14(b). This issue can be traced to the comparator schematic. The tail node at the drain of N7 in Fig. 7 is not reset with the other nodes. This leads to the input pair acting as a source follower, effectively implementing a "max hold" on the tail node. Depending on the history of the DAC voltages, this tail node gives a different starting condition to the comparator operation, resulting in offset and producing the linearity limitations. To alleviate this, the tail of the comparator can be reset using a pull-up PMOS in a future redesign. Due to this issue, the optimal calibration settings could not be found in a straightforward binary search but needed manual fine-tuning. As predicted, the calibration range needed to be enlarged (by 25%) at 4.2 K with respect to RT via R DAC,+ and R DAC,− .
In Fig. 15, we show a summary of dynamic measurements performed on the core. In Fig. 15(a), the output spectrum at F s = 1 GHz and T = 4.2 K in 6-bit mode is shown, resulting in signal to noise and distortion ratio (SNDR) = 35.2 dB. No spur is visible at the mirror of the input frequency, suggesting a sufficiently accurate timing calibration. As spurious free dynamic range (SFDR) = 46.9 dB at 1 GS/s, the residual skew between the two ADC slices after calibration is estimated to be at least below 3 ps. Therefore, the ADC shows sufficient performance for our target application of frequencymultiplexing 20 qubit channels with a modulation depth of 0.5. For the 8-bit mode in Fig. 15(b), at F s = 0.5 GHz and T = 4.2 K, the SNDR is limited to 42.7 dB due to the linearity limitation found in the INL/DNL measurement. Still, the higher SNR allows the coverage of use cases requiring higher ENOB at a lower speed.
In Fig. 15(c), we sweep the sampling frequency keeping the input frequency close to Nyquist for different bit settings. For the 6-and 7-bit resolution settings, the sampling pulse was shortened in the clock receiver configuration to allow for more time spent in the conversion. Cryogenic and RT performance closely track each other up to moderate frequencies. At high frequency, we observe a speed advantage when operating at cryogenic temperatures. At cryogenic temperature, sampling rates as high as 1.1 GHz are possible but at a slightly reduced SNDR of 35 dB. Also, the flexibility of the ADC is shown: at reduced sampling speed and higher resolution settings, an SNDR around 45 dB can be achieved. The RT results closely follow the expectations derived from simulations. The speed advantage at cryogenic temperatures can be explained by the increased performance of the logic and the comparators. For the comparators, the threshold voltage increase does not hinder the comparator speed because of the CDAC switching scheme, while the mobility increases significantly. This allows for a faster comparator decision. To investigate further, we sweep the input frequency for a series of fixed sampling frequencies in Fig. 15(d). The ADC's performance is largely constant with respect to the input frequency, indicating no drops in performance for a particular input frequency.
We repeated the same analysis for the SFDR in Fig. 15(e) and (f), ignoring the dc and F s spur contributions. At moderate speed, little difference between RT and 4.2 K is observed, while the same improvement as for the SNDR is observed toward high sampling speeds. We can conclude that the front-end linearity is not majorly impacted by cooling the circuit due to the choice of a thick-oxide front-end switch.
In Fig. 16(a), we perform a two-tone measurement showing a −51.4-dB third-order intermodulation spur. The dominating interleaving spurs at Nyquist and dc due to the offset and offset mismatch between the slices are not important in our application. The variation in the amplitude between the two tones is attributed to the imperfect calibration of the long measurement cables.
We also tested the multi-tone performance of the ADC in Fig. 16(b) where we removed one peak out of a comb of frequencies generated by a VSG. We observe that no significant tone is generated by intermodulation at the frequency of the missing tone. The other spurious tones are similar in level to the previous test.
Finally, in Fig. 16(c), we plot the FOM W of the ADC core excluding the clock receiver. For sampling frequencies below 900 MHz, the ADC shows better than 10-fJ/conv-step performance both at cryogenic and RT. At cryogenic temperature, an FOM W of 15 fJ/conv-step is achieved at 1 GS/s. This is on-par with the performance of RT designs at similar sampling speeds, as shown in Table I. When including the thick-oxide clock receiver, this increases to 200 fJ/conv-step.
In Fig. 17, a breakdown of the ADC's power consumption at a sampling speed of 1 GS/s is given. Most of the power is dissipated in the thick-oxide clock receiver, which was not the central focus of the design. Replacing it with a bootstrapped switch would reduce this power drastically. In the ADC core, most of the power is dissipated in the logic and the CDAC reference. The comparators contribute only 12% of the total power dissipation, as the noise requirements for an 8-bit ADC allowed using small devices for their implementation. In summary, this power consumption means that, for the case of 20 qubit channels, 0.5 mW is consumed per qubit.
One important consideration in the design of cryogenic chips, in general, is the on-chip temperature, which might largely deviate from the environmental temperature of 4.2 K, e.g., as shown in [5] and [50]. As the measurements are performed in a dip stick, the chip is fully submerged in liquid helium. Although this results in an expected good thermal coupling, to verify the absence of any local hot spots on the chip, an array of temperature-sensing diodes is placed on the chip, distributed over the ADC core, clock receiver, digital, and termination resistors, as shown in Fig. 18(a). These P+/N-well diodes are multiplexed via thick-oxide selection switches on the anode side and read via a sense-and-force connection by a source measure unit (SMU) using a current of 1 µA. For calibration, the diode voltage was recorded at a range of environmental temperatures, measured with an external temperature sensor, with the ADC being inactive. Due to the loss of sensitivity in the diodes at lower temperatures in the calibration curve shown in Fig. 18(b), temperatures below 6 K are affected by a non-negligible error. In Fig. 18(c), we show the heat map when operating the ADC with a clock of 1 GHz and Nyquist-rate input signal. Very little global self-heating was observable. This is in line with the results in [50], where for a very small heater dissipating 6.3 mW, a self-heating below 0.5 K was observed at a distance of just 20 µm, which is the minimum distance of the on-chip sensing diodes from the power-dissipating circuit elements. Also, the power consumption in this chip is spread over the area of the clock receiver and ADC, rather than being concentrated in a single point, further decreasing the effect. We can conclude that the global chip temperature was very close to 4.2 K.
In Table I, we compare the presented ADC with prior works showing operation at cryogenic and RT. Compared with other cryogenic designs, we have made significant advances in efficiency and speed, while maintaining similar performance to RT state-of-the-art designs in similar technologies.

V. CONCLUSION
In this article, we demonstrated a loop-unrolled ADC operational at 4.2 K. Using a DAC switching scheme that increases the comparator input common mode for most decisions, we reached higher speed at cryogenic temperatures. By providing an ENOB of 5.7 b at 1 GS/s, the ADC is well suited for an RF reflectometry readout for spin qubits, allowing for 20 frequency-multiplexed qubit channels while maintaining a power efficiency of 0.5 mW/qubit. The ADC power efficiency can be further improved by including a more efficient clock receiver, which is possible when replacing the thick-oxide front-end sampling switches with bootstrapped switches. To the best of the authors' knowledge, this design is the fastest sampling cryo-CMOS ADC reported to date. With the demonstrated performance, the proposed ADC contributes to the progress toward the fully integrated cryo-CMOS readout required in future large-scale quantum computers.