A 100–750 MS/s 11-Bit Time-to-Digital Converter With Cyclic-Coupled Ring Oscillator

This paper presents the first measured cyclic-coupled ring oscillator (CCRO) time-to-digital converter (TDC). The CCRO realizes a robust true time-domain delay interpolation with sub-gate-delay resolution. The architecture employs real-time quantization to reduce conversion time and hence maximize bandwidth. Furthermore, the CCRO phase progression is encoded with a bubble error suppression logic, thereby building resilience to delay mismatches from circuit/layout imperfections. The prototype circuit implemented in a 28 nm CMOS process demonstrates a combination of high resolution and high sample rate over wide range of sample rates. The TDC achieves its peak figure-of-merit (FoM) of 0.051 pJ/conv.-step at 100 MS/s while delivering 8.38-bit linear resolution and 15.4 ps time resolution, operating from a 0.55 V supply. The TDC demonstrates the highest reported linear resolution of 9.29 bits among converters operating above 100 MS/s, at 125 MS/s and 0.9 V supply, while achieving 4.4 ps time resolution and 0.16 pJ/conv.-step FoM. Further, the real-time quantizing architecture allows fast operation up to 750 MS/s, where the TDC delivers 6-bit linear resolution and 0.48 pJ/conv.-step FoM operating from 0.9 V supply.


I. INTRODUCTION
Integrated time-to-digital converter (TDC) circuits that measure time intervals with high precision, accuracy, and speed, have a wide range of applications. They are used as phase detectors in all-digital phase-locked loops and frequency synthesizers [1], time-of-flight (ToF) sensors in radio or laser ranging (RADAR/LIDAR) and 3D imaging systems [2], nuclear instrumentation [3], and medical 3D imaging solutions [4]. An application that has recently gathered interest is time-domain analog-to-digital conversion [5], where time is utilized as a medium for signal representation. The design of conventional analog-to-digital converter (ADC) architectures with voltage/current-domain signal representation becomes increasingly difficult as semiconductor technology scales down deep into the nanometer regime, due to the diminishing voltage headroom and the decreasing intrinsic gain of the devices. Time-domain converters help circumvent the The associate editor coordinating the review of this manuscript and approving it for publication was Sai-Weng Sin . limitation by taking advantage of the increasing time resolution offered by modern technologies to deliver converter performance that improves with technology scaling. Furthermore, time-domain signal representation enables increased or even exclusive use of digital components thereby improving design automation with computer-aided design (CAD) framework available for digital design, hence reducing design and porting effort relative to traditional full-custom analog design [6]- [9].
The time resolution of a TDC with a time-domain quantizer employing delay line or ring oscillator is limited by the minimum delay of an inverter in the given technology. Most of the existing solutions enabling sub-gate-delay time resolution, such as the Vernier architecture [10]- [16] or the pulseshrinking [17] architecture, exhibit limited sample rate and bandwidth caused by increased time required for conversion.
In this paper, we propose a TDC circuit with real-time quantization and a cyclic-coupled ring oscillator (CCRO) quantizer. The CCRO realizes a robust true time-domain delay interpolation to provide a sub-gate-delay resolution time reference for quantization, enabling high converter resolution for given conversion time. Furthermore, real-time conversion reduces the conversion time to ensure maximal sample rate and signal bandwidth. The design also employs a pattern-independent bubble error suppression technique based on ones-counters, which effectively suppresses large conversion errors caused by transition reordering in the presence of delay mismatches due to circuit/layout imperfections. This work provides the results from a chip implementation of a modified design as well as rigorous validation of the concepts, whereas some of the related theoretical aspects are covered by our recent publication [18].
The measurement results of a prototype 11-bit TDC design in a 28 nm CMOS process demonstrate a versatile highly scalable operation over a wide range of sample rates from 100 MS/s to 750 MS/s, and delivers the state-of-the-art peak linear resolution of 9.29 bits at 125 MS/s sample rate with 4.4 ps time resolution. Further, the converter achieves a figure-of-merit (FoM) of 0.051 pJ/step when operating at 100 MS/s from a 0.55 V supply with 15.4 ps time resolution, while still delivering a high linear resolution of 8.38 bits. Overall, the TDC demonstrates a combination of high sample rate, high resolution, and high energy efficiency over a wide range of sample rates.
The remaining part of the paper is organized as follows. Section II describes the proposed TDC design and implementation details. Section III presents the measurement results and provides performance comparison with state-of-the-art TDCs. Section IV concludes the discussion.

II. PROPOSED TDC DESIGN
The proposed TDC design is shown in Fig. 1. A multi-phase cyclic-coupled ring oscillator generates a technologyindependent sub-gate-delay time resolution for input delay quantization [18]. A counter connected to one phase tap of the CCRO further extends the dynamic range of the phase accumulation hardware by counting integer cycles of the CCRO. The CCRO and counter phases are sampled by registers clocked by the start and stop pulses. The counter phase is sampled by two sets of registers, where the second sampling instant is delayed by τ . The two resulting counter phase samples φ ctr,start/stop,A and φ ctr,start/stop,B are used to align the sampled counter and CCRO phases coherently, and to correct any sampling errors originating from path delays and asynchronous counter sampling. Error-free counter samples are selected based on the counter correction logic described in Section II-B. The CCRO phase samples are encoded into a binary number with a ones-counter encoder discussed in Section II-C. The integer (counter) and fractional (CCRO) phases are combined and a difference is calculated between the start and stop channels to generate the final digital output. The digital clock, denoted as Clk in the figure, is the start signal with slighly leading phase. The resulting TDC performs Nyquist-rate conversion in good alignment with our goal of maximizing conversion bandwidth. The finer details regarding the architecture as well as the circuit and layout implementation are described in the following subsections.

A. CYCLIC-COUPLED RING OSCILLATOR QUANTIZER
This work employs real-time quantization [18], where the time available for quantization is limited. Hence, a technology-independent sub-gate-delay LSB time step is desired to maximize converter resolution for a given conversion time. We employ a CCRO quantizer to achieve a robust sub-gate-delay time resolution with true time-domain delay interpolation by means of injection-locking, which is a significant improvement over passive [19], [20] or active [21] delay interpolation employed in earlier-reported real-time quantizing converters. Both passive and active interpolation operate in voltage-domain as they construct the interpolated transitions utilizing the instantaneous voltage of the reference transitions, and hence fail to operate well when the propagation delay between the reference transitions exceeds the transition time [22].
The CCRO consists of M ring oscillators (each with N delay stages) coupled in a cyclic manner. Cyclic coupling forces the oscillators to a steady state where the phase difference between adjacent oscillators becomes equal, with an assumption that the oscillators are identical. Fig. 2 shows, for easy illustration, a 5 × 3 (N = 5, M = 3) cyclic-coupled ring oscillator circuit where respective taps of ring oscillators are cyclically connected with coupling inverters labeled 'c' [23], [24]. Coupling inverters are sized to have a drive-strength k times that of regular inverters, and k < 1. The coupling network implements multi-phase injectionlocking [25] between the adjacent ring oscillators, which forces the phase difference between adjacent oscillators to be equal, thus causing a uniform temporal distribution of the transitions in the oscillators. In steady state, phase difference ψ between respective nodes of adjacent ring oscillators can  have M − 1 possible values given by ψ = i · 2π/M , where i is an integer with 0 < i < M . Each solution defines a mode of oscillation, which is analyzed further in [23]. The uniformly distributed transitions provide a factor-of-M sub-gate-delay resolution, since M oscillators having 2 · N · M transitions within a cycle results in factor-of-M improvement in phase resolution as illustrated in Fig. 3, which shows the output waveforms of the CCRO in Fig. 2 for the mode ψ = 120 • . t inv-min is the minimum inverter delay in the technology for the given load conditions, and t q is the sub-gate-delay time step achieved with a CCRO, which becomes the LSB of the converter. The achieved sub-gate-delay can be considered technology-independent since the minimum time step t q can be reduced below the minimum inverter delay by design (with M > 1).
The dimensions of the CCRO employed in the design is set as N = 9 and M = 7, yielding 2 · N · M = 126 phase steps when both rising and falling transitions are utilized (≈7-bit precision). A high-speed 4-bit synchronous counter is used to extend the dynamic range of the converter by 4 bits to approximately 11 bits. Fig. 2 shows the transistor widths used in the chip implementation. The channel lengths are kept at the minimum. The CCRO is designed with k = 0.25 due to layout related reasons discussed later in Section II-E.

B. COHERENT SAMPLING OF INTEGER AND FRACTIONAL PHASE
In this work, the coherence between integer and fractional phases is achieved by double sampling the counter output with two registers that are triggered with a constant delay τ between them. The integer-fractional coherence has also been addressed in [26], [27]. The proposed counter sampling correction logic is presented in Fig. 4. In the figure, φ CCRO and φ counter are the sampled CCRO and counter phases, respectively. The CCRO phase tap connected to the counter is selected with a multiplexer as shown in Fig. 1, allowing the effective counter delay τ c to be tuned. The CCRO and counter samples capture a common circuit event occurring in the CCRO, which allows φ CCRO to be used as a reference for the counter sample selection logic. For lower half phases of φ CCRO , the counter sample A may be erroneous because of recent transition in the counter output, so the B-sample is selected. Otherwise, the A-sample can be assumed to be error-free due to sufficient temporal distance from the most recent counter transition. After applying the counter sample selection, the final encoded start/stop sample is constructed from the two phase samples by combining the encoded CCRO value and the scaled counter value as φ start/stop = 2 · N · M · φ counter + φ CCRO . Error-free counter sampling and integer-fractional coherence can be ensured as long as the effective counter delay between CCRO zero-phase and counter phase is less than the sampling delay τ c < τ and the sampling delay is less than half-period of the CCRO τ < T CCRO /2. Additionally, the correction is PVT insensitive when PVT induced delay variation VOLUME 9, 2021 is within the above mentioned delay margins. Furthermore, the correction is robust against global variations in supply and temperature, since all of the time/delay quantities, τ c , τ and T CCRO /2, change in a common direction with supply and temperature. In the measurement, the value of τ c is tuned by sweeping the control codes ('Ctrl' in Fig. 1) and measuring the DC output of the converter. The standard deviation of the DC output is minimized when τ c is correctly set. In this implementation, the fixed sampling delay τ has a value of 90 ps according to post-layout simulation.

C. PATTERN-INDEPENDENT BUBBLE ERROR SUPPRESSION
This work employs a robust bubble error suppression technique with dual ones-counters, which is capable of effectively suppressing bubble errors in the sampled phase signal regardless of the pattern of bubbles. The solution is based on our earlier published encoder developed for ring oscillators [28], which is here adapted to the transition pattern observed at the output of a CCRO. Robust error suppression is crucial to avoid large conversion errors caused by delay mismatch among the phase taps in the CCRO-register connection, particularly when designing converters with picosecond-range LSB in modern wire-delay dominated processes. Factors contributing to the delay mismatch include drive mismatch among inverters in the CCRO, PMOS-NMOS drive mismatch, load mismatch at the output nodes due to layout-induced wire-load mismatch, and clock skew among the flip-flops in the sampling register. When the the combined mismatch exceeds the LSB of the converter, the temporal order of transitions seen by the register becomes different from the order of transitions in the CCRO, resulting in large conversion errors with a digital phase encoder which directly maps the phase patterns to a numerical representation. Moreover, the pattern of such transition reordering is difficult to predict since they arise largely from the layout-dependent wire-delay mismatch in CCRO-register connection and the clock distribution network within sampling registers, calling for a robust pattern-independent error suppression.
The proposed phase encoder is illustrated in Fig. 5. The sampled phase signal is first subject to selective inversion, where a predetermined set of bits in the data are inverted. The inversion pattern consists of a chess-board pattern starting at an arbitrarily chosen pivot tap, on which another mode-dependent pattern is superimposed, as illustrated in Fig. 5. The M − 1 mode-dependent patterns are predefined based on possible CCRO oscillation modes and their known node phases [18]. The selective inversion maps the CCRO patterns to a cyclic unary code, where transition reordering errors appear as bubble errors. The cyclic unary code is then compressed by counting the number of ones in the code, which effectively corrects bubble errors irrespective of their pattern or the position of bubbles. The output is unfolded [29] with the knowledge about the position of pivot tap to convert the non-unique mapping obtained with ones-counting back to a unique mapping [28]. When the transition circulating in the CCRO is temporally close to the pivot tap at the time of sampling, large errors can result. To mitigate this problem, two separate signal paths with pivot taps separated by π/2 are used, and the correct output is chosen based on a coarse detection of the position of transiting tap at the time of sampling. In effect, the logic ensures monotonicity of LSB accumulation even in the presence of transition reordering errors, regardless of the pattern of reordering.  Each data point is generated from a test vector with 2 14 patterns having random reordering of transitions around the switching bit. As can be seen, the technique effectively suppresses large errors over a wide range of bubble error depths, even with random transition reordering patterns. The maximum error remains at two LSBs up to a bubble depth of six taps, whereas the error can be as large as 63 LSBs without error correction. Fig. 7 shows the spectra from a behavioral model of the CCRO encoder developed with Python, simulated with varying bubble error depths. Start and stop signals corresponding to a sinusoid input are generated with an ideal amplitudeto-time conversion. In Fig. 7, the bubble error suppression is disabled in top plot and enabled in the bottom plot. The proposed correction technique ensures that the errors remain low, thus maintaining good SNDR even with high orders of reordering, compared to the case without correction.

D. REDUCTION OF POWER CONSUMPTION IN SAMPLING REGISTER
This work implements a gating technique to reduce power consumption of the sampling circuits, as presented in Fig. 8. In our design, the input stage of the flip-flop buffers and inverts the high frequency signal of the CCRO to drive the differential inputs of the sense-amplifier flip-flop. The flip-flop is required to be active only during the start/stop sampling instant. Consequently, the input node can be gated during a significant portion of the sampling period to prevent unnecessary switching. This results in reduced power consumption, since the frequency at the clock node is lower than the frequency of the CCRO. Similar issue is also addressed in [30], where clock gating is applied to a delay line TDC to conserve power. The flip-flop input is activated by the rising edge of the start/stop signal using a positive edge triggered pulse generator, which generates a narrow pulse with duration τ p . The sampling clock is aligned to the approximate middle of the narrow pulse. In this design, the post-layout simulated value of τ p is 220 ps. Modifying the pulse width can modulate the frequency of the CCRO and potentially result in nonlinearity at the output of the TDC. However, no linearity degradation was observed at the converter output in simulations or measurements due to other more dominant error sources. In the measurements, the TDC linearity is equal with and without the pulsed sampling (disabled by a NAND-gate with On/Off signal denoted in Fig. 8). The additional power consumed by the generation of τ p is lower than the power saved by the gating, because the inverters used to generate τ p are switching at lower frequency than the input inverters of the flip-flops. The measured power saving at 250 MS/s sample rate and nominal 0.9 V supply is approximately 1.5 mW.

E. LAYOUT CONSIDERATIONS
Careful layout of the CCRO is crucial to achieve good performance in terms of time resolution and linearity. It is desired to minimize the load mismatch among the phase taps of the CCRO to reduce delay mismatch. An interleaved placement of the delay cells of the CCRO is employed in this work to achieve a balanced load distribution, as illustrated in Fig. 9 with the example of a 7×5 CCRO. The interleaved placement avoids relatively long wiring present at the boundaries of a layout with non-interleaved placement. Further, two dimensional interleaved placement is adopted to achieve wire-load balancing in both directions. The white squares in the layout illustration to the right of Fig. 9 represent delay cells containing the two inverters, and black arrows represent wiring between the cells. The indices are in the form (n, m), where the column index is 1 ≤ n ≤ N and the row index is 1 ≤ m ≤ M . The remaining delay mismatches are dynamically matched by the inherent dynamic element matching (DEM) present in the free-running oscillator based architecture [31].
Due to the grid-like layout, inter-node coupling through metals and the substrate in the CCRO can be significant, which can degrade the desired cyclic-coupling by introducing additional coupling paths. The resultant degradation of modal VOLUME 9, 2021 FIGURE 9. The two-dimensional interleaved placement of delay cells to minimize wire-load mismatch among CCRO phase taps, illustrated with the example of a 7 × 5 CCRO. stability of the CCRO was observed in post-layout simulations. The observed impact of parasitics was minimized by increasing the widths (W PMOS /W NMOS ) of the main and coupling inverters to 3.2/1.6 µm and 800/400 nm, respectively. The device lengths were kept at minimum.
Coupling of the start and stop clocks to the free-running CCRO can potentially cause nonlinearity at the output. Thus, in order to minimize this effect, the supply and ground nets of the CCRO and sampling flip-flops are isolated in the layout. Additionally, sufficient supply decoupling capacitance is important for stabilizing the supply and thus minimizing nonlinearity caused by interference.

III. MEASUREMENT RESULTS
The prototype chip with the proposed design, fabricated with a 28 nm CMOS process, is measured after directly wire-bonding the die to the test PCB. The chip micrograph is shown in Fig. 10. The TDC occupies an active area of approximately 0.078 mm 2 . The circuit is measured using RF signal sources for start and stop inputs and a logic analyzer for recording the 11-bit parallel LVDS digital output. The CCRO mode detection is carried out off-chip using a histogram method as shown in Fig. 5, and the effective counter delay tuning is automatically calibrated in the Matlab measurement routine. The TDC does not require any active fine calibration to operate correctly. A sub-gate-delay LSB size of 4.4 ps is obtained at the nominal supply of 0.9 V.

A. RAMP TEST AND STATIC NONLINEARITY
The response of the converter at a sample rate of 250 MS/s, to a slow ramp input generated by feeding two sinusoids with a small frequency difference to the start and stop inputs, is plotted in Fig. 11 along with the respective histogram computed from around 1 million samples.   rate. The converter delivers DNL well within ±0.5 LSB over a wide range of sample rates and supply voltages. This corroborates the effect of inherent dynamic element matching present in the free-running CCRO TDC and the bubble error suppression of the encoder. Increasing the sample rate reduces the time available for quantization, which can be observed as reduced conversion range in the 750 MS/s case. However, the real-time quantizing architecture is able to use the limited available conversion time efficiently. Fig. 13 shows the INL of the converter at sample rates of 100, 250, and 750 MS/s, and respective supply voltages. The INL performance shows no significant degradation with scaling of the supply, demonstrating the resilience of injection locking and delay interpolation in the CCRO against large variations in supply voltage. However, some degradation in linearity is observed when the sample rate is increased. This can be attributed to interference caused by clock coupling to the oscillator supply, which becomes more pronounced with increased frequencies. Similar effect is observed in [32].

B. DC TEST AND SINGLE-SHOT PRECISION
The converter is excited with a DC input generated with two phase-shifted frequency-locked 250 MHz signals connected to the start and stop inputs to evaluate the single-shot precision (SSP). The single-shot precision, measured from 2 million samples at 0.9 V supply for low, mid, and high codes, is shown in Fig. 14 along with respective histograms. The converter delivers a good precision with standard deviation around 0.86 LSB.  Fig. 15 presents the converter output for DC excitation with bubble error suppression logic enabled and disabled. Numerous large conversion errors appear at the output when the logic is turned off, which are effectively suppressed by the logic, demonstrating the robustness of the proposed pattern-independent bubble error suppression technique. Note that such a high number of transition reordering errors occur due to the delay mismatches in CCRO-register connection, even with the careful interleaved layout of the CCRO (Fig. 9), showing the necessity of robust error suppression techniques to avoid large conversion errors while designing converters with picosecond-range sub-gate-delay LSB in modern wire-delay dominated processes. Measured DC response of the converter with and without proper adjustment of the coherent integer and fractional phase sampling (Section II-B) is shown in Fig. 16. Large conversion VOLUME 9, 2021  errors that occur due to timing mismatch in integer and fractional signal paths in the absence of proper delay correction shows the necessity of robust delay mismatch correction techniques.

D. PERFORMANCE COMPARISON
The measured performance of the proposed converter is tabulated in Table 1, along with other state-of-the-art TDCs. The proposed TDC delivers the highest linear resolution of 9.29 bits among TDCs operating above 100 MS/s, while operating at 125 MS/s. It achieves a high sample rate of 750 MS/s while still maintaining linear resolution of around 6 bits. Overall, the converter delivers a combination of high linear resolution and sample rate over a wide range of sample rates as illustrated by the scatter plot in Fig. 17. Further, the converter reports the lowest supply voltage of 0.55 V while still maintaining a very good linear resolution of 8.29 bits and a high bandwidth of 50 MHz (100 MS/s sample rate). Additionally, the converter achieves competitive FoM over a wide range of sample rates as illustrated by the scatter plot in Fig. 18. The FoM, indicating energy per  conversion step, is computed using the linear resolution of the converters as shown in Table 1. Among the four operating points reported, the one with 125 MS/s and 0.9 V supply trades energy efficiency for improved bandwidth-resolution performance, thus lining up with 250 MS/s and 750 MS/s points in the N linear vs. sample rate plot in Fig. 17. On the other hand, the 100 MS/s point with a highly scaled-down 0.55 V supply trades some performance for maximal energy efficiency, thus lining up with 250 MS/s and 750 MS/s points in the FoM vs. sample rate plot in Fig. 18. The results demonstrate a robust operation over a wide range of sample rates and supply voltages, hence enabling flexible trade-off between performance metrics without hardware modification.

IV. CONCLUSION
This paper presents a cyclic-coupled ring oscillator -based time-to-digital converter circuit. The CCRO is employed to achieve sub-gate-delay time resolution with robust true-time phase interpolation. The CCRO-based TDC operates at high sample rates up to 750 MS/s due to real-time conversion without relying on sliding scale or time amplification -based methods. The mismatch related errors originating from layout and manufacturing non-idealities are suppressed by a onescounter -based encoder backend. The converter achieves high linearity of 9.29 bits due to inherent dynamic element matching present in the free-running oscillator architecture. The robust phase interpolation and sub-gate-delay time resolution of the CCRO, and the error mitigation capabilities of the encoder, are corroborated by the measured results. The presented design achieves state-of-the-art linearity performance with great energy efficiency over wide range of sample rates and supply voltages.