An Automated Design Methodology for Ring Voltage-Controlled Oscillators in Nanometer CMOS Technologies

This paper presents a design automation methodology for ring voltage-controlled oscillators (RVCOs) with their realistic and physical characteristics captured. With multiple sets of input constraints such as target frequency, phase noise, and control voltage range, the proposed algorithm automatically finds the design candidates that satisfy the target constraints, by running iterative post-layout simulations with auto-generated layouts and testbenches. The number of post-layout simulations is significantly reduced by the backtracking algorithm that observes the simulation results and determines the search direction. The proposed algorithm is applied to generate RVCOs in 40-nm planar and 7-nm FinFET technologies for DDR5 applications, and it turns out the proposed methodology produces sets of design parameters that meet the target specification in multiple technologies.


I. INTRODUCTION
Various clock generation circuits have been widely used to provide synchronization signals for modern computing and communication systems. For example, transceiver circuits for process-memory interfaces require voltage-controlled oscillators (VCOs) to produce high-frequency (>10 GHz) and low-jitter (several ps-RMS) clocks to transmit and receive data symbols at high rates. Typically, two types of oscillators have been used for the clock generation purpose: LC and The associate editor coordinating the review of this manuscript and approving it for publication was Dusan Grujić . ring oscillators. LC oscillators utilize the resonance of LC tanks to generate clock signals and generally exhibit superior noise performances. Ring oscillators (Fig. 1) are typically composed of delay stages in a loop (such as CMOS inverters) and tend to occupy compact areas. They are capable of generating multiple phases as well, which makes them feasible for system-on-a-chip (SoC) applications where area consumption should be minimized and multi-phased clocks are used for their I/O interfaces [1], [2].
However, several design difficulties are associated with the design of high-quality ring oscillators. First, various design parameters are involved in implementing ring oscillators that achieve desired performances, which complicates the design process significantly. For example, the oscillation frequency of a ring oscillator is affected by multiple variables such as process parameters, number of stages, transistor sizes, and supply (or control) voltage levels, in non-linear ways. Therefore, establishing an empirical and analytic relationship between those design parameters is intractable. To make matters worse, designing ring oscillators requires time-consuming transient simulations to extract time-domain metrics such as operating frequency and RMS jitter, unlike typical small-signal analog circuits such as amplifiers [3]. In addition to that, the behaviors of the ring oscillators are heavily affected by the layout-dependent effects such as parasitic capacitance, which cause more than 30 % deviations between pre-layout and post-layout characteristics in recent CMOS technology nodes such as FinFET. However, it takes significant time and effort to construct the physical layout of a ring oscillator due to the increased complexity of device structures and design rules [4], [5].
There have been several attempts to reduce the design time and overheads related to RVCO/analog circuit designs. [6] utilizes a reinforcement-learning technique with layout generators to find out the sizing parameters of analog circuits. Reference [7] trains a neural network model to automate the circuit sizing process without running SPICE simulations after the training step. Both approaches capture the layout-dependent effects by applying transfer learning techniques. However, the machine-learning-based approaches require multi-disciplinary knowledge (model selection, hyperparameter tuning, circuit design, and so on), and additional time and effort for training the neural network. References [8], [9] apply optimization techniques to design RVCO schematics with a fixed number of stages. Reference [5] utilizes digital place-and-route (PNR) techniques to produce various analog circuits including RVCO, assuming that the circuit sizing parameters are predetermined. In summary, additional research efforts have to be spent on implementing a resource-efficient RVCO design algorithm that supports automated circuit sizing and extensive parameterizations with layout effects captured.
To develop a practical circuit design technique for RVCOs that fulfills the aforementioned requirements, we propose an automated RVCO design methodology based on the researchefficient, simulation-based search algorithm with circuit schematic and ''layout'' generators. The proposed search algorithm receives various constraints such as oscillation frequency, phase noise, and control voltage range, then automatically finds the optimal set of design parameters with the minimum power consumption by running iterative post-layout simulations over design candidates. For the post-layout validations to capture layout-dependent effects, sized schematic and layout are generated by utilizing circuit design automation frameworks [10], [11] for the end-toend automatic execution with parasitic effects captured. The number of iterative simulations associated with the searching process is reduced by a dedicated backtracking algorithm that continually narrows down the search space by utilizing the fundamental characteristics of RVCO, which is detailed in Section III. The RVCO structure with controlled supply voltage ( Fig. 1) is used for the evaluation of the proposed algorithm, as the topology is widely used in typical clocking systems [2], [12], [13], though it should be noted that the proposed technique can be extended to other RVCO control methods as well [1], [14], [15], [16], [17]. Prototype RVCOs with completed layouts are generated and characterized based on the proposed method, meeting the target specifications in 40-nm planar and 7-nm FinFET technologies.
Our contributions in this work are summarized as follows: 1) An automated design method for RVCOs is presented, which captures their transient and phase-noise characteristics accurately without requiring interdisciplinary knowledge such as machine learning and/or optimization. 2) Iteration-efficient searching methods are proposed to minimize the number of post-layout simulations and converge to the optimal design point quickly.
3) The proposed search algorithm is combined with circuit generation frameworks to consider layout-dependent effects and achieve process portability in advanced CMOS technology nodes. The reminder of this paper is structured as follows: Section II summarizes major considerations for RVCO design and automation. Section III describes the proposed design automation method for RVCOs. Finally, Section IV presents RVCO generation examples for DDR5 application from the proposed method.

II. MAJOR CONSIDERATIONS FOR RVCO DESIGN AND AUTOMATION A. OSCILLATION CONDITION
To sustain the oscillation of an RVCO, its open-loop transfer function must meet the following Barkhausen criteria: Here, ω 0 indicates the angular frequency of the oscillator when the loop is completely closed and the conditions in (1) are satisfied. While the odd number of stage configurations always satisfies the oscillation condition [18], for an even number of stages, the loop establishes positive feedback rather than a negative one, which may result in latch-up [19]. VOLUME 11, 2023 Cross-coupled (and properly sized) latches can be attached between differential nodes to prevent latch-up as shown in Fig. 1, provoking the signal and meeting the oscillation condition by providing enough phase shift at the expense of reduced oscillation frequency. Therefore, the strength ratio between the original delay element and the latch should be carefully selected based on post-layout evaluations. Additional complexity is involved if the number of stages needs to be determined as well, for further design optimization.

B. OSCILLATION FREQUENCY
The oscillation frequency of the ring oscillator in Fig. 1 is expressed as the following expression: where t pd is the propagation delay of the core inverter that forms the loop, N is the number of stages, and k is the coefficient to represent the first-order sensitivity in t pd due to cross-coupled latches. a is the latch ratio coefficient, which is defined as the width ratio of the core inverter and latch.
Assuming the relative strength of PMOS and NMOS are balanced, t pd could be also approximated by the following expression [20]: where W core and L core are the channel width and length of the devices composing the core inverter, µ is the carrier mobility, V th is the threshold voltage, C ox is the gate-oxide capacitance per unit area, V ctrl is the controlled supply voltage applied to the ring oscillator, and C L is the loading capacitance, composed of the input capacitance (C core = C ox ·W core ·L core ) and capacitance associated other components (C other ) such as latches and buffers. Then since C L = C core + C other , we can rewrite the expression (3) as follows: then combining (2) and (4) yields the following expression for the oscillation frequency: Equation (5) reveals that f osc is determined by various factors, such as technology parameters, device-width and length, the voltage across the oscillator, and capacitance components, which complicates the design process significantly. Multiple solutions might exist for the same value of f osc and designers need to choose between candidates to find out the optimal design parameters, which requires extensive design space explorations with post-layout parameters captured. Fig. 2 illustrates one example scenario that two designs with different numbers of stages (3 versus 2) and latch ratios (0.25 versus 0.875) achieve the same oscillation frequency (20 GHz). Finding out these two candidates require two-dimensional, post-layout explorations with respect to the number of stages and latch ratio, which is intractable in manual design methods. To make matters worse, various large-signal characteristics and voltage-dependent capacitances deviate the oscillation frequency from the expression (5), which increases design complexity even further.

C. PHASE NOISE
The phase noise represents the fidelity of clock signals in the frequency domain in the presence of noise and disturbances. From [19], the phase noise of inverter-based ring oscillators affected by transistor thermal noise can be expressed as: where k, T , and γ denote Boltzmann's constant, the absolute temperature, and the excess noise factor for the device's thermal noise, respectively. f is the frequency offset and I d is the device drain current. Equation (6) indicates that the phase noise is inversely proportional to current and voltage when the oscillation frequency is fixed. In other words, increasing power consumption brings the improvement of the phase noise [19], [21], [22]. However, as the design parameters associated with the phase noise expression in (6) are coupled to the oscillation frequency as well, as expressed in (5), simply increasing power consumption to improve the phase noise characteristics may deviate the operating frequency from the original point. Therefore, additional fine-tuning processes are involved in preserving the oscillation frequency, which take a significant amount of effort in manual design approaches.

III. DESIGN METHODOLOGY FOR AUTOMATED RVCO GENERATION
As discussed in the previous section, the oscillation frequency and phase noise characteristics are affected by multiple design factors and layout effects, which results in layout iterations and/or sub-optimal designs. The methodology suggested in this section automatically discovers candidates that meet the target constraints (target frequency, phase noise, and control voltage range) and their layouts with reduced design efforts and iterations. Detailed explanations are provided in the following subsections.

A. AUTOMATED RVCO SCHEMATIC & LAYOUT GENERATION
In order to implement an automated sizing algorithm for RVCOs, their design entities (such as schematic, layout, and testbenches) for specific design parameters should be generated automatically. Therefore, we utilize an automatic circuit generation framework [10] and its associated layout generation engine [11]. They are used to describe the circuit generators for RVCOs that receive design parameters (i.e. the transistor width of core inverters) and produce the schematic and layout of the RVCO, and their associated testbenches to measure the frequency, power consumption, and phase noise for target operating conditions. The conceptual diagrams of the generated schematic and layout of RVCO are illustrated in Fig. 3 (only two delay/buffer stages are displayed for simplicity). The unit delay cell is composed of four CMOS inverters, two for the forwarding delay elements (or core inverters, INV core ) and two for the latching elements (or latch inverters, INV latch ). The sizing parameters for INV core and INV latch are chosen independently. The supply voltage of the inverters is connected to the control voltage of the RVCO, V ctrl to set the propagation delay and thereby the oscillation frequency. The buffer circuits are generated and considered together with the delay cells, as their input loadings affect the oscillation frequency as well. The layout of integrated delay/buffer stages is depicted in Fig. 3(c). The layout shares the same sizing parameters with the schematic, and its differential wiring patterns are matched across entire stages, to suppress any systematic phase mismatches. The effective width and length are configured by setting the fingers and vertical stacks of transistors, to make the proposed design method feasible for advanced technologies [4], [5].
Regarding the RVCO design, we have five design parameters to be configured: the length of transistors (l), the width of core inverters (w c ), the width of latch inverters (w l ), the control voltage (v), and the number of delay stages (n)). These five variables form a 5-dimensional search space X := {x = (l, w c , w l , n, v)}. Note that there is a significant number of design points to be checked when the entire design points are scanned for each case (N 5 when each design parameter has N possible values). Therefore, an efficient way of searching the design space with a minimal number of observations is essential to design the RVCO within a reasonable time period.

B. CORE ALGORITHM 1) PROBLEM DESCRIPTION
The main goal of the proposed algorithm is formulated in (7). As illustrated in the formulation, the algorithm searches available design spaces and finds a set of candidates that meets key constraints such as oscillation frequency, phase noise, and control voltage range. Additionally, optional constraints such as absolute maximum power consumption or the number of available phases (N φ in (7)) can be included to narrow down the search space. While the proposed algorithm favors the candidate that consumes the smallest power consumption (P osc ), it also outputs a set of candidates that achieves the frequency, phase noise, and control voltage constraints with additional power overheads. This provides further opportunities to decide the final design among near-optimum candidates based on other metrics, such as oscillator FOM [19], area, and K VCO .
2) AUTOMATED DESIGN PROCESS The overall RVCO design process is illustrated in Fig. 4. As mentioned in previous sections, the key idea of the proposed algorithm is to prevent running excessive simulations when exploring the search space composed of the five design variables. For this purpose, we construct five steps and four loops. First, we construct a five-dimensional search space X to be explored based on the design parameter vectors x (Search Space Construction). After that, candidates that satisfy the frequency constraint (X 1 = {x 1 }) are discovered for each possible combination of l and w c (Coarse/Fine Frequency Searching). The frequency search step is divided into the coarse/fine searching steps to efficiently exclude candidates that meet the frequency constraint without running simulations. In frequency searching, loop 1,2 sweep three design variables with measuring transient response for step purposes. The founded X 1 are then further examined by running phase noise simulations (Phase-Noise Searching). The searching direction is designed to minimize the number of time-consuming phase noise simulations. The frequency and phase noise searching steps are iterated over pairs of l and w c though loop 3,4 , while the scope and resolution of the iteration are dynamically adjusted for rapid searching. Finally, the candidates that meet the frequency and phase noise constraints (X 3,final = {x 3 }) and the optimal candidate x 4 that achieves the lowest power consumption are suggested (Candidate Suggestion). Each step is elaborated on in the following paragraphs with examples.
Step 1. (Search Space Construction): During the first phase of the design process, we construct a 5-dimensional search space X composed of design parameter vectors x = (l, w c , w l , n, v) to be explored. When the desired number of phases N φ is provided, candidates that do not meet the phase requirements in (7) are dropped out. For example, for four-phased clock generation cases (N φ = 4), three, five, and seven stage oscillators are removed from the search space.
After the search set is constructed, the algorithm enters loop 4 and loop 3 to find out the candidates. Among the five design parameters, only the channel length l is iterated over the full range (loop 4 in Fig. 4), as the oscillation frequency and phase noise characteristics vary significantly depending on the value of channel length. For each value of l, the core transistor width w c is swept from its lowest value (loop 3 in Fig. 4), finding out candidates that meet the oscillation frequency and phase noise conditions.
Step 2. (Coarse Frequency Searching): As the first performance evaluation, the transient behavior is observed for searching X 1 that fulfills frequency requirements. The coarse frequency searching precedes before fine frequency searching for the selected values of l and w c . The second step explores a 3-dimensional subspace X sub ⊂ X . In other words, the 3-dimensional subspace is composed of vectors x = (l 0 , w c0 , w l , n, v) when the pair (l 0 , w c0 ) are given, and only the maximum and minimum voltage is used in this step (v max and v min respectively).
Instead of fully exploring the subspace, monotonous properties of the oscillation frequency (with respect to n, w l and v) and the oscillation condition (with respect to w l /w c ) are exploited to reduce the number of exploration. In particular, for the oscillation frequency of RVCOs, the following inequality is derived by (5), where f osc is the oscillation frequency on given x. As stated in the inequality, f osc has negative partial derivatives for l, w l , and n, and positive ones for v. This characteristic is also the reason for the separation of coarse and fine-frequency searching. As shown in Fig. 5(a) which shows the frequency with respect to the number of stages, the target frequency f target must be surrounded by oscillation frequencies when v is v max and v min . It indicates that it forms the boundary where x 1 exists. We set a course search to find that boundaries X 1 (the collection of pair of upper and lower boundary x) and a fine search to sample X 1 from X 1 with sweeping v. As the exploration space where v is adjusted is reduced from X sub to X 1 , the computation can be reduced.
To sample X 1 from X sub , three design variables are updated with the post-layout transient simulation in loop 1 . It is ended when X sub is explored completely. The algorithm first choose v ∈ {v max , v min }, then fix a valid value of w l . Next, over the current parameters (l, w c , w l , v), it scan the valid values of n. Instead of iterating the whole possible values of n linearly, the proposed algorithm utilizes the 1/n dependency of f osc (which is revealed in (5)). To be specific, as the oscillation period is roughly proportional to the number of stages, the value of n that yields the target oscillation frequency can be computed from the following expression: where f osc,start is the measured oscillation frequency at initial point (n = n start , v ∈ {v max , v min }). n ub and n lb are correspond to the upper and lower boundary values of n based on the frequency constraint respectively. As the oscillation frequency slightly deviates from the estimated value from the 1/n dependency, several iterations are required to find out the actual values of n ub and n lb . To increase accuracy, we use the 1/n α model for the second estimation (see Fig. 5(b)). After the range of n is founded, the {x : n lb ≤ n ≤ n ub } are evaluated at the end of loop 1 . This approach quickly finds X 1 where the target oscillation frequency is covered by the control voltage range (v max , v min ). For example, the case illustrated in Fig. 5(b) takes 11 simulation steps to discover X 1 while the linear search would take 20 steps to get to the same point. While searching the boundary values of n, based on the observations at the boundary conditions and the monotonicity in (8), x incapable of meeting the frequency constraint are excluded from the search list. This process is called pruning, as multiple candidates with related parameter values are removed from the search scope without any simulations. To be specific, if the measured oscillation frequency of a candidate is lower than f target when v = v max , candidates with a higher channel length l should not meet the oscillation frequency constraint when they share the same parameters except l with the measured candidate. Therefore, the candidates are pruned out without running actual simulations. The same property applies to w l and n (see Fig. 5(c)). It should be noted that the core inverter width w c is not considered to have a monotonic dependency on the oscillation frequency, as its increase does not always lead to a higher oscillation frequency due to increased wiring capacitances. In addition to the frequency-based pruning, the allowable latch ratio for each value of (l, n) is estimated as well and exploited for the pruning process. When the post-layout simulation reveals that the current candidate is not strong enough for sustaining oscillation, its latch ratio is computed and all unexplored candidates with a smaller latch ratio than the current value are excluded from further explorations. The exclusion list is checked during each coarse frequency searching step to reduce the number of post-layout simulations.
Step 3. (Fine Frequency Searching): After figuring out the set of candidates (X 1 in Fig. 4), an additional process is performed to find the actual value of v that yields the oscillation frequency in loop 2 . This step is called fine frequency searching. During the fine frequency searching process, the potential value of v is estimated from linear interpolation (of which initial interpolation seeds are provided from v max and v min ). The oscillation frequency is then measured at the estimated control voltage, which is used for the next interpolation step. By iterating the interpolation process, the current design parameter vector x converges to the point that yields the target oscillation frequency f target with its measured power consumption provided.
Step 4. (Phase-Noise Searching): After finishing the frequency searching step, we obtain the set of candidates that VOLUME 11, 2023 meet the frequency constraint (X 1 in Fig. 4). The vectors are then provided for the next step to find candidates that fulfill the phase noise performance, with an (optional) preclusion process based on the power consumption constraints. For the candidates that meet the frequency and power constraints, we run periodic-steady-state and phase noise simulations to measure their phase noise profile. If the phase noise performance does not meet the target specification, the candidate is discarded. After finishing the phase noise evaluations over candidates with current length and core inverter width parameters (l, w c ), the algorithm advances to the next value of w c for further explorations. Instead of continuously sweeping the value of w c , the step size is adaptively adjusted to boost the convergence speed, based on the number of candidates that pass the noise performance constraint among the candidates that meet the frequency constraint during the current trial. To be specific, the metric to determine the step size of w c is defined as follows: If the score is zero, a coarse step is applied to the increment of w c because it indicates that there is no candidate that meets the phase noise constraint with the current value of w c . This means the candidates dissipate too small power to achieve the phase noise constraint. As the value of w c increases, some of their associated candidates meet the phase noise requirement, enhancing the score parameters (0 < score < 1), as illustrated in Fig. 6. Then the algorithm reduces the step size of w c to find out near-optimal values of w c and their associated candidates as possible. After the score becomes one (all candidates that meet the frequency constraint also meet the phase noise constraint), the loop 3 is terminated (optionally after a few additional trials) to avoid running unnecessary simulations to discover candidates with excessive phase noise performances. Fig. 6 shows the dynamic step adjustment for w c sweep and its resulting candidate trajectories. As shown in the figure, more candidates meet the phase noise performance as the value of w c increases (at the expense of additional power consumption). it receives a set of design parameter vectors (w c , w l , n, v) that achieve the target frequency and phase noise constraint for the current value of l. By repeating the candidate searching process over the entire range of l, the algorithm finally constructs a complete set of x for target constraints.
Step 5. (Candidate Suggestion): After the algorithm finishes the phase noise searching step, it outputs the optimal design vector x 4 that yields the lowest power consumption, together with other near-optimal candidates for further tradeoff explorations. The example of candidate suggestions will be described in the next section.

IV. CASE STUDY: DDR5
In order to verify the proposed RVCO design methodology, RVCOs oscillating at the maximum clock rate (3.2 GHz) of DDR5 applications [24] in 40-nm and 7-nm technologies are produced. The target control voltage range is set to 75% and 85% of the nominal supply voltage, and the desired number of phases is set to 4 for the quadrature clock generation. The overall target specification is summarized as follows: search: x = (l, w c , v, w l , n) Minimize: P osc (x) subject to: f osc (x) = 3.2 GHz with 5% tolerance L(x, 1 MHz) ≤ − 90 dBc/Hz The proposed method suggests 32 final candidates (including the best candidate that achieve the lower power consumption   Fig. 7 shows the characteristics of the output candidates in 40nm (black dots) and 7nm (red stars) technologies. All selected candidates meet the target specification with 5% tolerance in frequency error. The power consumption does not include additional circuits for the control voltage generation (such as voltage regulator), as the circuit topologies vary significantly across designs. It is worth mentioning that the Oscillator Figure of Merit (FOM) does not improve with technology scaling. Among them, the candidate that achieves the smallest power consumption is suggested, while all other candidates are summarized for the designer's consideration as shown in Table 1. Fig. 8 shows a schematic with the lowest power consumption candidate parameters in each technology and the layout in 40-nm technology. By utilizing the proposed methodology, we can analyze the relationship between design VOLUME 11, 2023  variables and the performance of candidates. Fig. 9. shows the relationships between the FOM and design variables: latch ratio (w l /w c ) and l. The definition of FOM is defined in Table 1. From Fig. 9, it is observed that to design low-phase noise with low power consumption at the identical frequency, the device length should get larger and the latch ratio should get smaller, which is consistent with [19].

V. CONCLUSION
In this paper, we present an automated design methodology for a ring voltage-controlled oscillator. The proposed methodology explores the design spaces and finds the candidates that meet the target specification items, by automatically generating sized schematics and layouts, and running post-layout simulations iteratively. The number of post-layout simulations is reduced by utilizing the nature of ring oscillators and intermediate explorations results to exclude unsuitable candidates without running simulations. The method is applied to generate RVCOs in 40-nm and 7-nm CMOS technologies for DDR5 applications. The optimal candidates achieve 0.57 mW power consumption in 40-nm technology and 0.55 mW power consumption in 7-nm technology at 3.2 GHz.

ACKNOWLEDGMENT
The EDA Tool was supported by the IC Design Education Center (IDEC).