A Digital PLL-Based Phase Modulator With Non-Uniform Clock Compensation and Non-linearity Predistortion

In this article, we present a low-power digital phase-locked loop (PLL)-based phase modulator targeting low error vector magnitude (EVM). We introduce a new non-uniform clock compensation (NUCC) scheme to tackle an EVM degradation resulting from the beneficial use of a time-varying sampling clock that is re-timed to the phase-modulated carrier. We also employ a phase-domain digital predistortion (DPD) to combat the intrinsic non-linearity of an LC-type digitally controlled oscillator (DCO), thus avoiding the complications of frequency-dependent calibrations. The prototype, implemented in 40-nm CMOS, modulates the carrier in the range of 2.7–3.9 GHz from a 40-MHz reference. The measured EVM is −47 dB for a 60-Mb/s 64-PSK modulation under the case that the phase-modulated output is frequency-divided by <inline-formula> <tex-math notation="LaTeX">$K=8$ </tex-math></inline-formula>, i.e., when the DCO exhibits the most significant non-linearity due to the large fractional FM bandwidth. When <inline-formula> <tex-math notation="LaTeX">$K=8$ </tex-math></inline-formula> or 4, the measured EVM remains below −43 dB across the carrier-frequency tuning range and without re-calibrating the DCO non-linearity.


I. INTRODUCTION
L IFETIME of a battery-operated radio for the Internet-of-Things (IoT) applications is severely limited by the power consumption of its wireless transmitter (TX). Therefore, its energy-efficient realizations are a subject of great interest, and this favors a digital polar TX architecture [1], [2], [3], [4]. The polar TX utilizes a phase modulation (PM) path in parallel with an amplitude modulation (AM) path to compose a complex-valued RF signal, as shown in Fig. 1 implementations of the PM path typically perform a direct or a two-point frequency/phase-modulation of an RF oscillator, e.g., a digitally controlled oscillator (DCO) in a phase-locked loop (PLL) [4], [5], [6], [7]. This solution renders unnecessary such power-hungry PM blocks as delaylines [3], [8], [9] and IQ interpolators [2], [10], thus maximizing the system energy efficiency, especially at lower output power. On the other hand, IoT standards such as Wi-Fi HaLow are currently evolving toward high-order modulation schemes, such as 256-QAM, which requires an error vector magnitude (EVM) below −32 dB for the entire TX. From a system perspective, the AM path is usually allowed to corrupt a greater EVM portion since it handles a large signal amplitude and is more prone to nonlinearity and EVM degradation. As a result, the PM path is allocated a much lower portion of the EVM budget (e.g., ≤−40 dB).
Although the recently published PLL-based phase modulators have reported EVM below −40 dB [11], [12], maintaining such performance is challenging under some practical systemlevel constraints. One is that the ever widening signal bandwidth (BW sig ) in advanced communication standards tends to become a large fraction of the RF channel frequency ( f RF ), i.e., BW sig / f RF , ultimately aggravating the 1/ √ LC-induced nonlinearity of the DCO. For example, WiFi HaLow may use a signal bandwidth up to 16 MHz around 800 MHz, resulting in BW sig / f RF ≈ 2%. If this signal is transmitted by a polar TX, the DCO on the PM path needs to update at a frequency much higher than BW sig to suppress the replicas and spectral regrowth due to the FM expansion [5]; e.g., the update frequencies in [7], [13], and [14] are over 16× of BW sig  the update frequency, even equal to it to guarantee the PM range of [−π, π] [11], [15]. Consequently, BW FM can be many times wider than BW sig , covering a portion of f RF much higher than 2%. Across such a wide FM range, an LC-tank DCO will exhibit significant nonlinearity due to its 1/ √ LC law conversion [16].
So far, the DCO nonlinearity has been tackled by predistorting the oscillator tuning word (OTW). Noting that the predistortion setting is highly frequency-sensitive, [17], [18] calibrate the settings in the foreground at multiple frequency points. This not only costs extra power but may also fail to maintain the optimum EVM since a foreground calibration cannot track the relevant parameters under temperature and supply drift. Although the background calibration in [7] and [12] addresses the drawbacks of the foreground calibration, the convergence times are long, e.g., up to 100 ms in [7]. Considering that the background calibration there involves not only the nonlinearity but also the DCO gain (K DCO ) [12], which is cubically related to the channel frequency [16], the calibration results can easily turn invalid after hopping to some reasonably faraway channel. Therefore, re-calibration may be frequently needed during channel hopping, wasting considerable time and energy.
Another challenging system-level constraint is that the phase modulator should operate at a non-uniform clock aligned with the channel-dependent and phase-modulated RF clock [14], [19], [20], [21], [22], such as the variable clock (CKV) in Fig. 1. As shown, the digital polar TX uses multiple clock domains (i.e., CKU, CKV, CKD) to allow sufficiently high clock sampling rates of each block while being aware of their effects on power consumption. Aligning all the clocks with a common reference, i.e., CKV, helps to avoid data misalignment and glitches during cross-clock-domain data synchronization. This prevents the EVM and output spectrum from getting degraded by glitches of AM data [21] and misalignment between AM and PM signals [22].
Two strategies are widely utilized to generate the phase modulator's updating clock (CKU) that is synchronous with CKV. One is to frequency-divide the CKV [14], [17], [19], [22], [23]; the other is to re-time the significant edge of the PLL's reference clock (FREF) by that of CKV [1], [21], [24], as exemplified by the CKU generation timing diagram in Fig. 1 (in this design, the significant edges of FREF and CKV are both falling, while those of CKU are rising). Since CKV is phase modulated, any clock synchronous with CKV will exhibit some non-uniformity-the clock periods are timevarying; the offsets between its significant edges and those corresponding to an ideal uniform clock (e.g., those between CKU and FREF in Fig. 1) vary across cycles. Considering that PLL-based phase modulators have overwhelmingly adopted the two-point modulation scheme [25], [26], which directly modulates the DCO phase through one feed point and eliminates the excess phase prior to the phase detector through the other feed point, the non-uniform period and timevarying offset of the generated clock will, respectively, affect the DCO PM and excess phase elimination (details will come in Section II-B). These two mechanisms will disturb the PLL Fig. 2. Discrete-time domain model of an ideal PLL-based phase modulator with a two-point modulation. The gains of DCO and phase detector, respectively, K DCO and K PD , are implied as normalized, as in [27], hence hidden. and finally degrade the EVM. Currently, the prior arts [14], [23] merely tackle the effects of period variation, but ignore the impairment related to offset variation. Even for the period variation compensation, the existing methods are only valid for the CKU generated by dividing CKV, whose period is determined by the instantaneous CKV frequency, but cannot be extended to the case of using the reference clock re-timed to CKV, whose period is affected by the accumulative CKV phase.
In this article, being an extended version of [28], we propose a phase modulator for a polar TX that utilizes a two-point PLL modulation scheme and updates data at a non-uniform digital clock, which is generated by re-timing the reference clock to CKV, thus inevitably disturbing the PLL and degrading the EVM of the output signal. To analyze the variations and effects of the re-timed clock, we extend the conventional discrete-time phase modulator model to a hybrid-time domain (Section II). Based on this new model, we propose a nonuniform clock compensation (NUCC) scheme to suppress the disturbance on the PLL and improve the PM accuracy (Section III). Furthermore, a phase-domain digital predistortion (DPD) is also proposed to combat the 1/ √ LC-induced DCO nonlinearity (Section IV). Parameters of the proposed DPD are established analytically, thereby avoiding the hardship of a frequency-dependent calibration. The implemented phase modulator (Section V) was experimentally verified with a 60 Mb/s 64-PSK signal to prove the efficacy of the proposed NUCC and phase-domain DPD (Section VI).

II. MODELING A PLL-BASED PHASE MODULATOR
A. Ideal Phase Modulator Model in Discrete-Time Domain Fig. 2 shows a discrete-time domain model of an ideal PLLbased phase modulator. To produce the CKV clock with the excess phase φ ′ V (i.e., excluding the carrier component), the desired modulation commanding phase θ M is first normalized by 1/(2π ) to φ M . 1 Then, φ M is differentiated to φ M , which is the target phase shift to be developed by φ ′ V during a single reference cycle. φ M modulates the PLL through two feeding points [29], defined as direct modulation (DM) and phase prediction (PP). Through the DM point, φ M directly modulates the DCO. Due to its phase integration nature [30], the DCO accumulates φ M cycle by cycle such that the . Meanwhile, the PP-related path also emulates the DCO behavior for its elimination purpose, i.e., by accumulating φ M and then delaying it to predict the DCO , φ E , will be detected and gradually corrected by the loop.
, so φ E = 0 signifies that the loop is oblivious to the modulation "perturbations." In practice, however, errors will occur in relation to these two feed points. The DM-induced error is denoted as φ E,DM and stems from various impairments of the DCO, such as its phase noise and frequency quantization, as well as the nonlinearity of its FM characteristics. Without the feedback loop, even a tiny but persistent φ E,DM can accumulate without bound in the DCO as a PM error. Fortunately, a closed-loop PLL will gradually correct it, thus preventing the accumulation in the long run. A wider PLL bandwidth helps to suppress the effects of φ E,DM , but it makes the PM accuracy more vulnerable to the PP-induced error, i.e., φ E,PP , which stems from the phase detector's noise and nonlinearity, as well as the prediction error of φ ′ R . This implies an optimum PLL bandwidth to balance the PM error due to φ E,DM and φ E,PP . However, the optimum bandwidth is merely a trade-off. To achieve a lower EVM, this work focuses on minimizing both φ E,DM and φ E,PP .

B. DCO Model in Hybrid-Time Domain
The DCO model in Fig. 2 is merely a discrete-time domain approximation assuming that both the modulating input φ M and developed output phase φ ′ V update simultaneously on the same uniform clock-spacing grid, thus incapable of properly handling the effects of clock impairments, i.e., the FM-induced skew and period variations. To include these non-idealities, the DCO model is expanded to a hybrid (i.e., discrete/continuous)time domain, with the diagram and waveforms shown in Fig. 3. The DCO is basically an FM device whose offset frequency f M from the f 0 carrier changes instantaneously in response to the OTW that is updated by the CKU clock. This FM characteristic is modeled in the discrete-time domain. To be consistent with the discrete-time DCO in Fig. 2, we expediently use an ideal CKU aligned with the PLL's reference (FREF), but we will add the timing non-idealities to the CKU later. Considering that OTW is denormalized from φ M by f REF /K DCO , where f REF is the frequency of FREF and K DCO is the DCO FM transfer gain, then f M during the nth clock cycle is related to φ M by the following equation: where T REF is the period of FREF. On the other hand, the DCO also exhibits phase-accumulation characteristic with which it acquires the excess phase φ ′ V by integrating f M over time [31], i.e., φ ′ This characteristic is modeled in a continuous-time domain, and a zero-order hold is added to convert the discrete-time f M [n] to continuoustime f M (t) [32]. Thus, the continuous-time φ ′ V (t) can be described as follows: where Fig. 2. Consequently, no error will be detected and so the PLL remains unperturbed. Note that two conditions should be satisfied to perfectly cancel the sampled and predicted phases. First, from the phase accumulation aspect, the excess phase shift in the nth clock cycle should exactly equal the input of Aside from an f M error caused by the DCO FM nonlinearity, this condition can also be impaired by the DCOphase-accumulation time (T acc ) deviating from T REF [33]. This occurs if CKU is time-varying, as in Fig. 1. Then, the CKU period variation will degrade the PM accuracy through φ E,DM . Second, from the phase-detection perspective, the DCO update clock CKU should ideally align with the sampling clock FREF. If any offset exists (this will be discussed in Section II-C), φ ′ R will not precisely predict φ ′ V . The associated error adds to φ E,PP , thereby disturbing the PLL and affecting the EVM.

C. Hybrid-Time Model of Phase Modulator
A realistic CKU might not be perfectly aligned with FREF due to various circuit delays on the FM path, e.g., CKU's propagation delay and DCO's settling time. For simplicity, all these delays are included in the nominally constant offset between FREF and CKU, i.e., t cnst (exaggerated) in Fig. 4(a). Then, φ ′ R predicts φ ′ V (t) sampled at the CKU grid, instead of that at FREF. Therefore, using φ ′ R for the phase detection leaks some φ ′ V information to φ E,PP , resulting in an error of  leakage mechanism due to the t cnst skew, the hybrid model emphasizes the clock-domains-FREF is used in the φ ′ V sampling and CKU drives all the remaining discrete-time blocks and updates the DCO's f M . Furthermore, this model also . Utilizing φ ′ S for phase detection can completely avoid the φ ′ V leakage. It should be noted that [15] has also found this φ ′ V leakage mechanism, defined as "delay spread," and compensated for it by recursively predicting φ ′ S . However, [15] considers only the case of constant t cnst . In the non-uniform CKU case (to be discussed in Section III), CKU's offset relative to FREF becomes time-varying. Under such a condition, using φ R2S to predict φ ′ S can be more convenient, since it only involves the phase accumulation within one CKU cycle and the prediction error would not propagate to or accumulate on subsequent cycles due to the non-recursive form.

III. NON-UNIFORM CLOCK COMPENSATION
A. Foundation for NUCC-t S Estimation Due to the system-level constraints discussed in Section I, the proposed phase modulator adopts the update clock CKU that is generated by re-timing the FREF falling edge to the 5th subsequent CKV falling edge (for timing reasons), as shown in Fig. 5(a). Consequently, CKU shows the time-varying offset (relative to FREF) and period, thus, respectively, contributing errors to φ E,PP and φ E,DM . To tackle these errors, the first step is to estimate the variations of CKU offset and period. This entails knowing t S , i.e., the instantaneous time offset between FREF and its first subsequent CKV edge, due to two reasons: Regarding the CKU's offset from FREF, t S dominates the variation component because this offset breaks down to two parts-t S and four CKV periods (i.e., 4T CKV [n], where T CKV [n] is the CKV period during the nth CKU cycle). The former one varies across CKU cycles; the latter one is roughly constant, approximately 4 average T CKV [n], i.e., t cnst ≈ 4T CKV , given that BW FM is sufficiently smaller than the DCO carrier frequency ( f 0 ). Regarding the CKU period, its variation can be simply derived by differentiating the relevant offsets, more specifically t S 's.
Actually, the t S prediction is widely used in the recent PLLs to narrow down the phase detectors' input range [34], [35], [36], [37], [38]. Predicting t S requires the absolute phase of CKV, i.e., φ V , which counts not only the excess phase φ ′ V due to modulation, but also the carrier phase φ C [see  2 Using the predicted φ V at the FREF grid, i.e., φ S , t S in the nth CKU cycle can be predicted as follows: where φ S,frac is the fractional part of φ S . To facilitate the t S prediction, the phase modulator model in Fig. 5(c) includes the DCO's carrier phase φ C : On the directmodulation side, φ C is modeled by integrating the DCO carrier frequency f 0 over time. Then φ C adds to φ ′ V to represent the absolute CKV phase φ V . On the phase-prediction side, the frequency control word (FCW), i.e., is accumulated to reflect the behavior of φ C at the FREF grid The accumulated FCW adds to φ ′ S (the prediction of φ ′ V at the FREF grid), yielding φ S . With its fractional part φ S,frac , the NUCC block can predict t S as well as estimate the CKU's period and offset deviation relative to FREF, and then compensate the associated effects on φ E,DM and φ E,PP with φ DMC and φ R2S , respectively.

B. Tackling φ E,DM Due to CKU Period Variation
The T acc [n] variation relative to T REF can be estimated by the following equation: Substituting (5), (6), and (9) into (8) yields the estimation , the NUCC core adds to the direct-modulation-related path a compensation phase equal to − φ ′ V,E [n] in the next CKU cycle, i.e.,    One may also notice φ ′ V,E is post-compensated, i.e., corrected with one CKU cycle latency, and wonder if it would be better to predistort φ ′ V,E to prevent this error from occurring. In fact, these two methods would result in the same simulated EVM. The reason is clarified in Fig. 7. Due to the phase integration feature of DCO, compensating φ ′ V,E takes one CKU cycle, instead of being completed immediately. Therefore, the φ ′ V,E -compensation error would stay on the φ ′ V (t) trajectory for one clock cycle, whichever strategy is adopted.

C. Addressing φ E,PP Due to CKU Offset Variation
Compared to the delay spread compensation in Fig. 4, the φ E,PP -compensation in NUCC specifically addresses the φ R2S prediction error raised by the time-varying component of the offset between FREF and CKU. Similar to the scenario in (4), calculating φ R2S [n] requires the instantaneous modulation So far,  Fig. 6). This observation helps to locate φ ′ R [n] on the φ ′ V (t) trajectory in Fig. 8, and finally leads to the conclusion that t R2S [n] equals the time offset between FREF and CKU in the preceding CKU cycle, i.e., considering either side of the formula equals where t acc,S [n] denotes the duration between the nth CKU and the subsequent FREF edges. Substituting (5), (6), (12) into (11) yields a φ S,frac -based φ R2S prediction, i.e., where the φ DMC term is ignored due to its negligible influences (in the order of φ M /FCW 2 ). t cnst in this expression characterizes the constant component of the offset between FREF and CKU, thus can be estimated with the least mean squares (LMS) algorithm in [15]. Consequently, φ R2S , φ ′ S , and φ S can be accurately predicted [see Fig. 5(c)]. This will not only compensate the φ E,PP error due to the non-uniform CKU, but will also provide an accurate φ S,frac for φ E,DM -compensation in the next cycle [see (10)].

IV. DCO FREQUENCY ERROR COMPENSATION
A. Characterizing the Error Induced by 1/ √ LC Fig. 9 sketches an open-loop representation of the directmodulation path in a PLL-based phase modulator. The instantaneous resonant frequency of the LC tank is controlled by a switched-capacitor (SC) bank, thereby suffering from errors related to the 1/ √ LC-induced nonlinearity. As mentioned in Section I, these errors increase dramatically at higher values of the fractional FM bandwidth BW FM / f 0 . The quantitative analysis starts with the DCO carrier frequency f 0 = 1/(2π(L 0 C 0 ) 1/2 ), where L 0 and C 0 are the tank's inductance and capacitance, respectively. With the capacitance change of C, the resonant frequency shifts by the following equation: However, nearly all published frequency modulators utilize just the linear (or first-order) approximation of (14) to estimate the frequency shift due to C, i.e., Consequently, a realistic DCO frequency shift deviates from the expected f lin with a relative error of Considering that the maximum f lin during modulation equals half of the FM bandwidth (i.e., BW FM /2), BW FM / f 0 thus reflects the level of the 1/ √ LC-induced FM error. According to the discussion above, a polar TX under the assumption of invariant signal characteristics (e.g., BW sig and BW FM ) suffers from a higher 1/ √ LC-induced PM error when it generates a lower RF channel frequency f RF simply due to the increased BW FM / f 0 , if the DCO directly oscillates at f RF , i.e., f 0 = f RF . However, in a practical polar TX, the DCO output may be first scaled down by a programmable frequency divider ÷K before input to the AM part (see Fig. 9) so as to extend the lower operational range of f RF [17]. Since ÷K allows the DCO to maintain the resonance at high frequency, i.e., f 0 = K · f RF , one may wonder how this would affect the nonlinearity characterized by BW FM / f 0 . Actually, ÷K also attenuates the DCO phase by K . To ensure the divided output maintains the desired phase θ M , it should be amplified by K before modulating the DCO (see Fig. 9). This forces BW FM to also expand by K . In the end, BW FM / f 0 and the 1/ √ LCinduced nonlinearity remains the same as in the basic case of f 0 = f RF .

B. Phase-Domain DPD
Considering the DCO nonlinearity due to the 1/ √ LC law being well captured in the presented math formulas, it can be compensated by polynomials whose coefficients are determined by pure math. As shown in Fig. 10(a), we predistort the nonlinearity in the phase domain with a second-order polynomial term, i.e., adding it to φ M . Derivation of this coefficient relies on the LC-DCO model in Fig. 9. Considering (14) and the capacitance change due to OTW, i.e., C = −OTW · C U , where C U is the capacitance of the SC units, the DCO frequency shift of f would require an OTW of By applying a Taylor series to (17) and exploiting (1) and (6), OTW can be written as a function of φ M The coefficient of the linear φ M term also equals f REF /K DCO , which is the denormalization factor from φ M to OTW in the linearized DCO models, e.g., Fig. 3(a). Therefore, (18) can be rewritten as follows: where φ DPD can be used for the phase-domain DPD. In the implemented system, the terms with i > 2 are discarded as negligible.
Interestingly, prior arts tend to predistort the DCO nonlinearity exclusively in the OTW domain [12], [17], [18], i.e., by adding a compensation signal OTW DPD into OTW [ Fig. 10(b)], rather than into φ M . According to (19) and (20), OTW DPD significantly correlates with K DCO , i.e., (21) where OTW lin is the OTW linearly denormalized without DPD, i.e., OTW lin = φ M · f REF /K DCO . Considering K DCO varies dramatically across frequency [16], this might come as no surprise as to why the prior arts suffer from the frequency-dependent OTW DPD , thus requiring extensive calibration. In contrast, the phase-domain DPD can be calibrationfree because the coefficients in (20) rely only on the foreknown FCW.
Note that the phase-domain DPD mainly tackles the nonlinearity caused by the 1/

√
LC law. As for that caused by device mismatches, the OTW-domain DPD can address it with relatively fixed settings since the mismatch is expected to be stable after the fabrication [16]. Therefore, combining the OTW-and phase-domain DPD ultimately leads to a frequencyinsensitive solution to address the DCO nonlinearity, i.e., the combinational DPD in Fig. 10(c). Fig. 11(a) presents an overview of the implemented phase modulator. The main body is a time-mode-arithmetic-unit (TAU)-based PLL reported in [39], which natively operates in a fractional-N regime and where the phase error (i.e., normalized timing of CKV relative to FREF), φ E , is extracted by the TAU-based phase detector, then passed through the digital loop filter to be iteratively corrected by tuning the DCO through OTW TRC (the OTW for carrier tracking). The phase detector extracts φ E according to φ S , i.e., the predicted CKV phase φ V at the FREF grid, in a coarse-fine style: The coarse path counts the number of CKV edges, representing the integer part of φ V , then cancels it with the integer portion of φ S , i.e., φ S,int . On the fine path, the TAU samples t S , reflecting the fractional φ V , cancel it with T CKV scaled by (1 − φ S,frac ) to extract the time error t E . After t E is quantized by a timeto-digital converter (TDC) and normalized by the TDC gain (K TDC ), the resulting phase adds to that of the coarse path, constituting φ E . The TAU also launches the CKU, which aligns with the fifth CKV falling edge after FREF and clocks the main digital block.

A. System Overview
The PM function is realized through the two-point modulation scheme: On the DM side, the phase shift target φ M is added to φ V by tuning the DCO's offset frequency through φ DM ; on the PP side, φ M accumulates with FCW so that φ S reflects the excess phase and ideally cancels with the sampled φ V prior to the digital loop filter. As discussed in Sections III and IV, the PM accuracy suffers from two significant error sources. One is the DCO's FM nonlinearity raised by 1/ √ LC, which is compensated by the proposed secondorder phase-domain DPD. The other is the non-uniform characteristics of CKU. It is tackled by the NUCC introduced in Fig. 5(c), whose separate accumulators for FCW and φ M are combined here without affecting the functionality.
B. Implementation of NUCC Fig. 11(b) shows the implemented NUCC. The φ E,DM and φ E,PP compensation paths share the common term φ M /FCW, which characterizes the expected phase accumulation on DCO during the average CKV period, i.e., Scaling φ M /FCW with (φ S,frac [n] − φ S,frac [n − 1]) yields φ DMC , which compensates φ E,DM due to the CKU period variation. This matches (10). To compensate φ E,PP due to the CKU offset variation, φ M /FCW is scaled to generate φ R2S , i.e., This equation is a re-arranged version of (13). N T cnst represents the constant component of CKU offset (relative to FREF) normalized by the average CKV period, i.e., N T cnst is estimated by an LMS algorithm that correlates the differentiated φ M with the detected phase error φ E , emulating [15]. The diagram is also shown in Fig. 11(b), where the factor µ NT adjusts the calibration convergence speed. Obviously, larger amplitudes in φ R2S and φ DMC indicate that more PM error is compensated by NUCC. Since φ M /FCW is the base scaling term in both (10) and (23), NUCC can improve the PM accuracy more conspicuously when a wideband signal (with a higher distribution probability at large φ M amplitudes) modulates the PLL with a small FCW. Besides, the impact of φ DMC outweighs that of φ R2S : The former scales φ M /FCW with a factor (i.e., φ S,frac [n] − φ S,frac [n − 1]) ranging from −1 to 1, and reduces φ E,DM , which could directly accumulate on the DCO and interfere with the PM signal across multiple CKU cycles until corrected by the PLL. The latter scales φ M /FCW with a factor (i.e., φ S,frac [n −1]) distributed within [0, 1), and reduces φ E,PP , which can be attenuated by the loop filter before disturbing the DCO.
Since NUCC tackles the φ E,DM and φ E,PP errors whose impacts depend on the PLL bandwidth (see Section II-A), the EVM improvement due to NUCC is also bandwidthdependent. To demonstrate that, time-domain simulations of a 3188-MHz PLL-based phase modulator shown in Fig. 11 have been carried out. The simulation conditions (e.g., using a 64-PSK signal, f REF of 40 MHz, feedforward frequency division K = 8, and so on) and the way to evaluate EVM are identical as in the measurements later presented in Fig. 20(b). The DCO in this simulation has perfect linearity and ultrafine resolution, thereby contributing negligible distortion and quantization error to EVM. This benefits in observing the impacts of non-uniform CKU and NUCC. The simulated EVM  Fig. 20(b). versus the PLL bandwidth is shown in Fig. 12. Enabling NUCC (see the "NUCC on" curve) improves EVM by at least 10 dB compared with the case when NUCC is disabled (see the "NUCC off" curve). Hence, the "NUCC off" behavior is dominated by the impact of non-uniform CKU, thereby roughly reflecting the EVM degradation due to the non-uniform CKU. According to the "NUCC off" curve, the non-uniform CKU degrades EVM more forcefully at narrower PLL bandwidths because the degradation is dominated by the φ E,DM error being less suppressed by the PLL loop. Therefore, especially at low PLL bandwidths, the bulk of EVM improvement from NUCC is obtained by merely enabling φ DMC (see the curve of "only φ DMC of NUCC on"). The EVM associated with the φ DMC -only option increases at wider PLL bandwidths because the nonuniform CKU contributes more PM error through φ E,PP when the PLL bandwidth is wider. This necessitates activating the φ R2S component of NUCC at wide PLL bandwidths. Finally, simultaneously utilizing both options in NUCC nearly entirely removes the effects of non-uniform CKU and lowers the EVM to the level limited by phase noise across a wide range of PLL bandwidths. Fig. 13(a) depicts a schematic of the DCO core, consisting of the LC-tank and complementary cross-coupled transistor pairs. The resonant frequency is tuned by the switchedcapacitor (SC) banks. While performing PM, the active banks can be functionally categorized into two types. The first tracks the carrier, i.e., the 32-b unary tracking bank (TB). The second is used for FM and configured in a segmented style, i.e., consisting of an 8-b unary coarse modulation bank (MCB) and a 16-b unary fine modulation bank (MFB). All the encoded OTWs are resampled by CKU before toggling the DCO SC units in order to avoid the data-dependent propagation delay, which may vary the effective phase accumulation time in each CKU cycle and finally degrade the PM accuracy.

C. DCO With Calibration
All the banks adopt the SC-unit structure sketched in Fig. 13(a), whose unit capacitor C U is inspired by the layout of a SAR ADC [40]. Here, the ground and output (VP/VN) nets can shield the internal switching node from the surroundings to minimize the systematic capacitance mismatch. This layout style also allows the SC units to abut each other, thereby shortening critical connection lines (i.e., VP and VN) to minimize the FM error related to the parasitic routing inductance.  Fig. 13(b) illustrates the control logic surrounding the DCO core. Regarding the carrier phase tracking, the integer portion of OTW TRC , i.e., OTW TB , directly tunes the number of active TB units, and the fractional OTW TRC dithers one TB unit through a high-speed (HS) modulator clocked by CKVD4 at 1/4 CKV frequency to improve resolution [27].
For PM, φ DM , i.e., the compensated φ M , is first denor- Among the three SC-banks, MCB has the coarsest resolution and affects the DCO FM linearity the most significantly. To address the frequency error associated with each OTW MCB codeword (9 in total), a lookup table (LUT) adds an OTW MCBdependent compensation code, OTW C , to the TB-tuning path. However, the control words from the scaled OTW M,F and LUT contain fractional bits, incompatible with the integer OTW TB . Therefore, their sum is noise-shaped by a first-order Behavioral description of the LUT with off-line calibration in Fig. 13: (a) calibrating the LUT content with the piecewise LMS algorithm in [12] and (b) updating the LUT with an LMS algorithm emulating K DCO calibration.
low-speed (LS) modulator (at the CKU rate) before being added to OTW TB to prevent the quantization error from accumulating on the DCO. To further suppress the quantization error, one can also add the fractional bits to the high-speed modulator, as in [11].
Two categories of parameters need to be estimated in Fig. 13(b). The first category is related to K DCO , i.e., f REF / K DCO,M and K DCO,M / K DCO,T . They are calibrated by an LMS-based algorithm, which correlates the detected phase error φ E [input of the digital loop filter, see Fig. 11(a)] and the relevant phase tuning target (i.e., φ DM or OTW M,F ), as in [15]. The second category is the LUT content, which is updated by correlating φ E with OTW MCB . The detailed algorithm depends on the dominant mechanism of non-idealities in MCB. For example, if the mismatch between the MCB units dominates, the piecewise LMS algorithm shown in [12] is preferred. Fig. 14(a) sketches the calibration principle. The LUT function is represented by the mux which conditionally passes the OTW MCB -associated compensation codes, VAL[0, . . . , 7], to OTW C . After the chosen VAL[n] is used, the corresponding φ E difference is scaled by µ DCO and added to that VAL[n] (enabled by EN[n]). VAL[n] finally converges at the value that exactly compensates for the error of the associated OTW MCB codeword. One may notice only 8 VAL units (VAL[0] to VAL [7]) are adopted to compensate the 9 OTW MCB codewords, i.e., integers ranging from −4 to 4 (considering MCB is 8-b unary). In fact, the frequency error associated with the codeword OTW MCB = 0 gets implicitly counted in the carrier frequency f 0 and automatically corrected by the PLL since OTW MCB = 0 is used when PLL locks the DCO to f 0 .
On the other side, if the dominant DCO non-ideality mechanism arises from the gain mismatch between MCB and MFB, i.e., the resolution ratio between MCB and MFB deviates from the nominal 16, all the desired VAL's linearly correlate with OTW MCB through the same factor, say K C . Consequently, the piecewise calibration in Fig. 14(a) simplifies to a K DCO -calibration-like algorithm shown in Fig. 14(b), where all the OTW MCB codewords and their corresponding φ E difference data are correlated with estimate the same gain factor K C . Then, K C · OTW MCB replaces To maintain flexibility in modifying the algorithm, the LUT is updated in an off-line style [see Fig. 13(b)]: φ E and OTW MCB sequences are collected and stored in an SRAM for debugging. The software reads the data, processes it, and updates the LUT. With the new content in the LUT, φ E and OTW MCB samples are collected again to update the LUT, whose content settles after several iterations.

D. Calibrated Parameters in Face of Channel Hopping
The implemented system utilizes, in total, four calibration loops related to PM, i.e., those for N T cnst , f REF / K DCO,M , K DCO,M / K DCO,T , and the LUT tackling the OTW MCBassociated error. Blindly re-calibrating all these parameters after channel hopping may take a long time before the EVM reaches back its optimum. To accelerate this re-calibration process, we first examine the frequency dependence of these parameters and then roughly compensate them according to the change in FCW.
Considering (24), N T cnst is designed to be a constant 4 because t cnst ideally represents an offset between CKU and the first CKV edge after FREF, and roughly equals 4T CKV . However, the DCO modulation frequency f M does not change immediately after the rising edge of CKU. An additional delay, i.e., t prop in Fig. 15, is always present mainly due to the propagation latency of control signals (e.g., OTW's). This delay is substantially constant in the time domain but turns frequency-dependent after being normalized by T CKV . Since the estimated N T cnst also counts t prop , the t proprelated part of N T cnst should be re-normalized according to the FCW (inversely proportional to T CKV ) after each channel hopping, i.e., N T cnst new = 4 + N T cnst old − 4 · FCW| new FCW| old (25) where the subscripts "old" and "new" distinguish the corresponding parameters in the previous and newly hopped channels. After the channel hopping, if t prop does not significantly change (for example, caused by environmental variations, such as supply voltage or temperature), (25) can directly set N T cnst to the value accurate enough to achieve optimum EVM in a new frequency channel. Consequently, re-calibration will be unnecessary. Per mathematical derivation in [16], K DCO exhibits a cubic relationship with the resonant frequency. Hence, after hopping to a new channel, f REF / K DCO,M should be re-calculated by the following equation: This equation is derived under the assumption of an ideal inductor. Considering a real inductor behaves a bit differently due to its parasitic capacitance, the estimated value might not be accurate enough for low EVM. Hence, some further calibration might still be needed. In contrast, K DCO,T / K DCO,M is determined by the capacitance ratio of the SC units in MFB and TB, thus independent of frequency and in no need of any further adjustment. Regarding the LUT for MCB, it is utilized in combination with the phase-domain DPD which tackles the 1/ √ LC-induced parabolic nonlinearity. Hence, the LUT mainly compensates for the non-idealities raised by device mismatches, e.g., the capacitance mismatch between MCB units or the gain mismatch between MCB and MFB. Considering these mismatch ratios are roughly constant after the fabrication, the LUT content does not need a frequencydependent adjustment unless extremely low EVM is targeted.
In summary, after channel hopping, the values of N T cnst and f REF /K DCO,M need to be modified using (25) and (26)

E. Simplified Implementation Details of TAU
TAU is utilized here for phase detection because it exhibits high linearity (i.e., showing low fractional spurs in [39]), which helps to minimize the PM error due to φ E,PP . The TAU in Fig. 11(a) is a universal timestamp-signal processor which outputs a weighted sum of an arbitrary number of timestamp inputs. In the implemented system, to extract the time error t E induced by the phase noise and PM error, the TAU calculates the weighted sum of T CKV and t S as follows: A simplified diagram of the TAU is shown at the bottom of Fig. 16. The controller programs the differential weighted time registers (WTR) to calculate (27). Fig. 16 (top) shows the details of a WTR [39]. It outputs a constant time offset minus the weighted sum of all the time inputs, t i 's. The WTR consists of a variable resistor R V , a variable capacitor C V and a level-crossing slicer. The variable resistor and capacitor are, respectively, realized by switchedresistor and -capacitor banks, whose values are controlled by RT and CT codewords. Before processing the time inputs, the capacitor's voltage V C is initially preset to V init by a charging switch SWC. Then, LOW levels of the SWD signal discharge C V through R V . The widths of these active-low SWD pulses Fig. 16. Simplified diagram and waveforms of the TAU, which utilizes differential WTRs to calculate the weighted sum of input times (i.e., T CKV and t S ) and outputs the result as t E .
define the time inputs of the WTR, i.e., t i 's. These t i 's are stored and summed as voltage drops on V C during the discharging events. The weights of t i 's in the summation are controlled by the RC product of R V and C V , i.e., τ = R V · C V . To properly read the weighted sum stored in the WTR, SWD should stay LOW till the slicer launches a CMP falling edge, indicating the moment V C crosses V th , threshold voltage of the slicer. The time offset between the last SWD falling edge and CMP asserting (i.e., falling edge) is defined as the WTR's time output t out , which equals a constant time offset t os minus the desired weighted sum of t i 's.
The implemented TAU ultimately uses two WTRs in a pseudo-differential manner to cancel t os and add ± sign onto the t i s' weighting factors. Accordingly, the differential WTRs' inputs and output are, respectively, redefined as the width differences of the SWD pulse pairs and the time offsets between CMPs. The controller in TAU programs the differential WTRs to calculate (27)-The controller samples T CKV and t S from CKV and FREF clocks, and converts them to differential WTRs' inputs; the controller also encodes CT and RT sequences according to (1 − φ R,frac ). In addition, the controller also generates the master clock for the main digital blocks, i.e., CKU. More details are discussed in [39].

VI. MEASUREMENT RESULTS
The proposed phase modulator is fabricated in TSMC 40-nm CMOS and occupies an active area of 0.31 mm 2 [excluding the pad drivers and SRAMs, see Fig. 17(a)]. With a reference clock of 40 MHz, it generates a phase-modulated clock whose carrier frequency f 0 ranges from 2.7 to 3.9 GHz. Fig. 17(b) shows the power consumption breakdown. The overall power drain is 4.6 mW, which is dominated by the DCO and its buffer, costing 2.35 mW at a 1.1 V supply. All other blocks are supplied with 1.0 V. The power consumption for the TAU-based phase detector sub-system (including TAU, TDC, counter, and so on) and digital logic are,   Fig. 14(a) were to be implemented on-chip, it would add a negligible power penalty to the overall 4.6-mW figure.

A. Measurement of the DCO's FM-INL
To measure the integral nonlinearity (INL) of the DCO's FM characteristic ("FM-INL"), we adopt the flow in Fig. 18(a). All possible combinations of the FM-related OTW's are input to a free-running DCO to measure the frequency differences relative to the corresponding f 0 , as in [41]. Such measured frequency difference reflects f M in a realistic FM operation. Meanwhile, the three OTW's are combined into OTW M , then "restored" to φ M through a reversed data flow relative to Fig. 13(b). Afterward, φ M is converted to the expected f M according to (1). The difference between the measured and expected f M 's reflects the FM-INL of the DCO. Fig. 18(b) shows the measured FM-INL at f 0 = 3188 MHz. The "linear" (blue) case restores φ M by assuming that the φ M -to-OTW function [in Fig. 18(a)] contains only the firstorder term, thereby reflecting the FM-INL of the DCO under the conventional linear assumption, as in Fig. 3(a). In reality, the INL curve is parabolic, and the maximum frequency deviation can exceed 7 MHz. After including the second-order term in the φ M -to-OTW function, which emulates the case of applying the proposed phase-domain DPD, the INL curve (green) becomes a linear staircase. This residue error after the DPD can be attributed to the fact that the resolution ratio between MCB and MFB deviates from the nominal value of 16; it is because this curve contains nine stairs, coincident with the number of MCB codewords. To compensate for this error, we introduce a small correction factor K E when combining the OTW's [see Fig. 18(a)]. With K E = 0.023, the maximum INL reduces to 0.5 MHz, below 0.26% of the full FM range (i.e., 197 MHz). K E merely describes the nonlinear behavior, and the associated effect will be addressed by the LUT for OTW MCB when characterizing the PM accuracy. Fig. 18(c) shows the FM-INL curves at multiple f 0 's under the same DCO linearization settings, i.e., using the secondorder phase-domain DPD and K E = 0.023. From 2708 to 3786 MHz, the frequency error is always below 0.45% of the full range, validating the efficacy of the phase-domain DPD in a wide range of carrier frequencies. The declining trend of the 3948-MHz curve can be attributed to the behavior of the physically realized inductor, whose effective value (defined as the reactance X L over angular operating frequency ω = 2π f 0 , i.e., L eff = X L /ω) was assumed to be constant in the derivation of the phase-domain DPD, but it actually changes with frequency due to the distributed parasitic capacitance [42].

B. PM Signal Generation and Measurement Setup
Although a GMSK signal is commonly used to evaluate the accuracy of phase modulators, it may fail to reflect the performance across the full PM range because it employs only two possible phase shifts between symbols (i.e., ±0.5π), exercising limited OTW codewords. Therefore, using M-PSK signals is deemed more reasonable. To avoid AM in conventional M-PSK signals [43], we generate the test signal by interpolating the symbols using a frequency pulse-shaping filter from the continuous phase modulation (CPM) [44]. Fig. 19(a) illustrates how the symbol is interpolated in this work. The frequency pulse-shaping filter g(t) lasts four sampling clock (FREF) cycles, equal to one symbol period T sys . The integral of g(t) defines the transition between symbol phases, i.e., θ sys . During the first three T REF 's, g(t) traverses the shape of a raised-cosine filter to smoothen the phase trajectory θ M (t). In the last T REF , g(t) = 0, thus freezing θ M (t) at the associated θ sys . Consequently, the symbols can be simply restored by sampling the transmitted signal during this period.
The measurement setup is shown in Fig. 19(b). The desired phase, i.e., the discrete-time θ M , is processed to φ M , loaded into an on-chip SRAM, and then input to the proposed phase modulator. The modulated output centers at f 0 and is further frequency-divided off-chip by K (programmable from 1 to 8). The division extends the carrier to a lower RF channel frequency emulating a realistic multiband polar TX, and helps to evaluate the effects DCO nonlinearity at large BW FM / f 0 . The divided clock is sampled by a high-speed oscilloscope, then processed in MATLAB to evaluate the EVM.

C. Modulation Performance at 64-PSK
A 64-PSK signal with a data rate of 60 Mb/s is finally adopted to evaluate the PM accuracy. Fig. 20 shows the measured constellation diagram at f 0 = 3188 MHz. According to Fig. 20(a), when the feedforward division ratio K increases from 1 to 8 with all compensation options turned off (i.e., phase-domain DPD, LUT for OTW MCB , and NUCC), EVM degrades from −35.1 to −24.4 dB. This is because the large K requires wider BW FM (expanding from 24 to 192 MHz), 3 which boosts BW FM / f 0 (increasing from 0.75% to 6.02%), and finally intensifies the 1/ √ LC-induced DCO nonlinearity. Fig. 20(b) begins with the worst case (K = 8) in Fig. 20(a). After enabling the phase-domain DPD, EVM is improved to −38.3 dB. However, as indicated by the DCO FM-INL curve in Fig. 18(b), the DPD performance is masked by the error in the resolution ratio between MCB and MFB, i.e., K E in Fig. 18(a). To combat this K E error, the LUT for OTW MCB [see Fig. 13(b)] is updated by the K DCO -calibrationlike algorithm shown in Fig. 14(b), where the compensation gain K C is equivalent to 16K E · K DCO,M / K DCO,T . Then, EVM    Fig. 18(c)]. The difference in EVM before and after applying NUCC suggests that NUCC removes a PM error around −47.9 dB, agreeing with the simulation result (see the "NUCC off" curve in Fig. 12) at a large PLL bandwidth (around 3 MHz according to the phase noise profile in Fig. 21). In addition, the output spectrum of this case is shown in Fig. 22. Fig. 23(a) shows the measured EVM versus the fractional FCW (FCW frac ) at different forward frequency division ratios (K ) when the integer FCW and all compensation options remain the same as in the final state of Fig. 20(b). Under the constant K , EVM varies within 1 dB across FCW frac . 4 With K increasing from 1 to 8, EVM shows a 10.6 dB improvement, similar to the trend of quantization noise that decreases with −20 log 10 K . However, the EVM is actually dominated by the DCO nonlinearity according to the EVM breakdown for the rightmost case on the K = 1 curve: The contribution due to the DCO's finite resolution is −43 dB. This is because the TB's frequency resolution f res = 156 kHz and update interval T REF = 25 ns result in phase resolution of θ res = 2π · f res · T REF , which adds to the modulated phase as a quantization noise with the power of θ 2 res /12, given that the noise transfer function of the low-speed first-order modulator in Fig. 13(b), i.e., N (z) = 1 − z −1 [45], cancels out the accumulation characteristic of DCO, i.e., 1/(1 − z −1 ) in the transfer function (see Fig. 9). Additionally, the integrated phase noise (IPN) of the unmodulated carrier degrades the EVM by −44 dB, which is 3 dB higher than the double-sided IPN of −47 dBc shown in Fig. 21, since the modulated signal spreads over both positive and negative offset frequencies.
The combined EVM contribution from these two sources is −40.5 dB, which is 3.5 dB lower than the measured EVM of −37 dB. The DCO nonlinearity appears the only candidate to explain this gap.
To further explore why the DCO nonlinearity affects EVM in a similar trend as does the quantization noise, Fig. 23(b) provides the f M distribution together with the DCO's FM-INL curve, on which the 9 discrete segments correlate with the 9 MCB codewords, and the V-shape of each segment arises from the mismatch between the MFB units. When K = 1, the exercised f M range almost overlaps with the central V-shape segment, so only the FM-INL related to MFB degrades the EVM. However, when K increases to 8, the INL grows 2.5×, i.e., from 0.2 to 0.5 MHz. Considering that the operational f M range is also multiplied by 8, the INL relative to the exercised range shrinks by 0.31, agreeing with the 10 dB improvement in EVM. Therefore, the high EVM at small K is mainly attributed to the MFB exhibiting unexpectedly strong nonlinearity, which is even higher than that due to MCB considering the frequency-tuning range. To further improve the EVM, additional measures are needed to combat the MFBrelated INL, e.g., an additional LUT for OTW MFB or the dynamic element matching (DEM) in [46]. Fig. 24 shows the measured EVM versus the DCO carrier frequency f 0 at different forward frequency division ratios K . EVM basically decreases at low f 0 and large K cases because they exercise a wider portion of the DCO's frequencytuning range to dilute the effect of MFB's nonlinearity. To demonstrate that the combinational DPD addressing the 4 In the realized phase modulator, the FREF signal couples to and periodically disturbs the DCO. The disturbance strength depends on the instantaneous phase difference between the FREF and DCO clocks, thus fluctuating at the frequency of FCW frac · f REF . At lower FCW frac , the disturbance experiences less filtering by the DCO (described by the DCO's phase-domain transfer function, i.e., 1/s). The unfiltered disturbance not only directly degrades EVM by increasing the PM error, but also results in a larger detected phase error φ E . A large φ E can saturate the TDC (detecting time errors ranging from −3.5 to 3.5 ps), and slow down the PLL's transient response. Therefore, PM errors stay uncorrected for a longer time, thereby further degrading the EVM. This is a possible explanation as to why the EVM increases at very small FCW frac .
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. DCO-nonlinearity, i.e., the DPD simultaneously applied in both phase and OTW domains, can achieve the frequencyinsensitive performance, the EVM is measured in two scenarios. In the first case (solid lines in Fig. 24), the compensation settings (i.e., phase-domain DPD, the OTW MCB LUT, and NUCC) are kept the same as in the final state in Fig. 20(b) irrespective of f 0 . In the second scenario (the dashed lines), the OTW MCB LUT is updated at each frequency point with the piecewise calibration shown in Fig. 14(a) to represent the optimum EVM of this design. At most points, the solid lines coincide with the dashed ones. In the case of K = 4 and K = 8, EVM on the solid lines remains below −43 dB across the full tuning range of f 0 . This validates the frequencyinsensitive performance of a combinational DPD solution.
One may notice a greater deviation between the solid and dashed lines at relatively high frequencies ( f 0 > 3.4 GHz) and K = 8. This is because the DCO exhibits a larger FM-INL [after compensated by a fixed gain factor, K C , shown in Fig. 14(b)] at higher resonant frequencies and across wider exercised f M ranges (i.e., BW FM which scales with K ) according to Fig. 18(c).
Due to its relatively frequency-insensitive performance, the combinational DPD can reduce the efforts required in the DCO nonlinearity calibration and shorten the time to reach optimum EVM after each channel-frequency hop. To prove this, we hopped the PLL's center frequency f 0 between 2868 and 3948 MHz, then measured [recorded by the debugging SRAM in Fig. 13(b)] the settling curves of f REF / K DCO,M (the only parameter that will likely require a re-calibration according to Section V-D), as shown in Fig. 25.
At each new frequency, f REF / K DCO,M starts with an initial value that is calculated from the final value of the was also written back to the phase modulator to measure the corresponding EVM in the K = 4 case (where the calibration process also used the same PM sequence in accordance with K = 4). As shown in Fig. 25, EVM settles to the optimum value within 15 us. This time is much shorter than the 100 ms needed by the phase modulator to calibrate the DCO's nonlinearity with the piecewise LMS algorithm [7]. One might argue that this comparison is unfair since the aforementioned 100 ms is the calibration time during an initialization, which can be shorter if optimized for channel hopping. However, the assumed shorter calibration time after channel hopping is not true for the piecewise LMS since the calibration results of the piecewise LMS are not only related to the DCO nonlinearity but also to the estimated K DCO 's [12]. After the DCO hops to the frequency associated with a faraway channel, K DCO 's will change significantly. Consequently, the piecewise LMS will need to correct rather huge errors, and so the corresponding calibration time will not considerably differ from that in the original initialization.

D. Performance Comparison
Table I compares this work with state-of-the-art PLL-based phase modulators. While running the DCO at 3188 MHz, this design produces a transmitted RF carrier at 398.5 MHz after the division by K = 8. When generating the 64-PSK signal, the DCO exercises an FM bandwidth (BW FM ) of 192 MHz, corresponding to 6.02% fractional BW FM (BW FM / f 0 ); hence it results in a large FM error due to the 1/ √ LC-induced DCO nonlinearity. Despite this, the proposed phase modulator achieves the lowest EVM and energy per bit, i.e., −47.6 dB and 0.08 nJ/bit, respectively.
It should be noted that the issue of comparing EVMs across designs is still an open question in the literature. Cherniak et al. [11] have chosen to normalize the EVMs to the same output frequency. This is equivalent to measuring the EVM after virtually dividing 5 the PM clock by K rescale = f reported / f chosen , where f reported is the original output frequency reported in a given reference paper, and f chosen is our chosen target output frequency for re-scaling (here equal to 398.5 MHz). Under an expedient assumption that the PM error is dominated by random jitter (i.e., thermal phase noise), the rescaled EVM in dB, i.e., EVM rescale , equals the original EVM minus 20 log 10 (K rescale ) because the divided carrier period becomes K rescale times larger, but the random jitter remains the same. Table I also lists the calculated EVM rescale values of each work.
However, the above −20 log 10 (K rescale ) scaling assumption does not hold under a realistic scenario of a wideband TX when distortion dominates the PM error because the distortion increases with K rescale . This can be understood by inspecting the distortion induced by the error in the modulation frequency ( f M ): According to Section IV-A, the relative f M error due to the 1/

√
LC nonlinearity is roughly reflected by BW FM / f 0 . If the original PM clock at f 0 was to be (virtually) frequency-divided by K rescale (for the EVM rescaling), BW FM should multiply by K rescale to keep the PM characteristics (e.g., data rate and constellation) unchanged after the division. Hence, a larger K rescale increases BW FM / f 0 , indicating stronger relative f M error and higher EVM contribution. This is verified by Fig. 20(a), contradicting with the EVM-rescaling trend indicated by the jitter-dominant assumption. Although linearizing the DCO can suppress the f M error, the residue increases dramatically with BW FM / f 0 due to the high-order nonlinearities indicated in (18). 6 This will ultimately dominate the EVM.
Considering that the EVM contributions due to jitter and distortion change differently in the frequency rescaling, we prefer to separately compare these two contributors, rather than merely considering the overall EVM. In Table I, the former one is already covered by the integrated rms jitter, and the latter is reflected by the EVM excluding IPN (integrated phase noise) at their original output frequencies. The "EVM excl. IPN" is calculated by the following equation: where 3 dB is added to IPN because it integrates phase noise over positive or negative offset frequencies and counts merely half of the EVM contribution. The proposed phase modulator exhibits the lowest distortion level compared with other works.

VII. CONCLUSION
This article has demonstrated a digital PLL-based phase modulator of high accuracy yet low power consumption. Although the DCO updates at a non-uniform clock and suffers from strong nonlinearity due to the wide FM bandwidth, the phase modulator can still achieve EVM below −47 dB at a 60-Mbit/s 64-PSK signal. This benefits from the two proposed innovations: 1) the NUCC that addresses PLL disturbances arising from the time-varying period and offset of the updating clock and 2) the phase-domain DPD that compensates the 1/ √ LC-induced DCO nonlinearity. From the methodology perspective, the NUCC analysis entails the improved PM model in the hybrid-time domain. The new model is effective in analyzing the time-related distortions in general PLL-based phase modulators. Moreover, combining the proposed phase-domain predistortion with the conventional OTW-domain counterpart could constitute a frequencyinsensitive solution compensating for DCO nonlinearity. These two powerful tools would benefit low-power PLL-based phase modulators in improving accuracy, thereby paving the way for future polar TXs supporting high-data-rate applications.