A Type-II Phase-Tracking Receiver

We present a new analog-to-digital converter (ADC)-based architecture of a phase-tracking receiver (PT-RX) optimized for ultra-low-power (ULP) and ultra-low-voltage (ULV) operations for the Internet of Things (IoT). The RX employs a type-II loop configuration that offers improved stability compared with the previous type-I PT-RX solutions. In addition, the type-II loop is also very tolerant of long run-lengths of consecutive “1” or “0” symbol sequences. Fabricated in 28-nm CMOS, the prototype PT-RX targets Bluetooth low energy (BLE) standard consuming only 1.5 mW at a supply of ≤0.7 V. It maintains an adjacent-channel rejection (ACR) of ≥−11/3.5/17/27 dB at 0/±1/±2/±3 MHz offset and can tolerate out-of-band (OOB) blockers of minimum −21 dBm across 1.0–3.5 GHz while also offering a best-in-class figure of merit (FoM) of 181 dB, with a 1-Mb/s BLE sensitivity of −93 dBm.

where RF blocks [i.e., low-noise amplifier (LNA) and local oscillator (LO)] dominate most of the power budget in order to satisfy the stringent sensitivity and linearity requirements. Therefore, rather than focusing on the power reduction of individual blocks incrementally, we aim at re-visiting the RX architecture from the ULP and ULV standpoints.
Several innovative RX architectures have been published recently. A hybrid-loop RX in [7] achieves good adjacent-channel rejection (ACR) performance due to utilizing an all-digital phase-locked loop (ADPLL) as an analog-todigital converter (ADC) for quantization and with enhanced dynamic range via a digital-to-analog converter (DAC) feedback. However, it is vulnerable to an RF carrier-frequency offset due to its IF being required to exactly align with the deviation frequency of FSK symbols, and it also suffers from image issues due to its low-IF conversion. Besides, the 1.1-mW ADPLL limits its overall power efficiency.
In contrast to the twin-path topology of the conventional I/Q RXs, a new single-path phase-tracking RX (PT-RX) in [2] demodulates the input symbols directly at RF. This significantly reduces the overall power consumption by removing the need for a separate LO PLL/RF synthesizer (typically, the most power-hungry block in RX) and further removes the need for quadrature I/Q signal processing circuits in the baseband. However, it suffers from impoverished sensitivity (degraded by an inadequate frequency deviation control and excessive loop latency) and limited ACR (deteriorated by comparator-induced worsening of sidelobe energy and non-robust locking loop). As a result, it cannot fully meet the Bluetooth standard, which is extensively used in IoT applications. A follow-up PT-RX in [3] has demonstrated improved ACR and sensitivity performance that satisfies the Bluetooth LE standard while consuming only 2.3 mW. However, that PT-RX still compromises its power efficiency by utilizing an aggressive hybrid-loop filter in order to achieve better ACR. It also shares the power-hungry ADPLL-based digital frequency-modulation (FM) interface with the transmitter to calibrate the symbol deviation frequency and define the initial frequency in order to improve sensitivity. Most significantly, the existence of a non-zero phase error in the type-I 1 phase-tracking loop [2], [3] means degraded stability of the locking loop and being unable to support long sequences of "0" and "1" symbols.
To solve the fundamental issue with the type-I PT-RX and to further reduce its power consumption, a type-II loop arrangement is proposed here in which the 1-bit phase quantizer of the prior solution is replaced with a 10-bit successiveapproximation-register (SAR)-ADC. In addition, the analog continuous-time low-pass filter (LPF) is also replaced with a more efficient switched-capacitor (sw-cap)-based DT LPF. The new RX consumes only 1.5 mW at a supply of ≤0.7 V, and it offers three key benefits: 1) robust locking/tracking loop; 2) improved performance to satisfy the BLE standard with the best sensitivity FoM of 181 dB; and 3) tolerant to the long run-length of consecutive "1" and "0" sequences.
This article is organized as follows. In Section II, a detailed review of state-of-the-art PT-RXs is carried out, and then, the new type-II PT-RX is introduced. A linear s-domain model of the proposed PT-RX is further presented, followed up by noise analysis. Section III discusses the circuit implementation. Finally, to show the effectiveness of the proposed system, Section IV discloses the experimental results.

II. ARCHITECTURE OF PHASE-TRACKING RX
A. Review of Type-I Phase-Tracking RX Architectures Fig. 1(a) shows a simplified diagram of the original PT-RX in [2]. This single-path RX is designed to demodulate IEEE 802.15.4 signals. 2 To fairly compare our proposed RX with this PT-RX, we apply the same Gaussian frequency-shift keying (GFSK) modulation to both RXs, at a data rate of 1 Mb/s. A GFSK-modulated baseband signal (peak deviation frequency f pk = 250 kHz) with a carrier frequency f C is applied to the LNA and mixed down by the digitally controlled oscillator (DCO) frequency f DCO . Apart from operating as an RF downconverter, the mixer, along with the one-bit quantizer, also behaves as a bang-bang (BB) PD in the loop. The mixer's output phase error MX is applied to an LPF to remove undesired components. Thus, the LPF not only behaves as a loop filter, but it also operates as a channel-selection filter in the RX baseband. The filtered phase error LPF is then limited by a one-bit quantizer, whose logic output is applied to a proportional-integral (PI) loop filter with programmable coefficients α and ρ that tune the DCO to track the RF input frequency. Thus, the PI controller completes the automatic frequency calibration (AFC) loop such that its output fed to the DCO as an oscillator tuning word (OTW) represents the carrier frequency with the recovered input modulating waveform.
However, the AFC is sensitive to non-50% duty-cycle symbol patterns and could result in a DCO frequency drift [see Fig. 1(b)]. Assume that there is no initial TX drift, and a sequence of "1011" (the first symbol "1" has been tracked) is fed into the RF input. This 67% duty-cycle symbol pattern ("011") will result in the accumulation of offset at the LPF output. This AFC offset is dependent on ρ/α and also on the length of consecutive sequences of "0" and "1" symbols. It results in a DCO frequency drift, which then limits the ACR and sensitivity performance. For long consecutive symbol sequences or large ρ/α, LPF falls to < 0 (θ = π/2 to 3π/2) region, and it alters the comparator output to logic 0, which results in the demodulation error. Due to this AFC offset accumulation, the PT-RX in [2] is only capable of supporting up to seven consecutive symbols.
Apart from the constrained symbol patterns, another issue of this type-I PT-RX is the instability of the locking loop in steady state, which can further degrade the performance of ACR and sensitivity. Fig. 1(c) shows the characteristic of the phase detector (PD) function. The PD operates relatively linearly with a maximum small-signal gain (∂ LPF /∂θ ) centered near the transitory zero-crossings. However, it exhibits a small-signal gain of zero if θ = kπ (k = 0, ±1, ±2, . . .), exactly at the points where the type-I PT-RX is kept in steady state. Now, suppose that the input symbol is tracked (either "0" or "1" symbol), which means that the PD will operate at the peak of its characteristic. The PD gain drops to zero there, which leads to an equivalent open loop. Essentially, with a very small but deterministic frequency drift, the RX cannot lock the loop reliably.
A frequency-domain ON/OFF keying (F-OOK) modulation was proposed in [8] to address the constrained data-pattern issue by means of adapting the input modulation scheme to F-OOK at 100 kb/s. However, it does not support the GFSK modulation, which is a requirement in BLE. Our proposed type-II PT-RX addresses the two above issues in [2], [3], and [9] architecturally and in a power-efficient manner.

B. Proposed ADC-Based Type-II Phase-Tracking RX
A new architecture of a type-II PT-RX is proposed in Fig. 1(d). A SAR-ADC is now utilized to quantize the phase error so that, along with the mixer, it can operate as a multi-level PD. In addition to the AFC applied for the fine calibration, a coarse carrier frequency offset (CFO) calibration is realized for the initial correction of the carrier frequency contained in the received BLE preamble. The multi-level ADC is an essential block in the type-II loop structure, and it achieves two key benefits over the prior-art type-I solution with the one-bit quantizer: 1) by means of the digital accumulator, it zeroes out the mean value of phase error , which eliminates the AFC offset and helps to achieve loop robustness and 2) it improves the degraded RX SNR in the face of the DCO sidelobe energy and quantization noise.
The operation is illustrated via an example. Suppose that the input data stream is "1011," and the first "1" symbol has been properly tracked. This implies that θ = π/2, LPF = 0, and f DCO = f C + f pk [see Fig. 1(f)], with f pk = 250 kHz. The next symbol alters to "0" ( f IN changes to f C − f pk ); then, the LPF output LPF follows V 0 cos (2π f pk t + π/2), and it starts to go into negative value gradually [θ = π/2 → 3π/2 in Fig. 1(f)]. The negative value of LPF is digitized via the SAR ADC causing the AFC output to decrease [see Fig. 1(e)]. After a certain time, the phase error LPF traverses the blue trajectory in Fig. 1(f) and reaches the location of θ = 3π/2, where the returns to zero again. The DCO output frequency is forced to f C − f pk , which indicates that the input symbol is tracked again. Hence, it is evident that in the proposed PT-RX, the phase error returns to zero after the symbol is tracked, which is an essential characteristic of the type-II loop. Apart from the phase error gravitating to zero, this type-II PT-RX tracks the transitions between each symbol rather than the steady states in type-I PT-RXs. Particularly, the blue trajectory in Fig. 1(f) stands for symbol "0," and the black trajectory represents symbol "1." The proposed PT-RX can track any symbol patterns since the AFC offset inherent in the type-I PT-RX is now eliminated [see Fig. 1(e)]. Suppose that the first "1" of the "011" symbol sequence is tracked, and the second one is coming. Since the latter does not change the RF input frequency, the DCO is kept stationary due to the lack of residual phase error from the ADC output [see Fig. 1(e)]. Consequently, the length of consecutive data is not limited at all in the proposed solution.
Perhaps, a bit deeper insight can be gained with an analogy to a BB (AD)PLL [10], where locking to a reference clock signal there can be loosely compared with locking to a long sequence of "1" symbols here. Due to the 1-bit quantization of the phase error in the BB-ADPLL, the loop exhibits chattering around the zero phase-error point, and a bit of loop delay can cause an oscillation of the oscillator's tuning input or even cycle slips. Because its one-bit quantizer does not provide the necessary stable point at exactly the zero phase error state, so as to keep the expected integrator output at zero, the DCO will continue chattering between the values above and below zero. In contrast, applying a multi-bit ADC here is equivalent to applying a multi-bit TDC there such that a stable locking point can be found. In other words, to avoid such toggling near the threshold in the steady state, the comparator (BB-PD) of the type-I PT-RX has to operate in the non-linear range with nearly zero gain; hence, the tracking loop is essentially free-running within that dead zone.
A behavioral model of the PT-RX is built in Simulink to perform time-domain simulations. 3 Fig. 2 shows waveforms at various key nodes (i.e., input frequency; LPF, ADC, and OTW outputs). Four pairs of "10" denote the BLE preamble. As expected, after a few repeated symbols, LPF and the ADC outputs return to zero. The OTW waveform tracks the input frequency deviation of the transmitted symbols quite faithfully.  As an additional benefit, the ADC-based type-II loop arrangement mitigates the DCO sidelobe energy level compared with the type-I solution. The sidelobe energy at the DCO output is unavoidable due to the binary modulation, which is abrupt [see Fig. 1(b)] and contains many harmonics. It mixes with the RF input and results in a residual interference, which limits the ACR performance. The multi-level ADC in the proposed PT-RX directly conveys the GFSK-like waveform (OTW) with less harmonic distortion into the DCO. Fig. 3 demonstrates that the more precise the ADC, the lower the BER. To meet the BLE specification, a 20-dB rejection at 2-MHz offset is targeted with 3-dB margin; hence, the ADC requires at least 8 bits of resolution.

C. Linear S-Domain Model
A simplified linear s-domain model of the proposed PT-RX is shown in Fig. 4. Similar to the modeling of a conventional ADPLL, it is a continuous-time approximation of a DT z-domain with a sampling frequency of f 0 = 25 MHz and is valid since the signal bandwidth of interest (1 MHz) is much smaller than the sampling frequency [11], [12]. The PT-RX is modeled as an arrangement of three main blocks (the PD, loop filter or PI controller, and normalized DCO). To simplify the analysis, the other blocks (i.e., LNA, TIA, and LPF) are represented as merely a constant gain, which is denoted by the forward gain K G .
The passive mixer, together with the ADC, is modeled as a phase subtractor with the gain of K PD since, in a steady state, its small-signal gain is always linear [see Fig. 1 [3], and [13] cannot be modeled as a constant gain due to its operation beyond the small linear range [see Fig. 1(c)] [14], [15]. One frequency-to-phase integrator is absorbed in the PD. The PI controller provides another integration pole apart from the pole due to the frequency-tophase conversion of DCO. The normalization of DCO ensures that its transfer function (TF) is largely independent of process, voltage, and temperature (PVT). LSB is a unit step of the DCO tracking bank. The feedforward-path TF H ff (s) can be expressed as The feedback-path TF H fb (s) is Let us suppose that the DCO gain is estimated/normalized correctly [11]. Then, H fb (s) is simplified to For mathematical convenience, assuming that K PD K G = 1, then the closed-loop TF could be simplified as Two noise sources, n,RX and n,DCO , are also introduced in the s-domain model, where n,RX represents the RX chain error source (e.g., thermal noise, flicker noise of the baseband, dc offset, and quantization error), and n,DCO stands for the DCO phase error (i.e., phase drift, flicker, and thermal PN). Their TFs are also derived as The pair of pole frequencies of (4)-(6) are at Since (7) can be reduced to Equations (4)-(6) have a common zero (z 1 = −( f 0 ρ/α)), roughly of the same value as p 1 , which leads to a compensation of the pole at p 1 . p 2 defines the bandwidth of the PT-RX loop. Since there is no zero at origin, a flat low-pass characteristic of the signal TF (STF) is confirmed in Fig. 5(a). This low-pass characteristic of STF ensures that the proposed PT-RX does not suffer from the consecutive symbol patterns.   of ρ in the time domain are also shown in Fig. 6. As expected, higher values of ρ assist with tracking more rapidly but also less robustly. At sufficiently high ρ, a tracking error happens due to the limited damping. In the time-domain simulations shown in Fig. 6, the loop latency is also included, which aggravates this scenario compared with the linear s-domain model in Fig. 5(b). The value of α is appropriately set to ensure the DCO tracking range slightly larger than the peak deviation frequency, so as to decrease the settling time while keeping the ratio of ρ/α smaller than 1/100 to ensure enough damping.
A 20-dB/decade suppression for the noise source n,RX is achieved in the proposed PT-RX, which indicates the inherent suppression of the dc offset caused by mismatch or selfmixing.
where L 1 MHz is PN of DCO at 1-MHz offset, and BW Loop is the loop bandwidth. In this work, a DCO with PN of −114.4 dBc/Hz at 1-MHz offset is designed in order to minimize the sensitivity degradation. By substituting the values of L 1 MHz , (9) is reduced to Then, SNR of RX due to the PN of DCO could be expressed roughly as To obtain a Bluetooth BER of 0.1%, an SNR of 11 dB is required for the PT-RX [13]. With 33 dB of SNR PN in (11), the DCO phase noise (PN) does not significantly impact the sensitivity performance here.
To summarize, Fig. 5(a) shows the plots of STF and NTF in both type-I and type-II PT-RXs. In contrast to the type-II PT-RX, the low-frequency attenuation of STF in type-I PT-RX leads to the limitation of repetitive symbols. Both of them have good RX noise suppression in terms of dc offset or flicker noise (of LNA and baseband circuits) and also provide 20-dB/decade suppression of DCO PN. However, the type-I PT-RX's high-pass corner p 1 cannot be made practically large enough to suppress the flicker noise of DCO since higher p 1 results in bit errors. In [13], p 1 ≈ 10 kHz and BW 1/ f 3 ≈ 100 kHz. With the consideration of integrated 1/ f 3 PN from 10 to 100 kHz, its SNR PN is expressed as

III. CIRCUIT IMPLEMENTATION
In terms of circuit implementation, this work focuses on yielding competitive performance while operating at ULP and ULV. The power management unit (PMU) that supplies multiple voltage domains is an essential block in contemporary SoCs. Hence, providing multiple individually optimized supply levels is, nowadays, a common practice. In [17], a 0.18-V supply goes to an on-chip PMU that then powers the RX front end and baseband circuitry and provides overall biasing at higher voltage levels. Reference [18] demonstrates a 0.5-V ADPLL with a DCO operating directly at 0.5 V. However, its TDC is supplied at 1 V by an internal doubler. An ADPLL in [19] runs its DCO at 0.23 V and a doubler at 0.35 V to power the voltage-sensitive TDC. Reference [20] demonstrates a DT-RX directly supplied at 0.275 V via an sw-cap-based voltage doubler. In this work, the RF front end is optimized at 0.3 V and a DCO at 0.25 V. A 0.7-V supply is applied to mixed-signal circuits. No dc-dc converters are utilized. A detailed introduction of individual block implementation is given in the following.

A. ULV Two-Stage LNTA and Passive Mixer
A single-ended two-stage low-noise transconductance amplifier (LNTA) operating at 0.3 V is shown in Fig. 7(a). A cascode architecture is selected for the first stage to provide low NF and large gain. The optimized bias voltages are chosen, so as to obtain competitive linearity, which is the main concern in ULV designs. For an extra-low-V T (ELVT) MOS (M 1−3 ) with threshold voltage of ∼450 mV, suppose that V CM,1 is biased at 510 mV, and then, the overdrive voltage V ov,1 = 60 mV, which is enough to tolerate a blocker of −14 dBm (BLE requires −30 dBm) without pushing M 1 into triode region. Assume that V ov,1−2 = 60 mV, and only 120-mV supply voltage is required to maintain M 1−2 in saturation. The measured RX noise figure (NF) is 6.3 dB, and the LNTA gain (with TIA as its load) is 32 dB.
A passive mixer is utilized due to its better linearity and lower power consumption. LO clocks with a convenient 50% duty cycle are used to translate the RF input signal since only one path is needed in the PT-RX. To achieve better linearity, a higher voltage VG MX,P-N is applied to lower R ON while maintaining small parasitic capacitance of the passive mixer. Two grounded capacitors load the mixer to suppress the even-order distortion. Common-mode feedback (CMFB) in Fig. 7(c) is implemented to stabilize the output common-mode voltage across PVT variations. Two current sources regulated by the error amplifier, which detects the averaged common-mode voltage from the main transistors, are used to sink/source current from/into V OUT,N and V OUT,P . M 5−8 are required to have a large length so as not to degrade the output impedance of the main amplifier significantly. Also, only a small portion of the current is needed to sink/source from/into the main amplifier. Hence, M 5−6 and M 7−8 are sized with large L but a small W/L of 1 μm/1 μm and 2 μm/1 μm, respectively. Furthermore, a body bias technique is applied as another tuning "knob" to combat PVT variations. The simulated output common-mode voltage across process and temperature is shown in Fig. 8. The amplifier can tolerate fast-fast (FF) and slow-slow (SS) corners without resorting to any adjustments of the body bias. However, with the body bias adjustment, it can bring the common-mode voltage back to a typical value again for FS and SF corners. The ELVT transistor is used in the amplifier with the threshold voltage of ∼380 mV. Fig. 9(a) shows the seventh-order charge-rotating IIR DT LPF, which functions as both a baseband filter for attenuating interferers and an anti-aliasing filter prior to the ADC [21], [22] (detailed discussion is deferred to the ADC subsection). In each cycle at the first phase φ 1 , a sampling capacitor C S samples the stored charge from the history capacitor C H1 , which is charged by the G m -cell. From φ 2 to φ 7 , at each phase, C S charge shares the residual charge from C H2 to C H7 , respectively. After sharing with the last history capacitor C H7 , C S is reset to virtual ground at φ 8 , and then, a new cycle starts. With the aim of attenuating in-band interferers or other undesired components (e.g., the mixing harmonics), the DT LPF's bandwidth f 3dB must be set as low as possible. On the other hand, it must be large enough, so as to satisfy the low loop latency. To achieve the best simultaneous performance of sensitivity and ACR, f 3−dB is set around ∼1.3× of the input signal bandwidth [13]. The targeted signal bandwidth is 500 kHz in BLE; hence, 700 kHz is set as the f 3−dB bandwidth in this tradeoff. Due to the TF preciseness of the sw-cap circuitry, its bandwidth only depends on the sampling rate and ratio of C H and C S [22]. An accurate 3-dB bandwidth could be expressed as [23]

C. Seventh-Order IIR Discrete-Time Low-Pass Filter
where f S represents sampling frequency and the coefficient α with N representing the order of the DT filter (α ≈ 3, when N = 7). f S is chosen 128 MHz (∼8 ns) here in order to minimize the loop latency, and with the requirement of f 3−dB ≈ 700 kHz, we obtain the values of C S (0.2 pF) and C H (3 pF). Thus, the oversampling frequency ( f OS ) of the seventh-order DT LPF would be ∼ 1 GHz (= (N + 1) f S ). It is generally not desired for ULP designs to operate at the sampling clock of ∼1 GHz; hence, a pipelining structure is utilized to divide down f OS , as shown in Fig. 9(a). Along with the eight interleaved banks, f OS is 8× lower at 128 MHz. Furthermore, a four-latch-based eight-phase waveform generator is implemented, as shown in Fig. 9(b), so as to further lower f OS by 2×. This waveform generator takes advantage of both rising and falling edges so that only N/2 latches are required for the eight-phase waveform. Therefore, in this work, f S could be expressed as In addition, the waveform generator can start up by itself with noise by means of returning Q 4 rather than Q 4 [22], [23] to the input of Latch 1. Fig. 9(b) shows the four-latch-based waveform generator with schematic of the latch unit shown in Fig. 9(d). By passing Q 1 and Q 4 through the AND gate and followed by the single-to-differential clock buffer, the complementary phases φ 1 and φ 1 are generated [see Fig. 9(c)]. Fig. 10(a) and (b) shows the latch output waveforms (Q 1−4 and Q 1−4 ), as well as the desired eight-phase waveform of φ 1-8 , respectively.
In this work, the differential value of C S ranges from 64 to 445 fF, digitally selected by 3-bit binary switches. Capacitors C H1-H7 range from 0.6 to 8.5 pF differentially. Single unit

D. SAR-ADC
A 10-b SAR ADC is implemented in this work with a margin of 2 bits, and its simplified diagram is shown in Fig. 11(a). The split binary-weighted capacitive DAC (apart from the LSB capacitors C 0 , which are not split) is utilized with a fringecapacitor-based unit of 1 fF. In order to obtain better linearity and speed (low R on ), the input sampling switches are realized with a bootstrap structure. The comparator is based on a simple dynamic latch stage, while the SAR logic consists of only TSPC flip-flops and gates. The output data stream D 0−9 is re-sampled by LSB compared-ready-signal (RDY) and fed into the digital PI block as an input.
An anti-aliasing filter is typically indispensable for a Nyquist ADC. An active filter would be conventionally exploited for the anti-aliasing but at a cost of power drain and the need for calibration [see Fig. 11(b)]. Reference [24] presents a concept of integrating an FIR filter with a SAR ADC to remove the active filter. In this work, as mentioned earlier, a 128-MS/s IIR DT LPF is implemented with lower complexity and delay compared with an FIR filter. Fig. 11(c) shows the proposed anti-aliasing scheme. The DT-LPF filter is chosen to operate at a higher sampling rate than the ADC due to the consideration of reducing the clock delay (∼8 ns for 128 MHz) into the PT-RX loop. As a result of the tradeoff between power and delay, this SAR ADC is operating at 25 MHz (i.e., 10× faster than the total loop delay) with 25× oversampling rate (for 1-Mb/s BLE). Along with an inherent  sinc function of DT LPF and also 25× oversampled ADC, this RX omits the active filter, so as to achieve better power efficiency without the performance sacrifices. Fig. 12 shows a DCO with low 1/ f 3 noise and with vertically integrating all the area-hungry switched capacitor banks and cross-coupled pair inside the transformer coils. A version of this DCO was published in [25]. It uses the supply voltage of 0.3 V and produces PN of −119 dBc/Hz at 1-MHz offset. However, as studied in Section II, the PN of the DCO only negligibly affects the sensitivity of the proposed PT-RX. Therefore, rather than striving for an improved PN performance at a cost of increased power consumption, this work implements a 0.25-V DCO with PN of −114 dBc/Hz at 1-MHz offset.

E. DCO
A simplified schematic is shown in Fig. 12(a) with a cross-coupled pair and 2:3 transformer. Compared with the conventional 1:2 transformer in [1], this 2:3 transformer achieves higher k m of 0.82, which provides enough gain to afford the ULV operation. Fig. 12(b) shows the vertical integration of those active devices used in the DCO. Fig. 13(a) shows the chip micrograph of the proposed PT-RX that is fabricated in TSMC 28-nm LP CMOS  technology. The core area occupies 0.69 mm 2 . This chip is aimed at the BLE standard; thus, it is measured in the 2.402-2.480-GHz frequency band with 1-MHz channel spacing and 1-Mb/s symbol rate. The initial DCO frequency is set close to the carrier frequency via SPI controlling the medium and coarse banks of DCO [see Fig. 12(a)]. This way, a reasonable residual frequency error between the DCO and carrier can be tolerated due to the AFC capture range of ±2.5 MHz. Fig. 13(b) shows the power breakdown of the RX, which consumes 1.5 mW in total at maximum gain. All the blocks are supplied by 0.7 V, apart from the DCO (0.25 V) and RF front end (0.3 V). The DCO and RF front-end draw the majority (62%) of the budget in order to ensure good RX sensitivity. The inverter-based TIA and G m -stage consume 23% of the RX budget, so as to provide enough signal swing for the ADC to digitize the phase error. Due to the well-known power efficiency of the sw-cap circuitry, only 10% of the power goes to mixed-signal blocks, including the seventh-order DT LPF and 10-b SAR ADC, as well as to the digital PI controller. Since the sidelobe energy is mitigated by the ADC compared with the comparator-based type-I PT-RX, an aggressive digital filter is not indispensable for the digital PI block, which helps to reduce its power down to 56 μW only.

IV. EXPERIMENTAL RESULTS
The measured data points are superimposed on the simulated TF of the DT LPF in Fig. 14  evident that the obtained DT-LPF TF is precise due to the PVT independent characteristic of sw-cap circuitry. Hence, no calibration is needed there.
The proposed PT-RX meets the BLE specifications of both interferer and blocker rejection. Fig. 15 shows ACR performance. In the BLE's physical layer specification, the ACR performance requirement is defined at a frequency spacing of 1 MHz for a symbol rate of 1 Mb/s and at 2-MHz spacing for 2 Mb/s. Therefore, in this work, the ACR performance is measured at 0/±1/±2/±3 MHz away from the desired signal. Due to the ADC-based type-II solution (which provides robust locking loop and improves the worsened sidelobe level), compared with the prior-generation of PT-RXs [2], [3], our ACR at the 1-MHz offset has an 8.5-dB improvement. At the 2-MHz offset, a 1-dB improvement of ACR is achieved compared with [2] with similar power consumption and 1 dB worse than [3] that consumes nearly 50% more power. For the ACR of 3-MHz offset, this work is 4 dB better than [2] and 2.5 dB worse than [3]. For the co-channel and first channel offset, it can tolerate 10 and 18 dB higher interferer than the BLE specification, respectively. At 2 and 3 MHz away, the performance is limited by the deterministic loop latency and DCO pulling of direct conversion RX. Halving [9] or doubling LO frequency could be a method of mitigating the DCO pulling. Simply increasing the distance between the coils of DCO and LNA (>600 μm in [3]) is another way to address this issue without an extra cost from the power-hungry divider or DCO with higher oscillation frequency.
In Fig. 16, the out-of-band (OOB) rejection is measured from 1 to 3.5 GHz with the desired signal at 2.44 GHz, and it satisfies the requirements of the BLE standard with at least 15-dB margin. It also has a >10-dB advantage compared with [2] at 2-2.5 GHz. For the far-out frequency band, the proposed RX is saturated at up to 0 dBm. Reference [3] has a higher saturation point on the low-frequency side due to its higher supply (0.8 V) of the front end. At the upper end frequency band, the OOB rejection of both works is similar.
In Fig. 17(a), the measured results of ADC output and recovered symbol samples (OTW of DCO) are demonstrated with a scenario of 128 consecutive "1" and then "0" bits as input symbols. In addition, an HMO3054 oscilloscope is used to observe the DT LPF's output. Fig. 17(b) and (c) shows when the input data stream is 16 bits of "1" and 16 bits of "0" after the 8-bit preamble data "10101010". The case with the input symbols comprising 128 bits of consecutive "1" and "0" is presented in Fig. 17(d) and (e). As mentioned earlier, this PT-RX of type-II tracks the transitions between each symbol. In particular, the positive rising part of the DT LPF's output, as shown in Fig. 17(c), represents a transition from symbol "0" to "1". On the other hand, when the symbol transitions from "1" to "0", a negative falling part can be observed at the output of DT LPF. After the transition, the observed phase error ( LPF in Fig. 1), as expected, returns to 0 in Fig. 17, and the type-II loop is locked robustly. Fig. 18(a) demonstrates the measured S 11 across various configurations. The 3-bit binary switches for both C gs and C Pad cover the BLE frequency band and PVT variations. The solid and dashed circle lines present the programmability of C gs and C Pad , respectively. Fig. 18(b) demonstrates the measured NF and RX gain across the supply voltage of the RF front-end. As expected, the RX gain proportionately follows the supply voltage because g m is proportional to the supply. With the supply ranging from 0.25 to 0.35 V, there is a 2.5-dB increment in the RX open-loop gain. Across two different chips, both RX gain and NF vary less than 0.8 dB at each supply level. Fig. 18(c) shows the plots of the bit error rate (BER) of a GFSK-modulated pseudorandom data across the RF input power. The RF input level of −93 dBm corresponds to the BER of 0.1% (i.e., the standard Bluetooth requirement).
For the PN measurements in Fig. 19, the PT-RX is configured in open loop. The free-running DCO consumes 487 μW at 0.25 V. The PN referred to a 2.48-GHz carrier is −91 and −114 dBc/Hz at 100-kHz and 1-MHz offsets, respectively. The 1/ f 3 corner is measured as 180 kHz. Table I summarizes the PT-RX performance and compares it with state-of-the-art ULP RXs operating in the 2.4-GHz ISM band. A figure of merit (FoM) for low-power RX was defined in [26] as FoM = −S RX − 10 log 10(P RX /R) (14) where S RX and P RX stand for the sensitivity and dc power of RX, respectively. R represents the symbol rate. However, the DBB power was not included in [26] for FoM, which is an   essential part in transceivers. Therefore, the FoM is re-defined with P RX ← P RX+DBB in order to account for the DBB contribution. Our PT-RX achieves the best BLE sensitivity FoM among the listed RXs. Compared with the conventional analog-intensive ULP RXs [5], [27], this work burns ∼4× less power at similar sensitivity. Compared with the DT-RX in [1], the power budget reduces nearly 50% due to the single-path architecture. Reference [20] operates at 0.275 V via an external voltage doubler and achieves 1-mW power consumption but with a 7-dB degradation of FoM compared with our work. Reference [7] has better selectivity due to an enhanced dynamic range of DPLL-based ADC by means of a feedback DAC. However, this is at a cost of more than two-thirds of extra power. Besides, it suffers from image issues and vulnerability to RF carrier-frequency offset. The proposed RX reduces power consumption by nearly 50% compared with the prior-generation PT-RX in [3] and does it without dramatically degrading its sensitivity or interferer rejection as in [2]. The DBB circuitry in this chip can be simplified due to the presence of ULP SAR-ADC, in contrast to the type-I PT-RXs in [2] and [3], which only employs a 1-bit comparator. In the measurements, an off-chip digital control block, which is identical to the on-chip PI controller consuming only 56 μW, and a moving-average filter with a window size of up to 24 as a post-processing filter are applied for symbol recovery. Most significantly, unlike other type-I PT-RXs in [2] and [3] not supporting consecutive symbols or locking unstably, the proposed type-II PT-RX solves these issues fundamentally.

V. CONCLUSION
The current PT-RXs feature low power consumption and low occupied area but suffer from two main issues of their type-I configuration: 1) degraded stability of the locking loop and 2) constrained data pattern. This article proposes a new type-II PT-RX to address them architecturally. A multi-level ADC is employed to provide the necessary stable locking state so that a digital integrator can zero out the residual phase error. Furthermore, the worsened DCO sidelobe issue in the prior art is also mitigated by the ADC. As a result, the new PT-RX maintains good ACR performance while achieving the best-in-class sensitivity FoM of 181 dB due to the avoidance of power-hungry blocks (i.e., RF frequency synthesizer and hybrid-loop filter) with a type-II loop configuration.