A Compact CMOS Broadband Bidirectional Digital Transceiver Frontend With Capacitor Bank and Transformer Matching Network Reuse

This article presents a fully integrated bidirectional class-G digital Doherty switched capacitor transmitter (TX) and N-path Quadrature receiver (RX) in CMOS. Through sharing on-chip capacitor banks, typically occupying a major portion of the digital TX or RX chip area, as well as the RF passive matching networks, the overall size can be radically reduced. Moreover, the overall performance could be further improved by eliminating the need for an integrated T/RX switch and its corresponding loss and area overhead. The class-G operation is used within the Doherty TX to increase the output power and backoff efficiency, while the capacitive stacking technique is used in the RX to increase the voltage gain. A transformer network is used to present the optimum impedance for both the parallel Doherty TX and RX mode of operation, as well as the class-G Doherty active load modulation. As a proof-of-concept, the joint bidirectional class-G digital Doherty switched-capacitor TX and N-path Quadrature RX through capacitor bank sharing is implemented in a 45-nm CMOS SOI process. The TX demonstrates a <inline-formula> <tex-math notation="LaTeX">$\text{P}_{\mathrm {out}}~1$ </tex-math></inline-formula>dB bandwidth (BW) of 1.6-3.1 GHz, a fractional BW >63%, and peak output power (<inline-formula> <tex-math notation="LaTeX">$\text{P}_{\mathrm {out}}$ </tex-math></inline-formula>) of 22.5dBm at 2.4GHz. The peak drain efficiency (DE) of the TX is 49.5% at 1.8GHz and 41.5%/38.7%/31.6%/18.1% for the peak/2.5/6/12dB power back off (PBO) at 2.4GHz. The DE improvement compared to class-B PA is <inline-formula> <tex-math notation="LaTeX">$1.24\times /1.51\times /1.72\times $ </tex-math></inline-formula> at 2.5/6/12dB PBO. The TX is measured using 64-QAM/20MHz modulation without the use of AM-PM pre-distortion or pattern based DPD. It achieves an excellent −27.1dB EVM, −31.31dBc ACLR, 14.6dBm average <inline-formula> <tex-math notation="LaTeX">$\text{P}_{\mathrm {out}}$ </tex-math></inline-formula> and 25.8% average DE at 1.6GHz. The RX achieves a noise figure (NF) of 7.6dB at 2.2GHz and a conversion gain of 17dB with a 12 MHz bandwidth. In addition, the proposed RX front-end achieves <inline-formula> <tex-math notation="LaTeX">$ < -60$ </tex-math></inline-formula> dBm LO leakage over the operating frequency range


I. INTRODUCTION
Spectrally efficient complex modulation schemes are widely employed to support the exponential growth in data traffic. However, this places stringent requirements on the RF electronic frontends, including strict linearity due to high Peakto-Average-Power-Ratio (PAPR). Large PAPR also results The associate editor coordinating the review of this manuscript and approving it for publication was Mauro Fadda . in reduced energy efficiency for power amplifiers (PA) as the PAs often need to operate at power back-off (PBO) modes. This poses major challenges in traditional analog RF transceiver designs. In parallel, with the technology advance and maturity of electronic frontend at RF frequencies, it is now increasingly important to explore RF transceiver (TRX) frontend solutions with ultra-compactness to support lowcost small form-factor high-volume applications, such as IoT and portable mobile devices. Most existing analog VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ TRX systems consist of chains of separate TX/RX functional blocks ( Fig.1) with corresponding inter-stage matching/filtering networks to ensure performance, making further area-reduction difficult. In contrast, digital TRXs realize multiple block functionalities within a single block to drastically reduce the overall TRX complexity and overhead. For example, the digital TX contains digital-to-analog converter (DAC), mixer, filters, and the PA all together [1], and thus they offer excellent RF performance within a compact area. In addition, the digital TRX has performance compatible with aggressive CMOS process scaling and supports extensive reconfigurability and wideband operation for multiband multi-standard radios. Therefore, there is recently a surge of interest to explore high-performance digital TRX frontends. Digital PAs (DPAs), a key building block within the digital TX, have been studied extensively [2], [3], [4] and can be broadly divided into two general categories. One is the voltage-mode DPAs such as the switched capacitor PAs (SCPAs), which achieve good linearity and efficiency, but also suffer from limited output power, since their output voltage swing is lower than the supply voltage [5], [6], [7], [8]. An alternative is the current-mode DPA such as the class-D −1 , which realizes higher output power as its output voltage swing can exceed its supply voltage, but may exhibit large nonlinearities due to distortion [9], [10], [11], [12], except for some recent designs for AM-PM compensation [13]. Due to the increased PAPR in complex modulation schemes, PAs with PBO efficiency enhancement have become essential to ensure overall high system efficiency [14]. A fully integrated digital Doherty PA was first demonstrated in [15]. Examples include class-G PAs [16], [17], [18] with digital supply modulations, and Doherty PAs [19], [20], [21], [22], [23] and Outphasing PAs [24], [25], [26], [27] with active load modulations. Other solutions include the Sub-Harmonic Switching (SHS) PA [28], [29], [30] and the Switched/Floated Capacitor Power Amplifier (SFCPA) [31], [32] with tri-state floating capacitors. Recent designs also demonstrate hybrid use of voltage-and current-mode PAs in digital Doherty PAs for balanced linearity and efficiency performance [33]. In parallel, Mixer first RX architectures have attracted extensive research focuses as well for wideband, compact, and high dynamic range radios [34], [35], [36]. In particular, the N-path mixer-first receivers are a popular choice due to their inherent capabilities of high linearity, sharp and tunable front-end filtering, and wideband carrier frequencies [37], [38], [39], [40], [41], [42], [43, [44]. The N-path mixer-first RXs also potentially consume low power due to the removal of the LNA and are area efficient due to their inherent reconfigurable bandpass filtering, removing the need for bulky and narrowband SAW filters. Recent designs also show that mixer-first RXs can be extended to wideband high mm-Wave frequencies [45], [46], [47].
While RF digital TRX frontends naturally offer excellent RF performance, reconfigurability, and wideband operations, they commonly rely on architectures based on binary and/or unary arrays of sliced active and passive devices, resulting in a substantial area overhead compared to their analog RF counterparts. For digital TXs, the current-mode digital TXs require a large array of class-D −1 RF current sources, while the voltage-mode switched capacitor digital TXs require a large capacitor bank with scaled switches. In both cases, sliced PA arrays require large binary/unary array configurations to ensure high Effective number of bits (ENOB), e.g., 8-bits to 14-bits [48] and high TX dynamic range. Although mixed-signal PAs with hybrid analog/digital configurations substantially reduce the ENOB requirements and achieve super-resolution, they require separate analog paths and are not fully digital in nature [49]. On the other hand, N-path mixer-first RXs require large differential capacitor banks as well, which scale with the quadrature operations and number of N-paths. In summary, although digital RF TRXs merge multiple functional blocks, their practical implementations often exhibit large chip areas fundamentally due to their nature of sliced digitized operations.
In this paper, we propose an compact CMOS broadband bidirectional digital TRX frontend with capacitor bank and transformer matching network reuse. The bidirectional digital TRX functions as a class-G digital Doherty switched-capacitor TX in its TX-mode and an N-Path quadrature differential RX in its RX-mode [50]. To radically save chip area, we exploit extensive sharing of the switched capacitor banks and the TRX-antenna transformer matching networks in the TX-and RX-modes combined with inherent digital TRX configurability. Moreover, the bidirectional digital TRX front-end eliminates the need for an integrated T/RX switch, hence eliminating the area overhead and loss/linearity constraints of the T/RX switch. This paper is organized as follows. We present the proposed bidirectional digital TRX architecture, its operation and theoretical analysis of the digital class-G Doherty switched-capacitor TX in Section II. The detailed circuit implementation is described in Section III. Section IV presents the measurement results and comparison with the state-of-the-art works. Section V concludes this paper. Figure 2 shows the architecture of a 2-way digital TX, an N path mixer-first receiver, and the proposed bidirectional  digital transceiver. The 2-way digital TX can operate as a 2-way digital Doherty TX with parallel power combining. The proposed bidirectional digital TRX merges a 2-way switched capacitor digital TX and an N-path mixer first receiver together through passive reuse. The transformerbased TRX-antenna interface matching network is shared by the TX and RX operations. As on-chip transformer antenna interface matching network is often the most area intensive passive component for RF TRX frontends, this reuse results in a major area saving. In addition, the capacitor banks (C u ), which often occupy a large chip area of the switched capacitor PA and N-path switched capacitor RX, are now shared between the TX/RX modes in our bidirectional TRX architecture as well. Finally, our proposed TRX architecture eliminates the need for an integrated T/RX switch. This reduces antenna-interface loss/linearity constraints and further saves chip area, which improves the overall TRX system performance, such as NF, output power, linearity, and energy efficiency.

A. DIGITAL TRANSMITTER OPERATION MODE
In the TX mode, the proposed digital TRX frontend consists of two PA paths in a parallel transformer-based power combining. The two PA paths serve as the main and auxiliary (Aux) paths, thus realizing a 2-way parallel Doherty PA configuration [51]. A class-G 2-level supply modulator is added to the 2-way switched-capacitor Doherty PA to boost the backoff efficiency (Fig. 3). The TX mode is divided into two VOLUME 10, 2022 operations, the VDD mode and 2VDD mode, depending on the class-G operation and the output power level. While in TX mode operation, the main path operates exactly the same as class-G operation and the transistors (M 17 ∼ M 19 ), which is not related to TX operation in Aux path, are turned off. Thus, it acts identically to class-G operation. The transistor, M 18 is added to prevent breakdown when the TX operates in 2VDD mode. In this operation, we adopt the natural supply transition class-G architecture [16] to minimize phase discontinuity. The proposed TX operation is the following. As the output power increases, each Main switched capacitor PA cell turns on sequentially until 12dB PBO. When the Main cell path is fully turned on, the Aux cells gradually turn on, which improves the efficiency peak at 6dB due to Doherty load pull operation. After the Main and Aux PAs are fully switched on to VDD mode, the Main PA cells are gradually switched from VDD to 2VDD mode. This improves the efficiency at 2.5dB PBO, and then the Aux PA cells are gradually turned on to 2VDD mode as well. The complete turn on sequence is detailed in Fig. 4 (a)-(c)-(d)-(e). An alternative turn on sequence is the following. All the cells of the Main path are sequentially turned on to VDD mode (up to 12dB PBO) and then sequentially switched to 2VDD mode (up to 6dB PBO). Then, for the Aux cells, we gradually turn them on to VDD (up to 2.5dB PBO) and change them sequentially to 2VDD mode ( Fig. 4 (a)-(b)-(d)-(e)). We explain in detail below the differences between the two turn-on sequences and how they work within the proposed structure. The major performance equations V out , P out , Q load , η(DE) of differential class-G SCPA in Fig.4 are given below [17]: where, m, n, and Q load are the fraction of unit cells operating in 2VDD mode, the fraction of unit cells operating in VDD mode, and the loaded quality factor of the output matching network (OMN), respectively. In addition, the subscripts M and A indicate the Main and Aux path, respectively. The power dissipation of capacitive divider (P cd ) and switching loss (P sw ) are defined as the following [5]: When we look at the losses relative to the power delivered to output matching network, it can be seen that P cd of different sequences are the same regardless of the operation sequence but P sw is different according to each sequence following by below equation.
where, R ON and f denote on-resistance and operating frequency, respectively. Letting A = π 4 R L R on f f sw , the relative switching loss to the power delivered the output matching network at 6 dB Back off depending on the sequence becomes the following: Fig.4. Fig.4. Figure 5 shows the results of the normalized efficiency, load impedance and normalized losses according to sequence (a-b-d-e) and sequence (a-c-d-e) assuming that other losses (insertion loss, conduction loss) are the same. From these   results, the TX operation uses sequence (a-c-d-e) method in order to have the best PBO efficiency performance.

B. MIXER FIRST RECEIVER OPERATION MODE
The overall operation of the mixer first RX and the equivalent circuit schematic operation are highlighted in Fig. 3. and Fig. 6 (a)-(b). During RX mode operation, VDD is connected to the gate of M 1 (V ctl <1> in Fig. 3) of the Main path, and the Main path is shorted to GND through M 1 , resulting in a low impedance. Since the TX cells are turned off, this is equivalent to the TX operating in the very low power region. Therefore, due to the Doherty operation of the OMN, the impedance presented to the Aux path increases, enhancing OOB linearity and voltage gain. For the Aux path, the transistors associated with TX operation are turned off and floated (M 11 -M 16 ) as detailed in Fig. 3. Therefore, the TX path inherently has minimal impact of the RX operation and hence it is possible to perform RX operation properly without an additional switch in the Main PA path.
In the RX mode, the proposed circuit performs bottom plate sampling to achieve better linearity by reducing the modulated on-resistance [52]. The RX operation in our proposed architecture adopts the capacitive stacking technique to achieve additional voltage gain for enhanced sensitivity [53].
We now detail the conceptual operation of the proposed receiver cell. To ease the analysis, we assume that V RF consists of a single sinusoidal wave input, and we consider the behavior of a bottom plate N-path mixer first receiver with resistor and capacitors as shown in Fig. 6. The switches are ideal and have negligible on resistance. We assume the RC time constant is much larger than the on-time (T on ) of the switch. All the switches are driven by 4-phase nonoverlapping 25% duty-cycle clocks provided by a divide-by-2 circuit. As the single-ended RX input is converted to a differential signal at the N-Path mixer through the transformerbased balun, the output is differential quadrature as well. The capacitor, C ua , which is connected to the bottom plate, is connected to the read-out capacitor (C b ) though the switch, M 18 . After a large number of switching cycles, each capacitor is assumed to have stored the average value of the input signal seen during its ON-time. Applying a 180-degree LO to M 20 turns on and turns off the M 17 operating in different phases. Since this structure uses a 4-phase clock for quadrature receiving, we can ensure that the voltage charged to C ua and the voltage charged to C ub are antiphase. Therefore, the read-out capacitor, C b , can get as much voltage gain as the C ua and C ub combined. The voltage at each C u is the down-converted baseband voltage stored in the capacitor because f in = f Lo . This results in a 6-dB voltage gain occurring similarly at 0 • , 90 • , and 270 • . For these inband signals, the baseband (BB) current is converted to a BB voltage V BB via transimpedance amplifiers (TIAs) [54]. Figure 7 shows the top schematic of the proposed bidirectional digital transceiver with single transformer footprint parallel Doherty output matching network. This proposed structure consists of a parallel Doherty OMN, a conventional class-G DPA, a newly proposed T/RX class-G PA, Phase modulated (PM) digital TX input driver, RX 4-phase nonoverlapping clock, an 8-bit AM driver and an RX op-amp. In TX mode, the Main and Aux path are controlled by an 8-bit AM code. 1 bit enables the RX mode, and when enabled, all schematics related to TX are disabled as shown in Fig. 3. The proposed transceiver is implemented in 45nm SOI CMOS process with 1.98mm × 2.57mm chip area (including pads). The IC is wire bonded to a PCB to provide DC and control inputs and the RF is probed. Figure 8 shows schematics and layout in details. Figure 8 (a) shows the schematic and layout of the conventional class-G unit cell. It occupies a total area of 623µm 2 . Within the cell, the capacitor size occupies a high area overhead, accounting for 47% of core area in class-G switched capacitor PA unit cell. Figure 8 (b) shows the schematic and layout of the proposed bidirectional T/RX cell. The value of shared capacitor is the same, and layout is approximately 740µm 2 primarily due to the additional required read out capacitor (C b ) for the receiver. By sharing the capacitors that occupy the largest area in the SCPA and N path mixer first receiver, we can add the RX functionality with an additional area overhead of only 19%. In addition, the Aux TX path is extended to share with the RX mode. The Aux TX path is chosen for the TRX sharing because the Doherty Aux path typically has lower impact on the overall TX gain and linearity compared to the Main path [49]. Figure 9 shows the simulated Tx results with conventional class-G structure and   newly proposed T/RX class-G structure. As we can see in the Fig. 9, the results of newly proposed T/Rx structure are almost same as conventional class-G.

B. OUTPUT MATCHING NETWORK
The 3-D EM model of the single footprint output network and its simulated passive efficiency according to frequency are shown in Fig. 10. This OMN supports Doherty operation and parallel power combining. The OMN occupies an area of 0.212mm 2 . It achieves a wide band operation from 1.2GHz to 5GHz with a loss less than 1dB for the band of interest. Figure 11 shows the output waveforms of the Main and Aux paths when the RX mode and TX mode are enabled for operation, and the corresponding load impedance seen in each path during the operation. Note that there is an inherent tradeoff between the transmitter and receiver. In the RX mode, high input impedance is desired to get better OOB linearity (IIP3) [53]. However, in TX mode, the output impedance should be proposed lower to increase the output power. To meet these requirements, this proposed circuit uses parallel Doherty output matching networks. In RX mode, the Main path is turned off and is low impedance thanks to voltage mode operation [33]. Thus, its impedance is around 50 ohms. For TX mode, each cell turns on gradually according to the required output power, and the turning on sequence follows as analyzed in Section II to get better efficiency in PBO status. Once the Main cell turns on to VDD mode ( Fig. 11 (b)), the Aux path also turns on to VDD mode (c) and thus employs Doherty operation and achieves backoff efficiency enhancement. After all the turned-on cells are operating in VDD mode, the Main cells start to turn on to 2 VDD mode (d) and then the Aux cells start to change from VDD to 2VDD mode (e). The active load modulation impedance of each PA path is derived below: where k is the magnetic coupling coefficient of OMN, n is the turn ratio of OMN, m and n are the number of bits turned on power cell of 2VDD and VDD, respectively. The overall active modulation results are shown as Fig. 11.
C. ROUTING LINE FLOORPLAN Figure 12 shows that the routing line floorplan to the passive network. In TX mode, the Main and Aux path are controlled by an 8-bit AM code. The Main and Aux paths consist of 7-bits. Among them, 6-bits control the number of unit cells in a unary and binary manner and 1-bit controls the supply modulation between 2VDD mode and VDD mode. There is 1-bit control, Rx-on. When the Rx-on bit is turned on, all the Tx cells are turned off. The Main and Aux cells are distributed symmetrically along the output feeding to minimize phase offsets. Additional minimum sized cells are added in the Aux path to balance the I/Q signals when in RX mode.

IV. MEASUREMENT RESULTS
The bidirectional digital transceiver is implemented in the Global-Foundries 45-nm CMOS SOI process, and the chip photograph is shown in Fig. 13. This is a fully integrated design with two sub-PAs, output passive network, input TX PM driver and AM buffer array, RX mode PM driver and op-amp. The dc supply is 2.2V and 1.1V for the PA and drivers respectfully. The chip is mounted on an FR4 printed circuit board (PCB) and on-wafer probed for the following measurements.

A. TRANSMITTER CW MEASUREMENT RESULTS
The Figure 14 shows the test setup for CW measurement. The PA is first characterized using continuous wave (CW) signals with a 50 ohms standard load. A CW signal is generated by a signal generator (N5193B) (Keysight), converted to a differential signal by an off-chip balun (Krytar4010180), and fed to the PA differential input. The PA output is moni- tored by a power meter (N1913A) to measure Pout and PA efficiency. The AM sequence is controlled using an USB-1024LS with custom LabVIEW code. Figure 15 shows the measured PA output power and the drain efficiency.

B. TRANSMITTER MODULATION RESULTS
The PA is then characterized with modulated signals. Highorder modulated signals are generated in advanced design system (ADS) and separated into AM and PM signals for polar operation. The memoryless 1-D AM-AM lookup table (LUT) that can be readily generated is used to control the  Table 1. We achieve competitive efficiency, bandwidth, and area with respect to the state of the art while also supporting embedded RX functionality.  C. RECEIVER RESULTS Figure 18 shows the test setup for RX measurement. A CW signal is generated by a signal generator (N5193B) (Keysight), converted to a differential signal by an off-chip balun (Krytar4010180), and fed to the LO differential input. The input signal is fed to the mixer first receiver and the VOLUME 10, 2022  I/Q outputs are used to measured gain. We achieve a conversion gain of 17dB from 1.6GHz to 3.0GHz with 12 MHz bandwidth. RX has a low return loss of <−10dB across the operational bandwidth, i.e., F LO = 1.6 to 3GHz. The OOB IIP3 is −30dB and OOB IIP2 is −59 dB. The proposed front-end achieves <−60 dBm LO leakage across the operation frequency range. Figure 19 shows the RX performance summary of the proposed bidirectional transceiver and the comparison with other state-of-the-art N path mixer first receivers is shown in Table 2. As we can see the results, this proposed architecture is implemented with a small area compared to the state of the art due to the transformer and capacitor reuse.

V. CONCLUSION
This article presents a bidirectional digital transceiver architecture by sharing OMN and capacitor banks. The proposed new architecture exhibits the highest peak DE and a competitive bandwidth. Furthermore, this proposed design supports additional receiver functionality without any performance degradation. To the author's knowledge, this is the first demonstration of a bidirectional digital T/RX sharing a switched capacitor and single transformer matching network. This work also presents the smallest RX core area compared to the reported designs as it shares the capacitors in TX. The potential area savings and performance enhancement of this design makes it a promising candidate for compact formfactor applications.