A CMOS Dual-Polarized Phased-Array Beamformer Utilizing Cross-Polarization Leakage Cancellation for 5G MIMO Systems

This article introduces a power-efficient and low-cost CMOS 28-GHz phased-array beamformer supporting fifth-generation (5G) dual-polarized multiple-in-multiple-out (MIMO) (DP-MIMO) operation. To improve the cross-polarization (cross-pol.) isolation degraded by the antennas and propagation, a power-efficient analog-assisted cross-pol. leakage cancellation technique is implemented. After the high-accuracy cancellation, more than 41.3-dB cross-pol. isolation is maintained along with the transmitter array to the receiver array. The element-beamformer in this work adopts the compact neutralized bi-directional architecture featuring a minimized manufacturing cost. The proposed beamformer achieves 22% per path TX-mode efficiency and a 4.9-dB RX-mode noise figure. The required on-chip area for the beamformer is only 0.48 mm2. In over-the-air measurement, a 64-element dual-polarized phased-array module achieves 52.2-dBm saturated effective isotropic radiated power (EIRP). The 5G standard-compliant OFDMA-mode modulated signals of up to 256-QAM could be supported by the 64-element modules. With the help of the cross-pol. leakage cancellation technique, the proposed array module realizes improved DP-MIMO EVMs even under severe polarization coupling and rotation conditions. The measured DP-MIMO EVMs are 3.4% in both 64-QAM and 256-QAM. The consumed power per beamformer path is 186 mW in the TX mode and 88 mW in the RX mode.


I. INTRODUCTION
E MERGING technologies are being developed to improve the wireless throughput in the fifth-generation (5G) mobile network enhanced mobile broadband (eMBB). High-performance and large-scaled millimeter-wave phased-array transceivers have been introduced and researched for delivering wideband data streams over long communication distance [1]- [18]. Further improvement on wireless data rate can be provided by the usage of multiple-in-multiple-out (MIMO) technique, while high power efficiency and low manufacturing cost features are required to be maintained for such 5G millimeter-wave MIMO systems.
Dual-polarized MIMO (DP-MIMO) transceivers are capable of transmitting two independent data streams simultaneously through the H-polarized (H-pol.) and V-polarized (V-pol.) waves. Spatial diversity is offered by the cross-polarization (cross-pol.) isolation of the dual-polarized antennas. In recent years, special attention have been focused on realizing low-cost and power-efficient 5G millimeter-wave DP-MIMO systems [6]- [11], [19]. Pang et al. [6] introduce an eight-element beamformer chip for 5G single-user DP-MIMO communications. The area-efficient bi-directional technique is utilized for minimizing the manufacturing cost. However, the performance of such a DP-MIMO transceiver is sensitive to the cross-pol. coupling from the antennas and the polarization rotation during propagation [20], [21]. Therefore, the achievable DP-MIMO EVMs and power efficiencies are limited. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Kibaroglu et al. [7] and Nafe et al. [8] introduce a feed rotation technique to provide additional cross-pol. isolations between H-pol. and V-pol. signals, while the achieved isolation is direction-selective and still sensitive to the polarization coupling caused by the module placement and propagation.
To realize high-efficiency and low-cost 5G millimeter-wave DP-MIMO systems, this work introduces a 28-GHz CMOS phased-array beamformer chip. A cross-pol. leakage cancellation technique is implemented for improving the degraded cross-pol. isolation. The cross-pol. leakages are suppressed to lower than −40 dB along with the transmitter array to the receiver array after the cancellation. The consumed power for the canceller is 20 mW. This work also adopts the neutralized bi-directional architecture and power-efficient design to minimize the manufacturing cost and power dissipation. Over 22% maximum TX-mode efficiency and 4.9-dB RX-mode noise figure (NF) are obtained by each element-beamformer. The occupied area for each beamformer is only 0.48 mm 2 . Due to the proposed cancellation technique, the 64H + 64V array modules in this work can support 5G new radio (NR) standard-compliant DP-MIMO communication even under severe cross-pol. leakage conditions. The achieved TX-to-RX DP-MIMO EVMs are both 3.4% in 64-QAM and 256-QAM.
This article, which is an extension to [22], is organized as follows. A detailed analysis of the DP-MIMO capacity is provided in Section II. The specific requirements for the proposed cancellation circuits are also included in this section. Section III demonstrates the circuit implementations for the power-efficient canceller and the bi-directional beamformer. Section IV presents the results for the on-wafer measurement and the over-the-air (OTA) measurement. Finally, the conclusion is drawn in Section V.

II. MILLIMETER-WAVE DP-MIMO COMMUNICATIONS
As demonstrated in Fig. 1(a), 5G millimeter-wave DP-MIMO systems utilize the cross-pol. isolation between H-pol. and V-pol. to simultaneously transmit two data streams. To evaluate the performance of DP-MIMO systems, the channel capacity under 2 × 2 DP-MIMO configuration could be represented with (1) [23], [24] where B is the channel bandwidth and γ 0 is the signal-to-noise ratio (SNR). n = 2 is selected here to keep the total transmitted power constant in DP-MIMO. The channel information is included in the 2 × 2 channel matrix H. For ideal line-of-sight (LOS) DP-MIMO operation, the H-pol. and V-pol. signals are completely isolated. Therefore, the channel matrix H ideal could be represented as follows: where ϕ 11 , and ϕ 22 denote the phase shifting caused by propagation. Considering a communication distance of d, ϕ 11 and ϕ 22 could be derived with 2π(d/λ). By applying H ideal to (1), the ideal DP-MIMO channel capacity could be calculated with the following equation: From (3), we can find that the channel capacity is significantly boosted by the usage of DP-MIMO in ideal condition, while, in real case, the arise of cross-pol. leakage will always limit the achievable DP-MIMO capacity. As shown in Fig. 1(b), both the coupling from the antenna and the polarization rotation due to module placement can generate such kind of leakages.
To explore their influence to channel capacity, the channel matrix H leak under leakage condition could be represented as follows: √ α and √ β are the magnitudes of the cross-pol. leakage components, where 0 ≤ α ≤ 1 and 0 ≤ β ≤ 1. The phases of the leakages are denoted with ϕ 12 and ϕ 21 . The corresponding channel capacity C leak could be derived by applying H leak to (1). The calculated result is shown in (5) (shown in the bottom). If we assume that α = β = A and ϕ 21 − ϕ 11 = ϕ 12 − ϕ 22 = ϕ, the channel capacity under leakage condition could be further simplified with the following equation: From (6), we can find the first term inside the brackets of C leak is the same with the one inside C ideal , while the second term, which is always negative, represents the capacity degradation caused by the cross-pol. leakages. The worst condition is obtained when A = 0.5 and ϕ = 0 • . The corresponding channel capacity C leak could be represented with (7), which is the channel capacity for single-in-single-out (SISO) scenario To compensate for the degradation caused by the crosspol. leakage, the channel SNR γ 0 is required to be improved. Fig. 2(a) and (b) demonstrates the required γ 0 against A and ϕ, respectively. The DP-MIMO capacity is fixed with 1.5 Gb/s in this calculation, and the channel bandwidth is 100 MHz. In 5G millimeter-wave transceivers, the channel SNR γ 0 is usually dominated by the transmitter side. Additional power back-off (PBO) will be required in the transmitter for improving γ 0 . Thus, the system power efficiency will be, in turn, degraded. An example of explaining efficiency degradation is also presented in Fig. 2.
As shown in Fig. 3(c) and (d), conventional 5G MIMO operation usually relies on the receiver digital baseband (DBB) or an analog-domain leakage cancellation. However, the channel capacity cannot be recovered regarding severely coupled MIMO channels. In DP-MIMO, this situation becomes even worse because the H-pol. and V-pol. streams usually come from the same directions. In this condition, the transmitter EVM is required to be further improved, which leads to larger PBO and lower power efficiency as mentioned previously. To improve the DP-MIMO capacity without sacrificing power efficiency, the cross-pol. leakage cancellation is required to be performed at the transmitter side. This function could be realized at the transmitter DBB, as shown in Fig. 3(a) and (e). However, the digital processing regarding multi-Gb/s data rate in 5G is power-hungry. A further improvement will also be required for the digital-to-analog converters (DACs) to support accurate cancellation, which is power-consuming. Therefore, a power-efficient analog-assisted cross-pol. leakage cancellation technique is introduced in this work. Fig. 3(b) shows the operation of the proposed technique. A cross-pol. leakage canceller is utilized for generating the cancellation signals. Fig. 3(g) shows the signal flowchart of the proposed cancellation. The settings of the canceller could be initially decided by a one-time factory calibration. During the transmitter operation, a magnitude and phase loopback calibration could be performed to compensate for the temperature variation. Regarding severely coupled DP-MIMO channels, a receiver-to-transmitter loopback cancellation, similar to precoding, could be further performed, as shown in Fig. 3(f). If the latency requirement is relaxed, the strength of the leakage signals could be detected at the receiver side and sent back to the transmitter for this cancellation. The cross-pol.  leakages introduced from the transmitter to the receiver could be canceled by the proposed cancellation technique in this condition.
The proposed canceller is designed based on the vector modulator. Regarding a leakage component A L e j ϕ L , the achievable suppression ratio with a cancellation signal A C e j ϕ C could be calculated with the following equation: where A = A C A L is the magnitude error and ϕ = ϕ C − ϕ L is the phase error. Fig. 4 presents the calculated suppression ratio against the corresponding A and ϕ. To maintain over 40-dB cross-pol. leakage suppression ratio, A should be less than 0.3 dB, and ϕ should be less than 1.8 • . The proposed canceller is, therefore, designed with fine-grained phase tuning and magnitude tuning for a high-accuracy and fast cancellation. Fig. 5 shows the block diagram of the proposed 28-GHz beamformer chip. Area-efficient neutralized bi-directional architecture is employed to share the same signal chain between the TX and RX [6]. Totally eight element-beamformers (4H + 4V) are integrated to support the DP-MIMO. In this work, the magnitude and phase-detection circuits along with on-chip coupling networks are implemented to the chip [12]. The output signals from each element-transmitter could be selected and distributed to the on-chip detection block through the coupling network. The signals will be down-converted to a much lower frequency for accurate detections. A cross-pol. leakage canceller mentioned in the previous section is also inserted at the H/V combining port for suppressing the cross-pol. leakage introduced from the polarization coupling and rotation. Low-cost, power-efficient, and high-performance 5G DP-MIMO systems could be maintained by this work. The detailed circuit implementation of the proposed beamformer chip will be introduced in the remaining part of this section.

A. Cross-Pol. Leakage Canceller
To improve the DP-MIMO EVMs and the corresponding system power efficiency, a high-accuracy cross-pol. leakage cancellation circuit is introduced in this work. Fig. 6 shows the block diagram of the proposed cross-pol. leakage canceller. It consists of two bi-directional variable gain amplifiers (VGAs) and two cross-pol. leakage cancellation paths. The proposed canceller can be configured into three operating modes. In the normal TX mode, the H/V bi-directional VGAs are operating in the TX mode, and the H/V cancellation paths are disabled. The RX mode operation is similar to the normal TX mode. Only the RX-mode H/V bi-directional VGAs are turned on to support the bi-directional operation. In the cross-pol. leakage cancellation mode, the cancellation paths are operating together with the TX-mode H/V bi-directional VGAs. The required H-to-V and V-to-H cancellation signals are created by the magnitude and phase control circuits along the cancellation paths.
The cancellation path in this work is designed to realize the high-resolution and orthogonal magnitude and phase tunings. Each cancellation path includes two VGAs and a phase shifter. Fig. 7(a) shows the circuit schematic of the VGA. The VGA is designed with a cascode stage, a common-source stage, and an adjustable attenuator. The passive components for matching are designed based on the low-loss and configurable transmission lines (TLs) [25]. To realize high-resolution magnitude tuning, a 10-bit DAC is utilized to control the adjustable attenuator. A shunt TL stub is connected to the attenuator to compensate for the parasitic capacitance and suppressing the phase variation. Fig. 7(b) shows the circuit schematic for the cancellation phase shifter. The reflection-type phase shifter is selected here to achieve fine phase shifting with minimized insertion loss and area [26], [27]. The 90 • directional coupler is realized with two top thick metal layers. The phase difference between the through port and the coupling port is optimized in electro-magnetic (EM) simulations. The reflection load in this work includes two series LC resonators with different resonating points. The 360 • gain-invariant phase-shifting could  be obtained with the dual-voltage control. The occupied area for the phase shifter is 0.16 mm 2 . Further area reduction could be realized by using the switch-type phase shifters along with the resonator-based fine-tuning stage. Fig. 7(c) presents the circuit schematic of the H/V single-ended bi-directional VGAs. Two single-direction VGA chains are directly combined for supporting the bi-directional operation with a minimized area. The mode selection is realized by switching the bias. Fig. 8(a) and (b) summarizes the measured variable gain performance of the cancellation path with a 2-dB tuning step. The simulated single-stage VGA performance with transistor process corners is also included. Two VGAs in this work totally provide a 32-dB gain tuning range. The achieved tuning resolution is less than 0.2 dB with the help of the high-resolution DACs. Furthermore, the measured phase variation over all gain tuning states is less than 2.3 • at 28 GHz. Phase-invariant gain tuning is obtained by the cancellation path. Fig. 8(c) and (d) presents the phase-shifting performance of the cancellation path. The simulated phase coverage over process corners of varactor and MOM capacitor is also included. The 360 • phase shifting could always be achieved from 26.5 to 29.5 GHz in both simulation and measurement. The corresponding rms gain error over the whole 5G 28-GHz band is less than 0.98 dB.
The cancellation performance in this work is further evaluated over temperature and supply voltage. Four H-pol. and four V-pol. transmitters along with the cancellation circuits are utilized for this simulation. The outputs of H-pol. transmitters and V-pol. transmitters are ideally combined, respectively. Before the cancellation, around −10-dB cross-pol. coupling is manually added between the output nodes of H-pol. and V-pol. transmitters. Fig. 9(a) demonstrates the H-to-V isolations with different cancellation temperatures. With a cancellation at 27 • C, the cross-pol. isolation is always better than 20 dB from −40 • C to 120 • C. Within 0 • C-60 • C, the cross-pol. isolation is better than 30 dB. Fig. 9(b) presents the H-to-V isolations over different supply voltages. With a cancellation at 1-V, the crosspol. isolation is always better than 23 dB considering ±5% supply voltage variation.

B. Neutralized Bi-Directional Gain Amplifier
In this work, the element-beamformer is designed based on the neutralized bi-directional architecture for saving chip area and manufacturing cost [6]. The occupied on-chip area for each beamformer is reduced to half by the bi-directional technique. As shown in Fig. 5, each beamformer in this work consists of a bi-directional gain amplifier, a bi-directional active vector-summing phase shifter, and a two-stage PA-LNA. Fig. 10(a) shows the circuit schematic of the bi-directional gain amplifier. Two differential pairs in the cross-coupling connection are included in the neutralized bi-directional core. The mode selection of the core is realized by switching the tail transistors M3 and M6. Fig. 10(b) and (c) further explains the TX-and RX-mode core operations. By selecting the same transistor size among M1, M2, M4, and M5, the gate-drain capacitance neutralization could be maintained in both operating modes [28]- [30]. Improved amplifier gain and reverse isolation are achieved. Moreover, to minimize the required chip area, the TL-based passive matching components for the gain amplifier are shared between the TX mode and the RX mode. In millimeter-wave frequencies, the required matching conditions for the proposed core will not change dramatically during the mode switching. Therefore, properly sized TLs could be selected for realizing a low-loss matching in both TX and RX modes. A high-performance and area-efficient bidirectional amplifier could be realized. Fig. 11(a) and (b) shows the simulated performance of the bi-directional gain amplifier. The achieved TX-and RX-mode gains are around 8 and 10 dB, respectively. Within 26.5-29.5 GHz, the return losses are always better than −8 dB. The power consumptions are 9 mW in the TX mode and 10 mW in the RX mode. Furthermore, orthogonal gain and phase tuning are always demanded by millimeter-wave beamformers for a simple control algorithm and fast calibration. In this work, the TX-mode amplifier is further reused as the VGA in each beamformer path. The tail bias is controlled by the high-resolution DACs for providing the fine gain tuning.   Due to the neutralized bi-directional technique, the phase variation during the gain tuning is suppressed by the gate-drain capacitance neutralization [31]. As shown in Fig. 11(d), within the 8-dB gain tuning range, the phase variation is less than 2.5 • at 28 GHz.

C. PA-LNA
Compact and low-loss design of the antenna interface is essential for realizing low-cost and power-efficient 5G millimeter-wave phased-array transceivers. The PA power delivery and LNA NF are required to be optimized together with the TRX switch considering the insertion loss and chip area. Facing the complex modulated signals in 5G, the PA efficiency in the deep PBO region also demands further improvement [32], [33]. Moreover, the packaging design is critical to the overall performance of 5G millimeter-wave phased-array systems. Accurate package modeling and optimizations are required for decreasing the insertion loss. Due to the reduced array antenna pitch at millimeter-wave frequencies, minimized packaging size and chip area become significant not only for a reduced manufacturing cost but also for a low-loss distribution from the chip to the antenna. Fig. 12 shows the circuit schematic of the PA-LNA along with the packaging connection. To minimize the chip area cost, the unbalanced neutralized bi-directional technique is adopted in this work. Compared with the neutralized bi-directional technique mentioned in Section III-B, two extra capacitors C comp are attached to the LNA transistors for compensating the extra gate-drain capacitance from the PA transistors. Therefore, even with different transistor sizes, the unbalanced neutralized bi-directional core could still maintain the gate-drain capacitance neutralization in both the PA mode and the LNA mode. To improve the PA-mode power efficiency at the deep PBO region, the proposed PA is biased in class AB condition. An adaptive antenna sharing network is co-designed with the core circuits to improve the PA-mode power delivery and LNA-mode NF. At 28 GHz, the simulated insertion loss of the antenna sharing network is 0.5 dB in the PA mode and 1.5 dB in the LNA mode.
To minimize the implementation size and loss, the proposed chip is packaged with a wafer-level chip-scale package (WLCSP). The influence of the WLCSP is carefully modeled with EM simulation. Fig. 13(a) and (b) shows the package interconnection models for the H/V antenna port and the H/V combined input/output port, respectively. The  signal interconnection is kept in GSG fashion all along with the chip pad to the PCB pad for better signal shielding. To compensate for the impedance mismatch, a TL-based re-matching network is further designed together with the models. Fig. 13(c) and (d) presents the simulated insertion loss after the compensation. Within the frequency range from 26.5 to 29.5 GHz, the introduced packaging loss is around 1.5 dB for both ports.
To evaluate the performance of the PA-LNA, a standalone PA-LNA excluding the WLCSP and re-matching network is fabricated for on-wafer measurement. Fig. 14(a) presents the measured PA-mode linearity. The achieved saturated output power is 16.2 dBm, and the output P 1 dB is 13.4 dBm. Fig. 14(b) demonstrates the corresponding power-added efficiency (PAE). The proposed PA reports a maximum PAE of 30.7% including the TRX switch. Due to the class AB bias condition, the maintained PAE at 6-dB PBO is 11.5%. The measured LNA-mode NF is further shown in Fig. 15. Keysight PNA-X N5247A along with the cold-source method is used for the NF measurement. Within the frequency range of 26.5-29.5 GHz, the NF is from 4.3 to 5.3 dB.

D. Balanced Active Bi-Directional Phase Shifter
The usage of active phase shifters further helps the bi-directional beamformers to shrink in size [6]. The area consumption overhead for gain compensation, which is required in beamformers with passive phase-shifting solutions, can be removed. However, conventional active bi-directional vectorsumming phase shifters [see Fig. 16(a)] based on the switchable polyphase filters (PPFs) suffer from the imperfect switching operation [6]. Unbalanced vector summing is introduced by the parasitic resistance of switches. Complicated control tables are required for compensating for the magnitude and phase errors. In addition, the overall achievable RF-path gain is also degraded by the usage of multiple switchable PPFs.
To address these issues, Fig. 16(b) presents the balanced active bi-directional phase shifter in this work. The proposed phase shifter consists of a non-switchable PPF, bi-directional VGAs A1 and A3 for the I path, and bi-directional VGAs A2 and A4 for the Q path. The VGA circuits for A1-A4 are similar to the neutralized bi-directional gain amplifier. Two bi-directional cores with flipped output are included for covering the complete 360 • phase shifting in vector-summing operation. Fig. 16(c) and (d) further explains the TX-and RX-mode operations, respectively. In the TX mode, the generated I/Q signals by the PPF are first sent to A1 and A2. Due to their invariant input impedance, A1 and A2 will directly function as the I/Q VGAs. In this condition, A3 and A4 will be set with fixed gain and serve as another driving stage for the PA. Balanced I/Q summing and improved linearity over phase shifting could be achieved in the TX mode. In the RX mode, the PPF can operate as the quadrature adder with its reciprocal characteristic. However, the orthogonality of PPF is sensitive to the mismatch between the I/Q input impedances. Therefore, A3 and A4 in the RX mode will function as the I/Q VGAs, while A1 and A2 with fixed gain will serve as the isolation buffers. Identical I/Q input impedances could be provided to the PPF, and balanced vector summing could be realized also in the RX mode. Moreover, the overall gain performance of the proposed phase shifter is improved due to the usage of one single non-switchable PPF.
The non-switchable PPF used in this work is designed with tunable capacitance. It can be configured for different operating center frequencies. The frequency band from 26.5 to 29.5 GHz is covered by the PPF. The I/Q imbalance caused by the process variation could also be compensated by the tunable capacitance through a calibration. Fig. 17 shows the simulated image rejection ratio (IRR) of the PPF in corner conditions after the compensation. Process corners of the varactors and resistors are considered in this simulation. Considering the 400-MHz channel bandwidth defined in the 5G NR standard, the simulated IRRs are always better than 37.1 dB. Fig. 18 demonstrates the measured phase-shifting performance of the proposed beamformer. The PPF setting is fixed in this measurement. The 360 • phase shifting is covered by this work. From 26.5 to 29.5 GHz, the introduced rms phase error and rms gain error are less than 2 • and 0.4 dB, respectively, without any compensation tables. The proposed phase shifter also realizes 13-dB gain in the TX mode and 10-dB gain in the RX mode. The PA could be directly driven by the phase shifter in this work. The area and power efficiencies of the proposed beamformer are further improved.

IV. MEASUREMENT
The proposed phased-array beamformer chip is fabricated in a 65-nm CMOS process with WLCSP. Fig. 19 shows the chip micrograph, including the package. The chip size is 4 mm × 4 mm. Table I summarizes the core area consumption breakdown for the whole chip. The proposed element-beamformer based on the compact neutralized bi-directional architecture occupies only 0.48-mm 2 on-chip area. Table I also   element-beamformer is 186 mW in the TX mode and 88 mW in the RX mode. The cancellation path introduced in this work only consumes 10 mW. Due to the power-efficient circuits introduced in this work, a peak TX-mode efficiency of 22% per antenna path is achieved. High-efficiency and low-cost 5G millimeter-wave DP-MIMO systems could be realized by the proposed chip.
The on-wafer characteristics of the proposed chip are evaluated first. Fig. 20 summarizes the on-wafer measured performance of the single-path element-beamformer. The simulated performance with different temperatures is also provided. Fig. 20(a) and (c) shows the measured TX-and RX-mode frequency responses. Within 26.5-30.5 GHz, the proposed beamformer achieves around 25-dB gain in the TX mode and around 18-dB gain in the RX mode. Fig. 20(b) presents the TX-mode linearity. The measured on-wafer saturated output power is 16.1 dBm at 28 GHz. The corresponding output P 1 dB is 13.7 dBm. Fig. 20(d) demonstrates the measured RX-mode NF. At 28 GHz, the achieved NF is 4.9 dB.
The TX-mode beamformer is further evaluated with the modulated signals. Single-carrier-mode (SC-mode) and standard-compliant 5G NR OFDMA-mode signals are utilized in this measurement. Fig. 21(a) and (b) presents the measured OFDMA-mode EVMs with 100-and 400-MHz channel bandwidths, respectively. The measured EVMs are normalized to the rms magnitude of the constellations. The peak-to-average power ratio (PAPR) for 400-MHz, 64-QAM OFDMA-mode modulated signal is 11.6 dB. When the output power P out is low, the EVMs are dominated by the output noise floor. Therefore, in this region, the measured EVMs with 100-MHz bandwidth are better than the ones with 400-MHz bandwidth. When P out is large, the nonlinearity of the TX-mode beamformer mainly decides the achievable EVMs. Thus, the measured EVMs are almost the same between Fig. 21(a) and (b) in large P out region. This work achieves the minimum 64-QAM EVMs   Fig. 21(c) and (d) further demonstrates the measured EVMs with SC-mode modulated signals against the output power. Compared with the 5G NR OFDMA-mode signals, the SC-mode modulated signals have lower PAPRs (7.7 dB for 64-QAM and 8.2 dB for 256-QAM). Therefore, the EVMs of the TX-mode beamformer at large output power region are improved. With 400-MHz channel bandwidth, the TX-mode beamformer in this work can deliver 11-dBm SC-mode output power in 64-QAM with −25-dB EVM. The 6.1-dBm output power can also be supported in 256-QAM with −32.4-dB EVM. Fig. 22(a) demonstrates the measured performance of the RX-mode beamformer. The output power, output noise floor, and IM3 are measured at 28 GHz against the input power. The input P 1 dB is −29 dBm. The corresponding  signal-to-noise-and-distortion ratio (SNDR) is calculated with a 400-MHz channel bandwidth. In this work, the RX-mode beamformer achieves a peak SNDR of 41.2 dB. The measured dynamic range regarding a 25-dB SNDR is from −57 to −30 dBm. The SNDR at the high input power region could be further improved by reducing the RX-mode gain. Fig. 22(b) shows the RX SNDRs with different gain settings. With a −6.9-dB gain, the measured NF of the RX-mode beamformer is 16.3 dB at 28 GHz. The achieved SNDR is improved to −35.4 dB regarding an input power of −20 dBm.
To evaluate the OTA performance of this work, the packaged chips are further implemented into the 64H + 64V dual-polarized phased-array transceiver modules. Fig. 23 shows the photograph; 16 packaged chips are mounted to the backside of the module. Each chip has eight antenna ports and is connected to the 2 × 2 dual-polarized antenna array in the frontside through the PCB vias. For distributing H-pol. and V-pol. signals among the chips, totally four 1-to-8 dividers/combiners are utilized on the PCB. To improve the cross-pol. isolation of the H/V signal distribution network, the ground-wall shieldings are included between the distributions. The simulated cross-pol. isolation for the H/V signal distribution network on the PCB is better than 37 dB from 26.5 to 29.5 GHz. In the frontside of the module, a 16 × 4 dual-polarized antenna array is implemented. The antenna spacing is 6 mm. In this work, the dual-polarized antennas are fed in a non-rotated fashion. The identical spacing could be maintained between each feed point for suppressing the  magnitude and phase errors. Fig. 24 shows the simulated radiation patterns in the azimuth plane for the dual-polarized patch antenna. The simulation frequency is 28 GHz. Including the feedlines, the antenna gain is around 5 dBi within −50 • to +50 • for both polarizations. At 0 • , the H-to-V cross-pol. isolation is 26.7 dB, and the V-to-H cross-pol. isolation is 26.3 dB. Fig. 25(a) demonstrates the measured V-pol. beam patterns in azimuth plane for the proposed phased-array module. The proposed module is capable of scanning the beam from −40 • to +40 • . The measured sidelobe level is always less than −9 dB without any amplitude tapering. The observed asymmetry of the beam patterns is possibly caused by the element pattern and the imperfections during the measurement. Fig. 25(b) presents the measured elevation-plane beam patterns within ±30 • . A 2 × 4 array is used in this measurement. The sidelobe level is always less than −9.6 dB. Fig. 26 shows the measured V-pol. effective isotropic radiated power (EIRP) against the activated element number. The saturated EIRP realized by 64 TX-mode beamformers is 52.2 dBm.
Two of the 64-element phased-array transceiver modules are evaluated in an SISO scenario with modulated signals. In this measurement, one phased-array module is operating in the TX mode, while the other one is operating in the RX mode. The communication distance is 1 m, which is limited by the size of the microwave anechoic chamber. The beam direction is fixed with 0 • . The SC-and OFDMA-mode modulated signals for the TX-mode module are generated by the Keysight arbitrary waveform generator (AWG) M8195A along with an up-conversion mixer. The LO for the up-conversion mixer is generated by the Keysight signal generator N5183B. The phase noise is low enough, which will not influence the measured EVMs. The received signals from the RX-mode module are directly analyzed by the Keysight real-time oscilloscope UXR0334A. The upper side of Fig. 27 presents the measured SC-mode constellations and EVMs. In SC-mode measurement, eight TX-mode beamformers and four RX-mode beamformers are activated. As shown in Fig. 27 Fig. 27 demonstrates the measured constellations and EVMs with standard-compliant 5G NR OFDMA-mode signals; 64 TX-mode beamformers and four RX-mode beamformers are utilized in this measurement. The PBO of the TX-mode module is reduced for improving both the EIRP level and the transmitter power efficiency. Considering the increased EIRP from the TX side, the RX-mode beamformer in the low-gain mode is used in this measurement for improving the input-referred linearity. As shown in Fig. 27   As mentioned in Section II, the cross-pol. leakage introduced from the PCB and propagation will degrade the channel capacity of DP-MIMO systems. An analog-assisted cross-pol. leakage cancellation technique is proposed in this work to improve both the DP-MIMO EVM and the transmitter power efficiency. Fig. 28(a) demonstrates the equipment setup and the cancellation procedure. In this measurement, 4H + 4V TX-mode beamformers and 4H + 4V RX-mode beamformers are adopted. To verify the cancellation performance over propagation, the TX-and RX-mode modules are placed with 30 • rotation. Single-tone test signals with frequencies of 27.0 and 26.9 GHz are first sent to the H-pol. and V-pol. of the TX-mode module, respectively. At the RX side, the crosspol. leakage could be observed from the H-pol. and V-pol. output spectra. According to the leakage observed at the RX side, the canceller at the TX side is activated and configured to suppress the cross-pol. leakage. Fig. 28(b) shows the output spectra for the RX-mode module before and after the cancellation. Before the cancellation, around −15-dB crosspol. leakage could be observed for both H-pol. and V-pol. output spectra due to the rotation and coupling. After the cancellation, the V-to-H leakage is suppressed to −41.3 dB, while the H-to-V leakage is suppressed to −45.4 dB. The DP-MIMO performance before and after the cross-pol. leakage cancellation is also evaluated. Fig. 29(a) demonstrates the equipment setup. The settings for the TX-and RX-mode modules are kept the same as the ones used in Fig. 28. The rotation between the modules is also 30 • . Two-stream 400-MHz standard-compliant 5G NR modulated signals in the SC-FDMA mode are generated simultaneously from the Keysight AWG M8195A. The data patterns are fully independent between these two data streams. The generated data streams are up-converted by two external mixers for the TX-mode module. The H-pol. and V-pol. output signals of the RX-mode module are directly analyzed by the Keysight real-time oscilloscope UXR0334A. Fig. 29(b) summarizes the measured TX-to-RX constellations and EVMs before and after the proposed cancellation. The output EIRPs for H-and V-pol. signals are kept the same for comparison. The achieved TX-to-RX EVM is improved from −22.7 to −25.1 dB in 64-QAM. The DP-MIMO EVMs, including the coupling and rotation, are improved with the help of the cross-pol. leakage cancellation technique. To perform the cross-pol. leakage cancellation for 64H + 64V TX-mode beamformers, the designed chip can operate in a primary-secondary configuration. As shown in Fig. 30(a), a 4H + 4V transceiver module with coaxial connector interface in this work could be utilized as the primary transceiver. It could provide cross-pol. cancellation function and input signal buffering for the secondary 64H + 64V TX-mode module. In this condition, the secondary 64H+ 64V TX-mode transceiver module will bypass the cancellation function and operates in normal TX mode. Fig. 30 In Table II, this work is compared with several state-of-theart 28-GHz phased-array transceivers supporting DP-MIMO. With the proposed analog-assisted cross-pol. leakage cancellation technique, this work achieves over 40-dB cross-pol. isolation. The 64H + 64V phased-array transceiver modules in this work could support 2 × 2 DP-MIMO communications. Moreover, the area-efficient neutralized bi-directional beamformer achieves 16.1-dBm TX-mode saturated output power and 22% per path peak TX-mode efficiency. The measured RX-mode NF is 4.9 dB at 28 GHz. The occupied on-chip area for each beamformer is only 0.48-mm 2 due to the completely shared circuits between TX and RX modes. Low-cost and power-efficient millimeter-wave DP-MIMO systems could be supported by this work for the 5G NR.

V. CONCLUSION
In this work, an analog-assisted cross-pol. leakage cancellation technique is utilized for the 28-GHz DP-MIMO systems to improve the DP-MIMO EVMs and efficiency. Over 40-dB cross-pol. isolation is achieved after the proposed high-accuracy cancellation. The element-beamformer in this work maintains low-cost and high-efficiency features with the neutralized bi-directional architecture. The 16.1-dBm TX-mode saturated output power along with a 22% maximum TX-mode efficiency is realized with 0.48-mm 2 on-chip area. The proposed 64H + 64V array modules achieve 3.4% EVM with standard-compliant DP-MIMO signals in 256-QAM. Low-cost and high-efficiency 5G NR millimeter-wave DP-MIMO systems could be realized. Zheng Li (Graduate Student Member, IEEE) received the B.E. and M.E. degrees in microelectronics and solid electronics from Xidian University, Xi'an, China, in 2014 and 2017, respectively. He is currently pursuing the Ph.D. degree in electrical and electronic engineering with the Tokyo Institute of Technology, Tokyo, Japan, with a focus on 5G RF front-end and system design.
His current research interests include millimeterwave CMOS wireless transceiver and 5G mobile systems.
Xueting Luo received the B.E. and M.E. degrees in electrical engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2018 and 2020, respectively.
She is currently with Sandisk Ltd., Yokohama, Japan.
Joshua Alvin (Student Member, IEEE) received the B.E. degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2019, where he is currently pursuing the master's degree in electrical and electronic engineering.
Rattanan Saengchan received the B.E. degree in electrical engineering from Chulalongkorn University, Bangkok, Thailand, in 2016, and the M.S. degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2020.
From 2016 to 2017, he was with Cypress Semiconductor, Bangkok. He is currently with Renesas Electronics Corporation, Tokyo.
Ashbir Aviat Fadila received the B.S. degree in electrical engineering from the Institut Teknologi Bandung, Bandung, Indonesia, in 2015, and the M.S. degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2020, where he is currently pursuing the Ph.D. degree.
From 2015 to 2016, he was a Standard Cells Mask Layout Engineer with Marvell Technology Indonesia, Jakarta, Indonesia. From 2016 to 2017, he was a Research Assistant with the Institut Teknologi Bandung, where he was involved in the research of SoC for IoT application. His current research interests include analog-mixed signals, data converters, and synthesizable analog circuits.
Kiyoshi Yanagisawa received the B.E. and M.E. degrees in information science from Tohoku University, Sendai, Japan, in 1998 and 2000, respectively.
From 2013 to 2019, he was with Rohde & Schwarz, Tokyo, Japan, where he was engaged in the test and measurement of wireless systems. He is currently a Researcher with the Department of Electrical and Electronic Engineering, Tokyo Institute of Technology, Tokyo, Japan. His current research interests include new technology in wireless systems and devices. Mr. Yanagisawa is also a member of the Institute of Electronics, Information and Communication Engineers (IEICE).
Yi Zhang (Student Member, IEEE) received the B.E. degree in microelectronic science and engineering from the Southern University of Science and Technology, Shenzhen, China, in 2018, and the M.E degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2020, where he is currently pursuing the Ph.D. degree in electrical and electronic engineering.
His research interests include both analog and RF circuit design in millimeter-wave CMOS transceiver design for 5G mobile systems.
Zixin Chen received the B.E. degree in microelectronic science and engineering from Xi'an Jiaotong University, Xi'an, China, in 2018, and the M.S. degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2020.
He is currently with Sony Semiconductor Solutions Corporation, Atsugi, Japan. Zhongliang Huang received the B.E. degree in electrical engineering from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the M.S. degree in electrical and electronic from the Tokyo Institute of Technology, Tokyo, Japan, in 2020.
He is currently with Socionext Inc., Tokyo.
Xiaofan Gu received the B.E. degree in electrical engineering and automation from Wuhan University, Wuhan, China, in 2018, and the M.S. degree in electrical and electronic engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2020. She is currently with Micron Memory Japan, G.K., Kanagawa, Japan.