A Study of a Millimeter-Wave Transmitter Architecture Realizing QAM Directly in RF Domain

Realization of high-order modulation schemes directly in the RF domain enables the generation of spectrally efficient <inline-formula> <tex-math notation="LaTeX">$4^{M}$ </tex-math></inline-formula>quadrature-amplitude-modulated (<inline-formula> <tex-math notation="LaTeX">$4^{M}$ </tex-math></inline-formula>QAM) symbols using the vectorial summation of <inline-formula> <tex-math notation="LaTeX">$M$ </tex-math></inline-formula> quadrature phase-shift keying (QPSK) signals whose amplitudes are progressively scaled by a constant factor of two. Called RF-QAM, this approach leads to numerous advantages including the elimination of power-hungry digital-to-analog converter (DAC) and the mitigation of stringent linearity requirement of the front-end power amplifier (PA). This paper presents a comprehensive comparative study of RF-QAM and conventional transmitters. The design issues associated with the front end and the mixed-signal blocks for both architectures are investigated, and the performance of these two designs is compared. Various circuit- and system-level simulations verify the superior performance of the RF-QAM transmitter compared to the conventional counterpart.

in a significant drop in the performance and efficiency of key active circuit blocks such as power amplifier (PA) and oscillator. (2) Increasing the bandwidth requires the back-end blocks such as digital-to-analog converter (DAC) and digitalsignal-processor (DSP) unit to operate at ultra-high sampling rates, which leads to excessively high power consumption (e.g., ≥ 300 mW). (3) Increasing the bandwidth results in an increase in the system integrated noise, thereby degrading signal-to-noise ratio (SNR).
To alleviate the above concerns, high-spectral-efficiency modulation techniques are commonly used, enabling higher data rates without the need to increase the bandwidth [10]. However, the generation of high-order modulations at high frequencies requires a power-hungry back-end DSP as well as high-speed, high-resolution data converters [11]. The realization of high-order modulation directly in the RF domain promises to markedly relax these issues [12], thereby facilitating a high-speed end-to-end transceiver. One possible solution to generate QAM signals in the RF domain is the use of RF-DAC [13], [14]. However, using RF-DAC leads to limited bandwidth due to the need for multi-stacked transistor stages. Additionally, error-vector-magnitude (EVM) degradation is substantial at high data rates. Finally, the switching transistors with accurate binary weighing and low dynamic error above 10 GHz are extremely difficult to implement.
Traditionally, transmitting a digitally modulated signal involves two main tasks: (1) symbol generation and (2) frequency upconversion. The former is responsible for generating the digital baseband symbols and is usually done on a DSP, whereas the latter is implemented in the analog domain. A transmitter implementing the modulation directly in the RF domain blends these two steps together, eliminating the powerand area-hungry on-chip digital circuitry. Recently, bits-to-RF above-100-GHz RF-8PSK and RF-16QAM transmitters in silicon were disclosed [1], [9]. These transmitters achieved 15 and 20 Gbps data rates, respectively, while consuming less than 600 mW.
The conventional direct-conversion transmitter incorporating 4 M QAM scheme is depicted in Fig. 1(a). The digital baseband 4 M QAM symbols (originally, comprised of two 2 M PAM symbols) are generated in the DSP which are then fed to two DACs to be converted to analog signals. The baseband I and Q components are then upconverted to RF with the aid of I and Q mixers fed by quadrature local oscillator (LO) signals. The two upconverted I and Q components are combined to generate the 4 M QAM RF signal ready to be transmitted following the power amplification. Shown in Fig. 1(b) is the system block diagram of the RF-QAM transmitter. It consists of M QPSK modulators, where the amplitude ratio of any two side-by-side QPSK signals is a constant factor of two [1], [2], [12], [15]. Each QPSK modulator directly receives two This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ input binary streams (which eliminates the use of DAC and DSP), and simultaneously generates and upconverts the QPSK symbols. Moreover, each QPSK path employs an explicit PA. Finally, a power combiner adds these M QPSK waveforms to construct a 4 M QAM RF signal prior to transmission.
This paper provides a thorough analytical study as well as circuit-and system-level simulations of both the RF-QAM and conventional transmitters. It also presents a performance comparison between these two architectures. The remainder of this paper is organized as follows: Section II presents an analytical study of the conventional transmitter. Section III provides an in-depth analysis of the RF-QAM transmitter architecture and compares it to the conventional counterpart. Section IV provides the simulation results of both architectures. Finally, Section V provides concluding remarks.

II. CONVENTIONAL TRANSMITTER
A. Power Amplifier 1) EVM Degradation Due to PA Non-Linearity: It is widely known that the PA non-linearity leads to amplitude compression whose detrimental effect is exacerbated when dealing with envelope-variable modulation schemes such as high-order QAM [16]. To quantify this performance degradation in the conventional architecture, the EVM due to PA non-linearity is calculated. Prior work analyzed nonlinearity-induced constellation distortion and EVM degradation using a polynomial model for the PA [17]. In this work, PA non-linearity is modeled using the method introduced in [18] that relates the PA output's phase shift and amplitude to the input amplitude. Suppose that the modulated input signal is expressed as where A(t) and (t) capture the "AM/AM conversion" and "AM/PM conversion", respectively, and both are functions of the input signal's amplitude, a(t), i.e., where α 1 , α 2 , β 1 , and β 2 are empirical fitting parameters [16]. PA non-linearity causes the transmitted symbols to deviate from the ideal ones in the constellation diagram in two different ways: (1) AM/PM conversion acts as a phase shift rotating the symbols around the origin of the constellation diagram, while maintaining a fixed distance from the origin.
(2) AM/AM conversion changes the radial distance of the rotated symbols from the origin. These effects are shown in Fig. 2 for only one of the QAM symbols for the sake of clarity.
In a conventional transmitter generating 4 M QAM symbols in the digital domain, each symbol has its own error vector (EV) determined by the symbol power (i.e., the square of the distance from the origin of the constellation diagram). E V n,m is defined to be the EV associated with symbol (I, Q) = (n, m). Therefore, EVM is obtained to be where n and m are odd numbers (i.e., n, m = 2l − 1 where l ∈ N), ASP denotes the average symbol power, and M is the order of modulation. The following steps are taken to calculate EVM: (1) the ASP of the transmitted constellation diagram as well as the average rotation angle of all QAM symbols are calculated. Next, a perfect 4 M QAM constellation with no impairment is considered, which is rotated by this average rotation angle to obtain a reference constellation with a symbol-to-symbol spacing of 2d. Furthermore, d is calculated such that the transmitted and the reference constellation ASPs are equal. (2) The effective phase difference between the two symbols in the transmitted and the corresponding reference constellation diagrams, ψ n,m , is calculated (Fig. 2). (3) Using d and ψ n,m , EV for each symbol in a 4 M QAM constellation diagram is derived and EVM is calculated, accordingly. The PA input-signal amplitude during the transmission of symbol (I, Q) = (n, m) within the 4 M QAM constellation is denoted by a n,m = a u × √ n 2 + m 2 , where a u is the unit amplitude. The average symbol power, AS P, of the distorted constellation diagram is derived to be: where S P n,m , the power of the distorted symbol (I, Q) = (n, m) at the PA output, is Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. As previously stated, the reference constellation is adjusted in two ways: (1) it is scaled by changing the minimum symbolto-symbol spacing to match the transmitted average symbol power obtained in Eq. (4); (2) it is rotated by the average rotation angle of all constellation points, θ avg , derived in Appendix A. Additionally, ASP of the reference constellation (assuming 2d symbol-to-symbol spacing) is calculated to be The derivation steps to calculate d are found in Appendix A. E V n,m caused by AM/AM and AM/PM conversions is calculated to be: where E P n,m and E A n,m represent EVs of symbol (I, Q) = (n, m) generated by the AM/PM and AM/AM conversions, respectively, which are derived to be where ψ n,m is the effective phase difference between the transmitted (I, Q) = (n, m) symbol and its associated reference constellation point. Moreover, d n,m is the distance of (I, Q) = (n, m) symbol in the reference constellation to the origin. These two parameters are calculated in Appendix A. Using Eqs. (3)-(5), and (7), the EVM of a conventional transmitter handling 4 M QAM is derived, as follows A CMOS 125-GHz PA, whose topology and design specifications will be disclosed in Sec IV-A, is considered. The circuit-simulated fitting parameters capturing the PA non-linearity are α 1 = 8.34, β 1 = 10.47, α 2 = 11.18, and β 2 = 19.67. Fig. 3(a) shows the plots of EVM as calculated by Eq. (10) for three modulation schemes, namely, 16QAM, 64QAM, and 256QAM. It is observed that EVM is degraded as the PA input amplitude grows. Additionally, the rate of this degradation increases with the modulation order.
2) The Impact of Bandwidth Limitation on EVM: The limited bandwidth of PA and DAC as well as other blocks contribute to intersymbol interference (ISI). In a conventional 4 M QAM architecture, a non-zero EV is generated due to the unsettled transition from one symbol to another because of the transmitter's limited bandwidth. Taking a similar approach to [1], for 4 M QAM, the EV's probability density function (PDF) of in-phase and quadrature components are , and BW T X and f B B are the transmitter's low-pass-equivalent bandwidth and the baseband symbol rate, respectively. Therefore, the average EV power is readily obtained: Additionally, ASP of the reference constellation is derived as Hence, Fig. 4(a) shows the plot of Eq. (14) for QPSK and three different QAM schemes (i.e., 16QAM, 64QAM, and 256QAM). It is observed that EVM induced by the transmitter's limited bandwidth increases with the modulation order.
3) PA Efficiency and Output Power: This section will study the PA's available output power and its efficiency in conventional architecture. The peak symbol power in a 4 M QAM scheme with a symbol-to-symbol spacing of 2d is Likewise, average power (P avg = AS P) was calculated in Eq. (6). Therefore, the PAPR for a 4 M QAM signal is which is demonstrated in Fig. 4(b) with respect to the modulation order. This plot shows that the rate of increase in PAPR will get smaller for higher modulation order. In a conventional 4 M QAM transmitter, the PA should operate in its linear region dictating P max < P 1d B . Therefore, meaning that the PA should back off, at least, by as large as the PAPR value from its 1-dB compression point. For instance, the PA handling a 64QAM signal with a PAPR of 2.33 should operate at a minimum of 3.7 dB backoff from its P 1d B . This Eq. (18) explicitly indicates that the efficiency drops, at least, by the PAPR value. For the CMOS PA circuit in Section IV-A(b) with P D D = 211 mW, Fig. 3 shows the efficiency of the PA in terms of a u for 16QAM, 64QAM, and 256QAM, where the peak efficiency for all modulation schemes remains the same at around 7.5%. Additionally, Figs. 5(a) and 5(b) show EVM and efficiency, respectively, for the aforementioned QAM schemes in terms of the PA backoff from its 1-dB compression point. It should be noted that: (1) these plots confirm the common knowledge that backing off from 1-dB compression point indeed improves the EVM and deteriorates efficiency. (2) Even if the PA operates at its P 1d B (i.e., zero back off), the peak efficiency cannot be reached because P 1d B < P sat . Figs. 3(a)-3(b) and 5(a)-5(b) reveal a tight trade-off between EVM and efficiency for PA in a conventional transmitter. Specifically, if the PA is designed at an operation point for maximum efficiency, the EVM is severely degraded, accordingly. For instance, to achieve -30-dB EVM, the PA efficiency would be less than 1.5% for all three QAM schemes. On the other hand, 7.5% peak efficiency is only acquired for a poor EVM, i.e., EVM≥-12 dB.

B. DAC Challenges
In a conventional 4 M QAM transmitter, the baseband signal is generated using two DACs in I and Q paths. In a direct conversion transmitter (DCT), the DAC minimum sampling rate is twice the baud-rate of the baseband signal (i.e., f D AC > 2 f B B , which in practice it could be as high as four times). Additionally, a heterodyne architecture mandates the DAC minimum sampling rate to be twice the IF frequency. A high-IF transmitter may thus need high-speed DACs with high power consumption. On the other hand, higher-order modulation schemes require higher-resolution DACs. Not only are high-speed and high-resolution DACs difficult to implement, they are also extremely power-hungry [19]. A number of recently published DACs are summarized in Table I. A quantitative study of the DAC power consumption and EVM due to DAC INL for a 64QAM scheme and four DAC resolutions (i.e., 3-6 bits).
its dependency on resolution and modulation order will be provided later in this section.
The DAC linearity, characterized by its integral non-linearity (INL) and differential non-linearity (DNL), directly degrades the transmitter EVM, distorts its constellation, and closes the eye-diagram. The distortion caused by non-zero INL is quantified, as follows. The least-significant-bit voltage, L S B, corresponding to the symbol-to-symbol spacing in the constellation diagram, is defined as where V max and V min are the output voltage when input code is 2 N − 1 and 0, respectively. A pair of decimal numbers (n, m) at the inputs of I and Q DACs generate the symbol Assuming the I and Q DACs to be identically matched, the EV caused by their non-zero INLs for symbol (I, With the ASP of a 4 M QAM signal expressed in Eq. (6), EVM is readily derived, i.e., A simplifying assumption is first adopted where all input digital codes are assumed to have the same INL. This assumption enables us to plot EVM with respect to the DAC INL in Fig. 6 that demonstrates EVM for a 64QAM scheme under four different DAC resolutions (i.e., N ). The DAC resolution is determined based on a targeted EVM and DAC INL. Additionally, the accuracy within which the digital pulse-shaping filter is reconstructed in the analog domain is another factor in determining the DAC resolution.
The DAC in a conventional 4 M QAM transmitter targeting above-100 Gbps consumes considerable power. Theoretically, analog reconstruction of a 4 M QAM symbol requires a minimum of M bits (i.e., N min = M). Based on the plot of Fig. 6, a targeted EVM imposes an upper limit for the INL (I N L p ). As an example, a 64QAM scheme with the desired BER of 10 −5 should have an EVM better than −25 dB [24]. Suppose that the three impairments investigated in this work (i.e., PA and DAC nonlinearities and device noise) make equal contributions to the EVM degradation. Under this special-case scenario, DAC's contribution to EVM should not exceed −35 dB. This EVM translates to a maximum INL of 0.05 LSB for a 3-bit DAC. High-speed DACs with such a low INL are difficult to realize especially in CMOS. Referring to Fig. 6, to increase the maximum tolerable INL and bring it to practically viable levels (e.g., 0.4 LSB) for the given EVM, DAC's resolution should be increased. It is readily proved that increasing the resolution by one bit will double the maximum tolerable INL. The required DAC resolution for this new INL value is Dynamic performance parameters contributing to linearity such as third-order harmonic distortion (HD3) will impose more adverse effects at high frequencies than INL. Specifically, it is proved that the required output impedance of a widely used current-steering DAC for a given HD3 is [25] |Z o | ≥ In a high-resolution DAC (e.g., 8-bit) operating at high frequencies above 10 GHz, the overall shunt capacitance dominates the output impedance. Therefore, (22) becomes extremely challenging to satisfy, as was ascertained in [1].
To quantify the DAC performance, a simple figure-of-merit (FoM) is introduced in this work, as follows where P is the DAC power consumption, f s is the sampling rate, and N is the resolution. Rearranging Eq. (23), the minimum power consumption of a DAC is calculated, i.e., To appreciate the impact of DAC on the overall power consumption of the transmitter, consider the above 64QAM example targeting 100 Gbps data rate. This data rate requires a DAC with a minimum sampling rate of 33.3 GS/s. Based on a comprehensive survey of recently published DACs, the minimum FoM ever achieved at such high speeds of operation is around 209 f J conv.−step [20]. Moreover, based on Eq. (21), the minimum resolution for this DAC would be 6 bits. Therefore, the minimum total power consumption of the I and Q DACs in a direct conversion architecture is calculated to be around 890 mW. It is noteworthy that this calculation does not account for the power dissipation of clock buffers and clock generator circuits which could be significant [26]. From (24), it is seen that the power consumption increases linearly with the sampling rate and exponentially with the resolution. The absence of data converters in an RF-QAM transmitter thus results in significant power saving.

C. Noise
The majority of noise sources in transceivers are either Gaussian by nature (e.g., device thermal noise) or can be approximated by a Gaussian random process (e.g., LO phase noise and communication-link noise [27]). A Gaussian noise is characterized by a two-dimensional probability density function (PDF) with zero mean, i.e., f X Y (x, y) = 1 √ 2πσ 2 exp( x 2 +y 2 2σ 2 ). Changing the coordinates from Cartesian to polar, the square of the distance from the origin, is calculated to be an exponential random variable with a PDF of [28] and [29] where σ 2 = ηB 2 = N 0 2 , and B and η are the system bandwidth and white noise source PSD, respectively. The random variable V has a mean value of E [V ] = 2σ 2 = N 0 . In the case of the conventional 4 M QAM transmitter, EVM due to white Gaussian noise (WGN) is calculated to be [27] where ρ avg is the average SNR referenced to a 1 resistance. It should be noted that: (1) this value can readily be transformed to an SNR referenced to an arbitrary R 0 resistance, i.e., ρ R 0 = ρ 1 /R 0 , and (2) the average SNR is also related to ρ min = a 2 u /N 0 which is defined to be the minimum SNR of the 4 M QAM signal, i.e., ρ avg = 2

III. RF-QAM TRANSMITTER A. An Overview of Prior Work
This section summarizes the studies conducted by prior work, especially our work in [1], on various aspects of the RF-QAM transmitter.
1) RF-QAM EVM Calculation: Assuming that the data bits fed to the QPSK modulators are statistically independent, the mean power of EVs can be added together to obtain the total EV. Therefore, EVM is readily obtained to be [1]: For a special case where all QPSK signals have equal EVMs, the high-order QAM EVM is basically the same as the QPSK EVM. This highlights an important advantage of QAM generation based on vectorial summation of QPSK signals in RF domain. Specifically, while low-EVM generation/upconversion/amplification of 4 M QAM using a conventional scheme faces incredible challenges, it is far easier to achieve low-EVM constant-amplitude QPSK modulation. As a consequence, RF-QAM structure is capable of attaining low EVM values, not achievable using conventional architectures, at near-f max carrier frequencies.
2) Bandwidth Limitation: As was discussed in Section II-A.2, a non-zero EV is generated due to the unsettled transition from one symbol to another because of the transmitter's limited bandwidth. For QPSK, the EV's PDF of in-phase and quadrature components are obtained to be the same as (11), ). Therefore, Fig. 4 includes the plot of EVM for the QPSK scheme obtained by Eq. (28). This plot clearly shows that the adverse effect of the transmitter's limited bandwidth on EVM is less pronounced for the QPSK compared to high-order QAM.
3) Amplitude Mismatch: As previously mentioned, ideally, the amplitude ratio of different QPSK signals should be exactly two. However, due to PVT variations, this ratio is not always maintained, causing non-zero EVs. As was thoroughly discussed in [1], in the case of 16QAM, if the amplitude ratio is 2/(1 + 1 ), EVM caused by this amplitude mismatch is readily calculated: Additionally, for a higher order QAM scheme, the ampli- To approximate the EVM of a 4 M QAM signal (M ≥ 3) duehes, only the three largest QPSKs are considered [1], resulting in where M−2 is defined as 4) Phase Mismatch: Starting with the special case of a 16QAM scheme, EVM caused by the phase mismatches between different QPSK signals is readily calculated to be where φ 1 is the phase difference between the first and second QPSK signals. For a general case of a 4 M QAM, its EVM for M ≥ 3 is estimated by considering only the three largest QPSKs [1], resulting in where φ = φ M−2 − φ M−1 , and φ i is the phase difference between the M th and the i th QPSK signals (i.e., φ i = θ M −θ i ). 5) Local LO I/Q Phase and Gain Imbalance: Each path is dealing with a QPSK signal. Therefore, the LO I/Q phase imbalance in each path causes a distortion to the corresponding QPSK signal. It is shown in [16] that the constellation of a QPSK signal in the presence of LO I/Q phase imbalance is compressed along one diagonal and stretched along the other. Therefore, a non-zero EVM for the corresponding QPSK signal is produced by this phenomenon, as follows where δ θ is the phase mismatch between LO I/Q signals. The effect of amplitude imbalance of LO IQ signals (i.e., gain mismatch) is also analyzed in [16]. The amplitude imbalance causes the QPSK symbols to be stretched in one Cartesian direction (i.e., I or Q axis) and compressed in the other direction (i.e., Q or I axis), thereby causing a non-zero EV. Therefore, EVM, in this case, can be calculated, as follows where δ a is the amplitude imbalance of I and Q paths.
B. Power Amplifier 1) EVM Degradation Due to PA Non-Linearity: In the RF-QAM transmitter, the 4 M QAM constellation is constructed by M constant-amplitude QPSK signals in which the ratio of symbol-to-symbol spacings in any two adjacent QPSK sub-constellations must be kept at a constant value of two. This ratio is easily maintained by fine-tuning the DC bias current of each QPSK modulator [1]. Moreover, as noted in [12] and [15] and recited in Section I, RF-QAM architecture has the unique advantage that the power amplification can be performed on each QPSK signal prior to the power combining. This means that each PA is now handling a constant amplitude signal (i.e., QPSK) which does not substantially suffer from PA non-linearity. This notion suggests that if the PA is fed with a QPSK rather than a QAM signal, the adverse effects such as EVM degradation due to the PA non-linearity will be noticeably reduced. The use of constant-envelope PA means that the RF-QAM transmitter performance will not be degraded by AM/AM and AM/PM distortions, suggesting that EVM is invariant with respect to the PA input power; a remarkable advantage. Therefore, non-linear PA topologies such as class-D can be used to significantly improve PA efficiency.
Since each PA in RF-QAM transmitter is fed with a QPSK, the data is encoded in the phase of the input signal to the PA. Therefore, although the non-linearity introduced by the PA causes distortion to the PSD of the transmitted signal (i.e., spectral regrowth), it does not impact the information encoded into the phase of each QPSK signal. Even in the case where pulse shaping has made each QPSK amplitude variable, non-linear PAs can still be incorporated, as the amplitude distortion caused by the PAs does not affect the data bits encoded in the phase. However, this phenomenon increases the out-of-band emission of the transmitter.
2) The Impact of Bandwidth Limitation on EVM: The limited RF bandwidth in any transmitter including the one incorporating a constant-envelope modulation causes EVM degradation. Particularly, the limited bandwidth degrades the EVM of each QPSK modulated signal in the RF-QAM transmitter of Fig. 1(b), as was quantified in (28). Furthermore, since each QPSK signal is now of limited bandwidth, its amplitude is no longer considered to be constant. Therefore, AM/AM and AM/PM distortions should be taken into account. These two effects are investigated first under the Nyquist channel condition for zero-ISI where the symbol rate remains smaller than twice the baseband bandwidth [30]. This leads to the condition ϵ ≪ d, which implies a small input amplitude variation. Starting with AM/AM distortion, the PA in each QPSK path of the RF-QAM transmitter in Fig. 1(b) should always operate at its P sat , where the input power variation does not change the output power substantially, to achieve maximum efficiency. Therefore, the AM/AM distortion has a negligible impact on the PA performance. As for the effect of AM/PM distortion, based on (2), small input variation causes every constellation point, displaced because of the limited transmitter bandwidth, to rotate approximately with the same rotation angle. Hence, EVM remains more or less unchanged.
Using the same simulated PA in this work (details in Sec IV-A), when ϵ/d varies from 0.01 to 0.1, calculations show only a maximum of 0.1 dB deterioration in EVM due to this AM/PM distortion. From (2), one way of reducing the effect of AM/PM distortion is to design the PA pre-drivers of each QPSK path so that the PA operates in a region where the variation rate of the AM/PM distortion with respect to input power is very small. The distortion rate in terms of the simulated PA input power is shown in Fig. 8(a). The EVM due to AM/PM distortion is shown to experience its worst value for a = 1/ √ 3β 2 , or equivalently, when P in = 5.23 − 10 log β 2 dBm. In cases where the zero-ISI condition during signal transmission is relaxed, higher symbol rates can be allowed [31], and ϵ/d can thus assume appreciable values (e.g., 0.1 ≤ ϵ/d ≤ 0.25, where 0.25 value corresponds to symbol rate as high as 3 times the baseband bandwidth). In such cases, PAPR of the QPSK signal is approximated to be: Since the PAPR of the bandwidth-limited QPSK signal is still low (at least three times lower) than that of a QAM signal with a minimum PAPR of 2.6 dB (corresponding to 16QAM constellation), AM/AM and AM/PM distortions will impose negligible degradation on EVM compared to the direct effect of limited bandwidth given by (28). Fig. 8(a) shows EVM under different values of ϵ/d. It is observed that EVM is degraded by as much as a maximum of 0.2 dB in the presence of AM/PM distortion, implying that EVM due to bandwidth limitation dominates the one caused by the AM/PM distortion. As a consequence, the impact of AM/AM and AM/PM distortions on the RF-QAM architecture can be ignored. This notion points to a unique advantage of RF-QAM architecture, i.e., the PA non-linearity has a negligible impact on the RF-QAM transmitter, not only in the case of ideal QPSK signals, but also in a practical scenario where the bandwidth is limited. This indicates that the PA in the proposed scheme can reliably operate at its P sat for maximum efficiency.
3) PA Efficiency and Output Power: Each PA in the proposed RF-QAM transmitter is fed by a constant-envelope signal (i.e., QPSK). Therefore, the transmitter performance is not limited in any shape or form by the PAPR, and the PA can thus be designed to operate at its maximum efficiency (thus minimizing the PA power dissipation) while its EVM remains unchanged. The PA efficiency, in this case, is increased to A comparison between the two efficiencies in Eqs. (18) and (37) reveals a remarkable advantage of RF-QAM transmitter over the conventional architecture. The PA output power in an RF-QAM transmitter does not have to operate in its power backoff regime, while, at the same time, it can be reliably boosted beyond its P 1d B to P sat . It is also worth mentioning that the power combiner is the last stage prior to the antenna in the RF-QAM architecture. Therefore, its power loss directly impacts the output power and efficiency. The power combiner contribution on the transmitter performance will be investigated in Section III-C.

C. Power Combining
In the RF-QAM transmitter, the power combining is done after the QPSK PAs either on-chip electronically or in the air using beamforming As a major distinction, in the conventional transmitter, only a pair of orthogonal I and Q signals are combined, whereas, in the RF-QAM transmitter, M QPSK signals should be fed to an M-to-1 power combiner.
The power combiner non-linearity is crucial, as it produces an amplitude-varying signal (i.e., 4 M QAM) at its output. Since the power combiner employs a passive network, it exhibits negligible non-linearity. However, power combiners exhibit two other non-idealities with detrimental effects, namely, power loss and imperfect port-to-port isolation. Any power loss associated with the power combiner directly manifests itself into efficiency degradation of the transmitter chain. The output power delivered to the antenna is: where L C is the power combiner loss. According to Eqs. (17) and (38), so long as L C | d B < (P sat | d B − P 1d B | d B ) + P A P R| d B , which is usually the case, the RF-QAM transmitter outperforms the conventional counterpart in terms of PA output power and efficiency. The output power of levels of both transmitters accounting for PAPR and power-combining loss are shown in Fig. 9. This figure clearly shows that the output power is higher in the RF-QAM transmitter compared to the conventional architecture by  Finite port-to-port isolation in an M-to-1 power combiner results in the signal of port i appearing at port j with attenuated amplitude and a possible phase shift. This, in turn, can indirectly degrade EVM by adversely influencing the optimum load-pull matching requirement of the PAs prior to the power combiner in Fig. 1(b). Furthermore, this phenomenon degrades the EVM by producing phase and amplitude offsets of the QPSK signals, as will be described in this section.
The phasor representation of a QPSK symbol with an amplitude of | − → A i | is shown in Fig. 10 in blue. Assuming that this signal phasor is injected to the i th input port of the power combiner, due to finite port-to-port isolation, it will appear at the input of the j th port with an attenuated amplitude of | − → A i j | and a phase shift of δ i j (the red vector in Fig. 10). Port i to port j isolation seen from port i, I i j , is defined to be the ratio of the residue over the original phasor amplitudes (i.e., The power combiner adds the residue phasor, − → A i j , to the original phasor, − → A i , producing the normalized phasor − → S i j (= − → A i + − → A i j ) at its output (green vector in Fig. 10). Furthermore, the resultant offset amplitude, A i j , is defined as The following parameters are defined where i j shows the percentage of the amplitude change of the i th QPSK phasor at the power combiner output.
i j shows the phase difference between the modified and ideal i th QPSK phasors. In the case of a good port-to-port isolation (e.g., -15 dB), I i j ≪ 1. Therefore, (39)-(40) are simplified to It is noteworthy that (39)-(42) only capture the impact of the leakage from port i to port j on the i th QPSK signal. In a general power combiner with M input ports, each input port i has some leakage to all other M − 1 input ports whose impacts on the amplitude and phase of the i th QPSK signal can be quantified by calculating the following parameters From − → S i and A i , i and i are readily calculated: In conclusion, leakage can cause phase and amplitude mismatches between i and j QPSK signals, degrading EVM. These mismatches are captured by i and i , which are then used in Eqs. (29), (30), (32) and (33) to calculate the EVM induced by power combiner leakage.
Various power combination techniques have been proposed in the literature [32], [33], [34], [35], [36] which can be utilized to combine different QPSK signals in an RF-QAM transmitter. Among different methods, three main techniques are (1) transformer-based power combining as shown in Fig. 11(a); (2) transmission-line-based power combining (e.g., Wilkinson structure shown in and Fig. 11(b)); and (3) spatial power combining necessitating the usage of multiple antennas and beam-forming, as shown in Fig. 11(c). The choice of one structure over the other boils down to an existing trade-off between EVM degradation caused by the leakage and efficiency reduction due to the loss.  ture LO, to form a 4 M QAM constellation, These M LO signals are often generated from a single core PLL at a lower frequency, and subsequently, are distributed to M QPSK modulators using an H-tree distribution network, as shown in Fig. 12(a) [12]. The LO frequency appearing at the output of the network is multiplied by N to produce the desired frequency for each QPSK modulator. The inherent mismatches between different paths within the network degrade the EVM which can be largely mitigated with a symmetrical layout. The EVM degradation due to asymmetries is exacerbated when the operating frequency is high (e.g., above 100 GHz). At (sub-)terahertz frequencies, the LO distribution network renders itself as a distributed transmission-line (t-line) structure where branches at each stage are matched-terminated to the same characteristic impedance, Z 0 . A l i j overall length difference between the i th and j th input-output paths of a 1-to-M H-tree results in a delay of t i j = l i j √ L 0 C 0 (L 0 and C 0 are per unit-length inductance and capacitance). The total phase mismatch between the outputs of the i th and j th frequency multipliers in the network of Fig. 12(a) thus equals to: where f c = N × f P L L is the carrier frequency. It is observed from (47) that the network length mismatches l i j create an excess phase mismatch between i th and j th QPSK signals, which is linearly dependent on the carrier frequency and l i j . This, in turn, degrades EVM, as quantified by Eqs. (32) and (33). Assuming an on-chip transmission line with L 0 = 600 nH/m and C 0 = 200 pF/m [37], for a 16QAM transmitter with a length mismatch of l between the two LO paths, based on Eqs.  ports of QPSK modulator, a minimum length mismatch of k% will define an EVM floor given by (48). If this EVM floor is not satisfactory, phase-mismatch compensation (e.g., the use of explicit phase shifters) should be incorporated into the LO path to calibrate this mismatch. Two important notes regarding the LO distribution network should be taken into consideration: (i) Accounting for the t-line loss, the LO power and phase noise will be enhanced with the PLL and distribution network operating at 1/N th of the carrier frequency [38]. The oscillation frequency in this case is boosted to the desired value by frequency multipliers placed next to each modulator circuit [12]. Additionally, the interconnect loss generates thermal noise which degrades the LO phase noise. However, this degradation is inconsequential since it is shown to be only affecting the far-out tail of the phase noise profile [1]. (ii) To eliminate any phase imbalance caused by the LO distribution network in the RF-QAM architecture, a symmetrical layout becomes increasingly critical at (sub-)terahertz carrier frequencies. However, even in the absence of a completely symmetrical layout, phase shifters can be employed in the LO path to compensate for this mismatch.
2) Impedance Mismatch: Another factor contributing to EVM of the RF-QAM transmitter is the impedance mismatches between corresponding blocks located at each stage of the H-tree network (e.g., Buf 31 -Buf 34 in Fig. 12(a)). The impedance mismatch between the output branches of Stage i causes phase mismatch between the output signals appearing at the power splitting juncture of the subsequent stage. This phenomenon is quantified in this section.
The impedance mismatch between the output branches of each stage of the H-tree network is modeled using the lumped circuit in Fig. 12(b), where V s,nm and Z o,nm represent the Thevenin equivalence of the sub-network prior to Buf nm . Z L ,nm is the input impedance of the subsequent blocks which could be line buffers, the frequency multiplier, or the QPSK modulator. However, the buffers at each stage of the H-tree as well as the frequency multipliers located at the output branches of the H-tree are identical, and their impedance mismatches are thus negligible. On the other hand, the QPSK modulators, producing amplitudes with a scaling factor of two (cf. Fig. 1(b)), are designed with different bias conditions and transistor sizes. Hence, they are the major contributors to impedance mismatch. Under this assumption, as shown in Fig. 12(c), the Thevenin equivalence of each path prior to each QPSK modulator is identical and is modeled by a voltage source, V S , and an output impedance, Z o = R o + j X o . The i th path is also terminated with the input impedance of its QPSK modulator denoted by Z L ,i = R L ,i + j X L ,i . The voltage at the LO port of the i th QPSK modulator is thus calculated to be: where γ i is complex attenuation factor of the i th path, i.e., Assuming the conjugate matching principle for maximum power transfer, R L i and X L ,i are R L ,i = R o + R i and X L ,i = −X o + X i . Here, R i and X i denote the impedance mismatch. Hence, (50) is re-expressed as Focusing on the phase of γ i , (51) is simplified, as follows: Assuming (52) is simplified by its first-order Taylor-series approximation to Eq. (53), in the case of purely resistive matching (e.g., 50 ), is further simplified to The LO signal's phase at the far-end termination of each path is shifted by the value given by (52). Therefore, any pair of LO paths within the network exhibit a total phase mismatch which can be readily calculated. These LO phase mismatches appear as phase mismatches at the output of QPSK modulators [27], which degrades the EVM, as predicted by (32) and (33).

E. Noise
The EVM of the RF-QAM transmitter was calculated in [1] and was recited in Eq. (27). A viable design approach is based on minimizing the transmitter EVM due to existing impairments. If all QPSK modulators within the RF-QAM transmitter have the same EVM and EVM floor, the EVM of the resulting 4 M QAM induced by white Gaussian noise will reach its minimum. Pursuing this approach leads to where ρ s is the QPSK symbol SNR referenced to a 1 resistance [27]. Comparing (26) and (55) reveals that the proposed RF-QAM transmitter's performance subjected to white Gaussian noise will be superior to its conventional counterpart if the QPSK SNR in the former is greater than the average QAM SNR in the latter (i.e., ρ s > ρ avg ).
To compare the SNR of both architectures, the noise contribution of each block needs to be considered. RF-QAM transmitter exhibits three major advantages in terms of noise: (1) The SNR at the input of QPSK modulators in the RF-QAM transmitter can potentially be higher than that at the input of the I/Q mixers in the conventional counterpart. This is because the signal in the conventional architecture is fed to the mixers using DAC, which, in and of itself, contributes  to the overall noise. Alternatively, input bits in the RF-QAM transmitter are directly fed to the modulators. (2) Since the DAC output signal needs to back off because of the PAPR of the multi-level PAM signal generated in the baseband, the average signal power at the input of the I/Q mixers is less than the maximum output power, thereby degrading the SNR. However, in the RF-QAM, the inputs of modulators are fed by raw data bits whose amplitude can be maximized, thus improving the SNR. Specifically, assuming Gray-coded symbols and random input bit stream, the average voltage at the DAC output in the conventional architecture is V D D /2. However, in the RF-QAM transmitter, it can be increased to V D D . Therefore, the SNR-ratio for the two architectures is: where N M X , N P A , and N PC are the input-referred noise of the mixer, PA, and power combiner, respectively, and G M X and G P A are the mixer and PA power gains, respectively. Assuming that the power gain of the PA is large, Eq. (56) is simplified to: It is observed that the SNR in the RF-QAM architecture is at least 6 dB higher than the conventional architecture. (3) The input data streams in the RF-QAM architecture are in the form of square wave signals with sharp transitions. Swapping the LO signal with the input data stream results in an abrupt switching of the differential pair in the modulator, thereby reducing the noise contribution of the differential pair [27]. Based on the aforementioned advantages, EVM due to white Gaussian noise in the RF-QAM transmitter can be less than the conventional counterpart.

IV. SIMULATION RESULTS
Various circuit-and system-level simulations were conducted to verify the accuracy of the developed models and compare the two transmitter architectures studied in this work.

A. Circuit-Level Simulation
Fig. 14 shows the complete schematic of the RF-QAM transmitter incorporating 16QAM scheme in a 45 nm CMOS RF-SOI process. In this simulation, the effect of routing parasitics was also accounted for. A high-power and a low-power PA with the same topology were designed for this transmitter. Reducing the bias and supply voltages causes the low-power PA to achieve approximately 6 dB lower saturated output power. For the high-power PA, a two-stage class-AB PA with transformer-based matching [12], shown in Fig. 15, operating at 125 GHz was designed and simulated (involving post-layout extraction and electromagnetic simulation of passive components). Figs. 16(a)-16(b) show the first and second active stages of the stand-alone high-power PA, respectively. The following fitting parameters were extracted to be used in MAT-LAB as PA behavioral model: α 1 = 8.34, β 1 = 10.47, α 2 = 11.18, and β 2 = 19.67. AM/AM and AM/PM characteristics of the simulated PA are shown in Figs. 17(a) -17(b) in terms of the input power, P in , referenced to 50 . It is seen that for the low input power regime (e.g., ≤ -20 dBm) where the transfer characteristic remains linear, the distortion due to AM/AM and AM/PM conversions is negligible. The simulated PA power gain vs. P in is demonstrated in Fig. 18(a), exhibiting an inputand output-referred 1-dB compression point of -9.25 dBm and 8.15 dBm, respectively. Additionally, the saturated output power of the PA was 12.2 dBm at an input of around 0 dBm. In this scenario, the PA's DC power consumption was simulated to be 211 mW. Therefore, the efficiency and the power-added efficiency (PAE) of this PA were simulated to be 7.5% and 7%, respectively. The frequency response of this PA is shown in Fig. 18(b), and 1-dB bandwidth and 3-dB bandwidth of the simulated PA, as it is seen in in this figure, was 13 GHz and 29 GHz, respectively.   Moreover, a 2-to-1 Wilkinson power combiner was designed to combine the output signals of the two PAs. It is noteworthy that although the power combiner exhibited finite port-to-port isolation, the reciprocity and symmetry of the passive structure prevented EVM degradation due to its leakage.
A double-balanced QPSK modulator was also designed to simultaneously generate and upconvert the QPSK symbols from the input bit streams. The sizing of each transistor was chosen to be W L = 12µm 40nm , and the biasing current source for each mixer path is designed to be I SS = 2 m A. To test this circuit and obtain the eye-diagrams and output signal's PSD, a pseudo-random bit stream generated 4 bits each at a bit rate of 5 Gbps making the total transmission rate to be 20 Gbps. The first two bits are depicted in Fig. 19(a), and the simulated output QPSK signal generated by one of the QPSK modulators is shown in Fig. 19(b). Figs. 20(a)-20(b) demonstrate the eye diagram of the down-converted I and Q PAM signals. Additionally, the spectrum of the transmitter's output 16QAM signal is depicted in Fig. 21.

B. System-Level Simulation
The conventional and RF-QAM transmitters incorporating different QAM schemes were simulated in MATLAB using behavioral models of the blocks shown in Figs. 1(a)-1(b). To solely assess the impact of each individual block on the performance of the chain in these architectures, the non-idealities of a single block at a time were taken into account while other blocks were considered to be ideal. It should be noted that these simulations include a pseudo-random bit stream generator creating 4 bits each at a bit rate of 10 Gbps. Furthermore, the WGN-induced EVM in both architectures was also captured.
1) PA Non-Linearity: The designed PA of Fig. 15 was used in the conventional transmitter of Fig. 1(a). The 16QAM  waveform at the output of the PA is shown in Figs. 22(a) and 22(b) for low input amplitude (i.e., a u =10 mV) and high input amplitude (i.e., a u =100 mV), respectively. It is observed that when the input amplitude of the PA grows, due to its nonlinear characteristic, different symbols at the output become indistinguishable. Additionally, Fig. 23 shows the plot of EVM based on the developed analysis (i.e., Eq. (10)) and MATLAB simulations for three different 4 M QAM schemes, namely, 16QAM, 64QAM, and 256QAM. A comparison between the EVM plots derived from Eq. (10) (solid lines) and the ones obtained from MATLAB simulations (black asterisks) verifies the same variation trend between the two. Additionally, a test bench for the RF-QAM transmitter was developed in   MATLAB using the behavioral model of two PAs. The first PA had the following fitting parameters: α 1 = 8.34, β 1 = 10.47, α 2 = 11.18, and β 2 = 19.67. As for the second PA, the fitting parameters were α 1 = 10.17, β 1 = 6.04, α 2 = 13.26, and β 2 = 17.12. For a large a u (i.e., a u =100 mV), the output 16QAM waveform is shown in Fig. 24. Comparing Figs. 22 and 24 show that the multi-level output QAM signal in the RF-QAM transmitter is not distorted compared to the conventional architecture.
2) DAC Non-Linearity: The behavioral model of four DACs with four resolutions (i.e., 3-6 bits) were developed in MAT-LAB environment to be used in the conventional transmitter. These DACs exhibited various levels of INL ranging from 0.1 LSB to 0.36 LSB. The output voltage of these DACs were then fed to ideal I/Q mixers driven by quadrature LO signal, and combined afterwards to generate 64QAM signal. Subsequently, this signal was amplified by a linear PA. Fig. 25 plots EVM with respect to the DAC INL for a 64QAM scheme under these four DAC resolutions. This figure also includes MATLAB system-level simulation results, indicated in black asterisks. The simulation results clearly verify the accuracy of the developed model.
3) Noise Contribution: A WGN source was added to the signal prior to PA in Fig. 1(a) and prior to each QPSK PA in Fig. 1(b). Fig. 26(a) shows the simulation results in black asterisks as well as EVM degradation due to limited system SNR in the blue solid line for these two architectures. It is noteworthy that the SNR in this simulation is obtained in reference to a 1 resistance. From Fig. 26(a), it is observed Fig. 26. Theory-based and simulated EVM due to (a) Gaussian noise, and (b) lack of perfect port-to-port isolation of the power combiner. that the two architectures have a common value for EVM as long as they have the same SNR. 4) Power Combiner: A behavioral model of an RF-QAM transmitter incorporating 16QAM scheme was simulated in MATLAB to investigate the effect of the power combiner leakage on EVM. For the sake of simplicity, the simulated power combiner was designed to have a uni-directional leakage from port 1 to port 2 without introducing any phase shift to the leaked signal. Fig. 26(b) shows the plot of EVM due to power combiner finite port-to-port isolation obtained by Eqs. (29) and (45) with a red solid line as well as the simulated results for the simulated TX using black asterisks.
V. CONCLUSION This paper presented a comprehensive analysis of the RF-QAM transmitter. A comparative study of RF-QAM and conventional architectures was undertaken, while their performance due to key determining factors such as PA linearity and efficiency, DAC power consumption and linearity, impedance mismatch in LO distribution network, and noise were accounted for. Table II provides an estimation of the power consumption of each block in the two architectures operating at mm-wave frequency range. Various system-level simulations were conducted to verify the analytical studies developed in this paper. The results of the comparative study are summarized in Table III. APPENDIX A The details and the mathematical derivations of the conventional 4 M QAM transmitter's EVM is presented in this section. The average rotation angle of the constellation diagram's symbols is calculated to be