Multilevel Outphasing With Over-the-Air Combining in Large Antenna Arrays

This article investigates the feasibility of combinerless multilevel outphasing transmitter as a potential architecture for large millimeter-wave (mmWave) phased arrays. We consider two distinct ways of distributing the component signals to the antennas and develop a model for the received signal at each radiated spatial direction from a phased array. Based on the received signal model, we derive expressions for the signal-to-distortion ratio as well as total power experienced at each spatial direction. Furthermore, antenna branch mismatches, overload distortion and quantization are considered, and an analytical model for the signal-to-distortion ratio at the intended receiver is derived. We additionally establish a model for comparing the achievable energy efficiency to those of the relevant reference methods. Extensive numerical experiments are carried out to verify the analytical works, and to assess the commonly used metrics of error vector magnitude (EVM) and total radiated power adjacent channel leakage ratio (TRP-ACLR). It is shown that the combinerless architecture is a valid option for mmWave phased arrays, demonstrating favorable EVM results and TRP-ACLR beyond the 28 dBc limit imposed by the 3GPP, even in the presence of the considered distortions. The conducted energy efficiency assessment shows that efficiency of the reference methods can be exceeded with sufficient amount of outphasing levels. The considered architecture is thus an interesting alternative for addressing the linearity vs. energy-efficiency challenge in mmWave phased-array systems.

that the adoption of 5G networks will triple the amount of base stations (BSs) deployed compared to the 4G/LTE networks, with estimates of total number of BSs to be 65 million worldwide by the year 2025 [1].In addition, due to the wide-spread adoption of multiple-input multiple-output (MIMO) techniques, the BSs will then employ large scale antenna arrays, with antenna counts in the order of hundreds or even thousands [2], [3].This is especially true at the millimeter wave (mmWave) range of frequencies, where due to increased path losses, higher antenna gains are required [2], [4].Therefore, it is imperative that the individual transmitter elements in the BSs operate as efficiently as possible.Moreover, as the volume of the devices increases, cheaper components suffering from nonidealities are used in the production [5].Naturally, the energy efficiency and nonideality issues in large scale antenna arrays need to be dealt with, without compromising the coverage and quality of service.

A. Background and State-of-the-Art
In order to maximize the power efficiency of antenna arrays, individual power amplifiers (PAs) are driven close to saturation.A widely used transmitter architecture is the Cartesian, which induces considerable nonlinear distortion in the PA, especially in the presence of high peak-to-average power (PAPR) signals, such as the widely used orthogonal frequency-division multiplexing (OFDM) waveform.A classical approach to diminish the effects of the nonlinear distortion is to utilize sophisticated digital predistortion (DPD) techniques [6], [7], [8], which aim to linearize the PA output by applying the inverse of the nonlinear response of the PA to the signal before passing it through the PA stage.The DPD techniques are studied extensively in scientific literature, also in the large antenna array and MIMO contexts, see, e.g., [7], [8].
An alternative way to circumvent the issue of the nonlinear distortion is to utilize constant envelope (CE) waveforms, which have a PAPR of 0 dB and do not induce intermodulation distortions around the carrier frequency, even in PAs driven in deep saturation [9].The CE signals are appealing in MIMO systems, as they require simpler, cheaper hardware [10].Additionally, the utilization of CE signals facilitates the use of highly efficient PA classes, such as switch-mode PAs (e.g.class-D, class-E, class-F), where the transistors are used as switches, as opposed to amplifiers, as is the case in traditional PAs (e.g.class-B, class-AB) [11].
One particular example of the utilization of CE signals is to perform CE precoding [12], [13], [14], [15].In CE precoding, the transmit signals are precoded on symbol-level, such that the signal transmitted by each antenna unit is CE.Although enabling the use of power-efficient PAs, these precoders suffer from high computational complexity due to their nonlinear, signal-dependent, nature.Additionally, CE precoding has an inherent beamforming gain loss compared to non-CE precoders, as a considerable amount of the available power is transmitted to the channel null-space [12], thus deteriorating the link-level efficiency.
The full potential of the CE signals can also be capitalized with so-called digital transmitters, such as the digital polar and digital outphasing (OP) architectures, which utilize phase modulated signals in combination with switching PA structures.Although these architectures in their traditional forms have been known for a long time (see e.g., [16] and [17]), only fairly recently have they gathered notice in the scientific community due to the nanoscale complementary metal oxide semiconductor (CMOS) technology development, which has made them feasible for digital-intensive implementations.Particularly the OP transmitter has been drawing attention in the recent years.The basic principle of such OP transmitter is to divide the transmit signal into two CE component signals, which are amplified separately and then combined to produce an amplified version of the original transmit signal [17], [18], [19], [20].The feasibility of the OP architecture has been demonstrated with CMOS implementations for example in [21], [22], [23], and [24], which showcase the linearity and general feasibility of the structure.The drawback in OP is that it suffers from low efficiency when the instantaneous amplitude of the transmit signal is low [25], which, by definition, occurs frequently with high-PAPR signals.
Many variations of the OP transmitter have been introduced, especially to tackle the efficiency issue.One promising solution is the so-called multilevel outphasing (ML-OP) transmitter, where instead of a single amplitude level, the component signals can have an arbitrary number of discrete amplitude levels, which increases the combiner efficiency by limiting the outphasing angle [26], [27], [28].In [29], the improvement of the signal quality and out-of-band (OOB) emissions under branch mismatches with increased number of amplitude levels was evidenced.The works in [30] and [31] studied the optimization of the amplitude levels using signal statistics, showing substantial improvement over standard OP.An asymmetric ML-OP structure, where the component signals can have different amplitudes, was introduced to improve efficiency in [32] and [33].Similar to the multilevel approach, a multi-mode architecture was proposed, where the outphasing decomposition is only carried out for certain signal amplitudes [34], [35].Moreover, the polar-LINC structure [25] combines the polar and regular OP architectures, improving the back-off efficiency.An ML-OP variant with three component signals, termed tri-phasing, was proposed and implemented in [36] and [37], to improve the linearity by addressing issues such as pulse swallowing and harmonic spreading.Various OP and ML-OP architectures have been successfully implemented, e.g., in [25], [33], [37], [38], [39], and [40].
The overall efficiency of the OP transmitters is hindered by the power combiner losses when combining the component signals [41], [42], [43], [44], [45], [46], [47].Bypassing this issue and achieving improved efficiency can be realized by transmitting the component signals separately, and let them combine over-the-air (OTA).Additionally, the removal of the typically very frequency selective combiner enables wider frequency operation for the transmitter.This combinerless OP structure has already seen some consideration in literature.The work in [41] used the Alamouti code to improve the link and to allow higher antenna separation with OP transmitter in a combinerless two-antenna configuration.In [42], the OP transmitter was used in multiantenna configuration, the effects of nonidealities were considered, and a functioning prototype was implemented.Similarly, in [43], multiantenna configuration was considered, this time also taking into account the possibility to use ML-OP.Additionally, effects of the branch mismatches on the received signal quality were considered, and the benefit of multiple amplitude levels was evidenced in terms of OOB emissions.The work in [44] proposed a zero-forcing equalizer to take into account differences in the OP component signal propagation channels.An encoder and a list Viterbi algorithm were proposed in [45], to achieve maximum likelihood detection in a combinerless OP system.The work in [46] leverages the combinerless OP structure to correct the gain and phase mismatches in the receiver side.In [47], a large antenna array with OP transmitter was implemented, and the total power beampatterns were shown.Recently, [48] showcased the combinerless OP structure in simulations and a prototype in large antenna arrays, along with sequential (termed Doherty in [48]) and quadrature combining, highlighting the total power beampattern and considering the negative effects of the mismatches.

B. Contributions
While some research on the combinerless OP exists, as reviewed above, a comprehensive analysis of its characteristics and feasibility in the large array context, in terms of transmitted signal quality and integrity in the spatial domain, is missing.In this article, building on our early work in [49] on single-level OP, we aim to fill this gap.Unlike prior art, this work analytically considers the spatial domain, i.e., emissions not just towards the intended user but to all spatial directions.We also consider the effects of the branch mismatches, clipping, and quantization on signal quality in the ML-OP case, both at the intended user and other spatial directions.Furthermore, the energy efficiency in terms of required supply power to meet wanted signal quality at the receiver with the ML-OP structure is analyzed and compared against those of a Cartesian and a sequential transmitter.Overall, we showcase how the combinerless ML-OP structure performs in large antenna arrays with varying number of antennas and amplitude levels.The main contributions and novelties are as follows: • We derive an analytical model for the total radiated power signal-to-distortion ratio (TRP-SDR), which quantifies the useful signal to distortion across the spatial domain, and the total power beampatterns in the two beamforming cases, and the models are then used in analyzing the signal quality.
• We develop a model taking into account branch mismatches, overload, and quantization distortions for the signal quality experienced at the intended user as signal-todistortion ratio (SDR).Extensive simulations are carried out to verify the accuracy of the analytical work.
• Further, we extensively simulate the total radiated power adjacent channel leakage ratio (TRP-ACLR) without and with the branch mismatches and quantization, as well as the error vector magnitude (EVM).• The required DC supply power to meet the wanted spectral efficiency (SE) is analyzed and assessed in the ML-OP scheme and in two reference transmitter methods, namely quantized Cartesian and sequential architectures.• All in all, this study and the results show that the combinerless ML-OP architecture is a valid alternative for phased array transmitters, when utilizing e.g., 8 amplitude levels, 6 phase modulator bits, and 32 or more antennas.This configuration also constitutes a better energy efficiency than the presented reference methods.The remainder of this paper is organized as follows.Section II introduces the overall system model, where a ML-OP transmitter feeds an antenna array through an analog beamformer, and two distinct beamforming schemes are introduced.In Section III, we analyze the received signal at each spatial direction and develop analytical models for total useful signal power versus distortion power, and the total power beampatterns.Section IV analytically examines the effects of distortions stemming from antenna branch mismatches, clipping, and quantization of the transmit signal on the received signal quality, specifically at the intended user direction.In Section V, the power efficiency of the combinerless ML-OP scheme is assessed and compared to that of the quantized Cartesian and sequential transmitter architectures, under the assumption of class-B PAs.The aforementioned analytical results are verified and the system is further analyzed via numerical simulations in Section VI.Finally, Section VII concludes the paper.
Mathematical Notation: In this paper, standard complex baseband signal modeling is adopted.The imaginary unit is denoted as j.The ceiling function is expressed with ⌈•⌉, and the floor function with ⌊•⌋, while (•) * denotes the complex conjugate.E[•] is the statistical expectation operator, while exp(•), erfc(•), and ln(•) denote the exponential function, the complementary error function, and natural logarithmic function, respectively.N denotes the Gaussian distribution.Time dependent variables are denoted using a generic index n in brackets.

II. SYSTEM MODEL
Here, we introduce the overall system model, which will then be used to analyze the performance in the upcoming sections.In our model, we consider a combinerless ML-OP transmitter, which transmits the component outphasing signals separately.The transmitter employs multiple antennas, which allows the signals to be beamformed in a phased array towards the user.The combining of the signals will then occur at the receiving antenna, after the signals have propagated through a wireless channel.In this work we consider a single user as the receiver, with a single antenna.The overall system models are depicted in Fig. 1, where Fig. 1(a) presents a realistic implementation of the system, where the beamforming is carried out before the amplification.However, since the beamformer only affects the phase of the input signals, which in Fig. 1(a) are the phase modulated, i.e., CE signals p 1 [n] and p 2 [n], we can switch the places of the beamformer and the amplification stage in Fig. 1(b), which is equivalent to the one presented in Fig. 1(a).The beamformer takes only two inputs and divides the signals to the antennas and applies the required phase shifts.We base our modeling on the latter equivalent model in Fig. 1(b), which allows us to use established notation for the OP transmitters.

A. Multilevel Outphasing Transmitter
In the complex baseband signal context, an arbitrary input signal x[n] can be written as where A[n] is the amplitude and ϕ[n] the phase of the signal over time.We assume that A[n] follows the Rayleigh distribution, while ϕ[n] is uniformly distributed, as can be approximated, e.g., in an OFDM system, while the variance of the baseband input signal x[n] is denoted by σ 2 x .For the purposes of the ML-OP transmitter, the amplitude A[n] needs to be normalized, i.e., max(A[n]) = 1.Cutting the amplitude signal causes overload distortion, which sophisticated OFDM systems may mitigate by employing iterative clipping and filtering methods.In this work, for tractable analysis, we assume a simple clipper, which assigns the value 1 for amplitude levels exceeding it.Simple arithmetic then shows the following relation between the signal variance and the clipping probability P clip , under the assumption of Rayleigh distributed amplitude: The clipping is carried out prior to the transmitter, as is depicted in Fig. 1, and the clipped signal x[n] is defined as where o[n] is the overload signal, limiting the envelope of x[n] to 1. Therefore, o[n] can be written as The idea behind OP is to divide the input signal x[n] into two CE signals, which are amplified separately.In the ML-OP structure, the amplitude of the component signals can be within a predetermined set of levels, for improved energy efficiency and accuracy [26], [27].Let us define N A as the number of amplitude levels in the OP system, and let the difference between levels be uniform.The ML-OP signal baseband equivalent can then be written as [26], [27], and [36] x where are the ML-OP component signal baseband equivalents after the amplification stage, and are generated in a phase modulator (PM) using the OP angle signals Φ[n], which are produced by the signal component separator (SCS), and are defined as Developing ( 6) using ( 7), we can identify an additive model for the ML-OP component signals where Further developing, the ML-OP component signals can be written as where the term o illustrates the concept of the ML-OP as a vector diagram with one and two amplitude levels, i.e., N A = 1 and N A = 2, where the former corresponds to regular, single-level OP structure.As can be seen from Fig. 2 and from ( 9), the phasor of ẽ[n] is always perpendicular to the phasor of x[n], and the amplitude is set such that the component signal amplitude is within the given set of the amplitudes.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Beamforming Schemes
In this paper, we consider a uniform linear array (ULA) type antenna layout, which employs M total antennas, where M is even, in order to transmit an equal amount of the component signals.Under the ULA phased array scheme, the signals may be beamformed with a traditional phase shifting beamformer, which will steer the beam towards the intended user angle θ, measured from the array norm.The applied phase shift ψ for the signal in antenna indexed m is then given as where k is the antenna separation, λ the wavelength and ρ = k/λ the ratio between antenna separation and wavelength.
The ML-OP component signals S 1 [n] and S 2 [n] are divided evenly to the transmitting antennas, therefore the transmit signal at antenna m can be written as where depending on the adopted beamforming scheme.In this paper, we consider two distinct beamforming schemes: alternating and block, similar to [42] and our early work in [49].In the former, the component signals alternate in the antennas, such that even indexed antennas transmit S 1 [n] and odd indexed antennas transmit S 2 [n], while in the latter the signals are grouped in equally sized blocks, such that the first M/2 antennas transmit S 1 [n] and the latter M/2 antennas transmit S 2 [n].The beamforming schemes are shown in Fig. 3.In practical terms, the block scheme is simpler to implement, since the alternating scheme requires crossing signal paths, which complicates the circuit design.However, performance-wise we will see in Section VI that the alternating scheme is superior, however, due to the simple nature of the block scheme, it is considered in this paper for reference.

C. Line-of-Sight Channel and Receiver Combining
The transmit signals propagate through line-of-sight (LOS) channels to observation receivers, located at various angles around the transmitter.We assume that the channels do not impose multipath effects to the transmit signals.This is a fair approximation in 5G NR FR2 systems, which operate at mmWave frequencies, where the LOS channels are heavily dominated by the LOS component, due to the high reflection and scattering losses at these frequencies.Under the ULA scheme, the channel from antenna indexed m towards angle θ ′ then merely phase shifts the signal transmitted from the antenna, according to The observed signal r[n, θ ′ ] at an observation receiver at angle under investigation θ ′ can then be defined as the sum of the signals after their respective channel propagation, given by Independent of how the component signals are distributed to the antennas, half of the antennas transmit each component signal, and therefore the total signal at the intended receiver angle θ is given as which indicates that however the component signals are distributed to the transmitting antennas, they combine perfectly at the intended receiver.For the sake of simplicity, we omit the effects of noise from the analysis.

III. DISTORTION AND BEAMPATTERN ANALYSIS
The radiation patterns of the proposed system are analyzed here, considering both the ratio between useful signal power and distortion, and the total observed power.The observed signal in the ML-OP alternating scheme at angle θ ′ , after (14), Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
is given as where the identity is utilized and β θ ′ = 2πρ(sin(θ ′ ) − sin(θ)) is defined for simplicity.Meanwhile, the observed signal under the block scheme at angle θ ′ can be written as We can identify a useful signal part u[n, θ ′ ] from the observed signals in both beamforming schemes in ( 16) and ( 17).This useful signal part for both cases at angle θ ′ is given as which is also the observed signal at each angle when an ideal transmitter -without the clipper -is utilized, as can be seen from ( 14) by substituting y m [n] = x[n].At the user angle we have u[n, θ] = M x[n], as per (15).By subtracting the useful signal from the observed signals in both cases, we can identify the total distortion signals d[n, θ ′ ], which for the alternating scheme at angle θ ′ is defined as and for the block scheme at angle θ ′ as The average useful signal power at each angle is then given as while at the intended angle we have x .Likewise, we can define the expected distortion power at each angle for the alternating scheme as where and We can employ l'Hospital's rule to see that at the intended angle we have P d,alt [θ] = M 2 σ 2 OL .The expected distortion power of the block scheme at each angle is given as which at the intended angle also gives us P d,bl [θ] = M 2 σ 2 e .We can now define a metric, which considers the ratio between the useful signal power and the distortion power at each angle.We call this metric total radiated power signal-todistortion ratio (TRP-SDR), and it is defined as: where P d [θ ′ ] is the distortion from either the alternating scheme shown in (22) or from the block scheme shown in (25).Lastly, we define the total power beampattern P [θ ′ ] as the expected power of the total observed signal at each angle under investigation θ ′ , i.e., where r[n, θ ′ ] is the observed signal over time, at angle θ ′ , as described in (14).Following the alternating scheme's Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
observed signal definition from ( 16), the total power beampattern in the alternating scheme is given as where fk (z) includes a second order Taylor series approximation at z The closed-form of the integral term in ( 28) is omitted for brevity, however, it is straightforward to find numerical values for it.To find the beampattern value at the intended angle, we can leverage the result from ( 15) since the distortion cancels out at the intended user angle, yielding ) .The total power beampattern in the block scheme can be determined as As was the case in the previous derivations, the beampattern in this case is also not defined at exactly the intended user angle.However, we can lean on the result from (15), to see that also in this case ) .

IV. ANTENNA BRANCH MISMATCH AND QUANTIZATION DISTORTION
So far, we have assumed that the only source of distortion in the ML-OP transmitter is the clipper.However, such real-life systems exhibit various types of nonidealities, such as amplitude and phase mismatches in the component signal propagation paths [18], [50], [51], and timing delays between the phase and amplitude paths [37], which cause distortion to the received signal.In this Section, we will focus on assessing the effects of the mismatches in the transmitting antenna branches in large antenna arrays on the received signal quality.Furthermore, we will also consider the effect of quantization on the signal quality, which will only affect the phases of the component signals S 1 [n] and S 2 [n], since the amplitude quantization is inherent in the ML-OP architecture, and does not affect linearity.Under these nonidealities, we will develop an analytical model for the signal-to-distortion ratio (SDR) for the received signal at the intended user, which will be used as a figure-of-merit for the system in Section VI.
Here, we consider the quantization of the component signal phases in the PM.As mentioned before, the amplitudes of the component signals are already drawn from a discrete set, therefore we can write the quantized component signals as where ) is the phase error induced by the quantization.Considering a typical mid-rise quantizer with B p phase bits, i.e., L p = 2 Bp different phase levels, the phase errors can be written as which is difficult to analyze due to the presence of the floor function ⌊•⌋.Therefore, for analytical purposes, we will assume that the difference of the phase errors, i.e., ∆ q and has empirically been found to have the following approximate probability density function (PDF) h(∆ q [n]): The above approximation holds well, when B p > 3. Fig. 4 demonstrates histograms of the difference of the phase error ∆ q [n] using randomly modulated OFDM signal with 10 6 samples, also showing the approximate PDF of (33), evidencing a good match.
Let us then assume that there is random phase and gain mismatch in each of the antenna branches.Let us further assume that these mismatches are Gaussian distributed for analytical tractability, with the phase mismatch having zero mean and the gain mismatch having mean of 1.The variances of the mismatches are denoted as σ 2 δ and σ 2 g , for the phase and gain, respectively.Therefore, we can write the transmit signal ŷq m [n] at antenna branch m under the mismatches, overload distortion, and quantization, following (12), as where g m ∼ N (1, σ 2 g ) and δ m ∼ N (0, σ 2 δ ).The received signal at the intended user angle rq [n, θ] under the antenna branch mismatches, clipping, and quantization, independent of the signaling scheme, can be written as where γ m = g m exp(jδ m ) and S q 1,2 [n] is chosen to be either S q 1 [n] or S q 2 [n] for each index m depending on the beamforming scheme.However, since the mismatches g m and δ m are random variables, from the distortion's point of view, it makes no difference which scheme is adopted as long as there are Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where M 1 and M 2 are sets both containing exactly half of the antenna indices, such that M 1 ∩ M 2 = ∅.From (36) we can identify three parts which are summed: the quantized useful signal part u q [n], the overload distortion part ω q [n] and the mismatch part µ q [n], the latter two of which constitute the total quantized distortion signal d q [n] = ω q [n] + µ q [n] at the receiver.Let us then define the SDR as the average power of the useful signal power, i.e., E |u q [n]| 2 , compared to the average total distortion signal power, i.e., E |d q [n]| 2 , stemming from the overload distortion, mismatches, and quantization.We can then write the average useful signal power as shown in (38), as shown at the bottom of the next page.Similarly, the distortion signal average power can be written as shown in (39), as shown at the bottom of the next page, which can then be further re-written as in (40), as shown at the bottom of the next page, where σ 2 OL and σ 2 e are defined in (23) and (24), respectively.Combining Equations ( 38), ( 40), (23), and (24), we can finally write the SDR as It is worth noting, that when L p → ∞, we have indicating that the effects of the quantization error disappear in (41) with infinite phase resolution, as expected.

V. SYSTEM POWER EFFICIENCY MODELING
This section covers a system power efficiency model, which we will use to compare the efficiencies of the reference transmitter schemes and the combinerless ML-OP structure.For a fair comparison, we will model the efficiency of the systems by the amount of required power to meet certain spectral efficiency (SE), by determining the required radiated power by a link budget, and by translating the required radiated power to required supply power by considering the instantaneous efficiencies of the utilized PAs.
We start by determining and assessing the relevant reference methods.Aside from the traditional Cartesian transmitter architecture, we will consider a sequential transmitter [52], [53], where the carrier and peaking signals are transmitted separately.In [48], it was established that the combinerless sequential architecture (termed Doherty spatial combining in [48]) operates more efficiently than the combinerless single-level OP transmitter, however, ML-OP, which substantially improves the outphasing structure's efficiency, was not considered.In order to retain the same beamforming gain in the sequential transmitter as in the Cartesian and ML-OP structures, the clipped input signal x[n] is first multiplied by 2, and then divided into the carrier signal xc [n] and peaking signal xp Let us then determine the total required radiated power to meet a wanted SE at the user, set by wanted inband signalto-noise-and-distortion ratio (SNDR), while considering the overloading and quantization noise as sources of distortion.In the Cartesian and sequential architectures, the mid-rise quantizer is used on the real and imaginary parts of the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
signal separately in the digital-to-analog converter (DAC), the accuracy of which affects the signal quality.The SNDR at the user is then given as where σ 2 x , σ 2 d , and σ 2 n are the simulated inband useful signal, distortion signal, and noise variances, respectively, and g is the gain of the PA required for the transmitted signals to meet the wanted SNDR.The total distortion signal d[n] at the user can be determined as where rq [n, θ] is the total combined signal at the user, including overloading and quantization noise, c B is the Bussgang coefficient, which can be determined using least-squares fitting, and x[n] the original unclipped, unquantized TX signal.Then, the distortion power σ 2 d is the inband power of d [n].With all the other parameters known via simulation, the required PA gain g can be determined from (44) for each architecture separately.
The instantaneous per antenna TX signal power in the Cartesian architecture is , where xq [n] contains both the clipping and quantization effects, while the instantaneous per antenna TX signal power in the ML-OP structure is Finally, the total average required consumed power from the supply to meet the wanted SE per antenna is given as , where P tx [n] is the instantaneous TX power of the antenna and η[n] is the instantaneous efficiency of the PA.For comparison's sake, we assume that each antenna employs a class-B PA, for which the instantaneous efficiency is given as [34] η where P in [n] is the instantaneous input power, P max the maximum input power, A max the maximum input amplitude and η max the maximum efficiency, achieved at For the ML-OP scheme η B [n] = η max , since the individual PA units operate on CE signals.

VI. NUMERICAL RESULTS AND ANALYSIS
In this section, the derived results from Sections III and IV are verified numerically by comparing them to simulated ones.Additionally, the inband and OOB performance of the radiated signals are simulated, in both ideal, and mismatched antenna branch and quantized cases.Further, the energy efficiency Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. is assessed using the models and methodology introduced in Section V. Quantification of the signal quality in the simulations is based on the widely-used metrics of EVM and TRP-ACLR, which model the inband and OOB performance, respectively.EVM measures the average Euclidian distance of the measured and equalized symbol from the ideal one, formally defined as where s meas [k] is the kth received symbol measured at the intended angle θ, s[k] the ideal transmit kth symbol, while K is the total number of transmitted symbols.Meanwhile, TRP-ACLR measures the inband and adjacent channel powers at each spatial direction from the transmitter, both in the azimuth and elevation angles.In this paper we restrict our analysis on the azimuth angles, and the TRP-ACLR is determined as where P IB [θ ′ ] is the inband power at angle θ ′ and P OOB [θ ′ ] the higher adjacent channel power at angle θ ′ .In order to avoid interfering with transmissions in the adjacent channels, 3GPP imposes a limit of 28 dBc for the TRP-ACLR in lower (24.25-33.4GHz) frequency and 26 dBc in higher (37-52.6GHz) frequency FR2 systems [54].Here, we consider the more stringent 28 dBc limit.
The following parametrization is adopted for all the simulation cases, unless otherwise noted.The transmit signal x[n] is a 5G NR compliant OFDM signal with a standard cyclic prefix (CP), with 200 MHz bandwidth and 60 kHz subcarrier spacing.The signal is oversampled by a factor of 8.The intended angle is set to θ = −27 • , however, similar behavior is seen with other user angles as well.Additionally, in order to avoid generating grating lobes in the beampattern, antenna elements transmitting similar signals need to have separation less than λ/2.Therefore, to meet this criteria also in the alternating beamforming case, the antenna separation to wavelength ratio ρ is set to 0.25 [48], [49].In the simulations, in order to avoid spatial aliasing, we consider an angle range between −90 • and 90 • , with an increment of 0.25 • .

A. Beampatterns and TRP-SDR
First, the derived total power beampattern expressions from Section III are verified by simulations.Fig. 5 illustrates the derived and simulated beampatterns for both the alternating and block beamforming schemes, with M = 32 and N A = 4, and σ 2 x = 0.1448, which corresponds to a clipping probability of 0.1%, and a PAPR of around 8.4 dB.Additionally, the beampattern produced by the ideal architecture is plotted for reference.It is clear from Fig. 5 that the derived beampattern expressions from ( 28) and (30) match the simulated results extremely well: the mean squared error (MSE) of the alternating beamforming scheme is in the order of 10 −8 and the MSE of the block scheme is in the order 10 −6 .The remaining small errors stem from the utilization of the Taylor series approximations in (28) and (30).The skew of the beampatterns -meaning that the maximum power is not directed directly towards the useras well as the high side lobes are due to the distortion terms in ( 28) and (30), which are significantly higher in the block beamformer especially near the user angle.These terms stem from the imperfect elimination of the ẽ[n] signal.
Fig. 6(a) plots the TRP-SDR with various number of antennas and amplitude levels of the ML-OP transmitter, showing both the analytical results from ( 21), ( 22), (25), and (26), as well as simulated results, which can be seen to match perfectly with the analytical work.The clipping probability is set to 0.1%.It is clear that increasing the number of amplitude levels increases the TRP-SDR, indicating that the distortion powers decrease with more amplitude levels in the ML-OP system.Additionally, the TRP-SDR increases with the number of antennas in the alternating beamforming scheme, therefore the total radiated distortion increases slower than the useful transmit power or not at all with increased number of antennas.Conversely, in the block beamforming scheme, the TRP-SDR stays approximately invariant w.r.t. the number of antennas, indicating that the distortion power increases along with the useful signal power.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 6.Total radiated power signal-to-distortion ratio (TRP-SDR) (a) and simulated TRP-ACLR (b) with various numbers of antennas and amplitude levels in the considered ML-OP beamforming schemes, P clip = 0.1%.Fig. 7. Simulated and derived SDR at the intended user angle for various numbers of utilized antennas (M ) and amplitude levels (N A ), clipping probabilities (P clip ), and numbers of PM bits (Bp).In the simulations, σg = σ δ = 0.1.
The received signal power can be divided into inband and OOB powers through Fourier transformation of the time-domain signal, and selecting the appropriate frequencies in order to determine the TRP-ACLR according to (48).For this, only simulated results are given.Fig. 6(b) shows the TRP-ACLR values for the considered ML-OP architectures, with various number of transmitting antennas and amplitude levels, with 0.1% clipping probability.Similar conclusions can be drawn here as for the analytical TRP-SDR above.This is due to the distortion terms of ( 22) and ( 25) being the only sources of nonlinearity in the TRP-SDR, and thus having also an effect to the TRP-ACLR.The TRP-ACLR however also considers the inband power of the distortion terms, which is the source of the difference between the TRP-SDR and TRP-ACLR results.Again, increasing the number of amplitude levels improves the performance of the OP-based systems.The alternating beamforming scheme is able to meet the 28 dBc limit even with a single amplitude level, if sufficiently many antennas are employed, as was also shown in [49].A more moderate number of antennas is required to meet the limit if the system has more amplitude levels.For example, with 4 amplitude levels, the 28 dBc limit is reached with 8 antennas.Meanwhile, the block beamforming scheme cannot meet the 28 dBc limit in any of the considered cases, and the TRP-ACLR is almost invariant with respect to the number of transmitting antennas.

B. Branch Mismatch and Quantization Distortion Assesment
Here, we will first verify the results of the SDR derivations of Section IV by simulations.Fig. 7 illustrates the simulated and derived SDR values with respect to different number of antennas, amplitude levels, clipping probabilities, and PM bits, while the standard deviations of the branch mismatches σ g = σ δ = 0.1.The simulated results of Fig. 7 are averages over 250 realizations, and match with the analytical work extremely well, minor differences emerging simply from the use of finite number of realizations per simulated result and the approximation used for the PDF of ∆ q [n].It can be seen that the effects of the mismatches diminish when the number of antennas is increased, and the level approaches a limit set by either the number of modulator bits as with B p = 4, or the overload distortion when B p = 6.The limit is reached with less antennas Fig. 8. Simulated EVM at the intended user angle for various numbers of utilized antennas (M ) and amplitude levels (N A ), clipping probabilities (P clip ), and numbers of PM bits (Bp).In the simulations, σg = σ δ = 0.1.
when more amplitude levels are employed.Also, with higher signal power, i.e., higher clipping probability, the overload distortion and/or quantization noise becomes dominant with fewer employed antennas, and the ceiling set by these is reached with fewer antennas.The above mentioned effects are intuitive, as with increased number of antennas, the effects of the antenna branch mismatches are decreased when more independent Gaussian distributed random effects are summed together.
Furthermore, the signal quality suffers less when more amplitude levels are considered in Fig. 7.This is due to the absolute error of the phasors of the component signals being smaller under the mismatches with smaller amplitude levels.The prominence of this effect is highlighted, since most of the samples in OFDM signals have amplitudes in the lower end, which explains the high PAPR of the signals.Mathematically, this diminishing effect can be seen in (24), where the amplitude level dependent terms decrease with increasing number of the levels, indicating diminishing mismatch distortion power when other parameters are kept unchanged.The joint effect of the ẽ[n] terms and overloading causes the SDR to be worse with clipping probability of 0.1% than with 1% when employing a single amplitude level.The analysis of this minor effect we leave for future work.Still, there seems to be a clear benefit in terms of signal quality to employ as many amplitude levels and antennas as possible in the system.Practically, however, the benefit in the SDR when going from for example 8 amplitude levels to 16 might not be worth the extra cost and complexity of the transmitter, especially if the clipping or the number of PM bits is the limiting factor in the achievable signal quality.
The EVM metric is simulated in Fig. 8 with the same parametrizations as in Fig. 7. Unlike the SDR, the EVM in Fig. 8 considers only the inband effects of the branch mismatches, overload distortion and quantization, yet, since the SDR and EVM are related metrics, we can draw similar conclusion here.For reference, the EVM is also simulated with an ideal transmitter where linear PAs are utilized.Therefore, the ideal data acts as the EVM floor in the different clipping probability cases, since the distortion stems now only from the clipping, which cannot be diminished by the use of ML-OP structures.Additionally, the calculation of EVM requires the employment of an equalizer, which removes the effects of the branch mismatches in the ideal transmitter case.In the ML-OP scheme, we see again that increasing the number of antennas averages the mismatches out, which improves the EVM.Employing more PM bits obviously improves the EVM, with 6 bits showing more than adequate results.Moreover, increasing the number of amplitude levels has a major impact on the EVM: we can see that already with 2 amplitude levels the gap to the EVM floor -set by the ideal transmitter -is much reduced, compared to employing only a single amplitude level.In terms of the inband distortion, the proposed multilevel architecture is then a viable substitute even in the presence of presented nonidealities, if sufficient parametrization is employed.
Fig. 9 illustrates the TRP-ACLR under the branch mismatches, clipping and quantization, utilizing an ideal transmitter and the ML-OP architecture with the alternating beamforming scheme.Again, the ideal structure employs only linear PAs, and therefore it acts as the limit, with the only the mismatches and clipping distorting the received signal at each angle.Since the entire band of the signal is affected by the mismatches, they have no effect on the TRP-ACLR in the ideal case, the only contributor then being the clipping probability, such that more clipping induces more nonlinear distortion, lowering the TRP-ACLR.For the ML-OP architectures, only the alternating beamforming scheme is investigated, since as was seen in Fig. 6(a), the block beamforming cannot reach the 28 dBc limit, even without the distorting effects of the branch mismatches, clipping and quantization.As was the case in Fig. 6(b), the TRP-ACLR can be improved by employing more amplitude levels in the ML-OP architecture in the presence of the mismatches and overload distortion.Like before, the TRP-ACLR of the alternating beamforming scheme also improves with increasing number of antennas.Again, the obvious increase in the TRP-ACLR is evidenced with increased number of PM bits, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 9. Simulated TRP-ACLR for various numbers of utilized antennas (M ) and amplitude levels (N A ), clipping probabilities (P clip ), and numbers of PM bits (Bp), with the 28 dBc limit highlighted.In the simulations, σg = σ δ = 0.1.Fig. 10.Power-efficiency results in terms of the required supply power relative to the Cartesian architecture for different numbers of DAC/PM bits, clipping probabilities (P clip ), target SE (measured in SNDR), and numbers of amplitude levels (N A ), M = 128.and the 28 dBc limit can be reached with moderate number of amplitude levels with 6 bits.As we saw with the SDR before with a single amplitude level, the TRP-ACLR seems to counterintuitively increase slightly with higher clipping probability, when employing a small number of amplitude levels due to the joint effect of the ẽ[n] terms and overloading.Further analysis of this effect we again leave for future work.Nevertheless, it is shown that with sufficient parametrization it is possible to surpass the 28 dBc limit, which can be achieved for example with M = 32, N A = 4, and B p = 6 with some margin for other unconsidered effects.Thus, also from the OOB emission point-of-view, the combinerless ML-OP architecture employing the alternating beamforming scheme is a feasible choice in large antenna array cases.

C. Comparison to Reference Architectures
Fig. 10 shows a comparison of required total supply power in three different transmitter architectures to meet the wanted SE, discussed in Section V.In Fig. 10 the required powers are given relative to the Cartesian architecture, and the maximum efficiency η max is set to a moderate 30% for FR2.The noise power σ 2 n is set 20 dB lower than the received useful signal power M 2 σ 2 x of (44) while the number of antennas M = 128.However, since we are comparing the required powers to those of the Cartesian, the exact parameter values are inconsequential and similar results are obtained with other parametrizations as well.Additionally, the ML-OP is assumed to utilize N A parallel PA branches in the generation of the component signals, requiring an additional 2 dB of power to overcome the loss in Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the combiner, which can be realized for up to N A = 16, for example, by two-stages of 4-way combiners with approximately 1 dB of insertion loss [55].It can be seen that the sequential architecture benefits from splitting the signal in two parts before quantization with limited resolution, which turns into requiring less power than the Cartesian TX.This effect disappears with increasing number of DAC bits.Meanwhile the single level OP (N A = 1) requires much more power than the reference methods, which is in line with the results of [48].However, the required power drops with increasing number of amplitude levels in the ML-OP scheme, even with the considered 2 dB loss in the power combiners and the lower clipping probability, i.e., lower signal variance favors the ML-OP scheme.It can be seen that 8 levels are required to match -or top in some scenarios -the supply power requirement of the reference schemes.This requirement can be brought down to 4 amplitude levels, if the combiner loss could in practice be lowered to less than 1 dB, as, e.g., in [56].Further improvement in the ML-OP scheme can be gained by considering non-uniform division of the amplitude levels [30], [31], or asymmetric architectures [32], [33], where the component signals can have different instantaneous amplitudes.These are, however, out of the scope of this paper.
Lastly, Fig. 11 illustrates the TRP-ACLR achieved with the three different TX architectures, using various numbers of antennas and DAC/PM bits.Here, the branch mismatches are omitted for presentation brevity.It can be seen that the TRP-ACLR of the sequential architecture falls between the ML-OP utilizing 4 and 8 amplitude levels, and exceeding the 28 dBc limit in all cases, albeit with a negligible margin with B p = 4 and M = 8.Meanwhile, the TRP-ALCR of the Cartesian structure is greatly affected by the number of DAC bits: with 4 bits, the 28 dBc limit cannot be reached, while with 6 bits there is already a margin of around 10 dB.It is worth noting that the nonlinear distortion stemming from PAs is omitted in this study, which would greatly deteriorate the TRP-ACLR performance of the sequential and Cartesian architectures.

VII. CONCLUSION
In this article, we have thoroughly investigated the combinerless ML-OP transmitter in a large phased array context.Stemming from the introduced system model, we derived the TRP-SDR and total power beampattern analytical expressions for the two considered beamforming schemes.Additionally, we considered antenna branch mismatches, along with quantization, and formulated the SDR at the intended user direction.The analytical results were verified through simulations, and the TRP-SDR expressions were shown to predict the radiated OOB behaviour in terms of TRP-ACLR.Conversely, the derived SDR was shown to predict the inband quality metric EVM, as was evidenced through simulations.Comprehensive simulation results indicate that the TRP-ACLR limit of 5G NR systems at FR2 (28 dBc) can be reached with the considered transmitter architecture, if sufficient parametrization is employed.Specifically, the limit could be reached for example by employing 4 amplitude levels, 6 PM bits, and 32 transmitting antennas.This was shown to be true strictly-speaking only for the alternating beamforming scheme, as improvement in the performance in terms of TRP-ACLR could not be evidenced in the block beamforming case, by increasing the number of antennas.The positive impact of the larger number of antennas and amplitude levels on EVM was also demonstrated.Investigation of power efficiency revealed that the performance of the reference transmitter schemes can be met and surpassed with the ML-OP scheme with sufficient number of amplitude levels.Overall, this study has shown that the combinerless ML-OP is a potential candidate for communication solutions in mmWave large phased arrays, where the tradeoff between the transmitter linearity and power-efficiency is one of the most notable implementation challenges.In our future work, extending the combinerless ML-OP structure to multi-user cases will be considered, including appropriate digital precoding to reduce multi-user interference and OOB emissions.

Fig. 1 .
Fig. 1.Considered transmitter system, where (a) is a realistic implementation and (b) is an analytical equivalent, both consisting of a multilevel outphasing (ML-OP) transmitter and a phase-based beamformer, serving M total antennas in a ULA configuration.

Fig. 2 .
Fig. 2. Vector diagram of the OP concept with (a) single amplitude level (N A = 1) and (b) two amplitude levels (N A = 2), showing the partition of the transmit signal x[n] to the OP component signals S 1 [n] and S 2 [n].

2 − 1
is always equal to 0, since for any time instance, either o[n] = 0 or Ã[n] = 1, making the square root term equal 0. The amplitudes of the ML-OP component signals S 1 [n] and S 2 [n] are now within the set {1/N A , 2/N A , • • • , 1}.Fig. 2

Fig. 4 .
Fig. 4. Comparison of histograms and the approximated PDF of ∆q[n], using random OFDM signal inputs.exactly the same amount of antennas transmitting each of the component signals.Further development of the received signal then yields

Fig. 5 .
Fig. 5. Derived and simulated beampatterns of the combinerless ML-OP architecture, utilizing (a) alternating and (b) block beamforming schemes, normalized to the user angle power with M = 32 and N A = 4, and σ 2 x = 0.1448.The dotted vertical line indicates the angle of the intended user θ = −27 • .

Fig. 11 .
Fig. 11.TRP-ACLR for various numbers of antennas (M ) and different numbers of PM/DAC bits, utilizing different TX architectures.
since the power of S q 1 [n] and S q 2 [n] is the same, and both contain the effects of clipping and quantization.The instantaneous TX power of the carrier signal antennas of the sequential architecture is P tx,c [n] = |g seq xq c [n]| 2 , and similarly the power in peak signal antennas is P tx,p [n] = |g seq xq p [n]| 2 , where xq c [n] and xq p [n], are the carrier and peaking signal containing the clipping and quantization, respectively.