Parallel Delta-Sigma Modulator-Based Digital Predistortion of Wideband RF Power Amplifiers

In this article, we propose a new robust and highly efficient digital predistortion (DPD) concept for the linearization of wideband RF power amplifiers (PAs). The proposed approach is based on the combination of a parallelized delta-sigma modulator (DSM) and a forward model of the PA. This concept applies multi-rate techniques on a DSM that incorporates the forward PA model in its feedback loop to perform the required signal predistortion. Such a technique eliminates the need of reverse modeling and its associated problems. The multi-rate approach relaxes enormously the clock speed requirement of the DPD, which allows handling high signal bandwidths at feasible sampling rates. Moreover, enhanced performance can be achieved without the need of increasing the order of the modulator which reduces the sensitivity of the system to gain variations and phase distortions caused by the nonlinear PA characteristics. Three time-interleaved parallel DPD (P-DPD) variants are described and introduced, all of them have been shown to offer increased accuracy, and consequently better linearization performance compared to the DSM-based DPD state-of-the-art. The proposed architectures are tested and assessed using extensive real-world RF measurements at the 3.6 GHz band utilizing wideband 100 MHz 5G New Radio (NR) transmit waveforms, evidencing excellent transmit signal quality.


I. INTRODUCTION
M ODERN wireless communication systems are evolving with enhanced support for increasing numbers of connected devices, while providing higher data rates and better quality-of-services (QoS). To ensure high throughputs, advanced physical-layer schemes that generate transmit waveforms with largely dynamic envelopes are applied. However, when using these types of waveforms and modulation schemes, enhanced spectral efficiency is achieved at the expense of strict requirements in the design of the involved radio transmitters. To this end, the generated non-constant envelope signal triggers the nonlinearities in the transmitter's nonlinear components, in particular, in the power amplifier (PA). When the PA is operated with signals with a high peakto-average-power ratio (PAPR) [1], the power efficiency will decrease significantly and, consequently, a trade-off between the linearity and power efficiency must be taken into consideration. In this context, various linearization techniques have been presented as preferred solutions for this trade-off by restoring the required linearity while allowing operating the PA in its nonlinear region with feasible input back-off (IBO) [2], [3]. One of the most dominant and well-established linearization techniques is the digital predistortion (DPD) [4], [5], [6], [7], [8], [9], [10], [11], [12]. It consists of preceding the PA with the inverse of its nonlinearity, operating on the digital transmit waveform, such that the cascade of the DPD and the PA ideally behaves as a linear system. However, the reverse modeling in the DPD systems can be easily associated with instability issues and additional computational cost [10], [12], and thus needs to be handled with care.
In [13], [14], and [15] a novel digital predistortion technique was proposed that consists of placing the forward model of a PA in the feedback path of a delta-sigma modulator (DSM), such that the overall transfer function would be the inverse model. Therefore, the linearization happens automatically without the need to perform explicit reverse modeling. However, in this context, having a good DSM performance is crucial for the desired DPD accuracy. In fact, one of the major implementation challenges in DSM-based transmitters is the high clock-speed [16], [17], [18], required to reach sufficiently high oversampling ratios (OSRs) that are essential to obtain good performance. This is a particular concern in wideband linearization problems where already the modulation bandwidth is large. A high OSR is known to increase considerably the cost and complexity of the system. Moreover, the very high processing speed requirement of DSMs may limit their applicability in processing and linearizing wideband signals. Although the performance of the DSM can be improved by increasing the order of the modulator, the stability of This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ high-order modulators applied in digital predistortion becomes more sensitive to the possible gain variations and phase distortions in the nonlinear PA characteristics [13], [14].
In this article, building on the DSM-based DPD principles, we propose a new and highly efficient transmitter linearization concept, that alleviates the requirement of high oversampling rate and obviates the need of using high-order modulators. The proposed concept consists of a parallel DPD (P-DPD) approach that employs the time-interleaving technique to enhance the linearization performance with a lower sampling rate per processing channel. Specifically, the parallel processing principle enables using multiple interconnected sub-modulators operating in parallel and at the same time at lower sampling speed. As a result, the effective sampling rate becomes the clock rate of each sub-modulator multiplied by the number of used sub-modulators. In other words, the required sampling rate can be achieved not by increasing the oversampling rate, but by increasing the number of parallel modulators. Consequently, signals with wider bandwidths can be used and processed without the requirement of high processing speed. In addition, this technique obviates the need of increasing the order of the modulator, and as a result, offers more robustness against nonlinear characteristics in the feedback loop of the modulator. The well-known Volterra-based generalized memory polynomial (GMP) is used in the proposed architectures, for forward modeling, as it is a frequencydependent model and is thus able to reflect accurately the dynamic behavior of wideband PAs [1], [2]. However, the general P-DPD concept as such allows for the utilization of any given nonlinear forward model. Furthermore, the presented concept offers the advantage of supporting the use of fast low-cost finite-resolution digital-to-analog converters (DACs), further increasing the practical implementation feasibility of the overall system, though the use of low-resolution DACs is not as such limited to the P-DPD concept.
The rest of this article is organized as follows. In Section II, we give first a brief overview of DSMs followed by the description of the time-interleaved or parallelized DSM concept. Section III then describes the proposed parallel DPD method, while introducing also three alternative exact DPD system architectures or types stemming from the parallelism. In Section IV, we describe the RF measurement environment together with the obtained RF measurement results and their analysis. Section V provides some further complementary discussion about the potential advantages that can be obtained by using the proposed DPD concept compared to the existing methods. Finally, conclusions are provided in Section VI while complementary details are provided in the Appendix.

A. Delta-Sigma Modulators
The DSM concept is based on oversampling and quantizing the input signal, together with shaping the generated quantization noise outside the bandwidth of the signal of interest [16], [17]. In a typical DSM, the signal and the quantization noise are shaped by two different transfer functions, which means that the quantization noise can be pushed outside the band of interest, while the signal is kept essentially unchanged. DSMs are characterized by the well-known tradeoff between the order, the number of quantization levels, and the oversampling ratio (OSR) [16], [17]. If an increase in SNDR is desired, the system designer can either increase the modulator order, the number of quantization levels, or the OSR -or a combination of the three. Formally, the OSR can be expressed as where f s is the sampling frequency, and f B is the maximum signal frequency.
The FBD technique consists of using a bank of filters to break the input signal to a certain number of smaller bandwidth sub-signals. As a result, the required sampling frequency for each of these smaller sub-bands can be considerably reduced. The other approach is based on decomposing the spectrum of the input signal to multiple sub-bands working at a lower speed using the so-called Hadamard transformer technique. After modulation, the Hadamard transformer is applied again to mix the modulated sub-bands. The time-interleaving concept, in turn, uses M interconnected sub-modulators working simultaneously in parallel which reduces the clock speed by a factor of M. In this work, this parallelization approach is adopted, to design a DSM-based DPD to improve its performance without increasing the processing speed. To the best of the authors' knowledge, this is the first time that such a parallel DPD approach is presented in the scientific literature.

C. Time-Interleaving Technique
The time-interleaved architectures use the polyphase decomposition principle, which is based on a method, where an arbitrary transfer function is decomposed into pseudo-circulant transfer function matrices. Then, the resulting transfer function matrices are implemented in parallel channels with multirate signal processing. To derive the time-interleaving version of a single-input single-output discrete-time system Y (z) = H (z)X (z), an equivalent system H (z) is achieved by applying the polyphase multi-rate technique. Therefore, if we consider that the time-interleaving factor is M, the equivalent transfer function H (z) can be represented with an M×M matrix that reads [22], [23], [24], [25] where E i (z) are the polyphase components of H (z). Mathematically, the relation between E(z) and H (z) can be expressed as At the input of H (z), M down-samplers are used, which means that H (z) will be working at a clock speed M-times lower than H (z), but providing the same function with the same performance. At the output of H (z), M up-samplers and M−1 delays are used to construct the final signal.

III. PROPOSED PARALLEL DPD METHODS
In this section, we describe the proposed new P-DPD concept as well as the related design approaches and three alternative implementation variants.

A. Nonlinearity Inversion Concept
The main idea is to invert the nonlinearity of the PA by embedding its corresponding behavioral forward model in the feedback loop of the modulator such that the overall transfer function acts essentially as the inverse function of the PA [13], [14], [15]. Fig. 1(a) illustrates this approach by showing the association of a feedback loop system including a nonlinear function in its feedback path, and a block with the same function at the output, where T is the forward gain and f (.) is a nonlinear function. For a stable feedback system, in which T has a sufficiently high value, the transfer characteristic from the input x(t) to the output y(t) is f −1 (.) [13]. In other words, for a stable feedback system with sufficiently high forward and loop gains, ε(t) is small compared to x (t), y (t), and f (y (t)). As a result, x (t) ≈ f (y (t)), and thus, y (t) ≈ f −1 (x (t)). When the forward gain approaches infinity, the obtained approximations come closer and closer to the corresponding equalities, as long as the system remains stable. Finally, since f −1 (x (t)) represents the inverse behavior of the PA, the nonlinearity would ideally be suppressed at the output of the nonlinear device.
This familiar property of feedback systems can be applied in the DSM context, as illustrated in Fig. 1(b). The forward path contains a delayed integrator followed by a quantizer, and the feedback incorporates the nonlinearity to be inverted.

B. Power Amplifier Forward Model
With focus on linearizing RF PAs in wideband system applications, the well-known GMP model is adopted in this work as the nonlinear function f (.). The GMP is a baseband model that is, in general, widely employed in the literature for the modeling and predistortion of RF PAs [2], [11]. It is built by augmenting the conventional memory polynomial (MP) model with additional basis functions which include the crossterms resulting from the combination of the instantaneous complex signal with the leading and lagging terms [11].
For readers' convenience and presentation completeness, the GMP expressions are next stated in a compact manner. We denote with y(n) and y mod (n) the complex baseband envelopes of the model input and output, respectively. The GMP model can then be expressed as [2] and [11] where L × 1 vector ζ stacks all the model parameters while the 1 × L total data or basis function vector reads The corresponding 1 × L a , 1 × L b , and the 1 × L c data vectors γ (n), θ (n), and λ (n) read as shown in (6) where Q a and K a are the memory depth and the nonlinearity order of the polynomial function applied to the time-aligned input samples, respectively; Q b and K b are the memory depth and nonlinearity order of the second underlying polynomial function, respectively, with I denoting the order of the lagging cross-terms; Q c and K c are respectively the memory depth and nonlinearity order of the third underlying polynomial function, in which J is the order of the leading cross-terms. In general, the linear-in-parameters expression in (4) is directly applicable in least-squares based parameter estimation, utilized also in our numerical results in Section IV. It is also noted that the more ordinary MP model can be obtained as a special case for which L b = L c = 0. Finally, for clarity, it is noted that in the feedback path application shown in Fig. 1, the signal y(n) serves as the model input.

C. Proposed Parallel DPD Architectures
In Fig. 2(a), (b), and (c), we describe and illustrate the proposed P-DPD method and its three possible variants. The entire transmitter architectures consist of a P-DPD followed by a DAC, a low-pass filter to remove the out-of-band noise, an up-conversion stage, and a nonlinear PA before the signal gets transmitted by the antenna. For all the three P-DPD architectures, the output of the polyphase equivalents W i would be essentially expressed as follows The main differences between the three variants stem from how the quantizers and sample rate converters are organized in the processing chains, and how the PA forward model(s) are utilized. These are described further in the following. 1) P-DPD Type 1: The first architecture in Fig. 2(a) consists of placing an equivalent block filter H (z) in place of the original filter H (z) of the DSM as in Fig. 1, and embedding the PA forward model in the feedback loop. The quantization and the behavioral model in this topology are applied to the entire signal, i.e., after the signal construction. In this architecture, only the integrator is replaced with its equivalent circuit but the overall structure of the DPD-based DSM is still essentially kept in its general structure.
2) P-DPD Type 2: In this structure, instead of recombining the sub-signals w i into one signal and then quantizing it to finally obtain y(n), we can alternatively quantize each of the w i component signals and then recombine those into y(n). In other words, the original architecture has been modified by placing M quantizers before the up-samplers instead of using only one after signal construction as described in Fig. 2(b). In this topology, since each of the w i signal components is quantized separately, the quantizers would work at the reduced clock speed. Ideally, the output y(n) remains the same as in the first architecture.
3) P-DPD Type 3: The third topology in Fig. 2(c) modifies the previous architectures further by merging the input adder into the section operating at the lower speed. More specifically, each channel will have its own feedback loop including its own PA behavioral model. This way, the feedback loop including the PA model block will operate at a lower speed as well. Since this third variant applies the PA model for each subset of samples in the internal channels, it is important to mention that the models in all branches should be the same in order to keep the same input-output behavior. Consequently, this topology will perform as M sub-predistorters working simultaneously in parallel at a lower rate. In this final architecture, the nonlinearity inversion principle described in Section II would be applied on each sub-modulator. Therefore, under the stability conditions, v i (n) is small compared to x i (n), y i (n) and f (y i (n)); and as a result . This architecture enables the predistortion operation to be performed entirely at the lower sampling speed, which can be a remarkable implementation advantage.
Further discussion on the advantages and potential challenges are provided later, in Section V.
Considering that the Type 3 variant keeps the same pseudocirculant circuit or transfer function H (z), and considering that the M PA models are equal, the changes applied in this Type 3 structure do not change the overall behavior of the previous two variants. Therefore, under the same initial conditions, all three topologies are equivalent from the inputoutput performance point-of-view, and therefore, they provide additional flexibility to the system designer. However, from the involved sample rates and corresponding complexity point-ofview, the Type 3 can be considered as the preferred choice.

D. Design Examples
In this sub-section, focusing on the P-DPD Type 3, we demonstrate and provide two concrete P-DPD design examples with M = 2 and M = 4 parallel branches. These are then utilized also along the numerical results in Section IV.

1) P-DPD Type 3, Concrete Example With M = 2 Channels:
Here, the goal is to determine the expression of the polyphase components for the case of M = 2. To this end, since most of DSMs are composed of integrators, the transfer function for a conventional first-order DSM reads Using (10), the expression in (3) can be rewritten for a 2-channel interleaved system as According to (11), the polyphase components E i (z) are and Therefore, substituting (12) and (13) in (9), we obtain the following or  where V i = X i − Y i , for i = 1, 2, while knowing that Y i are the outputs of the PA forward models. Fig. 3 illustrates the topology of the 2-channel P-DPD at the computing level. For readers' convenience, the corresponding complete time-domain characterization is provided in the Appendix, including also explicitly the GMP forward model for f (.).
2) P-DPD Type 3, Concrete Example With M = 4 Channels: Considering next an example with M = 4, the expression of H (z) can be written as follows Therefore, the expressions of the polyphase components E i (z) can be expressed as and Stemming from above, the expressions for W i read then A complete transmitter architecture utilizing a 4-channel P-DPD building on the above expressions is illustrated in Fig. 4.

A. Measurement Setup
In order to demonstrate and evaluate the proposed DPD concept, three different RF experiments are conducted using PA samples with different implementation technologies. All the forth-coming experiments use 5G NR standard-compatible OFDM waveforms with a modulation bandwidth of 100 MHz and a PAPR of 8 dB. The experiments are specifically intended for the 5G NR band n78 (3300-3800 MHz) with a carrier frequency centered at 3.6 GHz. Fig. 5 illustrates the experimental RF measurement setup utilized to evaluate and demonstrate the concept of the proposed P-DPD architectures. It is composed of a power amplification unit (PAU), an attenuation block, and a vector signal transceiver (VST) acting as the transmitter and observation receiver. The employed National Instruments PXIe-5840 VST includes a vector signal generator (VSG) and a vector signal analyzer (VSA). The frequency range of the VST is from 9 kHz to 6 GHz with an instantaneous bandwidth of 1 GHz which is sufficient for the bandwidths used in this work. The VST includes an additional host processor-based computing environment that executes the systems' functions including digital waveform generation, PA forward modeling, and P-DPD processing. In addition, N-bit DACs are considered in the P-DPD-based transmitters, which are also implemented or mimicked in the VST host environment. Specifically, we are introducing a Gaussian-distributed mismatch that is commensurate with finite-resolution N-bit performance.
First, the host processor generates the digital baseband signal, then after the data is subdivided into 8 blocks of size 10,000 samples each, they are transferred to the VST hardware, where the signal modulation and frequency upconversion to the desired carrier frequency of 3.6 GHz are carried out. The modulated RF signal is then amplified via the PAU. The output of the PAU is connected to the RF input of the VST through two attenuators, a coaxial N-Type Fixed attenuator and an SMA fixed attenuator. After attenuation, the signal is brought back to baseband after downconversion and demodulation performed by the VST. Finally, the host processor, after time aligning the received signal with the input data, extracts the parameters of the forward model -in our case, either the GMP or more ordinary MP model. Now, after the model has been properly built, the host processor performs the P-DPD function by transferring the predistorted signal to the VST. In order to optimize the performance and prevent instability, it is important to normalize the gain and compensate for the potential phase offset that appears during the data capturing for the PA forward modeling.

B. Modeling and Linearization Evaluation Metrics
The modeling accuracy is evaluated in this work using the normalized mean square error (NMSE) metric which enables assessing the deviations between the modeled and measured output of the PA. It can be expressed in dB as NMSE (dB) = 10 log 10 where y measured (n) and y model (n) are the measured and modeled PA output signals, respectively, while N data is the length of the available data in the discrete time domain. For PA modeling, a systematic NMSE study is commonly performed to select the parameters of the forward model, in which the nonlinearity orders and memory depths are increased until satisfactory performance is achieved. A similar approach is taken in this work.
As for the P-DPD system figures of merit, we adopt the signal to noise and distortion ratio (SNDR) and adjacent channel power ratio (ACPR). The SNDR is used to evaluate the in-band signal quality and is defined as the ratio between the in-band signal power and in-band noise and distortion power. It can be expressed in dB as SNDR (dB) = 10 log 10 In-band signal power In-band noise and distortion power (22) The ACPR, in turn, focuses on measuring the out-of-band performance and is defined as the ratio of the transmitted powers within the desired channel (P desired ch. ) and that in the right or left adjacent channel (P adjacent ch. ). It can thus be formulated in dB as ACPR (dB) = 10 log 10 P desired ch. P adjacent ch.
In this work, in the ACPR measurements, the channel bandwidth is defined as the bandwidth containing 99% of the total power. Furthermore, the basic OFDM waveform processing contains weighted-overlap-and-add (WOLA) type of windowing for better spectral containment. In the following numerical results, we primarily focus on Type 3 P-DPD-based linearization results, however, also the corresponding results with Types 1 and 2 are provided for reference purposes. Additionally, it is noted that the baseline P-DPD parametrization contains M = 4 channels and 8bit quantization, while additionally results with M = 2 and M = 8 channels and with 4-bit and 6-bit quantizers are also provided for comparison purposes.

C. RF Experiment 1: Asymmetrical Doherty GaN PA
In the first experiment, the main PA is the RTH36016M-23 provided by RFHIC. It is integrated with the Asymmetrical Doherty configuration and fabricated using high density GaN semiconductor process. This PA has a gain of 23 dB and a frequency range of 3550-3700 MHz. In this experiment, a driver amplifier (Mini-Circuit ZHL-4240) is placed before the main amplifier, while operating in a relatively linear point in order to ensure that the nonlinear distortion added to the transmit signal is primarily due to the main PA. The transmit signal is a 5G NR OFDM waveform with an assumed bandwidth of 100 MHz, and the center frequency is 3.6 GHz.
1) Forward Modeling Performance: Fig. 6 presents the forward modeling performances of the GMP and MP models at an output power of +39.8 dBm. The considered memory depths and nonlinearity orders are stated in Table I, together with the corresponding NMSE values. The figure also shows the spectra of the error of both models. It is clear that the GMP model achieves more accurate results with lower modeling error. Such improved performance compared to MP model is generally obtained by the inclusion of the crossterms of the envelope. Taking into account these terms offers more robustness against the strong nonlinear memory effects  of the PA which can be severe with wideband signals. For comparison purposes, however, both the GMP and MP models are used to design the P-DPD along the forthcoming results.
2) Linearization Performance: In order to illustrate the linearization performance of the proposed P-DPD concept, Fig. 7 depicts the measured spectra without and with linearization for the two considered models (MP and GMP). Again the model parameters are as shown in Table I, Type 3 first-order modulator-based P-DPD with M = 4 channels is considered, while the amount of the quantization bits is 8. The corresponding SNDR and ACPR figures are as shown in Table II, covering also the cases with 4 and 6 quantization bits. We can observe that the GMP model offers enhanced performance also in the actual linearization task, reaching excellent SNDR and ACPR values. This is particularly so when the amount of the quantization bits is 6 or 8.
Next, to confirm the equivalency of the three variants of the proposed P-DPD concept, in terms of linearization performance, all three variants are measured. The implementations and measurements build on the 4-channel, 8-bit, first-order modulator approach, similar to above. Table III compares the obtained SNDRs and ACPRs when employing a 100 MHz 5G NR waveform and using the GMP approach for forward modeling. As expected, the obtained results are essentially the same for the three architectures with only very minor mutual differences which are within any reasonable measurement uncertainty.
In general, the P-DPD technique can be applied using more channels, and the per-channel sampling rate will be significantly reduced. This is presented and highlighted in Table IV that summarizes the RF measurement results of the first-order 8-bit P-DPD when applying an effective OSR of 12 and using different numbers of channels. Again, the GMP approach is used for forward modeling inside the P-DPD system. It is clear that the same performance can be achieved with reduced clock rates. For instance, the 8-channel P-DPD is able to achieve the same performance with a clock rate that is 8 times smaller than that of the ordinary 1-channel architecture. These results demonstrate and highlight that the increase of the number of channels offers the same results but with a much lower clock speed, enabling the linearization of wider modulation bandwidths with the same sampling frequency.
Finally, Fig. 8 presents the measured AM/AM and AM/PM characteristics of the proposed 4-channel P-DPD using a GMP model. As can be observed, the proposed architecture is able to well linearize the S-shaped characteristics of the RTH36016M-23, which are known to be difficult to compensate. Overall, the results show that the proposed P-DPD structure is successfully able to compensate for the dynamic nonlinear behavior of the RTH36016M-23.

D. RF Experiment 2: HMC1114 GaN PA
In order to further validate the applicability of the proposed P-DPD concept, a second experiment with a different PA unit is next reported. This experiment is performed using an HMC1114 broadband GaN PA, which is designed to operate in the frequency range of 2700-3800 MHz. Within the more specific frequency range of 3200-3800 MHz, this amplifier has a gain of 32 dB. This experiment also applies the same ZHL-4240 as the driver amplifier, similar to the previous experiment, working at a relatively linear point.
1) Forward Modeling Performance: We start again by showing the forward modeling performance of the GMP model in comparison to that of the MP. To this end, Fig. 9 presents the measured spectra of the GMP and MP models, measured at an output power of +32.7 dBm. The model parameters are as shown in Table I which also shows the corresponding NMSE results. According to the measured results, a good accuracy is achieved by both models, which supports implementing them in the P-DPD linearizer structure.
2) Linearization Performance: The measured PA output spectra without and with the proposed 4-channel Type 3 P-DPD concept with 8 quantization bits are shown in Fig. 10, TABLE I   FORWARD MODELING RESULTS OF THE MP AND GMP MODELS IN EXPERIMENTS 1-3   TABLE II  SUMMARY OF THE MEASUREMENT RESULTS FOR DIFFERENT NUMBERS  which clearly demonstrates the capability of the P-DPD in linearizing the HMC1114. This is well supported by the corresponding measured AM/AM and AM/PM characteristics in Fig. 11. The quantitative ACPR and SNDR figures of merit are available in Table II, for all the three cases of 4-bit, 6-bit, and 8-bit quantizers. Additionally, Table III shows the performance numbers also for Types 1 and 2, assuming 8-bit quantizers and GMP as the forward model, while the potential impact of the number of channels M is shown in Table IV. We can clearly conclude that the proposed P-DPD system can very accurately linearize the HMC1114 PA, especially when GMP is used as the forward model and when the number of quantization bits is 6 or 8. The best measured SNRDs exceed 50 dB while the ACPR is also extending towards 48 dB.

E. RF Experiment 3: Small-Cell PA
The final third experiment focuses on pursuing and evaluating the linearization of the Skyworks SKY66292-21 PA module, shown in Fig. 5(e). It is a low-to-medium-power amplification unit suitable, e.g., for small-cell base stations or large-scale antenna array RF transmitters as an antennaspecific PA entity. This PA module is designed to be operated in the 5G NR Band n78, providing a gain of 34 dB, and having a 1-dB compression point (P1dB) of +31.5 dBm. The transmit signal is again a 5G NR OFDM waveform with a bandwidth of 100 MHz, and the center frequency is 3.6 GHz. In this experiment, the I/Q modulated RF waveform is transmitted via the RF output port of the VST directly to the PA, which facilitates providing output powers up to around +28 dBm. Then, the proposed P-DPD schemes are adopted to carry out the performance quantification measurements.
1) Forward Modeling Performance: Fig. 12 presents the measured forward modeling results of the MP and GMP models, with the parametrization given in Table I. In this case, the GMP model significantly outperforms the MP model. This is due to the strong dynamic nonlinear behavior of the adopted PA, particularly when driven to a strongly nonlinear operation point, like done in these measurements. Specifically, as shown in Table I, the GMP provides more than 3 dB better forward modeling NMSE. However, both the MP and GMP are still utilized in the actual linearization experiments for comparison purposes.
2) Linearization Performance: The measured spectra of the proposed system are shown in Fig. 13, with Type 3 P-DPD with M = 4 channels and 8-bit quantization. The corresponding AM/AM and AM/PM characteristics are presented in Fig. 14. Furthermore, Table II summarizes the measurement results of the P-DPD using different numbers of quantization bits, while the corresponding results with the alternative implementation Types 1 and 2 are provided in Table III. Finally, Table IV provides again the comparative results when the number of channels M is varied from M = 1 (ordinary DSM-based DPD) to M = 8, while keeping the same OSR of 12 relative to the modulation bandwidth.
Based on the provided measurement results, it can be clearly concluded that the proposed P-DPD system can very accurately linearize also the SKY66292-21 PA module -despite the highly nonlinear operation point -with the best measured SNDRs and ACPRs exceeding 45 dB. Altogether the extensive set of three experiments demonstrates that linearizing 100 MHz modulation bandwidth is technically feasible even with clock rates as low as 150 MHz in the parallel branches of the proposed P-DPD system.   10. Measured PA output spectra with the proposed Type 3 P-DPD with M = 4 channels using an MP-and a GMP-based models in Experiment 2 (HCM1114) at an output power of +32.7 dBm. Also, the corresponding PA output spectrum without DPD is shown for reference.

F. Comparison to Other DSM-Based DPD Works
Finally, a short comparison against other published DSM-based DPD works is provided in Table V, considering the proposed P-DPD Type 3 and results from Experiment 1. We compare the considered bandwidth, number of channels, number of bits, order of the DSM, clock speed, and achieved SNDR. Considering the supported high bandwidth and achieved excellent linearity performance, the proposed concept clearly outperforms the existing DSM-based DPD solutions. This makes the proposed DPD methods appealing for the future wideband radio transmitters, where the growing bandwidth requirements present an important challenge.

V. FURTHER DISCUSSION RELATED TO IMPLEMENTATION BENEFITS AND CHALLENGES
The proposed P-DPD concept offers important advantages that make it beneficial in implementation terms. Firstly, the  proposed architectures support the use of fast finite-resolution DACs due to the involved noise shaping. In general, several new communication systems are tending towards using  low-and medium-resolution DACs to avoid the complexity and high power consumption caused by high-resolution DACs that increase exponentially with the increase of the number of bits. The proposed architectures can be used along with the reduction of the length of the digital words applied to the input of the DAC. More specifically, besides being pre-distorted, the oversampled signal will be re-quantized to a shorter word length that is equal to the number of bits of the DAC, and the quantization error resulting from such operation will be spectrally shaped.
Another advantage is that the presented P-DPD system applies the forward model of the PA, which eliminates the problems associated to explicit reverse modeling, such as stability and/or convergence issues. Therefore, the approach leads to reduced implementation, control, and configuration complexity. Moreover, the usage of dynamic models in the DSM's feedback enables the linearization of wideband signals due to the ability to compensate for the memory effects.
The performance of DSM-based transmitters depends, in general, the most on the effective oversampling factor. The proposed P-DPD alleviates the involved costs by employing the parallelism principle, and thus, the physical sampling frequency per channel will be significantly reduced, leading to high accuracy with reduced costs. This would make it possible to exploit the benefits of DSM-based DPDs in wideband applications while benefiting from the relaxed processing speed requirements. Furthermore, the P-DPD allows improving the linearization performance without the need for increasing the order of the modulator, and therefore, offers robustness against instability issues caused by the gain variation and phase distortion in the nonlinear characteristics of the PA. This applies to all three P-DPD variants, all of which being able to employ first-order feedback loops. Therefore, introducing PA nonlinearity along such first-order loops will not be problematic as the dynamics of the P-DPD system are simple.
It is also fair to state that the proposed P-DPD system achieves reduced sampling rate values through the increase of the number of parallel channels, and therefore, the numbers of for example delays and adders -and potentially also the number of PA forward model replicas -are increased. For example, a conventional 1-channel DSM-based DPD has one two-input adder, one explicit delayed integrator, a quantizer, and one PA forward model. On the other hand, in addition to the input and output multiplexers, a Type 3 P-DPD with two channels has two dual-input adders, at least two delay elements, four internal dual-input adders, two explicit integrators, two quantizers, two cross-connections and two parallel PA forward models.
It is also worth mentioning that in order to reconstruct the data precisely, the recombining process of the samples should be accurate in timing alignment while operating at the original sampling speed. Considering the fact that no complex operations are required in such data reconstruction, containing essentially only delays, subsamplers, and adders, this can be expected to be straightforward and feasible. Another important point to mention is that although we utilized and demonstrated up to 8 channels in this work, the concept is flexible allowing the designer to choose a higher or lower number of channels depending on the targeted sampling frequency and the available resources. Even in the basic case with two channels, the proposed approach enables the DPD function to operate at half the original sampling rate, which is still technically very beneficial in terms of complexity and costs.
Eventually, the final assessments of the true implementation benefits, and the related tradeoffs in the linearization performance vs. power consumption vs. silicon area, are subject to actual integrated circuit or FPGA implementations which pose an important topic for future work in this area.

VI. CONCLUSION
This paper proposed a new concept for wideband power amplifier linearization, through parallelized DSM-based DPD structure. Three implementation variants of the new P-DPD architecture were developed and described, allowing to relax the processing and clock rate requirements while still facilitating wide linearization bandwidths. This is achieved by making a DSM behave as a nonlinearity inverter by embedding the PA forward model in its feedback loop and using the multi-rate filtering principles to relax the involved sampling rate requirements. The proposed P-DPD architectures were assessed and validated through extensive RF measurements at the 3.6 GHz band, utilizing 5G NR OFDM waveforms with 100 MHz modulation bandwidth and three different types of PA systems. The results show that the proposed P-DPDs were able to compensate for the dynamic nonlinear behavior of different types of PAs, and that the three proposed architecture variants perform equivalently from the linearization performance pointof-view.

APPENDIX COMPLETE TIME-DOMAIN CHARACTERIZATION FOR M = 2 CHANNELS
For presentation completeness, the full time-domain characterization of the proposed P-DPD concept is here provided, in terms of the difference equations. For presentation simplicity, the 2-channel Type 3 P-DPD is considered. To this end, with reference to Fig. 3, we can first write