A Time-Domain Multi-Tone Distortion Model for Effective Design of High Power Amplifiers

This paper proposes a new time-domain multi-tone distortion (TD-MTD) model suitable for accurately predicting the non-linear behavior of packaged high power radio frequency (RF) transistors over a range of discrete non-uniformly distributed frequencies. This proposed TD-MTD model uses a single expression rather than multiple distinct frequency specific behavioral models to describe the underlying behavior of the high power RF transistor at multiple fundamental frequencies. Furthermore its extraction is carried out using a time-domain representation of the travelling waves that can be acquired using a generic vector load-pull characterization system and without imposing additional requirements. The proposed model is extracted as an artificial neural network (ANN) and is implemented as a Netlist to serve in a harmonic balance simulator based power amplifier design process. The proposed model is validated in two phases. First, its ability to reproduce the large-signal behavior of a high power RF LDMOS transistor was demonstrated in simulation. Then, the TD-MTD model was used to validate the design of a high power two-way asymmetric Doherty power amplifier and the simulated output-power-dependent power efficiency, AM/AM, AM/PM and input return loss characteristics were compared to those obtained in measurement. The excellent agreement between the simulation and measurement results confirms the usefulness of the proposed model despite the simplicity of its extraction routine and measurement data.


I. INTRODUCTION
The ever increasing demand for broadband mobile communication services has been a driving factor for the ubiquitous deployment of macro base stations worldwide. These base stations are required to operate over a wide range of radio frequencies (RF) and to efficiently broadcast high power levels in order to minimize its environmental impact. Given that the RF high power amplifiers (HPA) dominate the power consumption of the mobile broadband communication base stations, significant research and development endeavors have recently focused on the investigation of effective methodologies for designing high efficiency HPAs using advanced packaged RF transistor technologies and circuit topologies. These endeavors are increasingly resorting to electronic computeraided design (ECAD) tools to cope with the unprecedented The associate editor coordinating the review of this manuscript and approving it for publication was Rocco Giofré .
HPA design challenges especially as they are expected to meet strict RF performance over a wide range of frequencies. Furthermore, they growingly call for accurate large-signal models capable of mimicking the non-linear behavior of high power transistors. In the literature one can distinguish two types of large signal models that can be used in an ECAD tool to support the design of HPAs, namely compact and behavioral models.
Compact models [1], [2] use a set of physically inspired and specially arranged circuit elements to reproduce the small and large signal behavior of RF transistors in a simulation environment. The choice of the compact model topology is non-trivial and requires an intimate level of knowledge of the construction of power transistors and the underlying physical phenomenon involved in its operation. Furthermore, the extraction of its parameters calls for specialized characterization systems with advanced measurement capabilities. Compact models have been recently very widely used to automate the design of high efficiency low to medium power amplifiers. Furthermore, excellent agreement between simulated and measured large signal RF performance metrics such as efficiency, output power and gain has been reported [3]- [6]. However, as RF transistors scale up to high powers, these models have shown lower accuracy in predicting back-off power efficiency for Doherty HPA designs [7], [8]. Often the prediction accuracy of an initially extracted compact model is unacceptable and post-tuning of the model parameters based on load-pull measurement data is required to improve it. For very high power RF transistors, HPA designers often resort to design methodologies that completely rely on load-pull measurement data. These design methodologies exploit the load-pull system as a design framework that allows the full exploration of the design space by determining the optimum load conditions needed to achieve the target RF metrics. While these methodologies were found useful when designing single transistor based HPAs, they fall short in handling the complexity associated with the design of advanced HPA topologies that include more than one transistor such as Doherty Amplifiers and Load Modulated Balanced Amplifiers. This motivated the attempts to mobilize the use of behavioral models in the design of advanced HPA topologies [9]- [13].
Behavioral models mimic the measured behavior of RF transistors and other non-linear devices explicitly, since behavioral models are extracted from measurements directly and the device is merely represented as a black-box that responds to stimulus at its ports. This gives the particular advantage of behavioral models being technology-agnostic as they are strict mathematical models of the behavior of the device that assumes nothing about the underlying physics and dynamics of the device. Due to this reason there is virtually no difference in how a behavioral model would model an LDMOS power transistor compared to a GaN power transistor or any other future RF power transistor technology that might be developed. In the literature, the Poly-Harmonic Distortion (PHD) models [14] have been developed to describe the non-linear behavior of RF transistors in the frequency domain. Given its formulation and extraction complexity, efforts were made to derive simplified versions of the PHD models, including S-functions [15] or X-parameters [16]. These models apply the concept of harmonic superposition principle as a first order approximation of the transistor nonlinear behavior around a large signal operating point (LSOP). Consequently, the modeling of the RF transistor behavior over a wide range of LSOP conditions (input power level and load impedances) requires the use of a set of X-parameters that are specifically extracted at each LSOP condition. Alternative PHD model implementations resorted to higher-order polynomials [17]- [20] or rational functions (Padé fractional forms [21]) in order to improve its modeling capacity.
It is worth noting that the PHD models use dedicated functions for each harmonic frequency to describe the behavior of the high power transistor even though a common underlying non-linearity has resulted in the behavior observed in all of those harmonic frequencies. This property was exploited in [22] where the authors proposed a single time-domain expression to describe the poly-harmonic distortions of RF power transistors [22]. The extraction of the resulting timedomain poly-harmonic distortion (TD-PHD) model required a small number of representative multi-harmonic load-pull measurements to capture the non-linear behavior. It was also successfully integrated into ECAD tools and exploited to design single-ended harmonically tuned PAs. Nevertheless, the TD-PHD model was fixed to a single fundamental frequency requiring a wideband HPA designer to extract a completely separate model to fit the load-pull measurements at each fundamental frequency, even though, similar to before, a common underlying non-linearity has resulted in the behavior observed at all the individual fundamental frequencies.
In this paper, the previously proposed TD-PHD framework will be generalized such that a single time-domain model can fit load-pull measurements that were performed over a non-uniformly spaced discrete frequency grid. The proposed model is a discrete-time model that is tuned to act on a discrete set of frequencies. As a result the model will not be able to predict the performance of the power transistor at frequencies that are not in the discrete frequency set of the model. On the other hand the discrete frequencies can be chosen such that it spans over a significant bandwidth of frequencies where load-pull measurements have been performed.
To validate this generalized modeling framework, the large-signal one-tone simulation of a two-way Doherty HPA with packaged LDMOS power transistors will be investigated and compared against the measurements of the HPA and a simulation of the HPA using a compact transistor model over a discrete set of frequencies where a model was extracted from load-pull measurements. Unlike other attempts of Doherty HPA design via behavioral models, loadpull measurements and compact models [8], [10], [23], [24], our paper proposes a generalized framework that has a particular application of being able to produce a single time-invariant time-domain-defined behavioral model that can be used to approximate all the measured load-pull data at a set of discrete non-uniformly-spaced fundamental frequencies.

II. PROPOSED TIME-DOMAIN MULTI-TONE DISTORTION MODEL FOR HIGH POWER TRANSISTORS A. DISCRETE PROJECTION OF A CONTINUOUS-TIME VOLTERRA SERIES ON A FIXED FREQUENCY GRID
If the power transistor is assumed to be an equi-continuous and uniformly bounded non-linear time-invariant system, then according to the Arzela-Ascoli theorem and Fréchet's approximation theorem [25], its behavior around a quiescent bias can be approximated uniformly to an arbitrary degree of precision by a sufficiently high order Volterra series [26]- [28]. For simplicity of representation, a one-port system will be used in the following analysis but the arguments will be trivially generalized to two-port systems and beyond at the end of this Section. Suppose a powerwave scattering model is used for behavioral modeling, where the input signal is the incident powerwave signal a(t) and the output signal is the reflected powerwave signal b(t). The time-domain continuous Volterra series gives the response of the non-linear time-invariant system b(t) for a known input signal a(t). The Volterra series represents the output signal as a series sum of multi-dimensional convolutions of the input signal with the kernel functions h n , where n is the polynomial order of the kernel function [29]: The real-valued kernel functions h n (τ 1 , · · · , τ n ) are a representation of the underlying dynamic non-linearity of the power transistor including its short and long term memory effects. Assume that a fixed set of time-domain continuous kernel functions provide a frequency-independent description of the non-linear time-invariant response of the power transistor for all potential continuous input signals a(t). In order to create a behavioral model that targets signals that are fixed to a certain discrete frequency grid, a projection of the output of the frequency-independent infinite series expression of (1) can be found for a subset of all possible signals that lie on a specific fixed frequency grid. In order to theoretically compute the instantaneous output, b(t), of the Volterra-series expression of (1), the circuit simulator requires the complete knowledge and access to the input signal's value at all instances in time and not just at time t, even though the circuit simulator expects an instantaneous (that is, available at time t) time-domain expression for the non-linearity. To be able to emulate the instantaneous-time computation of the Volterra series, the information about the past values of the input signal needs to be made available to the circuit simulator at all instantaneous times. This can be achieved through auxiliary signals in the model Netlist that reveal the time-domain value of the history of the input signal to the circuit simulator, allowing an emulated computation of the instantaneous output only based on the instantaneous values of these auxiliary signals.
The TD-PHD model [22] is such a projection of the continuous-time Volterra series kernels onto a discrete frequency grid of frequencies that are multiples of a single fundamental frequency f 1 . Although f 1 is not the only frequency that can divide all the frequencies of the frequency grid, it is the largest frequency that can do so. Any integer division of f 1 (e.g. f 1 /2 or f 1 /3 and so on) could have possibly also been used as the common fundamental frequency (f CF ) of the model. For the TD-PHD model as proposed in [22], the f CF parameter was explicitly fixed to f 1 , the fundamental frequency of the multi-harmonic load-pull measurement set, as shown in Fig. 1. This f CF parameter will be the key to generalization of the TD-PHD modeling framework proposed in this paper. The TD-PHD model uses a set of auxiliary signals that are delayed versions of the input signals evenly spaced in time to span the fundamental period (T = 1/f 1 ) of the multi-harmonic load-pull measurement dataset. Let the auxiliary signals x i (t) be N time delayed versions of a(t) spanning its period (T ): By making the signals x i (t) available in the circuit simulator, the value of a(t) at any time can be theoretically evaluated and made available for the computation of the right-hand-side of (1). So a discrete projection of the Volterra-series can be found with these auxiliary signals as its constituting basis. Since the choice of auxiliary signals in the TD-PHD model is fixed to the period of a single fundamental frequency, a single TD-PHD model cannot be used to represent the behavior of a non-linear power transistor over multiple non-uniformly spaced frequencies that was captured during a load-pull measurement. To overcome this limitation, a generalization of the time-spacing of the auxiliary signals and as a result, a new generalized discrete projection that allows for the extraction of a single time-domain defined behavioral model for a nonuniformly spaced frequency grid will be proposed. The models using this generalized framework will be referred to as time-domain multi-tone distortion (TD-MTD) models. TD-MTD models are a generalization of TD-PHD models in the sense that a TD-PHD model is a TD-MTD model where f CF is fixed to a single fundamental frequency.
In order to allow for an instantaneous-time computation of the Volterra-series at time t 1 , there must exist a timeinvariant interpolation function that reveals the value of the input signal a(t) at another time t 2 from only the evaluation of the auxiliary signals, x i (t), at time t 1 and the knowledge of the time-offset between these two times t = t 2 − t 1 . That is, the condition required to be able to approximate the Volterra series instantaneously and to provide a projection is the existence of a smooth continuous interpolation function f interp : If the function f interp exists, then the continuous-time Volterra series expression collapses onto the discrete projection: where the continuous simulation-time-independent static function g approximates the operations on the right hand side of (1). In fact, the multi-variable non-linear function g in (4) is approximating the output of the expression of the Volterra series of (1) where the time-dependent operand a(t) is replaced with a set of simulation-time-independent operands x i (t) by taking advantage of the interpolation function. This can be achieved without having an explicit expression for f interp by adopting a multi-variable polynomial expression for g. Alternatively, in this paper an artificial neural network was used to fit this multi-variate function g as it is an effective fitting tool for continuous multi-variate functions and it allows the avoidance of the numerical instability problems that arise when a high non-linear polynomial model order is required.

B. FORMULATION OF THE MULTI-TONE DISTORTION MODEL
In the previous Section, a multi-variable non-linear function g was introduced to approximate the Volterra-series when modeling the behavior of high power transistors over multiple nonuniformly-spaced frequencies. In this Section, the auxiliary signals x i (t) that were used as operands in the function g will now have to be defined such that they allow for the extraction of the parameters of the multi-variable function g based on the load-pull characterization data spanning multiple fundamental frequencies.
Since typically these frequencies are all integer multiples of a common frequency f CF , a common period exists for the time-domain representations of the load-pull measurement data. In Fig. 2 the location of the spectral content of load-pull measurements are shown for multiple fundamental frequencies. This common period is longer than the period of any individual fundamental frequency in the loadpull characterization data and contains more periods of the higher frequency characterization data and less periods of the lower frequency data. A uniform sampling of the input signals over this much longer common period can reveal the signals that are on a non-uniformly spaced frequency grid through the Non-Uniform Discrete Fourier Transform Type I (NUDFT-I) [30], [31].
Suppose the auxiliary signals x i (t) of (2) are defined for T = 1/f CF and are evaluated at time t = 0: The set of discrete samples x i (0) can be used to generate a Fourier series representation of the input signal a(t). These Fourier series coefficients are obtained from the Non-Uniform Discrete Fourier Transform Type I (NUDFT-I) as follows: The Fourier series coefficients A k (0) are obtained when the functions x i (t) are evaluated at t = 0. Using the timeshifting property, the Fourier Series coefficients at an arbitrary simulation-time can be obtained by: where ω k is the angular frequency of each of the Fourier series coefficients A k .
Comparing the auxiliary signals of (5) and the elements of the Fourier series expression of (6), it can be noted that the Fourier series coefficient A k (0) is a function of the delayed input signals x i (0): By the time-shift property for periodic signals, the Fourier series coefficient at simulation-time t 1 can be determined as: The time-domain input signal at another arbitrary simulation-time t 2 can be expressed as the following Fourier series expression: Using the time-shift property of (7), the expression of (10) can be re-written using the time-varying Fourier series coefficients A k (t 1 ): In this expression an interpolation of the input signal at an arbitrary simulation-time t 2 is found that's based on the auxiliary signals x i (t) evaluated at another arbitrary simulationtime t 1 and the time offset between these two arbitrary simulation-times t = t 2 − t 1 . Thus (3) holds for the choice of auxiliary signals x i (t) and from (4), the output at an arbitrary time t can be expressed using a simulation-timeindependent function of auxiliary signals evaluated at any simulation-time t: Now that the expression of the auxiliary signals needed to model the behavior of a non-linear single port system over a non-uniformly-spaced set of frequencies has determined, the expression of (12) can be generalized for two port systems, which is the form of the model used for power transistors: where N is the time resolution of the model and x 1,k (t) and x 2,k (t) are the auxiliary signals based on the two input signals a 1 (t) and a 2 (t). In this generalization of one-port systems to two port systems, the complete dependence of each of the output signals (b 1 (t) or b 1 (t)) on both of the input signals (a 1 (t) and a 2 (t)) is made explicit.

III. EXTRACTION AND VALIDATION OF THE PROPOSED TD-MTD MODEL AND ITS IMPLEMENTATION IN A HARMONIC BALANCE SIMULATOR
In Section III-A the procedure to extract a TD-PHD model from load-pull measurements of a high power transistor is outlined and its implementation in a harmonic balance simulator is described. Any non-linear load-pull-based behavioral model should at least reproduce the load-pull data that was used to extract it. In Section III-A, the extracted model is put in a simulated load-pull testbench and the load-pull contours obtained from the simulated measurements will be compared to the raw measurements used to extract the behavioral model. To truly demonstrate the modeling capabilities of using behavioral models of power transistors in the practical design of power amplifiers, in Section III-B a two-way Doherty power amplifier is simulated based on two behavioral models extracted from the main and peaking power transistors respectively.
It should be noted that the signals used during the loadpull measurement and the validation of the two-way Doherty PA are all narrowband pulsed-RF signals, even though the Doherty PA is designed to amplify wider bandwidth modulated signals with a high Peak to Average Power Ratio (PAPR). Even though narrowband characterization of the transistor doesn't completely capture all the dynamics of the power transistor, a necessary but not sufficient requirement of wideband Doherty PA design is that it at least meets the narrowband RF performance requirements across the design band of interest. This means that an extracted narrowband model that is correct across the band can be used to tune the performance of the wideband PA across the band. On the other hand the extracted TD-MTD model will only be able to model the large-signal narrowband performance across the band and will not be able to be used to simulated the modulated signal behavior parameters like Adjacent Channel Power Ratio (ACPR).

A. EXTRACTION OF THE PROPOSED TD-MTD MODEL FROM LOAD-PULL MEASUREMENTS AND ITS IMPLEMENTATION IN A HARMONIC BALANCE SIMULATOR
To showcase the ability of the proposed TD-MTD model in fitting load-pull measurement data spanning over multiple fundamental frequencies, a set of fundamental frequency load-pull measurements at three frequencies (790MHz, 805MHz and 820MHz), that have a common fundamental frequency f CF of 5MHz, are performed on both the main and peaking power transistors of the NXP A2V09H525 packaged high power (which has a peak power of 525W) LDMOS device that is intended for an asymmetrical two-way Doherty HPA design. A dual-device load-pull fixture was designed for the NXP A2V09H525 device as shown in Fig. 3. The fixture parameters were extracted from a custom built Thru-Reflect-Line calibration kit and used to de-embed vector-corrected passive load-pull measurements to the package plane of the transistor devices in a setup similar to the block diagram of Fig. 4. A set of load-pull measurements were obtained that include DC Drain current measurements and pulsed RF waveform measurements at the fundamental frequency at the input and output of the power transistors. Pulsed 10% duty cycle RF measurements are performed on high power transistors during load-pull since a non-pulsed RF signal at the peak power of power transistor would excessively heat up the device at the peak powers of the power transistor. The load-pull measurement sweep involved setting a passive loadtuner to different fundamental load impedances at each of the frequencies and performing a pulsed-RF power sweep at each of the tuner positions. These power sweeps were bound at the upper end by a maximum gain compression of 5.5dB for the main device and 3.5dB for the peaking device. Since the compression of the power transistor is highly dependent on the load impedance, the input power ranges in the measurement data will vary with impedance and frequency. The load-pull measurements were performed over a range of impedances that covered the high power and high efficiency operations of the transistor.
The drain model of (14) will model the non-linear output power generation of the power transistor while the gate model of (13) will model its non-linear input impedance. This behavioral model is implemented as a Netlist for a harmonic balance simulator in the form shown in Fig. 5. To generate the auxiliary signals x 1,i and x 2,i in a harmonic balance    simulation Netlist, the time-delays can be implemented as a frequency-defined block in the Netlist that applies a frequency proportional phase-shift to each frequency component of the input signals. A train of fractional period time-delay blocks each creating a delay of t d = T CF /N in front of the a 1 and a 2 signals will reveal all the auxiliary signals x 1,i and x 2,i to the time-domain simulationtime-independent non-linear functions g 1 and g 2 . This allows the harmonic balance simulator to compute the time-domain output signals b 1 (t) and b 2 (t).
To extract the proposed model, the load-pull measurement dataset will need to be converted from the frequency domain to the time domain by means of a Fourier Series evaluation of the DC and fundamental frequencies: a 1 (t) = V gate,DC + |A 11 |cos ω f (t) + A 11 a 2 (t) = V drain,DC + |A 21 |cos ω f (t) + A 21 b 1 (t) = I gate,DC + |B 11 |cos ω f (t) + B 11 b 2 (t) = I drain,DC + |B 21 |cos ω f (t) + B 21 (15) VOLUME 10, 2022 If the model time resolution N is too low, the model fitting algorithm will have difficulty in finding a good fit to the data. A good model fitting threshold would be a Normalized Mean Squared Error (NMSE) of better than −30dB for the timedomain measurement dataset. The NMSE for a parameter y that is modeled with the variables y model,i and measured with variables y meas,i over the measurement dataset is defined as follows: For this load-pull measurement dataset, when the model time resolution N was set to 17, it was found to have better than −30dB NMSE for each of the time-domain parameters b 1 (t) and b 2 (t) to the measurement data with the chosen fitting function for the TD-MTD nonlinear functions g 1 and g 2 . Since the f CF of this measurement set is 5MHz, the common fundamental period will be T = 1/f CF = 0.2µs. This makes the sampling time delay t d = T /N = (0.2/17)µs. The discrete set of functions x 1,1 (t) through x 1,17 (t) and x 2,1 (t) through x 2,17 (t) will be used to denote the time-domain delayed incident wave at the input port a 1 (t) and the output port a 2 (t) respectively and are defined by the definition outlined in (2).
Since the entire load-pull measurement data set is periodic with the period T CF , the functions b 1 (t), b 2 (t), x 1,1 (t) through x 1,17 (t), and x 2,1 (t) through x 2,17 (t) are evaluated at times t = 0, t = t d , t = 2t d , · · · , t = 16t d . This means that each frequency-domain load-pull measurement at a fixed power level in the dataset will be converted into 17 equivalent discrete time-sampled data points. This will be the discrete time-domain dataset used for fitting the simulationtime-independent non-linear output functions g 1 and g 2 that implement the following TD-MTD mappings: b 1 (t) = g 1 x 1,1 (t), x 1,2 (t), · · · , x 1,17 (t) (17) b 2 (t) = g 2 x 2,1 (t), x 2,2 (t), · · · , x 2,17 (t) (18) The multivariate non-linear functions g 1 and g 2 can be implemented with multivariate polynomial functions of the form:  (20) The polynomial coefficients K (y 1 ,··· ,y 17 ,z 1 ,··· ,z 17 ) and L (y 1 ,··· ,y 17 ,z 1 ,··· ,z 17 ) will have unique real values for each polynomial power of the auxiliary signal y i and z i . The main complication of using a multivariate polynomial implementation is that the number of required model coefficients increases as the model order is increased and the choice of which coefficients to include and which to leave out becomes important in model extraction stability. In addition, while polynomial models can allow for good interpolation, they are not well suited for extrapolation beyond the training data.
Since the non-linear representation of polynomials is limited, an artificial neural network model of g 1 and g 2 is used instead to avoid these limitations. The decision to use artificial neural networks is made out of convenience of implementation but it is not a requirement for a TD-MTD model, as other non-linear functional implementations of the TD-MTD model could also be an effective modeling tool.
The neural network topology used for modeling each of the two multi-variable non-linear functions g 1 or g 2 will have a distinct input neuron for each of the auxiliary signals x 1,i and x 2,i and a single output neuron for b 1 or b 2 respectively similar to what is shown in Fig. 6. All the inputs to the artificial neural network are normalized to a value between 0 and 1 (using Min-Max Normalization). The layers of neurons in between the input and output layers are referred to as the hidden layers. Each neuron in the hidden layer will get an input from all the neurons in the previous layer and will model its output based on the following neuron model: (21) where x i are the k inputs to the neuron and S(x) = 1 1+e −x is the sigmoid function. The choice of the sigmoid function as the activation function of the neuron model ensures that the output of the artificial neuron y will be a value between −1 and 1. Each artificial neuron will have input weights w i and bias value b as parameters that will need to be solved for during the ANN training. The output neuron of each of the b 1 (t) and b 2 (t) ANNs will denormalize the value of the output neuron from a value between −1 and 1 to the minimum and maximum values that are available in the training dataset for the output variable. It is known that a single hidden layer with enough neurons should be enough to model any continuous non-linear multivariate expression. In practise we noticed that to achieve the −30dB threshold NMSE, a single hidden layer topology is capable of modeling the gate (input impedance) non-linear model for b 1 but for the drain (output power) non-linear model a two hidden layer neural network topology was required to achieve the target NMSE with respect to the loadpull data with a less number of neurons. It was observed by the authors that up to two hidden layers for an artificial neural network can be simulated with modern harmonic balance simulators without much trouble. The authors also observed that a using a bounded non-linear activation function for the output neuron like the sigmoid function provides better simulation convergence compared to using an unbounded linear activation function, even if the ANN trained with the linear output neuron activation function achieved the required threshold NMSE.
The gate model mapping of (17) was implemented with a two-hidden-layer artificial neural network with 30 neurons in each layer achieving a NMSE b 1 (t) = −37.6281dB, while the drain model mapping of (18) was implemented with a two-hidden-layer artificial neural network with 50 neurones in each layer achieving a NMSE b 2 (t) = −31.1715dB for the main device of the NXP A2V09H525. The ANNs were extracted using the Levenberg-Marquardt algorithm available from MATLAB. The extracted artificial neural networks implementing (17) and (18) were converted into a flattened expression. This flattened expression can be recognized by the circuit simulator when implemented as a simulation-timeindependent multi-variable time-domain non-linearity. In the Keysight ADS harmonic balance simulation environment, this can be implemented using the Symbolically Defined Device (SDD) component which will implement the multivariable non-linear functions g 1 and g 2 . The inputs to these multi-variable functions x 1,2 (t) through x 1,17 (t) and x 2,1 (t) through x 2,17 (t) are made available in the model Netlist using a time-delay chain of 16 time fractional period delays t d of the input a 1 (t) and a 2 (t) signals respectively. The time-delays are implemented in the Netlist as a frequency domain equation block implementing the frequency proportional phase shift of (7). A simulated load-pull measurement was performed on the implementation of the extracted TD-MTD model to test its ability to reconstruct the measurement dataset used to extract the model. Since a mature compact model of this power transistor is also available, the load-pull simulation of this compact model at the same DC bias condition and frequencies were obtained and are included in this comparison. Table 1 shows the summary of how well the compact model and extracted TD-MTD model fit the load-pull data over the 3 fundamental frequencies in the RF reflected wave and DC drain current parameters. Unsurprisingly the TD-MTD NMSE is spectacular here as it was extracted from the same load-pull data. Fig. 7 and Fig.8 show the loadpull contours of the real and imaginary part of the input impedance looking into the transistor gate (Z in = R in + jX in ) at 2dB of gain compression. Since behavioral models fit the load-pull measurement data directly, it is not surprising that the TD-MTD model has a better prediction of the input impedance of the power transistor compared to the compact model. Fig. 9, Fig. 10 and Fig. 12 show the comparison of the load-pull simulation of the compact model, the extracted TD-MTD model and the load-pull measurements in terms of the output power, operating gain and drain efficiency at 2dB gain compression respectively. The compact model has its best accuracy towards the high power and high efficiency regions of the load-pull data but the accuracy is less at impedances further away. Fig. 11 compares the AMPM loadpull contours to the simulation of the two power transistor models at 2dB of gain compression for the main device. Since the TD-MTD model is a behavioral model that does not use a look-up table, the resulting simulated performance smooths out the noise in the measurement data resulting in smooth AMPM contours that track the trend observed in the noisy load-pull measurement data. Overall the TD-MTD model can faithfully model the load-pull measurement data spanning multiple frequencies with a single smooth timedomain fitting function.

B. DOHERTY HIGH POWER AMPLIFIER SIMULATION OF MULTI-TONE DISTORTION MODELS
In this Section, the models extracted in Section III-A will be used to simulate the narrowband large signal operation of a Doherty HPA design over a set of discrete fundamental frequencies. Since the load-pull measurement data used for model extraction only includes DC and fundamental frequency measurements, these behavioral models do not react  to harmonic impedance termination in the circuit simulator. Under the assumption that packaged LDMOS power transistor devices are not highly sensitive to harmonic impedance terminations, an HPA design can be simulated solely based on collected DC and fundamental frequency load-pull behavior. It should be noted that the theoretical derivation of this model does not forbid the inclusion of harmonic data for the training of the model. For the case of GaN power transistors when the harmonic termination becomes significant, it would be suggested to perform harmonic load-pull measurements in conjuction with fundamental frequency load-pull at each of the fundmanetal frequencies. This load-pull measurement space will contain a higher number of measurements but a TD-MTD model could be fit to such a load-pull measurement set with the exact same procedure outlined in this paper. The validation of a TD-MTD for a multi-harmonic multifundamental-frequency load-pull measurement space is not demonstrated in this paper but is within the theoretical possibilities of the application of TD-MTD models. The 790MHz NXP A2V09H525-04NR6 Test Circuit was used as the reference circuit to validate the main and peaking device models extracted from the two power transistors in the NXP A2V09H525 package. Since the top-copper structure PCB drawings and the bill-of-materials of this reference circuit are available, the S-parameters of the input and output matching networks of the reference PA circuit were extracted using an EM simulation and vendor S-parameter models were used as a model for the passive components. Harmonic balance simulation at 790MHz, 805MHz, and 820MHz were then performed on the schematic representation of the  reference PA circuit, which includes the extracted main and peaking power transistor TD-MTD models, the extracted EM structures and the discrete passive models, respectively. The results of the simulation are compared to large signal pulsed-RF measurements of the reference PA as well as a simulation of the circuit with the compact model of the devices. Table 2 summarizes the NMSE of the compact model and the extracted TD-MTD compared to the measurement data in terms of how well they reflect each of the fundamental frequemcy RF waves and the DC drain current consumption over the power and frequency sweep. This table overall shows that the TD-MTD model performed significantly better than the compact model in predicting the input side RF behavior, while at the output side the compact model performed slightly better in predicting the DC Drain current but slightly worse than the TD-MTD model at predicting the RF behavior. Fig. 13 and Fig. 14 show the gain magnitude and phase compression curves of the measured PA compared to the simulated PA with the TD-MTD model and the compact model, while Table 3 shows the numerical values of the gain magnitude and phase compression at the average power (49dBm) and peak power (57dBm) levels across the three load-pull frequencies. In Fig. 15 and Fig. 16, the input return loss and the drain efficiency of the two transistor models is compared against the HPA measurement, while Table 4 shows the numerical values of the input return loss and drain efficiency at the average power and peak power levels across the three load-pull frequencies. While the TD-MTD model has a better approximation of the back-off efficiency of the PA, the compact model does not under-estimate the efficiency in the mid-power region of the power sweep as much as the behavioral model. Fig. 15 compares the simulated and measured input return loss as it varies with the output power in this reference PA. The extracted TD-MTD behavioral model has a better prediction of the return loss which is justifiable, given that the extracted TD-MTD model had a much better   fit of the input impedance of the power transistors compared to the compact model. Part of the discrepancy between the simulation of the HPA netlist, whether compact or behavioral model and the measurement of the fabricated HPA can be attributed to the inaccuracy of the simulation models used for the passive segments of the HPA circuit. The level of error seen in Table 2 compared to Table 1 suggests that most of the error could be attributed not the the active device error but due to other elements in the circuit while both the compact model and extracted TD-MTD model had similar performance in predicting the RF performance of the reference Doherty PA design. The transistor package used to perform the load-pull measurement was not the same as the transistor used in the reference circuit, which can attribute some of the difference between the modeled and measured performance to device variation.
Designing an analog Doherty HPA with a behavioral model allows the HPA designer to simulate the non-linear load   modulation of both the main and the peaking power transistors as the power is ramped up. The designer can then track the performance of the HPA against the extracted loadpull characterization data and visualize the load-modulation provided by its input matching network and output matching and combining network design at all the intermediate power levels from back-off to peak power. Fig. 17 and Fig.18 show how the load impedance varies at the fundamental frequency  of the main and peaking transistor respectively during the power drive up of the Doherty HPA.
As can be seen from the load-modulation simulation of the peaking transistor in Fig. 18, the load impedance of the peaking device starts from around the complex conjugate of the peaking transistor's off-state impedance and moves towards its optimal design impedance close to the peak of the peaking device load-pull power contours at 1dB of gain compression.  As can be seen from Fig. 18, the simulation of the peaking device during the Doherty PA simulation is presenting a drain impedance at back-off power that is outside of the passive load-pull characterization region of the peaking device. This means that the low power behavior of the peaking device TD-MTD model is smoothly extrapolated to a region outside of where the load-pull measurements were performed in order to have some prediction of the turn-on characteristics of the peaking devices when it is being modulated with significant power from the main device.

IV. CONCLUSION
Traditional ECAD-based design of Doherty HPAs relied on compact models of power transistors. Since compact models are often not available at the same pace as transistor development, load-pull based designs of HPAs are often employed to allow for quick turnaround. In this paper, the TD-MTD behavioral model is proposed that allows the HPA designer to use load-pull measurements of power transistors captured over a discrete set of non-uniformly spaced frequencies and convert that into a behavioral model of the power transistor for use in a simulation-based design environment. Using a TD-MTD based model simulation as the cornerstone of HPA design can allow for fast turn-around of matching network designs without trading off accuracy. As validation for the model presented in this paper, a simulation of a LDMOS Doherty HPA design with the TD-MTD model is shown to have less than 1 dB error in the prediction of the input return loss at both back-off-power and peak-power levels, and less than 0.8dB and 2 • error in the back-off gain and phase compression and less than 1.4dB and 5.6 • error in the peak power gain and phase compression, and an error of less than 6% in drain efficiency over the range of simulated frequencies, achieving a an NMSE of −18.89dB for predicting the B 2 wave and an NMSE of −13.17dB for predicting the DC Drain current over the power and frequency sweep. These errors were no worse than the compact model based design. The use of a TD-MTD model allows the HPA designer to perform E-CAD based design of HPAs relying solely on loadpull measurements without compromising any simulation accuracy compared to using a compact model as the power transistor model.