Software-Defined Radio Transceiver Design Using FPGA-Based System-on-Chip Embedded Platform With Adaptive Digital Predistortion

In this paper, a software-defined radio (SDR) based transceiver system is designed and implemented on the system-on-chip (SoC) platform, which consists of a high-speed Arm embedded processor and a reconfigurable field-programmable gate array (FPGA). In the proposed SDR transceiver, the real-time baseband signal generation and adaptive digital predistortion (ADPD) units are implemented on the SoC platform. Memory polynomial model based ADPD solution is implemented to linearize the radio frequency (RF) power amplifiers (PAs). The implementation of the ADPD on a reconfigurable FPGA platform makes the system flexible and cost-effective. The PA characterization, in terms of model extraction and coefficient calculation, is done in real-time. These calculated coefficients are updated in the transmission path to precondition the transmitted signal before it is applied to the PA. The proposed ADPD is applied at the baseband level. Therefore, it can be used for different classes of PA operating at different RF carrier frequencies. A long-term evolution (LTE) signal with 20 MHz bandwidth and 11 dB peak to average power ratio (PAPR) is used for simulation and measurement purposes. The LTE signal is amplified using a GaN-based harmonically tuned continuous Class-F PA in measurement. The performance of the implemented ADPD scheme is analyzed in terms of NMSE, ACPR and EVM.

signal processing algorithms and communication network protocols are implemented on a reconfigurable hardware platform such that they can be easily upgraded without any hardware intervention [5]. This will ease the problem of device up-gradation whenever a network is updated from one generation to another. Because of these advantages, in the last few years, most of the analog transceivers are replaced by SDR based transceivers for different wireless communication applications [5]. SDR based transceivers generally operate in the lowest two layers, i.e., the data link layer and physical layer of the open systems interconnection (OSI) model, as shown in Fig. 1. [6]. In SDR transceivers, some or all of the transceiver functions are software-defined and implemented on a flexible and reconfigurable hardware platform [5]. These flexible and reconfigurable platforms include field-programmable gate arrays (FPGA), digital signal processing (DSP) processors, general-purpose processors (GPPs), embedded processors, or programmable system-on-chip (SoC). By using these platforms, new features and capabilities can be added to an existing transceiver system without adding any new hardware. The physical layer of SDR transceivers can be divided into two sub-blocks, baseband processing block and radio frequency (RF) front end block [6], as shown in Fig 1 (a). As the name suggests, the baseband processing block performs the signal processing at the baseband level, such as baseband signal generation, modulation, demodulation, encoding, decoding and link-layer protocols implementation, etc. Baseband processing requires DSP processors/embedded processors and FPGAs. Whereas, RF front end has RF transceiver circuitry, which is used for baseband to RF, RF to baseband signal conversion and transmission/reception of the signal at the different RF carrier frequencies. SDR based transceivers consist of a generic hardware/software co-platform with DSP processors/embedded processors, FPGAs, and programmable RF front end. Such SDR transceivers provide software control over most of the transceiver parameters such as signal modulation/demodulation technique, information security functions, waveform specification requirement, RF front end parameters, etc., as shown in Fig. 1 (b).
As the communication standards and network generations are upgrading with time, the demand for energy and spectrum efficient transmission with a high data rate is increased significantly. The long-term evolution (LTE) and LTE advanced (LTE-A) signals with high peak to average power ratio (PAPR) are used in advanced communication systems to achieve a high data rate within the limited available spectrum [7], [8]. The requirement of any modern communication system is to amplify these high PAPR signals with high efficiency and linearity. However, it is difficult to achieve both due to the nonlinear property of PAs. At the amplification stage, high efficiency harmonically tuned or switch-mode power amplifiers (PA) are most commonly used in the saturation region where PA behavior is highly nonlinear [9], [10]. Due to the nonlinear behavior of PAs in the saturation region, PAs creates distortion at its output in the form of spectral regrowth in the adjacent channels [8]. This spectral regrowth is represented as adjacent channel leakage power ratio (ACPR). In order to maintain good linearity with high efficiency, it is desirable to increase the linear operating region of the PA [8]. Therefore, the linearization of PA is an important aspect of improving the power and spectrum efficiency in any SDR transceiver system [11]- [13]. For RF PAs linearization, several methods have been proposed in the literature, such as feed-forward technique, analog predistortion (APD), linear amplification with nonlinear components (LINC), digital predistortion (DPD), etc. Out of these, DPD is the most favorable technique because it can be applied directly at the baseband level, which makes it a suitable technique for the SDR based transceivers [11]- [15]. The main criticality in the successful implementation of the DPD scheme in any SDR based transceiver is the time required to calculate the exact inverse nonlinear characteristics of PA using the appropriate model and its implementation to generate the predistorted signal.

II. STATE-OF-THE-ART
A number of DPD models are discussed in the literature, but most of them use a fixed sample set of the input baseband signal and its transmission/reception is done by commercial equipment [16]- [20]. The required system/PA modeling and predistorted signal generation are performed offline using a personal computer (PC) with running MATLAB [21]- [28]. This predistorted signal is directly loaded into the signal generators to check the system performance. Such type of DPD systems cannot be considered as a real-time solution because all the processing is done offline and require costly commercial equipments [16]- [28]. Such DPD solutions do not have the real-time adaptability to change the DPD parameters automatically because the DPD coefficient extraction is performed externally.

VOLUME 8, 2020
A look-up-table (LUT) based memory polynomial model is proposed in [16], [17] for DPD. They used a fixed sample set of the input signal for hardware verification without any real-time adaptability. In [16], a co-simulation test setup is developed to evaluate the DPD performance. Here, the DPD coefficients are extracted in MATLAB and uploaded to the LUTs of the FPGA design. A memory polynomial [18], [19] and a memoryless polynomial [20] based DPD solutions are proposed in the literature, which is verified in the simulation test bench but not validated in the hardware. In some other literature [21]- [28], all the computations required for DPD are performed externally using the PC and signal transmission and reception are carried out through commercial instruments such as MXG/VSG/MXA. In [27], the signal is captured in the FPGA board and then transferred to the PC for further processing. The DPD coefficients are calculated to generate the predistorted signal, and then this signal is again transferred to the FPGA board for transmission in the RF domain. Such solutions cannot be called a real-time system because a PC is used for DPD. In [25], an adaptive DPD system is proposed, where the dynamic properties of PA are handled by using a set of DPD coefficients. In the proposed adaptive system, the initial DPD coefficients are calculated for the different output power of PA and at different temperatures offline using MATLAB deep-learning toolbox. A different set of DPD coefficients are generated and stored in the LUTs of FPGA. In such systems, the DPD coefficients are PA specific and require manual intervention if the PA unit changes. The adaptation in the above-discussed DPD systems [16]- [28] requires manual intervention and therefore, cannot be used as a real-time adaptive DPD solution.
Many commercial SDR platforms are also available in the literature. But most of these platforms have limited reconfigurability features for the signal generation of required bandwidth and sampling frequency [29]- [31]. Moreover, these platforms do not provide a means to add user-specific applications. The above limitation is solved by national instrument's (NI) universal software radio peripheral (USRP) SDR transceivers [32]. Another limitation of NI USRP is that it supports an application programming interface (API) through LabVIEW (NI software). Due to this, all the programming, implementation of user-specific applications and modifications require an external PC where LabVIEW is running.
In this paper, a LUT based real-time adaptive DPD (ADPD) solution is designed and implemented on a flexible and programmable SoC platform [33], as shown in Fig. 2. The proposed design provides a software and hardware interface to develop a SDR based transceiver system, where, in real-time, a baseband signal with different bandwidth is generated and transmitted to different RF carrier frequencies. At the same time, the proposed SDR transceiver can perform real-time DPD to remove the PA nonlinearities. With the variation in temperature and input signal properties, the behavior of PA changes. Therefore, the adaptability of DPD is essential to respond automatically to these changes [34]. The proposed system fulfills this requirement by providing real-time adaptability which can recalculate the DPD coefficient without any manual intervention, unlike in [25], [27].
The implementation of the SDR transceiver with the real-time ADPD technique involves three main steps: (1) baseband signal generation, (2) characterization of PA nonlinear characteristics, and (3) implementing the inverse nonlinear characteristics of PA in LUTs to linearize the system. The PA is characterized by driving a realistic LTE signal in the saturation region. The input and output of PA are used to calculate the amplitude and phase distortion added by the transceiver system and PA. Once this information is calculated, the inverse nonlinear characteristics of the system and PA is generated and stored in the LUTs in the form of inverse gain and inverse phase. The LUTs are used to synthesize the predistortion function in FPGA, which is generated from the PA characterization. The proposed real-time ADPD solution is capable of simultaneous transmission and updation of the predistorter coefficients to remove the system/PA nonlinearities in the real-time. The real-time updation of the predistorter coefficients is performed only when the system performance is decreased from the required value. The real-time ADPD can be enabled and disabled automatically as per calculated system performance to save power and extra computations.
In this paper, the timing and adaptivity issues for PA characterization are resolved by using an advanced SoC evolution board having a reconfigurable FPGA platform along with high speed embedded processors [33]. By virtue of SoC evaluation boards, the complex and time-consuming signal processing algorithms can be implemented on highspeed embedded processors and the FPGA platform can provide high processing speed to perform different complex mathematical functions. Embedded processors also provide: (1) communication between the SDR platform and user interface to display the system performance, (2) allow the user to design different applications required for SDR and DPD implementation, (3) software control over transceiver parameters along with RF front end and (4) reading and writing data to and from the FPGA and RF front end board [35]. These features provide high-level performance and service quality in the communication link. If the quantization bits of the transmitted and the received signals are large enough, then the performance of the DPD system can be further improved [36]. The proposed adaptive SDR transceiver has the ability to monitor its real-time performance and modify its operating parameters whenever its performance deteriorates.
Since the DPD technique is applied directly at the baseband level, it can be applied to any type of modulated signal and on any class of PA. Such a DPD scheme with high-speed digital signal processing is favorable in SDR transceiver applications where reconfigurability in terms of transmitter configuration is sought.

III. PROPOSED SDR TRANSCEIVER ARCHITECTURE
This section discusses the detailed architecture and working of the implemented SDR transceiver with an ADPD solution. The proposed SDR transceiver architecture with ADPD solution shown in Fig. 3 is implemented on the Zynq 7000 SoC platform with ADRV9371 RF front-end board [33], [35]. The SDR transceiver architecture is divided into two sub-blocks, namely the baseband signal processing block and RF front end block. The baseband signal processing block consists of a processing system unit (PSU) having an embedded processor and programmable logic unit (PLU) having FPGA. The embedded processor (PSU) and FPGAs (PLU) are fabricated on a single chip [37]. Therefore, the latency of data transfer between the PSU and PLU is very less. The on-chip PSU is connected to the PLU with multilayer advance microcontroller bus architecture (AMBA) advanced extensible interface (AXI) interconnect which gives very high throughput with very low latency [37]- [39]. These interconnects are non-blocking and provide multiple simultaneous transactions between PSU and PLU. These interconnects are designed such that the PSU and PLU have the shortest path to get low latency. Moreover, the PLU is connected with PSU with over 3000 interconnects and provides up to 100 Gb/s of I/O bandwidth [37]. PSU-PLU interface has a total of nine configurable 32/64-bit AXI interfaces linking the PSU to the PLU. There are four more 32-bit AXI ports connecting the Zynq PSU and PLU in addition to the 32/64-bit configurable AXI ports [37]- [39]. These ports provide a connection between the PSU and any IP blocks implemented in the PLU. Since a large number of the AXI interconnect ports are available in SoC. Therefore the total communication load on AXI gets distributed and prevents it from loading effects. The PSU in SoC consists of a dual-core ARM Cortex A9 embedded processor with a maximum clock frequency of 800 MHz [37]. The dual-core architecture and high operating clock speed of the embedded processor accelerate the software execution which is independent of the design implemented in the PLU. This allows fast processing for the calculation of PA inverse model coefficients. Such high clock speed, high speed interconnects and the dual-core architecture of embedded processor provide less time consumption for data processing.
The PSU is used to perform the tasks such as baseband signal generation, PA inverse modeling, DPD model coefficients calculation and system error calculation. The LUT implementation, generation of predistorted signal using LUT coefficients and user interface with SDR transceiver system is performed in the PLU. To mitigate the time required for DPD implementation, the time-intensive data processing can be done in PSU in parallel with PLU [40], [41]. User display screen and universal serial bus (USB) input devices such as keyboard/mouse are connected to SoC using high definition multimedia interface (HDMI) and high-speed AXI interconnects, respectively. PSU is responsible for controlling all the important parameters of the SDR transceiver. The parameters of RF front end transceiver board, such as RF carrier frequency of transmission and reception, the sampling frequency of digital to analog converters (DACs)/analog to digital converters (ADCs), filter coefficients, low noise amplifier (LNA) gain, attenuation, etc. are also controlled by the PSU. The RF front end has a wideband RF to baseband and baseband to RF conversion system that includes high-performance DACs and ADCs. A software-controlled attenuator is used at the end of the transmitter chain for the selection of different power levels to drive a wide range of PAs. Several filter stages are used before ADC and after DAC to remove the anti-aliasing effects.
Working of the proposed SDR transceiver starts with assigning all the transceiver parameters to the PSU, such as required bandwidth of the baseband signal, baseband sampling frequency, RF carrier frequency for transmission and reception, the output power of the transmitter and required NMSE after the DPD. After initializing all the system parameters, the signal generation block shown in Fig. 3 generates a baseband signal and passes it through the pulse shaping filter and interpolation filter to make a bandlimited signal. The output of the interpolation filter is used to generate baseband LTE signal X BB with specified bandwidth and sampling frequency. This baseband signal X BB is passed through the predistortion unit (PDU) and generates the predistorted signal Y PD . For initial transmission, PDU does not alter the baseband signal and therefore pass it without any changes. Therefore, for initial transmission, the predistorted signal Y PD is the same as the baseband signal X BB . The output of the PDU, Y PD is then passed to the RF transceiver board, where the received predistorted signal is converted into the analog form using DACs and upconverted to the required RF carrier frequency using a quadrature modulator. The RF upconverted signal is amplified using PA before transmission. In order to model the nonlinearity of the system and PA, the attenuated output of PA is received back by one of the receivers of the RF transceiver board. The received signal is then down-converted to baseband frequency using the quadrature modulator and converted into digital form using ADCs. The LNA gain at the receiver is also optimized in order to utilize the full dynamic range of ADCs. The digitized output of ADCs is received by baseband signal processing block using AXI interconnect and represented as distorted signal Y D . The received distorted signal Y D is time-aligned with transmitted predistorted signal Y PD . The time-aligned input and output signals are received by the system performance calculation (SPC) block. In this block, the system performance in terms of normalized mean square error (NMSE) is calculated and the same will be displayed on the user interface display. If the calculated NMSE value is less than the required value, then the SPC block invokes the next section to calculate the model coefficients for DPD.  However, if NMSE is greater, then it will not invoke the next section. This will save the extra computation burden of the system when DPD is not required. If it is required to calculate the model coefficients, then PA inverse behavioral modeling is performed using time aligned signals received from the time alignment block. This is an appropriate solution to calculate the inverse PA characteristics. By using these coefficients, inverse gain and inverse phase values are calculated and updated to respective LUTs in the PDU. The input baseband signal is passed through this nonlinear function in the PDU and generates a predistorted signal to make the system linear and suppress spectral regrowth and signal distortion at PA output. The detailed working of the PDU and hardware-software interfacing in the SoC is discussed next.

A. PDU WORKING
The PDU shown in Fig. 4 is used to generate the predistorted signal using stored LUT coefficients. The baseband signal X BB in the form of in-phase (I ) and quadrature phase (Q) is received by the PDU and converted to the polar form with magnitude M BB and phase φ BB . Dual-port block random access memories (BRAMs) are designed and implemented in PDU to store the LUT coefficients in terms of inverse gain g pd and inverse phase ϕ pd . The detailed explanation of system modeling and calculation of LUT coefficients is discussed in section IV. The dual-port BRAM gives the advantage of simultaneous reading and writing operations. A BRAM controller is designed in FPGA to provide an interface between user applications running on user space and implemented BRAM. The magnitude of baseband signal M BB is connected to the address port of both the BRAMs designed to store the inverse gain and inverse phase values. As per the received address, BRAM generates inverse gain g pd and inverse phase ϕ pd at its output data port. The magnitude of the baseband signal is multiplied with the inverse gain g pd and phase of the baseband signal is added with the inverse phase ϕ pd , and generates a predistorted signal with magnitude M PD and phase φ PD . The predistorted signal is converted back to complex form i.e., I PD and Q PD components represented as Y PD . The flow chart of the implemented ADPD algorithm is shown in Fig. 5. One can observe from Fig. 5, the first step is the initialization of system parameters followed by baseband signal generation. The generated baseband signal is passed through the PDU for generating a predistorted signal. If the behavior of PA is unknown, then the gain and phase LUTs are initialized by 1 and 0. If the behavior PA is previously known, then LUTs in the PDU can be initialized by previously calculated coefficients. After updation of LUT coefficients in the PDU, the PA output is captured, and system performance is analyzed. One can see from Fig. 5 that the ADPD system calculates the coefficients only if the required NMSE is below the acceptable value. In all other cases, it remains off and does not perform any operation. This will save the unnecessary computation involved in the DPD coefficient calculation. It is also seen from Fig. 5 that if the system performance (NMSE) is in the acceptable range, then the system will re-check the system performance after some delay. This delay can be set according to the user application. This will again save the extra computation load of the PSU.

B. EMBEDDED LINUX AND WORKING
This section discusses the details of interfacing between PSU and PLU. In order to perform the software-hardware interfacing and real-time user interface, embedded linux is deployed in the PSU, as shown in Fig. 6. A libiio library is run on this embedded linux to provide communication between PSU and hardware devices. The libiio provides device registration, device handling and has information of all the connected devices. Whenever a new device or new design is added, then this information is transferred to libiio. The libiio library has user space and kernel space. Userspace provides an application programming interface (API) to the user for writing codes of user-specific applications. The user application creates an internal buffer for each device to store the device information. The kernel space provides an interface between user space and connected devices and is used for real-time data acquisition from hardware devices. A device driver is designed for each device to interact with user space and connected devices. The role of the device driver is to identify the required device, device channels and channel attributes. For each user-defined internal buffer, kernel space creates a kernel buffer. The kernel buffer is the only part of the kernel space that actually interacts with the hardware. The kernel buffers have a ring structure, which supports writing data to devices and reading data from user space or vice-verse at the same time shown in

IV. DIGITAL PREDISTORTION ALGORITHM AND MODEL
Initially, a baseband LTE signal of the desired bandwidth and sampling frequency is generated. With the assumption that the PA behavior is unknown to the system, the gain and phase LUTs are initiated by ones and zeros, respectively, for initial transmission. Therefore, the PDU transmits the baseband signal without any changes, such that X BB (n) = Y PD (n). The output of PDU is upconverted to RF frequency and fed to PA. The input to the PA can be expressed as: where ω c is the angular RF carrier frequency, g(t) and φ(t) are the amplitude and phase variation of the baseband modulated signal, respectively. Due to the nonlinear characteristics of PA operating in the saturation region, the gain and phase distortions are likely to occur in the amplified output of PA. The distorted PA output (y d ) can be expressed as: where g d (t) and φ d (t) represent, respectively, the amplitude and phase distortion added at the amplified output due to the nonlinear behavior of the PA. Both g d (t) and φ d (t) are function of amplitude g(t) of the input signal. Hence, g d (t) and φ d (t) are written as amplitude to amplitude modulation (AM/AM) distortion and amplitude to phase modulation (AM/PM) distortion of the PA, respectively. The attenuated output of PA is captured and demodulated in the RF transceiver board. This demodulated signal is passed through a low pass filter to select the required frequency band. The filtered received signal is time aligned with the transmitted baseband signal using cross-correlations [42] given by: where µ x and µ y represent the mean values, σ x and σ y represent the standard deviation of the input and output signals, respectively and N is the length of the time-aligned signal. The captured output signal is divided by its complex small-signal gain. The time-aligned signals are then used to calculate the inverse nonlinear characteristics of the PA using a suitable model. In order to generate the predistorted signal, the inverse gain and inverse phase are calculated to cancel the extra nonlinear terms g d (t) and φ d (t) added in (2). The calculated inverse gain is multiplied with the magnitude of the baseband signal and the inverse phase is added to the phase of the baseband signal in the PDU to generate a predistorted signal. This predistorted signal (y pd ) can be written as: where g pd (t) and ϕ pd (t) are the inverse gain and inverse phase added to the baseband signal, respectively. The AM/AM and AM/PM response of this predistorted signal must be an exact inverse of the PA response to make the system linear. When this predistorted signal y pd (t) is passed through PA, it will cancel the amplitude and phase distortion added by the PA and make the system linear. The linearized output of PA (y L ) can be written as: where the multiplication of g d (t) and g pd (t) gives a constant linear gain and summation of φ d (t) and φ pd (t) gives nearly zero phase distortion. Equation (5) represents the linearized output after the DPD.

A. LUT COEFFICIENT CALCULATION
In this paper, the memory polynomial model is used to extract the nonlinear coefficients of the RF PA. The memory polynomial model can be expressed as: where C m,k represents the coefficients of memory polynomial, k is the nonlinearity order, M is the memory depth and n is the discrete-time index of the signal. A subset of time aligned input and output signals from PA are used to calculate these coefficients. The coefficients C m,k of memory polynomial model, is then calculated using the following equation: where C m,k is the set of memory polynomial coefficients used for calculation of inverse nonlinear characteristics of PA. The number of memory polynomial coefficients is equal to the multiplication of nonlinearity order and memory depth used for calculation of PA nonlinear characteristics. PA in is the PA input signal matrix of size 1×n, where n is the total number of samples of PA input and output used for the PA inverse modeling. Matrix A is calculated by using the output signal of the PA as follows: where y(n) is the PA output signal expressed as: Calculation of inverse matrix A requires a high computational cost when the size of the matrix is large. Therefore, Moore-Penrose pseudo inverse [43], [44] is used for matrix inversion.
After calculating the inverse of the matrix A, the memory polynomial coefficients are calculated using (7) and represented as: The next step is to calculate the inverse gain and inverse phase using these coefficients. Since the gain and phase distortion added by PA depends upon the amplitude of the input baseband signal, therefore the inverse gain and inverse phase values are calculated for each possible input amplitude. The size of each LUT is selected in such a way that for each possible input magnitude, there are an inverse gain and inverse phase values stored in the LUT. For the calculation of inverse gain and inverse phase values, a signal with all possible magnitude is generated. This generated signal with length L can be defined as: Using memory polynomial coefficients calculated by (7), an inverse modulated signal is generated using (6) as: where op inv is inverse signal used to calculate the inverse gain and inverse phase using the following equation: where l is the length of the calculated inverse gain and inverse phase matrix. Since in this paper, the memory depth is used up to M = 2, therefore m = 0, 1, 2. Therefore, three sets of inverse gain and inverse phase values are calculated using (14) and (15) for m = 0, 1, 2, respectively. In the proposed ADPD technique, table based LUT is used where the inverse gain and inverse phase are calculated for each possible input power level. Therefore, the LUTs have a length equal to the length of the generated signal X new . The inverse gain and inverse phase values are sorted in the LUTs according to the instantaneous envelope power of the baseband signal. In the transmission path, the magnitude of the baseband signal is used as the address of BRAMs and respective gain and phase values are multiplied to generate the predistorted signal. The architecture for generating the predistorted signal using LUT based memory polynomial ADPD is shown in Fig. 8. One can observe from Fig. 8 that three sets of PDU, which were shown in Fig. 4, are used to precondition the baseband signal to make the system linear. In the LUTs of each PDU, the inverse gain and inverse phase values are stored as per the value of m. Therefore, inputs of PDUs are multiplied with different gain/phase values. For m = 0, the baseband signal is applied directly to PDU, where the inverse gain/phase is stored for m = 0. For m = 1 the input signal is delayed by 1 sample and for m = 2 input signal is delayed by 2 samples and multiplied with inverse gain/phase in the PDU calculated for m = 1 and m = 2, respectively. The output of PDUs is summed together to generate the predistorted signal, which is passed to the RF transceiver board for transmission.

V. HARDWARE IMPLEMENTATION AND MEASUREMENT RESULTS
The Xilinx zynq SoC ZC706 evaluation board is used as an SoC and the ADRV9371 evaluation board from the analog device is used as an RF front end [33], [35]. The experimental setup used in this paper is shown in Fig. 9. ADRV9371 uses a single chip for transmission and reception of the signal to maintain the synchronization between transmitter and receiver. The implementation of PDU in FPGA requires fixed point algorithms, which are more complex compared to the integer-based algorithms. In the RF transceiver board, the size of the DAC is 16 bit and the size of the ADC is 14 bit. Therefore, in the transmission path, the generated baseband signal is stored in 16-bit and received data is stored in 14-bit resolution. The implemented LUT has 4096 values to cover all the possible magnitude of the baseband modulated signal. The performance of the proposed method is compared in terms of NMSE, power spectral density (PSD), error vector magnitude (EVM), and hardware utilization. The NMSE can be calculated as: where x(n) is the complex baseband input signal and y meas (n) is the measured baseband output signal, N is the length of signals. The EVM is calculated after decoding the symbols from the received signal. It represents the error of the detected symbol position compared to the original symbol position.
The EVM(%) can be represented as:   center frequency of 2 GHz and 40 dBm output power at 10 dB gain is shown in Fig. 10. The gain compression of PA is 4 dB, as shown in Fig. 10(a) and phase compression is shown in Fig. 10(b). From Fig. 10, one can observe that the   transmitter exhibits a scattering effect and nonlinearity up to the saturation region. For calculation of memory polynomial coefficients, nonlinearity order (K = 9) and memory depth (M = 2) is selected and the performance of DPD is checked. While converting the complex signal into polar and multiplying it with inverse gain and phase and again converting them back to complex form includes rounding off and truncation of bits. Therefore, the PDU itself may generate some distortions due to this error. However, the performance of PDU is verified by calculating NMSE between input and output of PDU, which comes to be -60.5 dB and hence shows good linearity. The inverse gain and inverse phase values calculated using (14) and (15) for m = 0, 1, 2 are shown in Fig. 11(a) and Fig. 11(b), respectively. The baseband signal is multiplied with these LUT values, as shown in Fig. 8 and generates a predistorted signal to make the system linear. The linearized output signal of the PA is then used for further calculation. The performance of the implemented ADPD system using the memory polynomial method is shown in Fig. 12. It shows the output power spectrum of PA using a 20 MHz LTE signal with and without DPD. One can observe that Fig. 12 shows significant improvement in spectral regrowth and ACPR is improved by 13.8 dBc after applying DPD. The combined  AM/AM and AM/PM response of the PDU and PA is shown in Fig. 13. One can observe that after applying DPD the scattering behavior of the PA is reduced, and a linear  relationship is presented between the input and output of the PA for the entire range of input power. Fig. 14 shows the variation in ACPR and NMSE with the number of iterations. It can be observed from Fig. 14 that the proposed scheme is able to clean the PA nonlinearity in a maximum of two iterations. ACPR and NMSE are almost constant after the second iteration. The change in the ACPR after the second iteration is within the limit of ± 1 dB. Here, iteration 0 represents the case where DPD is not applied. To see the in-band distortion, the measurement of EVM is important, which is summarized in Table 1, along with NMSE and ACPR results. One can see from Table 1 that after DPD, NMSE is improved from -5.65 dB to -30.29 dB. The improvement in left and right ACPR is 13.8 dBc and 13.9 dBc, respectively, after DPD is applied. It can also be clearly observed from Table 1 that after DPD, EVM is 2.95 %, which is within the limit. The constellation plot at the PA output with and without DPD is shown in Fig. 15. The FPGA resources used for designing PDU is summarized in Table 2. From Table 2, it is clear that the proposed DPD requires only a small portion of available resources. Table 3 shows the comparison of the proposed SDR transceiver with state-of-the-art. From Table 3, it is clear that the proposed scheme provides a real-time adaptive DPD solution that can be used for any SDR transceiver system.

VI. CONCLUSION
The designed SDR transceivers provide an efficient, flexible and low-cost solution to provide multi-functional transceivers that can be easily reconfigured using software control. A DPD platform for SDR transceiver is developed for advanced LTE signals. A high-speed SoC is used to solve the time consumption in coefficients extraction. A memory polynomial is used with 9th order nonlinearity and memory depth of order 2. The performance of the proposed algorithm is calculated in terms of NMSE, EVM and ACPR. The performance of PDU is also calculated, which shows that it would not impact on the system performance. The proposed algorithm is validated using a 20 MHz LTE signal passed through a 10W PA and the implemented algorithm shows good improvement after linearization.
NISHANT KUMAR (Member, IEEE) received the B.Tech. degree in electronics and communication engineering from Rajasthan Technical University, Kota, India, in 2010, and the M.Tech. degree from the Maulana Azad National Institute of Technology, Bhopal, India, in 2013, with specialization in VLSI and embedded system design. His M.Tech. project was based on the design and analysis of dual core pipelined processor implementation using FPGA. He is currently pursuing the Ph.D. degree with the Department of Electronics and Communication Engineering, Indian Institute of Technology (IIT) Roorkee, India. He is the Founding Director of Linearized Amplifier Technologies and Services Private Limited. His research interests include software-defined radio, digital transmitter, digital signal processing, and power amplifier design. His research has resulted in a publication in reputed journals and conferences and two patents (under review).