Real Time Receiver Baseband Processing Platform for Sub 6 GHz PHY Layer Experiments

Wireless communication is rapidly evolving to fulfill diverse requirements in a number of application areas. Researchers are experimenting with novel ideas to improve different aspects of communication systems and performance metrics. While computer simulation is the first step to validating these approaches, testing platforms are needed to transform them to implementations for further validation and experimentation. Software Defined Radios (SDR) and System on Chip (SoC) offer a great deal of flexibility and versatility allowing researchers to experiment with wireless algorithms. However there are challenges; noise, channel effects, and synchronization errors have to be dealt with, and measures to mitigate their effects on received symbols should be implemented. Existing research using state of the art SDR platforms has not leveraged the powerful processing capabilities of Field Programmable Gate Arrays (FPGAs) in SoCs for receiver backend operations. In this work, we use high level tools to describe hardware, and automatically synthesize and implement receiver baseband wireless signal processing algorithms for FPGA targets. We demonstrate the ability to use such platforms for real world applications using over the air waveforms. We use Xilinx ZC706, Zedboard, ADI’s FMComms3, and NXP’s BGA7210 variable gain amplifier (VGA) for the experiments presented in this paper. Use cases considered include testing the performance of higher order modulation schemes, adjacent channel interference, power amplifier (PA) gain compression and effects on the bit error rate (BER) performance.

cause performance bottlenecks in terms of spectral efficiency and multiple access which are fundamental considerations in future communication systems. Orthogonal Frequency Division Multiplexing (OFDM) for example, suffers from low spectral decay rate [2] which causes high out-of-band radiation [3]. OFDM characteristics impose strict synchronization and orthogonality requirements which prevent IoT based systems to operate in asynchronous communication scenarios where latency becomes a critical performance metric [4].
As the radio spectrum is becoming more crowded, researchers are experimenting to check the possibility of efficiently using the legacy sub 6 GHz band as well as millimeter wave bands for future communication systems. 5G New Radio (NR) access technology for next generation mobile networks is expected to operate in two frequency bands; FR1 (sub 6 GHz band) and FR2 (24.25 GHz to 52.6 GHz). Design of communication systems in FR2 requires significant efforts in design of multi-element antenna arrays considering high free space path loss (FSPL) and penetration issues. The technical challenges associated with radio frequencies (RF) and antenna design to handle wide frequency bands, and mitigation of phase noise -a major problem at high frequencies -need to be addressed for such systems to support higher orders of modulation formats [5]. In contrast, sub 6 GHz band possesses some favorable characteristics that help solve many issues that could be challenging to solve in millimeter wave (mmW) based deployments. Lower base station density, reasonably good coverage distance and strength to penetrate structures are the key advantages of this band compared to mmW that makes it a strong candidate for initial 5G rollouts.
Exponential growth in demand for higher data rates has necessitated higher order modulation schemes since they support transmission of more bits per unit of bandwidth (bits/Hz). These increased requirements cannot be fully met with Quadrature Amplitude Modulation (QAM) formats such as 16/64 QAM which have widely been used in WiFi and 4G Long Term Evolution (LTE) applications. A notable feature of 5G will be dense infrastructure networks in which the distance between a base station (BS) and user equipment (UE) is reduced and the link budget is improved, allowing higher order modulation based channels to operate [6]. The 3 rd Generation Partnership Project (3GPP) release 12 started supporting 256 QAM for LTE downlink, and now with release 15, 1024 QAM support has been added. Increase of data rate with the order of modulation up to 256 QAM for different bandwidths is shown in Fig. 1. However, applicability of higher order modulation techniques depends on the behavior of the communication link in terms of E b N 0 , the ratio between energy per bit and noise spectral density. An important performance metric to be considered for all types of communication systems is BER which is a measure how many erroneous bits are received per bit transmitted. A plot of BER vs different values of signal to noise ratio (SNR) in dB is given in Fig. 2, which shows how BER increases at low SNR values for different orders of modulation. Since higher order modulation schemes demand more SNR to maintain a target BER that could have  BER vs SNR for different modulation formats assuming that no error correction coding is performed. In practice, error correction performed at the receiver can relax the SNR requirement to achieve a target BER.
been achieved with a lower order modulation scheme with a lower SNR, systems for the former case requires significant design effort to minimize the noise level in the RF data path and data conversion modules.
Ideally, a receiver side low noise amplifier (LNA) should not add any noise to a received signal, i.e., a signal, after amplification should have amplified its signal power and noise floor by the same amount meaning that the amplifier's input and output have the same SNR. In reality, amplifiers add noise to received signals causing a degradation of SNR. This degradation is measured in terms of the noise figure (NF) of the LNA and is given by, NF = SNR input − SNR output .
In addition to noise, there are other impairments such as channel effects, time and frequency offsets. In a practical wireless communication system, it is critical to analyze such impairments in both transmit and receive chains and take appropriate actions to mitigate them in both analog and digital domains. SDR based infrastructure provides researchers with a convenient way to design and implement complex systems. However most published academic research has not made full use of the processing capabilities of such platforms to demonstrate live wireless communication systems. In our work, we leverage the high level of parallelism of FPGAs to implement advanced signal processing algorithms on the receiver baseband such as channel estimation and equalization, and show, with experiments, the possibility of detection and demodulation of multi carrier waveforms with higher order modulation schemes. In particular, we focus on real world use cases with natural impairments and no extra instrumentation in the transmit or receiver chains. Our platform takes into account various impairments of a practical wireless system and takes measures to mitigate their adverse effects in order to improve the BER performance. We include experiments that involve PAs and adjacent interfering channels. Thus, our platform can be used as a baseline framework on top of which other enhancement techniques such as digital pre-distortion (DPD) and spectrally efficient modulation schemes can be implemented and characterized.
The main contributions of our platform are: • The ability to transmit/receive over the air LTE based waveforms and perform baseband processing in real time, • support for higher order modulation schemes such as 64, 256 and 1024 QAM with quantification of received signals based on the receiver side EVM, • a channel estimation technique that is robust against multi-path/synchronization errors and channel effects, • the ability to perform wireless experiments under controllable interference levels, and • the ability to measure and characterize effects of peak to average power ratio (PAPR) of transmitted signals and PA nonlinearities. A list of acronyms used in the paper is given in Table 1. The rest of the paper is organized as follows. In Sec. II, we present background related to SDR and SoC design. In Sec. III, we present related work. We describe our workflow in Sec. IV. In Sec. V, we describe impairments in general, and how we mitigate some of them using signal processing algorithms implemented on the FPGA. Sec. VI describes the experimental setup and Sec. VII lists results. Sec. VIII concludes and presents future work.

A. SOFTWARE DEFINED RADIO
The inception of SDR dates back several decades when researchers came up with techniques to realize some of the functions that were traditionally performed using discrete components by defining operations in software. However it was not until the latter part of the 20 th century that the functional and architectural aspects of SDR were fully defined [7]. Recently, direct conversion (zero-IF) technology has emerged that eliminated the IF stage of a traditional superheterodyne receiver. This approach is a cost effective method for SDR and helps improve size weight and power of receivers. The high level architecture of a zero-IF transceiver is given in Fig. 3. The processing of such SDR can be divided into three categories: The analog RF section consists of RF filters, attenuators, LNAs, and mixers. Analog baseband section consists of analog reconstruction and anti-aliasing filters for bandlimiting the transmit/receive signals, so that the power of frequencies above the Nyquist frequency is highly attenuated to prevent out of band emission in case of the transmitted signals and interference in case of received signals. Direct conversion receivers have IQ imbalance and DC offset problems which need to be corrected at some point in the receiver chain.

B. SYSTEM ON CHIP
Advancements in VLSI technology over the last few decades have enabled highly complex functional blocks of a computer -IP blocks-to be integrated inside a single chip. A major benefit of this approach is that re-usability of IP cores allows SoC designers to expedite chip design process by integrating IPs in short period. A modern SoC consists of one or more processor cores, peripherals and on chip memory as well as interfaces for on and off chip peripherals. An SoC consisting of an FPGA inherits all advantages of the FPGA, and signal processing can be partitioned between the processor and the FPGA depending on the application requirement and design trade-offs, a technique known as hardware software co-design [8]. In latency critical applications for example, an FFT operation can be offloaded to the FPGA while the processor is assigned operations that are non-time-critical.

III. RELATED WORK
Previously, we have created a ZC706+FMComms3 based platform for detection and demodulation of WiFi and LTE signals using a single RF front end [9]. The front end works at the LTE sampling rate regardless of the protocol being received, and a matched filter is used to detect the type of the received signal. In [10], we have demonstrated receiving multiple protocols such as WiFi, LTE and Zigbee using the same RF front end and baseband processing the received signals. Openwifi [11] is a full stack implementation of an SDR based WiFi radio on ZC706+FMComms2 platform. The developers of this project have used OpenOFDM, a Verilog implementation of an 802.11 decoder in their design and developed an SDR driver for handling communication between the processor and the SDR through the FPGA. In [12], an experimental testbed using USRP has been created to perform wireless experiments using different 5G physical layer technologies based on CP-OFDM, BF-OFDM and WOLA-OFDM. All baseband processing such as waveform generation, QAM modulation/demodulation, equalization are done using MAT-LAB running on a host computer, and data communication to and from the USRP are done using 1 Gbps ethernet links. OpenAirInterface [13] is another platform that offers open source software implementation of the LTE/LTE Advanced protocol stack and also provides emulation capability to carry out laboratory scale experiments. The basic design does not come with any signal processing implemented on a Spartan 6 LX150T FPGA, and based on the user requirement, some processing done in software can be offloaded to the FPGA. In [14], physical layer features of 5G NR were implemented using National Instruments Communications LTE Application Framework, and tests involving transmission/reception of a 4K video have been performed.
The platform presented in this paper differs from previous research in a number of ways. Our work focuses on the baseband operations performed on received over-the-air waveforms on the FPGA in real time. We use fully separated transmitter and receiver setups with no common LO. Mismatches between the transmitter and receiver introduce frequency offsets which need to be corrected in order to accurately demodulate received signals. Despite frequency offset correction, residual offsets still exist which, if not corrected, can significantly degrade the EVM of the received signal. We implement a robust yet hardware friendly channel estimation and equalization technique to mitigate such impairments. Accurate channel estimation enables demodulation of higher order modulated signals such as 256 and 1024 QAM which will be used in 5G applications. Software configurability of the platform can be leveraged to allow coexistence case studies that involve multiple users transmitting asynchronously; a situation that poses a major challenge in mMTC applications. Use cases involving PA nonlinearity and their effect on higher order modulations are also presented as a basis to help determine linearization requirements.

IV. WORKFLOW
The high level FPGA targeting workflow of our receiver is given in Fig. 4. Note that we process data on the FPGA only at the receiver, and not at the transmitter. Signals to be transmitted are simply passed through the transmit side FPGA. The target is a Xilinx ZC706 evaluation board. This board includes a Xilinx Zynq 7Z045 SoC which consists of a dual-core ARM Cortex-A9 processor as the PS and a Kintex-7 FPGA as the PL. Depending the application requirements, it is possible to use either or both these for signal processing, or bypass and perform processing offline using a general purpose computer. We use MATLAB Simulink to describe the high level behavior of the system to be targeted on the FPGA. This model is converted to HDL and then to an IP core using MATLAB HDL code and IP core auto generation workflow. We use Xilinx Vivado for logic synthesis and implementation. Vivado generates the bitstream that is used to download to the FPGA.
We use the PS to configure parameters such as center frequency, bandwidth and gain of the RF front end. We use ADI's AD9361+FMComms3 SDR as the front end which has a digital interface that connects the SoC and AD9361. When the digital data is received by the PL, the AXI AD9361 IP core performs IQ imbalance and DC offset correction, and forwards the data to the IP through a FIFO. This IP core performs operations such as data synchronization, demodulation and channel estimation at receiver baseband. More details of the IP functionality are given in Sec. V.

V. IMPAIRMENTS
An OFDM based wireless communication system is susceptible to a number of impairments which must be taken into account and corrected in order to improve the BER performance [15]- [18]. In this section we provide details related to some commonly found impairments in an OFDM system; we use LTE signals in our experiments.

A. SYNCHRONIZATION ERRORS 1) TIMING SYNCHRONIZATION
LTE uses two types of synchronization signals; PSS and SSS. The PSS is based on a frequency-domain Zadoff-Chu sequence [19]. In an LTE frame, PSS occupies the central 62 subcarriers of the last OFDM block of subframe 0 and subframe 5 as shown in Fig. 5(a). The complex baseband representation of this signal is shown in Fig. 5(b). In an actual implementation, a copy of the PSS is stored in the receiver's memory and is cross correlated with an incoming signal. Correlation peaks shown in Fig. 5(c) indicate the presence of the PSS, and the rest of the receiver processing such as OFDM demodulation can be carried out based on the timing reference obtained using these peaks. It is possible that  different multi-path components of a signal arrive at different times at the receiver, making it difficult to precisely synchronize the signal for the input of the OFDM demodulator. In addition, a practical implementation of an OFDM synchronizer may also introduce timing synchronization errors. In an ideal synchronizer there is no possibility of this happening; VOLUME 8, 2020 PSS can be sampled at the data rate of the PDSCH, and the incoming signal can be cross-correlated with the sampled PSS. Doing so can create a sample accurate timing reference for the FFT. In our case, since we use the PSS sampled at 1.92 MHz to synchronize a PDSCH signal sampled at 30.72 MHz, timing synchronization errors are inevitable. To circumvent this issue, we position the start of the FFT somewhere close to the center of the CP as shown in Fig. 6. This does not cause any data loss because the tail part of the OFDM block is essentially the same as the CP, and any loss in data at the tail of the block is compensated by the fraction of the CP that will be included for FFT calculation. However, doing so results in a phase shift at the k th subcarrier of the FFT output which is given by: where D ∈ {88, 79} is the number of samples to take in advance from the boundary between the CP and the OFDM block, and is taken as a fraction of the CP length. N is the FFT size. These values were obtained using the default CP fraction value (0.55) provided in the MATLAB LTE Systems Toolbox. D = 88 corresponds to the 160 sample long CP of the first and seventh blocks of a subframe, and D = 79 corresponds to the 144 sample long CP of rest of the blocks. At the output of the FFT we place phase correction logic (as shown in Fig. 7) to compensate for the offset that resulted due to the FFT being performed away from the boundary of CP and OFDM block. Phase correction coefficients are stored in two look up tables. LUT 1 is used to store the coefficients corresponding to the OFDM blocks that are prefixed with a 160 sample long CP, and LUT 2 is for the blocks that are prefixed with a 144 sample long CP. FFT control logic determines which set of LUT coefficients to multiply the FFT result with, so that the channel estimation input is properly phase aligned. This is a required step before channel estimation.

2) FREQUENCY SYNCHRONIZATION
FO between the transmitter and receiver is a major problem in OFDM, which if not corrected can significantly degrade the performance. In a laboratory scale experiment, it is possible to use a single LO to provide the reference clock for both transmitter and receiver. Our platform is intended to provide support for more realistic use cases where the transmitter and receiver are completely separated from each other, and a frequency offset is always present. The effect of frequency offset on a 64 QAM constellation is shown in Fig. 8. The subplots of the first and second columns show how various symbols in an LTE frame rotate with time in 3-D and 2-D IQ plots, respectively. For a LO frequency of 3.5 GHz, which we used in our experiments, a frequency offset of a few kHz can be noticed. We use the same technique used in MATLAB LTE HDL MIB Recovery example [20] to correct this offset. This technique consists of two steps; (i) FO estimation and (ii) FO correction. FO estimation is done by using a CP correlator which provides a complex number based on the value of the FO, which will then be converted to an angle, and filtered to remove transients in the estimate. The estimated value is fed to a an NCO which produces a complex exponential signal that is multiplied with the original signal to compensate for the FO. However, even if the accuracy of this estimation is high, a residual offset is always present which will make symbols rotate progressively in subsequent OFDM blocks by an angle e −j ψ where ψ is the residual offset. Such rotation leads to symbols moving out of the decision boundary for any type of constellation and eventually results in an undecodable block. This rotation more strongly effects higher order modulations than lower order ones. However, pilots in an OFDM block rotate by the same angle, and this can be leveraged to correct the rotation. This is done in channel estimation and equalization which are discussed in Sec. V-C.

B. PHASE NOISE
In theory, an LO is expected to produce a spur free carrier signal which is used to upconvert a baseband signal to a higher frequency. However in practice, some PN will always surround the carrier signal. From a communication engineer's standpoint, the effect of such noise can be linked to the symbol constellation which results in rotation of symbols from an ideal constellation point by an angle defined by the random variable φ. We analyze the effect of PN in communication engineers perspective because we are more concerned on the effect of such noise on symbols rather than the noise itself. The combined effect of thermal and PN on the transmit signal can be represented as: In (2), e jφ in the first term denotes the PN component acting on the transmit signal x(n) that causes phase rotation of symbols, and E s /N 0 in the second term denotes the signal to noise ratio of the signal. σ 2 is the variance of the additive white Gaussian noise. For simulation purposes, we use φ rms = 0.4 • , a reasonable number at 3.5 GHz LO frequency, based on the SDR manufacturer's specifications. Fig. 9(a) and (b) show the transmit side 64 QAM constellations when φ rms is 0 • and 0.4 • , respectively and SNR is 30 dB. The effect of PN in this case is negligible since the transmit side EVM is higher in (b) only by about 0.5 dB than in (a). However in (c) and (d) where the SNR is as high as 50 dB, this effect becomes more obvious when the difference in EVM is as high as 7 dB. Typically, the best SNR of the SDR that we could obtain when using a 20 MHz 64 QAM LTE signal was about 50-60 dB. Therefore, we have an EVM degradation of about 7 dB due to the PN component. Nevertheless, the transmit side EVM is well within the requirements of the LTE standards for higher order QAMs, and we do not consider PN to be a deciding factor for the frequency at which we operate.

C. CHANNEL EFFECTS
Since the signals traveling through a communication channel undergo fading, channel estimation and equalization methods must be implemented at the receiver in order to correctly decode data symbols with as few bit errors as possible. The role of channel estimation becomes critical with increasing the order of modulation since EVM requirements become more stringent.
In LTE, CSR signals (also called pilots) are used for channel estimation. These pilots are placed on an LTE resource grid, as shown in Fig. 10(a) [21]. The position of pilots depends on the value of the Cell ID, a unique number that is used to identify a cell. In the simplest case, channel estimates for pilots are first calculated by taking the ratio between the received subcarrier and the transmitted subcarrier, and then averaging over multiple OFDM blocks. Channel estimates for unknown subcarriers are calculated by interpolating pilot channel estimates along the frequency axis (see Fig. 10 (b)). This method is simple and convenient to implement in hardware. However, if the averaging window size is large, estimation for some blocks of symbols are likely not to be accurate since residual phase errors can rotate the constellation and cause the EVM to increase. For higher order modulations, the simple pilot averaging and interpolation method does not provide reasonable EVM values regardless of the size of the averaging window. In order to perform channel estimation in a hardware friendly manner, we use a time-localized approach; i.e., we compute channel estimates for the PDSCH using its nearby reference signals by using bilinear interpolation [22]. In this approach, pilot symbols can be considered as data points of a low resolution surface plot with time, frequency and complex amplitude axes. The channel estimate for a particular subcarrier in terms of channel estimate of the VOLUME 8, 2020 pilot (h i ) can be calculated by: where w i is the weight for h i and is calculated by taking the inverse of the Euclidean distance d i from the position x of the subcarrier of interest, which is given by: In (3), R is the ROI over which the interpolation is performed, and p is the exponent for weights that is used to control the flatness of the function near data points. The value of p needs to be greater than 1 for the interpolation function to be differentiable. Fig. 10(c) shows two ROIs each consisting of 3 pilots within their perimeter. Depending on the position of the PDSCH symbol, its nearest 3 pilots are considered for calculating the channel estimate. The ROI was increased to include 3 pilots when the calculation point is close to the edge of the block due to ease of implementation. After calculatingĤ (k), we use a CORDIC divider to obtain the inverse of it, and estimate the value of the transmit symbol using the single tap equalizer: Then we measure error vector for each subcarrier and calculate the RMS and peak EVM values as: and where N is the total number of subcarriers. When analyzing the effect of p on the EVM by changing it from 1 to 5, it was noticed that the highest peak EVM corresponds to p = 1, and the lowest corresponds to p = 5.
In our design, p = 2 has been used with the aim of having the lowest RMS EVM which was about 2%. However, it is possible to use 2.5 or 3 so that the peak EVM can further be reduced with an additional increase in 0.1% of RMS EVM. These changes do not incur any additional cost in terms of hardware utilization, because such changes only change the weight matrix coefficients which are mapped to a LUT on the FPGA.
Although bilinear interpolation based channel estimation is attractive in terms of hardware cost and accuracy, there are still some interpolation artifacts that could lead to small errors in channel estimation and eventually result in high EVM for higher orders of modulation. Such artifacts can be removed by using an MA filter. The resulting symbol estimate including the MA filter is given by: where is the response of the filter, and L is the MA length. By changing L, it is possible to control the smoothness of the output signal. If L is very small, most of the artifacts remain, and the EVM of received symbols does not decrease by a significant amount. When L is too large, too much smoothing takes place, and the EVM increases more than the non-MA-filtered case. Experimentally L = 13 gave the best EVM and was used in the filter design. Fig. 11 and Fig. 12 show channel estimatesĤ (k) where k is the subcarrier index, obtained using a received signal when timing synchronization error ( n) is equal to zero and −8, respectively. n = −8 is the highest negative error resulting from the PSS correlation which works at a sixteenth of the sampling rate of the PDSCH. In Fig. 12, an oscillation can be noticed due to the time offset. In both cases, MA filtering helps smooth out artifacts. Therefore, MA filtering is a necessary element in the channel estimator implementation. The EVM results with and without MA filtering are presented in Sec. VII-A.  We use bilinear interpolation function from (3) and MA filter from (9) to implement the channel estimation logic in hardware. Fig. 13 shows the block diagram of the channel estimator implemented on the FPGA. Channel estimates of pilots, which are stored in a LUT, are multiplied by weights computed according to the distance measure given in (4), and a weighted sum is calculated, normalized and MA filtered to produce the final channel estimate which will be used for equalizing the symbols. Channel equalized symbols are compared against a reference constellation to measure the EVM. These symbols have undergone different levels of scaling due to gain control at the front end and various signal processing steps on the programmable logic. Therefore, it is necessary to align received and reference constellations before calculating the EVM. Once this alignment is made, it is possible to use (6) and (7) to compute the EVM.

A. EXPERIMENTAL SETUP
We use a transmitter and a receiver setup to send/receive over the air LTE waveforms, and then do baseband processing of the received signals on an FPGA. Our setup consists of two Xilinx ZC706 boards each with an ADI FMComms3 board, one for the transmitter and the other for the receiver. The ADI transceiver supports carrier frequencies upto 6 GHz and a maximum bandwidth of 56 MHz. It has an FMC slot that is used to connect to an FPGA or SoC based module with a compatible interface. This connection provides a programmable interface to configure device registers using software control and also enables high throughput data transfer to and from the SoC. The transmitted signal is generated offline using MATLAB LTE Systems Toolbox, and the specifications of the generated signals are given in Table. 2. The generated digital baseband signal of frequency 30.72 MHz is passed through the Xilinx Zynq SoC to the AD9361 FMComms3 where operations such as interpolation, D/A conversion and signal reconstruction take place. The interpolation filter chain gradually increases the sampling rate of the signal from 30.72 MHz to 491.52 MHz at which the DAC operates. After the D/A conversion, reconstruction filtering is performed to remove sampling artifacts, and the signal is up-converted to the carrier frequency of 3.5 GHz using the transmit side LO before transmission. At the receiver, the received signal is down-converted to analog baseband and anti-aliasing filtering is performed before the A/D stage. A/D conversion output signal is sampled at 245.76 MHz, and is sent through a decimation filter chain to bring the baseband signal rate down to 30.72 MHz. Receiver chain processing such as signal detection, frame synchronization, and channel estimation are performed on the FPGA of the Zynq SoC. Noise Floor TX = Noise Floor perBW + 10 log 10 BW . (10) and is approximately −81 dB. At the receiver, more noise gets added to the signal due to the LNA. Usually, the NF of the LNA increases with the carrier frequency. Device specifications state a NF of 3 and 3.8 for carrier frequencies 2.4 and 5.5 GHz, respectively. Therefore we approximate the NF to be 3.3 at 3.5 GHz. The noise floor at the receiver can then be expressed as: Noise Floor RX = Noise Foor TX + NF (11) VOLUME 8, 2020  and is approximately −78 dBm. Note that these calculations are approximate; exact values may differ slightly from those in the calculations. The SDR's maximum output transmit power is 7.5 dBm at 2.4 GHz and 6.5 dBm at 5.5 GHz, and the calculated transmit power is 7.1 dBm at 3.5 GHz. However, the maximum transmit power is limited to about −5 dBm when taking into account the high PAPR of 12 dB of the transmit signal. More details about PAPR are discussed in Sec. VI-E. Since we use a wideband signal, the total transmit power is given by integrating the power spectral density S xx over the bandwidth: For a discrete power spectrum with power calculated per RBW, we can approximate the transmit power as: 10 N FFT (OBW ) ; i = 0 (13) where N FFT (OBW ) is the total number of FFT bins in OBW and is related to the total number of FFT bins N FFT (TBW ) in the spectrum as: We have placed the transmitter and receiver about 1 m apart, and the FSPL according to is about 43 dB. This loss is compensated at the receiver when it's set to fast/slow attack AGC modes. Further increase in the distance between the transmitter and receiver is possible as long as the receiver amplifier does not enter saturation. We measure the SNR by first taking a recording of the signal with noise when the transmitter is enabled, and then taking another recording with noise when the transmitter is disabled. Fig. 15 shows these two spectra separately. We use a sampling rate of 61.44 MHz, although the original sampling rate of LTE is 30.72 MHz. This is because some of the elements in the PL run twice as fast as the baseband rate.

B. CHANNEL ESTIMATION
In order to show the effect of n in channel estimation, we recorded a received signal and performed channel estimation in MATLAB with and without MA filtering. This was done because it is not possible to get the exact value of n in hardware. In this experiment, we have recorded RMS and peak EVM values for different timing synchronization offsets.

C. HIGHER ORDER MODULATIONS
Using higher order QAM formats for increased data rates has been discussed in early 3GPP standards related to LTE and LTE-Advanced. In [23] it was suggested to use 1024 QAM for stationary wireless links with high SINR to improve the network capacity. Given the fact that 1024 QAM can encode 10 bits per symbol, it is possible to increase the theoretical capacity of such a link by 25% over a 256 QAM link. We use our platform to demonstrate that 1024 QAM LTE transmission and reception is possible while meeting EVM requirements mentioned in the standard. In this experiment, we create OFDM blocks to be transmitted based on the symbol mapper equation for 1024 QAM given in [23], and create an LTE frame just like for other QAM formats by using MATLAB LTE Systems Toolbox.

D. ADJACENT CHANNEL EXPERIMENTS
The software configurability of the SDR allows us to conduct experiments that involve more than one frequency channel operating close to each other. The center frequency and the bandwidth of the front end can be changed so that the receiver can capture a desired channel out of many, and perform baseband processing. The experimental setup shown in Fig. 16 shows a scenario with a single receiver and two adjacent channel transmitters in operation. In this setup, the receiver has been configured to a center frequency of 3.5 GHz which is the frequency of the main channel. The adjacent channel operating 20 MHz apart thus has a center frequency of 3.52 GHz. For each channel, the OBW is 18 MHz which results in a guard band of 2 MHz between channels.
Although 20 MHz LTE operates at a sampling frequency of 30.72 MHz, we operate the receiver baseband at a maximum clock frequency of 61.44 MHz due to advantages in implementation. Specifically, baseband side correlators that are used to detect the PSS and CellID can be clocked at 61.44 MHz, which is 32 times the sampling rate of the PSS. High frequency clocking enables efficient use of DSP resources on the FPGA by implementing the correlation filter using a partly serial systolic architecture. Note that, when the receiver operates at this frequency, the spectrum of the signal is compressed by a factor of two compared to that of the signal sampled at 30.72 MHz.
The AD9361 receiver chain has an analog anti-aliasing filter which has a magnitude response shown in Fig. 17. This is a third order Butterworth low pass filter, and is configured to have a passband frequency (f c ) which is 1.4 times the   (16) where n = 3 and ω c = 2πf c . In our setup, f c = 12.6 MHz provided that the signal has an OBW of 18 MHz. f c is also the 3 dB bandwidth of the filter. We perform an adjacent channel experiment in which both channels use the same transmit power. In this situation, the response of the anti-aliasing filter causes the adjacent channel power to attenuate as shown in Fig. 18(a). The front end still operates at sampling frequency F s = 61.44 MHz. If the signal is downsampled without any filtering the adjacent channel causes interference on the main channel. This is shown in Fig. 18(b). To avoid this, we use a decimation filter which has a cut-off frequency of F s /4. This value is chosen because we can design a half band filter with half of its coefficients equal to zero, which will essentially reduce the implementation complexity. Fig. 18(a) shows that having such a filter does not completely filter out the adjacent channel power. Nevertheless, since there is a significant guard band between the two channels, the effect of this interference is negligible.

E. PA NONLINEARITY EXPERIMENTS
AD9361 does not include a PA. However it is possible to connect its RF output a PA to amplify the power level of the signal being transmitted in order to transmit over longer distances. Practical PAs show some undesirable characteristics that could degrade the BER performance of a VOLUME 8, 2020   communication system. Unlike LNAs used at the receiver side which have significantly low noise levels, PAs cause the noise floor to increase. However, the major bottleneck of the PA is non-linearity which prevents it from operating at high power levels. This problem worsens for OFDM signals which have high PAPR. Modern communication systems using OFDM are designed to support high spectral efficiency and wider bandwidths that result in high PAPR values. This poses a significant challenge for designers to meet bandwidth, linearity and power efficiency simultaneously. Table. 3 shows the PAPR values for the three types of modulation formats that were used in the experiments. Fig. 19 shows the power vs time plot for a 64 QAM modulated LTE signal that indicates peak and average power levels.
We use BGA7210 variable gain amplifier to boost the signal being transmitted. This PA has a software configurable attenuator with maximum gain of 31 dB, and the attenuation can be changed from 0 to 31 dB; i.e., when the attenuation is set to 0, the PA output power is maximum, and when set to 31 dB, PA output power is minimum. However, in practice the maximum gain depends on the device operating frequency.  According to manufacturer's specifications, the typical gain range within the frequency range of 3.4-3.8 GHz is 22-30 dB under minimum attenuation. Since we use 3.5 GHz, it is safe to assume that the gain we get is around 27 dB. Now, since the device is subject to high PAPR it becomes impossible to operate it at this gain setting, and we likely will see a degradation of EVM performance when the transmit gain G t > G max − PAPR. Therefore, the condition for the PA to operate in the linear region is: and PAPR is given by: The inequality (17) imposes a restriction on the maximum gain achievable by the PA thereby demanding a wider linear dynamic range. This is highly inefficient because the PA needs headroom equal to PAPR just to accommodate the peak power (P pk ) of the transmit signal. We use the setup shown in Fig. 20 to demonstrate the effect of PA non-linearity on the received symbol constellation. We perform experiments using a 20 MHz LTE signal with 16, 64 and 256 QAM modulation formats. According to (17), the maximum PA gain should be around 15 dB for a PAPR value of 12 dB. This can be verified using the EVM results obtained for different gain values.   nization error here is equal to −0.26 µs. Fig. 21(b) shows the constellation for the same situation but with MA filtering. The EVM results in Table. 4 show that the RMS EVM has decreased from 1.83% to 1.58% due to the use of the MA filter at this value of n. When n is zero, the difference in RMS EVM with and without MA filter was insignificant. However, even for that situation, MA could improve the peak EVM compared to the non-MA case. It should be noted that the increase in EVM performance is due to the removal of interpolation artifacts which is not surprising, due to the limited number of points considered in the interpolation. Alternatively, it is also possible to use a larger number of points and perform bicubic interpolation to achieve a smoother surface of channel estimates. However, in the context of implementation, trade offs have to be considered, since selecting more points for better resolution requires more memory and may also incur additional latency. Eventually, the optimal channel estimation method for a target application needs proper evaluation of trade offs before being implemented on a target device.

B. 1024 QAM EXPERIMENTS
The 1024 QAM constellation at the receiver side is shown in Fig. 22. The RMS EVM value obtained was 2%.
C. ADJACENT CHANNEL EXPERIMENTS Fig. 23 shows power spectra at different sampling rates when an adjacent channel is present. In (a) and (b), the sampling frequency is F s . In (a), the effect of the anti-aliasing filter on the adjacent channel can be noticed. In (b), a decimation filter is used to attenuate frequencies above F s /4. (c) and (d) show the spectra when signals in (a) and (b) are downsampled, respectively. The effect of direct downsampling results in the constellation given in Fig. 24(a) in which considerable interference is present. The RMS EVM in this case was 4.2%. Fig. 24(b) shows the constellation obtained with decimation filtering, and the effect of adjacent channel interference is not present due to filtering. The purpose of the anti-aliasing filter is to remove frequency components and noise that are above the Nyquist frequency, which has been done in this situation. An analog filter with steep roll-off that effectively attenuates all ACI would require increased hardware resources, and would be challenging to integrate inside an SDR. This makes it necessary to have a subsequent digital filtering stage which we have implemented in a hardware friendly manner by using a half band filter. However, If the guard band width is too narrow, there is a possibility that power from an adjacent channel leaks into the edge subcarriers of the main channel. In such cases, a digital filter with steeper roll-off and cut off frequency close to edge subcarriers is required.  Even before this transition, the PA should not be considered as operating linearly because its power transfer characteristics do not follow a linear model when it operates close to the 1 dB compression point. In addition, unlike LNAs, PAs have a larger NF which further degrades the EVM. Therefore, a high EVM even in the PA's linear region can be anticipated. High EVM results in Fig. 25(b,c,d) can be explained by looking at Fig. 19. There are some sparse peaks in this plot which even when clipped do not cause significant EVM degradation. This is the point when the head room of the PA is unable to accommodate the peak power which causes a small loss in EVM. In addition, all other peaks which are about 3 dB below P pk are affected as the gain is further increased causing a severe loss in EVM.
A more intuitive performance matrix from a communication engineer's perspective is the bit error probability which can be approximated in terms of EVM as [24]: Similar to the approach used to obtain EVM values for 64 QAM shown in Fig. 25, we obtain those corresponding to 16 and 256 QAM. Then we use (19) to calculate the bit error probability by using those values. Fig. 26 shows the corresponding P b vs G t plot. It is noted for 16 and 64 QAM, P b is negligible, i.e., it is very close to zero when the PA gain is below 15 dB. However, when it is increased to 15.5 dB, an abrupt increase in P b can be noticed, which is consistent with the high EVM values obtained. At this level of PA gain, P b of 256 QAM is on the order of 10 −2 which is comparable to 10 −4 in 64 QAM and 10 −6 in 16 QAM. This is expected because the number of symbols going out of the decision boundary due to nonlinearity induced noise and phase rotation goes up when the boundary is small in higher order modulations.

VIII. CONCLUSION AND FUTURE WORK
We have presented a SDR+SoC platform for carrying out a wide range of sub 6 GHz wireless experiments under real world conditions. The platform's baseband processing takes into account synchronization errors and channel effects and taken measures have been taken to mitigate them. A robust channel estimation technique accounting for residual phase offsets has been implemented on the baseband side to improve the BER performance. This algorithm has been shown to meet the stringent EVM requirements of the existing standards for higher order QAM schemes. The testing platform has leveraged the processing capability of FPGAs to perform receiver baseband functions in real time. Making use of high level modeling and HDL code auto-generation allows researchers to prototype wireless systems in relatively short period of time, and test and evaluate their performance. We have also shown the effect of PA nonlinearity in terms of the received EVM when the transmitter gain is increased. Note that the high PAPR of OFDM, particularly at higher orders of modulation, is a major bottleneck that needs to be addressed in order to improve the BER performance. We have also demonstrated a coexistence scenario in which two transmitters operate on adjacent frequency channels, and tuning the receiver to capture the preferred channel by configuring its center frequency. Interference mitigation using low pass filtering on the receiver side helps improve the EVM, and implementation complexity can be reduced by using a half band filter.
In the future we plan to extend our research work in two directions. First, we plan to implement different DPD based PA linearization algorithms to improve the EVM. The targeting workflow of such techniques is similar to what we used in this paper with the exception that DPD is implemented on the transmitter side as opposed to the receiver side. The other topic we will focus on is implementation of spectrally efficient filtering and windowing techniques for MTC applications. In this research, we plan to implement different variants of OFDM spectral enhancement and interference mitigation techniques at transmit and receive sides and provide a real time demonstration of an asynchronous communication system. We expect that the platform presented in this paper can be scaled up to carry out such experiments.