A 1.6-GHz Sub-Nyquist-Sampled Wideband Beamformer on an RFSoC

This paper presents how to simultaneously achieve elemental sub-Nyquist sampling and true-time-delay (TTD) beamforming using a contemporary RF system-on-a-chip (RFSoC) by outlining the development of a 1.6 GHz S-band phased array system implemented using a Xilinx 8-channel 4 GSPS RFSoC. RFSoCs integrate a high speed analog-to-digital converters (ADC) and digital-to-analog converters (DAC) with a Field Programmable Gate Array (FPGA) and system-on-a-chip (SoC) architecture on a singular device, enabling direct sampling of RF signals. Thus, the RFSoC is the only hardware in this beamformer apart from the antenna aperture. This enabling technology facilitates the development of compact all-digital arrays, which massively increases the available degrees of freedom in system control enabling a paradigm shift in industry and engineering communities. The efficacy of our modular approach is confirmed via our research testbed.


I. INTRODUCTION
T HE RF system-on-a-chip (RFSoC) is a contemporary device which integrates multiple analog-to-digital converter (ADC) and digital-to-analog converter (DAC) channels into a Field Programmable Gate Array (FPGA) and Systemon-a-Chip (SoC) Integrated Circuit (IC), providing potential footprint and power reductions of 50% and 75% [1], respectively. Researchers have been exploring the role of the RFSoC in next-generation phased array applications including bistatic radar [2], synthetic aperture radar [3], near-field calibration [4], real-time signal generation [5], and fully-digital radar systems [6]. Several examples of RFSoC-based phased array systems can be found in current literature [7], [8], [9], [10], [11], [12], demonstrating sample rates of 1 − 4 GSPS in various system architectures. In [7], a minimum variance distortionless response (MVDR) beamformer is digitally implemented on an RFSoC with ADCs operating at 125 MHz. Its system architecture incorporates an RF downconverter with an 8 MHz passband. The authors in [8] demonstrate a multi-beam digital beamformer which employs direct sampling of 1 GHz, 100 MHz bandwidth quadratic-amplitude modulation (QAM) signals at a sample rate of 4 GSPS.
A 16-element phased array with ADCs operating at 2 GSPS is shown in [9] which supports an 800 MHz bandwidth. However, at a carrier frequency of 28 GHz, narrowband beamforming was sufficient. Beamforming measurements summarized in [10] utilize the Integrated Multi-use Phased Array Common Tile (IMPACT) [11] which supports an instantaneous bandwidth (IBW) of about 500 MHz. Beamforming using a tightly coupled dipole array (TCDA) is demonstrated at 3.5 GHz, 4.9 GHz, and 9.5 GHz, without mention of test signal bandwidth. The authors in [12] provide narrowband measurements and discuss a wideband beamforming engine with equalization.
In this paper, we present an 8-element fully-digital sub-Nyquist-sampled wideband receive array utilizing an RFSoC. This system matches or exceeds key performance metrics of the aforementioned literature such as digital bandwidth, absolute bandwidth, sampling frequency, and fractional bandwidth. Demonstrated with a linear chirp signal, a commonly used radar waveform, with a 1.6 GHz bandwidth centered at 3 GHz, array elements are directly sampled by 4 GSPS ADCs such that the chirp waveform is centered in the second Nyquist zone and, upon sampling, folds into the first Nyquist zone. Signal compensation supporting the entire digital bandwidth is applied at complex baseband following the digital downconverter (DDC). Sub-Nyquist-sampled beamforming enables direct sampling below the Nyquist frequency while maintaining waveform bandwidth. This simplifies system design by mitigating the need for traditional analog downconversion circuitry while allowing for lower sample rates, and thus lower power consumption, than would be required to support Nyquist sampling at the carrier frequency. Sub-Nyquist sampling has been utilized in wideband spectrum sensing [13] and directionof-arrival (DoA) estimation [14]. Additionally, researchers in the ultrasound community [15], [16] and optical tomography community [17] have leveraged the concept of sub-Nyquist sampling to reduce power and layout requirements. This paper is organized as follows. Relevant theory is presented in Section II, including a comparison of narrowband and wideband beamforming, implications of sub-Nyquistsampled beamforming, and the design of finite impulse response (FIR) fractional-sample delay filters for an embedded system. An overview of the wideband beamforming testbed is provided in Section III, including the hardware, firmware, and software. Section IV presents the results of bench-top and over-the-air (OTA) far-field anechoic chamber measurements. Comparisons between narrowband and wideband beamsteering measurements are presented as well as measured versus simulated wideband beamsteering performance. A summary is provided in Section V.

II. PROBLEM FORMULATION AND RELEVANT THEORY
This section provides the theoretical basis on which the demonstrated results are based. Section II-A provides an overview of phased array beamforming for both narrowband and wideband signals. Section II-B discusses digital true time delay (TTD) units and their application to wideband beamforming. A procedure for designing a digital fractionalsample delay filter bank is also provided. Lastly, Section II-C discusses compensation implications of sub-Nyquist-sampled beamsteering.

A. Beamforming
The fundamental intent in phased array beamforming, also known as classical beamforming [18], is to determine sensor signal compensation which causes signals to combine coherently for a given steering direction. Other beamsteering techniques may seek to broaden the beam to expand spatial coverage or to mitigate the effects of directional interferers through directional nulling or adaptive beamsteering by maximizing the signal to interference and noise ratio (SINR) [19]. However, the focus herein will be on the coherent summation of signals for a given direction. It is well known [20], [21], [22], that the time delay between elements in a uniform linear array (ULA) for a signal phase front to propagate from the first element to element n is given by as shown in Figure 1. Classical beamforming seeks to compensate the inter-element time delay, either through TTD devices or by approximating the time delay via phase shifters. As phase shifters apply a frequency-invariant shift in phase, element compensation is defined by where f c is the waveform's center frequency and t n,θ st is the compensation time delay for element n in the direction θ st . In contrast, TTD units apply a direct time delay, resulting in the frequency-variant phase shift described by φ n ( f ) = 2π f t n,θ st .
Bandwidth limitations due to the use of phase shifters, such as beam squint and pulse distortion are well documented [23], [24], [25]. Beam squint occurs when the ULA phase gradient is defined by Eq. (2), yielding progressively degrading beam accuracy with increasing f = | f − f c |. This phenomenon becomes more drastic with increasing steering angles. The frequency range over which the main beam response is within 3 dB of the peak response for the worstcase steering direction provides an estimate for system bandwidth [22]. Pulse dispersion arises when phase shifters are unable to provide more than one cycle of delay. Thus, when the propagation time across the aperture, known as the aperture fill time [26] t fill , is greater than one cycle, compensation is wrapped to within 0 • to 360 • resulting in phase coherent signals lacking time alignment. The aperture fill time provides an additional estimate of system bandwidth [26] for a given steering angle, as described by B = 1 t fill , although less conservative than the beam squint bandwidth constraint [24]. In order to increase the bandwidth of larger arrays, which have smaller beamwidths and large aperture fill times [21], TTD units may be incorporated at the sub-array level when elemental TTD units are impractical.

B. True Time Delay Units
The RFSoC facilitates elemental digital implementation of TTD units through a combination of integer and fractionalsample delays. While integer-sample delays are trivially implemented by shifting digital signal samples, fractional-sample delays require a digital filter. This section provides a design method for fractional-sample delay FIR filter synthesis as well as considerations for embedded system applications.
1) Ideal Fractional-Sample Delay Filter: The ideal fractional-sample delay filter has a frequency response with unity gain and linear phase [27], as prescribed by the time shift property of the Fourier transform. Thus, the ideal frequency response for a discrete FIR filter with delay t a , is given by which is periodic over the interval F s . The magnitude and phase responses are given by As the resulting sinc response [28] is both infinite and noncausal, it is not possible to implement the ideal fractionalsample delay filter. 2) Approximate Fractional-Sample Delay Filter: In order to design a finite, causal approximation, the ideal response must be appropriately truncated and shifted [28]. In general, merely truncating the response to some finite length l len produces undesirable ripple in the frequency domain. A symmetric window function improves the ripple response appreciably at the acceptable cost of a small reduction in magnitude. Although many window functions exist, a Blackman window was utilized for the fractional-sample delay filter bank in this implementation. It is recommended to select an odd filter length given the majority of the sinc function energy is concentrated near the center of the response. Additionally, limiting the fractional-sample delay to −0.5 ≤ t a T s < 0.5 minimizes filter asymmetry. To achieve causality, the finite windowed sinc response is shifted to the right by l len −1 2 , assuming an odd length l len . Thus, the final taps for a t a -delay filter with odd length l len , designed to operate on a signal sampled at frequency F s = 1 T s , are given by where w[n] is some window function, in this case a Blackman window, of length l len and n is subject to the constraint 0 ≤ n ≤ l len − 1.
3) Filter Length: Considerations must be made for embedded system implementations given limited system resources. Figure 2 provides a comparison of the magnitude response and group delay for 1 2 -sample delay filters of various lengths. As longer filters provide lower loss and increased bandwidth, one must ensure sufficient hardware resources to maintain accuracy in high digital bandwidth applications. 4) Fixed-Point Quantization: FIR filters are typically implemented using fixed-point numerical representation rather than floating point due to computational efficiency. Figure 3 shows degradation in the frequency response due to different fixedpoint precisions for a 21-tap 1 2 -sample delay filter. Black traces show the floating point response. Subsequent traces show the 16-bit, 12-bit, and 10-bit responses, respectively. Reduced coefficient precision causes increased ripple. Although the magnitude ripple is quite small, and likely inconsequential for most applications, group delay variations, particularly for 10-bit fixed-point coefficients, may be prohibitively large.
5) Fractional-Sample Delay Resolution: In real-time applications, filter coefficients are typically pre-computed for a finite set of prescribed delays rather than computed in real-time. Figure 4 shows the magnitude response and relative group delay of an example filter bank with a 1 32 -sample resolution. The relative group delay ignores the integer-sample delay associated with the FIR filter. The filter bank consists of thirty-two 21-tap filters with 16-bit fixed-point coefficients. A zero-fractional-sample delay filter is included, which provides the group delay reference for the rest of the filter bank. This ensures that the fractional-sample filter bank group delay is applied regardless of whether a nonzero fractionalsample delay is required for a given steering operation. Due to symmetry about the group delay reference, the magnitude responses of the negative relative shifts match those of the positive relative shifts. As quantization lobes due to finite fractional-sample delay resolution degrade system SNR [29], one must ensure sufficient resolution to support system requirements. In this example, the bank of 32 FIR filters has a resolution of 15.625 ps at a baseband sample rate of 2 GSPS. This gives a worst case phase resolution of about 21.4 • at F high = 3.8 GHz, slightly better than a 4-bit phase shifter.

C. Sub-Nyquist-Sampled Beamforming
Although bandpass sampling is theoretically valid for arbitrarily high Nyquist zones, the 3 − dB cut off for the RFSoC ADCs used in this testbed constrains the input spectrum to 4 GHz. As outlined in Section I, the beamformer testbed is designed to operate at a sample frequency of 4 GHz and support a 1.6 GHz bandwidth centered in the second Nyquist zone at 3 GHz. Spectrum aliasing due to sub-Nyquist sampling affects the relative phase of received signals, impacting beamformer compensation. This phenomenon is also present when beamforming at complex baseband due to frequency shifting [8]. To determine the necessary phase shift, we consider a monochromatic planewave x(t) = cos(2π Ft) incident on the two-element ULA in Figure 5. The elemental signals are given by x n (t) = cos(2π F(t − t n )), where t 1 = 0 and t 2 = d sin θ c , as described in Eq (1). The sampled signals are given by where t = mT s , F = f F s = f T s , and the sample period and sample frequency are given by T s and F s respectively. Because sinusoids are 2π periodic, when the magnitude of the normalized discrete frequency f exceeds 0.5, the time-varying component of the sinusoidal argument aliases to a new normalized frequency f ′ such that −0.5 ≤ f ′ < 0.5, as given by The constant phase term in Eq. (7) contains the original discrete frequency f rather than the aliased frequency f ′ . To align the two receive tones, the first sensor signal must be digitally delayed by t = t 2 . A delay sample index can be defined as m d = m − m 0 , where the delay sample offset m 0 = t T s , generally not an integer. Substituting m d into x 1 [m] of Eq. (6) yields which is coherent with the delayed tone x 2 [m] given in Eq. (6). By substituting m d into the aliased signal x a,1 [m] in Eq. (7), as given by we note that the resulting phase offset differs from that in x a,2 [m] in Eq. (7), namely −2π f T s t n . A correction phase is defined as the difference in phase offsets between Eqs. (7) and (9), as given by  This correction phase depends on the frequency difference F between the original and aliased frequencies and the propagation delay t between the current and reference elements. Thus sub-Nyquist-sampled digital beamforming can be implemented via TTD filters and phase shifters. Note that pulse distortion should be considered if this correction phase grows beyond a single cycle.

III. RESEARCH TESTBED
The wideband beamforming testbed is shown in Figure 6 mounted in the University of Oklahoma's (OU) far-field anechoic chamber. A Xilinx RFSoC is housed within the Pentek 3U virtual path cross-connect (VPX) chassis held at the base of the black high density polyethylene (HDPE) frame that was designed for this project. Eight MMCX-to-SMA cables provide the RF interfaces between the antenna elements and the RFSoC ADC channels for digital beamforming, which is controlled via a Secure Shell (SSH) interface to the Petalinux operating system (OS) running on the embedded realtime processor. To utilize the network analyzer for chamber measurements, the beamformer output is sourced out of the channel 1 DAC following digital upconversion. The upper chassis frame supports the wideband Vivaldi antenna aperture, shown in further detail in Figure 7. Specifically designed and fabricated at OU for this endeavor, the 8-element horizontally polarized linear array supports better than 2 GHz of bandwidth centered at 3 GHz. A 50 mm pitch yields half lambda spacing at the center of the band. The aperture's 66% fractional bandwidth supports the full digital bandwidth of the RFSoC, which utilizes 80% of the Nyquist zone providing a 1.6 GHz bandwidth at a sample frequency of 4 GHz. To facilitate analog baseline measurements to which the digital RFSoC results may be compared, a narrowband phase-amplitude control (PAC) board is mounted below the aperture. Used for beamsteering experiments throughout OU's Advanced Radar Research Center (ARRC) [30], the PAC board is included to provide a method for analog narrowband beamsteering measurements given in Section IV. It consists of eight analog channels, each providing 6-bit amplitude and 6-bit phase shifter control, with 0.5 dB resolution and 5.625 • resolution respectively. A USB battery box and Raspberry Pi provide power and control to the PAC board.
The Pentek Quartz Model 5950 is a VPX board that employs our Xilinx Zynq UltraScale+ RFSoC. MMCX RF connectors provide transformer-coupled RF interfaces to the eight 4 GSPS ADCs and eight 6.4 GSPS DACs that are resident on the RFSoC. The full-scale inputs have a maximum of 8 dBm into 50 ohms. The input RF chains were designed to have a 3 dB passband of 10 MHz to 3700 MHz. The board also houses additional resources, such as DDR4 Random Access Memory (RAM), power management, and interleaved ADC calibration circuitry. Interleave calibration is periodically carried out whenever there is no signal present at the input. The Model 5950 board is housed in the Pentek VPX chassis shown in Figure 6, which provides power conversion, cooling, and interface access via the Pentek Rear Transition Module (RTM). The FPGA fabric within the Xilinx RFSoC facilitates extensive customization. The Pentek FPGA Design Kit (FDK) enables convenient set-up and initial operation, incorporating the Xilinx real-time processor, RF data converter IP cores, and the necessary logic for various interfaces to the Pentek hardware including 100 GigE UDP, PCI Express, and DDR4 RAM. Much of this interface logic was subsequently removed as beamformer control was provided via an SSH interface and the beamformer output was measured via a network analyzer out of the channel 1 DAC. Internal channel-channel synchronization was provided by the RF data converter IP core.
The functional block diagram for the custom FPGA image is provided in Figure 8. As discussed in Sections II-B and II-C, sub-Nyquist-sampled TTD beamforming requires an integersample delay, fractional-sample delay, and phase shifter for channel compensation. Xilinx's RF data converter IP core provides user control of the ADC and DAC, and subsequently DDC and DUC, resources. The 4 GSPS real-valued data generated by each of the eight ADCs is frequency shifted to baseband and decimated by a factor of two in the corresponding DDC. Because a real-valued signal is conjugate symmetric, decimation by two fully retains the digital bandwidth supported by the 4 GSPS system. The DDC provides an integer-sample delay of up to seven samples as well as phase offset control of the digital mixer's numerically controlled oscillator (NCO). This allows the integer-sample delay and relative phase shift to be implemented through software control of the DDC for single beam applications rather than through custom FPGA firmware. The gain buffer enables the application of various tapers as well as narrowband channel-channel leveling, if desired. Fractional-sample delays are implemented via instantiation of FIR filters. The conjugatesymmetric fractional-sample delay filter is implemented via real-valued coefficients allowing for independent filtering of the I and Q data streams. Finally, the compensated channel signals are summed in the adder and passed to the DUC, which interpolates by a factor of 2 and frequency shifts the beamformer output to the carrier frequency for transmission via the DAC interface.
In order to utilize the full digital bandwidth, the fractionalsample filter bank was designed to process multiple samples in parallel. While the complex baseband spectrum is represented by complex-valued samples at 2 GSPS, the fractionalsample filter bank is designed to operate at the FPGA fabric frequency of 250 MHz. This requires parallel processing of eight samples for both the I and Q data streams for each of the eight channels. Figure 9 shows an interleaved real-valued FIR  filter implementation which processes eight samples per clock cycle. To maintain data throughput without further decimation, the filter must compute an output sample for each possible buffer offset necessitating resources for eight simultaneous filter instantiations. For each clock cycle, data in a given row is shifted eight samples to the right to make room for the next input sample set. The greyed-out samples represent registers which hold current input samples not required for the corresponding output sample but which must be stored for the upcoming clock cycle. This architecture requires eight multipliers per filter tap to produce eight output samples per clock cycle. Two interleaved filter instantiations are required for each of the eight channels, resulting in 128 multipliers per fractional-sample delay filter tap. The fractional-sample delay filter length was set to 17 taps, consuming 2176 DSP units for the fractional-sample delay filter banks. Filter coefficients utilized 16-bit fixed point precision and were computed for a 1 32 -sample resolution. Device utilization is provided in Table I and the design layout is given in Figure 10. The primary resource of interest is DSP utilization, requiring 2176 DSPs for the fractional-sample delay filter banks and 128 DSPs for the gain buffers, enabling efficient taper application. A custom-designed IP core, shown in Figure 11, was developed in C++ using Vivado HLS and contains the fractional-sample delay filter bank and summation node. Vivado HLS provides an environment which facilitates rapid development of register transfer logic (RTL) designs using higher level programming languages such as C and C++. HLS enables the user to control how arrays are managed in memory or registers and dictate how hardware resources are allocated when implementing functional operations, such as parallelizing loop iterations and pipelining data flow.  As discussed in the previous section, parallel resource allocation is necessary to take advantage of the full digital bandwidth. A custom C program executes within the embedded processor which memory maps the beamformer IP core, computes the necessary steering commands for the prescribed direction, and provides control data to the DDC, gain buffers, and fractional-sample delay filter bank.

IV. RESULTS
In this section, we present simulated and measured results of beamformer performance. Section IV-A shows simulation results for ideal and measured channel waveforms. Following the completion of bench testing, far-field chamber OTA measurements were captured for several steering locations for both narrowband and wideband operating modes. These are presented and discussed in Section IV-B.

A. Bench-Top Loopback Tests
System modeling allowed comparisons of various performance parameters, such as fractional-sample filter lengths, fixed point resolutions, and fractional-sample resolutions. Bitaccuracy was incorporated to aid in the HLS development of beamformer firmware, allowing for the generation of test vectors for use with the HLS testbench. A chirp waveform, commonly used in radar systems, spanning 80% of the Nyquist zone was externally looped back over-the-wire to each ADC channel through an 8-1 splitter, allowing for simultaneous channel frequency response characterization. This captured data was fed into the system model to estimate the measured performance of the RFSoC wideband beamformer. Figure 12 shows a comparison of simulation results when sourced with ideal channel data and measured channel data. As the measured channel data lacks equalization, namely wideband magnitude and phase alignment, the measured channel data provides an expectation of the uncalibrated beamformer response.

B. Over-the-Air Results
OTA results were measured in OU's far-field anechoic chamber, shown previously in Figure 6. Wideband measurements were captured for three test cases: narrowband beamsteering using analog phase shifters on OU's PAC board, narrowband beamsteering using digital phase shifters on the RFSoC, and wideband beamsteering using digital TTD and phase shifters on the RFSoC. For each operating mode, the main beam was steered to 15 • , 30 • , and 45 • at the center frequency of 3 GHz. Given the λ 2 aperture spacing at 3 GHz, a grating lobe can seen in the wideband beamsteering results at the high end of the 2 to 4 GHz measurement bandwidth. Measurements were captured for −60 • ≤ θ ≤ 60 • . Note that the following figures employ the same intensity scale as is given in Figure 12.
Analog narrowband beamsteering results acquired using OU's PAC board, shown in Figure 13, provide a baseline for digital results measured on the RFSoC. Immediately apparent is the beam squint due to the large fractional bandwidth of the incident signal, which becomes more drastic with steering angle. The 3 dB beamwidth ranges from 17 • to 10 • over the signal bandwidth while the sidelobe level is about 11 dB. The RFSoC digital beamformer implementation supports narrowband beamsteering via phase control of the digital mixer's NCO. These results are shown in Figure 14 with a 3 dB beamwidth from 17 • to 11 • over the signal bandwidth and average sidelobe level around 9 dB. The main beam position and shape show strong alignment between the analog and digital narrowband beamsteering results, although less definitive nulls and degraded sidelobe levels shown in Figure 14 indicate stronger channel-channel magnitude and phase alignment in the PAC board. As wideband equalization on the RFSoC is not part of this paper, frequency-variant channel-channel magnitude and phase errors degrade sidelobe performance in the digital beamforming results.    Wideband digital beamsteering results are provided in Figure 15. Mainbeam angle accuracy maintained throughout the signal bandwidth indicates effective implementation of digital TTD and phase shifters required to mitigate beam squint over the large fractional bandwidth at complex baseband.
The 3 dB beamwidth spans from 16 • to 10 • over the signal bandwidth while the average sidelobe level is about 9 dB. Bench-top measured channel data incorporated into system simulations, provided in Figure 16, show reasonable agreement with Figure 15. Similar sidelobe patterns indicate that effective wideband equalization will improve sidelobe performance. We have described an approach to equalization for wideband systems in a recent paper [31], while the paper at hand concentrates on the wideband digital beamforming method.

V. CONCLUSION
Recently, the RFSoC is a unique, state-of-the-art, highly integrated device that incorporates an FPGA, an SoC architecture, and high speed ADCs and DACs operating at gigahertz speeds onto a monolithic device. In this paper, we have demonstrated the operation of an 8-element fully-digital sub-Nyquistsampled array utilizing an RFSoC that implements digital TTD to achieve wideband digital beamforming. Array elements directly sample, at 4 GSPS, a 1.6 GHz linear chirp signal centered at 3 GHz, which folds into the first Nyquist zone upon sampling. As our theoretical equations show, signal compensation supporting the entire digital bandwidth is then applied at complex baseband following the digital downconverter. It is worth noting that the substantial resource requirements for the fractional-sample delay filter bank are dependent on several parameters: IBW, filter length, fractional-sample delay resolution, and fixed-point resolution. The authors sought to implement a beamformer which took full advantage of the available IBW while achieving a fractional-sample delay resolution of at least a 4-bit phase shifter. Sub-Nyquistsampled beamforming enables direct sampling below the Nyquist frequency while maintaining waveform bandwidth. This simplifies system design by mitigating the need for traditional analog downconversion circuitry while allowing for lower sample rates, and thus lower power consumption. We acknowledge that for practical implementations, a multichannel matching circuit, front-end amplification, and analog filtering would be best suited between the antenna and ADCs. Most importantly, we have confirmed that beam squint has been avoided. Given these concepts, this paper contributes the development and demonstration of a wideband digital beamformer on an RFSoC. A prototype system and chamber measurements confirm the efficacy of our approach.