Jammer Mitigation via Beam-Slicing for Low-Resolution mmWave Massive MU-MIMO

Millimeter-wave (mmWave) massive multi-user multiple-input multiple-output (MU-MIMO) promises unprecedented data rates for next-generation wireless systems. To be practically viable, mmWave massive MU-MIMO basestations (BSs) must rely on low-resolution data converters which leaves them vulnerable to jammer interference. This paper proposes beam-slicing, a method that mitigates the impact of a permanently transmitting jammer during uplink transmission for BSs equipped with low-resolution analog-to-digital converters (ADCs). Beam-slicing is a localized analog spatial transform that focuses the jammer energy onto few ADCs, so that the transmitted data can be recovered based on the outputs of the interference-free ADCs. We demonstrate the efficacy of beam-slicing in combination with two digital jammer-mitigating data detectors: SNIPS and CHOPS. Soft-Nulling of Interferers with Partitions in Space (SNIPS) combines beam-slicing with a soft-nulling data detector that exploits knowledge of the ADC contamination; projeCtion onto ortHOgonal complement with Partitions in Space (CHOPS) combines beam-slicing with a linear projection that removes all signal components co-linear to an estimate of the jammer channel. Our results show that beam-slicing enables SNIPS and CHOPS to successfully serve 65% of the user equipments (UEs) for scenarios in which their antenna-domain counterparts that lack beam-slicing are only able to serve 2% of the UEs.


I. INTRODUCTION
N EXT-generation wireless communication systems are expected to rely on the vast, unused bandwidth available at millimeter-wave (mmWave) frequencies in order to meet the ever-growing demand for higher data rates. Communication at mmWave frequencies is characterized by a high path loss that can be compensated for with massive multiple-input multiple-output (MIMO) technology [2]. Besides providing the basestation (BS) with a high array gain, massive MIMO also enables multi-user (MU) communication [3].
The deployment of a BS equipped with a large number of antennas and corresponding radio-frequency (RF) chains poses implementation challenges in terms of system costs, power consumption, and circuit complexity. A potential solution that addresses these challenges is to use low-resolution data converters that (i) reduce power consumption of data conversion and (ii) relax the linearity and noise requirements of the RF chains, which, in turn, also translates into power consumption and circuit complexity savings [4], [5].
Unfortunately, the use of low-resolution analog-to-digital converters (ADCs) leaves the BS vulnerable to jammers that could be introduced, for example, by a rogue user equipment (UE) or a malicious transmitter. Previous works [6]- [15] have A conference version of this paper introducing beam-slicing and SNIPS has been presented at the IEEE International Workshop on Signal Processing Systems (SiPS) 2021 [1]. The present manuscript extends our work in [1] by proposing CHOPS and comparing it to SNIPS, analyzing the performance of SNIPS and CHOPS for non-line-of-sight channels, and performing ablation studies that explain design choices when deploying beam-slicing.
* GM and OC contributed equally to this work. GM, OC, and CS are with the Department of Information Technology and Electrical Engineering, ETH Zürich, Switzerland; e-mail: gimarti@ethz.ch, caoscar@ethz.ch, and studer@ethz.ch The work of OC and CS was supported in part by ComSenTer, one of six centers in JUMP, a SRC program sponsored by DARPA. The work of CS was also supported by an ETH Research Grant and by the US National Science Foundation (NSF) under grants CNS-1717559 and ECCS-1824379. analyzed the impact of different types of jamming attacks on massive MU-MIMO systems and proposed mitigation methods based on digital equalization. However, none of these works take into consideration the compounding challenge of lowresolution data conversion: A jammer can either saturate low-resolution ADCs or (if gain-control is used) widen their quantization range, which inevitably drowns the useful signals in quantization noise. Both of these effects introduce distortions that are difficult to remove with subsequent digital processing.
For illustration, we consider the efficacy of two classical methods for jammer mitigation: Projection onto the orthogonal subspace (POS) [6], [16], and linear minimum mean-square error (LMMSE) equalization that treats jammer interference as noise (IAN) [17]. We consider the uncoded bit error-rate (BER) of these mitigation methods both for infinite-resolution ADCs and for 4-bit ADCs. Both of these settings consider 32 single-antenna UEs transmitting 16-QAM symbols to a BS equipped with 256 antennas under line-of-sight (LoS) conditions, and we assume a permanently transmitting jammer arXiv:2109.02502v1 [cs.IT] 6 Sep 2021 with a power 25 dB stronger than that of the average UE. In Figure 1(a), we consider infinite-resolution ADCs, where we compare POS and IAN with unmitigated communication, as well as with a baseline case in which no jammer is present. All of these methods estimate the channel with least squares (LS) from an orthogonal pilot sequence and then perform LMMSE equalization. In order to mitigate the effects of the jammer, POS projects both the estimated channel matrix and the receive symbol vector onto the (B−1)-dimensional subspace orthogonal to the (genie-provided) jammer channel before performing LMMSE equalization. Contrastingly, IAN uses the (genie-provided) interference covariance matrix to equalize (or "soft-null") the jammer directly in the LMMSE step, in which the jammer interference is treated as spatially correlated noise. We see that, at least when the jammer channel is perfectly known, both POS and IAN achieve almost perfect jammer removal, and the BER performance is virtually identical to the case without jammer. In contrast, in Figure 1(b), we consider the behavior when taking into account the quantization artifacts introduced by gain-controlled finite-resolution (4-bit) ADCs. 1 In this setting, both POS and IAN suffer an error floor as high as 2% BER, even when furnished with perfect knowledge of the jammer channel. Since such knowledge would not be obtainable with low-resolution ADCs, the true performance of these jammer mitigation methods is likely worse. The reason for this performance deterioration is that the ADC inputs are dominated by jammer interference, causing their quantization range to widen up and drown the UE signals in quantization noise, which cannot be removed with linear digital processing.

A. Contributions
In this work, we develop a practical method that mitigates strong jamming attacks on mmWave massive MU-MIMO systems with low-resolution ADCs at the BS. We show that effective jammer mitigation with digital linear equalization is possible only when the ADC resolution is sufficiently high (e.g., 8 bits for a 25 dB jammer), even when taking into account the nonlinear distortions caused by the ADCs. However, practical deployments of massive MU-MIMO are likely to rely on low-resolution ADCs, in which case the jammer will force the ADCs' quantization range to drown the UE signals in quantization noise. In order to enable jammer-robust communication also with low-resolution ADCs, we propose a novel technique that we call beam-slicing: a non-adaptive, localized spatial transform which precedes data conversion. We then propose two different beam-slicingbased and quantization-aware methods for jammer mitigation, Soft-Nulling of Interferers with Partitions in Space (SNIPS) and projeCtion onto ortHOgonal complement with Partitions in Space (CHOPS), which are essentially the beam-slicing counterparts of IAN and POS from Figure 1. We use simulation results for realistic LoS and non-LoS mmWave channels to demonstrate that beam-slicing enables SNIPS and CHOPS to mitigate the adversarial impact of strong jammers on lowresolution mmWave MU-MIMO systems far more effectively than conventional jammer mitigation schemes that directly operate in antenna-space. Finally, we justify the design choices in our construction of beam-slicing through ablation studies.

B. Related Prior Work
Several works have studied means to improve the resiliency of MIMO systems against jamming attacks. These works have considered different attacks, such as constant jamming attacks [6], [7], in which the jammer is permanently transmitting, as well as other types of attacks in which the jammer transmits only at specific time instances, e.g., when the UEs transmit [6], or during pilot transmission [8]. Moreover, given the complexity of the jammer problem, some works have devoted themselves only to detecting the presence of a jammer [8], [9], while other works have proposed methods to suppress the impact of the detected jammer [6], [7], [10]- [15].
In this work, we focus on mitigating the interference of a permanently transmitting jammer. We now describe existing approaches that deal with such jammer interference. Reference [6] proposes a method for small-scale MIMO systems that uses the angle-of-arrival of the jammer interference to project the receive vector onto its orthogonal subspace. Also for small-scale MIMO, reference [7] proposes a method that uses differential encoding and exploits the ratio between channel coefficients. In the context of massive MIMO, reference [10] uses random matrix theory to estimate the UEs' eigensubspace to then project the received signals onto that subspace, while [11] proposes methods which require perfect channel state information and cooperation between the UEs and the BS. References [12], [13] use an estimate of the jammer channel to implement different versions of a jammer-robust zero-forcing detector.
Similarly to our work, references [14], [15] propose to exploit spatially correlated channels to suppress jammer interference. In particular, the work in [15] applies a beamspace transform [18], i.e., a spatial discrete Fourier transform (DFT), to the BS receive signal in order to separate the jammer from the UEs in angular domain. The beamspace transform is closely related to the concept of beam-slicing proposed in this work. In fact, the beamspace transform (and even the absence of any spatial transform, i.e., the antenna domain) can be formulated as a special case of beam-slicing, and one can think of beamslicing as a more general beamspace transform with adjustable angular resolution. The key advantage of beam-slicing over the beamspace transform is that beam-slicing is composed of localized transforms that only take inputs from a few adjacent antennas, making it more amenable for analog circuit implementation [19].
The difficulty of implementing large analog spatial transforms (specifically, large DFTs) can be illustrated with the example of a Butler matrix. A Butler matrix is a passive, bidirectional beamforming network consisting of hybrid couplers and fixed phase shifters, and is capable of simultaneously generating multiple beams for antenna arrays [20], [21]. To support multibeamforming, the Butler matrix implements a DFT 2 in a structure analogous to that of the fast Fourier transform (FFT) [22], [23]. This efficient topology as well as the reduced component count (compared to the alternative Blass [24] and Nolen [25] matrices) makes the Butler matrix one of the most prominent circuit-based, analog DFT implementations [23], [26]. Nevertheless, the implementation of large Butler matrices remains impractical. Specifically, their implementation is hindered by the more complex routing [23], [26] and the necessity for lower manufacturing tolerances [27]. Another issue in implementing large Butler matrices is an increase in insertion loss [23], [26], [28]. These obstacles are reflected by the fact that, to the best of our knowledge, the largest Butler matrices reported in the open literature are for 16 antenna elements [29]- [31]. However, our proposed beamslicing methods can be implemented with small Butler matrices that transform signals from 4 or 8 antennas.
Finally, we note that most existing works on jammer mitigation have not considered the effects of hardware impairments, with the exception of [14]. Reference [14] models the effects of hardware impairments (including quantization errors) as additive Gaussian noise, hence failing to model the signal-and jammer-dependent distortions introduced by coarse analogto-digital (A/D) conversion. In stark contrast, our system simulations explicitly model the effects of low-resolution quantization, whilst our beam-slicing methods, SNIPS and CHOPS, take into consideration such coarse quantization by using Bussgang's decomposition [32], [33].

C. Notation
Matrices and column vectors are represented by boldface uppercase and lowercase letters, respectively. For a matrix A, the conjugate transpose is A H , the kth column is a k , the Frobenius norm is A F , and the trace is tr(A). The N × N identity and DFT matrices are I N and F N , respectively, where F H N F N = I N . For a vector a, the kth entry is a k , the 2 -norm is a 2 , the real part is {a}, and the imaginary part is {a}. The all-zeros matrix is denoted by 0. Moreover, diag(a) is a diagonal matrix whose diagonal is formed by the entries of a. Expectation with respect to the random vector x is denoted by . The floor function x returns the greatest integer less than or equal to x. We define i 2 = −1.

II. PROPAGATION MODEL
We consider the uplink of a mmWave massive MU-MIMO system in which U single-antenna UEs transmit data to a B antenna BS, while a permanently transmitting, single-antenna jammer interferes with the BS receive signal. For this scenario, we consider the following frequency-flat input-output relation: Here, y ∈ C B is the (unquantized) vector received by the BS antennas, H ∈ C B×U models the MIMO uplink channel matrix, s ∈ S U is the transmit vector whose (independent) entries correspond to the per-UE transmit symbols which take value in a constellation set S (e.g., 16-QAM), j ∈ C B is the channel vector from the jammer to the BS, w ∈ C is the jamming signal, and n ∈ C B is i.  assume that the UE transmit symbols s u , u = 1, . . . , U , are independent and circularly-symmetric with variance E s so that E s ss H = E s I U . We model the jamming signal w as circularly-symmetric complex Gaussian with variance E w . All probabilistic quantities are assumed to be mutually independent.

III. INTEFERENCE-REMOVAL WITH PARTITIONS IN SPACE
Our approach aims to protect most of the ADCs from the jammer by exploiting the strong spatial directivity of mmWave signals. Prior to A/D-conversion, we apply a spatial transform that resolves the incident waves so that only a few ADCs are strongly affected by the jammer. One can then discount the outputs of these jammer-distorted ADCs during equalization and detect the data symbols mainly based on the outputs of the distortion-free ADCs.
A naïve approach would be to transform the (unquantized) receive vector y to the beamspace (or angular) domain [18] using a DFT according to y B = F B y, and set the entries of the beamspace vector y B dominated by jammer interference to zero. Accordingly, the resulting vector y B,mask will have entries equal to zero if they belong to jammer-contaminated beams, and otherwise equal to the corresponding entry in y B . One could then equalize y B,mask as if neither interference nor interference-cancellation had occurred, for instance with LMMSE estimation, While such an approach would be effective in suppressing jammer interference, implementing large analog spatial transforms, such as the DFT, is nontrivial (cf. Section I-B), especially when considering hundreds of BS antennas [19], [23], [26]- [28].
As a practical alternative, we propose beam-slicing, a distributed, localized, and hence small analog spatial transform that can be implemented in practice. We also propose two jammer mitigation methods based on beam-slicing, SNIPS and CHOPS. SNIPS and CHOPS do not discard the outputs of jammer-affected ADCs completely, but instead take into account each ADC-output's fidelity by estimating the amount of jammer interference at the individual ADCs. Moreover, SNIPS and CHOPS utilize Bussgang's decomposition in order to take into account the effects of practical, low-resolution ADCs. The only difference between SNIPS and CHOPS lies in how they mitigate jammer interference: SNIPS performs LMMSE equalization where the jammer interference is treated as noise; CHOPS projects the receive signal (and channel estimate) onto the subspace orthogonal to the jammer channel before applying a conventional LMMSE equalizer.
In the following discussion of these two methods, SNIPS and CHOPS differ only in Section III-F. The rest of their jammermitigation pipeline, illustrated by Figure 2, is identical.

A. Beam-Slicing
Beam-slicing transforms partitions of the (unquantized) receive vector y into a "rotated" angular domain. Beam-slicing is fully analog, non-adaptive, and operates in decentralized fashion. Specifically, we partition the BS antenna array into C clusters of equal size, each consisting of S = B/C adjacent BS antennas. The corresponding partition of the receive vector is denoted y = [y T 1 , . . . , y T C ] T , where y c ∈ C S , c = 1, . . . , C. Beam-slicing then transforms the receive clusters into what we call the beam-slice domain as follows: Here, the cth cluster matrix V c is given as a progressively phase-shifted S-point DFT matrix F S according to so that the different clusters transform to successively "rotated" angular domains. Such phase-rotated DFTs are used to increase the "angular diversity" of beam-slicing to better capture the possible directions of jammers-see Figure 3 for a graphical explanation, as well as Section IV-D for empirical evaluation. The action of the beam-slicer is summarized aŝ where V = diag(V 1 , . . . , V C ) and V H V = I B . We also point out that for an (impractical) cluster size S = B, beamslicing corresponds to performing a conventional beamspace transform. In what follows, it will be convenient to define the beam-sliced channel matrixĤ = VH, and the beam-sliced jammer channelĵ = Vj, which allows us to rewrite (1) aŝ y =Ĥs +ĵw +n, wheren = Vn has the same distribution as n.

B. Data Conversion
The beam-sliced signalŷ is then converted into the digital domain. To take into account the quantization errors of lowresolution ADCs, we assume that the beam-sliced vectorŷ is quantized as follows: Here, G = diag(g 1 , . . . , g B ) is a diagonal matrix that represents beam-wise gain-control. The quantization function Q(·) is applied entry-wise to its input and represents a q-bit uniform midrise quantizer with step size ∆ defined as For the quantizer's step size ∆, we use the value which minimizes the mean-square error (MSE) between the quantizer's output Q(x) and its input x under the assumption that x is Gaussian with zero mean and unit variance [34]. For convenience, we will denote (6) as The per-beam gains aim to ensure that the values entering the quantizers have unit variance per real dimension and are obtained from a set of T training vectorsỸ = [ỹ 1 , . . . ,ỹ T ] as whereỹ (b) is the bth row ofỸ.

C. Jammer Interference Estimation
Our interference-mitigating data detection schemes (see below) rely on knowledge of statistics of the jammer's interference. Specifically, these methods need to know the covariance matrix C j = E w ĵ w(ĵw) H . We suggest to estimate C j from a number of channel uses during which the UEs are not transmitting and where the jammer transmits i.i.d. jamming symbols [w 1 , . . . , w N ], so the beam-sliced receive matrix and corresponding quantization output can be modeled aŝ In order to learn the jammer channel, we propose to estimate the gain matrix G with (9) directly from the received signals, G = G(Ŷ J ). Our estimate Λ of the covariance matrix C j is given by

D. Channel Estimation
We estimate the UEs' channel matrix using a pilotbased LS estimator from U orthogonal pilot sequences S P = [s 1 , . . . , s U ]. The channel estimation pipeline passes through the beam-slicer and the quantizer. The beam-sliced receive matrix and corresponding quantization are modeled aŝ where we estimate the gain matrix G with (9) from the pilot sequence itself,Ỹ =Ŷ P . (We fix this choice of G = G(Ŷ P ) also for the data detection phase described below.) We then estimate the beam-sliced channel matrix with an LS estimate: where (a) holds because the pilot sequence is orthogonal. 3

E. Bussgang Analysis
So far (i.e., for jammer covariance estimation and channel estimation), we have neglected the distortion introduced by the quantization step in (6)-(8). We do not, however, neglect this distortion during the data detection step: 4 The quantization step introduces distortions which are correlated with the quantizer inputs. We assume that the components of the quantizer inputs are real-valued Gaussian with zero mean and unit variance. This assumption allows us to perform a component-wise Bussgang decomposition [32] of the quantization signal as follows: Here, γ is the quantizer's Bussgang gain, and the distortion d has zero mean and is uncorrelated with x. The Bussgang gain is given by [33,Eq. (9)] and the variance of the distortion d is Bussgang's decomposition allows us to rewrite (8) as where we define d = d r + id i . Based on (19), we make the idealized assumption that the covariance matrix of d is We now combine (24) with (5), which results in the following input-output relation: The expression in (26) also sheds light on how low-resolution ADCs exacerbate the jammer interference: If there is significant jamming energy at the ADC inputs, then the elements g b in (9) of the (diagonal) gain control matrix G are close to zero. Therefore, the entries of G −1 are very large, and we see from the last term in (26) how this leads to amplification of the quantization noise d. Beam-slicing aims to ensure that only a small set of the entries of G −1 are large, so that the signal can be detected from the remaining components of r. 1) SNIPS: The first method uses an LMMSE-like detector that treats the jammer interference as noise, resulting in SNIPS. In deriving the SNIPS data detector, we make certain idealizing assumptions. The detector is where the matrix W SNIPS is given by Here, we use the gain control matrix acquired during the pilot phase, G = G(Ŷ P ). If the diagonal approximation (25) and the approximationsĤ est ≈Ĥ and Λ ≈ C j were exact, equation (27) would implement the LMMSE estimator when the interference is regarded as noise.
2) CHOPS: The second method projects the received signals onto the (B − 1)-dimensional subspace orthogonal to the beam-sliced jammer channel and performs LMMSE-like data detection in this projection space. The matrix for this projection would be P = I B −ĵĵ H / ĵ 2 [35, Sec. 2.6.1], which can be approximated based on the covariance estimate Λ in (12) as see Appendix A for a derivation. To obtain a consistent model of the wireless channel, the projection is also applied during channel estimation. So, for CHOPS, we replace (14) with while the rest of the channel estimation phase remains identical as in Section III-D. We denote the obtained estimate of the projected, beam-sliced channel matrix withĤ proj . (Mathematically, instead of projecting the quantized pilot receive matrix on the orthogonal subspace as in (31), we could equivalently have projected the estimate of the beam-sliced channel matrix (14), asĤ proj = P estĤest . 5 ) For data detection, the beam-sliced receive vector r from (26) is projected accordingly: We then perform LMMSE-like data detection on the resulting quantities as which would implement LMMSE detection in the projection space if the diagonal approximation (25) as well as the approx-imationsĤ est ≈Ĥ and Λ ≈ C j were exact.

IV. RESULTS
We now demonstrate the efficacy of beam-slicing by comparing SNIPS and CHOPS with two baselines that differ from SNIPS and CHOPS only in lacking analog beam-slicing. We note that the operations of these baselines correspond to SNIPS and CHOPS with cluster size S = 1, which implies V = I B : The baselines perform A/D-conversion and soft-nulling or orthogonal projection directly in the antenna domain 6 . We will show that in the presence of a strong jammer, beamslicing with a two-antenna cluster size (S = 2) already yields significant improvements over these baselines. We will also identify situations that benefit from beam-slicing considering different ADC resolutions and levels of jamming power. Finally, we will empirically justify two of our choices when designing beam-slicing, namely the use of the DFT as the base spatial transform and the use of "rotated" DFTs for each cluster.

A. Simulation Setup and Performance Metrics
We simulate a mmWave massive MIMO system in which U = 32 single-antenna UEs communicate to a B = 256 antenna BS both under LoS and non-LoS conditions. The UE and jammer channels are generated using the QuaDRiGa mmMAGIC urban microcellular (UMi) model [36] for a carrier frequency of 60 GHz and a uniform linear array (ULA) with half-wavelength spacing. We let the U UEs and the jammer be randomly placed at distances from 10 m to 100 m within a 120°angular sector in front of the BS. The minimum angular separation between two UEs, as well as between the jammer and any UE, is 1°. We assume ±3 dB per-UE power control, 5 Note that we did not explicitly address the jammer contamination of the channel estimate in SNIPS. The reason is that we observe empirically that the malicious influence of this contamination is negated by the Λ-term in the inverse of (29), which makes such a projection unnecessary. To illustrate this, we point out that the IAN-curve in Figure 1(a) is also based on a detector whose channel estimate suffers from strong jammer contamination: Its performance nevertheless matches the performance of the case without jammer (where the channel estimate is uncontaminated), showing that this channel contamination ultimately has no consequences on the error-rate performance. 6 We note that the baselines consisting of SNIPS and CHOPS in the antenna domain correspond to IAN and POS from Figure 1, respectively. However, for simplicity, we refrain from using the terms IAN and POS in the remainder. so that the ratio between maximum and minimum per-UE receive power is 4. The transmit constellation is 16-QAM. In our simulations, we define the average receive signal-to-noise ratio (SNR) as To quantify the jammer's power in comparison to a single UE, we define the relative jammer power ρ as We will consider two performance metrics: Uncoded BER and the per-UE root mean-square symbol error (RMSSE) [37]. The RMSSE for the uth UE over n data symbol slots is: Here, s u,k and s u,k are the transmitted and estimated data symbol of the uth UE at time slot k, respectively. To understand the relevance of RMSSE u as a performance metric, it is helpful to compare it to the error vector magnitude (EVM) requirements in the 3GPP 5G NR technical specification [38]. The EVM is loosely speaking the square root of the sum of RMSSE u -squared over all U UEs. Vice versa, the RMSSE u is loosely speaking a single-UE proxy for the EVM. We will therefore interpret RMSSE u as a random variable and analyze its distribution by means of Monte-Carlo simulations. For 16-QAM transmission, the 3GPP 5G NR technical specification requires an EVM below 12.5% [38, Tbl. 6.5.2.2-1]. As a performance metric, we therefore consider the fraction of UEs (averaged over many UE placements/channel realizations, noise and jammer realizations, and data transmissions) for which the RMSSE u is below 12.5%.

B. The Efficacy of Beam-Slicing
In Figures 4 and 5, we evaluate the performance of SNIPS (solid) and CHOPS (dashed) for different antenna cluster sizes S. We compare the baselines, which perform soft-nulling or orthogonal projection in antenna domain (ANT), against SNIPS and CHOPS with cluster sizes S ∈ {2, 4, 8, 16, 32, 64, 256}, hence considering analog beamslicing that only operates on a pair of adjacent antennas up to a single cluster consisting of the whole antenna array. We note that with a cluster size S = B = 256, beam-slicing corresponds to performing a full beamspace transform. For these experiments, we consider a strong relative jammer power ρ = 25 dB, and a BS with q = 4 bit ADCs (per real dimension). Figure 4(a) shows uncoded BER results under LoS conditions. We see that beam-slicing with two-antenna clusters (S = 2) already yields noticeable BER improvements over the antenna-domain baselines. SNIPS and CHOPS are virtually identical for small clusters, but SNIPS slightly outperforms CHOPS for large clusters. Figure 4(a) also shows that large clusters outperform small ones. However, the performance of a full beamspace transform (S = 256) is inferior to S = 64, which exhibits the best performance for both SNIPS and  CHOPS. The reason for this has nothing to do with how the beamspace transform distributes the jammer interference to the ADCs. Instead, the performance decrease can be addressed to the fact that, after a full beamspace transform, the UE signals are concentrated to only a few ADCs, so that low-resolution (q ≤ 4) ADCs can no longer represent them accurately. This observation suggests that the fully-centralized beamspace transform, which in any case is impractical, may not necessarily be optimal for achieving the full potential of beam-slicing. Figure 4(b) shows results for non-LoS conditions, where the channels are less sparse (in beamspace domain) than under LoS conditions. As a consequence, the gains obtained by beamslicing over the antenna-domain baselines are not as pronounced as for LoS conditions. Nonetheless, beam-slicing with S = 8 still offers a gain of 4 dB at a BER of 1% compared to the baselines. Due to the decrease in channel sparsity, there is now a strict improvement of performance for larger cluster sizes: Even for a full beamspace transform, the UE signals are sufficiently distributed over the different low-resolution ADCs as to be appropriately represented.
The behavior of the fraction of UEs whose RMSSE u is below 12.5% at a given SNR is shown in Figure 5, where Figure 5(a) shows results under LoS conditions and Figure 5(b) shows results under non-LoS conditions. In terms of this criterion, the antenna-domain baselines are not able to successfully serve a significant percentage of UEs under LoS conditions, and less than 20% under non-LoS conditions, regardless of SNR. In contrast, beam-slicing with four antennas per cluster (S = 4) at high SNR can serve more than 40% of UEs both under LoS and non-LoS conditions. This fraction of successfully served UEs increases with the cluster size S (though again not strictly monotonically under LoS conditions), with SNIPS and CHOPS being able to serve more than 80% of UEs at high SNR for S ∈ {32, 64} under LoS conditions, and more than 65% of UEs under non-LoS conditions. The general picture that emerges from Figures 4 and 5 is that beam-slicing significantly improves jammer mitigation already for modestly sized (and hence practical) antenna clusters. The strongly similar performance of SNIPS and CHOPS suggests that the advantage afforded by beam-slicing is quite independent from the digital jammer mitigation method used, so that beamslicing may successfully be combined with and enhance a variety of digital jammer mitigation methods in systems that rely on low-resolution ADCs.

C. When is Beam-Slicing Needed?
The experiments in Section IV-B indicate that, for a strong jammer, SNIPS and CHOPS outperform their antenna-domain baselines and that the best performance is achieved with a large cluster size S. However, as discussed in Section I-B, analog spatial transforms spanning a large number of antennas are difficult to implement in practice [19], [23], [26]- [28], with a full beamspace transform being probably infeasible for massive MU-MIMO systems. We therefore consider SNIPS and CHOPS with a moderately-sized antenna cluster of S = 8 for our subsequent evaluations.
In Figure 6, we analyze the impact of ADC resolution when the relative jammer power is ρ = 25 dB for CHOPS under LoS conditions (Figure 6(a)), as well as for SNIPS under LoS (Figure 6(b)) and non-LoS (Figure 6(c)) conditions. 7 We consider the fraction of UEs successfully served in terms of the criterion RMSSE u < 12.5%. For infinite-or highresolution (q = 8) ADCs, the beam-slicing methods have identical performance as their antenna-domain baselines in all the setups. However, already for 6-bit ADCs, SNIPS and CHOPS outperform their antenna-domain counterparts. For low-resolution ADCs with q ≤ 4, the antenna-domain methods are unable to serve a significant fraction of UEs (less than 2%) under LoS conditions. In contrast, SNIPS and CHOPS can at 7 Because of the virtually identical performance of CHOPS and SNIPS under LoS conditions shown by Figures 6(a) and 6(b), we have omitted a plot for CHOPS under non-LoS conditions. The same applies for Figure 7.
least serve some UEs at high SNR even for q = 3, and they can serve up to 65% of UEs for q = 4 under LoS conditions. Under non-LoS conditions, the beam-slicing gains are less pronounced but still significant.
In Figure 7, we consider the impact of the relative jammer power ρ for 4-bit ADCs. We again compare CHOPS (LoS conditions; Figure 7(a)) and SNIPS (LoS and non-LoS conditions; Figures 7(b) and 7(c)) against their antennadomain counterparts. We see that beam-slicing does not offer a performance gain with weak jammers that are only as strong as the average UE (ρ = 0 dB). However, for a jammer with ρ = 15 dB, SNIPS and CHOPS already significantly outperform their antenna-domain baselines in terms of successfully served UEs, and this gap continues to widen as jamming power increases further.
Together, these experiments confirm that strong jammers pose a serious problem for classical all-digital jamming suppression methods when combined with low-resolution ADCs. Our results also show that beam-slicing can successfully mitigate this problem in a practical manner.

D. Ablation Studies
After showing the general efficacy of beam-slicing and analyzing the conditions under which it leads to performance improvements, we now justify some of the choices in our beam-slicing design empirically. Specifically, we compare the choice of the DFT against other spatial transforms; we show the utility of the increased angular diversity provided by rotating the DFTs as in (3); and we provide evidence that using steadilyprogressing rotation angles as in (3) is close-to-optimal. For this, we generalize the per-cluster beam-slicing transform (3) to resulting in the overall beam-slicing matrix where φ = [φ 1 , . . . , φ C ] are the per-cluster rotation angles, and T ∈ C S×S is an arbitrary unitary transform. We restrict ourselves to unitary transforms T since this ensures the unitarity of V(T, φ), which in turn ensures that the beam-sliced noise vectorn = V(T, φ)n has the same distribution as n.
Our previous results have shown the similar-to-identical performance of SNIPS and CHOPS over a wide range of parameters. For simplicity, we therefore restrict our analysis from here on to SNIPS. We consider SNIPS with 4-bit ADCs, under a relative jammer power ρ = 25 dB in LoS transmission.
In Figure 8, we compare the performance for different choices of the transform T and uniformly-strided cluster rotations φ = (2π/B) × [0, 1, . . . , C − 1] as in (3). The cluster size (i.e., the size of the transforms) is S = 8. In addition to the proposed DFT, we consider the Haar transform [39], the Hadamard transform [40], the discrete Hartley transform [41], [42], the discrete cosine transform (DCT) [43], and the Noiselet transform [44] as candidates for T. As a baseline, we also include the performance without beam-slicing, i.e., when operating directly in the antenna domain. We see that even the worst-performing Haar transform significantly outperforms the antenna-domain baseline on both performance metrics. In terms of the uncoded BER shown by Figure 8(a), all the other considered transforms T yield similar performance, with a slight advantage for the DFT. Nevertheless, when looking at the fraction of served UEs in Figure 8(b), we are able to appreciate a larger performance gap between the DFT and the other transforms. Together, these results support the choice of  the DFT in our beam-slicing design when merely considering system performance.
Despite the foregoing analysis, the choice of the beamslicing transform may be guided by different factors in practice. For example, while large DFTs are hard to implement using analog circuitry (e.g., S = 16 is the largest Butler matrix DFT reported in the open literature; cf. Section I-B), highdimensional Hadamard transforms can be implemented with particular ease-see, e.g., the recent work in [45] with an implementation for S = 128. Therefore, one could imagine preferring a subpar transform, such as the discrete Hadamard transform, but with a large cluster size S, over the DFT with a lower cluster size when implementing beam-slicing. A detailed study of such trade-offs is, however, left for future work.
In Figure 9, we fix T to be the DFT, and analyze the performance gain of uniformly-strided cluster rotations φ = (2π/B) × [0, 1, . . . , C − 1] as in (3), over fixed (unrotated) clusters, which corresponds to φ = 0 in (38). In our analysis, we consider different cluster sizes S. Note that for the antenna domain (S = 1) and beamspace transform (S = 256), both cases (with and without rotations) are mathematically equivalent, so the BER curves coincide. For all other choices, the with φ = (2π/B)(c − 1), we obtain, across all clusters, signals whose angular frequencies increase in steps of 2π/B, as they would for the full beamspace transform F B . We thereby achieve the same angular diversity as the beamspace transform (though not the same sharpness in spatial resolution).
While the choice of uniformly-strided cluster rotations φ = (2π/B) × [0, 1, . . . , C − 1] comes naturally for the DFT, this does not yet guarantee their optimality. To gain some insight into the question of optimality, we used a data-driven approach for learning the per-cluster rotations φ. Specifically, we used a coordinate descent algorithm to determine better rotation angles that proceeds as follows: For cluster c, we fix all per-cluster rotations in φ except for φ c . The angle φ c is then swept over a grid of 148 possible rotation angles between 0 and (2π/B)C, evaluating for every grid point the uncoded BER of SNIPS (with these rotation angles φ) at an SNR of 20 dB and a relative jammer power ρ = 25 dB over a training set consisting of 10 3 LoS channels (one transmit symbol per UE per channel). We then fix φ c to the rotation angle with the lowest uncoded BER, and the procedure is repeated analogously for the next cluster rotation angle φ c+1 . This coordinate descent process is repeated (for each rotation angle φ c , c = 1, . . . , C in φ) for a total of 50 iterations. Figure 10 shows uncoded BER simulation results for the rotation angles learned with this coordinate-descent learning algorithm. In Figure 10(a), we compare the BER performance of the learned rotation angles with the performance of uniformly strided rotations φ = (2π/B) × [0, 1, . . . , C − 1] (and no rotations), considering only the channels contained in the training set. We see that the learned rotations offer virtually no improvement over uniform rotations. Since learned rotations do not outperform uniform rotations even on the training set, it would seem unlikely that they offer improvements when evaluated on new channels. This is confirmed in Figure 10(b), where we compare the performance for 10 4 new LoS channels.
V. CONCLUSIONS We have proposed a novel method to mitigate strong jamming attacks in mmWave massive MU-MIMO BSs relying on lowresolution ADCs. Concretely, we have shown that strong jammers force the quantization range of low-resolution ADCs to drown the UE signals in quantization noise, thereby introducing distortions which are difficult to remove with digital signal processing. In order to enable effective and practical jammer mitigation in systems that rely on low-resolution ADCs, we have proposed beam-slicing, a non-adaptive, distributed analog spatial transform. Beam-slicing exploits the strong directionality of mmWave signals to focus the jammer's energy on a subset of ADCs. An estimate of each ADC-output's fidelity can then be exploited to estimate the UE signals on the basis of the interference-free ADC outputs. We have proposed two such estimation methods, SNIPS and CHOPS, which differ in how they cancel the jamming signal. SNIPS performs LMMSE equalization on the beam-sliced signal to soft-null the jammer; CHOPS projects the beam-sliced signal on the orthogonal subspace to remove the jammer completely. Both of these methods leverage Bussgang's decomposition to model quantization artifacts. We have shown using simulations that beam-slicing significantly improves jammer mitigation already for small transform clusters, with SNIPS and CHOPS clearly outperforming their antenna-domain counterparts. This suggests that beam-slicing is a practical method that may be combined with and enhance a variety of digital jammer mitigation methods to provide increased robustness of low-resolution mmWave massive MU-MIMO BSs to jamming attacks.

APPENDIX A ESTIMATING THE PROJECTION MATRIX
We offer a brief exposition on why (30) is a sensible estimate for the projection matrix P = I B −ĵĵ H / ĵ 2 . In doing so, we ignore the thermal noise and the quantization noise. Under these simplifications, the jammer sequence receive matrix (11) can be rewritten as Lemma 1. If the jammer receive sequence is given as (40) with R J = 0, and if Λ = R J R H J , then the orthogonal projection onto the orthogonal complement of the subspace spanned byĵ is with Proof. We have and tr (Λ) = w 2 ĵ 2 . From this, the claim follows by recalling [35, Sec. 2.6.1].