A Low-Complexity Machine Learning Design for mmWave Beam Prediction

Machine learning (ML) for fifth generation (5G)-Advanced air interface is currently being studied by the 3rd Generation Partnership Project (3GPP), where millimeter-wave (mmWave) beam prediction is an important use case. Thereby the targets are to reduce reference signal (RS) overhead, latency, and power consumption, which are currently imperative for frequent beam measurements. To this end, a low-complexity ML design is presented, that exploits the spatial correlation between beam qualities to expedite the spatial-domain beam prediction. Evaluation results showcase that the proposal achieves a beam prediction accuracy of 96 % with 75 % reduction in RS overhead and lower computational complexity as compared to the state of the art. Further, to demonstrate the practicality of the proposed design, we analyze its generalization behavior across different communication scenarios.


I. INTRODUCTION
The availability of abundant bandwidth at millimeter-wave (mmWave) bands makes it a requisite for higher throughput.However, to achieve an adequate link margin, beamforming via large antenna arrays is essential [1].Consequently, the evaluation of beam qualities through frequent beam measurements and beam qualities reporting is imperative to help the base station (BS) and the user equipment (UE) decide the optimal beam pair for link establishment.Within the 3rd Generation Partnership Project (3GPP) this is referred to as beam management procedure.
In order to enable the UE to measure the beam qualities, beamformed reference signals (synchronization sequence blocks (SSBs)) are sequentially transmitted from the BS in the form of an SSB burst.This allows the UE to measure the qualities of all the BS transmit beams in terms of their reference signal received powers (RSRPs) through one of its receive beams.Further, to measure the qualities of all possible transmit-receive beam pairs, several SSB bursts are transmitted.This procedure of beam qualities measurement is known as exhaustive beam scan (EBS), which suffers from large beam measurement overhead, increased latency, and higher power consumption [2], [3].To overcome this, a two-level hierarchical beam scan (HBS) consisting of parent (wide) and This work was funded by the European Union's SEMANTIC ITN project under the Marie Skodowska-Curie grant agreement No. 861165.
Recently, machine learning (ML) methods have been extensively applied to wireless communications to solve the non-linear problems that were burdensome to be resolved by conventional signal processing techniques.Consequently, several studies propose the use of ML for beam prediction and selection [2].A straightforward approach to reduce the beam measurement overhead is to utilize the UE location information [5] to train an ML model for beam prediction.However, transmission of UE location information, which may not necessarily be available always to the BS, poses an additional feedback overhead.To avoid this issue, the study in [6] fuses the concept of HBS with a supervised ML model and exploits the spatial correlation among the parent and the child beam qualities to predict the optimal child beam.A similar approach in [7] utilizes the received signal vector of parent beams as an input to a convolutional neural network (CNN).Another approach in [8] proposes to reduce the beam measurement overhead by transmitting a subset of child beams and then utilizes a CNN that predicts the optimal beam by learning the spatial correlation among child beams.
Starting from 2022, the study of ML for the fifth generation (5G)-Advanced New Radio (NR) air interface is an important project at 3GPP.Here, the focus is to explore the benefits of augmenting the NR air interface with ML models for enhanced performance and/or reduced overhead and complexity [9].An important study item in this project is the evaluation of ML for beam management, where spatial and temporal-domain beam prediction are the sub use cases [10].Following 3GPP guidelines, companies report their proposed evaluation methodology and results on ML-based beam prediction [11].A recent proposal for spatial-domain beam prediction is presented in [12], where based on the received power of a subset of the transmit beams, a CNN is trained to predict the RSRPs of the non-transmitted beams resulting in reduced overhead.
Though most of the discussed ML solutions reduce the beam measurement overhead while achieving a performance closer to EBS, no significant attention has been paid to the model computational complexity, model training time and its generalization capabilities.To bridge this research gap, this letter presents a low-complexity ML beam prediction approach that achieves the performance closer to the optimal EBS but with lower computational complexity as compared to other ML approaches, resulting in faster beam prediction.Additionally, to investigate the generalization capabilities of our model, we evaluate its performance over 3GPP specified scenarios.

A. Channel Model
We consider a downlink mmWave multi-in multi-out (MIMO) communication system, where the BS and the UE are equipped with N T and N R antenna elements, respectively.Using the clustered channel model, the channel is assumed to be the sum of the line-of-sight (LOS) path and C non-line-ofsight (NLOS) clusters with L paths per cluster.The channel matrix H ∈ C NR×NT can then be written as [13] Here, the l-th path of the c-th cluster has azimuth (elevation) angle-of-arrival (AoA) ϕ R c,l (θ R c,l ) and azimuth (elevation) angle-of-departure (AoD) ϕ T c,l (θ T c,l ), while α c,l is the complex path gain.The same variables are analogously defined for the LOS path and are indicated by the LOS index.Furthermore, a R (•) ∈ C NR×1 and a T (•) ∈ C NT×1 denote the UE and the BS array response, respectively, (•) H denotes conjugate transpose, K is the Ricean factor, and Λ indicates the pathloss.
We assume a uniform planar array (UPA) in the y-z plane at the BS and the UE with N y and N z antenna elements (N y N z = N ) on y and z axis, respectively.Here, for ease of notation we drop the subscript for the BS and UE.The array response vector for the UPA can then be written as where 2 indicate the wavelength and antenna element spacing, respectively.

B. Beam Steering Model
We consider phase shifter based analog beamforming with one radio frequency (RF) chain.At the BS the transmit signal is beamformed by a beamforming vector Here, f i and w j denote the complex weight on the i-th transmit and j-th receive antenna element, respectively.The transmit and receive beams are selected from the predefined codebooks F and W, consisting of F and W candidate beams, respectively.The codebooks are designed on the following beam steering scheme.
Here, φT m ( θT m ) for the m-th transmitting beam f m , m ∈ {1, 2, • • • , F } and φR n ( θR n ) for the n-the receiving beam w n , n ∈ {1, 2, • • • , W } are the quantized azimuth (elevation) AoD and AoA, respectively.Given the channel matrix H, the transmit signal x, the m-th transmitting beam f m and the n-th receiving beam w n , the received signal y m,n is where P is the transmit power and η ∈ C NR×1 is the additive white Gaussian noise (AWGN).

C. Beam Management in 5G NR
The 3GPP beam management procedure is based on the EBS and aims to find the optimal beam pair {f m * , w n * } that maximizes the RSRP given as: RSRP m,n = |y m,n | 2 .The optimization problem can be formulated as EBS solves this optimization problem by exhaustively searching over all possible beamforming and combining vectors leading to an excessively huge beam training overhead of F •W beam measurements.
To reduce this beam measurement overhead, HBS utilizes a multi-resolution codebook and the problem of beam selection is divided into two levels.The first-level search identifies the best parent beam by solving Here, F p = F sT and W p = W sR indicate the number of parent beams at the BS and UE, respectively.Further, s T and s R defines the number of child beams within each parent beam at the BS and UE, respectively.After identifying the best parent beam pair, the second-level search confirms the optimal child beam pair within the range of the selected parent beam pair (7), by Notably, the first and the second-level search requires F p • W p and s R • s T beam measurements, respectively, resulting in reduced beam measurement overhead.However, the multilevel search incurs increased latency.

III. LOW-COMPLEXITY MACHINE LEARNING DESIGN FOR MMWAVE BEAM PREDICTION
In this section, we leverage the angular domain spatial correlation to propose a low-complexity beam prediction model for fast beam training.Motivated by the fact that very large antenna arrays can only be employed at the BS due to size constraints, in the following sections, we limit our discussion to the identification of the optimal transmit beam, i.e., the assumption of the knowledge of the optimal receive beam.

A. Algorithm Framework
Motivated by the two-level beam search, we propose to cover the whole angular region with the first-level parent beams.By doing so, we observe that there exists a strong angular spatial correlation among parent and child beams in a certain environment.As an example, Fig. 1 shows the angular spatial correlation between the RSRPs of the parent and the child beams, where each parent beam contains four child beams.Here, it can be observed that the parent beam has a stronger correlation with a limited number of child beams.Consequently, we assume that the RSRP c of the child beams is a function f 1 (•) of the parental RSRP values, i.e., In particular, we aim on probing the parent beams and obtaining their corresponding RSRPs from the received signal vector y p = [y p 1 , y p 2 , • • • , y p Fp ] T and by intelligently merging these parent RSRPs with the strong correlation among parent and child beams, we can predict the optimal child beam index m * .Due to the discrete number of candidate beams, the beam prediction problem can be formulated as multiclassclassification problem and can be written as where f 2 (•) is the function that learns the correlation between parent and child RSRPs for optimal beam index prediction.Further, due to the highly non-linear relationship between RSRPs and channel directivity, the prediction is difficult to be estimated by conventional signal processing methods.With this background, we propose a low-complexity ML design for beam prediction in the following section.

B. Model Design
In this section, we introduce our ML model and its corresponding inputs and outputs as shown in Fig. 2.
1) Input Layer: Based on our previous discussions, the RSRP p of the parent beams obtained via the first level of traditional HBS is provided as an input to the model.This indicates that the input layer consists of F p nodes.As an example, considering F = 64 beams and selecting s T = 4 results in F p = 16 parent beams which means that a beam measurement overhead reduction of 1 − 16 64 = 75% is achieved as compared to the EBS.2) Output Layer: For the prediction of the optimal child beam from all the candidate child beams, a fully-connected (FC) layer, consisting of F nodes is introduced, which learns the spatial correlation between RSRP p and RSRP c and transforms it to the candidate child beams.Finally, a non-linear softmax activation layer is introduced that returns the probabilities of all the child beams.The output of the proposed low-complexity neural network (NN) can be written as Here, P ∈ C F ×1 is the predicted output probability vector of all the child beams, while A ∈ C Fp×F and b ∈ C F ×1 are the weights and the biases, respectively.Finally, the child beam with maximum probability Pm is selected, i.e., IV. PERFORMANCE EVALUATION This section details detaset generation, model training, complexity analysis, and performance evaluation over specified key performance indicators (KPIs).For reproducibility of results, our simulation dataset and source code is publicly available [14].

A. Dataset Generation and Model Training
For dataset collection, we utilize the EBS approach in combination with HBS.Our dataset consists of parent RSRP measurements, i.e., RSRP p obtained via the traditional HBS and are provided as input features to the ML model.In addition, the offline training labels, i.e., optimal beam indices are obtained via the traditional EBS [15].Table I lists default simulation parameters.The location of the UE is drawn based on a uniform spatial distribution in the cell coverage area.The noise power σ 2 is computed as (−174 + 10log 10 B + N F ) dBm and the path loss is given as (20log 10 d + 20log 10 f c − 147.56) dB, where d indicates distance.Finally, the channel model is considered as a clustered delay line (CDL) model [13].Further, to investigate the generalization capabilities of our ML model, we consider following scenarios with different combinations of channel profiles [15].
• Scenario 1: The ML model is trained based on a training dataset constructed by utilizing the CDL-D channel profile and performs inference on the UE with same channel profile but with unknown location.Our dataset consists of 25,000 samples, where the training, validation, and testing data split is 70%, 10%, and 20%, respectively.Further, the ML model is trained for n e = 100 epochs, the model parameters are optimized by the Adam optimizer [16] with the mean square error as loss function.

B. Key Performance Indicators
For performance evaluation in terms of beam measurement overhead, the KPI is selected as reference signalling overhead reduction (%) 1− N M , where N is the number of beams (SSBs) required as input by the ML model and M is the total number of beams to be predicted [15].For beam prediction accuracy, the KPI Top-K (%) is defined as the percentage that the truly optimal genie-aided transmit beam is among the K best beams predicted by the ML model and the beam prediction error (%) is calculated as 1−beam prediction accuracy.Here, the Top-1 genie-aided transmit beam is obtained via EBS [15].Further, the beam prediction accuracy is also evaluated in terms of achieved average RSRP.Finally, for complexity analysis, we compare the model complexity in terms of number of trainable parameters and number of floating-point operations (FLOPs).

C. Complexity Analysis
An important measure of ML model complexity is the number of trainable parameters (n l ), which for an FC-NN layer with n i inputs and n o outputs can be computed as n l = (n i +1)n o .Consequently, for proposed model the number of parameters are n l = (F p + 1)F = 1088.Further, the number of trainable parameters for a convolutional layer can be obtained as n l = n f (f h f w f d + 1), where n f , f h , f w , and f d indicate the number of filters, filter height, width, and depth, respectively.We evaluate the complexity in terms of model size with 32-bit precision.Table II indicates that due to a

No. of Trainable Parameters
Model Size (Mbits) No. of FLOPs FC-NN in [5] 17,728 0.5 1.77 × 10 4 CNN in [6] 352,034 11.2 1.37 × 10 6 CNN in [7] 67,008 2.1 3.32 × 10 5 CNN in [12] 739 Here, in addition to the parameters defined above, i h and i w indicate input height and width, respectively.Further, the inference time complexity is then given as Table II summarizes the complexity comparison with the state of the art.For a fair comparison, the number of estimated FLOPs are for one epoch and one data sample, i.e., n e = n d = 1.Here, it can be seen that the proposed ML model achieves significantly lower computational complexity and benefits from lower power consumption.Further, the execution of the proposed ML model on an Intel i7-1185G7 processor indicates that the training time per epoch and per data sample is 9 µs, which allows efficient and less time consuming model retraining.Besides, the execution time for each prediction is around 2 µs allowing faster beam prediction.

D. Simulation Results
For performance evaluation, in addition to the two-level HBS, CNN from [6], and the FC-NN from [5], the EBS based beam selection is selected as a baseline for comparison [15].During inference the input to all ML models are the RSRP p measurements of the parent beams and the outputs are the predicted probabilities of each child beam being the best.
In terms of beam measurement overhead, the baseline EBS requires 64 beam measurements, resulting in 100% beam measurement overhead.HBS requires 16 parent and 4 child beam measurements, resulting in beam measurement overhead of 32%.For the ML models, during inference, the measurement overhead depends on the value of K ∈ {4, 2, 1}, reflecting the necessity of probing the remaining K beams for final selection, resulting in beam measurement overhead of around 32%, 28%, and 25%, respectively, as shown in Fig. 3.In terms of beam prediction error for K = 1, our proposed approach reduces the error by around 2, 1.4, and 1 percentage points as compared to HBS, [5], and [6], respectively.Similar observations can be made from Fig. 4, where the performance is compared in terms of the average RSRP.Here it can be noticed that the mean RSRP achieved by all ML approaches is well within a 0.15 dB margin of the genie-aided (EBS) transmit beam.However, it is worth mentioning that the HBS achieves similar performance at the cost of increased latency.Fig. 5 showcases the generalization capabilities of our proposed model over three different scenarios as discussed in Section IV-A.We observe that for ML Top-1 the prediction error of the model increases by around 5 percentage points for scenario 2, due to different channel profiles used in training and testing.Further, the error can be reduced when the model is trained on a mixed data set from different channel profiles, i.e., scenario 3.However, the error in scenario 3 is still around 1.2 percentage points higher as compared to scenario 1.An important observation made here is that training a model for a large number of scenarios results in reduced inference performance for a specific scenario.Thus, there exists a trade-off between ML model accuracy performance and its generalization capabilities.

V. CONCLUSION
This letter proposes an ML-based beam prediction design that reduces the reference signaling overhead and predicts the transmit beam with higher accuracy and much lower computational complexity as compared to the state-of-the-art.Specifically, we formulated the beam prediction problem as a multiclass-classification task and proposed a low-complexity ML design to learn the spatial angular correlation between parent and child beams to predict the optimal beam.Due to lower computational complexity, the proposed model reduces the power consumption at the UE and the beam prediction time making it suitable for faster beam prediction.Further, through simulation results, we showed that there exists a tradeoff between ML model performance and its generalization capabilities.These 3GPP compliant evaluation results indicate the feasibility of ML-based mmWave beam prediction for 5G-Advanced NR and beyond 5G communication networks.
and at the UE the received signals are combined with a receive combining vector w

TABLE I LIST
OF SIMULATION PARAMETERS.The ML model is trained based on a training dataset constructed by utilizing the CDL-D channel profile and performs inference on a UE with the CDL-E channel profile and with unknown location.The ML model is trained on the mixed dataset from above scenarios and performs inference on the UE of both channel profiles but with unknown location.
• Scenario 2: • Scenario 3: smaller number of trainable parameters the proposed model has the smallest size as compared to other models.The time complexity of our proposed ML model is compared in terms of number of required FLOPs using Big-O notation.During training, the ML model performs forward and backward pass and it is useful to analyze the training and inference time complexity.In both forward and backward pass, the trainable parameters of a layer with w nodes are updated by a matrix-vector multiplication resulting in a time complexity of O(w 2 ) FLOPs.Furthermore, considering an NN with l layers, w nodes per layer, and training the network with n d data samples, and for n e epochs requires O(n e n d lw 2 ) FLOPs during training, while the inference requires only O(n d lw 2 ) FLOPs as only forward pass is performed during inference.Similarly, the time complexity of a CNN, with l c convolutional and l FC layers during training is O