Deep Unfolded Hybrid Beamforming in Reconfigurable Intelligent Surface Aided mmWave MIMO-OFDM Systems

This letter considers a millimeter-wave (mmWave) multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) transceiver system assisted by a reconfigurable intelligent surface (RIS). The goal is to jointly design transceiver hybrid beamforming and RIS phase shifts to maximize spectral efficiency (SE). We adapt the weighted minimum mean square error manifold optimization (WMMSE-MO) algorithm to the RIS-assisted system, and further deep-unfold it with neural networks to alleviate the algorithm’s computational complexity and expedite its convergence. The proposed deep-unfolded WMMSE-MO algorithm demonstrates superior SE performance, convergence speed, and computational efficiency compared to both its counterpart without deep unfolding and previous methods.

sparse nature of the mmWave channel was exploited to jointly design hybrid beamforming and RIS to maximize SE.In [5], a geometric mean decomposition (GMD) and principal component analysis (PCA) based approach was proposed for designing RIS and transceivers, which reduces BER.In [6], majorization-minimization (MM) and manifold optimization (MO) algorithms were compared for solving the RIS optimization problem, showing the advantages of MO.These optimization-based algorithms however require multiple iterations, leading to increased computation times.
To address this challenge, deep learning (DL) has emerged as a viable solution.DL-based methods have been employed in optimizing RIS-aided transmission [7], [8], [9].In [7], a convolutional neural network (CNN) was used to predict RIS phase shifts and SE based on user locations.In [8], an unsupervised learning approach was used to design RIS with reduced complexity.In [9], a two-stage neural network based on unsupervised learning was employed for separate RIS and beamforming design in each stage.However, end-to-end DL models can face computational and memory challenges due to the growing number of trainable parameters.To address this, the deep unfolding technique [10], a modeldriven DL approach, employs a compact set of trainable parameters by unfolding iterative algorithms into multi-layered neural networks.This technique has demonstrated its effectiveness in fully-digital beamforming design [11] and channel estimation [12] in RIS-aided systems, resulting in reduced complexity and enhanced performance.However, the challenge of jointly designing hybrid beamforming in RIS-aided mmWave MIMO-OFDM systems remains unexplored by the aforementioned algorithms.
In this letter, we address the joint design of partiallyconnected hybrid beamforming in RIS-aided mmWave MIMO-OFDM systems to maximize SE.We introduce a sequential optimization framework based on the weighted minimum mean square error MO (WMMSE-MO) algorithm for hybrid beamforming and RIS design, and we employ deep unfolding to expedite algorithm convergence by substituting the Armijo backtracking line search with a neural network in the WMMSE-MO algorithm.Our main contributions are: • To the best of our knowledge, this letter presents the first exploration of the WMMSE-MO algorithm and its deep unfolding in RIS-aided mmWave MIMO-OFDM systems.The integration of deep unfolding into hybrid beamforming addresses the challenges such as complexity posed by iterative MO algorithms in RIS-aided systems.  is the number of RF chains at the UE, the decoded signal for the kth subcarrier is expressed as where where D is the length of the cyclic prefix.The dth delay tap of H 1,d and H 2,d comprises a line-of-sight (LoS) path and L non-line-of-sight (NLoS) paths [5], as described by where p(τ ) is the rectangular pulse shaping filter for T sspaced signaling; τ l and τ l (with τ 0 = τ 0 = 0) denote the relative time delay of the lth path; κ l and υ l are complex channel gains, with κ 0 and υ 0 of LoS component modeled by CN(0, 1) and {κ l } L l=1 and {υ l } L l=1 of NLoS components modeled by CN(0, 10 −0.1μ ) with a Rician factor μ; θ t l and θ r l denote the angles of departure (AoD) at the BS and angles of arrival (AoA) at the UE for the lth path, respectively; γ t l (γ r l ) and η t l (η r l ) are the azimuth and elevation AoD (AoA) at the RIS, respectively; and a BS , a RIS , and a UE are the steering vectors at the BS, RIS, and UE, respectively.Considering the uniform linear array (ULA) structure at the BS and UE, and the uniform planar array (UPA) structure at the RIS, the steering vectors are expressed as where d a is the antenna spacing which is set to half wavelength λ 2 , N represents N t and N r for the case of BS and UE, respectively, and M y and M z represent the numbers of horizontal and vertical elements of the RIS and M = M y ×M z .
Our objective is to optimize the average SE of the system through the design of hybrid beamforming and RIS beamforming.The SE at the kth subcarrier is given by where Then, the problem is formulated as ) elements [13].Problem ( 8) is nonconvex due to the transmit power constraint (8b) and constant modulus constraints (8c)-(8e).To tackle this challenging problem, we consider manifold optimization (MO)-based algorithms used in hybrid beamforming scenarios, both with RIS [14] and without RIS [13].We address the inherent high complexity resulting from the iterative nature of the MO algorithm by applying deep unfolding.Deep unfolding is especially valuable in scenarios involving RIS, as the complexity of the iterative algorithm increases with the presence of RIS.

III. DEEP-UNFOLDED JOINT HYBRID BEAMFORMING
AND RIS OPTIMIZATION The original problem ( 8) is transformed into an equivalent weighted minimum mean square error (WMMSE) problem, as expressed in problem ( 9), through manipulations that include the derivation of optimal weight matrices and substitution of variables, ensuring identical global optimal solutions [13]: (8b), (8c), (8d), (8e), where Λ k and E k represent a weight matrix and modified mean MSE for the kth subcarrier, respectively, and ξ k denotes the scaling factor.The modified MSE is given by [13] We propose a novel algorithm based on the WMMSE-MO algorithm originally designed for scenarios without RIS considerations [13] to solve problem (9).Our proposed algorithm adopts a nested iterations framework that incorporates both inner and outer iterations, as illustrated in Fig. 1.The I stacked rectangles represent I outer layers, and within each outer iteration i, the algorithm updates RF , W BB,k sequentially, as shown in Fig. 1(b).Specifically, F (i) RF , Φ (i) , and BB,k are updated directly using the WMMSE algorithm.The detail is described below.
1) Updates of RF is updated via the MO procedure [13]: ) , (11) where j denotes the jth inner iteration of the MO algorithm; ρ (9), with The MO procedure is deep unfolded through a neural network, as depicted in Fig. 1(a).Particularly, the step size ρ (i,j ) f is determined by a learnable neural network followed by a normalization, i.e., In ( 12), V ext (•) extracts the real and imaginary parts of the nonzero terms in F (i,j −1) RF and vectorizes them, t ∈ R are trainable weights and bias, respectively, and LeakyReLU denotes the activation function.In ( 13), a normalization based on the gradient's amplitude is performed.Compared to the step size obtained through Armijo backtracking line search, as proposed in [13], the proposed step size can accelerate the optimization process.
The Euclidean gradient ∇g(W ) in (15) has been previously derived in [13], and its concept is similar to ∇ M f (F (i,j −1) RF ).The Euclidean gradient ∇u(Φ (i,j −1) ) used to calculate ∇ M u(Φ (i,j −1) ) in ( 14) is derived as follows.For brevity, we will omit the superscripts denoting inner and outer iterations.
We first define u(Φ) It can be shown that the objective function in (9a) can be reformulated as u(Φ).Then, we derive ∇u(Φ) by computing the gradient of u(Φ), i.e., Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where in (a), we have utilized the matrix differentiation property d(XQY) = Xd(Q)Y, with X and Y being constant matrices independent of Q, and the trace property tr(AB) = tr(BA).Next, we use the properties , with the incorporation of the mask function I M to comply with the diagonal matrix, where in the ith outer iteration.The proposed deep-unfolded WMMSE-MO algorithm employs a fixed number of inner and outer iterations.In the next outer iteration i + 1, the update steps described above are repeated.The iterations cease when either the specified number of iterations is reached or when the relative difference between consecutive MSE values falls below a threshold [15].
The proposed deep-unfolded WMMSE-MO algorithm is trained in an unsupervised fashion, with the loss function chosen to align with the objective function defined in (8a): w } denotes the trainable parameters, N B is the batch size, and R n,k represents the SE for the kth subcarrier and nth training sample in a mini-batch.

IV. SIMULATION RESULTS
In this section, we present a performance comparison of the following schemes: 1) WMMSE-MO: the original WMMSE-MO algorithm [13] adapted for the RIS scenario, following the nested iterations framework in Fig. 1 but without deep unfolding; 2) DU-WMMSE-MO: the proposed deep-unfolded WMMSE-MO algorithm; 3) T-SVD [14]: an algorithm that optimizes the RIS by approximating mmWave channel capacity and employs a matrix factorization-based hybrid beamforming algorithm.T-SVD offers versions for both fully-connected (FC) and partially-connected (PC) structures, denoted as 'T-SVD (FC)' and 'T-SVD (PC)', respectively; 4) T-WMMSE-MO: an algorithm that adopts the same approach for designing RIS as in T-SVD and subsequently designs hybrid beamforming using the original WMMSE-MO algorithm; and 5) GMD-PCA [5]: an algorithm that designs RIS using angles of arrival and departure, plus the GMDbased baseband precoder/combiner and PCA-based analog   in performance.The training-based step size adopted in DU-WMMSE-MO contributes to the improvement.Moreover, the proposed DU-WMMSE-MO(2,1) outperforms T-SVD (PC) and GMD-PCA, since the proposed DU-WMMSE-MO iteratively updates beamforming and RIS matrices to achieve a better joint design, while others first optimize RIS and then beamforming matrices, which could result in performance degradation.Moreover, as the number of outer iterations reaches 6, DU-WMMSE-MO(6,1) exceeds the performance of T-SVD (FC).This shows that the proposed method can achieve superior performance with simplified hardware connections and reduced hardware costs.Comparing Figs.3(a Finally, we compare the average runtime for all schemes in Table I.T-WMMSE-MO exhibits relatively lower runtime compared to DU-WMMSE-MO and WMMSE-MO due to its one-pass RIS design, in contrast to the multiple outer iterations involved in the other two schemes.However, this reduction in runtime comes at the cost of significantly worse SE performance.To assess the extent of acceleration achieved by the deep unfolding technique, we observe that DU-WMMSE-MO(2,4) achieves a substantial runtime reduction of 34.93% compared to the WMMSE-MO counterpart, approaching the computation time of T-SVD (PC) even though DU-WMMSE-MO(2,4) significantly outperforms T-SVD (PC) in SE performance.Furthermore, DU-WMMSE-MO(2,1) and GMD-PCA have comparable complexity; however, the achieved average SE of the former is 1.3 bps/Hz higher than the latter at SNR = 10 dB.The complexity reduction of DU-WMMSE-MO compared to WMMSE-MO is even more pronounced for the (6,4) setting, reaching 62.53%.Overall, the results demonstrate the effectiveness of the proposed deep unfolding approach and the favorable balance between performance and complexity achieved by the DU-WMMSE-MO scheme.
V. CONCLUSION In this letter, we have applied the WMMSE-MO algorithm to the joint hybrid beamforming and RIS design problem in RIS-aided mmWave MIMO-OFDM systems.We incorporated the deep unfolding technique to optimize the iterative algorithm's step size, leading to improved convergence speed, performance, and computational efficiency.Our proposed DU-WMMSE-MO scheme, which employs a partially-connected structure, carries the potential to outperform T-SVD (FC) utilizing a fully-connected structure, highlighting the costeffectiveness of our method.
respectively.As mentioned earlier, the weights and biases are the result of the deep unfolding technique, involving a neural network that requires training.Fig. 1 illustrates that the three MO blocks undergo individual training for their respective weights and biases during each inner and outer iteration.g(W (i,j −1) RF
) and 3(b), the average SE improves as the number of inner iterations increases for DU-WMMSE-MO, WMMSE-MO, and T-WMMSE-MO.

The effective channel H eff,k can be expressed as H 2,k ΦH 1,k , where H 1,k ∈ C M ×Nt and H 2,k ∈ C Nr ×M are the
×Nsand W k = W RF W BB,k ∈ C Nr ×Ns , H eff,k is the RIS-assisted effective channel at the kth subcarrier, and n k ∼ CN(0, σ 2 I Ns ) is the additive white Gaussian noise with σ 2 being the noise power.