Graph Neural Network-Based Joint Beamforming for Hybrid Relay and Reconfigurable Intelligent Surface Aided Multiuser Systems

This letter examines a downlink multiple-input single-output (MISO) system, where a base station (BS) with multiple antennas sends data to multiple single-antenna users with the help of a reconfigurable intelligent surface (RIS) and a half-duplex decode-and-forward (DF) relay. The system’s sum rate is maximized through joint optimization of active beamforming at the BS and DF relay and passive beamforming at the RIS. The conventional alternating optimization algorithm for handling this complex design problem is suboptimal and computationally intensive. To overcome these challenges, this letter proposes a two-phase graph neural network (GNN) model that learns the joint beamforming strategy by exchanging and updating relevant relational information embedded in the graph representation of the transmission system. The proposed method demonstrates superior performance compared to existing approaches, robustness against channel imperfections and variations, generalizability across varying user numbers, and notable complexity advantages.

Abstract-This letter examines a downlink multiple-input single-output (MISO) system, where a base station (BS) with multiple antennas sends data to multiple single-antenna users with the help of a reconfigurable intelligent surface (RIS) and a half-duplex decode-and-forward (DF) relay.The system's sum rate is maximized through joint optimization of active beamforming at the BS and DF relay and passive beamforming at the RIS.The conventional alternating optimization algorithm for handling this complex design problem is suboptimal and computationally intensive.To overcome these challenges, this letter proposes a twophase graph neural network (GNN) model that learns the joint beamforming strategy by exchanging and updating relevant relational information embedded in the graph representation of the transmission system.The proposed method demonstrates superior performance compared to existing approaches, robustness against channel imperfections and variations, generalizability across varying user numbers, and notable complexity advantages.Index Terms-Beamforming, relaying, reconfigurable intelligent surface, graph neural network, unsupervised learning.

I. INTRODUCTION
R ECONFIGURABLE intelligent surfaces (RISs) have shown significant potential to help increase throughput and expand coverage by manipulating the radio environment in future wireless communication systems.To achieve this, joint optimization of active transmit beamforming and passive RIS phase shifts has been studied to maximize the sum rate in RIS-assisted communication systems.However, traditional model-based methods (e.g., [1], [2], [3]) typically rely on an alternating optimization (AO) procedure to address the complex nonconvex problem and variable coupling in multiuser RIS systems.Despite efforts (such as those seen in [2], [3]) to reduce the complexity of the beamforming design problem, the algorithm runtime remains excessively high for practical applications due to the multivariable and coupling nature of the problem.
In recent years, machine learning-based methods have been adopted to design joint transmit and RIS beamforming with lower complexity compared to model-based methods.Supervised learning [4], [5] and deep reinforcement learning (DRL) [6], [7] based models were proposed.Huang et al. [4] proposed a deep neural network (DNN) model that uses supervised learning to design joint beamforming in a RISaided single-user system.Sheen et al. [5] applied supervised learning to map between RIS configurations and achievable rates based on user locations.These models require substantial information about user locations to train effectively.Huang et al. [6] utilized the deep deterministic policy gradient (DDPG) algorithm to jointly design active and passive beamforming without knowledge of the channels and mobility patterns.Zhu et al. [7] employed a soft actor-critic algorithm to enhance the selection strategies produced by the DDPG algorithm.Despite their effectiveness, supervised models generally require substantial labeled data generated by model-based solutions and have high training overhead.On the other hand, DRL methods may encounter convergence issues when dealing with fast-and independently-fading channels.
Unsupervised learning schemes that directly use channel state information (CSI) as model inputs have been proposed to overcome these challenges [8], [9].An unsupervised learning approach to designing passive beamforming was proposed [8], which does not require labeled data and has significant runtime advantages over conventional optimization algorithms.This idea was extended to the multiuser scenario by jointly designing active and passive beamforming with a two-stage neural network [9].A similar approach using CSI as model inputs was applied to various scenarios with different model designs [10], [11], [12].However, the effectiveness of these methods depends on the availability of CSI and quality of channel estimation [13].To address these issues, a graph neural network (GNN) architecture was proposed [14] to directly learn the mapping from received pilots to active and passive beamformers without explicit channel estimation.
In this letter, we propose an unsupervised learning-based joint beamforming scheme for hybrid relay and RIS assisted multiuser systems using a GNN.A hybrid system offers technology interoperability and combines the benefits of both RISs and relays, e.g., greater service ranges and signal processing capabilities of active relays, and energy efficiency and costeffectiveness of RISs.Previous studies on hybrid relay and RIS systems [15], [16], [17], [18] have primarily focused on singleuser single-input single-output (SISO) systems using modelbased methods, with only one study [18] proposing a DRLbased joint relay selection and RIS beamforming scheme.Moreover, existing studies considering multiuser multipleinput single-output (MISO) systems [19], [20]  challenges than RIS-only systems, there is a need to develop low-complexity learning-based methods.Our proposed twophase GNN model is designed to exchange and update relational information embedded in the graph representation of the hybrid relay and RIS assisted multiuser MISO system, thereby learning an effective joint beamforming strategy.Simulation results demonstrate the superior performance and lower complexity of the proposed method.
Notation: (•) H , (•) T , and tr(•) denote the Hermitian, transpose, and trace operators, respectively, (•) and (•) are the real and imaginary parts of the argument, R is the set of real numbers, diag(a) is a diagonal matrix with vector a on the main diagonal, and vec(A) is the vectorization of matrix A.

II. SYSTEM MODEL AND PROBLEM DESCRIPTION
We consider a hybrid relay and RIS assisted downlink multiuser MISO system.A base station (BS) with M antennas transmits signals to K users each with a single antenna, aided by a half-duplex decode-and-forward (DF) relay with L antennas and a RIS with N elements positioned in between.Let H x ,y or h x ,y (depending on the dimensions) be the channel between node x and node y, where a node can be the BS, RIS, relay (R), or user k.The transmission takes place in two phases.In the first phase, the BS transmits to the users, and the received signal at user k is where • s is the transmit signal at the BS comprised of source signal s k and BS beamforming vector g k for all user k's, with G = [g 1 , . . ., g K ] and s = [s 1 , . . ., s K ] T , and w The SINR for user k in the first phase can be expressed as γ In the second phase, the relay transmits its received-andthen-decoded signal to the users.It is assumed that the relay perfectly decodes all user k's signals from its received signal in the first phase where w R ∼ CN (0, σ 2 R I), provided that the SINR corresponding to user k after applying matched filter combining at the relay, i.e., is higher than a threshold γ th R , where )g k is the combining filter for user k at the relay.Thus, the relay transmits signal where ing in the second phase, and w The SINR for user k in the second phase is given by γ After the two-phase transmissions, the received signals at each user k are combined via maximal ratio combining (MRC), leading to the total SINR γ k = γ We aim to maximize the achievable sum rate of all users by jointly optimizing BS beamforming, relay beamforming, and RIS beamforming.The problem is formulated as where (5b) and (5c) represent BS and relay transmit power constraints, respectively, (5d) specifies finite-resolution (Bbit) RIS phase constraint where }}, and (5e) ensures perfect relay decoding.The pre-log factor of 1/2 is omitted in (5a) but considered in all numerical sum-rate values.Problem (5) involves coupled beamforming variables and is nonconvex.Traditional AObased algorithms for decoupling the variables typically entail many iterations, approximations, and/or relaxations, incurring high complexity and suboptimal results.We are therefore motivated to develop a data-driven approach that has favorable performance and complexity properties, as described next.

III. PROPOSED GNN-BASED JOINT BEAMFORMING
The proposed method is based on deep learning using a graph representation of the transmission system.Graph nodes represent network nodes and are encoded with corresponding node features that contain useful relational information about these nodes (e.g., channel information).The information is updated and exchanged among nodes through GNN layers, enabling collective learning of beamformers that produce desirable results.We propose a two-phase GNN model to accommodate the two-phase transmission in the hybrid relay and RIS system.The two phases of the GNN model consist of a similar architecture, but have different inputs and graph interpretations, as illustrated in Fig. 1 and described as follows.

A. The First Phase
In the first phase, a fully-connected unweighted undirected graph of K + 1 nodes is constructed, including one RIS node that learns the RIS beamformer θ 1 , and K user nodes that learn the BS beamformer for each user, i.e., g k , k = 1, . . ., K .
1) Initial Layer: The inputs of the initial layer contain channel information associated with first-phase transmissions from the BS to the users and to the relay, as expressed in (1) and (2), respectively.Specifically, the inputs are , where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Φ = [1, e −j 2π/N , . . ., e −j 2π(N −1)/N ] T is a column of the DFT matrix adopted as the initial value of θ 1 .
The initial features of the RIS node, r (0) , utilize the aggregated channel information about users and relay contained in H (I) k , k = 1, . . ., K and H R , respectively.Define feature extractors f (0) : R 2N ×1 → R q/2×1 , which is used to extract the information in H (I) k , with q being an adjustable parameter, and f k ,m , followed by element-wise mean operations among k's and m's to retain the permutation invariance property of the GNN, denoted by ϕ mean (•), i.e., Then, the initial features of the RIS node are obtained by concatenating the extracted features of H (I) The initial features of the kth user node, u 2) Node Update Layers: Here, the node features are updated by exchanging and aggregating the information from the other nodes.The number of node update layers, denoted by D, is a design parameter.At the dth layer (d = 1, . . ., D), the features of the RIS node are updated by ∈ R q(d+1)×1 (10) where f (d) : R 2qd×1 → R q×1 is a node update function at the dth layer.By applying the element-wise mean function, the RIS node obtains an equal amount of information from each user node.Besides, the features from the previous layer are concatenated to retain the previous information.
Likewise, the features of the kth user node are updated by (11) where f (d) u : R 3qd×1 → R q×1 is a node update function at the dth layer and ϕ max (•) is an element-wise max function.By applying the element-wise max function, permutation invariance is maintained and each user node can identify the dominant interference from the other users.The features from the previous layer are also concatenated.
3) Readout Layer: After D node update layers, the final node features of the RIS node and of the K user nodes will pass a readout layer to generate RIS beamformer θ 1 and BS beamformer g k , k = 1, . . ., K , respectively.For notational convenience, we use layer D + 1 to denote the readout layer.For the RIS node, the output of the readout layer, which performs a linear function f (D+1) : R q(D+1)×1 → R 2N ×1 , is given by The RIS beamformer θ 1 = [θ 1,n ] n=1,...,N is obtained by first deriving the continuous-phase θ 1,n from r (D+1) , i.e., , (13) and then quantizing θ 1,n to the nearest discrete phase in Θ.
Note that the continuous phase is used in the training to allow for backpropagation.
Similarly, the final user node features pass the readout layer f The BS beamformer for user k, k ,M +m , subject to normalization to meet the power constraint in (5b).

B. The Second Phase
The second phase of GNN involves a similar procedure as in the first phase.Here, the RIS node learns the RIS beamformer θ 2 and the K user nodes learn the relay beamformer f k , k = 1, . . ., K .The final node features from the first phase of GNN are exploited in the second phase.
. ., K associated with second-phase transmissions as shown in (4), as well as the final node features of the RIS node from the first phase of GNN, r (D) .Specifically, the initial features of the RIS node are given by where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
2) Node Update Layers: The features of the RIS node and user nodes at the dth layer (d = 1, . . ., D) are updated respectively by 1) , ϕmean({v where g (d) : R 2q(D+d+1)×1 → R q×1 and g 3) Readout Layer: Similar to the readout layer operation in the first phase of GNN, the RIS beamformer θ 2 is obtained from s (D+1) = g (D+1) (s (D) ) ∈ R 2N ×1 , and the relay beamformer The proposed two-phase GNN model is trained offline in an unsupervised fashion using different channel realizations as training samples.The loss function is chosen to align with the design objective (5a) with an added penalty term to reflect constraint (5e), i.e., where the weighting factor β is empirically determined.Note that constraints (5b)-(5d) are met by normalization and quantization operations, as described previously.
Fig. 2 shows the sum-rate performance comparison.For the hybrid scenario, the proposed GNN (θ 1 and θ 2 ) scheme outperforms its two simplified variants, and all GNNbased schemes outperform model-based methods that rely on problem decoupling and various approximations and relaxations, which yield suboptimal results.GNN (θ 2 only) and GNN (θ 1 only) observe slight performance degradation since GNN (θ 1 and θ 2 ) has the increased flexibility in designing the two phases of RIS beamforming.Table I shows that GNN (θ 1 and θ 2 ) achieves the highest SINR in both phases, while training the RIS beamformer using solely the first-phase channel information, as done in GNN (θ 1 only), results in higher phase-one SINR but lower phase-two SINR compared to GNN (θ 2 only).GNN (θ 1 and θ 2 ) outperforms DNN (θ 1 and θ 2 ) due to its architecture design and enhanced feature extraction capabilities.Comparing different deployments, we observe that RIS-only scenario is advantageous when node distances are smaller and N is larger, yielding moderate RIS double attenuation (as in Topology 1).However, deployments involving relays, especially hybrid systems, offer benefits when node distances are larger, where active relays help mitigate the impact of significant RIS attenuation and reduce the need for a large number of RIS elements (as in Topology 2).
Table II evaluates the generalization ability of the proposed GNN model across different numbers of users, showing that it can be trained with a smaller K value and still achieve comparable performance when tested with a larger K value, due to the model's feature extractors being independent of the number of users and its permutation invariance properties.The performance typically declines as the difference between the K values used in training and testing increases.However, the model does not generalize to different numbers of RIS elements and BS/relay antennas due to dependencies of feature extractors on these quantities, as described in Section III.Table III compares the online runtime complexity.GNN has a slight advantage over DNN while achieving significantly reduced complexity compared to the AO-based algorithm.

V. CONCLUSION
We have proposed a two-phase GNN model for joint beamforming in hybrid relay and RIS aided multiuser MISO systems.The proposed GNN model showed promise in learning effective beamformers directly from the input CSI, in an unsupervised manner and without the need for iterative processes.Results showed superior performance and complexity over conventional techniques, generalization to varying numbers of users, and robustness against channel variations.

Fig. 1 .
Fig. 1.Schematic of the proposed two-phase GNN model for joint BS, relay, and RIS beamforming in hybrid relay and RIS systems.
Graph Neural Network-Based Joint Beamforming for Hybrid Relay and Reconfigurable Intelligent Surface Aided Multiuser Systems Bing-Jia Chen, Ronald Y. Chang , Senior Member, IEEE, Feng-Tsun Chien , Senior Member, IEEE, and H. Vincent Poor , Life Fellow, IEEE use AO procedures that have high complexity.As hybrid relay and RIS systems pose even greater complexity and variable coupling This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see https://creativecommons.org/licenses/by/4.0/Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I USERS
' PHASE I AND PHASE II SINR, WITH (M, N, L, K) = (8, 50, 4, 4) TABLE II SUM RATE FOR VARIOUS K'S USED IN TRAINING (ROW) AND TESTING (COLUMN), WITH (M, N, L) = (8, 50, 4), WHERE THE PERCENTAGE VALUE INDICATES THE ACHIEVED SUM-RATE RATIO COMPARED TO THE MODEL TRAINED AND TESTED WITH THE SAME K VALUE