Learning Resource Allocation Policy: Vertex-GNN or Edge-GNN?

Graph neural networks (GNNs) update the hidden representations of vertices (called Vertex-GNNs) or hidden representations of edges (called Edge-GNNs) by processing and pooling the information of neighboring vertices and edges and combining to exploit topology information. When learning resource allocation policies, GNNs cannot perform well if their expressive power is weak, i.e., if they cannot differentiate all input features such as channel matrices. In this paper, we analyze the expressive power of the Vertex-GNNs and Edge-GNNs for learning three representative wireless policies: link scheduling, power control, and precoding policies. We find that the expressive power of the GNNs depends on the linearity and output dimensions of the processing and combination functions. When linear processors are used, the Vertex-GNNs cannot differentiate all channel matrices due to the loss of channel information, while the Edge-GNNs can. When learning the precoding policy, even the Vertex-GNNs with non-linear processors may not be with strong expressive ability due to the dimension compression. We proceed to provide necessary conditions for the GNNs to well learn the precoding policy. Simulation results validate the analyses and show that the Edge-GNNs can achieve the same performance as the Vertex-GNNs with much lower training and inference time.


I. INTRODUCTION
Optimizing resource allocation such as link scheduling, power control, and precoding is important for improving the spectral efficiency of wireless communications.Various numerical algorithms have been proposed to solve these problems, such as the weighted minimal meansquare error (WMMSE) and the fractional programming algorithms [1], [2], which are however with high computational complexity.To facilitate real-time implementation, fully-connected neural networks (FNNs) have been introduced to learn resource allocation policies, which are the mappings from environmental parameters (e.g., channels) to the optimized variables [3].While significant research efforts have been devoted to intelligent communications, most existing works optimize resource allocation with FNNs or convolutional neural networks (CNNs).These DNNs are with high training complexity, not scalable, and not generalizable to problem size (say, the number of users).This hinders their practical use in dynamic wireless environments.
Encouraged by the potential for achieving good performance, reducing sample complexity and space complexity, as well as supporting scalability and size generalizability, graph neural networks (GNNs) have been introduced for learning resource allocation policies [4]- [9].These benefits of GNNs originate from exploiting topology information of graphs and permutation properties of wireless policies.To embed the topology information, the hidden representations in each layer of a GNN are updated by first aggregating the information of neighboring vertices and edges and then combining with the hidden representations in the previous layer, where the aggregation operation consists of processing and pooling.A policy can be learned either by a Vertex-GNN that updates the hidden representations of vertices or by an Edge-GNN that updates the hidden representations of edges, no matter if the optimization variables are defined on vertices or edges.To harness the permutation property, parameter-sharing should be introduced for each layer of a GNN.It has been shown in [9] that a GNN designed for learning a wireless policy will not perform well if the permutation property of the functions learnable by the GNN is mismatched with the policy.However, even after satisfying the permutation property of a policy, a GNN may still not perform well due to the insufficient expressive power for learning the policy.The expressive power of a GNN is weak if the GNN cannot distinguish some pairs of graphs [10].When learning a wireless policy, the graphs (i.e., the samples) that a GNN learns over are with different features.If a GNN maps different inputs (i.e., features) into the same output (called action), then the policy learned by the GNN may not achieve fairly good performance.Nonetheless, such a cause of the performance degradation has never been noticed until now.
A. Related Works 1) Vertex-GNNs and Edge-GNNs: GNNs can either update the hidden representations of vertices (i.e., Vertex-GNNs) or edges (i.e., Edge-GNNs).In the literature of machine learning, Vertex-GNNs were proposed for "vertex-level tasks" (say, vertex and graph classification) whose actions are defined on vertices [11].For "edge-level tasks" (say, edge classification and link prediction) whose actions are defined on edges, both Vertex-GNNs and Edge-GNNs have been designed.In early works as summarized in [11], edge-level tasks were learned by Vertex-GNNs with a read-out layer, which was used to map the representations of vertices to the actions on edges.In [12], [13], the edges of the original graph were transformed into vertices of a line-graph or hyper-graph.Then, the edge-level task on the original graph is equivalent to the vertex-level task on the line-graph or hyper-graph, which can be learned by Vertex-GNNs.In [14], an Edge-GNN was designed for learning an edge-level task.
In the literature of intelligent communications, GNNs have been designed to learn the link scheduling [4], [5], [15], [16], power control [6], [17] and power allocation [9] policies in deviceto-device (D2D) communications and interference networks, the precoding policies in multi-user multi-input-multi-output (MIMO) system [18]- [21], as well as the access point selection policy in cell-free MIMO systems [22].Except [19], [21] where the optimized precoding matrices were defined as the actions on edges and hence Edge-GNNs were designed, the actions of all these works were defined on vertices and thereby Vertex-GNNs were designed.
2) Structures of GNNs: When using GNNs for a learning task, graphs need to be constructed and the structures of GNNs need to be designed or selected.For a resource allocation task, more than one graph can be constructed, say homogeneous graph or heterogeneous graph, and the action can be defined either on vertex or on edge.The structure of a GNN is captured by its update equation, which can either be vertex-representation update or edge-representation update, with a variety of choices for the processing, pooling, and combination functions.The commonly used pooling functions are sum, mean, and max functions, and the processing and combination functions can be either linear or nonlinear (say using FNNs).The three functions used in the GNNs for wireless policies are provided in Table I, which were designed empirically without explaining the rationality.

Processing Function Pooling Function Combination Function Literature
Linear function sum FNN without hidden layer [5], [9], [19], [ GNNs can be designed to satisfy permutation properties, which widely exist in wireless policies [21], [23].According to the problem, a policy can be with different properties such as one-dimension (1D)-permutation equivariance (PE), two-dimension (2D)-PE, and joint-PE [9] property.A GNN learning over a homogenous graph can automatically learn the policies with 1D-PE property if the read-out layer (if needed) also satisfies the 1D-PE property, since the permutation of the vertices in the graph does not affect the output of the GNN.If a GNN is designed for learning a policy with 2D-PE, joint-PE, or more complicated PE property, then the parameter-sharing in the update equation of the GNN should be judiciously designed, as detailed in [21].In [4], [6], [9], [19], [21], the PE properties of their considered wireless tasks were analyzed, and the GNNs were designed to satisfy the same properties.A GNN with matched PE property to a policy is possible to be sample efficient, scalable, and size generalizable, but the GNN may still not perform well due to insufficient expressive power to the policy.
3) Expressive power of GNNs: The parameter-sharing in GNNs restricts their expressive power [24].In the literature of machine learning, the expressive power of Vertex-GNNs has been investigated for vertex classification, link prediction, and graph classification tasks, where the samples for training and testing a GNN are graphs with different topologies [10], [24]- [26].
The expressive power is characterized by the capability to distinguish non-isomorphic graphs. 1he 1-dimensional Weisfeiler-Lehman (1-WL) test is a widely used algorithm to distinguish non-isomorphic graphs, which consists of an injective aggregation process.Both the 1-WL test and Vertex-GNNs iteratively update the representation of a vertex by aggregating the representations of its neighboring vertices.Inspired by such a finding, the expressive power of Vertex-GNNs was characterized by whether their aggregating functions were injective in [10].To build a GNN that is as powerful as the 1-WL test, a graph isomorphic network (GIN) was developed in [10], which is a Vertex-GNN whose processing and combination functions are FNNs and the pooling function is sum function.Under the assumption that the features of vertices were from a countable set, e.g., all vertices are with identical feature, the aggregating functions of the GIN were proved to be injective.When replacing the processing functions in the GIN with linear functions or replacing the pooling functions with mean or max function, the aggregating functions were proved to be non-injective, hence the resultant GNN is with weaker expressive power than the GIN.It was empirically shown that the less powerful Vertex-GNNs perform worse than the GIN on a number of graph classification tasks.
In [24]- [26], the expressive power of Vertex-GNNs and the techniques to improve their expressive power were reviewed.According to the analyses in [10], a GNN as powerful as the 1-WL test is with the strongest expressive power among all Vertex-GNNs.Since the 1-WL test cannot distinguish some non-isomorphic graphs, e.g., k-regular graphs with the same size and same vertex features, the Vertex-GNNs also cannot distinguish them.To design Vertex-GNNs with stronger expressive power for these graphs, one can provide each vertex a unique feature to make vertices more distinguishable [24], [26] or use a high-order GNN that updates the representation of k-tuple of vertices to maintain more structural information of the graph [24]- [26].However, these techniques incur computational costs.
Due to the considered tasks in [10], [24]- [26] and the references therein, these works did not consider the features of edges.When learning wireless policies, the edges of the constructed graphs may have features, and the graphs with the same topology but with different features usually corresponds to different actions.The 1-WL test was proposed to distinguish nonisomorphic graphs without edge features that are different from the graphs for wireless problems, hence the analyses for the expressive power of GNNs in [10], [24]- [26] are not applicable to most wireless problems.The expressive power of a GNN for learning wireless policies is captured by its capability to distinguish input features (e.g., channel matrices), which has never been investigated so far.

B. Motivation and Major Contributions
In this paper, we strive to analyze the impact of the structure of a GNN for optimizing resource allocation or precoding on its expressive power, aiming to provide useful insight into the design of GNNs for learning wireless policies.
We take the link scheduling and power control problems in D2D communications and the precoding problem in multi-user MIMO system as examples, each representing a class of policies.
The link scheduling and power control policies are the mappings from the channel matrices to the optimized vectors in a lower dimensional space, where the optimization variables are respectively discrete and continuous scalars.The precoding policy is the mapping from the channel matrix to the optimized precoding matrix, where the optimization variables are vectors.To demonstrate that the graph constructed for a wireless policy is not unique and the policy can be learned by both Vertex-GNN and Edge-GNN, we consider different graphs and GNNs for each policy.
To the best of our knowledge, this is the first attempt to analyze the expressive power of the GNNs for learning wireless policies.The major contributions are summarized as follows.
linear processers.By contrast, the update equations of Edge-GNNs with linear processors do not incur the information loss.
• We find that the output dimensions of the processing and combination functions also restrict the expressive power of the Vertex-GNN for learning the precoding policy, in addition to the linear processors.We provide a lower bound on the dimensions for a Vertex-GNN or Edge-GNN without the dimension compression.
• We validate the analyses and compare the performance of Vertex-GNNs and Edge-GNNs for learning the three policies via simulations.Our results show that both the training time and inference time of the Edge-GNNs with linear processors are much lower than the Vertex-GNNs with FNN-processors to achieve the same performance.
The rest of this paper is organized as follows.In section II, we introduce three resource allocation policies.In section III, we present Vertex-GNNs and Edge-GNNs to learn the policies over the directed homogeneous or undirected heterogeneous graph, and analyze the expressive power of the GNNs with linear processors.In section IV, we analyze the impacts of the linearity and output dimensions of processing and combination functions on the expressive power of GNNs.In sections V and VI, we provide simulation results and conclusions.x ij , and |X| max ≜ max m i=1 max n j=1 |x ij |.Π, Π 1 and Π 2 denote permutation matrices.R, C, and I denote the sets of real, complex, and integer numbers, respectively.R n denotes n-dimensional vector space.

II. RESOURCE ALLOCATION PROBLEMS AND POLICIES
In this section, we present three representative resource allocation problems.
1) Link scheduling: Consider a D2D communication system with K pairs of transceivers.
Every transmitter sends data to a receiver, and all the transmissions share the same spectrum.
Hence, there exist interference among the transceiver pairs, as illustrated in Fig. 1(a).To coordinate the interference, not all the D2D links are activated.A link scheduling problem that maximizes the sum rate of active links is [2], [5], [15], where x k is the active state of the kth D2D link, x k = 1 when the kth D2D link is active, x k = 0 otherwise, α jk is the composite large-and small-scale channel gain from the jth transmitter to the kth receiver, p k is the power of the kth transmitter, and σ 2 0 is noise power.The link scheduling policy is denoted as x * = F ls (α), where T is the optimized solution of the problem in (1) for a given channel matrix α = [α ij ] K×K , and F ls (•) is a function that maps α ∈ R K×K into x * ∈ I K×1 .This policy is joint-PE to α [9], i.e., Π T x * = F ls (Π T αΠ).
2) Power control: The interference in the D2D system can also be coordinated by adjusting the transmit power of every transmitter.The power control problem that maximizes the sum rate under power constraint is [3], [6], [17], where P max is the maximal transmit power.
The power control policy is denoted as T is the optimized solution of the problem in (2) for a given channel matrix α, and F pc (•) is a function that maps α ∈ R K×K into p * ∈ R K×1 .This policy is also joint-PE to α [6], i.e., Π T p * = F pc (Π T αΠ). 3) Precoding: Consider a multi-user multi-antenna system, where a base station (BS) equipped with N antennas transmits to K users each with a single antenna, as shown in Fig. 1(b).The precoding problem that maximizes the sum rate of users subject to power constraint is [19], [20] max where is the precoding vector for the kth user, and is the channel vector from the BS to the kth user.
Denote the precoding policy as V * = F p (H), where V * is the optimized solution of the problem in (3) for a given channel matrix The link scheduling problem is a combinatorial optimization problem, which can be solved by using iterative algorithms [2] or exhaustive searching.Both the problems in (2) and (3) are non-convex, which can be solved by numerical algorithms such as the WMMSE algorithm [1].

III. GNNS FOR LEARNING THE POLICIES
In this section, we introduce several GNNs to learn the three policies.Since constructing appropriate graphical models is the premise of applying GNNs, we first introduce graphs for each policy.Then, we introduce Vertex-GNN and Edge-GNN.Finally, we analyze the expressive power of the GNNs with simple pooling, processing, and combination functions.

A. Homogeneous and Heterogeneous Graphs
A graph consists of vertices and edges.Each vertex or edge may be associated with features and actions.Learning a resource allocation policy over a graph is to learn the actions defined on vertices or edges based on their features.The inputs and outputs of a GNN are the features and actions of a graph, respectively.
A graph may consist of vertices and edges that belong to different types.If a graph consists of vertices or edges with more than one type, then it is a heterogeneous graph.Otherwise, it is a homogeneous graph, which can be regarded as a special heterogeneous graph.
More than one graph can be constructed for a resource allocation problem.
1) Link scheduling/Power control: For learning the link scheduling and power control policies, the topologies and features of the graphs and the structures of the GNNs are the same, and only the actions of the GNNs and the loss functions for training the GNNs are different.Hence, we focus on the GNNs for learning the link scheduling policy in the sequel.
In [5], [15], a directed homogeneous graph G hom ls , as shown in Fig. 2(a), was constructed for learning the link scheduling policy.In G hom ls , each D2D pair is a vertex, and the interference links among the D2D pairs are directed edges.Denote the ith vertex as D i , and the edge from features of all vertices and edges can be represented as α.The action of vertex D i is the active state of the ith link x i .In [6], the graph constructed for optimizing power control only has one difference from G hom ls : the action of vertex D i is p i .We can also construct an undirected heterogeneous graph G het ls as shown in Fig. 2(b) for learning the link scheduling policy.In G het ls , there are two types of vertices and two types of edges.Each transmitter and each receiver are respectively defined as a transmitter vertex and a receiver vertex (respectively called tx vertex and rx vertex for short), and the link between them is an undirected edge.Denote the ith tx vertex and the ith rx vertex as T i and R i , respectively, and the edge between T i and R j as edge (i, j).Edge (i, i) is referred to as signal edge, and edge (i, j) (i ̸ = j) is referred to as interference edge (respectively called sig edge and int edge for short).The vertices have no features.The feature of edge (i, j) is α ij , and the features of all the edges can be represented as α.The active state x i can either be defined as the action of vertex T i or the action of sig edge (i, i).  2) Precoding: In [19], the precoding policy was learned over a heterogeneous graph G het p as shown in Fig. 2(c).In G het p , there are two types of vertices and one type of edges.Each antenna at the BS and each user are respectively antenna vertex and user vertex, and the link between them is an undirected edge.Denote the ith antenna vertex and the jth user vertex as A i and U j , respectively, and the edge between A i and U j as edge (i, j).The vertices have no features.The feature of edge (i, j) is h ij , and the features of all the edges can be represented as H.The action v ij is defined on edge (i, j).
One can also construct a homogeneous graph for learning the precoding policy in a similar way as in [18].In particular, each user (say the kth user) and all antennas at the BS are defined as a vertex, h k and v k are the feature and action of the vertex, respectively.Every two vertices are connected by an edge, which has no feature and action.However, a GNN learning over such December 27, 2023 DRAFT a graph is only permutation equivariant to users, losing the property of permutation equivariance to antennas and hence incurring higher training complexity.

B. Vertex-GNN and Edge-GNN
GNNs can be classified into Vertex-GNNs and Edge-GNNs, which respectively update the hidden representations of vertices and edges.Each class of GNNs can learn over either homogeneous or heterogeneous graphs.The directed homogeneous graph can be transformed into an undirected heterogeneous graph by regarding the edges of different directions as two types of undirected edges and the two vertices connected by a directed edge as two types of vertices. 2ence, we focus on undirected heterogeneous graph in the following (called heterogeneous graph for short).
For conciseness, we take the GNNs for learning the link scheduling policy over the heterogeneous graph G het ls in Fig. 2(b) as an example.We discuss the GNN for learning the link scheduling policy over the converted heterogeneous graph from the directed homogeneous graph and the GNN for learning the precoding policy in remarks.

1) Vertex-GNN:
In Vertex-GNN, the hidden representation of each vertex is updated in each layer, by first aggregating information from its neighboring vertices and edges, and then combining the aggregated information with its own information in the previous layer.For each vertex, its neighboring vertices are the vertices connected to it with edges, and its neighboring edges are the edges connected to it.As illustrated in Fig. 3(a)(b) where are its neighboring vertices and edge (1,1) ∼ edge (1,4) are its neighboring edges, while for R 1 , T 1 ∼ T 4 are its neighboring vertices and edge (1,1) ∼ edge (4,1) are its neighboring edges.
For the Vertex-GNN learning over a heterogeneous graph, the hidden representation of the ith vertex with the τ i th type in the lth layer, d i,τ i , is updated as follows [27], Aggregate: a where a (l) i,τ i is the aggregated output at the ith vertex, τ i and τ ij are respectively the types of the ith vertex and edge (i, j), N (i) is the set of neighboring vertices of the ith vertex, e ij is the feature of edge (i, j), q(•), PL(•) and CB(•) are respectively the processing, pooling, and combination functions, and W (l) (a) Neighboring vertices and edges of T1 Neighboring vertices and edges of R1 (c) Neighboring edges of edge (1, 1) The choices for the processing and combination functions are flexible [11].To ensure a GNN satisfying the PE property of a learned policy, the pooling function should satisfy the commutative law, e.g., sum-pooling K k=1 (•), mean-pooling 1 K K k=1 (•), and max-pooling max K k=1 (•).
Remark 1.We consider simple pooling, processing, and combination functions as in [5], [9], [19], [21] for easy analysis, where PL(•) is the sum-pooling function, q(•) is a linear function, and CB(•) is a FNN without hidden layer (i.e., a linear function cascaded with an activation function), unless otherwise specified.The GNNs with the three functions are called vanilla GNNs.
The vanilla GNN updating vertex representations for learning the link scheduling policy over G het ls is referred to as vanilla Vertex-GNN het ls , where the active state x i is defined as the action of vertex T i .Since there are two types of vertices in G het ls , the GNN respectively updates the hidden representations of the vertices of each type in each layer.
From (4), the hidden representations of T i and R i in the lth layer of the vanilla Vertex-GNN het ls , d Update hidden representations of rx vertices: where a (l) i,T and a (l) i,R are respectively the aggregated outputs at T i and R i , RI , and P (l) RI are respectively the weight matrices for processing the information of R i , edge (i, i), R j , j ̸ = i, and edges (i, j), j T is the weight matrix for combining when updating T I , and R in (5b) are respectively the weight matrices used for processing and combining when updating R i .
In the input layer (i.e., l = 0), we set d The output of the GNN is [d which is composed of the learned actions taken on all the tx vertices, where L is the number of layers of the GNN.
Denote the policy learned by the vanilla Vertex-GNN het ls as xT = G het ls,v (α).It is not hard to show that the learned policy is joint-PE to α.
Remark 2. In [5], [6], [15], Vertex-GNNs were used to learn the link scheduling and power control policies over directed homogeneous graphs with the same topology as G hom ls in Fig. 2(a).
To apply the update equation in (4), one can convert G hom ls into a heterogeneous graph, denoted as G undir ls,v .When updating the representation of vertex D i , its neighboring edges (i, j) and (j, i) (j ̸ = i) are two types of edges, and its neighboring vertices are with two different types.Then, the representation of D i in the lth layer, d (l) i,V , can be obtained from (4), where d (0) i,V = α ii .The input of the GNN is α, which is composed of the features of all vertices and edges.The output The input and output of the GNN are respectively H and [v ij,V ] N ×K ≜ VV , which are composed of the features and the learned actions of all edges.When the three functions in Remark 1 are used, the GNN is referred to as vanilla Vertex-GNN het p .It is not hard to show that the learned policy is 2D-PE to H.
2) Edge-GNN: In Edge-GNN, the hidden representation of each edge is updated in each layer, by first aggregating information from its neighboring edges and neighboring vertices, and then combining with its own hidden representation in the previous layer.For edge (i, j), the ith and jth vertices are its neighboring vertices, and the edges connected by the ith and jth vertices are its neighboring edges.
The update equation of an Edge-GNN can be obtained from the update equation of a Vertex-GNN simply by switching the roles of the edges and vertices [28].For the Edge-GNN learning over a heterogeneous graph, the hidden representation of edge (i, j) with the τ ij th type in the lth layer, denoted as d where a (l) ij,τ ij is the aggregated output at edge (i, j), the first and second processors are respectively used to process the information from the ith vertex and its connected edges and the jth vertex and its connected edges, N (i)/j is a set of neighboring vertices of the ith vertex except the jth vertex, e v j denotes the feature of the jth vertex, and W (l) τ ij are the weight matrices.
When an Edge-GNN is used for learning the link scheduling policy over G het ls in Fig. 2(b), the actions are defined on the sig edges.When the pooling, processing, and combination functions in Remark 1 are used, the Edge-GNN is referred to as vanilla Edge-GNN het ls .
Since there are two types of edges in G het ls , the GNN respectively updates the hidden representations of the edges of each type in each layer.Since the vertices have no features in G het ls , when updating the representation of each edge, only the information of its neighboring edges is aggregated.For edge (i, i) (say, edge (1, 1) in Fig. 3(c)), edge (i, j) and edge (j, i) (j ̸ = i) respectively connected by T i and R i are its neighboring edges.For edge (i, j) (say, edge (1, 2) in Fig. 3(d)), edge (i, i) and edge (i, k) (k ̸ = {i, j}) connected by T i are respectively its neighboring sig and int edges, and edge (j, j) and edge (k, j) connected by R j are respectively its neighboring sig and int edges.
From (6), the hidden representations of edge (i, i) and edge (i, j) in the lth layer of the vanilla Edge-GNN het ls , d i,S and d (l) ij,I , are updated as follows, Update hidden representations of sig edges: ki,I , Update hidden representations of int edges: i,S and a (l) ij,I are respectively the aggregated outputs at edge (i, i) and edge (i, j), R are respectively the weight matrices for processing the information of the neighboring int edges of the sig edge (i, i) connected by T i and R i , IT are respectively the weight matrices for processing the information of the neighboring sig and int edges of the int IR are respectively used for processing the information from neighboring sig and int edges of the int edge (i, j) connected by R j , U as G undir ls,e .For the directed edge (i, j), its neighboring vertices D i and D j are two types of vertices, its neighboring edges (i, k) and (j, k) are one type of edges while edges (k, i) and (k, j) are another type of edges.Then, the hidden representation of edge (i, j) in the lth layer, d (l) ij,ls , can be obtained from (6).Since the actions are defined on vertices in G hom ls , a read-out layer is required to map the representations of edges in the output layer to the action on each vertex.For example, a FNN layer can be used that is shared among ki,ls .The input of the GNN is α, which is composed of the features of all the vertices and edges.The output of the GNN is consisting of the learned actions on all vertices.When using the processing, pooling, and combination functions in Remark 1, the GNN is referred to as vanilla Edge-GNN undir ls .It is not hard to show that the learned policy is joint-PE to α.
Remark 5.In [19], the precoding policy was learned with a vanilla Edge-GNN over G het p , where the hidden representation of edge (i, j) in the lth layer, d The above-mentioned GNNs and the features and actions of graphs for each GNN are summarized in Table II

GNNs
Policies Graphs

C. Expressive Power of the Vanilla GNNs
In what follows, we analyze the expressive power of the vanilla Vertex-GNNs and vanilla Edge-GNNs, by observing whether or not a GNN can output different actions when inputting different channel matrices.Without loss of generality, we assume that L ≥ 3.
We first define several notations to be used in the sequel.
α R/i ≜ K j=1,j̸ =i α ij and α C/i ≜ K j=1,j̸ =i α ji , which are the summations of the ith row of the channel matrix α without α ii and the ith column of α without α ii , respectively.
H R,i ≜ K j=1 h ij and H C,i ≜ N j=1 h ji , which are the summations of the ith row and column of the channel matrix H, respectively.
which is the set composing of all the diagonal elements in α.
A ind ≜ {α ij , i, j = 1, • • • , K}, which is the set composing of all the elements in α. i,R in ( 5), the hidden representations of T i and R i in the lth layer can be respectively updated as, i,R = 0, when l = 1, from (8) we have d i,T = σ P RS α ii + P (1) i,R = σ P T S α ii + P When l = 2, by substituting d i,T and d i,R in ( 9) into (8), it is not hard to derive Similarly, the action taken over T i can be derived as, It is shown that the information of interference channel gains α ij , i ̸ = j is lost after the aggregation in the update equation of the vanilla Vertex-GNN het ls .Analogously, for the vanilla Vertex-GNN undir ls , the vanilla Edge-GNN het ls , and the vanilla Edge-GNN undir ls , the actions taken over the ith vertex or the ith sig edge can respectively be expressed as, Denote the outputs of the vanilla Vertex-GNN ls (i.e., Vertex-GNN het ls or Vertex-GNN undir ls ) as x [1] and x [2] with two different inputs α From ( 11) and ( 12), we can obtain the following observation.
Observation 1: x[1] = x [2] if the elements in α [1] and α [2] satisfy the following conditions: (1) These conditions can be rewritten as a group of linear equations which consists of 3K equations and 2K 2 variables.When K ≥ 3, the number of variables is larger than the number of equations, and hence there are infinite solutions to these equations, i.e., there are infinite numbers of α [1] and α [2] satisfying the conditions.
The observation indicates that the vanilla Vertex-GNN ls for learning the link scheduling policy x * = F ls (α) is unable to differentiate all channel matrices.When the pooling function in a vanilla Vertex-GNN ls is replaced by mean-pooling or max-pooling, we can also find channel matrices that the GNN cannot differentiate.
Recalling that α ∈ R K×K and x * ∈ I K×1 , the scheduling policy x * = F ls (α) is a many-to-one mapping where the channel matrix is compressed by the mapping.However, the mappings learned by a vanilla Vertex-GNN ls may not be the same as the scheduling policy, because F ls (α [1] ) = F ls (α [2] ) does not necessarily hold when α [1] and α [2] satisfy the three conditions, as to be shown via simulations later.As a consequence, the vanilla Vertex-GNN ls cannot well learn the link scheduling policy due to the information loss.
By contrast, the vanilla Edge-GNN ls (i.e., Edge-GNN het ls or Edge-GNN undir ls ) does not incur the information loss, since it can differentiate the channel matrices resulting in different optimization variables.This can be seen from ( 12), where the outputs of the vanilla Edge-GNN ls (i.e., xi,S and xi,E ) depend on each individual channel gain (say α ij ) in α.
2) Precoding Policy: With similar derivations, we can show that H is compressed into where the information in individual channel coefficients loses.As a result, the GNN is unable to differentiate the channel matrices satisfy the following conditions: (1) ,R,N , which can be expressed as the following group of equations, In other words, when inputting H [1] and H [2] , the outputs of the vanilla Vertex-GNN het p are identical.When the pooling function is mean-pooling or max-pooling, we can also find channel matrices that the vanilla Vertex-GNN het p cannot differentiate.However, the precoding matrix depends on every channel coefficient h ij .For example, when the signal-to-noise ratio (SNR) is very low, the optimal precoding matrix degenerates into K vectors each for maximal-ratio transmission (i.e., v [19].As a result, the vanilla Vertex-GNN het p is unable to learn the optimal precoding policy.By contrast, the vanilla Edge-GNN het p can differentiate all channel matrices.The update equation of the vanilla Edge-GNN het p designed in [19] is where E is the weight matrix for the combination, Q A and Q U are respectively the weight matrices for processing the information of the neighboring edges connected by A i and U j .In the input layer, d is combined individually when updating the edge representation with l = 1.Remark 6.When considering other typical constraints, we can use the same way to analyze the expressive power of GNNs, but the input samples that GNNs cannot distinguish may differ.

IV. IMPACT OF PROCESSING AND COMBINATION FUNCTIONS ON EXPRESSIVE POWER
In this section, we analyze the impact of processing and combination functions on the expressive power of the Vertex-GNNs and Edge-GNNs for learning the policies.
We first analyze the impact of the linearity of processing and combination function, and then analyze the impact of the output dimensions of the two functions on Vertex-GNNs and Edge-GNNs.
Without the loss of generality, we assume that L ≥ 3.

A. Impact of Linearity
As analyzed in section III-C, the vanilla Vertex-GNNs (i.e., vanilla Vertex-GNN het ls , Vertex-GNN undir ls , and Vertex-GNN het p ) cannot while the vanilla Edge-GNNs (i.e., vanilla Edge-GNN het ls , Edge-GNN undir ls , and Edge-GNN het p ) can differentiate the channel matrices resulting in different optimization variables, where both classes of vanilla GNNs are with linear processors and nonlinear combiners.In the sequel, we show that the expressive power of the Vertex-GNNs can be enhanced by using non-linear processors, and the strong expressive power of the Edges-GNNs comes from the non-linear combiners.We take the GNNs for learning the link scheduling policy as an example.The impact is the same on the GNNs for learning the power control and precoding policies.
1) Vertex-GNNs: We start by analyzing the expressive power of a degenerated vanilla Vertex-GNN het ls where the combination function becomes linear.Linear processor and linear combinator: For Vertex-GNN het ls , d i,R = 0.Then, after omitting the activation functions in (8), the hidden representations of T i and R i become, i,R = P (l) By substituting d i,T and d i,R into (8), and again omitting the activation functions, we have, d i,T =U T g (1) RS g (1) R (α jj , α C/j ) + P (2) T g (1) RS α ii + P (2) T S g (1) T (α jj , α R/j ) + P T S α ii + P (2) T (α T S α ii + P where (a) in both (17a) and (17b) is obtained by exchanging the operation order of the linear function and the summation function, i.e., g T ( (•)) and g (1) R ( (•)).With similar derivations, we can obtain that the action of vertex T i is a linear function of the input features, i.e., which does not depend on α ij (i ̸ = j).This indicates that the degenerated vanilla Vertex-GNN het cannot distinguish the individual interference channel gains in α.As a result, the GNN may yield the same action for different input features α [1] and α [2] .
Remark 7. Analogously, when the combination functions are linear, the action of vertex D i of the degenerated vanilla Vertex-GNN undir ls (i.e., xi,V ), the action of vertex D i of the degenerated vanilla Edge-GNN undir ls (i.e., xi,E ), and the action of sig edge (i, i) of the degenerated vanilla Edge-GNN het ls (i.e., xi,S ) can also be expressed in the form as (18).
Linear processor and non-linear combinator: From ( 4) and ( 5), we can see that four linear functions with different weight matrices are required in the vanilla Vertex-GNN het ls .In particular, the processing functions are respectively used for (a) tx vertex aggregating information from rx vertex and sig edge, (b) tx vertex aggregating information from rx vertex and inf edge, (c) rx vertex aggregating information from tx vertex and sig edge, and (d) rx vertex aggregating information from tx vertex and inf edge, which are respectively q(d Then, the aggregated outputs of T i and R i in ( 5) can be respectively rewritten as, i,T and d i,R depend on α ii and α C/i .With similar derivations, it can be shown that the outputs of the GNN depend on A diag and A CR , which are respectively composed of α ii , α R/i and α in the combination function is replaced by a FNN, then it is not hard to show that xi,T has the same form as in (11).This means that the information of interference channel gains α ij , i ̸ = j is lost after the linear processing.As a consequence, the GNN cannot distinguish α [1] and α [2] .
Non-linear processor and linear/non-linear combiner: When the processors in the vanilla Vertex-GNN het ls are replaced by non-linear functions (say FNNs), the aggregated outputs of T i and R i after passing through the sum-pooling become, i,T depends on i,R depends on A * i ≜ {α 1i , • • • , α Ki }.After the combiner (no matter linear or non-linear), d i,T and d i,R respectively depend on A i * and A * i .It can be shown with similar derivations that the outputs of the GNN depend on A ind , which is the set of all the elements in α.In other words, the outputs of the GNN depend on α ij , and hence the GNN can distinguish α [1] and α [2] .
Similarly, we can show that the expressive power of Vertex-GNN undir ls is able to be improved by using FNN for processing, but cannot be improved by using FNN for combining.
2) Edge-GNNs: Since the outputs of all GNNs for link scheduling depend on α ii , i = 1, • • • , K when the processing and combination functions are linear as in Remark 7, we only analyze whether they depend on individual interference channel gains in the following.According to Remark 7 and the analysis in Section III-C, it is the non-linear combination functions that help the vanilla Edge-GNN ls distinguish individual interference channels.Since there are two types of combination functions in each layer of the vanilla Edge-GNN het ls as shown in (7), in the following we analyze which combiner helps the vanilla Edge-GNN het ls distinguish α ij , i ̸ = j.Since d (0) i,S = α ii and d (0) ij,I = α ij , the combination functions of the vanilla Edge-GNN het ls for updating the hidden representations of sig edge (i, i) and int edge (i, j) in the first layer can be obtained from (7a) and (7b) as, I (•) and J ij (•) in (22b) become linear functions.Then, when l = 2, the first term in (7a) becomes K k=1,k̸ =i Q (2) ik,I = Q (2) T K k=1,k̸ =i J ik (α ik ) that depends on α R/i , and the second term in (7a) becomes K k=1,k̸ =i Q ki,I that depends on α C/i .Hence, d i,S depends on α R/i and α C/i .With similar analysis, it can be shown that the action on edge (i, i) (i.e., xi,S ) still depends on A CR , i.e., the corresponding Edge-GNN cannot distinguish α ij , i ̸ = j.
ik,I depends on α ik , k ̸ = i rather than α R/i , and K k=1,k̸ =i Q ki,I depends on α ki rather than α C/i .Hence, d i,S depends on A i * and A * i no matter if σ S (•) is omitted.Similarly, we can show that the action of edge (i, i) depends on A ind , i.e., the corresponding GNN can distinguish α ij , i ̸ = j.
With similar analysis, we can show that it is σ I (•) (instead of σ S (•)) in the Edge-GNN het ls that enables the GNN to distinguish α ij , i ̸ = j and enhances its expressive power.
In a nutshell, non-linear processors help improve the expressive power of the vanilla Vertex-GNNs (i.e., Vertex-GNN het ls , Vertex-GNN undir ls , and Vertex-GNN het p ). Non-linear combiners for updating the int edges (say σ I (•)) help the vanilla Edge-GNN het ls to distinguish α ij , i ̸ = j.Since there is only one type of combination functions in the vanilla Edge-GNN undir ls and Edge-GNN het p , the non-linearity of all combination functions helps improve the expressive power of the two Edge-GNNs.The expressive power of these GNNs is summarized in Table IV.We consider three optimization problems, i.e, the link scheduling problem in (1), the power control problem in (2), and the precoding problem in (3).For the link scheduling and power control problems, all the D2D pairs are randomly located in a 500 m × 500 m squared area.
The wireless network parameters are provided in Table V.The composite channel consists of the path loss generated with the model in [5], log-normal shadowing with standard deviation of 8 dB, and Rayleigh fading.For the precoding problem, we set P max = 1 W , and change σ 2 0 in (3) to adjust SNR.These simulation setups are considered unless otherwise specified.k ) , where N s is the number of training samples, r n k is the data rate of the kth user and y n k is the activate probability of the kth D2D link in the nth sample, w 1 and w 2 are weights that need to be tuned.The second and the third terms in the loss function are respectively the penalty for preventing all the links from being closed and being activated.For the link scheduling problem, we set w 1 = 10 −1 and w 2 = 10 −4 .For the power control and precoding problems, w 1 = w 2 = 0.
We use sum rate ratio as the performance metric.It is the ratio of the sum rate achieved by the learned policy to the sum rate achieved by a numerical algorithm (which is FPLinQ [2] for link scheduling, and WMMSE for power control and precoding).We train each GNN five times.
The results are obtained by averaging the sum rate ratios achieved by the learned policies with the five trained GNNs over all test samples.For the link scheduling and power control problems, we only provide the performance of Vertex-and Edge-GNN het ls in the sequel since Vertex-and Edge-GNN undir ls perform very close to them.
All the simulation results are obtained on a computer with a 28-core Intel i9-10904X CPU, a Nvidia RTX 3080Ti GPU, and 64 GB memory.
All the GNNs in the following use the mean-pooling function.

A. Impact of the Non-distinguishable Channel Matrices
We first validate that the vanilla Vertex-GNN het ls cannot well learn the link scheduling and power control policies, and the vanilla Vertex-GNN het p cannot well learn the precoding policy, due to their weak expressive power.
In Fig. 5(a) and Fig. 5(b), we show the probability of F ls (α [1] ) = F ls (α [2] ) and F pc (α [1] ) = F pc (α [2] ) simulated with different values of K and transmit power, where p k = P for the link scheduling problem and P max = P for the power control problem.α [1] and α [2] are generated by solving the linear equations in (13).Since the computational complexity of solving ( 13) is high when K is large, we take K ∈ {3, 4, 5, 6} as examples.For the link scheduling problem, the optimal solutions are obtained by exhaustive searching.For the power control problem, the sub-optimal solutions are obtained by the WMMSE algorithm.Since the optimized powers are continuous, we regard the frequency of |F pc (α [1] ) − F pc (α [2] )| max < 10 −3 as the probability.We can see that the probability decreases with K and P .This can be explained as follows.Since the solution spaces of both problems become large with a large value of K, the probability that the optimal solutions for α [1] and α [2] are identical decreases with K.When P is low such that the noise power dominates, the optimization for the K D2D pairs in problem (1) and ( 2) is decoupled.In this case, the optimal scheduling policy is to activate all the links, and the optimal power control policy is to transmit with the maximal power to all the receivers.Hence, given any two channel gain matrices, the two optimal solutions are always identical.
In Fig. 5(c), we show the probability of F p (H [1] ) = F p (H [2] ) simulated with N = 2 or 4 and K = 2, where H [1] and H [2] are generated by solving the linear equations in (14).The suboptimal solutions are also obtained by the WMMSE algorithm.Since the precoding variables are continuous, we regard the frequency of |F p (H [1] ) − F p (H [2] )| max < ϵ as the probability, where ϵ is a parameter.We can see that the probability is low under different values of ϵ and SNR.
When ϵ is smaller than 0.05, or the values of N and K are larger, the probability is almost zero, i.e., the optimized precoding matrices for H [1] and H [2] are different with very high probability.
In Fig. 6 ), and (c) Fp(H [1] ) = Fp(H [2] ), where α [1] and α [2] satisfy the conditions in Observation 1, and H [1] and H [2] satisfy the conditions in Section III-C2.the fine-tuned hyper-parameters are shown in Table VI.We can see that the performance of the vanilla Vertex-GNN het ls degrades rapidly with P , while the performance of the vanilla Edge-GNN het ls changes little with P .This is because when the value of P is higher, the probability of F ls (α [1] ) = F ls (α [2] ) or F pc (α [1] ) = F pc (α [2] ) is lower according to Fig. 5, but the vanilla Vertex-GNN het ls yields the same output with the two channel matrices α [1] and α [2] .Recall that the vanilla Vertex-GNN het ls is with the mean-pooling function.If max-pooling is used as in [6], the Vertex-GNN will perform much better when learning the link scheduling or power control policy, which however is still inferior to the vanilla Edge-GNN het ls .It is noteworthy that the sum rate ratio achieved by the link scheduling policy learned by the vanilla Edge-GNN exceeds 100% as shown in Fig. 6(a).This is because the sum rate ratio is the ratio of the sum rate achieved by the learned policy to the sum rate achieved by the FPLinQ algorithm that can only learn suboptimal solutions.
In Fig. 6(c), we show the performance of the precoding policies learned by the vanilla GNNs versus SNR.The GNNs are trained with 10 5 samples, and the fine-tuned hyper-parameters are shown in Table VI.We can see that the vanilla Vertex-GNN het p performs poor under different SNRs.This is because the probability of F p (H [1] ) = F p (H [2] ) is low under different SNRs as shown in Fig. 5(c), but the vanilla Vertex-GNN het p yields the same output for H [1] and H [2] .As expected, the vanilla Edge-GNN het p performs very well.

B. Impact of the Linearity of Processing and Combination Functions
To validate the analysis in Section IV-A, we compare the performance of Vertex-GNNs and Edge-GNNs with vanilla Vertex-GNNs and vanilla Edge-GNNs.
The hyper-parameters for the vanilla GNNs are the same as Table VI.The fine-tuned hyperparameters of the Vertex-GNNs with FNN as processor or combiner are shown in Table VII.
The GNNs for learning the link scheduling policy and the power control policy are trained with 1000 samples.The GNNs for learning the precoding policy with N = 4, K = 2, and SNR = 10 dB are trained with 10 5 samples.In Table VIII, we provide the simulation results, where the performance of the vanilla GNNs is marked with bold font.closely to the vanilla Vertex-GNN, because both of them cannot differentiate α [1] and α [2] or H [1] and H [2] .
For the Edge-GNNs learning the link scheduling and power control policies, when CB S (•) is non-linear and CB I (•) is linear, Edge-GNN het ls is inferior to the vanilla Edge-GNN het ls .When CB I (•) is non-linear and CB S (•) is linear, Edge-GNN het ls performs closely to the vanilla Edge-GNN het ls , since both of them can differentiate α [1] and α [2] .When learning the precoding policy, the Edge-GNN with linear combination function is inferior to the vanilla Edge-GNN, since it cannot differentiate H [1] and H [2] .
Next, we show the impact of different activation functions.For the Vertex-GNNs, when the processing function is a non-linear function such as FNN, they can perform well even if a linear combination function is applied.Hence, non-linear activation functions in the combiner have little impact on the performance of the Vertex-GNNs.For the vanilla GNNs, only the combination functions contain non-linear activation functions.Thereby, we only compare the performance of the vanilla GNNs with several non-linear activation functions.The simulation results are provided in Table IX.It shows that the vanilla GNNs with different activation functions achieve similar performance.In Table X, we show the space and time complexities of the GNNs.It is shown that the training time and inference time of the vanilla Edge-GNNs are shorter than the Vertex-GNNs, because using FNN as processor is with higher computational complexity than using linear processor.
The space complexities of the vanilla Edge-GNNs are lower than the Vertex-GNNs.

TABLE X SPACE AND TIME COMPLEXITY OF GNNS
Edge-GNNs for learning wireless policies and provide guidelines for designing efficient and well-performed GNNs.While we focused on two resource allocation policies and a precoding policy, the conclusions are also applicable to other wireless policies such as signal detection, channel estimation, other resource allocation, and other precoding problems whenever the edges of constructed graphs are with features.If both the features and actions of a graph are defined on vertices, then a Vertex-GNN will be with the same expressive power as an Edge-GNN and may be more sample efficient.
Notations: (•) T and (•) H denote transpose and Hermitian transpose, respectively.|•| denotes the absolute value of a real number or the magnitude of a complex number.Tr(•) denotes the trace of a matrix.X = [x ij ] m×n denotes a matrix with m rows and n columns where x ij is the element in the ith row and the jth column.∥X∥ ≜ m i=1 n j=1
Fig. 3. Vertex-GNN (a) and (b) (When updating the representation of a vertex in blue color, the information of the vertices and edges with the same color is aggregated with the same weight.)and Edge-GNN (c) and (d) (When updating the representation of an edge with dashed lines, the information of the edges with the same color is aggregated with the same weight.) and d (l) i,R , are updated as, Update hidden representations of tx vertices: i,T = 0 and d (0) i,R = 0, because the vertices in G het ls have no features.The input of the GNN is α, which is composed of the features of all edges.
which includes the learned actions on all vertices.When the three functions in Remark 1 are used, the Vertex-GNN learning over G undir ls,v is referred to as vanilla Vertex-GNN undir ls .It is not hard to show that the learned policy is joint-PE to α. Remark 3. When using a Vertex-GNN to learn the precoding policy over G het p , the hidden representations of antenna vertex A i and user vertex U j in the lth layer, d (l) i,A and d (l) j,U , can be obtained from (4), where d (0) i,A = d (0) j,U = 0 because the vertices have no features.Since the actions are defined on edges, a read-out layer is required to map the vertex representations in the output layer into the actions.In particular, to map d (L) i,A and d (L) j,U into the action on edge (i, j), a FNN can December 27, 2023 DRAFT be designed as vij,V = FNN het read (d

Remark 4 .
the weight matrices for combining the information of the sig and int edges, and σ S (•) and σ I (•) are activation functions.In the input layer, d (0) i,S = α ii and d (0) ij,I = α ij .The input of the GNN is α, which consists of the features of all edges.The output of the GNN is [d (L) 1,S , • • • , d (L) K,S ] T ≜ [x 1,S , • • • , xK,S ] T ≜ xS , which consists of the learned actions on all the sig edges.The policy learned by the vanilla Edge-GNN het ls is denoted as xS = G het ls,e (α), which is easily shown as joint-PE to α.When using the Edge-GNN to learn the link scheduling policy over G hom ls with the update equations in (6), G hom ls needs to be converted into a heterogeneous graph, denoted , can be obtained from(6) with the three functions in Remark 1.This GNN is referred to as vanilla Edge-GNN het p .The input and output of the GNN are respectively H and [d (L) ij,E ] N ×K , which are composed of the features and the learned actions of all edges.It was shown that the learned policy is 2D-PE to H.
= h ij .The inputs of the GNN are the features of all edges, i.e., [d (0) ij,E ] N ×K = [h ij ] N ×K = H.The outputs of the GNN are the learned actions taken on all the edges, i.e., [d depend on α ii and α R/i , and a (a) in (22a) or (22b) comes from substituting d (0) i,S and d (0) ij,I into a (1) i,S and a (1) ij,I , (b) in (22a) or (22b) is obtained by exchanging the order of matrix multiplication and summation operations.December 27, 2023 DRAFT J ij (•) is expressed as a function of only α ij , since we are concerned whether or not the output of GNN depends on individual interference channel gains.If σ I (•) is omitted, then CB

1. 5 m
, 2.5 dBi Transmit power of activation link 40 dBm While the GNNs can be trained in a supervised or unsupervised manner, we train the GNNs in an unsupervised manner to avoid generating labels that is time-consuming.Then, each sample only contains a channel matrix α = [α ij ] K×K that is generated according to the channel model with randomly located D2D pairs or H = [h nk ] N ×K where each element follows Rayleigh distribution.We generate 5 × 10 5 samples as the training set (the number of samples used for training may be much smaller), and 10 3 samples as the test set.Adam is used as the optimizer.The loss function is designed as Loss = − 1
het p are with the hyper-parameters in Section V-C.Besides, M d = 4 for Vertex-GNN and M d = 32 for Edge-GNN when N = 4 and K = 2, while M d = 64 for Vertex-GNN and M d = 128 for Edge-GNN when N = 8 and K = 4.

TABLE I PROCESSING
, POOLING AND COMBINATION FUNCTIONS OF GNNS FOR RESOURCE ALLOCATION.
and Table III, respectively.

TABLE II VERTEX
-AND EDGE-GNNS LEARNED OVER THE GRAPHS IN FIG.2 which is the set composing of all elements in H. f (•) with different super-and sub-scripts are non-linear functions.

TABLE VII FINE
-TUNED HYPER-PARAMETERS FOR THE VERTEX-GNNS WITH FNN AS PROCESSOR OR COMBINERIt is shown that for the Vertex-GNNs learning the same policy, when the processing function is FNN (no matter if the combiner is linear or non-linear, where the results for linear combiner are not shown because combiner is usually non-linear), the performance is much better than the vanilla Vertex-GNN.When only the combination function is FNN, the Vertex-GNN performs

TABLE VIII PERFORMANCE
OF GNNS WITH DIFFERENT PROCESSING AND COMBINATION FUNCTIONS σ(•) means that the combination function is a linear function cascaded with an activation function. 1

TABLE IX PERFORMANCE
OF VANILLA GNNS UNDER DIFFERENT NON-LINEAR ACTIVATION FUNCTIONS Since the Vertex-GNN het p +FNN (with H) outperforms the Vertex-GNN het p +FNN (w/o H), we only consider the Vertex-GNN het p +FNN (with H).The Vertex-GNN het ls +FNN and the vanilla Edge-GNN het ls are with the hyper-parameters in Table VII and Table VI, respectively.The Vertex-GNN het p +FNN (with H) and Edge-GNN