Data-Driven Based Relay Selection and Cooperative Beamforming for Non-Regenerative Multi-Antenna Relay Networks

In this paper, an attempt to exploit the benefits of data-driven methods in solving joint relay selection and beamforming for non-regenerative relay networks has been made. The common relay selection and beamforming optimization problem aiming to maximize the receiver’s achievable rate under the constraint of relay transmit power is intrinsically hard since the mixed discrete and continuous variables. The direct map from channel state information to select a relay with optimized beamforming weights via data-driven methods often fails to yield good results. To overcome this difficulty, we propose a two-stage algorithm based on data-driven method. Firstly, we convert relay selection to a multi-class classification problem, and a support vector machine(SVM) based data-driven scheme is suggested to determine the best relay. After the relay is selected, we utilize a closed-form solution to obtain the corresponding relay beamforming weights. Since the number of relays is often more than two, the sample imbalance problem exists in the classification problem, considered in our proposed data-driven scheme. The core idea of SVM based classification method is equivalent to training the optimal parameters of the SVM classifier through a large number of offline sample data. In this way, the computation of relay selection can be transferred to offline SVM training. Simulation results demonstrate that the performance of the proposed method is close to that of the global optimal relay selection scheme. Moreover, our proposed scheme has much lower complexities than the international optimal relay selection scheme, especially when the number of relays is large.


I. INTRODUCTION
M ACHINE Learning (ML) has been broadly accepted as an effective Artificial Intelligence(AI) technique for supporting a variety of tasks, for classification and decision-making. Recently, machine learning in wireless communication has also attracted people's attention [1]- [3]. [1] using K-Nearest Neighbor (K-NN) and Support Vector Machine(SVM) algorithm to solve the antenna selection problem is the first attempt to machine learning application in communication. Data-driven based antenna selection schemes have been investigated in point-to-point systems [4]. Data-driven based joint beamforming, and antenna selection scheme is studied in [5] in which a favorable tradeoff of system performance and computation complexity has been observed. For relay systems, the Transmission Antenna Selection (TAS) scheme is investigated based on data-driven algorithms for entrusted relay networks [6], [7]. Some studies have applied Q-learning to solve the relay selection problem. For example, [8] proposed a relay selection algorithm based on Q-learning for cooperative wireless network, which adopts Markov decision process modeling and uses an iterative method to approximate the optimal solution. Source nodes with learning ability can determine the best relay to participate in collaborative communication based on previously observed system performance status and quality functions describing rewards. However, the relay selection scheme based on Q-learning can only deal with the problem in a small state space, and all nodes are equipped with a single antenna.
Wireless Sensor Networks (WSN) have been extensively used in military, political, and medical fields [9]. However, in some applications, due to the large number and small scale of sensor nodes deployed in some applications, the system throughput of the nodes is insufficient, and the energy is deficient [10]- [14]. Collaborative communication technology is seen as a key technology to improve and prolong the network life cycle and improve the performance of sensor networks [15], [16]. It utilizes broadcast characteristics of wireless systems to improve the data rate of the system and expound the coverage of nodes by deploying relay nodes in wireless sensor networks. Cooperative relay is a key technology used in current wireless systems and applications such as Long-Term Evolution (LTE), LTE-Advanced cellular systems [17], [18], and Wireless Local Area Network(WLAN), to improve quality of service, coverage, and resource utilization efficiency. In multi-relay networks, relay selection can not only simplify signaling, save energy costs, but also maintain most of the system performance. Relay selection is the main technology to keep the most of system capacity improvement in multi-relay networks and at the same time reduce energy and signaling [19]- [24]. However, relay selection is an NPhard problem, and the hardness is carried over to joint relay selection and relay beamforming problem [25]- [28]. In [27], the optimal local design of joint cooperative beamforming and relay selection scheme is considered, by approaching the binary constraints by continuous box constraints, iterativebased d.c. (difference of two convex functions/sets) programming is employed to find the approximation solutions. In [28], the optimal global design of joint optimization of relay selection and cooperative beamforming is investigated.
The optimal global design is built on exhaustive search; in each step of the thorough search, a semidefinite programming (SDP) problem is solved. Both iterative-based algorithm and exhaustive-search-based algorithms involve high computation complexity, especially when the number of relays is large. In particular wireless networks where the computation ability and energy storage of nodes is bounded, i.e., wireless sensor networks and the Internet of Things (IoT) networks, algorithms with high computation complexity algorithm is prohibitive. At present, there have been studies on finding low complexity multiple relay selection schemes based on various methods [29], [30]. However, as far as we know, there is no literature investigating the data-driven based relays selection and beamforming scheme in multi-antenna multirelay networks, which have the advantages of low complexity and high performance. This paper uses a data-driven method to design a joint relay selection and beamforming scheme, aiming at maximizing the achievable rate under the individual relay power constraints and the relay selection constraints and supposed to reduce the online computation burden of multi-antenna multi-relay network and at the same time maintain high performance. We propose a two-stage scheme based on datadriven method. Firstly, we convert relay selection to a multiclass classification problem, and SVM based data-driven scheme is suggested to determine the best relay. After the relay is selected, we utilize a closed-form solution to obtain the corresponding relay beamforming weights. Since the number of relays is often more than two, the sample imbalance problem exists in the classification problem, which is considered in our proposed data-driven classification scheme. To implement data-driven method, the multi-class classification training systems are constructed accordingly. By feeding massive sample data into the training system, the parameters of the multi-class classifier are optimized. Once the multiclass classifiers are tuned in the training stage, only simple computations are necessary for real-time implementation.
The remainder of this paper is organized as follows. In Section II, we introduce the system model. In Section III, we propose a data-driven joint relay selection and cooperative beamforming scheme, and obtain an improved support vector machine algorithm and a low complexity cooperative beamforming matrix under sample imbalance. In Section IV, the performance is evaluated, including simulation results and complexity analysis. Finally, the paper is summarized in Section V.
Notations: Lower and upper case boldface letters represent a column vector a and a matrix A, respectively. A T , A * , A † , ∥A∥, and tr(A) denote the transpose, conjugate, conjugate transpose, Forbinius norm, and trace of the matrix A, respectively. ⊗ denotes Kronecker product, vec(A) denotes to stack the columns of a matrix A into a single vector a, A ⪰ 0 means that A is positive semidefinite, diag(A) denotes the diagonal elements of the matrix A, |a| denotes the absolute value of vector a.

II. SYSTEM MODEL
Consider a multi-antenna multi-relay network, as shown in Fig. 1, which consists of one source, one destination, and K relays. The source and the destination are equipped with a single antenna, and each relay is equipped with M antennas. We assume that the direct link between the source and destination is sufficiently weak to be ignored. This occurs when the direct link is blocked due to long-distance path loss or obstacles. Let h k ∈ C M ×1 and g † k ∈ C 1×M denote the channel vectors from the source to the kth relay and from the kth relay to the destination, respectively, where k ∈ K = {1, 2, ..., K}. The network operates in the timedivision duplex mode, and the transmission of information is divided into two phases. In the first phase, the source transmits the symbol s ∈ C 1×1 to the relays. Thus, the received signal at the kth relay, k ∈ K, is expressed as where P denotes the transmit power of the source and n k ∼ CN (0, σ 2 k ) denotes the additive Gaussian noise vector at the kth relay. In the second phase, select a relay and multiply the received signals through an adequately designed beamforming matrix, then forward the product to the destination. The transmitted signal from the kth relay, k ∈ K, is where W k ∈ C M ×M denotes the beamforming matrix at the kth relay and ∆ k ∈ {0, 1} denotes the relay selection indicator. Specifically, ∆ k = 1 means that the kth relay is selected to forward signal, and ∆ k = 0 means that the kth relay is not selected. The number of selected relays is constrained to be κ, 0 ≤ κ ≤ K in this work, i.e., For simplicity, we assume κ = 1 in the following. The received signal at the destination, denoted as y, is given by where n d ∼ CN (0, σ 2 d ) denotes the additive Gaussian noises at the destination. Thus, the signal-to-noise ratio (SNR) at the destination is We aim to maximize the achievable rate of multi-relay network subject to transmit power constraint and relay se-lection constraint. The optimization problem is formulated as follows where P r denotes the total transmit power constraint at all the relays 1 .

III. DATA-DRIVEN-BASED JOINT RELAY SELECTION AND COOPERATIVE BEAMFORMING
In this section, a two-stage approach is proposed for efficiently selecting the optimal relay as well as deriving the relevant beamforming matrix that achieves the maximum achievable rate design goal. A SVM-based relay selection method is designed to map the channel vector to the optimal relay. Since the number of relays is often more than two, the sample imbalance problem exists in the multi-class classification problems. The issue of sample imbalance often causes deviation to the expected classification effect, and we propose an improved SVM-based classification algorithm to solve this problem. Once the optimal relay is selected, a generalized rayleigh quotient-based algorithm is utilized to derive the closed-form optimal cooperative beamforming weights for the selected relay. The aim is to leverage the computational efficiency with data-driven method in order to obtain a fast real-time joint relay selection and beamforming scheme.

A. DATA-DRIVEN RELAY SELECTION
In this section, we detail our proposed data-driven relay selection scheme. Since there are multiple relays in the relay networks, we first set up a multi-class classification model. By extracting features from the Channel State Informations (CSIs), we apply the SVM approach to construct the classification model and predict the class label that the current channel belongs to. The belonged class represents an ideal relay index to select that may optimize the achievable rate of the current channel. The construction of the classification model needs a sufficiently large training data set and can be completed offline.

1) Training Set Preparation
We first generate L channel samples for training. Each sample can be represented as . The channel realization h k and g k randomly generate according to Rayleigh fading characteristics. Then, covariance matrices respectively. The diagonal elements of covariance matrices A hk , k ∈ K and A gk , k ∈ K are extracted to formulate the feature vectors 2 . To be specific, we generate a 1 × 2KM real-valued feature vector x. Furthermore, since high-value feature would bring about bias, we normalize each feature as where x(n) l and t(n) l denote the n th feature and the n th normalized feature of sample L, respectively, E(x l ) denotes the mean of the n th feature of L samples, max(x l ) represents the maximum value of the n th feature among L samples, and min(x l ) represents the minimum value of the n th feature among L samples. Then each channel sample can be represented as a normalized feature vector t ∈ R 1×2KM .
There are K relays in the network for relay selection. That can be interpreted as K classes, and the index of the relay denotes the class label. The label of all training samples can be presented as vector r ∈ C 1×L consisted of the index of optimal relay for L training samples. For each channel sample, the tag generation is an exhaustive search result of the global optimization design for the joint optimization of relay selection and cooperative beamforming in [28]. Specifically, the label is the index to which relay can achieve the maximum realizable rate under the transmitted power constraints in (6). This process goes on to generate the whole training data set.
Note that the proposed training strategy only maps each channel realization to the optimal relay index. Compared to mapping the channel realization to W k , which is the beamforming matrix on the selected relay, e.g., the kth relay. The proposed training method reduces the "learning burden" of the SVM classifier, makes the SVM classifier easier to train and achieves better results in practice.

2) SVM Classifier
The SVM classifier tries to find an optimal hyperplane that can separate two classes of samples with a large margin [31]. Assuming the training dataset of L samples be (t i , y i ), i = 1, 2, · · · , L, where t i ∈ R d is a feature vector and y i ∈ {+1, −1} is the corresponding label. The optimum hyperplane is formulated as y i (w T t i + b) ≥ 1, i = 1, 2, · · · , L, where w is the normal vector which determines the direction of the hyperplane, and b is the displacement which determines the distance between the hyperplane and the origin. In a regular SVM algorithm, the optimization problem of training the separating hyper-plane of the nth classifier can 2 Since we take the maximum reachable rate of formula (6) as the KPI of the classifiers, and the power gains of the channels h k and g k are the major factors that determine the KPI. On the other side of the diagonal elements in A hk and A gk , which are the squared amplitude of the elements of h k and g k are the power gains of the channels h k and g k . Therefore, we take the diagonal elements in A hk and A gk as the features. Compared to just taking the amplitude of the elements of h k and g k as features, our selection of features is more directly connected to the KPI hence more effective. be formulated as min w,b,ξ where ϕ is the mapping function, b is the threshold, C is the penalty constant, ξ i is the value of error caused by misclassification for sample t i .
Considering sample imbalance in our classification problem, we choose C + as penalty constant for positive samples and C − as penalty constant for negative samples, respectively. The problem (8) can be reformulated as In order to ensure the accuracy of hyperplane separation in imbalanced datasets, we choose a more significant penalty constant for positive samples and a more minor penalty constant for negative samples. To be specific, we use the reciprocal of the positive and negative sample numbers as C + and C − , respectively. To solve the problem (9), we construct the dual problem of problem (9) as below where a i and a j are Lagrange multipliers and K(t i , t j ) is the kernel function. a i and a j can be solved by Sequential Minimal Optimization (SMO) algorithm with the fast and reliable convergence as in [2]. K(t i , t j ) is chosen to be radial basis kernel function according to TABLE I, which shows the accuracy comparison of several kernel functions based on the classification test made on the training samples. The radial basis kernel function is K(x i , x j ) = e (−∥xi−xj ∥ 2 /(2σ 2 )) , with σ being the design parameter. Cross-validation is a common method to decide the value of parameter σ, but it needs to train the model many times, which leads to high complexity. In this paper, we propose a new method to optimize the parameter σ according to the sample distance. Thus, problem (10) can be transformed as where D ij = ∥t i − t j ∥ 2 , α = −1/(2σ 2 ) and y i y j ∈ {+1, −1}. By using Mclaughlin expansion, the problem (11) can be represented as Then, we can get the optimal value, α * = −1/D ij . When {a i }, w, b and design parameter σ are determined, the n th classifier can be presented as where t j is a new feature vector needed to be classified, t i is a support vector, S is the set of support vectors. If f n (t j ) > 0, the optimal relay is n th relay.

3) Training Stage
For each normalized feature vector t i , i ∈ {1, 2, ..., L}, we have its corresponding class label r i , i ∈ {1, 2, ..., L}. By using L tuples {t i , r i } as input, SVM classifier tries to find the optimal values of the hyperplane parameter parameters w, b and design parameter σ, {a i }. Once the training optimization of SVM classifier is done, we can make a prediction of relay selection for a new input of channel realizations. 3

4) Testing Stage
In the testing stage, the channel realization is generated following the same distribution as the training stage. Then, the diagonal elements of covariance matrices A hk and A gk , k ∈ K are obtained as input data and fed to the SVM classifier. The selected relay tags can be predicted by trained support vector machines. Afterwards, the corresponding relay beamforming matrix that attains the maximum achievable rate is deduced, which details in the following subsection.
Algorithm 1 Data-driven relay selection algorithm. Phase 1. Prepare training data 1: Estimate all h k , g k , A hk and A gk , k ∈ K = {1, 2, ..., K}. Calculate t i , i ∈ {1, 2, ..., L} using (7); 2: Solve the problem (19) using h k , g k , A hk and A gk , calculate k * and label it with r ∈ C 1×L ; 3: Repeat 1-2 for K − 1 times; Phase 2. Predict relay selection 1: Input training sample {t i , r i }, find the optimal values of the hyperplane parameter parameters w, b, and design parameter σ, {a i }. Build SVM classifier; 2: In the established SVM classifier, the new feature vector t j is input, and the predictive relay is output; 3: Assign the relay label to k * ; 4: return k * ;

B. LOW COMPLEXITY COOPERATIVE BEAMFORMING
First, we assume that the selected relay is the kth relay, k ∈ K, to the lth testing channel realization. Then the optimization problem (6) can be rewritten as The optimization problem (14) is still non-convex and difficult to solve. Using the equality tr(A T , where A 1 , A 2 , and A 3 are arbitrary matrices with compatible dimensions [35], [36]. Problem (14) can be equivalently reformulated as where z k = vec(W k ), a = h * k ⊗ g k , B = I ⊗ g k , and C = h * k ⊗ I.
can be further rewritten as For problem (16), the optimal relay beamforming vector z k should satisfy that the transmit power constraint is active [32], i.e., z † k D 3 z k = P r . Under this condition, problem (16) can be equivalently transformed into  It is noted that the function log 2 (·) is omitted in (17) since the logarithm is a monotonically increasing function, which has no effect on the optimization problem.
Theorem 1: The optimal solution, denoted as z o k , is Thus, the optimal value of objective function of (17) is R k = λ max (T −1 D 1 ). In order to evaluate the performance of our system, we adopt the maximum achievable rate as the performance metric. As per the rules of the conventional relay selection scheme, the index of the selected relay is expressed as Proof: Problem (17) is the maximization of a generalized Rayleigh quotient [35] whose optimal solution is βq, where q is the unit-norm eigenvector of the matrix 4 corresponding to its largest eigenvalue and Since and substituting the solution of q into β, we have β = P IV. PERFORMANCE EVALUATION 4 The rank of D 3 is full, the matrix D 2 + σ 2 d Pr −1 D 3 is invertible.

A. SIMULATION RESULTS
In this section, we evaluate the achievable rate of the proposed SVM scheme (denoted as "Data-driven method with SVM (improved method)") [37]. For comparison, we also evaluate the performance of the classical SVM scheme (denoted as "Data-driven method with SVM (original method)"), the Artificial Neural Network (ANN) scheme [33] (denoted as "Data-driven method with ANN"), the "Exhaustive Search" scheme and the "Random Selection" scheme. In the "Data-driven method with SVM (original method)" scheme, we replace the proposed SVM algorithm with classical SVM algorithm without considering the sample imbalance. In the ANN scheme, we replace the proposed SVM algorithm with the ANN algorithm. In the "Random Selection" scheme, we randomly select 1 out of K relays to participate in information exchange. In the "Exhaustive Search" scheme, we mean exhaustively search the optimal relay overall possible relays combined with optimal beamforming design, which generates the upper bound of the performance. The training data samples and the test data samples are composed of the covariance matrix {A hk } K k=1 and {A gk } K k=1 , with 4.8 × 10 3 channel realizations and 1.2 × 10 3 channel realizations, respectively. Each element of h k and g k , k ∈ {1, 2, ..., K} are independent and identically distributed (i.i.d.) complex Gaussian random variables with zero-mean and unit variance. Hence, the training data and the testing data are completely different. In both training samples and test samples, we assume M = 4, K = 8, σ 2 d = σ 2 k = σ 2 = 1, k ∈ {1, 2, ..., K}, the transmit power at the relays are {0, 5, 10, 15, 20, 25} dB, and the transmit power at the source is 10 dB.
In Fig.2, we present the average achievable rate comparison of the "Data-driven method with SVM (improved method)" scheme, the "Data-driven method with SVM (original method)" scheme, the "Exhaustive Search" scheme, and the "Random Selection" scheme. It is observed that the "Exhaustive Search" scheme obtains the highest average achievable sum rate. The "Data-driven method with SVM (improved method)" scheme performs obviously better than the "Data-driven method with SVM (original method)" scheme under different values of transmit power at the relays, which verifies that the process for the imbalance nature of the samples is effective. Moreover, the performance gap between the "Data-driven method with SVM (improved method)" scheme and the "Exhaustive Search" scheme is narrow.
In Fig.3, we present the average achievable rate comparison of the "Data-driven method with SVM (original method)" scheme, the "Data-driven method with ANN" scheme, the "Exhaustive Search" scheme, and the "Random Selection" scheme. It is seen from Fig.3, the performance of the "Data-driven method with SVM (original method)" scheme is better than the "Data-driven method with ANN" scheme.

B. COMPLEXITY ANALYSIS
In this section, we analyze the complexity of the exhaustive search scheme, the random selection scheme, the proposed improved SVM scheme, the original SVM scheme and the ANN scheme, respectively. Among them, the computational complexity of the SVM schemes, and the ANN scheme is obtained in the test stage. The complexity of the SVM algorithm [38] is represented by O(pn sv ), and the complexity of the ANN algorithm is represented by O(pn l1 + n l1 n l2 + ...), where p denotes the dimension of the feature, n sv denotes the number of support vectors, and n li denotes the number of neurons in layer i. According to the actual simulation experiment, a three-layer neural network is designed, and the number of neurons in the second layer (hidden layer) is n l2 . The complexity of the optimal beamforming is composed of two parts according to (18), the first part is the complexity of T inverse matrix, and the second part is the complexity of multiplying three matrices (β, T −1 , a). It can be observed from Table II that the complexity of the exhaustive search scheme is the worst, and the the complexity of the improved SVM scheme is a little better than that of the ANN scheme. Since the support vectors and categories of the SVM classifiers have not changed, the complexity of our proposed improved SVM scheme is the same as that of the original SVM scheme according to the calculation rules. In this paper, we propose a joint relay selection and beamforming scheme in multi-antenna multi-relay networks based on the data-driven method. Considering the sample imbalance problem in classification, we propose an improved SVM scheme for relay selection and derive the closed-form beamforming matrix for the selected relay. Simulation results demonstrate that the performance of our proposed SVM scheme approaches that of the global optimal relay selection scheme. Moreover, the proposed SVM scheme has much lower computational complexity than the global optimal relay selection scheme, especially when the number of relays is large. More work remains to be carried out. In future work, we will continue to study multi-relay selection schemes in multi-relay networks.