Learning-Aided Deep Path Prediction for Sphere Decoding in Large MIMO Systems

In this paper, we propose a novel learning-aided sphere decoding (SD) scheme for large multiple-input–multiple-output systems, namely, deep path prediction-based sphere decoding (DPP-SD). In this scheme, we employ a neural network (NN) to predict the minimum metrics of the “deep” paths in sub-trees before commencing the tree search in SD. To reduce the complexity of the NN, we employ the input vector with a reduced dimension rather than using the original received signals and full channel matrix. The outputs of the NN, i.e., the predicted minimum path metrics, are exploited to determine the search order between the sub-trees, as well as to optimize the initial search radius, which may reduce the computational complexity of SD. For further complexity reduction, an early termination scheme based on the predicted minimum path metrics is also proposed. Our simulation results show that the proposed DPP-SD scheme provides a significant reduction in computational complexity compared with the conventional SD algorithm, despite achieving near-optimal performance.


I. INTRODUCTION
In multiple-input multiple-output (MIMO) systems, the sphere decoding (SD) algorithm is known as an efficient signal-detection scheme, which performs close to the maximum-likelihood detection (MLD) receiver [1].Recently, to satisfy the increasing demand of ultra-high data rates in mobile communication systems, large MIMO systems, in which a large number of antennas are employed at a base station for data transmission and reception, have been of great research interest [2].To maximize the achievable data rates in large MIMO systems, the base station needs to receive as many symbols as possible simultaneously from multiple terminals, which leads to enhanced multiplexing gains.In this circumstance, a near-optimal receiver like SD plays an important role in approaching the channel capacity.However, the complexity of SD significantly increases with the number of antennas [3], which makes it difficult to apply to large MIMO systems.
Recently, deep-learning (DL) techniques have been applied in various fields, exhibiting eminent performance.Motivated by the performance of DL technologies in other fields, there have been attempts to apply DL to MIMO detection [4]- [8].In particular, the DL-based sphere decoding (DL-SD) algorithm is derived to choose the optimal hypersphere radius [4].In addition, a deep network architecture, called DetNet, is proposed to estimate the solution of MIMO detection [5].
Furthermore, the sparsely connected neural network (ScNet) is developed to simplify the structure of DetNet for massive MIMO systems [6].The application of a deep neural network to reduce the computational complexity of the conventional belief propagation detector is proposed in [7], and the orthogonal approximate message-passing network (OAMP-Net) architecture is proposed to improve the performance of the iterative detection algorithm with trainable variables [8].
In this paper, a novel learning-aided SD algorithm is proposed.The main idea of the proposed algorithm is to predict the minimum path metric among "deep" paths of each sub-tree in a large tree structure by using a neural network (NN).In large MIMO systems, the required information to estimate the path metrics can have large dimension, which can significantly increase the complexity of the NN.To resolve this problem, the size of the input vector to the NN is optimized based on the property of large MIMO channels.The predicted minimum path metrics of sub-trees, which are generated by the designed NN architecture, are then used to determine the search order of SD.Furthermore, they are also used for early termination and optimization of the initial radius of SD, which can potentially reduce the overall complexity.
The remaining part of this paper is organized as follows.In Section II, the system model and conventional Schnorr-Euchner SD (SE-SD) algorithm are explained, whereas Section III describes the proposed learning-aided SD algorithm, which employs the NN.Section IV presents the simulation results to compare the bit-error rate (BER) performance and computational complexity, and is followed by the conclusion in Section V.
Notations: Scalars, vectors, and matrices are denoted by lowercase, bold-face lowercase, and bold-face uppercase letters, respectively.The (i, j)th element of a matrix A is denoted by a i,j , whereas the ith element of a vector a is denoted by a i .(•) T and (•) H represent the transpose and conjugate transpose of a matrix, respectively, whereas I and 0 indicate an identity matrix and all-zero matrix of appropriate size, respectively.(•) and (•) denote the real and imaginary parts of a complex matrix, respectively.

II. SYSTEM MODEL
We consider a MIMO system with N t transmit antennas and N r receive antennas.The received signal vector can be expressed as where y represents an N r × 1 complex received signal vector, H is an N r × N t complex channel matrix, v is an N r × 1 complex additive white Gaussian noise vector with zero mean and covariance matrix σ 2 v I, and x is an N t × 1 complex transmitted signal vector drawn from a QAM constellation S.
The optimal ML detector searches for the lattice point xML that has the smallest Euclidean distance to the received signal vector y over the entire space S Nt of the transmitted signal vector ( In ML detection, all candidate vectors in S Nt need to be examined, which requires high computational complexity, especially when N t is large.To resolve this problem, the SD algorithm can be employed.To achieve near-ML performance with lower complexity, the SD algorithm limits the search space of the tree search.Specifically, for a search radius d, the SD solution can be expressed as To improve the efficiency of the tree search procedure in SD, the QR decomposition (QRD) of the channel matrix H is performed, which yields where R is an and Q 1 and Q 2 consists of the first N t columns and last (N r − N t ) columns of Q, respectively.
Then, the constraint on the search space in (3) can be rewritten as where we have z = Q H 1 y and d2 = d 2 − Q H 2 y 2 .Based on (5), the SD solution can be reformulated as The SE-SD scheme, known as an improved SD search strategy, determines search order for nodes at each layer based on the branch metrics [9].For a candidate symbol vector x = [x 1 , x 2 , ..., x Nt ] T , the branch metric at layer l is written as In the SE-SD scheme, the candidate symbols are examined in ascending order of their branch metrics, which can be achieved at layer l by following a zigzag search order of candidate symbols, starting from an initial point: where • rounds to the nearest point of its argument in the constellation S.
As aforementioned, at each layer of SE-SD, the search order is determined by the branch metric in (7), which only considers the metric at the corresponding layer.However, if we can use the full-path metric, i.e., z − Rx 2 , which accumulates the branch metrics from the root note to the leaf node, for ordering, the search can be more efficient.Furthermore, this full path metric can also be exploited for early termination and the optimization of the initial radius.This motivates us to develop a novel NN-based SD scheme, which utilizes the predicted path metric for the operation of SD.

III. DEEP PATH PREDICTION FOR SPHERE DECODING
In this section, the proposed DPP-SD is presented.As discussed in the last paragraph of Section II, knowledge of the full path metric can improve the efficiency of search ordering, which can potentially reduce the computational complexity.Hence, we consider the prediction of the path metric based on the NN.However, the prediction of the full path metric for every candidate vector requires the same complexity order as the ML detection, which implies that it is not an efficient strategy to exploit the NN to reduce the complexity of SD.Instead, in the proposed DPP-SD scheme, the NN is designed to predict the minimum path metric of the sub-tree rooted by each node at layer N t .In other words, before beginning the tree search, the DPP-SD scheme predicts the minimum path metric of the "deep path" ranging from each child node of the root to a leaf node in each sub-tree, and we use it for sub-tree ordering, early termination, and radius determination. A. Design of the NN for path metric prediction be the target vector of the proposed NN, where g 2 q means the minimum path metric of the sub-tree rooted by the qth node at layer N t , which can be formulated as Here, {s |S| } are the candidate symbols at layer N t , and x 1:Nt−1 represents the vector consisting of the first N t − 1 symbols in x, i.e., x 1: Because the path metric depends on the received signal vector and the channel matrix, the input to the NN for path metric prediction should include information of the received signals and channel coefficients.However, if all components of the received signals and channel state information, i.e., {y, H} or {z, R}, are employed for the input, the complexity of the NN can be significantly large in large MIMO systems.Furthermore, a large number of input elements can require a large training set and lower the prediction accuracy.
To reduce the number of input elements, we rewrite the path metric as which implies that the path metric can be computed based on knowledge of z H z , R H z, and It also means that they can be used as the inputs to the NN to predict the path metrics, instead of the received signals and channel matrix.However, R H R contains many more elements than z H z and R H z, whereas it can be approximated as a diagonal matrix R H R ≈ N r σ 2 h I in large MIMO systems due to the asymptotically favorable propagation and channel-hardening effect [10], where σ 2 h = E(|h i,j | 2 ).By assuming σ 2 h = 1, R H R can be approximated as a fixed matrix, which is independent of either of the received signal or the channel information.In practical systems, σ 2 h = 1 can be achieved by properly normalizing the received signal and channel matrix in (1).Hence, we exclude R H R from the set of inputs to the NN.We note that the distribution of the path metric depends on the noise variance σ 2 v , which implies that the noise variance can help improve the accuracy of estimating the minimum path metric in the NN.Considering these aspects, we set the input vector in the form of where the size of e becomes 2N t + 2.
For the NN to predict the minimum path metrics, we employ a Gaussian radial basis function network (G-RBFN) [11], [12] consisting of one hidden layer, as depicted in Fig. 2. The radial basis function is expressed as In the G-RBFN structure employed for the path metric prediction, the Gaussian function with center µ = 0 and width ω = 1 is used as the activation function in each node of the hidden layer.The number of nodes in the hidden layer is set to 2N t + 2|S|.

Input layer
Output layer Hidden layer Fig. 2. The proposed NN architecture.
To optimize the parameter vector θ of the NN, which consists of the weights and biases between input and hidden layers and between hidden and output layers, the mean squared error (MSE) loss function is used as follows: where M indicates the number of training examples, whereas g (m) and ĝ(m) (e; θ) = [ĝ T are the desired target vector and the output vector for the mth example.For the optimization algorithm in training, the scaled conjugate gradient (SCG) method [13] is employed and the learning rate is set to 0.0001.

B. NN-aided optimization of the initial radius
In the SD algorithm, the initial radius is typically determined based on the noise variance σ 2 v to meet the constraint on the probability of the true solution existing inside the sphere [1].Specifically, the conventional initial radius f 1− is chosen as where Ξ is the inverse incomplete gamma function, and 1 − is the probability of the true solution existing inside the sphere.In this work, we assume 1 − = 0.999.If the initial radius is small, the complexity of SD is reduced; however, its BER performance can be degraded because the probability of the true solution being outside the sphere increases.In contrast, if the initial radius is large, the BER performance is improved, but the complexity can increase.Therefore, a better trade-off between the performance and complexity can be achieved if we reduce the initial radius while preserving the near-optimal BER performance.
We note that the NN presented in the prior subsection generates ĝ(e; θ) which are the predicted smallest path metrics of all the sub-trees originating from layer N t .
Hence, we can consider g1 = min{ĝ 1 , ĝ2 , • • • , ĝ|S| } as the estimate of the smallest path metric over all the possible paths in the tree.Inspired by this, in the proposed DPP-SD scheme, we set the initial radius f to f = min(λ 1 g1 , f 0.999 ), where λ 1 is a design parameter.In the ideal case, where g1 is accurately estimated and we have g1 ≤ f 0.999 , the initial radius f with λ 1 = 1 guarantees that there is at least a single lattice point inside the sphere, which implies that the proposed DPP-SD achieves the optimal ML performance.However, in practice, g1 contains a prediction error, and hence λ 1 > 1 is empirically chosen to provide near-optimal performance.From (15), we have f ≤ f 0.999 , and hence we can expect that by setting the initial radius to f instead of f 0.999 , the computational complexity of tree search can be reduced, as will be numerically verified in Section V.

C. NN-aided ordering
The output of the proposed NN can also be exploited for ordering in tree search.Specifically, the proposed NN-aided ordering scheme rearranges the predicted minimum path metrics in the output vector ĝ(e; θ) in ascending order to generate g = [g 1 , g2 , • • • , g|S| ] T , where we have g1 ≤ g2 ≤ • • • ≤ g|S| .The smaller predicted minimum path metric of a sub-tree implies that it is more likely that this sub-tree contains the final solution of SD.Therefore, it can be computationally efficient to search over the sub-trees with smaller predicted path metrics first.
In the depth-first search of the proposed DPP-SD, we visit the nodes at layer N t in the order of In other words, the sub-tree corresponding to g1 is first visited by a depthfirst search, and then the sub-tree corresponding to g2 is searched.In this manner, the search is continued until any termination condition for SD is satisfied.
This NN-aided ordering strategy for layer N t can be extended to the remaining layers, which can potentially improve the search efficiency.However, it requires additional outputs of the NN.
We note that the number of outputs of the NN increases exponentially with the number of layers employing the NN-aided ordering scheme, which leads to the significantly enhanced overall complexity of DPP-SD.For example, to apply the NN-aided ordering to layer N t − 1, the NN should be designed to predict the minimum path metrics of |S| 2 sub-trees rooted by the nodes at layer N t − 1. Considering this aspect, we only apply the NN-aided ordering scheme to layer N t , whereas the ordering scheme of SE-SD is employed for the remaining layers, as illustrated in Fig. 1.

D. NN-aided early termination
To further reduce the complexity, early termination can be performed based on the predicted minimum path metrics.When a solution is found through the search for a sub-tree rooted by the pth node of the first layer, the radius f is updated to the distance of the found solution.After the search in the pth sub-tree is completed, the radius, which is generated by the current best solution, is compared to the predicted minimum path metric of the next sub-tree.In particular, we check the termination condition where λ 2 is a design parameter.If the condition (16) is satisfied, the most recently found solution is adopted as the final solution and the algorithm is terminated, which results in complexity reduction of the SD.If there is no prediction error in gq+1 , we can choose λ 2 = 1 in (16) without any performance loss of the SD receiver.However, considering the potential prediction error in gp+1 , λ 2 should be set to be sufficiently large to guarantee the near-optimal performance of the SD.
The proposed DPP-SD scheme is summarized Algorithm 1.In step 1, we obtain the predicted path metrics ĝ(e; θ) by using the trained NN.In steps 2 and 3, the sub-trees rooted by a node at layer N t are rearranged in ascending order of the elements in ĝ(e; θ).Then, the NN-aided initial radius is determined based on the estimated smallest path metric g1 in step 4. In steps 5-16, the tree search procedure is performed for each sub-tree.Specifically, in steps 6-10, the optimal solution in the pth sub-tree is determined.In steps 12-14, it is determined whether early termination is performed or not.
Algorithm 1 The proposed DPP-SD algorithm x T by using the trained NN. 3: Rearrange the sub-trees according to g.

7:
if A new solution is found.then 8: Set the new solution to x.

9:
Update the radius: f = z − Rx . 10: Go to Step 6 to continue the search for the pth sub-tree.In this section, the simulation results are presented to evaluate the performance and complexity of the proposed DPP-SD scheme.For simulations, we consider 16×16 and 24×24 MIMO systems with QPSK modulation.For each MIMO system, 100,000 randomly generated data samples are used to train the designed NN.To generate the random data samples, the channel coefficients are set to independent and identically distributed (i.i.d.) complex Gaussian random variables with zero mean and unit variance, whereas the signal-to-noise ratio (SNR) is set to a uniform random variable in the range of [4,14] dB.We define the SNR as E s N t /σ 2 v , where E s is the average symbol energy.The design parameters λ 1 and λ 2 of the proposed DPP-SD algorithm are optimized by simulations.As λ 1 and λ 2 increase, the performance improves; however, the complexity also increases.Therefore, the values of λ 1 and λ 2 are optimized such that the complexity of the DPP-SD is minimized while its performance remains nearly the same as that of the SE-SD.The optimized values of λ 1 and λ 2 for each MIMO configuration and SNR are shown in Table 1.
As presented in Section III, the DPP-SD scheme is composed of three sub-schemes: NN-aided initial radius, NN-aided sub-tree ordering, and early termination.To separately test these subschemes, we evaluate 1) DPP-SD with only the NN-aided initial radius, 2) DPP-SD with only NN-aided sub-tree ordering, and 3) DPP-SD with only NN-aided sub-tree ordering and early termination, which are referred to as "DPP-SD w/ NN-radius," "DPP-SD w/ NN-ordering," and "DPP-SD w/ NN-ordering & ET" in the figures demonstrating the simulation results, respectively.
We note that early termination can be employed only when it is incorporated with sub-tree ordering.For the conventional SD scheme, for comparison, the SE-SD algorithm is considered, whereas the DL-SD in [4] is excluded from comparison.In the DL-SD algorithm, the original channel matrix and received signals are used for the input to the NN, which can have large dimension in large MIMO systems.For example, in a 24×24 MIMO system, the input contains more than 1,000 elements, which requires significantly high complexity of the NN as well as a large training set.We note that in a 24×24 MIMO system, the input vector of the proposed DPP-SD scheme only contains 50 elements.In the considered algorithms, if no solution is found through the search process, the final solution is chosen to be a zero-forcing (ZF) solution, which is given by xZF = (H H H) −1 H H y.
Fig. 3 shows the BER performance comparison of the proposed DPP-SD and conventional SE-SD in the assumed MIMO configurations.In Fig. 3, it is observed that the proposed DPP-SD schemes achieve almost the same performance as the SE-SD.
In Figs. 4 and 5, we show the computational complexity ratio of the proposed DPP-SD with respect to the that of conventional SE-SD.To evaluate the computational complexity, the average number of complex multiplications and additions are counted.In Figs. 4 and 5, it is observed that the proposed DPP-SD requires significantly lower complexity than SE-SD when SNR ≤ 11 dB.In particular, the complexity-reduction ratio of the DPP-SD with all three sub-schemes is 36.7-43.2% and 50-59.2% at SNR ≤ 11 dB for 16×16 MIMO and 24×24 MIMO systems, respectively.

V. CONCLUSION
In this paper, we have presented the DPP-SD scheme, a novel learning-aided SD algorithm for large MIMO systems.To solve the high-complexity problem of the conventional SD algorithm in large MIMO systems, we design an NN to perform the minimum metric prediction among the paths in the sub-trees.To optimize the complexity of the NN, we reduce the size of the input vector based on the property of large MIMO channels.The DPP-SD algorithm exploits the output of the NN, i.e., the predicted minimum path metrics, to optimize the initial radius, sub-tree ordering, and early termination.The simulation results show that the proposed DPP-SD algorithm performs close to the conventional SE-SD scheme while requiring up to approximately 60% lower complexity.
x, yielding xML = arg min x∈S N t y − Hx 2 .

Fig. 1 Sub-tree 4 Fig. 1 .
Fig.1illustrates a tree structure for SD when quadrature phase shift keying (QPSK) modulation is employed, where we have |S| = 4 sub-trees, each of which are rooted by one of |S| nodes at layer N t .In the next subsection, we will describe the NN that predicts the minimum path metrics of the sub-trees.

Fig. 4 .
Fig. 4. Complexity comparison between the DPP-SD and SE-SD for a 16×16 MIMO system with QPSK.

Fig. 5 .
Fig. 5. Complexity comparison between the DPP-SD and SE-SD for a 24×24 MIMO system with QPSK.

TABLE I OPTIMIZED
VALUES OF λ1 AND λ2.