Machine Learning Detectors for MU-MIMO Systems With One-Bit ADCs

,


I. INTRODUCTION
One of promising technologies beyond the 5G cellular system is a massive multi-input multi-output (MIMO) in which many antennas at the base station (BS) improve the capacity and energy-efficiency [2]. In contrast, hardware costs and radio frequency (RF) circuit power consumption can be significantly increased [3] by the use of a massive MIMO. Specially, high-resolution analog-to-digital converter (ADC) is a main problem since the power consumption of ADC increases exponentially with the number of quantization bits and linearly with the baseband bandwidth [4]. In order to get over these challenges, the usage of low-resolution ADCs (e.g., 1∼3 bits) in massive MIMO systems has been extensively studied for decades. one-bit ADCs seem specifically appealing because they do not require automatic gain The associate editor coordinating the review of this manuscript and approving it for publication was Guan Gui . controllers, reducing hardware complexity [5] noteably. In this situation, a simple zero-thresholding comparators quantize the in-phase and quadrant components of an observed signal which has continuous values separately. Even if low-resolution ADCs provide benefits, it causes great amount of technical problems in channel estimation and MIMO detections.
In uplink MU-MIMO systems with one-bit ADCs, the optimal ML detection was developed in [6] and the lowcomplexity methods were proposed in [7], [8]. Inspired by a coding theory, [9] proposed a weighted minimum distance (wMD) decoding, by viewing the MIMO detection as the coding theory problem over the parallel binary discrete symmetric channels (B-DMCs). Among recent researches, supervised-learning (SL) detectors were presented in [10], [11] by modeling a non-linear MIMO channel as parameterized probabilistic models, where one is based on Gaussian-mixture (GM) model [10] and the other is based on Bernoulli-mixture (BM) model [11]. Also, it was shown in [11] that SL detector based on BM model can outperform the other methods. However, in order to estimate the parameters in the model, it requires a great number of pilot overhead. To successfully apply SL detector to practical systems, it seems inevitable to cut down number of the pilot overhead, which is the major goal of this paper.
We consider an uplink MU-MIMO system equipped with a one-bit ADC at receive antennas K users with singletransmit antenna communicate with a BS with N r receive antennas. Then, we assume that the BS does not know a channel state information (CSI) as in pragmatic communication models. Therefore, it needs to be estimated through pilot signals throughout the training phase (see Fig. 1). We first assumed a block-fading channel to be static throughout the coherence time T c and shifts independently in block-to-block. Also, the first T t < T c time slots are assigned to the channel training phase and the remaining T d = T c − T t time slots are devoted to the data transmission phase as shown in Fig. 1. In this system, our major contribution to alleivate the pilot-overhead of the existing SL detector in [11] is to develop a semi-supervised learning (SSL) detector motivated by semi-supervised learning [12]. The key idea of the proposed SSL detector lies in estimation of the parameters of the underlying BM model leveraging an efficient expectationmaximization (EM) algorithm. In this step, both pilot data signals (i.e., labeled data) and some pieces of data signals (i.e., unlabeled) data are contributed. Beyond the blockfading channel (i.e., static channel during T c time slots), we propose an online-learning (OL) detector for time-varying channels, where the main idea is to reform the conventional EM algorithm into online EM algorithm and to assign a decreasing weight on the out-of-date information. Via simulation results, we describe that the proposed SSL detector can accomplish the comparable performance of the corresponding SL detector with a fairly reduced pilot-overhead (e.g., 75% overhead reduction). Furthermore, we will show that the proposed OL detector is more robust to channel variations.
This paper is organized as follows. In Section II, we explain an uplink MU-MIMO system with one-bit quantized signals at receive antennas and briefly review the SL detector proposed in [11]. In Section III, we propose a novel SSL detector which achieve the comparable performance of the SL detector with quite reduced pilot-overhead. For time-varying channels, OL detector is proposed in Section IV by leveraging the idea of online learning. Section V shows the simulation results to demonstrate the supremacy of the proposed SSL detector. Finally, Section VI provides conclusion.
Notation: column vectors and matrices are represented as lower and upper boldface letters, respectively. Let [x : y] = {x, x + 1, . . . , y} for any integers x and y > x, and when x = 1, it can be further shortened as [y]. For any k ∈ [0 : K − 1], we let g(k) = [y 0 , y 1 , . . . , y K −1 ] T represent the mary expansion of k where k = y 0 m 0 + · · · + y K −1 m K −1 for y i ∈ [0 : m − 1]. Also, g −1 (·) indicates its inverse function. In a vector case, g(·) is applied element-wise. As such, in case where a scalar function is applied to a vector, it will be applied element-wise. Re(x) and Im(x) represent the real and complex part of a complex vector x, respectively.

II. PRELIMINARIES
This section introduces the considered system model and briefly reviews the concept of supervised-learning (SL) detector proposed in [11].
whereH ∈ C N r ×K is the channel matrix between the BS and the K users. For instance, the i-th row ofH is the channel vector between the K users and the i-th receiver antenna at the BS. Also, ] T ∈ C N r denotes the noise vector whose elements are distributed as circularly symmetric complex Gaussian random variables with zeromean and variance σ 2 , i.e.,z i [t] ∼ CN (0, σ 2 ). The signalto-noise ratio (SNR) is defined as In the MIMO system with one-bit ADCs, each receiver antenna of the BS is equipped with RF chain followed by two one-bit ADCs which are applied to each real and imaginary part respectively. We define sign(·) :  (2) into the equivalent real representation as where where N = 2N r . This real system representation will be used in the sequel.

B. EQUIVALENT N PARALLEL B-DMCs
In [9], it was verified that a real system representation (4) can be transformed into an identical N parallel binary discrete memoryless channels (B-DMCs), from a coding-theoretic viewpoint. In the corresponding N parallel B-DMCs, the channel input/output and the channel transition probabilities are char as follows.

1) AUTO-ENCODING FUNCTION
Given channel state matrix H , we can construct a spatialdomain code C = [c 0 , . . . , c m K −1 ], each of which is given by where h i T denotes a channel matrix between K users and i−th receiver antenna and g(·) is defined in the notation in Section I. Note that each codeword of C can be considered as an output under noiseless channel in (4). The channel input q of the equivalent channel is determined by the auto-encoding function f (·) such as 2) EFFECTIVE CHANNEL Reference [9] showed that the effective channel is composed of the N parallel BSCs with the channel input q and the channel output r. This channel is detailed by the following channel transition probabilities: For the n-th BSC, depending on user's message w = g(j) and the corresponding codeword c j , the transition probability is defined as where the error-probability of the n-th BSC is computed as and where Q( Leveraging the equivalent channel, an optimal weighted hamming distance decoding was proposed in [9], with the assumption of full-knowledge on a channel matrix H . Also, a more practical SL detector was proposed in [11], without a priori knowledge on the channel matrix (see Section II-C).

C. OVERVIEW OF SL DETECTOR
In this section, we review the SL detector proposed in [11], which is based on parameterized supervised learning. The corresponding generative model for r[t] follows the equivalent channel in Section II-B, which is fully described by the The above model is referred to as Bernoulli-mixture model. We remark that each class j has its own probability distribution parameterized by θ j = [c j , j ]. In [11] SL detector performs with the two-phases during each coherence time T c .

1) PARAMETER ESTIMATION PHASE
The parameter vector θ is estimated using T t pilot signals. We first gather the labeled data L where (r[t], j t ) is the pilot signal corresponding to the label j t . Since for each codeword, redundant pilot signals that have equivalent messages are transmitted over T times, the entire pilot-overhead is same as T t = T · m K . Also, for t ∈ [T t ], the labels are determined as · denotes the floor function. According to [11], from the labeled data L, the parameter vector θ is estimated by the optimal maximum-likelihood (ML) estimation aŝ (13) is likely to be zero, which is not equivalent with true value . Furthermore, this yields detection error since the detected probability in (9) is forced to be zero. Thus, we propose the empirical estimation rule by the Laplace's rule of succession. 2

) DATA DETECTION PHASE
Under the Bernoulli mixture model in (12) and (14), the ML detection is applied to data detection as follows,

III. THE PROPOSED SSL DETECTOR
In spite of its superior performance, SL detector introduced in [11] is not pragmatic as a large number of pilot signals increases such that an empirical transition probability (13) approaches the true probability (8). Furthermore, this problem becomes severer as K increases since the number of parameters grows exponentially with K (see (12) and (13)). We solve this problem by proposing a semi-supervised learning (SSL) detector. In the proposed method, the parameter vector θ is updated by leveraging detection information for both data signals (i.e., unlabeled data U) and pilot signals (i.e., labeled data L).
Here, the unlabeled data U is collected during T u time slots (see Fig. 1) as follows Letting D = L ∪ U, the proposed SSL detector performs as follows.

A. PARAMETER ESTIMATION PHASE
In this phase, the parameter vector θ = [θ 0 , . . . , θ m K −1 ] is updated using the given data D such that the conditional probabilities of the observations (i.e., the received binary signals) are maximized. This ML estimation is formulated aŝ Note that under the Bernoulli-mixture model, we know the probability distribution p(r[t]|j, θ j ) in (9) for the given parameter θ j , which will be used in the below. Also, the labels of the labeled data ate given as {j t = (t − 1)/T : t ∈ [T t ]} in (11). For any fixed parameter θ , the objective function in (17) is represented as where recall that p(r[t]|j, θ j ) is defined in (9), and P(j t |θ j t ) = 1/m K since the users' messages are assumed to be generated uniformly and randomly. Clearly, the above objective function is non-convex especially due to the second-term with the unlabeled data and thus, maximizing (17) is too complicated to be solved. We thus can solve it applying Expectation-Maximization (EM) algorithm [13]. The EM algorithm comprises expectation-step (E-step) and maximization-step (M-step), respectively: Given estimated parameter vector θ i , this algorithm updates parameter vector θ i+1 as following steps.
E-step: This step needs to compute the probability distribution using the parameter vector θ i : This is specified by considering the difference of the labeled and unlabeled data as follows: • (Unlabeled Data) For t ∈ [T t + 1 : T t + T u ] and j ∈ [0 : M-step: This step estimates an updated parameter vector θ i+1 with the γ j [t] in (21) as follows: where the objective function ψ is characterized as VOLUME 8, 2020 where the second equality comes from the Bayes rule and (9). Note that γ j [t] in the above is constant with respect to θ j . Also, from the Bernoulli-mixture model in (9), the function ψ in (23) can be specified as Since the first-term in the above is constant with respect to θ , the parameter vector θ is maximized by maximizing only the second-term as follows: Obviously, maximizing (24) can be viewed as maximizing each term in (24): For fixed j and n, we have To estimate parameterˆ ,ĉ in (25), we introduce the useful lemma in the below. Lemma 1: Suppose a ≥ 0 for 1 ≤ ≤ n, Then n =1 a log p is maximized over all probability vectors p = (p 1 , . . . , p n ) by p = a n i=1 a i . Proof: Note that n =1 a log p is a concave function of p over a region with linear constraints. Then we use a Lagrange multiplier λ for the constraint n =1 p = 1 and try to find the stationary point of the Lagrangian as follows: The stationary point is the point where the partial derivatives by the variables p i are all zero. Then, we have: (27) which yields the p = a λ for all . Also, in order to satisfy the linear constraint n =1 p = 1, the λ should be equal to n =1 p = 1. This completes the proof. First of all, we observe that the optimal c j,n should satisfy the following constraint for any j,n < 0.5: Also, we can see that this constraint is satisfied by assigning Next, applying Lemma 1 in the below to (25), the errorprobability i+1 j,n is optimized aŝ . (30) Finally, we can compute the log-likelihood (18) using the updated parameter vector θ i+1 as from which we can check the convergence. The overall procedures are described in Fig. 2 and Algorithm 1 where ε ≥ 0 denotes the pre-determined threshold for the stopping criterion.
Algorithm 1 Parameter update of the proposed SSL detector Input: Estimate θ 0 from L using (12) and (13) Calculate log likelihood log P(D|θ 0 ) from (31) while log P(D|θ i+1 ) − log P(D|θ i ) < ε do Also, for t ∈ [T t + T u + 1 : T c ], the detection of the SSL detector has the equivalent process as (15) of the SL detector in Section II-C. We remark that the performance-complexity Algorithm 2 Parameter update of the conventional OL detector Input: Estimate θ 0 from L using (12) and (13) (19) M-step: Update θ i+1 j by (33) and (30) end for tradeoff of the proposed SSL detector is controlled by the choice of T u .
Remark 1: We describe the complexity of the parameter update in SSL detectors. As seen in Algorithm 1 and Algorithm 2, we consider the two parts which are respectively expectation and maximization steps. First, the expectation step requires the O((T t +T u )m K ·i) where i denotes the number of iteration in EM algorithm. Then, the maximization step requires the O(N (T t + T u )m K · i) without taking into account the complexity of (9). For practical implementations, i is fixed in advance for simulations in Section V. Also, i is also a parameter which causes the performance degradation when the number of iteration grows. This is because there are really likely to be overfitting for the data signal from the parameter update phase T u . Then, the complexity to be exponential with K could be cut down if the one-bit sphere decoding methods [14], [15] are applied to detection and training phase.

IV. THE PROPOSED OL DETECTOR FOR TIME-VARYING CHANNELS
In this section, we develop an online-learning (OL) detector by extending the proposed SSL detector for timevarying channels. Unlike the static channel, during the data transmission phase, channel state can change slowly with some correlation to priori CSI (see Fig.3). The proposed OL detector is constructed by transforming the update rules in (29) and (30) into online versions, as in Algorithm 3. The main traits of OL detector are as follows: i) Parameters are being updated at every time slot during data transmission phase; ii) OL detector can update parameters by exploiting all the data signals during data transmission phase, which is different from SSL detector because the SSL detector uses data signals in the parameter update phase that is part of data transmission phase. First of all, we change the (29) into an incremental form so that the parameter update is available at every time slot: with the initial value ofr . Accordingly, (30) can be changed. However, they should be further improved in two points. The first point is that (30) requires lots of memory and computation as the time goes on since it should store all γ j [t] and r[t] for t ∈ [1, T t + T u ]. The second point is that we need to introduce a decaying weight to the parameter corresponding to old information. This is because as channel state is changing with correlation for each time slot, channel also keeps losing its prior information continuously. Since the incremental form of (29) is provided in (33), we next focus on the incremental form of (30). Toward this, we define two parameters, denoted by Ns [t] and Nd[t]: Finally, taking the old and new information into account, we allow to decay the old information by putting decaying weighted factor δ into (33) and (35) as follows: It is remarkable that the above update roles can be viewed as online EM algorithm proposed in [16], [17].

A. DATA DETECTION
Unlike the data detection of SSL detectors (32), an OL detector performs detection with updated parameters as the way of (15) because a OL detector updates parameters at every time slot during data transmission phase.

B. GENENRALIZATION
In fact, the OL detectors also can be employed on both invariant channel and variant channel condition. In case of the invariant channel, the OL detectors can allow the parameter to update using single data signal, which is similar to stochastic gradient descent in optimization. The OL detectors are a part of the SSL detectors in that the OL detectors are constructed by remodeling the conventional batch EM algorithm into online EM algorithm [16], [17]. The objective of both detectors is to enhance their performance continuously by utilizing the data signal under the probabilistic generative models.

V. SIMULATION RESULTS
The average bit-error rate (BER) performances are evaluated in the conventional SL detector, the proposed SSL detector and OL detector. In the simulations, we consider a Rayleigh fading channel where each element in a channel matrix H is drawn from an independent and identically distributed (i.i.d.) circularly symmetric complex gaussian random variable with zero mean and unit variance. In this system, QPSK modulation is applied and a user is assumed to send binary data (i.e. m = 2). In the first and second simulations, we set a block fading duration (i.e., coherence time interval) to be T d = 512, T u = 10·T t and T t = T ·m K . In the last simulation where time-varying channel is used, initial parameters are estimated during static training phase T t . After the estimation, dynamic data transmission phase is set to be T d = 2048. Fig. 4 descibes the BER performances of the SSL detection, SL detection, and maximum likelihood detection (MLD) with channel state information at a receiver (CSIR) when training duration varies. It is noteworthy that the performance of proposed SSL detector surpasses that of the conventional SL detector over the entire SNR regimes in equivalent pilotoverhead condition. Particularly, in the case of T = 1, the performance of the SSL detector nearly achieves that of the SL detector trained for T = 4. This signifies that the SSL detector can lessen training span (T t ) noticeably with maintaining performance, by utilizing information from data signals in the generative model. In addition, when compared with MLD under CSIR, this result shows that the proposed  method enables the empirical conditional probability to converge into true conditional probability without increasing pilot overheads. Fig. 5 shows the BER performances of the SL detector and the proposed SSL detectors under Bernoulli-mixture model in a different situation from Fig. 4. T = 2 is used for the SSL detector and T = 4, 8 and 16 are used for the SL detector. We can observe that the SSL detector outperforms the SL detectors. The SSL detector can alleviate the pilot overheads of the SL detector by more than 1 8 . Fig. 6 demonstrates the BER performances of the OL detectors and SL detector for dynamic channel environment. To construct the time-varying channel, we apply an order one auto-regressive process, where η is the temporal correlation coefficient for the secondhop channel fading and W[t] is a process noise matrix whose (i, j) element has complex Gaussian distribution, i.e, W i,j ∼CN (0, 1 − η 2 ). According to the Jake's model, the temporal correlation coefficient is characterized as η = J 0 (2π f d T s ), where J 0 (·) denotes the Bessel function of the first kind of order zero, f d is the maximum Doppler frequency and T s is the sampling time. In our simulation, we chose both f d and T s to be f d T s = 0.005. Assuming that the channel is only variant during the data transmission phase set to T d = 2048. In this condition, Fig. 6 shows that the proposed dOL outperforms other machine learning detectors such as the SL detector and the OL detector. This provides a hint that the proposed decaying parameter rule makes detection more accurate by forcing to reducing the weight on the old update information term.

VI. CONCLUSION
In this paper, we proposed the two novel machine-learning based detectors for an uplink MU-MIMO systems with onebit ADCs. The first one, named semi-supervised learning (SSL) detector, can address the major drawback of the existing SL detector, where some part of data signals are employed to estimate model parameters via expectationmaximization (EM) algorithm. Another one is referred to as online-learning (OL) detector, which further improves the robustness of the proposed SSL detector for channel variations. Via simulations, we demonstrated that the proposed SSL detector can significantly outperform the existing SL detector having a lower pilot-overhead. It was also verified that the proposed OL detector can yield an attractive performance for time-varying channels. As succeed in other communications areas [18]- [22], the use of machine-learning would be of attractive for the construction of future MIMO detectors.