Semi-Blind Channel-and-Signal Estimation for Uplink Massive MIMO With Channel Sparsity

This paper considers the transceiver design for uplink massive multiple input multiple output (MIMO) systems with channel sparsity in the angular domain. Recent progress has shown that sparsity learning-based blind signal detection is able to retrieve the channel and data by using message-passing based noisy matrix factorization. We propose a semi-blind signal detection scheme in which a short pilot sequence is inserted into each user packet and the knowledge of pilots is integrated into the message passing algorithm for noisy matrix factorization. We derive a semi-blind channel and signal estimation (SCSE) algorithm based on the message-passing principles. The SCSE algorithm involves enumeration over all possible user permutations, and so is time-consuming when the number of users is relatively large. To reduce complexity, we further develop a simplified SCSE (S-SCSE) to accommodate systems with a large number of users. We show that our semi-blind signal detection scheme substantially outperforms the state-of-the-art blind detection and training-based schemes in the short-pilot regime.


I. INTRODUCTION
Massive multiple input multiple output (MIMO) systems [1]- [4] achieve significant performance improvement over the traditional communication systems in many aspects, such as increasing channel capacity, suppressing channel fading, and enhancing energy efficiency, etc. In a massive MIMO scenario, a base station (BS) typically equipped with an array of a few hundred antennas simultaneously serves many tens of terminals in a single time-frequency resource slot [5]. As the scale of the terminals or the W. Yan and X. Yuan are with the Center for Intelligent Networking and Communications, the National Laboratory of Science and Technology on Communications, the University of Electronic Science and Technology of China, Chengdu 611731, China (email: xjyuan@uestc.edu.cn). The work has been partially submitted for potential presentation in IEEE International Conference on Communications (ICC) 2019 [23]. array increases, the acquisition of channel state information (CSI) becomes one of the key obstacles for the utilization of massive MIMO [6].
Many studies have been attracted to the design of efficient and reliable techniques for channel acquisition. We are particularly focused on the uplink case, where users transmit signals to a BS [7]. In a training-based approach, each transmission frame is divided into two phases, namely, a training phase and a data transmission phase [7], [8]. In the training phase, pilots are transmitted to facilitate the estimation of the channel coefficients at the receiver side; in the data transmission phase, data are transmitted and the receiver performs detection based on the estimated channel.
Compared to separate signal processing for the two phases, joint channel-and-data estimation is able to improve the system performance since partially detected data can be used as soft pilots to enhance the channel estimation accuracy in an iterative fashion [9]. However, no matter whether separate or joint signal processing is employed, it is required in the training-based approach that the number of pilot symbols is no less than that of users, so as to ensure a vanishing channel estimation error [8], [10]. As such, channel acquisition generally consumes a substantial portion of the system resource.
To reduce the channel acquisition overhead, another line of research is called the blind detection approach, in which the channel and data are estimated with little prior information of the signals from the transmitter side [11]- [13]. In particular, it has been recently evidenced that a massive MIMO system exhibits channel sparsity in the angular domain, since signals usually impinge upon a massive antenna array from a limited range of angles [14]- [17]. Based on the channel sparsity, a blind iterative detection technique [18] has been developed to avoid the use of pilots in channel acquisition. Approximate message passing algorithms [19], [20] are used to simultaneously estimate the channel and data by factorizing the received noisy observation matrix.
Sparsity-learning based blind detection in [18], however, can be improved in a number of aspects. For example, the blind detection scheme in [18] imposes a relatively stringent requirement on the channel sparsity level and the signal-to-noise ratio (SNR) of the system to achieve a satisfactory error performance.
More importantly, blind detection suffers from the so-called phase and permutation ambiguities inherent in matrix factorization. In [18], a reference symbol and a user label are inserted in each user packet to eliminate the phase and permutation ambiguities after matrix factorization. Yet, as the reference symbols and the user labels (similar to pilots) are a priori known by the receiver, such knowledge can be integrated into the iterative process of matrix factorization to enhance the detection performance, rather than used for afterwards compensation.
To address the above issues, we propose a semi-blind detection scheme to jointly estimate the channel and the user signals in a sparse massive MIMO system. We focus on the scenario that a short pilot sequence is inserted into each user packet to assist the matrix factorization for joint channel and signal estimation. Here "short pilot" means that the pilot sequence is not long enough to generate a relatively accurate initial channel estimate, and so the existing training-based approaches [7] [9] are unable to achieve a good performance. We show that, to efficiently exploit the short pilots, the phase and permutation ambiguities need to be skilfully estimated in the iterative process of the matrix factorization. As such, a message-passing based semi-blind channel and signal estimation (SCSE) algorithm is developed, building upon the framework of approximate message passing algorithms. The main contributions of this paper are summarized as follows.
• SCSE algorithm for massive MIMO: We propose a novel semi-blind detection scheme for massive MIMO systems to jointly estimate the channel and the signals. The proposed semi-blind detection scheme is able to efficiently exploit the information of short pilots in the iterative process of sparse matrix factorization.
• Simplified SCSE algorithm for complexity reduction: The SCSE algorithm involves an exhaustive enumeration of all possible user permutations, which is computationally infeasible when the number of users is relatively large. As such, we develop a simplified SCSE (S-SCSE) algorithm to avoid the burden of permutation enumeration in the algorithm.
• Simulations are conducted to verify the performance of the proposed SCSE and S-SCSE algorithms.
We show that our proposed semi-blind detection scheme is able to substantially outperform the state-of-the-art training-based and blind detection approaches [9], [18] in the short-pilot regime.
Furthermore, we show that, compared to the SCSE algorithm, the S-SCSE algorithm significantly reduces the computational complexity while maintaining a similar performance, thereby striking a attractive balance between complexity and performance.
The remainder of this paper is organized as follows: Section II describes the sparse channel model and the system model for uplink MIMO systems. In Section III, we formulate the joint channel and signal inference problem by including the estimation of the phase and permutation ambiguities inherent in sparse matrix factorization. Based on that, the SCSE and S-SCSE algorithms are derived based on the message-passing principles, and the selection metric for random initializations is described. Numerical results are presented in Section IV to verify the effectiveness of our proposed algorithms. Finally, we conclude the paper in Section V.
Notations: Capital bold letters, lowercase bold letters, and regular letters represents matrices, vectors, and scalars, respectively. For any matrix A, a i refers to the ith column of A, and a i,j refers to the (i, j)th entry of A. C denotes the complex field; S denotes a set; P denotes an arbitrary permutation matrix. For any set S, |S| represents the the cardinality of S; e i = [0, . . . , 0, 1, 0, . . . , 0] T with the only non-zero element being at the ith position; for any scalar x, |x| represents the absolute value of x; · 2 represents the ℓ 2 -norm, · F represents the Frobenius norm. The superscripts (·) T , (·) H , (·) −1 represent the transpose, the conjugate transpose, and the inverse of a matrix, respectively; E(·), δ(·) and e (·) represent the expectation, the Dirichlet function, and the exponential function; diag{a} represents the diagonal matrix with the diagonal specified by a; ⌈a⌉ represents the minimum integer larger than a.
For any integer I N denotes the set of integers from 1 to N . CN (·; µ, ν) represents a complex circularly symmetric Gaussian distribution with the mean µ and covariance ν.

A. Sparse Channel Model
Consider an uplink massive MIMO system with K single-antenna transmitters and a signal receiver equipped with N antennas deployed as a uniform linear array (ULA), where N ≫ K ≫ 1. Denote by θ ℓ,k the AoA of the ℓth path from transmitter k. The array steering vector for receiving a signal from angular θ ℓ,k can be written as where d is the interval between any two adjacent receive antennas, and λ is the wavelength of propagation.
We use the virtual representation method in [14] to divide the signal AoAs into N resolution bins with the kth bin represented by θ ℓ,k = arccos( ℓλ dN ), ℓ ∈ I N {0, 1, . . . , N − 1}. Then, the physical channel of a massive MIMO system can be modeled asH where A r = 1 √ N 1, a r (arccos λ dN ), . . . , a r (arccos (N −1)λ dN ) ∈ C N ×N is the discrete Fourier transform (DFT) matrix, 1 is a K-dimensional all-one vector, and H = [h 1 , h 2 , . . . , h K ] is the projection of the physical channel in the angular domain. h k = [h 1,k , h 2,k , . . . , h N,k ] T is the complex channel gain of user k, where h n,k is the aggregate gain in the resolution bin centered around θ ℓ,k .
The physical channel of a massive MIMO system exhibits a sparse structure in the angular domain, since only a small portion of resolution bins receive electromagnetic waves from the transmitters. Therefore, the angular-domain channel representation H is a sparse matrix with a large portion of the elements being zero or close to zero [14]. Define the sparsity level of the massive MIMO channel by where S is the support of the non-zero elements of H, and |S| represents the cardinality of set S.
Following [18], [21], we assume that the entry of H are independently drawn from a Bernoulli circularly symmetric complex Gaussian distribution, i.e.
where δ(·) is the Dirichlet function, h n,k is the entry of H in the position of (n, k), and σ 2 h is the variance of a non-zero channel coefficient.

B. System Model
Assume that the channel is block-fading with coherence time T . The massive MIMO system in the angular domain over T time slots can be modeled as where Y ∈ C N ×T is the transformed observation matrix in the angular domain, W ∈ C N ×T is an additive white Gaussian noise (AWGN) with each entry independently drawn from CN (w n,t ; 0, N 0 ), H ∈ C N ×K is the angular-domain channel matrix as aforementioned, Each entry of X is modulated by using a constellation C = {c 1 , c 2 , . . . , c |C| }, where |C| is the cardinality of C. That is, x k,t is uniformly drawn from C for ∀k, t, where x k,t is the tth entry of x k . Assume that C is rotationally invariant for any rotation angular θ ∈ Ω, where Ω = {ω 1 , ω 2 , . . . , ω |Ω| } is an angular set. For example, for quadrature amplitude modulation (QAM), For each user k, the first T P symbols of x k (denoted by x P,k ∈ C P ×1 ) are assigned as pilots, and the remaining T − T P are data symbols. We use T P to represent the set {1, 2, . . . , T P }, and T D to represent the set {T P + 1, T P + 2, . . . , T }. Let X P = [x P,1 , x P,2 , . . . , x P,K ] T denote the pilot matrix occupying the first T P columns of X, and let X D = [x D,1 , x D,2 , . . . , x D,K ] T denote the data matrix occupying the remaining and Y D correspond to X P and X D , respectively. Assume that the entries of X are independent of each other, i.e.
Denote by P the total power budget of the transmitters and α k P the average transmission power of the kth transmitter. Then, each transmitter is power-constrained as where α k 0 for k ∈ I K with K k=1 α k = 1.

A. Problem Description
In this paper, our goal is to retrieve both the channel matrix H and the symbol matrix X D from the observed data matrix Y and the pilot matrix X P . This problem can be formulated by using the maximum a posteriori (MAP) principle as whereĤ andX D are the estimates of the channel matrix H and the signal matrix X D , respectively.
A straightforward approach to solve (8) is first to estimate the channel H based on Y P (with known X P ) and then to estimate the data matrix X D based on Y D and the channel estimateĤ. In principle, the estimated data can be used to further refine the channel estimate and hence improve the system performance. To this end, the authors in [9] proposed a joint channel and signal estimation method which involves approximate message passing over the factor graph obtained by factorizing the probability distribution p H,XD|Y,XP in (8).
In this paper, we focus on the "semi-blind" scenario where the pilot length T P is not large enough to provide a relatively accurate initial channel estimate for data detection. In this scenario, both the separate and joint estimation approaches discussed above cannot work well. The main reason is that when T P is small, the scheme in [9] is close to blind channel-and-signal estimation in which the issue of phase and permutation ambiguities arises [18]. For a small T P , the knowledge of X P is not "strong" enough to correct the phase and permutation ambiguities in the iterative estimation process. As a result, the inclusion of the knowledge of X P in the joint channel and signal estimation (following the method in [9]) may lead to a system performance even worse than the blind estimation method in [18].
To efficiently exploit the knowledge of X P , we need to estimate the phase and permutation ambiguities in the iterative process of message passing. Denote by Π = [π 1 , π 2 , . . . , π K ] T ∈ C K×K an arbitrary permutation matrix, and by with ω independently and uniformly taken from Ω. The phase and permutation ambiguities mean that is also a solution of (5). To resolve these ambiguities, we define auxiliary variables Note that, for any given Π and Σ,H has the same distribution as H does, and soH is independent of Π and Σ. Similarly,X is independent of Π and Σ. Since H and X are independent, we further obtain thatH,X, Π, and Σ are independent of each other, i.e., Then we recast the problem in (8) as From the Bayes' rule, we obtain pH ,X,Σ,Π|Y,XP (H,X, Σ, Π|Y, X P ) where (12a) follows from the Bayes' rule; the notation ∝ in step (12b) means equality up to a constant scaling factor; (12c) is from the facts that (i) (X P , Σ, Π) → (H,X) → Y forms a Markov chain, (ii) X P is independent ofH for any givenX, Σ, and Π, and (iii)H,X, Σ, Π are independent of each other by (10); p σk (σ k ) in (12d) is the probability density of the phase shift of user k. Recall that where |Ω| is the cardinality of Ω. The factorization in (12) will be used in the development of message passing algorithms in the subsequent subsections.

B. Semi-Blind Channel-and-Signal Estimation Algorithm
To describe the message passing process more clearly, we introduce an auxiliary variable P ∈ C K×K to denote a random permutation matrix, where P ∈ P {P 1 , P 2 , · · · , P K! } with an equal probability with P being the set of all permutations. 1 Then, the fact that Π is a random permutation can be represented by the following joint distribution: where p P (P) = 1 . . . , 0, 1, 0, . . . , 0] T being the ith column of the K-by-K identity matrix. With the inclusion of the auxiliary variable P, the factorization in (12) converts to The factorized posterior distribution in (14) can be represented by a factor graph, as depicted in Fig. 1.
In Fig. 1, we use a brief form of δ k to represent δ(x P,k − σ k π T k X P ), k ∈ I K , and δ Π to represent δ([π 1 , π, . . . , π K ] T − P). Each hollow circle in Fig. 1 represents a "variable node" corresponding to a random variable involved in (14), and each black solid square represents a "factor node" corresponding to a factor function in (14). A variable node is connected to a factor node if the variable appears in the factor function. 1 It is possible to design message passing directly based on the factorization in (12). However, the introduction of P yields a unified view of the derivations of the SCSE and S-SCSE algorithms. In particular, we show in Section III-C that the S-SCSE algorithm is derived simply by deleting the constraint σΠ in Fig. 1. In Fig. 1, we divide the whole factor graph into two parts. In part I, we estimate the channel matrixH and signal matrixX based on Y and the knowledge thatH is sparse; in part II, we use the knowledge of X P to improve the estimation ofX P and estimate Π and Σ based on the constraint ofX P = ΣΠX P .
We derive the semi-blind detection algorithm based on the message passing principles over the factor graph in Fig. 1. Note that the constraints in part I are related to factorizing the matrix productHX. This part can be realized by following the BiG-AMP algorithm in [20]. Therefore, in what follows, we focus on the derivation of the message passing algorithm for part II.
Denote by ∆ a→b (·) the message from node a to node b and by ∆ c (·) the marginal posterior of variable c. Then the messages in part II are sketched as follows.
1) The message fromx k,t to δ k is given by 2) The message from σ k to δ k is given by 3) The message from δ k to π k is given by Sincex k,t , π k , and σ k are discrete variables, we write the message in (17) in its discrete form as where Px k,t →δk (x k,t = e jω x i,t ) denotes the probability ofx k,t = e jω x i,t specified in the message ∆x k,t→δk (x k,t ), and C is a generic normalization factor. Clearly, the message from π k to δ Π is 4) The message from P to δ Π is given by 5) Combining the message from P to δ Π and the messages from {π ′ k } K k ′ = =k to δ Π , we obtain Denote by p ℓ,k the transpose of the kth row of P ℓ . The discrete form of the above message can be written as where P π k ′ →δΠ (π k ′ = p ℓ,k ′ ) denotes the probability of π k ′ = p ℓ,k ′ specified by the message ∆ π k ′ →δΠ (π k ′ ).
Similar to (19), the message from π k to δ k is 6) The message from δ k tox k,t is given by We write the above message in its discrete form as for e jω xi,t=c 7) The message fromx k,t to p yn,t|zn,t is given by ∆x k,t →py n,t |z n,t (x k,t ) ∝ px k,t (x k,t )∆ δk→xk,t (x k,t ) N n ′ =1 =n ∆ py n ′ ,t |z n ′ ,t →xk,t (x k,t ).
where px k,t (x k,t ) = p xk,t (x k,t ) is the prior distribution ofx k,t determined by the modulation of the data, and ∆ δk→xk,t (x k,t ) is the message provided by the prior knowledge of X P . The message passing process of part II can be realized by the BiG-AMP algorithm in [20]. Note that the marginal posterior ∆x k,t (x k,t ) = ∆ δk→xk,t (x k,t )∆x k,t→δk (x k,t ) rather then ∆x k,t →py n,t |z n,t (x k,t ) is needed in the BiG-AMP algorithm for complexity reduction. The discrete form of ∆x k,t (x k,t ) is given by for: e jω xi,t=c for e jω xi,t=c The estimates ofH andX from the message passing iteration contain phase and permutation ambiguities. To eliminate these ambiguities, we need to estimate the phase and permutation ambiguities.
Specifically, the marginal posterior of σ k can be depicted as The discrete form of the message above can be written as Then, an estimate of Σ is given byΣ = diag{σ 1 ,σ 2 , . . . ,σ K }, whereσ k = arg max ω∈Ω P σk (e jω ). The marginal posterior of P can be depicted as We write (30) in its discrete form as Then, we obtain an estimate of Π byΠ = arg max P P (P ℓ ).

C. Simplified Semi-Blind Channel-and-Signal Estimation Algorithm
The SCSE algorithm is computationally infeasible for a relatively large K, since it involves enumeration over all length-K permutations in (A19). To reduce the complexity, we relax the constraint that Π is a permutation to the one that each row k of Π (denoted by π T k ) is independently taken from the set {e ℓ } K ℓ=1 . That is, where p πk (π k ) = 1/K, π T k ∈ {e 1 , e 2 , . . . , e K }. The corresponding factor graph of part II in Fig. 1 is given in Fig. 2. Compared with part II in Fig. 1, the factor graph in Fig. 2 is almost the same, except that the nodes {p P , P, σ Π } are replaced by {p πk }.
We now describe message passing over the factor graph in Fig. 2. Again, we focus on part II. The message fromx k,t to δ k and from σ k to δ k have the same form as in (15) and (16). The messages from π k to δ k is given as Then, the message from δ k tox k,t is given by Sincex k,t , π k , and σ k are discrete variables, we can write the above message in its discrete form as Then, for any c ∈ C, the marginal posterior ofx k,t can be updated as The other messages are calculated by following the SCSE algorithm. Compared to SCSE, S-SCSE omits the calculation of P δΠ→πk (π k = e i ) in (22), which significantly reduces the computation complexity The phase and permutation ambiguities are estimated as follows. The marginal posterior of σ k can be depicted as The discrete form of the above message can be written as Then, an estimate of Σ is given byΣ = diag{σ 1 ,σ 2 , . . . ,σ K }, whereσ k = arg max ω∈Ω P σk (e jω ). The marginal posterior of π k can be depicted as Similarly, the discrete form of the above message can be written as Then, an estimate of Π is given byΠ = [π 1 ,π 2 , . . . ,π K ] T , whereπ k = arg max ℓ∈IK P πk (e ℓ ).   Table I.
∀k:σ k = arg max ω∈Ω Pσ k (e jω ),Σ = {σ 1 ,σ 2 , . . . ,σ K } (S7) ∀k:π k = arg max ℓ∈I K Pπ k (e ℓ ),Π = {π 1 ,π 2 , . . . ,π K } (S8) The S-SCSE algorithm is presented in Table II. In Table II,  In this SCSE and S-SCSE algorithms in Table I and Table II, we assume that the model parameters, such as ρ, σ 2 h , and N 0 are a priori known by the receiver, so that the distribution pH(H), pX(X) and p Y|Z (Y|Z) can be initialized. However, in practice, these model parameters need to be estimated as 2 To distinguish users uniguely, TP is required to be large enough to ensure that for each x P,k and x P,k ′ , x P,k = e jω x P,k ′ , ω ∈ Ω. This implies that, when the data are modulated by quaternary phase shift keying (QPSK), TP should be no less than 1 + ⌈ 1 2 log 2 K⌉, where one symbol is used to correct the phase shift of each user k, and ⌈ 1 2 log 2 K⌉ symbols are used to guarantee that the pilot sequences of the K users are different from each other. well. In this paper, we use the EM algorithm to infer these model parameters and the EM update is performed in each outer iteration. In addition, the damping technique is used in the algorithm to improve convergence in simulation. We refer readers to [20] and [21] for more details.

D. Metric for Random Initializations
The semi-blind detection problem in (11) is non-convex, and the SCSE and S-SCSE algorithms are prone to be stuck at local optima. To alleviate this issue, multiple random initializations and multiple re-initializations are conducted.
We next describe how to choose a desirable result among multiple random initializations. In a practical receiver, the metrics such as the mean-square error of the channel and the symbol error rate of the signal are not useful in evaluating the performance of random initializations since the ground truth is not available to the receiver. In this regard, we propose to use the following heuristic metric for evaluating random initializations: where τ is the index of random initializations. We choose the initialization with the minimum value of J(τ ).

IV. NUMERICAL RESULTS
In simulation, QPSK and 16-QAM modulations with Gray-mapping are employed. We set α k = 1/K, P = K, and σ 2 h = 1. The SNR is defined by K N0 . For the simulated algorithms, the maximum number of inner iterations L max is set to 100, and the maximum number of outer iterations M max is set to 10.
The simulation results presented in this paper are obtained by taking average over at least 500 random realizations. We compare the numerical results of different approaches, as listed below.
• Training-based: The training-based scheme is performed by using the generalized approximate message-passing (GAMP)-based joint channel-and-data (JCD) algorithm in [9].
• Blind detection: The blind detection scheme is performed with the BiG-AMP algorithm in [20].
• SCSE: The SCSE algorithm is proposed in this paper.
• S-SCSE: The S-SCSE algorithm is proposed in this paper.   and T = 50. The results of random initializations J is set to 5. We can see that for a relatively large T P (say, T P = 5 for the configuration in Fig. 4), S-SCSE is able to perform close to SCSE. Note that due to high computational complexity for SCSE, we hence forth only present the simulation results of S-SCSE.  Fig. 5(a)) to 16-QAM (in Fig. 5(b)), a lager number of random initializations is required to accurate reliable semi-blind detection. Also note that the blind detection system needs one reference symbol and a user label. For the simulation settings considered here, this amounts to a cost of 1 + ⌈ 1 2 log 2 K⌉ = 4 symbols for QPSK modulation, and 1 + ⌈ 1 2 log 4 K⌉ = 3 symbols for 16-QAM modulation.
In Fig. 5(a), we see that for T P = 4 and 8, S-SCSE significantly outperforms the training-based scheme.
We also see that for T P = 4, S-SCSE slightly outperforms the blind detection scheme, while for T P = 8, SCSE outperforms the blind detection scheme by about 4 dB at BER = 10 −5 . For T P = 12, the S-SCSE and training-based schemes perform close to each other. The reason is that in this case T P is large enough to provide a relatively accurate initial channel estimate, and so the training-based scheme can work well.
In Fig. 5(b), S-SCSE can outperforms the blind detection scheme by about 6 dB at BER = 10 −3 for T P = 3, and by about 10 dB at BER = 10 −4 for T P = 9. Similarly, S-SCSE significantly outperforms the training-based scheme for T P = 3 and 6. The S-SCSE and training-based schemes perform close to each other for T P = 12. Fig. 6 shows the throughput for the training-based, blind detection, and S-SCSE schemes with 16-QAM and Gray mapping against SNR. We say that a system performs successful recovery when BER < 10 −3 .
For the S-SCSE and training-based schemes, for each given SNR, we increase the number of pilots T P until the system performs successful recovery. For blind detection, T P is fixed at 3. Then, the throughput is calculated by 4K(1 − T P /T ) bit per channel use. From Fig. 6, we see that S-SCSE considerably outperforms the training-based scheme for ρ = 0.1, 0.2, and 0.3. For ρ = 0.4, the sparsity level is too high so that the sparse matrix factorization itself cannot provided much useful information. Both the semi-blind and training-based schemes rely on the knowledge of pilots for channel and signal estimation.  thus, the two schemes perform closely in Fig. 6(d). For comparison, we also include the SNR threshold beyond which the blind detection scheme is able to perform successful recovery. We see that the blind detection scheme works well only when the SNR is sufficiently high. We also see that the threshold is not included in Fig. 6(d) since for ρ = 0.4 the blind detection scheme does not work in the SNR range of interest. This demonstrators the advantage of semi-blind detection.

V. CONCLUSIONS
In this paper, we proposed a semi-blind signal detection scheme for uplink massive MIMO in which short pilot sequences are inserted into user packets and the knowledge of pilots is intergraded into the message passing algorithm for noisy matrix factorization. We derived two semi-blind estimation algorithms, namely SCSE and S-SCSE, based on the message-passing principles. In specific, the S-SCSE algorithm, as a simplified version of the SCSE algorithm, achieves almost the same performance with a much lower computational complexity. We showed that our proposed semi-blind scheme substantially outperforms the existing blind detection and training-based schemes in the short-pilot regime.