Iterative Receivers for Large-Scale MIMO Systems With Finite-Alphabet Simplicity-Based Detection

In this paper, we consider large-scale MIMO systems and we define iterative receivers which use the simplicity-based detection algorithm referred to as Finite Alphabet Simplicity (FAS) algorithm. First, we focus on uncoded systems and we propose a novel successive interference cancellation algorithm with an iterative processing based on the shadow area principle and we optimize its parameters by exploiting the theoretical analysis of the detector output. Secondly, we assume FEC-encoded systems and we propose an iterative receiver based on a maximum likelihood-like detection with restricted candidate subset defined by the FAS algorithm output. We also introduce another receiver based on FAS detection whose criterion is penalized with the mean absolute error function. Simulations results show the efficiency of all proposed iterative receivers compared to the state-of-the-art methods.


I. INTRODUCTION
The expected exponential growth in the number of connected mobile machines and data traffic is motivating 5G designers to look for new technologies and approaches to meet the growing demand.It has been theoretically demonstrated that conventional schemes cannot achieve the overall throughput capacity of multi-user wireless systems and that the maximum number of users supported is limited by the total number of orthogonal resources [1].To overcome this problem and to support the massive connectivity of users and devices, improved technologies are needed.The large-scale Multiple-Input Multiple-Output (MIMO) is considered as a potential candidate to meet the challenges of 5G.To better exploit spatial diversity, the idea is to implement a large number of antennas in order to provide higher bandwidth within the spectrum limitation.Massive MIMO is a special case of large-scale MIMO systems that involves higher number of antennas.
In this paper, we are interested in the detection problem in large-scale MIMO systems.
The usual detection algorithms cannot be applied in our context.Optimal detection of maximum likelihood (ML) The associate editor coordinating the review of this manuscript and approving it for publication was Jiayi Zhang .meets the requirement of diversity, but its complexity is obviously too high.The ML-like sphere decoding technique [2] involves an exhaustive search in the hypersphere, the dimensions of which remain high in the case of large-scale MIMO, resulting in detection that is impossible to solve from a complexity point of view.Linear solutions such as Minimum Mean Square Error (MMSE) [3] and Zero Forcing (ZF) have a low computational complexity at the expense of a high performance loss.Successive interference cancellation (SIC) schemes [4], [5] have been proposed such as MMSE-SIC in [6] to improve linear detector performance at the expense of higher complexity.A further decrease in the error rate was obtained by combining the (SIC) and lattice reduction (LR) schemes, as was done, for example, in the MMSE-SIC-LR studied in [7].
An interesting phenomenon in large-scale MIMO can be exploited, it is the channel hardening [8].It is shown that when an overdetermined MIMO system is considered (the number of receive antennas is much higher than the number of transmit antennas), the linear detectors such as ZF and MMSE perform close to the optimum thanks to the channel hardening phenomenon and become attractive from an implementation point of view.However, in this case the spectral efficiency is limited by the number of transmit antennas which must be low.Hence, we lose one of the most important benefits of large MIMO systems.To get high spectral efficiency, the two MIMO system dimensions should be large and the performance of linear detectors degrades in that case.To overcome this problem, local search-based algorithms well suited for large-scale MIMO systems like Likelihood Ascent Search (LAS) [9] and Reactive Tabu Search (RTS) [10] achieve near optimal performance while keeping the same range of complexity as linear detectors.The task here is to get a first solution delivered by linear detection and to improve it by looking for a better chosen local minimum from the solution neighbors of the linear detection output.
Compressed sensing (CS) techniques have attracted considerable attention.They suggest that it may be possible to go beyond the traditional limits of sampling theory.Thanks to a sparse transformation of the received signal and the use of source separation techniques such as basic pursuit (BP) which searches for a sparse solution vector, it becomes possible to successfully recover the desired signal [11]- [13].In [14], exploiting the finite alphabet signal simplicity [15] (all its elements are bounded) we proposed a low-complexity detector suitable for large-scale MIMO systems with the ability to handle the underdetermined case.Unlike [11]- [13], with the exception of the problem real-valued formulation, no signal transformation is required.The detection is based on a quadratic criterion with bounded constraints to recover the received signal.We have shown that this algorithm works in the same way as [11]- [13] with less complexity and better than MMSE with an equivalent order of the calculation cost.
A forward error correction (FEC) is usually applied before modulation.Turbo-like receivers [16] based on iterative exchanges of information between their components (detection, synchronization, decoder, channel estimation, . ..) have proven to be efficient in achieving almost optimal performance [17].To this end, the authors proposed in [18] to combine CS-based detection with a soft-decision decoder as part of an iterative process based on a regularized detection criterion associated with a judicious sparse formulation of the detector output.However, the regulation parameter is set empirically and the output of the FEC decoder is preprocessed, which can degrade the conveyed information.
In this paper, we consider large-scale MIMO systems with an independent and identically distributed i.i.d.channel model.This model corresponds to a spatially white MIMO channel, which in practice can occur in rich diffusion environments with multipath elements uniformly distributed in all directions.We define iterative receivers that use the simplicity-based detection algorithm proposed in [14] and called Finite Alphabet Simplicity (FAS) algorithm in the rest of the paper.
In the first part, we focus on large-scale uncoded MIMO systems.We propose a novel successive interference cancellation algorithm with an iterative processing based on the shadow area principle [19] that exploits the symbols reliability.We analytically optimize its parameters by taking advantage of the theoretical analysis of the detector output for a two-iteration process.For a higher iteration number, the optimization is empirically carried out from simulations.The idea is to consider a set of reliable symbols at each iteration contrary to classical interference cancellation procedure which only considers one symbol as reliable and then to eliminate their contribution to the large-scale MIMO system in order to further improve the detection of the unreliable symbols in next stages.Herein, the goal is to reduce the number of necessary iterations compared to usual interference cancellation schemes which need an iteration number equal to the symbol number.The proposed algorithms are also extended to frequency-selective channels and we investigate the effect of spatial channel correlation on the algorithm performance.
In a second part, we assume that large-scale MIMO systems are FEC-encoded and our goal is to define an iterative turbo-like receiver.We propose an iterative receiver based on an ML-like detection whose restricted candidate subset is defined by the FAS-detection output.We also present another receiver based on the FAS algorithm whose criterion is penalized by the mean absolute error function.
Our contributions are: (i) an iterative FAS-algorithm that uses shadow-area constraints in uncoded large-scale MIMO systems (ii) an analytical expression of the parameter defining the shadow-area.For the FEC-encoded large-scale MIMO, (iii) we reduce the complexity of ML detection by restricting the candidate subset of the FAS-algorithm output (iv) to further reduce the complexity of the receiver, we propose a second iterative receiver with a detection based on a regularization of the FAS criterion without pre-treatment of the FEC-decoder output (v) and whose regularization parameter is fixed analytically.
The paper is organized as follows.Section II describes the large-scale MIMO system model discussed below and recalls the FAS detection scheme proposed in [14] and the main theoretical results.Section III deals with the iterative detection problem in the uncoded case solved by the shadowarea principle applied with the FAS algorithm.Section IV focuses on the design of turbo-like iterative receivers based on FAS detection.Finally, Section V concludes the paper.
Notations: Boldface lower case letters and boldface upper case letters denote vectors and matrices, respectively.The notations (.) T , (.) H and (.) * are used for the transpose, transpose conjugate and conjugate operations respectively.The Kronecker product is denoted by ⊗.I k is the k × k identity matrix and 1 k is the all-one size-k vector.The complexvalued vector z ∈ C k can be tranformed to a real-valued vector z ∈ R 2k defined by z = Re (z) Im (z) T .The complex-valued matrix H ∈ C n×N is transformed also to a real-valued matrix H ∈

II. SYSTEM MODEL AND OVERVIEW
We consider a noisy mixing model which can be described by the following linear equations: where y ∈ C n is the complex-valued observation vector, x ∈ C N is the complex-valued source vector, and H ∈ C n×N is a complex-valued random matrix.The components of H are assumed to be independent and circularly symmetric Gaussian with zero mean and unit variance.The vector x belongs to a complex finite alphabet.It can be written as x = a + jb where (a, b) ∈ F N × F N and F = {α 1 , α 2 , .., α p } which is the real-valued alphabet.We denote by M = p 2 the complex alphabet size.The equivalent real-valued system can then be written as: ( The elements of F are assumed to be equiprobable under the realization of x.Our problem is, given H and F, to recover of x from y.
In order to resolve the above problem, we briefly describe the detection technique proposed previously in [14].For that purpose, we introduce the following definition: Definition 1 (Simplicity [20]): The simplicity property of F is exploited to propose an optimization problem whose complexity is independent of the constellation size and is lower compared with [13], while performing the same in terms of error rate.The vector x is simple and its components are minored by α 1 and majored by α p .It can be decomposed as x = B α r where B α = I 2N ⊗[α 1 ; α p ] and r ∈ [0, 1] 4N .We used the previous decomposition to define the simplicity-based optimization problem given by [14] arg min where The optimization problem defined by ( 3) is a quadratic programming model.The linear equality constraint combined with the positivity constraint imposes that the detected vector will be minored by α 1 and majored by α p .
The criterion (3) can be optimized by the interior point methods [21] or the simplex [22].In this paper, algorithms based on interior point methods are considered.These algorithms start by finding an interior point of the polytope satisfying the constraints and then move inside the polytope to converge to the optimal solution.The resulting detector is referred to as Finite Alphabet Simplicity (FAS) detector in the remaining of the paper.
Theorem 3 in [14] demonstrates that the system condition n N > p−1 p is a necessary condition to the solution uniqueness for FAS detection in the noiseless case.
Our purpose is to include the FAS algorithm within an iterative detection for either uncoded or FEC-encoded largescale MIMO systems.Both of them require the knowledge of the analytical distribution of the detector output which was established in [14].Let r denote the solution of (3).Then the components of the FAS detector output x = B α r follow a censored normal distribution and their probability density function is given by [14]: Theorem 2 (Statistical Distribution of the Detection Output): The components of x = B α r follow a censored normal distribution given by with and where . ., 2n.Simulations were carried out in [14] to support the theoretical study and they showed that the analytical distribution coincides with the simulated histogram for different system dimensions and different SNR values.In practice, channel state information needs to be estimated.FAS detection was proved to be robust to channel estimation inaccuracy in terms of error rate [14].

III. ITERATIVE DETECTION BASED ON THE SHADOW AREA PRINCIPLE
In this section, our purpose is to improve the FAS detection performance by including it within an iterative detection procedure.For that purpose we consider shadow area constraints (SAC) used in [23] to limit error propagation [24], [25] in successive interference cancellation (SIC) schemes.Contrary to usual SIC, multiple feedback SIC with shadow area constraints (MF-SIC-SAC) feeds back more than one constellation point to the IC.Symbols are selected according to their belonging to a shadow area or not.In [23] the parameter which defines shadow areas is fixed empirically.
Herein, we propose to apply a similar approach to FAS detection and to exploit the theoretical distribution of its output (4) to fix the optimal parameter that limits the shadow areas.

A. SHADOW AREA AND DETECTION RELIABILITY
In the detection method described in previous section, all sources are detected at once and some decisions may be less reliable than others.In this section, we propose a reliability measure based on the shadow area principle [19], [26], [27] that exploits the output statistics reminded in Section II.We first define the centers as the elements of F. The principle is to take decision on components x k such that xk is close enough to one center and cancel their contribution in the observation y so as to proceed a novel detection iteration.To do so, we propose to take into account the reliabilities of the output xk .According to (4), the distribution of xk given x k = α i has a Gaussian shape centered on α i and moving away from the center makes the symbol less reliable.From this observation, we define shadow areas as intervals whose middle isn't a center and whose width depends on a threshold to be fixed hereinafter.xk is considered either as unreliable when it falls in a shadow area, or as reliable otherwise.We take decisions on reliable xk , cancel their contribution from y and proceed another detection.Adjacent to shadow areas, the high-reliability intervals are defined as intervals of length 2η and are centered on the different symbols of F. Let us denote by A the set of indices k such that xk is considered as reliable.The decision on xk , k ∈ A is taken as the nearest symbol value in F. We denote by xA the resulting decision vector.The equivalent notations for unreliable elements (falling in shadow areas) are respectively A for the set of indexes and v N for its cardinality.The observation after interference cancellation is denoted by ỹ and equals where The task is to estimate the vector x A which can be recovered by the following problem: where The shadow area constrained (SAC)-FAS detection procedure is detailed in Algorithm 1.The performance of the proposed iterative procedure highly depends on the choice of the parameter η, which needs optimization.

1) ITERATIVE FAS-SAC FOR k = 2
We chose to use the error probability as an optimization criterion.The error probability is a monotonically increasing function of the variance of the components of the detector output which can be calculated for the second iteration output only.Then, we propose to optimize the parameter η so as to minimize the variance σ 2 x .Theorem 3 provides an approximation of σ 2 x .

Theorem 3 (Variance of the Iterative Detector-Output):
Let x be the output of Algorithm 1 for a given η ∈ R + .Then, the variance of its components can be approximated by: where Z η is the probability that xk is reliable and is equal to Y η is the variance of the components of the vector xA given by and σ 2 ζ (η) is the variance of the components of the vector ζ : Proof 1 (Proof of Theorem 3): We prove hereinafter equation (9).The proof of equations ( 10) and ( 11) is provided in the Appendices.Let x the output of Algorithm 1 for a given η ∈ R + .Then, the variance of its components is given by: Using the definitions of Z η and Y η , we can write: As mentioned earlier, their expression is computed in the Appendix.To compute the variance var(x k | k ∈ A), let us study the covariance matrix xA by exploiting the fact that the number of elements of A denoted by v N is a random variable independent from x.Therefore, xA is defined by: By using the conditional expectation on the random variable v N , we can rewrite equation ( 15) as follows: By assuming that the vector xA can be estimated by: and substituting equation (16) in equation ( 15), we can compute the covariance matrix as By using the conditional expectation on the random matrix H A , the second expectation of previous equation can be rewritten as follows: η) I n by using the independence between H A and H A .Therefore, the covariance matrix can be rewritten as where we have exploited that, for a given v N , the matrix (H T A H A ) −1 follows an inverse Wishart distribution and then [28]).Let us mention that xA is proportional to the identity matrix.The distribution of v N is provided by following Proposition 4.
From Proposition 4, as the probability of the event {v N ≥ 2n − 1} is not significant, this event will be neglected in the calculation of xA .The diagonal elements of xA are given by var Applying the same reasoning as in [14], we can obtain the following approximation for the variance var(x k | k ∈ A).

Lemma 5 (Variance Approximation): The variance of the components of the vector xA can be approximated for n
The proposed iterative process can be extended beyond two iterations.However, as previously mentioned, the choice of the limitation parameter η becomes more complicated.The analytical calculation of the variance of the output at iteration k > 2 is not possible due to hard decisions which are taken on reliable symbols at each iteration.To overcome this difficulty, we apply an empirical choice strategy which is illustrated in Fig. 1.At the end of each iteration, we compute the output histogram and we choose the largest possible parameter which minimizes the error probability of hard decisions on elements selected as reliable.The parameter η is chosen by considering the place where the tail of the conditional histogram of the adjacent symbol vanishes.The number of necessary iterations highly depends on the SNR value.
In practical systems, the values of the parameter η over iterations can be empirically fixed and stored, by running simulations and by applying the above strategy for different SNR values and for different system dimensions.
In the simulation results section, the proposed schemes will be mentioned as FAS and FAS-SAC for the original and iterative algorithms respectively.

B. SIMULATION RESULTS
In this section, we evaluate the performance of the proposed FAS-SAC detection for a QAM constellation with different modulation orders and different number of iterations.

1) ITERATIVE FAS-SAC FOR k = 2
Herein, we consider the FAS-SAC for two iterations.We check the validity of the theoretical analysis and we optimize the parameters through simulations.In Fig. 2, the variance of the detector output is plotted as a function of η for different SNR values, N = 35, n = 30 and QPSK.The parameter η should be chosen so as to get the minimum value of the variance.
In Fig. 3, we have plotted the BER after first (FAS) and second (FAS-SAC) iteration of proposed detection compared to RTS and LAS algorithms for N = 64, n = 64 and different M -QAM (M = p 2 = 4, 16, and 64).We observe that the proposed FAS-SAC detection improves the performance of the FAS algorithm at all SNR values and for all QAM modulations.For instance, FAS-SAC detection achieves a gain between around 2dB and 3dB at 10 −3 BER.We also show that the FAS-SAC better exploits the receive diversity than the LAS detector and it achieves a gain that gets higher as the BER decreases or M increases.For QPSK, the FAS-SAC outperforms the LAS by 2.2dB at 10 −3 BER, the gain increases when M increases to achieve 7dB at 10 −2 BER (64-QAM).
As for RTS algorithm, we observe that it outperforms the FAS by 1dB at 10 −2 BER for QPSK.As the modulation order increases the FAS gets better than RTS from a given BER value (5.10 −4 BER for 16-QAM, 4.10 −3 BER for 64-QAM) with a flattering effect on the RTS performance curve.The FAS-SAC performs close to the RTS for QPSK and gets better than RTS below 3.10 −3 BER for 16-QAM and 2.10 −2 BER for 64-QAM.
The comparison of FAS and FAS-SAC detectors to MMSE-SIC detector is considered for a 64 × 64 determined MIMO system with QPSK in Fig. 4 and with 16-QAM in Fig. 5.The FAS-SAC detector better exploits the receive diversity than the MMSE-SIC detector and it achieves a gain that gets higher as the BER decreases or M increases.For QPSK, the FAS-SAC outperforms the MMSE-SIC by 2.4 dB at BER 10 −2 and by more than 4 dB at BER 10 −4 .For 16-QAM, the gain equals 4.3 dB at BER 10 −2 and 6 dB at BER 10 −4 .
In Fig. 6 and 7, we consider underdetermined systems with QPSK and 16-QAM respectively.We observe that proposed FAS-SAC algorithm performs remarkably even with underdetermined configurations.For instance, at BER 10 −4 , the gains of FAS-SAC over FAS vary between 1 dB and 2 dB for QPSK and between 2 dB and 3.8 dB for 16-QAM.

2) ITERATIVE FAS-SAC FOR k > 2
Let us now consider the case of higher number of iterations k.We fixed the maximum number of iterations to 8 and we   show the BER performance for different modulation orders.We will note that the number of maximum iterations can be reduced for low SNR values, as there is no improvement after a given low number of iterations.The same observation holds for high SNR values: few iterations are sufficient to reach the convergence and perform close to the lower bound.
In Fig. 8, we consider QPSK modulation and k = 8.We plotted the BER performance of the system with an AWGN channel (no interference) as a lower bound.We observe that the proposed FAS-SAC performs remarkably.For instance, at BER 10 −4 , the gains of FAS-SAC with k = 8 over FAS and FAS-SAC with k = 2 are about 3.3 dB and 1.2 dB respectively.Furthermore the FAS-SAC with k = 8 performs very close to the lower bound with a gap of 0.4 dB at BER 10 −4 .In Fig. 9, we consider 16-QAM modulation.FAS-SAC with k = 8 always performs better than FAS and FAS-SAC with k = 2.For instance, at BER 10 −4 , the gains of FAS-SAC with k = 8 over FAS and FAS-SAC with k = 2 are about 6 dB and 2.2 dB respectively.
In Fig. 10, we show the impact of high dimensions on the performance of the FAS-SAC algorithm.We show that when the system dimensions increase the FAS-SAC becomes even more efficient and beyond a given SNR value its performance converges towards the AWGN channel lower bound.For instance, in the case of 128 × 128, the FAS-SAC with k = 8 reaches the lower bound from SNR = 10 dB.However, for 64×64, we observe a minimum gap of 0.3 dB compared to the lower bound.These results can be explained by the channel hardening phenomenon.

3) FREQUENCY SELECTIVE LARGE-SCALE MIMO CHANNEL
In this section, we consider the frequency-selective largescale MIMO channel where L multipaths interfere at every channel use.The output of the frequency selective channel at time t is then written as: where H l repesents the realization of the frequency selective channel of the l th path.In order to decode the symbol vector, we consider the whole frame and we propose a joint detection problem.The received vector is formulated as: where In Fig. 11, we consider a 64 × 64 MIMO system using QPSK with T = 15.We compare the FAS-SAC algorithm performance for different multipath frequencyselective channels.We observe a performance loss when the number of mutipaths increases.However this degradation can be limited when considering a coded case and exploiting the diversity of frequency-selective channel.

4) EFFECT OF SPATIAL CORRELATION IN LARGE-SCALE MIMO SYSTEMS
In practice, some spatial correlation exists due to the antenna array geometry and the propagation conditions, which makes the i.i.d.model inadequate.This spatial correlation can affect the rank of the large-scale MIMO channel matrix resulting in degraded channel capacity.
In this section, we consider a spatially-correlated MIMO channel and we investigate the performance of the FAS-SAC detector.We adopt the Kronecker product model [29], where the complex large-scale MIMO channel matrix can be written as: VOLUME 8, 2020 where R r and R t are the receive antennas and transmit antennas correlation matrices respectively, defined as in [30], [31] R t = R r = (ρ i,j ) and ρ i,j = J 0 ( 2π λ d(i, j)) in which d(i, j) is the distance between antennas i and j, is the angle spread, λ is the wavelength, and J 0 (x) is the Bessel function of the zero-th order.H 0 represents an i.i.d.Rayleigh fading channel matrix.In this model, the fading statistics of the receive and transmit sides are assumed to be independent.Let us mention that this model does not take into account the scattering environment between the transmit and receive sides.
In Fig. 12, we show the effect of channel correlation in FAS-SAC algorithm.We consider a 128×128 MIMO system using QPSK modulation, with the distance between antenna elements fixed to 0.4λ.We assess the performance of i.i.d fading and spatially correlated fading.When the MIMO channel is correlated, we observe a degradation of the error rate due to the rank reduction of MIMO channel matrix.However the slope of the curve keeps as steep as in the i.i.d.case.To limit the correlation impact, we can increase the receive antennas or introduce a forward error correcting code.

C. COMPLEXITY ANALYSIS
In this section, we compare the complexity of the proposed iterative algorithms over the FAS and MMSE-SIC.Inner iteration refers to the iterations involved in the interior point method.Table 1 summarizes the order of different algorithms.We observe that the FAS-SAC k = 2 and k > 2 an additional complexity over the original algorithm due to number of iterations which is fixed and independent of the system dimensions.The whole complexity is dominated by the complexity of the first iteration.Nevertheless, we get the same order of complexity O(N 3 ) for FAS and its iterative FAS-SAC versions.The complexity of the LAS algorithm is dominated by three operations [9].The first one is the computation of the initial solution which can be found by MMSE algorithms and induces a complexity order of O(N 3 ) to the matrix inversion.The second one is the calculation of H T H which represents also a complexity order of O(N 3 ).However, the final one which is the search operation requires a complexity order of about O(N 2 ).Therefore, the total complexity is about O(N 3 ) + O(N 2 ) dominated by the two first steps.The RTS represents the same order of complexity as LAS algorithm with an extra complexity due to the implemented escape strategy [10].To conclude, we can mention that the LAS and RTS represent the same order of complexity as the proposed FAS algorithm with its iterative versions FAS-SAC with an extra complexity due to their iterative process.

IV. PROPOSED TURBO DETECTION SCHEME
In this section, we focus on FEC-coded large-scale MIMO systems.Our goal is the design of an iterative receiver consisting of a detector based on the FAS algorithm and a soft-input soft-output FEC decoder.The best iterative receiver of the state-of-the-art includes a soft-input soft-output maximumlikelihood detection.It provides the FEC decoder with loglikelihood ratios (LLR) whose computation involves the consideration of all possible transmitted sequences, which makes its practical use limited to low-order modulations and lowdimensional systems.In this section, we first propose to use the FAS detection to reduce the set involved in the computation of log-likelihood ratios.Although decreased, the computation cost of the resulting receiver keeps high in the case of high-order modulations.We then design a second iterative receiver, whose detection uses an optimized regularization of the FAS criterion.Compared to the first proposed scheme, the complexity of the second one is significantly lower at the cost of a contained performance loss.

A. ITERATIVE RECEIVER PRINCIPLE AND NOTATIONS
Let us first mention the assumptions regarding the transmitter.The binary stream is considered to be FEC-encoded, then randomly interleaved before being converted into QAM symbols and passed through a serial-to-parallel converter.
Let m = log 2 (p) and let c be the coded and interleaved binary information sequence of length L. Let also ψ be the binary-to-symbol conversion defined by: and c (j) = ψ −1 (α j ).The receiver structure is depicted in Fig. 13.dec in and dec out stand for the soft FEC input and output LLR respectively.Both proposed iterative schemes differ from the detection box definition.We denote by det in and det out the detection input and output respectively.det in is defined as the interleaving of the difference between dec out and dec in (extrinsic information).

B. FAS MAXIMUM LIKELIHOOD LIKE ITERATIVE RECEIVER (FAS-ML)
Usual turbo-detection schemes are based on a ML detection followed by a decoder [32].In such a scheme the detection output det out is defined as follows: where c = 2c − 1 and X k, corresponds to the set of sequences x such that c k = .
The complexity of such a detection increases exponentially with M and N .Therefore we propose to reduce the complexity by substituting a limited-size subset k, for X k, .For that purpose, we first run the FAS detection once and make a hard decision on its output x.We denote by xdet out this hard decision output.Then we define the subset such that it includes xdet out and sequences x which differ by one element from xdet out .To limit the size of , we only take neighbors of xdet out .More precisely, if xdet out and x differ from their i-th element, then x i is an adjacent symbol of xdet out,i in F. After the initialization step during which the FAS detection is carried out, an iterative process is applied alternating from a ML-like detection and a FEC decoder.The ML-like detection computes det out from det in and y as follows: In the remaining of the paper, we refer to the resulting iterative receiver as FAS-ML.In the case of uniform square constellations and except for M = 4, each symbol has at most three neighbors, and thus the complexity of FAS-ML only depends on the length of c.

C. FAS MEAN ABSOLUTE ERROR-BASED ITERATIVE RECEIVER (FAS-MAE)
To further reduce the receiver complexity, we propose a second receiver whose detection is based on a regularization of the FAS criterion.The receiver structure is detailed in Fig. 14.
Compared to [33], two major differences can be highlighted.First, the FEC output is directly exploited without any preprocessing in order to preserve the information.Secondly, the regularization parameter is optimized and an analytical expression is given.

1) NEW DETECTION CRITERION DESIGN
The first modification compared to [33] is the use of the Mean Absolute Error (MAE) computed from conditional probabilities Pr(x k = α j | det in ).We denote this error by ε(x, x| det in ) and we define it by where in provided by the FEC decoder, we compute Pr x k = α j | det in as follows: Pr . The MAE is introduced as a regularization term in the FAS criterion to define the following optimization problem: arg min where γ is a positive weight less than 1.In the remaining of the paper, the resulting iterative receiver is referred to as FAS-MAE.On one hand, the regularization term ε(x, x| det in ) is integrated to penalize the minimization criterion in order to ensure that the detection output remains near the decoder output.On the other hand, γ enables to regulate the contribution of the extrinsic information provided by the FEC decoder and to question the FEC decision reliability if necessary.The goal is also to ensure that the resulted vector r is sparse.We mention that its sparsity must to be imposed to take into account the probabilities delivered by the decoder.

2) OPTIMIZATION OF THE REGULARIZATION PARAMETER
The performance of the proposed FAS-MAE detector highly depends on the choice of the regularization parameter.However, its optimization is difficult.It depends on many parameters among which the SNR value and the level of Pr(x k = α j | det in ) (either close to their bounds 0, 1 or not).
According to the proposed optimization criterion, the algorithm convergence is optimum when the cost function tends to 0, that is to say when the following condition is satisfied: The analytical determination of γ from ( 26) is not possible as it requires the analytical distribution of the FEC output, which is not available.We propose two ways to optimize γ .The first one is empirical and uses pilot symbols.The second one gives an analytical expression for γ .In the simulations, the first one will be used as a benchmark for the second one and we will refer to it as FAS-MAE (genie).
The first optimization requires a pilot sequence xpilot .Pilot symbols are usually inserted within the data frame to help synchronization and parameter estimation.Their position as well as their value are perfectly known at the receiver.Assuming the transmission of the pilot sequence, we perform only one iteration (both detection and decoding) and we compute ||y − HB α r|| 2 by considering the true values of r and the value of P j delivered by the decoder.We then fix γ to the following ratio Previous optimization method suffers from two drawbacks.First, it requires the use of pilots, yielding spectral efficiency loss and secondly, a detection step followed by a decoding step is carried out.The second method overcomes both of them by providing an analytical expression for γ .The problem criterion in (25) defines an 1 -norm penalized least squares estimator similar to the one studied in [34].Then, the second term of regularization in (25) can be seen as a weighted 1 term and we propose to fix γ as developed in [34].It depends on the noise variance and on the system dimensions: 3

) DEFINITION OF THE DECODER INPUT
In this part, we focus on the information exchange from the detector to the decoder.Contrary to FAS-ML, we will use the statistical distribution of the FAS detection established in [14].
Using the detector output xdet out , the symbol to binary converter (SBC) computes the log likelihood ratio on the i-th bit associated to the k-th symbol ( det out (km + i)) which can be defined as: with We mention that we proved that the expression of σ x given (6) keeps valid throughout the iterative process based on an empirical study.
f xk |x k =α j is given by (4).In [33] we used a Gaussian approximation combined with the LogSumExp approximation [35] to avoid saturation precision problems of the floating point, especially for high SNR and after some iterations.Doing so, we degrade the information available for the symbol decisions which equal the alphabet bounds.In this paper, we overcome the problem by proposing a new approximation that takes into account the hard decisions available at the FAS-MAE output and which we previously denoted xdet out,k .This LLR approximation is given by: Performance were significantly improved thanks to this new approximation as will be shown in Section IV-D dedicated to simulations.

D. SIMULATION RESULTS
In this section, we study the performance of the proposed FAS-ML and FAS-MAE iterative schemes.We also compare them to the Turbo-MMSE detector and to the iterative receiver introduced in [33].
We will observe that as established in [14], FAS detection is perfectly adapted to underdetermined systems provided the recovery success condition is satisfied.

1) COMPARISON OF FAS-MAE TO FAS-SSE
FAS-MAE is an enhanced version of the receiver proposed in [33], which will be referred to as FAS-SSE for soft symbol error in the simulations.In [33], the detection criterion is arg min with soft symbol decision xdet in computed as and γ chosen empirically.Performance of FAS-MAE with γ 2 and FAS-SSE with empirically optimized γ are compared in Fig. 15 and Fig. 16 for N = 64, n = 64, 50 and 16-QAM.We observe the efficiency of both the new criterion and the optimization of γ as FAS-MAE outperforms FAS-SSE (gains of roughly 1.0 and 0.8 at BER = 10 −3 for n = 50 and n = 64 respectively).The gain slowly increases as SNR gets higher.We now consider MIMO systems with N = 64, L = 256, QPSK modulation and n = 64, 50 and 40 in Fig. 17, 18 and 19 respectively.We have plotted the BER measured at the FEC decoder output after 6 iterations for FAS-ML, FAS-MAE and Turbo-MMSE receivers.
2) OPTIMIZATION OF THE REGULARIZATION PARAMETER γ (FIG.17, 18 AND 19) We observe that proposed FAS-MAE (genie) and FAS-MAE perform the same, which supports the choice of the analytical expression ( 27) used to fix the penalization parameter.

3) COMPARISON OF FAS-MAE TO FAS-ML
To compare FAS-MAE to FAS-ML, let us study the influence of the modulation order.We remind that in order to reduce the candidate subset, given a position in x, FAS-ML considers all candidates in the case of QPSK while it limits itself to adjacent neighbours in the case of higher order modulations.The consequence is that FAS-ML outperforms FAS-MAE in the case of QPSK while it achieves lower performance in the case of 16-QAM.Whereas the gain of FAS-ML over  FAS-MAE varies between 0.2 and 0.5 dB at BER = 10 −4 depending on n for QPSK, we observe a degradation of FAS-ML over FAS-MAE of about 2 dB and 2.6 dB at BER = 10 −4 for n = 64 and n = 50 respectively.

4) COMPARISON OF FAS-MAE AND TURBO-MMSE (FIG. 17, 18 AND 19)
In all cases, FAS-ML and FAS-MAE outperform the Turbo-MMSE detection.The gain is all the higher as the system is underdetermined.The gain of FAS-MAE over Turbo-MMSE equals about 1.25 dB for n = 64, 1.5 dB for n = 50 and 2dB for n = 40 at BER 10 −4 .Fig. 20 gathers all configurations for the three receivers (FAS-MAE, FAS,ML, Turbo-MMSE).We observe that they achieve similar diversity orders and differ from coding gains.

E. COMPLEXITY ANALYSIS
In this section, we compare the complexity of the proposed iterative algorithms FAS-ML and FAS-MAE and we denote by K the number of iterations.The FAS-ML is an iterative algorithm based on the ML criterion whose complexity exponentially increases with the modulation order M and the system dimension N yielding to a complexity order of O(KN log M ).However, the FAS-MAE is a quadratic criterion with a penalization function with a complexity order of O(KN 3 ).

V. CONCLUSION
This paper focused on finite-alphabet iterative source recovery for large-scale MIMO systems either uncoded or coded.For uncoded case, we developed an iterative FAS algorithm which uses shadow area constraints with an optimized shadow area defining parameter.The simulation results showed that the proposed FAS-SAC algorithm significantly outperforms standard FAS and MMSE-SIC algorithms with the same order of computational complexity.Then, for FECencoded case, we introduced the FAS-ML receiver which reduces the complexity of ML detection by restricting the candidate subset from the FAS algorithm output.To further reduce the receiver complexity, we proposed FAS-MAE receiver whose detection is based on a regularization of the FAS criterion without any preprocessing of the FEC-decoder output and where its regularization parameter is analytically fixed.Simulations showed that both receivers outperform Turbo-MMSE in all cases and that FAS-MAE achieves better results (lower error rate and less complexity load) than FAS-ML for M -QAM with M > 4.
As we can define α − α i = ( − i) , we have, that is to say Finally, after simplifications, we obtain

B. PROOF OF THE EXPRESSION OF Y η
Let us now compute the variance of the elements of A denoted by Y η and defined as: Focusing on the first term of Eq. ( 32) we get: As the distribution of x is an even function and the real constellation F = {α 1 , α 2 , . . ., α p } is symmetric with respect to the origin, we get E xk |k ∈ A = 0.The second term of Eq. ( 32) is computed as: Following the same approach as for Z η we finally get: 2 dt.The Dirac delta function and the indicator function of a subset A are denoted by δ(•) and 1 A (•) respectively.

Proposition 4 :
The set A cardinality follows the binomial distribution with parameters 2N and (1 − Z η ):

FIGURE 1 .
FIGURE 1. Empirical choice of the parameter η i for the iteration i .

FIGURE 2 .
FIGURE 2. FAS output variance variation in function of the parameter η for SNR = 15 to 30dB (up-to-down) and 16-QAM.

FIGURE 4 .
FIGURE 4. BER performance of FAS-SAC detection compared to MMSE-Successive Interference Cancellation (SIC) for N = n = 64 and QPSK.

FIGURE 11 .
FIGURE 11.BER performance of FAS-SAC detection for N = n = 64 and QPSK.Frequency-selective channel.

TABLE 1 .
Computational cost with the interior point method.
FIGURE 12. BER performance of FAS-SAC detection for N = n = 128 and QPSK.Spatially-correlated channel.