Uplink Signal Detection for Scalable Cell-Free Massive MIMO Systems With Robustness to Rate-Limited Fronthaul

We consider the problem of uplink signal detection in scalable cell-free mMIMO (CF-mMIMO) systems subject to limited fronthaul link capacity and highly correlated channel conditions. Unlike centralized MIMO systems, in which all receive antennas are placed at a central access point (CAP), in the CF-mMIMO architecture the CAP serving a given area also uses information ( $i.e$ . channel estimates and receive signals) collected by a set of surrounding access points (APs). For such a scenario, two new robust receivers are designed, which can combat the effects of limited fronthaul capacity by leveraging knowledge of the heteroscedastic covariance of the resulting effective noise. The first receiver, which has a higher complexity but yields the best performance, is based on an expectation propagation (EP) approach, while the second employs the effective noise heteroscedastic covariance in a generalized least squares (GLS) variation of the maximum likelihood (ML) detection problem. Simulation results confirm the efficacy of both proposed receivers, which are further employed to empirically study the optimum distribution of antennas among the CAP and APs.


I. INTRODUCTION
With the ever-growing demand for higher data rates, lower latency and greater coverage, multiple-input multiple-output (MIMO) systems will continue to develop as a key technology to meet the heterogeneous requirements of the fifth generation (5G) and sixth generation (6G) networks [1], incorporating higher frequency bands and a larger number of antennas. In particular, massive MIMO (mMIMO) technology, where base stations (BSs) equipped with massively large antenna arrays simultaneously serve multiple users, has attracted attention over the last decade.
To offer a brief chronology of the subject, the effectiveness of low-complexity maximum ratio combining (MRC) The associate editor coordinating the review of this manuscript and approving it for publication was Stefan Schwarz . receivers for mMIMO systems under the assumption of uncorrelated channels was first demonstrated in [2]. Soon after it was argued, however, that employing such a large number of antennas at a single BS leads in practice to channel correlation, limiting the system performance [3]. Aiming at ironing out this limitation, the concept of CF-mMIMO, where antenna elements are distributed to multiple APs, was subsequently proposed and gained significant attention [4]- [7].
The main idea of CF-mMIMO is to virtually configure a mMIMO system from spatially distributed APs, connected through wired fronthaul links to a common high-performance central processing unit (CPU). While the CF-mMIMO approach helps avoid correlation, however, it suffers from its own practical limitations, in particular: a) the scalability of the system, and b) the limited capacity of wired fronthaul links; both of which are factors that are not negligible as the numbers of APs, of users, and/or of antennas increase.
The scalability problem of CF-mMIMO systems has been addressed by the dynamic cooperation clustering (DCC) method [8], in which the CPU-AP connections change dynamically so as to form local clusters [7]- [10], composed of local CPUs assisted by a number of distributed APs that gather observations for actual multiuser symbol detection.
This architecture is proposed in various related articles [11]- [17] and aims to add feasibility to CF-mMIMO systems, since connecting a massive number of APs directly to a single CPU via perfect fronthaul links is unfeasible due to the associated equipment and maintenance costs. For the same reason of lending feasibility and scalability to CF-mMIMO architecture, the local CPUs -hereafter referred to as CAPs -are also assumed to be connected to the various APs in its cluster via fronthaul links of limited capacity, which may also be subjected to channel correlation.
To the best of our knowledge, a robust multiuser detection mechanism for scalable CF-mMIMO systems with highly correlated channels, which takes into account the heteroscedasticity of the effective noise resulting from the limited capacity of fronthaul links to the CAP, has not yet been well investigated. In this article, we therefore contribute to the solution of this standing problem by proposing a new detection scheme aimed at improving the detection performance of scalable CF-mMIMO systems subject to the practical constraints of limited fronthaul links.
The reminder of the article is as follows. In Section II we describe the scalable CF-mMIMO system model and introduce the optimization problem associated with the corresponding multiuser MIMO detection. In Section III we first obtain a benchmark solver for the latter problem based on the expectation propagation (EP) method, and then cast the scalable CF-mMIMO multiuser receiver problem into a discreteness-aware (DA) setting, proposing another solver based on a proximal gradient (PG) mechanism, which results in a discreteness-aware variation of the zero-forcing (ZF) receiver accelerated via a greedy-type approach. Finally, in Section IV we first evaluate the performance of the new receivers via software simulations, and subsequently employ the scheme to investigate the impact of different antenna configurations. Brief concluding remarks are finally offered in Section V.
The following notation is used throughout the article. Scalars are denoted by italic letters, with the letter i reserved for the imaginary unit. Real-valued column vectors and matrices are denoted by upright bold letters small and capitalized as in v and V, respectively, while complex-valued column vectors and matrices are represented by corresponding italic letters, as in v and V , respectively. The Frobenius and 2 norms are denoted by · F and · 2 , respectively. The transpose, conjugate transpose and real part operators are denoted by · T , · H and [·], respectively. The N × N identity matrix, N × 1 all-one column vector and the Kronecker product are denoted by I N , 1 N and ⊗, respectively. The notation n ∼ N (µ n , σ 2 n ) or n ∼ CN (µ n , σ 2 n ) is used to indicate that a random variable n follows a real or complex Gaussian distribution with mean µ n and variance σ 2 n , respectively.

A. CONTRIBUTIONS
The following is a summary of the article's contributions.
• New EP-based Robust CF-mMIMO Receiver: A new receiver leveraging the EP algorithm is developed, which is shown to be highly effective in taking advantage of the information collected from distributed APs in a correlated CF-mMIMO system subjected to severe fronthaul capacity limitation. Despite the relatively large computational complexity (cubic on the dimension of the channel) the scheme is feasible for reasonably-sized systems and serves as a performance benchmark for CF-mMIMO system with limited CAP-to-AP links.
• New Low-Complexity Robust CF-mMIMO Receiver: Another robust receiver for CF-mMIMO systems is proposed, based on a discrete-aware GLS formulation of the CF-mMIMO detection problem, solved via a proximal gradient approach, which is found to come close to the performance of the aforementioned benchmark EPbased scheme, despite having computational complexity quadratic on the number of users.

• Insights on Optimal Antenna Concentration Ratio:
Making use of the proposed robust receivers, a simulation-based study of the performances achieved under different CF-mMIMO configurations is performed, which shows that depending on the capacity limitations of fronthaul links, different ratios between the number of antennas concentrated at the CAP versus those distributed to APs should be selected in order to optimize the system performance.

II. SYSTEM MODEL
Consider the uplink signal detection problem in a scalable CF-mMIMO system employing the user-centric DCC architecture described in [7], [8], whereby the CPU utilizes knowledge of large scale fading coefficients of user equipments (UEs) to cluster APs for scalability and mitigation of inter-cluster interference. After such a procedure, each local cluster can be considered individually, each of which possesses a CAP that controls the cluster and detects transmitted symbols with locally estimated channel state information (CSI). In light of the latter, consider the model of a local cluster of a scalable CF-mMIMO system as shown in Figure 1  with 0 indicating the CAP and the other integers corresponding to the remaining APs clockwise. As for the UEs served by the system, it is assumed that K single-antenna UEs are distributed randomly following a Poisson Point Process (PPP) with intensity µ within a circle of radius D U [m] encircling the cluster, as shown in Figure 1. Following [7], [18], the channel between each UE and a receiver (i.e., CAP or AP) is correlated with mean power δ determined by the urban micro cell path loss model [19], namely, δ(d ,k ) = 22.7 + 36.7 log 10 (d ,k ) + 26 log 10 (f c ), (1) where f c denotes the carrier frequency, d ,k is the distance between the -th receiver and the k-th UE with ∈ L and k ∈ {1, 2, . . . , K }. Given the latter large-scale factor, the associated correlated channel vector can be expressed as [20] h ,k √ ρ 10 −δ(d ,k )/10 R where ρ is a transmit (TX) power of each UE, h ∼ CN (0, I N ) ∈ C N ×1 and the spatial correlation matrix with N and θ ,k denoting the number of antennas and the angle of arrival at the -th receiver, respectively. Thanks to the DCC architecture [7], [8], the corresponding complex baseband signal y ∈ C N ×1 at the -th receiver can be written regardless of the effect from the other cluster as where H [h ,1 , . . . , h ,K ] ∈ C N ×K is a matrix concatenating the channel vectors associated with K single-antenna UEs, s ∈ C K denotes an information symbol vector stacking scalar symbols from UEs, and n ∼ CN (0, σ 2 I N ) describes the additive white Gaussian noise (AWGN) vector at the -th receiver.
It is considered that the CAP aims at detecting the intended symbol vector s by taking advantage of not only its own received signal vector y 0 but also those obtained from the APs subjected to distortion due to the limited fronthaul link capacity C f [11].
Following related literature [21], we further assume that perfect 1 channel estimation is performed individually at each AP so as to distribute the computational tasks and reduce the amount of information flowing through the fronthaul links.
In order to minimize the distortion due to the imperfect fronthaul links, the raw received signals y as well as the channel estimates are compressed via singular value decomposition (SVD)-based latent semantic analysis (LSA) technique of [22], such that the compression yieldsH =Ũ As shown in [11], the distortion resulting from the limited wired fronthaul links and the corresponding compression strategy described above can be modeled as AWGN such that the collected CSIĤ and received signalsŷ can be written where the variances of the AWGN induced onto channel estimates and received signals by the rate-limited fronthaul links, i.e., w ∼ CN (0, σ 2 y I N ) and vec (W ) ∼ CN (0, σ 2 H I N ), are respectively given by σ 2

III. PROPOSED RECEIVER DESIGN A. PROBLEM FORMULATION
It follows from equations (4) and (5) that the effective received signal vector at the CAP after gathering information from APs can be written compactly as where y ∈ C L , n ∈ C L and H ∈ C L ×K are implicitly defined with L ∈L N denoting the effective receive antenna dimension at the CAP.
Given the distorted received signal vector modeled in equation (6), the optimal ML estimateŝ ML is a solution of the following least squares (LS) minimization where S denotes the constellation set of the digital modulation employed. We remark, however, that the optimization problem described by equation (7) is non-deterministic polynomialtime (NP) hard, such that finding a solution to the problem requires reformulated approaches which, while feasible, are nevertheless still capable of fully leveraging the system architecture (e.g., discreteness of symbols and statistical knowledge of distortion from the limited fronthaul links) so as to overcome the imperfection caused by rate-limited fronthaul links [23]. Two distinct examples of such feasible solutions of the scalable CF-mMIMO problem fundamentally described above are offered in the subsequent sections, the first with a higher complexity but better performance, considered therefore as a benchmark solution, and the second which trades a slight performance degradation for a significant reduction of computational complexity.

B. BENCHMARK: EXPECTATION PROPAGATION-BASED ROBUST RECEIVER
In this subsection, we propose an EP-based receiver for scalable CF-mMIMO systems subject to heterogeneous noise variances due to limited fronthaul links, as described above.
It is well known (see e.g., [24]- [26]), that EP algorithms consist of two modules, hereafter referred to as Modules A and B, respectively.
In Module A, a replica of the received signal y is constructed with basis on a tentative symbol estimateŝ k processed by soft interference cancellation, yielding Assuming that the residual component of equation (8) can be approximated by a multivariate complex Gaussian random variable in confirmity to the central limit theorem and the Gaussianity of n, the conditional probability density function (PDF) ofỹ k for a given s k can be written as where k denotes the estimated covariance matrix of the residual component, which is given by whereφ k is the error variance associated with the estimate of s k and σ 2 σ 2 With the aim of leveraging Bayesian statistics for symbol detection, we then compute the log likelihood ratio (LLR) of the associated posterior extrinsic belief, which for the sake of simplicity is done here under the assumption that quadrature phase shift keying (QPSK) modulation with unit power is employed. 2 Then the soft value consisting of the LLRs for QPSK signaling can be mathematically described as with the normalization factor ν k 1 − ζ kφk , where In turn, in Module B the soft estimates of the mean and variance of the tentative symbol estimateŝ k are obtained based on their posterior belief β k , in the LLR domain. In particular, owing to Bayes's rule, the conditional expectation of s k for given β k can be written as [25] 2 The other types of modulation such as quadrature amplitude modulation (QAM) can also be considered, only leading to slightly more complicated expressions. See e.g., [25]. VOLUME 9, 2021 such that its associated error variance is given bȳ (15b) Harnessing harmful effects due to self-feedback of estimates over iterations, moment matching is further applied to equation (15a), such that the tentative mean estimate and its error variance can be adjusted to [25] In summary, the algorithm outlined above is such that the heteroscedasticity of the effective noise n in equation (6) is incorporated in the residual noise covariance computed in equation (10), and utilized both to calculate the extrinsic belief in equation (13) and to update the mean and variance of soft estimates for each symbol, which in turn is iteratively improved over multiple message passing iterations. The procedure is summarized as a pseudo code in Algorithm 1.

C. LOW-COMPLEXITY SOLUTION: GLS-BASED ROBUST DISCRETE-AWARE ZF RECEIVER
The benchmark solution offered in the above subsection is, albeit very elegant, of complexity that is cubic on the effective dimension L of the channel, such that an alternative solution with lower complexity is also desired. To this end, let p ∈ C P denote a complex-valued vector containing all distinct symbols of the employed constellation S of cardinality P |S|, such that M p I K ⊗p T describes the corresponding code book giving all possible outcomes.
From the above, any symbol vector s can be represented unequivocally by introducing a binary vector x satisfying under the constraints 1 K = M 1 x and x ∈ {0, 1} KP with the mapping matrix M 1 I K ⊗ 1 T P . Notice that the mapping above is indeed unequivocal (i.e., bijective), since 1 K = M 1 x ensures that each of the P-long adjacent K partitions of x has a single non-zero element, which by force of x ∈ {0, 1} KP must be exactly equal to 1. Consequently, the ML detection problem described by equation (7) can be rewritten without loss of optimality as Internal Parameters: Covariance matrix , constellation vector p, minimum update distance x min , maximum number of iterations J max , scaling factor q > 1, and shrinkage parameter 0 < ξ < 1, Compute Lipschitz constant L c by (28) Construct mapping matrix M p I K ⊗ p T ; Initialize parameters j = 0, x = 1 and γ 1/L c Compute r (0) = qγ and x (0) by (29); 1: while j ≤ J max and x ≥ x min do 2: Update t (j) as per equation (26b) Reset momentum as t (j) = x (j) ; 6: end if 7: Shrink step size by r (j) = max(ξ r (j−1) , γ ); 9: end if 10:

11:
Increment j = j + 1; 12: end while 13: outputŝ = M p x (j) Notice that the receiver formulated in equation (19) does not require knowledge of the interference-plus-noise power at the distributed receivers, bearing therefore a similarity with the classic ZF receiver in that sense. On the other hand, it can also be readily inferred from equation (19) that unlike the classic ZF receiver, a detector based on the solution of equation (19) offers better bit error rate (BER) performance thanks to the built-in enforcement that the solution belongs to the constellation. This feature, referred to as discretenessawareness, will be demonstrated in the sequel via simulation results. Due to these properties, we refer to this receiver as the PDAZF.
The bottleneck of PDAZF is, however, that no statistical knowledge of the distortion due to the rate-contrained fronthaul links is incorporated in the detection design, which may result in severe performance degradation when the fronthaul link capacity C f is low. Recalling equation (6), the difficulty in detecting symbols given the distorted received signal y is the block-wise heteroscedasticity, that is, the fact that each sub-vector in y possesses different noise variance.
The core idea of the proposed low-complexity detector to mitigate this additional challenge is to leverage the GLS approach, whereby the heteroscedastic covariance is utilized to the unbias LS term for minimizing the Mahalanobis distance, yielding The latter minimization problem is a convex quadratic program, which can be cast into the proximal form with Since the loss function f (x) implicitly defined in equation (20a) is convex and the constraint sets in equations (22) are both convex, the minimization problem expressed as equation (21) can be efficiently solved by PG methods.
To this end, let us start by evaluating the gradient of f (x), which is given by and defining the projection operators for the utilities g 1 (x) and g 2 (x) as where max(·, ·) and min(·, ·) denote the element-wise max and min operators, respectively. Given that PG methods can be accelerated by various efficient techniques, we here adopt the greedy acceleration method of [28], by which the solution of the optimization problem given in equation (21) is iteratively obtained from The step size r (j) used in each iteration is initialized and then updated based on the hyper-parameters q and ξ via with the standard step size (a.k.a. learning rate) optimally determined [29] as the reciprocal of the Lipschitz constant L c , i.e., with c k,p = |p p −s k | −1 ands H −1 y ∈ C K ×1 .

IV. PERFORMANCE ASSESSMENT A. QUALITATIVE COMPLEXITY ASSESSMENT
Before we proceed to assess the performance of the two receivers described above via computer simulation, let us briefly analyze qualitatively their computational complexities, so as to put the comparisons that follow into context. Regarding the complexity of the robust PDAZF (RPDAZF) receiver summarized in Algorithm 2, we remark that the most expensive operations among the various expressions listed above are the multiplications in equations (23) and (24), because the matrix inversions in both expressions can be calculated offline since the matrices M 1 and are both constant and deterministic. With that in mind, since x is of dimension KP, it follows that the effective cost per iteration of Algorithm 2 is only of order O((KP) 2 ).
As a reference, this complexity can be compared against the conventional ZF receiver, which can be obtained with basis on equation (6), namely by evaluation and multiplication of the pseudo-inverse of the stacked channel matrix H onto the stacked received signal vector y. In such case, the most expensive operations are the multiplication of H and its complex conjugate, which is of cost O(L K 2 ), and the evaluation of the inverse of the latter result, which is of cost O(K 3 ), yielding altogether a complexity order of O(K 3 + L K 2 ) [30], [31]. In other words, as long as the cardinality P of the constellation S is sufficiently smaller than the number of served users K and/or the aggregate rank of limited fronthaul channels L -that is, P 2 < max{K , L } -the solution offered by the RPDAZF has lower complexity than the conventional ZF receiver. In turn, the most expensive operation of the EP-based benchmark receiver is the repeated inversion of the matrix , required to update the beliefs β k via equation (13).
Referring to equation (14b) and ignoring the complexity of the multiplication of the diagonal matrix , it is found that the complexity of the EP-based robust receiver is dominated by the multiplication between H and H H , which is of complexity order O(L K 2 ) per iteration, and the inversion of the resulting matrix , which is of complexity order O(L 3 ) per iteration. It follows, therefore, that the complexity order of the EP-based benchmark robust receiver is O(L 3 + KL 2 ). Finally, in addition to the conventional ZF, in the comparisons to follow we will also consider the PDAZF scheme obtained by solving equation (19) as a State-of-the-Art (SotA) method, so that the corresponding gain in performance resulting from the subsequently-designed method based on the proximal gradient method and incorporating knowledge of the compression distortion noise covariance matrix can be measured. It is therefore also of relevance to highlight that the complexity of the PDAZF scheme is of order O((KP) 2 ), due to its similarity to the ZF receiver.

B. BER PERFORMANCE COMPARISONS
With these remarks made, we proceed to compare the BER performances of all aforementioned receivers, later applying the proposed EP-based and RPDAZF methods to study the impact of antenna distribution among CAP and APs.
The simulation setup follows related literature [4], [7], [18], [32] and is summarized as follows. The propagation model is set according to the urban macro cell setting [33], with the height of the CAP and the APs set to h = 10 [m], the maximum distance between the CAP and UEs set to D U = 200 [m] [33], the distance between the CAP and the APs set to D R = 100 [m], and the distribution of users following the DCC method, with the positions of UEs set randomly following a PPP with intensity µ = {10, 20}.
The wireless channel is characterized by a carrier frequency, bandwidth, and noise variance at the CAP and APs are respectively assumed to be f c = 2 [GHz], B = 20 [MHz], and σ 2 = −96 [dBm] [29], over which Gray-coded QPSKmodulated (P = 4) signals are transmitted.
The total number of receive antennas in the simulation setup is L = ∈L N , with N 0 receive antennas equipping the CAP, and (L − N 0 )/M receive antennas equipping each APs. Finally, we defined for later convenience the antenna centralized rate (ACR) as the ratio between the number of antennas N 0 allocated to the CAP and the total number of receive antennas L, thus, η N 0 /L. We also concisely collect the aforementioned and remaining simulation parameters in Table 1, offered above.
Our first comparisons are offered in Figures 2 to 5, which show the BER performances of our proposed receivers in comparison with those of SotA receivers under two distinct scenarios of fronthaul link capacity limit, as a function of the signal to noise ratio (SNR), defined as In order to enrich the comparisons, in these particular figures we augment the set of SotA methods by showing, in addition to the conventional ZF and the PDAZF receiver corresponding to equation (19), results obtained with a couple of additional conventional receivers, motivated by the following arguments. First, the minimum mean square error (MMSE) receiver with a fully centralized set up, i.e., in which all L = 50 receive antennas is concentrated in the CAP, is also evaluated so as to capture the performance of a conventional centralized massive MIMO system. Secondly, a generic EP-based receiver in which a diagonal matrix with the constant noise variance is used instead of equation (10) is also considered, so as to illustrate the gain resulting from the mitigation of the effect of the heteroscedasticity of effective noise, as here proposed.
Furthermore, we also include a slight and trivial improvement of the proposed EP-based benchmark scheme of Algorithm 1, which is obtained by the incorporation of a matched filter bound (MFB), such as described in [35]. In our simulation setting, the MFB is achieved only if the inter-user interference is perfectly removed; and thus, the EP-based detector achieves the MFB only if the symbol estimates used for soft interference cancellation performed as per equation (8) at the final iteration step, is equal to the transmitted symbols, via the successful iterative process.
This serves, therefore, as an absolute lower bound of the BER performance of the proposed algorithms. Other than these additions and variations, in all figures the BER performances of CF-mMIMO systems with L = 50 antennas distributed among the CAP and either M = 4 or M = 8 APs, according to an ACR of η = 0.2 and serving a random number of UEs with an average of µ = 10, are compared.
Notice that the key distinction between Figures 2 and 3 is the capacity of corresponding fronthaul links, which is set to C f = 12 and C f = 16, respectively, which in turn results in corresponding shrinkages of spatial resources from L = 50 to L = 34 and L = 42, respectively. It can be readily seen from the results that the proposed receivers, namely the robust EP-based receiver of Algorithm 1 and the RPDAZF receiver of Algorithm 2, significantly outperform all SotA  methods, extracting large gains over the latter at moderate to large SNRs.
We remark that the large SNRs regime is precisely where heteroscedasticity due to the limited fronthaul links is most impactful, as evidenced by the error floors induced in the performances of the SotAs, which explains the results. Let shift our focus to the BER performance with M = 8 in Figures 4  and 5, where the other settings are the same as in Figures 2  and 3. As can be inferred from the above comparison with the fully centralized MMSE detector, increasing the number of APs M can fundamentally improve the detection capability by reducing the channel correlation. In addition, the gain obtained by considering the heteroscedasticity of the effective noise is also enhanced. In the following analysis, we consider  the case of M = 4 as a more severe channel condition for multiuser detection.
Another important conclusion taken from Figures 2 to 5 is that the RPDAZF receiver in fact achieves a performance very close to that of the EP-based approach, in spite of its lower complexity -namely, quadratic on K instead of cubic on L -which indicates that much of the gain achieved over SotA methods is credited to handling the heteroscedasticity of the effective noise, in this case by means of the incorporation of the covariance matrix in the receiver design.
For a more conclusive evaluation of which of the two proposed approaches exhibits the lowest complexity, it is, however, also important to consider the convergence of both  schemes, which motivates the additional comparison between Algorithms 1 and 2 as shown in Figure 6.
It can again be confirmed from the figure that, despite the slower convergence, the RPDAZF has indeed a lower overall complexity since the order of magnitude of the raw (per iteration) complexity of the EP scheme outweighs the slower convergence of the RPDAZF scheme. In particular, the results shown in Figure 6 indicate that the EP scheme converges after about 20 iterations, while the RPDAZF scheme requires about 200 iterations for C f = 12 and 300 iterations for C f = 16. Put in numbers, that means that the total average complexity of the EP scheme is of order 20×O(L 3 +KL 2 ) = 20 × O(34 3 + 10 · 34 2 ) = 1017280 flops for the case when C f = 12, and 20 × O(42 3 + 10 · 42 2 ) = 1834560 flops  for the case when C f = 16, respectively. In comparison, the complexity of the proposed RPDAZF scheme is of order 200 × O(K 2 · P 2 ) = 200 × O(40 2 ) = 320000 flops, with C f = 12, and 300 × O(40 2 ) = 480000, with C f = 16, respectively. All in all, the RPDAZF method is found to be 3 to 4 times more efficient than the EP-based method, in this particular comparison.
Obviously, this reduction of complexity is only an example, and can be much larger in massive MIMO systems, when the number of receive (RX) antennas may reach several hundred, which greatly affects the complexity of the EPbased method, but not of the RPDAZF method.
Having established both the superiority of the proposed robust receivers over the SotA alternatives, and the scalability of CF-mMIMO with robustness to rate-limited fronthaul thanks to the fact that the proposed RPDAZF has complexity independent of the dimension of the system (L and L ), being only quadratic on the average number K of users during the iterations, we move on to study the impact of allocating antennas among the CAP and APs, in search of insights on how to optimize the CF-mMIMO system with respect of this design parameter.

C. PERFORMANCES WITH DIFFERENT ANTENNA CONCENTRATION RATIOS
To that end, consider Figures 7 and 8, which compare the BER performances of the proposed EP-based and RPDAZF receivers in systems with different ACRs, as a function of the fronthaul link capacity C f . Recall that η → 1 indicates a more centralized (cellular-like) system, while a smaller η indicates a more distributed system. Furthermore, notice that in the context of the experiment here considered, in which there is one CAP and M = 4 APs, an ACR of η = 0.2 indicates a fully distributed system, with N = 10, ∀ .
It is seen that depending on the fronthaul capacity, different system configurations yield better results. In particular, with basis on which of the configuration achieves the lowest BER, the figures are split into three areas each, respectively labeled ''Centralized'', ''Cooperative'', and ''Distributed'', which depict the corresponding optimal system architecture. As could be expected, it is found for instance that antennas need be centralized at the CAP in case of severe fronthaul capacity regions, although that might come at the sacrifice of Degrees of Freedom (DoF) and possibly better channels in the UE-to-AP links. In turn, if sufficiently high fronthaul capacity is available, distributing more antennas to the APs results in orders-of-magnitude improvements.
The results confirm the advantage of cell-free architecture over traditional cellular systems, due to the potential to achieve better performance, but also highlight the importance of the transreceivers used in fronthaul links, which prove essential to reap the full potential of CF-mMIMO systems.
To conclude our simulated study, we finally look at the impact of load onto the CF-mMIMO architecture, in terms of the ACR that yields the best BER performances. To that end, we compare in Figures 9 and 10 the BER performances of the proposed EP-based and RPDAZF receivers in systems with different average numbers of users µ and fronthaul capacities C f , as a function of the ACR η.
Interestingly, the figures demonstrate that the optimal antenna configuration is not strongly dependent on the load. For instance, it can be seen in Figure 9 that an ACR of η ≈ 0.7 would serve well a system employing the RPDAZF receiver in a scenario with C f = 16, both under a load of µ = 10 and under double that load, at µ = 20.
These results are somewhat counter-intuitive, because one would be justified to expect otherwise, but can be explained as follows. In critical conditions, when the fronthaul capacity is already low, not only the centralized configuration is intuitively better, but also the performance of the system is generally worse, as can be seen from the flat shape of  the curves. If, however, the fronthaul links have sufficient capacity, the cell-free architecture again is the optimal set up, for any load as long as it does not implicate in insufficiency of fronthaul capacity, which would be a contradiction of the assumption. In other words, CF-mMIMO is definitely the right choice to provide higher rates and better coverage in for future wireless systems.
As a final remark, we highlight the fact that when sufficiently broad (i.e., high-capacity) fronthaul links are afforded, the performance degradation suffered by doubling the load is rather small if the proposed EP-based receiver is employed.
This motivates, once again, future work on the design of TX/RX schemes that can better cope with finite fronthaul capacities. And on that, perhaps a suitable strategy is to take better advantage of the coherence time of the channel,  designing schemes that treat the APs-to-CAP signals in blocks, with the compression performed at each time slot designed to facilitate the recovery of the true information, although subjected to the same capacity.

D. EFFECTIVE THROUGHPUT ASSESSMENT
To conclude our assessment, we further evaluate our proposed and SotA receivers in terms of their effective throughput, as a measure of the spectral efficiency achieved by the corresponding CF-mMIMO systems. To this end, we consider the effective throughput, defined in [36] as, where P b , µ, and P denote BER at each receiver, the average number of UEs in the cluster, and the cardinality of the modulation scheme employed, respectively. Figures 11 to 14 show plots of the effective throughput as a function of SNR, for different fronthaul conditions, namely with C f = 12 or C f = 16, and for systems with different numbers of APs, i.e., with M = 4 or M = 8. Since effective throughput can be regarded as the averaged total bits received without errors in each snapshot, we include in all figures the averaged total transmitted bits µ log 2 (P) = 20 as an absolute upper bound. Another upper bound, based on the incorporation of MFB [35] into Algorithm 1 is also shown.
The results corroborate that the effective throughput achieved by CF-mMIMO systems employing our proposed receivers asymptotically approach the absolute upper bound at high SNRs, remaining also tightly close to the MFB-based bound over the entire SNR range.

V. CONCLUSION
We presented two new robust receivers for CF-mMIMO which exploit knowledge of the covariance of the compression noise resulting from passing information through fronthaul links with limited capacities, in order to mitigate the effect of the associated noise heteroscedasticity. The first receiver, which has complexity order O(L 3 + KL 2 ), with L describing the effective dimension of the set of fronthaul links, is based on an expectation propagation and yields the best performance among all alternatives investigated, serving therefore as a benchmark method. The second receiver, dubbed the Robust PDAZF (RPDAZF), is found, however, to closely approach the performance of the benchmark method in all scenarios, despite its significantly lower complexity of order O(K 2 P 2 ), which does not increase with the number of antennas L employed by the set of CAP and APs. As a result, it can be claimed that the proposed RPDAZF receiver for CF-mMIMO subjected to limited fronthaul links is both scalable and efficient.
As a bonus, we employ the proposed receivers to study the optimum distribution of antennas among the CAP and APs. Among others, an interesting outcome of the latter study is the insight that, given a given constraint in terms of the capacity of fronthaul links, the optimal distribution of antennas at the CAP and APs is resilient to load variations, a phenomenon that shall be investigated in more detail in a subsequent contribution. In addition to that, we also offer an interesting future direction as an insight obtained from the performance assessments, where compressed data and channel coefficients are jointly estimated to fully combat the harmful impact of the limited fronthaul capacities.