Reconfigurable Intelligent Surface-Assisted Cell-Free Massive MIMO Systems Over Spatially-Correlated Channels

Cell-Free Massive multiple-input multiple-output (MIMO) and reconfigurable intelligent surface (RIS) are two promising technologies for application to beyond-5G networks. This paper considers Cell-Free Massive MIMO systems with the assistance of an RIS for enhancing the system performance under the presence of spatial correlation among the engineered scattering elements of the RIS. Distributed maximum-ratio processing is considered at the access points (APs). We introduce an aggregated channel estimation approach that provides sufficient information for data processing with the main benefit of reducing the overhead required for channel estimation. The considered system is studied by using asymptotic analysis which lets the number of APs and/or the number of RIS elements grow large. A lower bound for the channel capacity is obtained for a finite number of APs and engineered scattering elements of the RIS, and closed-form expressions for the uplink and downlink ergodic net throughput are formulated in terms of only the channel statistics. Based on the obtained analytical frameworks, we unveil the impact of channel correlation, the number of RIS elements, and the pilot contamination on the net throughput of each user. In addition, a simple control scheme for optimizing the configuration of the engineered scattering elements of the RIS is proposed, which is shown to increase the channel estimation quality, and, hence, the system performance. Numerical results demonstrate the effectiveness of the proposed system design and performance analysis. In particular, the performance benefits of using RISs in Cell-Free Massive MIMO systems are confirmed, especially if the direct links between the APs and the users are of insufficient quality with high probability.


I. INTRODUCTION
In the last few decades, we have witnessed an exponential growth of the demand for wireless communication systems that provide reliable communications and ensure ubiquitous coverage, high spectral efficiency and low latency [2]. To meet The work of T. V. Chien, S. Chatzinotas, and B. Ottersten was supported by RISOTTI -Reconfigurable Intelligent Surfaces for Smart Cities under project FNR/C20/IS/14773976/RISOTTI. The work of H. Q. Ngo was supported by the UK Research and Innovation Future Leaders Fellowships under Grant MR/S017666/1. The work of M. Di Renzo was supported in part by the European Commission through the H2020 ARIADNE project under grant agreement number 871464 and through the H2020 RISE-6G project under grant agreement number 101017011. Parts of this paper were presented at IEEE GLOBECOM 2021 [1]. The associate editor coordinating the review of this paper and approving it for publication was Z. Zhang these requirements, several new technologies have been incorporated in 5G communication standards, which include Massive multiple-input multiple-output (MIMO) [3], millimeterwave communications [4], and network densification [5]. Among them, Massive MIMO has gained significant attention since it can offer a good service to many users in the network. Moreover, the net throughput offered by a Massive MIMO system is close to the Shannon capacity, in many scenarios, by only employing simple linear processing techniques, such as maximum ratio (MR) or zero forcing (ZF) processing. Since the net throughput can be computed in a closedform expression that only depends on the channel statistics, the optimized designs are applicable for a long period of time [6]. The colocated Massive MIMO architecture has the advantage of low backhaul requirements since the base station antennas are installed in a compact array. Conventional cellular networks, however, are impaired by intercell interference. In particular, the users at the cell boundaries are impaired by high intercell interference and path loss, and hence, they may experience insufficient performance. More advanced signal processing methods are necessary to overcome the inherent intercell interference that characterizes conventional cellular network deployments.
Cell-Free Massive MIMO has recently been introduced to reduce the intercell interference that characterizes colocated Massive MIMO architectures. Cell-Free Massive MIMO is a network deployment where a large number of access points (APs) are located in a given coverage area to serve a small number of users [7]- [10]. All APs collaborate with each other via a backhaul network and serve all the users in the absence of cell boundaries. The system performance is enhanced in Cell-Free Massive MIMO systems because they inherit the benefits of the distributed MIMO and network MIMO architectures, but the users are also close to the APs. When each AP is equipped with a single antenna, MR processing results in a good net throughput for every user, while ensuring a low computational complexity and offering a distributed implementation that is convenient for scalability purposes [11]. However, Cell-Free Massive MIMO cannot guarantee a good quality of service under harsh propagation conditions, such as in the presence of poor scattering environments or high attenuation due to the presence of large obstacles.
Reconfigurable intelligent surface (RIS) is an emerging technology that is capable of shaping the radio waves at the electromagnetic level without applying digital signal processing methods and without requiring power amplifiers [12]- [14]. Each element of the RIS scatters (e.g., reflects) the incident signal without using radio frequency chains and power amplification [15]. Integrating an RIS into wireless networks introduces digitally controllable links that scale up with the number of engineered scattering elements of the RIS, whose estimation is, however, challenged by the lack of digital signal processing units at the RIS [16]- [20]. For simplicity, the main attention has so far been concentrated on designing the phase shifts under the assumption of perfect channel state information (CSI) [16], [21] and the references therein. In [20], the authors have recently discussed the fundamental issues of performing channel estimation in RIS-assisted wireless systems. The impact of the channel estimation overhead and reporting on the spectral efficiency, energy efficiency, and their tradeoff has recently been investigated in [17]. In [16] and [18], to reduce the impact of the channel estimation overhead, the authors have investigated the design of RISassisted communications in the presence of statistical CSI. As far as the integration of Cell-Free Massive MIMO and RIS is concerned, recent works have formulated and solved optimization problems with different communication objectives under the assumption of perfect (and instantaneous) CSI [22]- [26]. Recent results in the context of single-input single-output (SISO) and multi-user MIMO systems have, however, shown that designs for the engineered scattering elements of the RIS that are based on statistical CSI may be of practical interest and provide good performance [18], [27]- [29].
In the depicted context, no prior work has analyzed the performance of RIS-assisted Cell-Free Massive MIMO systems in the presence of spatially-correlated channels. In this work, motivated by these considerations, we introduce an analytical framework for analyzing and optimizing the uplink and downlink transmissions of RIS-assisted Cell-Free Massive MIMO systems under spatially correlated channels and in the presence of direct links subject to the presence of blockages. In particular, the main contributions made by this paper can be summarized as follows: • We consider an RIS-assisted Cell-Free Massive MIMO under spatially correlated channels. All APs estimate the instantaneous channels in the uplink pilot training phase. We exploit a channel estimation scheme that estimates the aggregated channels including both the direct and indirect links, instead of every individual channel coefficient as in previous works [20], [24]. For generality, the pilot contamination is assumed to originate from an arbitrary pilot reuse pattern. •  The rest of this paper is organized as follows: Section II presents the system model, the channel model, and the channel estimation protocol. The uplink data transmission protocol and the asymptotic analysis by assuming a very large number of APs and engineered scattering elements at the RIS are discussed in Section III. A similar analysis for the downlink data transmission is reported in Section IV. Finally, Section V illustrates several numerical results, while the main conclusions are drawn in Section VI. Notation: Upper and lower bold letters are used to denote matrices and vectors, respectively. The identity matrix of size × is denoted by I . The imaginary unit of a complex number is denoted by with √ = −1. The superscripts (·) * , (·) , and (·) denote the complex conjugate, transpose, and Hermitian transpose, respectively. E{·} and Var{·} denote the expectation and variance of a random variable. The circularly symmetric Gaussian distribution is denoted by CN (·, ·) and diag(x) is the diagonal matrix whose main diagonal is given by x. tr(·) is the trace operator. The Euclidean norm of vector x is x , and X 2 is the spectral norm of matrix X. Finally, mod(·, ·) is the modulus operation and · denotes the truncated argument.

II. SYSTEM MODEL, CHANNEL ESTIMATION, AND RIS PHASE SHIFT CONTROL
We consider an RIS-assisted Cell-Free Massive MIMO system, where APs connected to a central processing unit (CPU) serve users on the same time and frequency resource, as schematically illustrated in Fig. 1. All APs and users are equipped with a single antenna and they are randomly located in the coverage area. Since the considered users are far away from the APs, the communication is assisted by an RIS that comprises engineered scattering elements that can modify the phases of the incident signals. 1 The matrix of phase shifts of the RIS is denoted by is the phase shift applied by the -th element of the RIS. The phase shifts are adjusted by a controller which exchanges information with the APs via a backhaul link (see Fig. 1). As a canonical form of Cell-Free Massive MIMO systems, we assume that the system operates in timedivision duplexing (TDD) mode. Thus, we assume that channel reciprocity holds in the consisted system model.

A. Channel Model
We assume a quasi-static block fading model where the channels are static and frequency flat in each coherence interval comprising symbols. We assume that the APs estimate the channel during the uplink pilot training phase. Thus, symbols ( < ) in each coherence interval are dedicated to the channel estimation phase and the remaining ( − ) symbols are utilized for the uplink and downlink data transmission phases.
The following notation is used: is the channel between the user and the AP , which is the direct link [12]; h ∈ C is the channel between the AP and the RIS; and z ∈ C is the channel between the RIS and the user . The pair h and z , which constitutes the cascaded channel, results in an indirect link (virtual line-of-sight link), which enhances the communication reliability between the AP and the user [30]. The majority of existing works assume that the wireless channels undergo uncorrelated Rayleigh fading. In this paper, we consider a more realistic channel model by taking into account the spatial correlation among the engineered scattering elements of the RIS, which is due to their sub-wavelength size, sub-wavelength inter-distance, and geometric layout. In an isotropic propagation environment, in particular, , h , and z can be modeled as follows where is the large-scale fading coefficient; R ∈ C × and R ∈ C × are the covariance matrices that characterize the spatial correlation among the channels of the RIS elements. The covariance matrices in (1) correspond to a general model, which can be further particularized for application to typical RIS designs and propagation environments. For example, a simple exponential model was used to describe the spatial correlation among the engineered scattering elements of the RIS in [31]. Another recent model that is applicable to isotropic scattering with uniformly distributed multipath components in the half-space in front of the RIS was recently reported in [32], whose covariance matrices are 2 1 In general, a completed system should include many users randomly located in the coverage area. There are some favorable users where their links to some APs are strong. But there are also some unfavorable users where the their links to the APs are weak. This may come from the large path loss (long distances) or heavy shadowing. In our work, we consider the cases where RISs are deployed to improve the performance of these unfavorable users, and hence, the coverage of the whole system can be increased. 2 This paper considers a spatial correlation model between engineered scattering elements in the far-field scenarios that is applicable and when the covariance matrices among the users differ only in terms of large-scale channel coefficients. Other scenarios where the users have covariance matrices with different phases [33] are of interest and are left for future work.
where ,˜∈ C are the large-scale channel coefficients, which, for example, model the signal attenuation due to large objects and due to the transmission distance. The matrices in (2) assume that the size of each element of the RIS is × , with being the horizontal width and being the vertical height of each RIS element. In particular, the ( , )−th element of the spatial correlation matrix R ∈ C × in (2) is [R] = sinc(2 u − u / ), where is the wavelength and sinc( ) = sin( )/( ) is the sinc function. The vector u , ∈ { , } is given by u = [0, mod ( − 1, ) , ( − 1)/ ] , where and denote the total number of RIS elements in each row and column, respectively. The channel model in (1) is significantly distinct from related works since the small-scale fading and the spatial correlation matrices are included in both links of the virtual line-of-sight link that comprises the RIS. In [31], by contrast, the channels between the transmitters and the RIS are assumed to be deterministic, for analytical tractability.

B. Uplink Pilot Training Phase
The channels are independently estimated from the pilot sequences transmitted by the users. All the users share the same pilot sequences. In particular, ∈ C with 2 = 1 is defined as the pilot sequence allocated to the user . We denote by P the set of indices of the users (including the user ) that share the same pilot sequence as the user . The pilot sequences are assumed to be mutually orthogonal such that the pilot reuse pattern is During the pilot training phase, all the users transmit the pilot sequences to the APs simultaneously. In particular, the user transmits the pilot sequence √ . The received training signal at the AP , y ∈ C , can be written as where is the normalized signal-to-noise ratio (SNR) of each pilot symbol, and w ∈ C is the additive noise at the AP , which is distributed as w ∼ CN (0, I ). In order for the AP to estimate the desired channels from the user , the received training signal in (4) is projected on as where = w ∼ CN (0, 1). We emphasize that the co-existence of the direct and indirect channels due to the presence of the RIS results in a complicated channel estimation process. In particular, the cascaded channel in (5) results in a nontrivial procedure for applying the minimum mean-square error (MMSE) estimation method, as reported in previous works, for processing the projected signals [8], [9]. Based on the specific signal structure in (5), we denote the channel between the AP and the user through the RIS as which is referred to as the aggregated channel that comprises the direct and indirect link between the user and the AP . In contrast to previous works, where the matrix Φ Φ Φ of the RIS phase shifts is optimized based on instantaneous CSI, in this paper, Φ Φ Φ is optimized based on statistical CSI. This is detailed in Section II-C. By capitalizing on the definition of the aggregated channel in (6), the required channels can be estimated in an effective manner even in the presence of the RIS. In particular, the aggregated channel in (6) is given by the product of weighted complex Gaussian and spatially correlated random variables, as given in (1). Despite the complex analytical form, the following lemma gives information on the statistics of the aggregated channels.
Lemma 1. The second and fourth moments of the aggregated channel can be formulated as follows E{| where . Moreover, the aggregated channels are mutually independent for ≠ and ≠ , i.e., In addition, the aggregated channels and , ∀ ≠ , and the aggregated channels and , ∀ ≠ , are mutually uncorrelated, i.e.,
Besides, the aggregated channels , , , and , fulfill the following conditions Proof. See Appendix B.
The moments in Lemma 1 are employed next for analyzing the channel estimation and the net throughput performance. We note, in addition, that the odd moments of , e.g., the first and third moments, are equal to zero. Conditioned on the phase shifts, we employ the linear MMSE method for estimating at the AP. In spite of the complex structure of the RIS-assisted cascaded channel, Lemma 2 provides analytical expressions of the estimated channels.
Lemma 2. By assuming that the AP employs the linear MMSE estimation method based on the signal observation in (5), the estimate of the aggregated channel can be formulated aŝ where = E{ * }/E{| | 2 } has the following closed-form expression The estimated channel in (15) has zero mean and variance equal to Also, the channel estimation error = −ˆand the channel estimateˆare uncorrelated. The channel estimation error has zero mean and variance equal to Proof. It is similar to the proof in [34], and is obtained by applying similar analytical steps to the received signal in (5) and by taking into account the structure of the RIS-assisted cascaded channel and the spatial correlation matrices in (1).
Lemma 2 shows that, by assuming Φ Φ Φ fixed, the aggregated channel in (6) can be estimated without increasing the pilot training overhead, i.e., only symbols in each coherence interval are needed for channel estimation, which is the same as for conventional Cell-Free Massive MIMO systems. If the user uses the same pilot sequence as the user does, then = from (15). Consequently, we obtain the relationˆ = ˆ, which implies that, because of pilot contamination, it may be difficult to distinguish the signals of two generic users. In that regard it is worth noting that, to get rid of pilot contamination, one can assign mutually orthogonal pilot signals to all the users in the network (if the coherence time is long enough so that ≥ ). Under mutually orthogonal pilot sequences, and simplify to and , respectively, as follows This implies that, in the absence of pilot contamination, we have → as → ∞, i.e., the variance of the channel estimation error in (17) is equal to zero. The channel estimates given in Lemma 2 can be applied to an arbitrary set of phase shifts and covariance matrices. To facilitate the performance analysis presented next, we introduce the following corollary that characterizes the correlation between the aggregated channels and their estimates. Corollary 1. Let us consider the two aggregated channels and with ≠ , and let us denote where and are two non-negative deterministic scalars. Then, the following holds Proof. See Appendix C.
Both Lemma 1 and Corollary 1 indicate that the presence of an RIS makes the channel statistics more complex, compared to a conventional Cell-Free Massive MIMO system, due to the correlation among the propagation channels. In the next sections, the analytical expression of the channel estimates in Lemma 2 and the properties in Corollary 1 and Lemma 1 are employed for signal detection in the uplink and for beamforming in the downlink. Also, the same results are used to optimize the phase shifts of the RIS in order to minimize the channel estimation error and to evaluate the corresponding ergodic net throughput.

C. RIS Phase Shift Control and Optimization
Channel estimation is a critical aspect in Cell-Free Massive MIMO. As discussed in previous text, in many scenarios, non-orthogonal pilots have to be used. This causes pilot contamination, which may significantly reduce the system performance. In this section, we design an RIS-assisted phase shift control scheme that is aimed to improve the quality of channel estimation. To this end, we introduce the normalized mean square error (NMSE) of the channel estimate of the user at the AP , as follows where the last equality is obtained from (7) and (18). The NMSE is a suitable metric to evaluate the channel estimation quality and to measure the relative channel estimation error per AP. By definition, the NMSE lies in the range [0, 1]. In particular, the NMSE tends to zero if orthogonal pilot signals are used for every user and the pilot power is sufficiently large. In general, however, the NMSE is greater than zero if the users reuse the pilot signals, i.e., < , since NMSE → 1− /( ∈ P ) as → ∞. We propose to optimize the phase shift matrix Φ Φ Φ of the RIS so as to minimize the total NMSE obtained from all the users and all the APs, as follows We emphasize that the optimal phase shifts obtained by solving the problem in (24) are independent of the instantaneous CSI and depend only on the statistical CSI, i.e., the largescale fading coefficients and the channel covariance matrices. Problem (24) is a fractional program, whose globally-optimal solution is not simple to be obtained for an RIS with a large number of independently tunable elements. Nonetheless, in the special network setup where the direct links from the APs to the users are weak enough to be negligible with respect to the RIS-assisted links, the optimal solution to problem (24) is available in a closed-form expression as in Corollary 2.
Corollary 2. If the direct links are weak enough to be negligible and the RIS-assisted channels are spatially correlated as formulated in (2), the optimal minimizer of the optimization problem in (24) is 1 = . . . = , i.e., the equal phase shift design is optimal.
Proof. See Appendix D.
If the direct links are completely blocked and the spatial correlation model in (2) holds, Corollary 2 provides a simple but effective option to design the phase shifts of the RIS while ensuring the optimal estimation of the aggregated channels according to the sum-NMSE minimization criterion. Therefore, an efficient channel estimation protocol can be designed even in the presence of an RIS equipped with a large number of engineered scattering elements. The numerical results in Section V show that the phase shifts design obtained in Corollary 2 offers good gains in terms of net throughput even if the direct links cannot be completely ignored.
Remark 1. The proposed optimization method of the phase shifts of the RIS is based on the minimization of the sum-NMSE, and it is, therefore, based on improving the channel estimation quality. This is a critical objective in Massive MIMO systems, since improving the accuracy of channel estimation results in a noticeable enhancement of the uplink and downlink net throughput [35], [36]. If the direct links are not weak enough, the equal phase shift design is not optimal anymore, and the optimal solution to problem (24) may be obtained numerically. For example, one can compute the firstorder derivative of the sum-NMSE in (24) with respect to each reflecting element, and the gradient descent algorithm may be utilized to obtain a locally-optimal solution of (24) in an iterative manner. Another option would be to optimize the phase shifts of the RIS based on the maximization of the uplink or downlink ergodic net throughput. The solution of the corresponding optimization problem is, however, challenging and depends on whether the uplink or the downlink transmission phases are considered. Due to space limitations, therefore, we postpone this latter criterion for optimizing the phase shifts of the RIS to a future research work.

III. UPLINK DATA TRANSMISSION AND PERFORMANCE
ANALYSIS WITH MR COMBINING In this section, we first introduce a procedure to detect the uplink transmitted signals by capitalizing on the channel estimation method introduced in the previous section. Then, we derive an asymptotic closed-form expression of the ergodic net throughput.

A. Uplink Data Transmission Phase
In the uplink, all the users transmit their data to the APs simultaneously. Specifically, the user transmits a modulated symbol with E{| | 2 } = 1. This symbol is weighted by a power control factor √ , 0 ≤ ≤ 1, which enhances the spectral efficiency by, for example, compensating the near-far effects and mitigating the mutual interference among the users. Then, the received baseband signal, ∈ C, at the AP is where is the normalized uplink SNR of each data symbol, which is defined as the maximum transmit power divided by the noise variance, and is the corresponding SNR of the user ; and ∼ CN (0, 1) is the normalized additive noise. For data detection, the MR combining method is used at the CPU, i.e.,ˆ, ∀ , , in (15) is employed to detect the data transmitted by the user . In mathematical terms, the corresponding decision statistic is Based on the observation , the uplink ergodic net throughput of the user is analyzed in the next subsection.

B. Asymptotic Analysis of the Uplink Received Signal
In the considered system model, the number of APs, , and the number of tunable elements of the RIS, , can be large. Therefore, we analyze the performance of two case studies: ( ) is fixed and is large; and ( ) both and are large. The asymptotic analysis is conditioned upon a given setup of the CSI, which includes the large-scale fading coefficients, the covariance matrices, and the power utilized for the pilot and data transmission phases. To this end, the uplink weighted signal in (26) is split into three terms based on the pilot reuse set P , as follows where T 1 accounts for the signals received from all the users in P , and T 2 accounts for the mutual interference from the users that are assigned orthogonal pilot sequences. The impact of the additive noise obtained after applying MR combining is given by T 3 . From (5), (6), and (15), we obtain the following identity 1) Case I: is fixed and is large, i.e., → ∞. In this case, we divide both sides of (28) by and exploits Tchebyshev's theorem [37] 3 and (7) to obtain where − → denotes the convergence in probability. 4 Specifically, the second and third terms in (28) converge to zero due to the so-called favorable propagation conditions and since the aggregated channel and the noise are mutually independent [38]. By inserting (29) into the decision variable in (27), we obtain the following deterministic value because T 2 / → 0 and T 3 / → 0 as → ∞. The result in (30) unveils that, for a fixed , the channels become asymptotically orthogonal. In particular, the small-scale fading, the non-coherent interference, and the additive noise vanish. The only residual impairment is the pilot contamination caused by the users that employ the same pilot sequence. This result is the evidence that, due to pilot contamination, the system performance cannot be improved by adding more APs if MR combining is used. The contributions of both the direct and RIS-assisted indirect channels explicitly appear in (30) through the terms and tr(Θ Θ Θ ), respectively. 2) Case II: Both and are large, i.e., → ∞ and → ∞. To analyze this case study, we need some assumptions on the covariance matrices R and R , as summarized as follows.
Assumption 1. For = 1, . . . , and = 1, . . . , , the covariance matrices R and R are assumed to fulfill the following properties The assumptions in (31) and (32) imply that the largest singular value and the sum of the eigenvalues (counted with their mutiplicity) of the × covariance matrices that characterize the spatial correlation among the channels of the RIS elements are finite and positive. Dividing both sides of (28) by and then applying Tchebyshev's theorem and (7), we obtain ces have the same eigenvalues, it follows that tr(Θ Θ Θ ) > 0. Based on Assumption 1, we obtain the following inequalities where ( ) is obtained from Lemma 3 in Appendix A; ( ) follows because Φ Φ Φ 2 = 1; and ( ) is obtained by applying again Lemma 3. Based on Assumption 1, the last inequality in (34) is bounded by a positive constant. From (33), therefore, the decision variable in (27) can be formulated as which is bounded from above thanks to (34). The expression obtained in (35) reveals that, as , → ∞, the post-processed signal at the CPU consists of the desired signal of the intended user and the interference from the other users in P . Compared with (30), we observe that (35) is independent of the direct links and depends only on the RIS-assisted indirect links. This highlights the potentially promising contribution of the RIS, in the limiting regime , → ∞, for enhancing the system performance.

C. Uplink Ergodic Net Throughput with a Finite Number of APs and RIS Elements
In this section, we focus our attention on the practical setup in which and are both finite. By utilizing the user-andthen forget channel capacity bounding method [39], the uplink ergodic net throughput of the user can be written as follows where is the system bandwidth measured in MHz and 0 ≤ ≤ 1 is the portion of each coherence time interval that is dedicated to the uplink data transmission phase. The effective uplink signal-to-noise-plus-interference ratio (SINR), which is denoted by SINR , is defined as follows where the following definitions hold In particular, DS denotes the (average) strength of the desired signal, BU denotes the beamforming uncertainty, which represents the randomness of the effective channel gain for a given beamforming method, UI denotes the interference caused by the user to the user , and NO denotes the additive noise. We emphasize that the net throughput in (36) is achievable since it is a lower bound of the channel capacity. A closed-form expression for (36) is given in Theorem 1.
Theorem 1. If the CPU utilizes the MR combining method, a lower-bound closed-form expression for the uplink net throughput of the user is given by (36), where the SINR is where MI is the mutual interference and NO is the noise, which are formulated as follows given in (16), and given in (17).

Proof. See Appendix E.
By direct inspection of the SINR in (42), we observe that the numerator increases with the square of the sum of the variances of the channel estimates, , ∀ thanks to the joint signal processing, i.e., the received signals form the APs are sent to the CPU for centralized data detection. On the other hand, the first term in (43) represents the power of the mutual interference. The use of an RIS to support multiple users introduce the extra interference shown in the second and third terms in (43). Due to the limited and finite number of orthogonal pilot sequences being used, the fourth term in (43) dominates the impact of pilot contamination. The second term in the denominator in (42) is the additive noise. If the coherence time is sufficiently large that every user can utilize its own orthogonal pilot sequence, the uplink net throughput of the user can still be obtained from (36), but the effective SINR simplifies to (45). The SINR in (42) is a multivariate function of the matrix of phase shifts of the RIS and of the channel statistics, i.e., the channel covariance matrices. Table I gives a comparison of the obtained uplink SINR of the user with and without the presence of the RIS. By direct inspection of |DS | 2 , we evince that the strength of the desired signal increases thanks to the assistance of the RIS. However, the mutual interference becomes more severe as well, due to the need of estimating both the direct and indirect links in the presence of the RIS. By assigning orthogonal pilot signals to all the users, the coherent interference can be completely suppressed. In Section V, the performance of Cell-Free Massive MIMO and RIS-assisted Cell-Free Massive MIMO is compared with the aid of numerical simulations. IV. DOWNLINK DATA TRANSMISSION AND PERFORMANCE ANALYSIS WITH MR PRECODING In this section, we consider the downlink data transmission phase and analyze the received signal at the users when the number of APs is large or when the numbers of RIS elements and APs are both large. A closed-form expression of the downlink ergodic net throughput that is attainable with MR precoding and for an arbitrary phase shift matrix of the RIS elements is provided.

A. Downlink Data Transmission Phase
By exploiting channel reciprocity, the AP treats the channel estimates obtained in the uplink as the true channels in order to construct the beamforming coefficients. Accordingly, the downlink signal transmitted from the AP is 6 where is the normalized SNR in the downlink; is the complex data symbol that is to be sent (cooperatively by virtue of the considered coherent joint transmission scheme) by all the APs to the user , with E{| | 2 } = 1; and is the power control coefficient of the AP , which satisfies the power budget constraint as follows The cooperation among the APs for jointly transmitting the same data symbol to a particular user creates the major distinction between the downlink and uplink data transmission phases. Based on (46), the received signal at the user is the superposition of the signals transmitted by the APs as where ∼ CN (0, 1) is the additive noise at the user . The user decodes the desired data symbol based on the observation in (48).

B. Asymptotic Analysis of the Downlink Received Signal
In contrast with the uplink data processing where the CPU needs only the channel estimateˆfor detecting the data of the user , as displayed in (27), the received signal in (48) depends on the channel estimates of the users in the network, since the channel estimates from the users are used for MR precoding. Therefore, the analysis of the uplink and downlink data transmission phases are different. First, we split (48) into three terms, by virtue of the pilot reuse pattern P , as follows Then, we investigate the two asymptotic regimes for → ∞ and , → ∞. In particular, the first term in (49) can be rewritten as where ( ) is obtained by utilizing the channel estimates in (15) and ( ) is obtained by extracting the aggregated channel of the user from the summation. By letting and/or be large, similar to the uplink data transmission phase, we obtain the following asymptotic results which are bounded from above based on Assumption 1. Consequently, the received signal at the user converges to a deterministic equivalent as the number of APs is large, i.e., → ∞, and as the number of APs and RIS elements are large, i.e., , → ∞. More precisely, the received signal converges (asymptotically) to which indicates the inherent coexistence of the users in P . The deterministic equivalents in (53) and (54) unveil that the impact of the channel estimation accuracy and the channel statistics is different between the uplink and the downlink. In particular, the asymptotic received signal in the uplink only depends on the channel estimation quality of each individual user, which is manifested by the coefficient . The asymptotic received signal in the downlink depends, on the other hand, on the channel estimation quality of all the users that share the same orthogonal pilot sequences, i.e., , , ∀ ∈ P .

C. Downlink Ergodic Net Throughput with a Finite Number of APs and RIS Elements
By utilizing the channel capacity bounding technique [39], similar to the analysis of the uplink data transmission phase, the downlink ergodic net throughput of the user can be written as follows where 0 ≤ ≤ 1 is the portion of each coherence time interval dedicated to the downlink data transmission phase, with + = 1, and the effective downlink SINR is defined as where the following definitions hold In particular, DS denotes the (average) strength of the desired signal received by the user , BU denotes the beamforming uncertainty, UI denotes the interference caused to the user by the signal intended to the user . The downlink ergodic net throughput in (55) is achievable since it is a lower bound of the channel capacity, similar to the uplink data transmission phase. In contrast to the uplink ergodic net throughput, which only depends on the combining coefficients of each individual user, the downlink net throughput of the user depends on the precoding coefficients of all the users. A closed-form expression for (55) is given in Theorem 2.
Theorem 2. If the CPU utilizes the MR precoding method, a lower-bound closed-form expression for the downlink net throughput of the user is given by (55), where the SINR is where MI is the mutual interference, which is defined as follows Proof. The main steps of the proof are similar to those of the proof of Theorem 1. However, there are also major differences that are due to the coherent joint data transmission scheme among the APs. The details of the proof are available in Appendix F.
From Theorem 2, we observe that the effective downlink SINR has some similarities and differences as compared with its uplink counterpart in Theorem 1. Similar to the uplink, the numerator of (60) is a quadratic function that depends on the channel estimation quality and the coherent joint transmission processing. Differently from the uplink, the transmit power coefficients appear explicitly in the numerator of (60) as a result of the cooperation among the APs. The impact of pilot contamination in (61), in contrast to the uplink, depends on all the transmit power coefficients. In addition, we observe that the impact of pilot contamination scales up with the number of APs and with the number of elements of the RIS. When the users employ orthogonal pilot sequences, the downlink SINR expression of the user simplifies to (62).
A comparison of the obtained analytical expressions of the downlink SINR for Cell-Free Massive MIMO and RIS-assisted Cell-Free Massive MIMO systems is given in Table II. By  comparing Table I and Table II, the difference and similarities between the uplink and downlink transmission phases can be identified as well. With the aid of numerical results, in Section V, we will illustrate the advantages of RIS-assisted Cell-Free Massive MIMO especially if the direct links are not sufficiently reliable (e.g., they are blocked) with high probability.
Remark 2. We observe that the ergodic net throughput in the uplink (Theorem 1) and downlink (Theorem 2) data transmission phases depend only on the large-scale fading statistics and on the channel covariance matrices, while they are independent of the instantaneous CSI. This simplifies the deployment and optimization of RIS-assisted Cell-Free Massive MIMO systems. As anticipated in Remark 1, in fact, the phase shifts of the RIS can be optimized based on the (longterm) analytical expressions of the ergodic net throughputs in Theorem 1 and Theorem 2, which are independent of the instantaneous CSI. In this paper, we have opted for optimizing the phase shifts of the RIS in order to minimize the channel estimation error, which determines the performance of both the uplink and downlink transmission phases. The optimization of the phase shifts of the RISs based on the closed-form expressions in Theorem 1 and Theorem 2 is postponed to a future research work.

V. NUMERICAL RESULTS
In this section, we report some numerical results in order to illustrate the performance of the RIS-assisted Cell-Free Massive MIMO system introduced in the previous sections. We consider a geographic area of size 1.
where¯accounts for the path loss due to the transmission distance and the shadow fading according to the three-slope propagation model in [8]. The binary variables accounts for the probability that the direct links are unblocked, and it is defined as  through the RIS. This setup is denoted by "RIS-CellFree-NoLOS". In Fig. 2, we illustrate the cumulative distribution function (CDF) of the net throughput by using Monte Carlo simulations and the proposed analytical framework. The CDF is computed with respect to the locations of the APs and users in the considered area. The Monte Carlo simulation results are obtained by using (36) and (55) with the SINRs given in (37) and (56), while the analytical results are obtained by using Theorem 1 and Theorem 2. We observe a very good overlap between the numerical simulations and the obtained analytical expressions. From Fig. 2, we evince that the downlink net throughput per user is about 2.6× better than the uplink net throughput. This is due to the higher transmission power of the APs and the gain of the joint processing of the APs. Since the Monte Carlo simulations are not simple to obtain for larger values of the simulation parameters, the rest of the figures are obtained by using the closed-form expressions of the net throughput derived in Theorem 1 and Theorem 2.
In Fig. 3, we illustrate the average sum net throughput as a function of the probability˜in (64). In particular, the average uplink sum net throughput is defined as =1 E{ } and the downlink sum net throughput is defined as =1 E{ }, where the uplink and downlink SINRs are obtained by using Theorem 1 and Theorem 2. In particular, the expectation is computed with respect to the locations of the APs and users in the considered area. From the obtained results, we evince that Cell-Free Massive MIMO provides the worst performance if the blocking probability is large (˜is small). As expected, the average net throughput offered by Cell-Free Massive MIMO tends to zero if˜→ 0 (the direct links are unreliable). For example, at˜= 0.1, the average sum net throughput of Cell-Free Massive MIMO is approximately 2.0× and 1.4× smaller, in the uplink and downlink, respectively, than the average sum net throughput of the worst-case RIS-assisted Cell-Free Massive MIMO setup (i.e., RIS-CellFree-NoLOS). In the considered case study, in addition, we note that the    proposed RIS-assisted Cell-Free Massive MIMO setup offers the best average net throughput, since it can overcome the unreliability of the direct links thanks to the presence of the RIS. The presence of the RIS is particularly useful if˜is small, i.e.,˜< 0.2 in Fig. 3, since the direct links are not able to support a high throughput. In this case, the combination of Cell-Free Massive MIMO and RIS is capable of providing a high throughput and signal reliability.
In Fig. 4, we compare the three considered systems in terms of average sum net throughput when˜= 0.2. We observe the net advantage of the proposed RIS-assisted Cell-Free Massive MIMO system, especially in the downlink. In the uplink, in addition, even the worst-case RIS-assisted Cell-Free Massive MIMO system setup (i.e.,˜= 0) outperforms the Cell-Free Massive MIMO setup in the absence of an RIS. In Figs. 5-7, we show the average sum net throughput as a function of the number of APs, the number of users, and the number of orthonormal pilot signals, respectively. We observe that the average sum net throughput increases with the number of APs, with the number or users, and with the number of pilot sequences, and that the RIS-assisted Cell-Free Massive MIMO setup outperforms, especially in the downlink, the other benchmark schemes. Gains of the order of 1.7× and 2.6× are obtained in the considered setups.
In the following figures, we focus our attention only on the RIS-assisted Cell-Free Massive MIMO setup, since it provides the best performance in the analyzed setups. In Figs. 8 and 9, we report the uplink and downlink average sum net throughput as a function of number of engineered scattering elements of the RIS. In particular, we compare the average sum net throughput when the phase shifts of the RIS are randomly chosen and are optimized according to Corollary 2 over spatially-independent and spatially-corrrelated fading channels according to (2). In the presence of spatial correlation, the channel correlation matrices are R = I and R = I , ∀ , . We see different performance trends over spatially-independent and spatially-correlated fading channels. If the spatial correlation is not considered, we observe that there is no significant difference between the random and uniform phase shifts setup. In the presence of spatial uncorrelation, on the other hand, the uniform phase shift design obtained from Corollary 2 provides a much higher average throughput. This result highlights the relevance of using even simple optimization designs for RIS-assisted communications over spatially-correlated fading channels.
In Fig. 10, we analyze the impact of the size of the engineered scattering elements of the RISs on the uplink and downlink average net throughput, while keeping the total number of RIS elements fixed. The size of the considered RIS, which is a compact surface, is , which implies    that it increases as the size of each element of the RIS increases. In this setup, we observe that the average net throughput increases as the physical size of each element of the RIS increases. In Fig. 11, on the other hand, we analyze a setup in which the total size of the RIS is kept constant and equal to = 10 × 10 while the triplet ( , = = ) is changed accordingly. With the considered fading spatial correlation model and for a size of the RIS elements no smaller than /3, we do not observe a significant difference on the average net throughput. Further studies are, however, necessary for deep sub-wavelength RIS structures, for different optimization criteria of the phase shifts of the RIS, and in the presence of mutual coupling in addition to the fading spatial correlation [40], [41].
In Fig. 12, we plot the ergodic net throughput per user by utilizing different channel capacity bounding techniques, by assuming = 200 symbols and = 5000 symbols. The main benefit of the use-and-then-forget bound in (36) and the hardening bound in (55) is the possibility of obtaining a closed-form expression for the net throughput by capitalizing on the fundamentals properties of Massive MIMO communications, as demonstrated in Theorems 1 and 2. However, the channel hardening capability reduces in the presence of an RIS [38]. Thus, other channel capacity bounding techniques may result in a better estimate of the net throughput as compared to the actual channel capacity. The statistical bound in [42], [43] was originally derived for application to the downlink data transmission when no instantaneous CSI is available at the users and the channel vectors may be less hardened [10]. In order to apply the same bound to the uplink data transmission, we assume that only the channel statistics are available at the CPU. Even though the statistical bound results in a better ergodic net throughput per user, some realizations of user locations and shadow fading may lead to negative values of the ergodic net throughput. This may occur in high mobility scenarios, which correspond to small values of , as illustrated in Fig. 12( ). The statistical bound provides consistent values of the ergodic net throughput in low mobility scenarios when is large, as displayed in Fig. 12( ). The numerical results unveils the need of developing different and more accurate channel bounding methods for evaluating the net throughput of RIS-assisted Cell-Free Massive MIMO systems.
VI. CONCLUSION Cell-Free Massive MIMO and RIS are two disruptive technologies for boosting the system performance of future wireless networks. These two technologies are not competing with each other, but have complementary features that can be integrated and leveraged for enhancing the system performance in harsh communication environments. Therefore,  we have considered an RIS-assisted Cell-Free Massive MIMO system that operates according to the TDD mode. An efficient channel estimation scheme has been introduced to overcome the high overhead that may be associated with the estimation of the individual channels of the RIS elements. Based on the proposed channel estimation scheme, an optimal design for the phase shifts of the RIS that minimizes the channel estimation error has been devised and has been used for system analysis. Based on the proposed channel estimation method, closedform expressions of the ergodic net throughput for the uplink and downlink data transmission phases have been proposed. Based on them, the performance of RIS-assisted Cell-Free Massive MIMO has been analyzed as a function of the fading spatial correlation and the blocking probability of the direct AP-user links. The numerical results have shown that the presence of an RIS is particularly useful if the AP-user links are unreliable with high probability.
Possible generalizations of the results illustrated in this paper include the optimization of the phase shifts of the RIS that maximize the uplink or downlink throughput, the analysis of the impact of the fading spatial correlation for non-compact and deep sub-wavelength RIS structures, and the analysis and optimization of RIS-assisted systems in the presence of mutual coupling.

A. Useful Lemmas
This section reports three useful lemmas that are utilized for asymptotic analysis.  respectively. Then, the expectation on the left-hand side of (65) is where˜is the −th element of vectorx. By noting that E{|˜| 4 } = 2 from Lemma 4 and E{|˜| 2 |˜| 2 } = 1 if ≠ , we obtain the following Consequently, (66) can be further simplified as which, with the aid of some algebraic manipulations, coincides with (65).

B. Proof of Lemma 1
We first compute the second moment of the aggregated channel by capitalizing on the statistical independence of the direct and indirect channels, as follows where ( ) follows by applying the trace of product property tr(XY) = tr(YX) for some given size-matched matrices X and Y; and ( ) is obtained thanks to the independence of the cascaded channels h and z . The fourth moment of the aggregated channel can be written, from (6), as follows can be equivalently written as follows (71) By applying Lemma 4 with ∼ CN (0, ), the first expectation on the right-hand side of (71) is equal to By exploiting the independence between the direct and RISassisted links, the next three expectations on the right-hand side of (71) are equal to By introducing the normalized variable˜= h Φ Φ Φz / R 1/2 Φ Φ Φz with˜∼ CN (0, 1), the last expectation on the right-hand side of (71) is equal to where ( ) follows because˜is independent of the remaining random variables; and ( ) is obtained by virtue of Lemma 4. Inserting (72)-(74) into (71), the proof follows with the aid of some algebraic manipulations. Also, by exploiting the second moment in (7), the expectation of the two independent aggregated channels can be formulated as shown in (11). The correlation between the two aggregated channels and , ≠ , is, by definition, as follows where ( ) follows because the propagation channels are independent.
The expectation in (13) can be written as follows where ( ) is obtained by utilizing Lemma 5. Similar steps can be applied to the two aggregated channels and with ≠ . The expectation in (12) can be written as follows where ( ) follows from the independence of the direct links.

D. Proof of Corollary 2
We introduce the shorthand notation =˜2 2 and = 2 2 ∈ P˜ . When the direct links are weak enough to be negligible, the NMSE of the channel estimate of the user at the AP can be reformulated as follows Let us denote by tr Φ Φ Φ RΦ Φ ΦR = =1 =1 NMSE the objective function of the problem in (24), the first-order derivative of NMSE with respect to tr Φ Φ Φ RΦ Φ ΦR is which implies that the objective function is a monotonically decreasing function of tr Φ Φ Φ RΦ Φ ΦR since ≥ 0, ∀ , . Moreover, Φ Φ Φ RΦ Φ ΦR is similar to R 1/2 Φ Φ Φ RΦ Φ ΦR 1/2 , which is a positive semidefinite matrix. Thus, we obtain With the aid of Cauchy-Schwarz's inequality, we obtain the following upper bound for (83) which holds with equality if and only if the two vectors a and b in (84) and (85) are parallel. This implies = , ∀ , . By combining (82) and (86), the proof is concluded.

E. Proof of Theorem 1
To obtain the closed-form expression of the uplink SINR in (37), we first compute |DS | 2 by using the definition of in (6) as where ( ) is obtained by expressing the original channel into the summation of its channel estimate and its estimation error as stated in Lemma 2; and ( ) follows because the channel estimate and the channel estimation error are uncorrelated.
Since the aggregated channels sharing the same AP index are correlated, the first expectation in the denominator of (37) is equal to where =ˆ * − E{ˆ * } with = 1. The closed-form expression of¯0 defined in (88) is as follows thanks to (22) in Corollary 1. The expectation of¯1 defined in (88) can be rewritten as follows where ( ) is obtained by using the identity E{| − E{ }| 2 } = E{| | 2 } − |E{ }| 2 ; and ( ) is obtained from * =ˆ * + * , by taking into account that the channel estimate and the channel estimation error are uncorrelated random variables as stated in Lemma 2. By replacingˆ * = * in the first expectation of (88), we obtain Let us analyze the three terms, 11 , 12 , and 13 , in the last equality of (91). The first term can be computed by exploiting the forth moment given in (8), as follows where ( ) is obtained by using the variance of the channel estimate in (17), with and that are defined in the statement of the theorem. The second term can be computed by exploiting the uncorrelation of the two cascaded channels, as follows The last term can be computed by exploiting the independence of the channel and noise, as follows By inserting (92)-(94) into (91) and with the aid of some algebraic steps, we obtain where the final identity is obtained by using the relationship between and in (17). Combining (88) and (95), the first term in the denominator of (37) simplifies to The second term in the denominator of (37) can be split into the two terms based on the pilot reuse pattern defined in P , as follows The first term on the right-hand side of (97) is the non-coherent interference, which is equal to We compute each expectation by utilizing the channel estimate in (15) with the aid of some algebraic manipulations, as follows where the following definitions hold = E{ * * }, The closed-form expression of¯ is obtained by utilizing the uncorrelated property of the quadruple aggregated channels in (12), as follows = tr(Θ Θ Θ Θ Θ Θ ).
In addition, the closed-form expression of¯ can be computed by utilizing the results in Lemma 1, as follows  The second term on the right-hand side of (97) is the coherent interference. By using (5) and (15), it simplifies as follows which is obtained by using the identity E{| + | 2 } = E{| | 2 }+ E{| | 2 } that holds true for zero-mean and uncorrelated random variables. The expectation 21 can be simplified as follows Similarly, the expectation 22 in (105) can be simplified as follows tr(Θ Θ Θ Θ Θ Θ ). (107) By inserting (106) and (107) into (105), and with the aid of some algebraic manipulations, we obtain where ( ) is obtained by using (16) and ( ) by using (17). Therefore, the total mutual interference between the user and the user who share the same pilot sequence can be written as follows Finally, the expectation of the additive noise after MR processing can be written as follows thanks to the independence between the channel estimate and the noise. The proof follows by inserting (87), (91), (109), and (111) into the SINR in (37) with the aid of some algebraic manipulations.

F. Proof of Theorem 2
Consider the downlink SINR in (56). Thanks to the uncorrelation between the channel estimate and the channel estimation error, the numerator simplifies to (112) The beamforming uncertainty term in the denominator of (56) can be simplified by using (15), as follows by utilizing (22) in Corollary 1. The expectation¯1 in (114) can be rewritten as follows where we have used the identities E{| −E{ }| 2 } = E{| | 2 }− |E{ }| 2 and E{| + | 2 } = E{| | 2 } + E{ | 2 } for zero-mean uncorrelated random variables. By using the identities in (91) and ( by using (5). Inserting (114) and (116) into (113), we obtain The mutual interference term in the denominator of (56) can be rewritten as follows where the first term on the right-hand side of (118) is obtained by exploiting the orthogonality of the pilot sequences. In the second summation of (118), = E{|UI | 2 } can be rewritten as follows which is obtained by taking into account that the noise is circularly symmetric. The term 21 depends on the non-coherent interference and can be simplified as follows by using the second moment in (7) and the uncorrelation among the aggregated channels. The last term 22 in (119) can be simplified as follows