Achievable Rate Analysis and Max-Min SINR Optimization in Intelligent Reflecting Surface Assisted Cell-Free MIMO Uplink

In this paper, we study the uplink transmission in an intelligent reflecting surface (IRS) assisted cell-free multiple-input multiple-output (MIMO) system where the central processing unit (CPU) only has statistical channel state information (CSI) to detect symbols, and to design the receiver filter coefficients, the power allocations, and the IRS phase shifts. The access points (APs) estimate only their local end-to-end channels with the users using minimum mean squared error (MMSE) estimation to implement matched filtering, thereby avoiding the large overhead associated with estimating individual IRS-assisted channels. Under this framework, we derive a closed-form expression for the achievable uplink net rate that only depends on the channel statistics. Using this expression, we formulate the problem of maximizing the minimum (max-min) signal-to-interference plus noise ratio (SINR) to design the receiver filter coefficients at the CPU, the power allocations at the users, and the phase shifts at the IRS, subject to per user power constraints as well as IRS phase shift resolution constraints. The resulting problem is jointly non-convex in the three design variables and is solved using an alternating optimization algorithm. In particular, the receiver filter design is formulated as a generalized eigenvalue problem leading to a closedform solution, the power allocation problem is solved using a geometric programming (GP) approach, and the IRS phase shifts are designed using an alternating maximization algorithm. For comparison, we also formulate and solve the max-min SINR problem for the scenario where the instantaneous imperfect CSI of all individual direct and IRS-assisted channels is available at the CPU. Numerical results show that the scheme designed using statistical CSI has the potential to outperform the scheme based on instantaneous CSI for moderate to large number of IRS elements, due to savings in the channel estimation overhead.

in distributed massive MIMO systems, the transmit antennas are spread out over a large area and can provide much higher coverage by efficiently exploiting diversity against shadow fading, at the cost of increased backhaul requirements. These conventional cellular networks where the centralized or distributed antenna array in each cell serves the users within the cell boundaries, are impaired by inter-cell interference, that needs to be mitigated using advanced signal processing methods.
In recent years, cell-free massive MIMO has emerged as a promising technology to reduce the effect of inter-cell interference through its cell-free architecture and centralized processing. Cell-free massive MIMO is essentially a distributed massive MIMO network where a large number of access points (APs) distributed in a given area serve a smaller number of users in that area [3], [4]. All APs collaborate via a backhaul network, and serve all the users in the same time-frequency resource without any cells or cell boundaries. The system performance is significantly enhanced because it combines the benefits of distributed MIMO and massive MIMO, and also because the users are now closer to APs. Specifically, in the classical operating regime of cell-free massive MIMO, where a large number of distributed low-cost single antenna APs serve several single antenna users [3]- [7], conjugate beamforming/matched filtering techniques yield excellent net throughput in the downlink and uplink respectively. These techniques are computationally simple and can be implemented in a distributed manner with most processing done locally at the APs. In light of its several advantages, cell-free massive MIMO has been the subject of significant research attention in the last few years.

B. NEED FOR IRS-ASSISTED CELL-FREE MIMO
While cell-free massive MIMO networks have shown promising performance gains compared to conventional cellular networks, they can lead to high deployment cost and power consumption due to the high costs of both hardware and power sources required by a large-scale deployment of APs [8]- [11]. Even if single antenna APs are considered, each AP will have an associated standard radio frequency (RF) implementation requiring highly linear amplifiers that are operated with considerable power backoff, which severely limits the overall energy efficiency of the system. Moreover, in the actual deployment of a cell-free MIMO system, the layout of urban buildings is uneven, and it is difficult to achieve uniform coverage in the deployment of APs. For areas in the shelter of tall buildings or located far from all APs, the communication between the users and APs can be blocked, resulting in "shadow areas" [9]. One simple way to improve these users' quality of service is to deploy more APs to achieve high-density coverage, which again increases the deployment cost and energy consumption of the network thereby limiting its efficiency.
These problems can potentially be avoided if the network operator can support the users, especially those in the shadow areas, by improving the propagation environment without increasing the number of APs and the transmit power. A promising new technology that allows the network operator to have some degree of control over the propagation of radio signals is the intelligent reflecting surface (IRS) [12], [13], also known as the reconfigurable intelligent surface [14], [15] in existing literature. These surfaces can increase the performance of conventional communication systems by realizing reconfigurable propagation channels between the APs and the users with very low power requirements [12].
An IRS is envisioned as an array of a large number of passive reflecting elements that can introduce adjustable phase shifts onto the incident electromagnetic waves as directed by a smart controller connected to the IRS. The softwarecontrolled reflections of the incoming signals can tailor the propagation environment in beneficial ways that can increase the coverage, improve the achievable rates, and increase the energy efficiency [13], [14], [16]- [18], without generating additional radio signals and thereby without consuming any noticeable additional power. The deployment costs of the IRS are also low and they can be deployed on urban building walls and other structures in the propagation environment [12]- [15]. It has been shown in the literature that an IRS-assisted MIMO system can achieve a performance similar to that of a massive MIMO system with a reduced number of antennas and reduced transmit power [16], [19]. Motivated by these observations, the integration of IRS into cell-free MIMO systems is of practical interest to reduce highly unlikely to hold in practice for an IRS-assisted system because the IRS has no radio resources of its own to send, receive or process the pilot symbols. More recently, several works have developed channel estimation protocols for IRSassisted systems. The earliest papers on this subject proposed an ON/OFF based channel estimation strategy, where only one IRS element is switched on in each channel estimation sub-phase [19]. Later the works in [24], [25] proposed a novel discrete Fourier transform (DFT) based channel estimation strategy, in which all the IRS elements are on in each training sub-phase, and their reflection coefficients are determined by the DFT matrix. Under this strategy, the mean squared error in the channel estimates is significantly reduced because the IRS elements are always on to reflect more signal power to the AP. These channel estimation protocols require an uplink training phase of length K + KN symbols to estimate all channels, where K is the number of users in the system and N is the number of IRS elements [19], [24]- [26]. This large training overhead compromises a large proportion of the performance gains expected from deploying an IRS. There are some works that develop lower overhead channel estimation protocols based on the idea of IRS elements grouping or by assuming specific channel properties. For example: the authors in [21] and [27] introduce the idea of grouping adjacent IRS elements into sub-surfaces, which decreases the training overhead but also reduces the beamforming gains. Other solutions exploit the channel sparsity that can potentially exist in IRS-assisted massive MIMO channels to develop lower-overhead channel estimation algorithms [28], [29].
More recently, the authors in [30] and [31] propose three-phase and two-phase channel estimation frameworks respectively, in which the direct channels and the IRSassisted channels of a typical user are estimated in the first phase, while the channels of the other users are estimated with lower overhead in the next phase. The reduction in training overhead is achieved by noting that the IRS-assisted channels of the other users are scaled versions of that of the typical user and only the N scaling factors, rather than the N channel vectors of dimension L, where L is the number antennas at the AP, need to be estimated. As a result, the required length of the uplink training period is reduced from K + KN to K + N + (K−1)N L symbols. However, the protocols in [30] and [31] are as costly as the ones in [19], [24], [25] for an IRS-assisted cell-free system with single-antenna APs, since the required length of the uplink training phase will still be K + KN symbols when L = 1. Moreover, even if we overlook this training overhead and assume perfect channel estimation, optimizing the IRS phase-shifts based on instantaneous CSI at the pace of a fast-fading channel significantly increases the system complexity.
To mitigate the challenges associated with instantaneous CSI acquisition and IRS optimization, the authors in [32]- [38] design the IRS parameters using only statistical CSI, i.e., the IRS phase shifts are designed using the slowly-varying large-scale channel statistics such as path loss, and correlation matrices, without requiring the instantaneous CSI of individual AP-IRS and IRS-users channels. In particular, the authors in [34] maximize the achievable average sum-rate of an IRS-assisted multi-user multipleinput single-output (MISO) system using a two-timescale transmission protocol, in which the IRS phase shifts are optimized based on the statistical CSI, while the precoding at the BS is designed using the instantaneous CSI of the users' aggregate end-to-end channels with the optimized IRS phase shifts. The authors in [35] consider sum-rate maximization and minimum user rate maximization problems in an uplink IRS-assisted MIMO system, where the IRS phase shifts are designed based on statistical CSI using the derived achievable rate expression. Other solutions like the random rotations scheme at the IRS and opportunistic beamforming using an IRS have been proposed in [39], [40], in which the IRS phase shifts are randomly drawn from a given distribution without requiring instantaneous CSI of the IRS-assisted channels. The only instantaneous CSI then needed is of the overall end-to-end channel (i.e., the aggregate of the direct link and the IRS-assisted link) to implement the beamforming transmission scheme at the AP. These works significantly reduce the channel training overhead and IRS beamforming design complexity over the existing schemes that are based on the instantaneous CSI of all channels.

2) IRS-ASSISTED CELL-FREE MIMO SYSTEMS
The current literature on IRS-assisted cell-free MIMO systems is limited. The authors in [8] consider an IRSassisted cell-free MIMO system where multiple IRSs are deployed around APs and users to create favorable propagation conditions. They optimize the digital beamforming at the APs and the analogue beamforming at the IRS to maximize the energy efficiency. An aerial IRS (AIRS)-aided cell-free massive MIMO system is studied in [9], with the goal of optimizing the power allocation and beamforming of each AP, and the placement and phase shift parameters of the AIRS to maximize the achievable downlink rate of a user that is in a shadow area caused by the shelter of a tall building. The authors in [10], [11] study an IRS-aided cell-free network in a wideband scenario and formulate the precoding and beamforming design problem at the APs and IRSs respectively to maximize the network capacity. All these works focus on formulating and solving optimization problems for IRS-assisted cell-free MIMO systems with different communication objectives under the assumption of the availability of perfect instantaneous CSI of all links.
In contrast to these works, the authors in [41] consider the uplink of an IRS-assisted cell-free MIMO system and optimize the IRS phase shifts under imperfect CSI obtained using the ON/OFF channel estimation protocol from [19], which not only incurs a large training overhead but also results in a lower channel estimation quality as compared to other available channel estimation protocols. Additionally, the proposed scheme requires the IRS phase shifts to be designed instantaneously with each channel realization which increases the computational complexity at the CPU as well as the signal exchange overhead between the CPU and the IRS controller. The authors in [7] develop closed-form expressions for the ergodic net throughout in the uplink and downlink of an IRS-assisted cell-free MIMO system, which depend only on the channel statistics. However system parameters like the power allocations at the users and the phase shifts of the IRS are not optimized for the derived throughputs and are considered to be given fixed quantities.
Based on these works, we have two extremes for the operation of IRS-assisted cell-free MIMO systems. On one extreme, we can estimate all channels at the APs and transmit this full instantaneous CSI to the CPU to optimize performance at the expense of high complexity, fronthaul traffic, and channel estimation overhead. On the opposite extreme, we can assume CSI is not available at the APs and the CPU, which will reduce performance due to the lack of optimized matched filtering at the APs and optimized beamforming at the IRS, as well as detection without CSI at the CPU. It is thus of interest to strike a compromise between these two extremes which is practical and achieves good performance.

D. MOTIVATION AND CONTRIBUTIONS OF THE CURRENT WORK
In light of the discussion so far and different from [8]- [11], [41], we study an IRS-assisted cell-free MIMO system in which the system parameters are designed based on statistical CSI instead of instantaneous CSI. By considering instantaneous imperfect CSI of only the aggregate AP-users channels at each AP for matched filtering and statistical CSI at the CPU for information decoding and parameter optimization, we avoid the need for estimating all individual IRS-assisted channels as well as the need for designing IRS phase shifts instantaneously with each channel realization. For the considered IRS-assisted cell-free MIMO system, we develop an analytical framework for studying the achievable uplink rate as well as optimizing the minimum achievable rate of the system.
Specifically, we consider the uplink of an IRS-assisted cell-free MIMO system where multiple single-antenna users transmit their data to multiple single-antenna APs in the presence of direct as well as IRS-assisted links. First we outline the channel estimation scheme in which each AP estimates the overall end-to-end channel with respect to each user, which is the aggregate of the direct link and the IRSassisted link between that AP and user. Taking into account the effect of pilot contamination, we apply the minimum mean squared error (MMSE) estimation technique to obtain these estimates. The APs employ matched filtering on the received signals using the derived estimates of the aggregate channel, and forward them to the CPU for information decoding. To improve the performance of the system, the CPU uses a receiver filter before decoding [5]. For this transmission scheme, we express the achievable uplink net rate of each user assuming that the CPU only exploits the knowledge of channel statistics between the users and APs to decode the transmitted message. Closed-form expressions of the achievable uplink net rate and SINR of each user are derived by applying statistical tools. The resulting expressions only depend on the channel statistics and the design variables, i.e., the receiver filter coefficients of the CPU, the power allocations across the users, and the phase shifts at the IRS.
Different from [8]- [11], [41] that use instantaneous rate expressions to design the IRS-assisted cell-free MIMO system, we formulate the max-min SINR optimization problem subject to per user power constraints as well as constraints on IRS phase shifts resolution, using the derived achievable net rate expression that only depends on statistical CSI. The non-convex problem is solved by decoupling it into three sub-problems, and solving them using an alternating optimization algorithm. The receiver filter coefficient design sub-problem is formulated as a generalized eigenvalue problem for which an optimal closed-form solution is obtained. The power allocation sub-problem is formulated as a standard geometric program which can be optimally solved. The IRS phase shifts design sub-problem is solved using an alternating maximization approach to sequentially optimize the IRS phase-shifts. As a benchmark scheme, we also formulate and solve the max-min SINR problem for the scenario where the instantaneous imperfect CSI of all individual direct and IRS-assisted channels is available at the CPU for decoding and parameter optimization. We use the MMSE-discrete fourier transform (DFT) protocol from [24], [25] to estimate all channels and derive an expression for the achievable uplink net rate. The max-min SINR problem is then formulated and solved using an alternating optimization algorithm, with the IRS phase shifts optimized using the Dinkelbach's algorithm. Finally we present numerical results to validate the theoretical analysis and yield insights into the performance of the proposed scheme.
The work described above results in the following contributions: • An explicit expression of the MMSE estimate of each aggregate AP-user channel presented in Lemma 1. • Closed-form expressions of the achievable uplink net rate and SINR of each user presented in Theorem 1, for the scenario where the CPU only has statistical CSI. • An alternating optimization algorithm outlined in Algorithm 2 to solve the max-min SINR optimization problem, with the IRS phase shifts designed using the alternating maximization algorithm outlined in Algorithm 1. An attractive feature of the proposed algorithms, both of which depend only on statistical CSI, is that the parameters do not need to be optimized instantaneously, which reduces complexity. • Closed-form expressions of the achievable uplink net rate and SINR of each user presented in Theorem 2, and an alternating optimization algorithm outlined in Algorithm 4 to solve the max-min SINR optimization problem, for the scenario where the instantaneous imperfect CSI of all channels is available at the CPU. • Simulation results that reveal: (1) the proposed maxmin SINR scheme designed using statistical CSI can outperform the scheme designed using instantaneous CSI for moderate to large number of IRS elements, (2) deploying an IRS can reduce the required number of APs to achieve a certain minimum rate performance, and (3) an IRS implemented using 2-bit phase shifters yields a very close performance to that implemented using higher resolution phase shifters. To the best of the authors' knowledge, this is the first work to solve the max-min SINR problem for an IRS-assisted cellfree MIMO system using statistical CSI. This is also the first work to compare the performance of max-min SINR schemes in any IRS-assisted communication setting under statistical and instantaneous CSI.
The paper is organized as follows. In Section II, the IRS-assisted cell-free MIMO system model is outlined. In Section III, the channel estimates and achievable uplink rates are derived. In Section IV we solve the max-min SINR problem based on statistical CSI to optimize the design variables. In Section V the same max-min SINR problem is solved considering instantaneous CSI of all channels. Simulation results and conclusions are provided in Section VI and Section VII, respectively.

II. SYSTEM MODEL AND PROBLEM FORMULATION
In this section, we present the uplink transmission model for the considered IRS-assisted cell-free MIMO system and describe the max-min rate problem formulation.

A. UPLINK TRANSMISSION MODEL
We consider the uplink transmission in a cell-free MIMO system, where K single antenna users communicate with a CPU via M single-antenna APs, as shown in Fig. 1. The APs are connected to the CPU via perfect backhaul links. 1 All APs and users are randomly located in the coverage area and communicate in the same time-frequency resource under the time-division duplexing (TDD) protocol. This communication is assisted by an IRS comprising of N reflecting elements that introduce phase shifts onto the impinging electromagnetic waves. The phase shifts are adjusted by an IRS controller that exchanges information with the CPU via a control link.
In the uplink data transmission phase, user k wants to send a message m k with rate R k to the CPU. This message is encoded into a codeword with symbols s k ∼ CN (0, 1) (where CN (0, 1) represents circularly symmetric complex normal distribution with zero mean and unit variance), which 1. We consider infinite capacity/noiseless backhaul links. Such perfect backhaul links can be established through fiber links between the APs and the CPU [5]. In [42], [43], the authors show that even with wireless microwave backhaul links, the performance of limited-backhaul cell-free system closely approaches the performance of cell-free system with perfect backhaul. are sent to all APs. The transmitted signal from user k is given as where q k is the transmit power. The received signal at AP m is given by where ρ u is the normalized uplink SNR defined as one divided by the noise variance (ρ u q k is the corresponding SNR of user k), g mk is the channel between AP m and user k, and w m ∼ CN (0, 1) is the additive noise at AP m.
To help the CPU decode the message from user k, AP m implements matched filtering, i.e., multiplies the received signal y m in (2) with the conjugate of its (locally obtained) channel estimateĝ mk , ∀k. 2 Then the obtained quantitieŝ g * mk y m ∀k, are sent to the CPU via a backhaul link. In order to improve the achievable rate, the forwarded signals are further multiplied by receiver filter coefficients u mk 's at the CPU as proposed in [5]. Specifically, the CPU multiplies the signalĝ * mk y m received from AP m with u mk , for all m, before decoding user k's message m k which is encoded into symbols s k . 3 The aggregated received signal at the CPU from all APs can be written as 2. The channel estimatesĝ mk 's are derived in Section III-A. 3. These coefficients are later optimized at the CPU to maximize the minimum uplink rate. The motivation behind introducing these coefficients is that the CPU can design them to minimize the inter-user interference. Note that the APs use matched filtering which maximizes the desired signal but is not optimal in terms of inter-user interference. Significant gains in the max-min rate were observed in [5] with the introduction of these coefficients.
for k = 1, . . . , K. Let us define u k = [u 1,k , . . . , u M,k ] H as the vector of receiver filter coefficients applied by the CPU on the received signals from all APs for the decoding of m k . This vector satisfies ||u k || = 1 without loss of generality.
The CPU wants to decode the message m k sent by user k from the aggregated received signal r k in (4). The expression of the rate R k that can be achieved for each user will be studied in Section III and Section V for two different transmission schemes. Next we outline the model for the channel g mk between AP m and user k in (2).

B. CHANNEL MODEL
We assume a quasi-static block fading model where the channels stay constant in each coherence interval of length τ c symbols. The channel between AP m and user k is given as where h [1] m ∈ C N×1 is the channel between the AP m and the IRS, h [2] k ∈ C N×1 is the channel between user k and the IRS, h [d] mk ∈ C 1×1 is the direct channel between user k and AP m, and = αdiag(exp(jθ 1 ), . . . , exp(jθ N )) ∈ C N×N is a diagonal matrix accounting for the IRS response. Here θ n ∈ [0, 2π ) is the phase shift applied by element n and α ∈ [0, 1] is the amplitude reflection coefficient which depends on the IRS construction. Similar to the majority of the works on this subject (for example: [13], [14], [16]- [18], [20]), we assume α = 1, motivated by the significant advancements made in the development of lossless metasurfaces. To keep the IRS model practical, we consider each IRS element to apply finite-resolution phase shifts, that can only take a finite number of values equally spaced in the interval [0, 2π ). We denote by b the number of bits used to represent each phase shift. Then the set of possible phase shifts at each IRS element is given by Q = {0, θ, . . . , θ(Q − 1)} where θ = 2π/Q and Q = 2 b . The q th element of Q is denoted by Q q .
The channel between each AP and IRS is considered to be LoS dominated [16], [17], [25], [28], [39], [44]- [46] as motivated by two observations. First, the IRS can be deployed where a LoS path between the AP and IRS (both of which have fixed positions) is guaranteed. Second, the path loss in NLoS paths is much larger than that in the LoS path in the next generation communication systems. In fact it is noted that in mmWave systems, the typical value of the Rician factor (ratio of energy in the LoS component to that in the NLoS component) is between 20dB and 40dB [28], which is sufficiently large to neglect any NLoS channel components in h [1] m as compared to the LoS component. Under these considerations, we assume that the AP-IRS channel is LoS dominated. The elements of h [1] m are computed as [16], [47] [h [1] m ] n = β [1] m exp(−j2π (n − 1)d IRS sin ϑ m ), for m = 1, . . . , M, n = 1, . . . , N, where ϑ m is the LoS angle of arrival at the IRS from AP m, d IRS is the inter-element separation, and β [1] m is the channel attenuation factor that captures path loss and shadow fading.
Most existing works assume the IRS-users channels to undergo independent Rayleigh fading. In this paper, we consider a more realistic channel model by taking into account the spatial correlation among the scattering elements of the IRS and model h [2] k 's as h [2] k ∼ CN 0, β [2] k R [2] k , for k = 1, . . . , K, where R [2] k is the covariance matrix that characterizes the spatial correlation between the channels between the IRS elements and user k. Moreover the direct channel between AP m and user k is modeled as mk ). Here β [2] k is the channel attenuation factor for the channel between the IRS and user k, whereas β [d] mk is the channel attenuation factor for the direct link.
For the considered transmission model in Section II-A and the channel models in Section II-B, we are interested in maximizing the minimum uplink rate R k , by tuning the parameters of the system. This problem formulation is provided next.

C. PROBLEM FORMULATION
In this work, our objectives are to characterize the performance in terms of achievable rate of an IRS-assisted cell-free MIMO system, to evaluate how much gain an IRS brings in terms of minimum user rate in a cell-free MIMO setting, and to investigate how much reduction an IRS brings in the number of APs required to achieve a certain minimum user rate performance. To this end, we need to design the receiver filter coefficients at the CPU, the power coefficients at the users, and the phase shifts applied by the IRS elements to maximize the minimum uplink rate of the system. The max-min rate objective provides a good balance between system throughput and user fairness, and is widely used as a performance metric in wireless communication systems. For the IRS-assisted cell-free MIMO transmission model outlined earlier in this section, we will aim to maximize the minimum uplink rate while satisfying the per-user power constraints, the unit-norm constraints on the CPU receiver filter vectors, as well as the IRS phase shifts resolution constraints. This max-min rate problem is formulated below.
where p k max is the maximum transmit power available at user k, and Q is the discrete phase shift set defined in Section II-B.
In the sequel, we will derive the expressions of the achievable uplink rate R k and solve (P1) using these expressions.
In Sections III and IV, we will focus on the scenario where the CPU only has statistical CSI to solve Problem (P1) and design the system parameters. In Section V, we will solve this problem where the CPU has instantaneous imperfect CSI of all channels to design the system parameters, which serves as a benchmark for comparison.

III. CHANNEL ESTIMATION AND ACHIEVABLE RATES UNDER STATISTICAL CSI AT THE CPU
In this section, we first outline the channel estimation method used at AP m to obtain the estimateĝ mk of the end-to-end channel g mk with respect to each user k. Next we derive the expression of the achievable net rate R k ∀k, for the scenario where the CPU only has statistical CSI. This expression is later used to solve problem (P1).

A. CHANNEL ESTIMATION
There are two ways to estimate the channel and design the IRS phase shifts in IRS-assisted systems. One way is to estimate the individual IRS-assisted channels, i.e., the AP-IRS, the IRS-user, and the direct AP-user channels, and use these estimates to optimize the IRS phase shifts in to obtain favorable instantaneous channels that meet the desired performance objective. However, as discussed in the introduction, many existing channel estimation protocols to acquire this CSI require an uplink training time that grows proportionally with the number of IRS elements. Moreover, optimizing the IRS phase-shifts based on instantaneous CSI at the pace of a fast-fading channel significantly increases the system complexity. The second way is to estimate the overall end-to-end channel g mk for a given and use it for beamforming at the AP, and then optimize using statistical CSI to obtain favorable channel statistics. We focus on the latter method, i.e., estimating the overall channel g mk for a given instead of estimating the individual channels h [2] k 's and h [d] mk 's. The motivation for doing so is two-fold. First, we save the prohibitive training overhead associated with estimating all individual IRS-assisted channels. Second, this suffices for optimizing the IRS phase shifts based on statistical CSI, i.e., based on the large-scale fading parameters like path loss factors and correlation matrices. 4 In order to estimate the channel coefficients g mk 's in the uplink training phase of length τ p symbols, the APs employ the minimum mean-square error (MMSE) estimator. During the training phase, all K users simultaneously transmit their pilot sequences of length τ p symbols to the APs. In cell-free MIMO systems, the number of orthogonal pilot sequences available is usually less than the number of users resulting in the re-use of pilot sequences across the users, leading to pilot contamination which increases the channel estimation error. 4. Later in Section V we will devise a benchmark scheme in which all individual IRS-assisted channels and direct channels are estimated and the IRS phase shifts (and other design variables) are designed based on this full instantaneous CSI. The performance of the proposed statistical CSI based scheme will be compared to that of this benchmark scheme.
In this work we will consider pilot contamination during channel estimation.
Specifically, we consider τ p orthogonal pilot sequences and K users in our setup. When τ p < K we will have pilot contamination due to the re-use of pilot sequences across some users, while when τ p ≥ K there will be no pilot contamination as all the users will be assigned unique orthogonal pilots. Let √ τ p φ k ∈ C τ p ×1 be the pilot sequence used by user k, where ||φ k || 2 = 1. Then the τ p × 1 received signal at AP m is given as where ρ p is the uplink SNR of each pilot symbol also referred to as the training SNR, and the vector w p m ∈ C τ p ×1 represents the noise whose elements are i.i.d CN (0, 1) variables. AP m projects the received training signal y p m on φ k to obtain the observationỹ p mk , given as Error due to pilot contamination Note that the observationỹ p mk (where˜y p mk √ τ p ρ p is also the least square estimate of g mk ) is a sum of the desired channel and the error, that arises due to the received noise in the uplink training phase as well as due to pilot contamination resulting from the re-use of pilot sequences across the users. If orthogonal pilots are assigned to all the users in the system, then φ H k φ k = 0 for all k = k , and the observation at AP m for the estimation of g mk reduces toỹ (11) to compute the MMSE estimate of the overall channel g mk , which is presented in the following lemma.
Lemma 1: Given AP m employs MMSE estimation using y p mk , the estimate of g mk is given aŝ where where η mk is the variance of channel estimate given as Moreover, the channel estimation error e mk = g mk −ĝ mk is uncorrelated (and also independent) ofĝ mk and is distributed as e mk ∼ CN (0, δ mk − η mk ).
Proof: The proof follows by applying the standard defini- [48] and using the expressions of the received training signal in (11) and the channel in (5) to compute it.
We can characterize the normalized mean squared error (NMSE) in the channel estimateĝ mk as The NMSE is a suitable metric to evaluate the channel estimation error, and by definition, it lies in the range [0, 1].
In particular, the NMSE tends to zero if orthogonal pilot signals are used for every user such that the error due to pilot contamination is zero (i.e., K k =k δ mk |φ H k φ k | 2 = 0), and the uplink training SNR is sufficiently large such that 1 τ p ρ p goes to zero. If the training SNR is sufficiently large but the pilot sequences are re-used across the users, the NMSE will be non-zero and will be given by which represents the error due to pilot contamination alone. Therefore the error in the channel estimates is determined by both pilot contamination and the uplink training SNR. The impact of pilot contamination on the system's performance will also be studied in the simulations.
The channel estimates in (12) are used at the APs to implement matched filtering before forwarding the received signals to the CPU for decoding as described in (4).

B. ACHIEVABLE RATES
In this section, we derive the closed-form expression of the achievable uplink net rate for all users under the considered transmission model in which the CPU decodes m k from r k in (4). In deriving the achievable uplink rate of each user, we assume that the CPU exploits only the knowledge of channel statistics between the users and APs in decoding the desired message from the received signal. 5 Without loss of generality, the aggregated received signal in (4) can be written as Note that for large M, which is the case in cell-free MIMO, detection using only the channel statistics is nearly optimal due to the channel hardening property [3], [5]. Each AP uses its local channel estimates to implement matched filtering to facilitate detection while the CPU uses statistics of the received quantitiesĝ * mk y m from the APs for detection. This ensures low computational complexity and offers a distributed implementation [5].
where D k denotes the (average) strength of the desired signal of user k, U k represents the beamforming uncertainty for user k, I kk represents the inter-user interference caused by user k to user k, and N k accounts for the noise following the matched filtering. We treat the sum of the second, third, and fourth terms in (17) as effective noise. We can show that the effective noise is uncorrelated with the desired signal. Since s k is independent of D k and U k , and noting that Thus, the first and the second terms of (17) are uncorrelated.
A similar calculation shows that because s k and s k are independent. Therefore the third term of (17) is uncorrelated with the first. Similarly we can show that the fourth term of (17) is also uncorrelated with the first term since w m and s k are independent. Therefore, the effective noise and the desired signal are uncorrelated. While the effective noise is not Gaussian, it can be treated as uncorrelated Gaussian as a worst-case to obtain the following achievable uplink net rate for user k [3], [5], [49] where B is the system bandwidth and the corresponding uplink SINR is given as Based on the SINR definition in (21), we provide an exact closed-form expression for the achievable net rate in (20) under the IRS-assisted channel model in (5) and the channel estimate in (12). The result is presented in the following theorem.
Theorem 1: The achievable uplink net rate of user k in the IRS-assisted cell-free MIMO system described in Section II-A under statistical CSI at the CPU is given by (20) with an uplink SINR of where k , and δ m,k , c mk and η m,k are as defined in Lemma 1.
Proof: The proof is provided in the Appendix. Note that the SINR in (22) and therefore the achievable rate in (20) are functions of only the large-scale fading statistics of the channels, like the path loss factors and correlation matrices, which change much slower than the actual fast fading channels. Hence, the achievable rate and accordingly the receiver filter coefficients u k 's, the power coefficients q k 's and the IRS phase shifts in , which will be optimized using the achievable rate expression, will only need to be re-calculated when the large-scale fading statistics change.
We also remark here that the closed-form expression of the achievable uplink net rate resulting from the SINR expression derived in (22) is exact. The derivation of this expression utilizes results on the moments of the channel g mk , the moments of the channel estimateĝ mk , and the correlation between g mk andĝ m k for m, m = 1, . . . , M and k, k = 1, . . . , K. These moments and correlations, as well as the final expression of the SINR are derived exactly in the Appendix, for the channel models considered in this paper, i.e., where the IRSuser channels follow correlated Rayleigh fading (which is a more realistic model than independent Rayleigh fading considered in many other works), and the AP-IRS channels are LoS dominated similar to the works in [16], [17], [25], [28], [39], [44]- [46] (motivated by the fixed location of the IRS which can be in the LoS of the APs). The extension of the result in Theorem 1 to other channel models, such as Rician fading for all links or correlated Rayleigh fading for AP-IRS links, can be done following similar procedures as in the Appendix.
Before proceeding to optimizing the design variables, we simplify the result in Theorem 1 for different operating scenarios. First we consider the scenario where perfect CSI of the overall end-to-end channels g mk 's is available at the APs to implement matched filtering. The resulting achievable uplink rate is presented in the following corollary.
Corollary 1: Under the setting of Theorem 1, if perfect CSI is available at the APs, then the achievable uplink net rate of user k is given by (20) with an uplink SINR of where and mk and δ m,k are as defined in Theorem 1.
Proof: The proof follows by letting ρ p → ∞ in Theorem 1.
Next we simplify the result in Theorem 1 for the conventional cell-free MIMO setting with no IRS. The resulting achievable uplink rate is presented in the following corollary.
Corollary 2: The achievable uplink net rate of user k in a cell-free MIMO system with K single-antenna users and M single-antenna APs that employ matched filtering is given by (20) with an uplink SINR of where Proof: The proof follows by letting β [1] m and β [2] k equal zero in Theorem 1.
Note that Corollary 2 retrieves the result in [5, Th. 1] that deals with the conventional cell-free MIMO system.
By direct inspection of the desired signal in the numerator of (22) and (32), we can see that the strength of the desired signal increases in an IRS-assisted system. This is captured by the term β [2] k h [1] H m R [2] k H h [1] m added in δ mk for an IRSassisted system, which appears in η mk 's that constitute η k in the numerator. However the inter-user interference in an IRSassisted system also increases due to the increase in δ mk 's as well as due to the addition of the term u H k ( K k =1 q k N kk )u k in the denominator of (22). Moreover, the performance is affected by the choice of q k 's and u k 's. Therefore it is important to design the IRS phase shifts in as well as the power allocations q k 's and the CPU combining vectors u k 's to maximize the minimum achievable rate in order to ensure promising gains over the conventional cell-free MIMO system. In the next section, we optimize the design variables to maximize the minimum achievable uplink net rate of the system.

IV. MAX-MIN SINR OPTIMIZATION UNDER STATISTICAL CSI AT THE CPU
In this section, we solve Problem (P1) described in Section II-C to design the receiver filter coefficients at the CPU, the power coefficients at the users, and the phase shifts applied by the IRS, using the achievable rate expression in Theorem 1. Since the logarithm is a monotonically increasing function of its argument, the max-min rate problem in (P1) is equivalent to solving the max-min SINR problem formulated below: From the SINR expression in (22) we can see that all the optimization variables are coupled with each other and the objective function is not jointly convex in terms of the optimization variables. Also, we have discrete variables which further complicates the problem and prevents using standard gradient search methods. To tackle this nonconvexity and discreteness, we decouple (P2) into three sub-problems: a receiver filter design sub-problem, a power allocation sub-problem, and an IRS phase shifts design subproblem. Then, we solve these sub-problems iteratively using an alternating optimization algorithm, i.e., we first optimize u k ∀k, for given q k 's and θ n 's, then optimize q k ∀k, for given u k 's and θ n 's, and finally optimize θ n ∀n, for given u k 's and q k 's, and we repeat this process until the algorithm converges in the value of the objective given by (39a). The overall algorithm to solve the joint optimization problem in (P2) is outlined towards the end of this section.
We remark here that all the variables, including the CPU receiver filter coefficients, the users' power allocations and the IRS phase shifts, significantly affect the minimum achievable net rate of the system, and therefore should be jointly designed, as jointly optimizing all available variables leads to a better performance. We will outline the computational complexity of the alternating optimization algorithm that jointly optimizes these variables towards the end of this section, to show that this method is affordable and practical. In the simulation results, we will show the improvement in the minimum achievable net rate that comes with the inclusion of different optimization variables in the joint optimization problem (P2), which will also confirm the importance of jointly optimizing all design variables. Another factor the contributes to the affordability of this approach in practice is that (P2) only depends on statistical CSI, which means that the solution of (P2) only needs to be regenerated once the channel statistics change. This makes the proposed design practical for IRS-assisted systems, for which instantaneous CSI acquisition is considered a major challenge for practical implementation.

A. RECEIVER FILTER COEFFICIENTS DESIGN
In this subsection, we solve the receiver filter design subproblem to maximize the minimum uplink SINR for a given set of transmit power allocations at the users and phase shifts at the IRS. The receiver filter coefficients at the CPU for all users (i.e., u k , ∀k) that maximize the minimum SINR can be obtained by independently maximizing the SINR of each user k with respect to u k as formulated below: Problem (P3) is a generalized eigenvalue problem [5], [50], for which the optimal solution can be obtained by determining the generalized eigenvector of the matrix corresponding to the maximum generalized eigenvalue. The optimal solution for each u k is given as where v λ max (C −1 k A k ) is the eigenvector corresponding to the maximum eigenvalue λ max of C −1 k A k . Since rank(A k ) = 1, we find that the optimal solution in (41) has a closed-form given as [50] where the normalization is done to satisfy the unit norm constraint in (40b).

B. POWER ALLOCATION DESIGN
Next we solve the power allocation problem for a given set of receiver filter coefficients u k 's and IRS phase shifts θ n 's. In this case, the uplink rates for different users are coupled since the allocated power for user k affects the rate R k as well as the rate R k for k = k through interference. Thus, the power allocation problem can be formulated as, Without loss of generality, Problem (P4) can be reformulated by introducing a new auxiliary variable t as follows.
We can rewrite the constraint in (44c) using (22) as follows which can be rewritten as where a kk = . Problem (P5) can therefore be rewritten as The transformed problem in (P6) has an objective which is an affine function of t and the constraints are posynomial functions in [t, q 1 , . . . , q K ]. Therefore Problem (P6) is a standard geometric program which has the form [51] min where f 0 and f i are posynomial and g i are monomial functions in the optimization variable x = [x 1 , . . . , x n ]. A standard geometric program can be easily turned into a geometric program in convex form using a logarithmic change of variables. CVX [52] does this conversion internally to yield the optimal solution of the standard geometric program and is used to solve (P6) optimally.

C. IRS PHASE SHIFTS DESIGN
To optimize the IRS phase shifts for a given set of receiver filter coefficients and power allocation coefficients, we need to solve the following problem: The problem with respect to the IRS phase shifts is extremely involved. To highlight this we express the SINR in (22) explicitly in terms of the IRS reflect beam- wherē where . We can see using this expression for γ k that the objective of Problem (P7) takes the form of a ratio of sums of ratios of products of quadratic forms inv, which can not be solved optimally without exhaustive search. To reduce the complexity of solving this non-convex problem, we use an alternating maximization approach in which the IRS phase shifts are optimized sequentially, one after the other. This way we have a sequence of scalar problems with respect to each IRS phase shift θ n , that can be solved by a line search in [0, 2π ). Given each phase shift is constrained to belong to Q by (49b), we apply a direct search on this set for each θ n while the other phase shifts are fixed, to maximize the minimum SINR. This process is repeated for each phase shift until the algorithm converges in the value of objective. Formally denoting by F(θ) the objective function of (P7) (i.e., min k γ k ), where θ = [θ 1 , . . . , θ N ], an alternating maximization algorithm to obtain the phase shifts can be stated as given in Algorithm 1.
Remark 1: Note that Algorithm 1 requires a proper choice of the initial discrete phase shifts. One way to obtain an initial feasible phase shifts set is by generating L different discrete phase shifts vectors θ l = [θ l1 , . . . , θ lN ], where each θ ln is chosen randomly from Q and computing the objective of (P7) for these L sets. The initial solution is then chosen as k as in (42) for k = 1, . . . , K while using q = q (i−1) and θ = θ (i−1) ; Obtain q (i) by solving Problem (P6) while using U = U (i) and θ = θ (i−1) ; Obtain θ (i) using Algorithm 1 while using U = U (i) and q = q (i) ; Since the minimum rate can not grow to infinity, Algorithm 1 must converge in the value of the objective. Denoting by I 1 the number of iterations until convergence, Algorithm 1 requires I 1 NQ evaluations and comparisons of the objective of Problem (P7). While the number of iterations I 1 until convergence is not known in advance, it is typically quite small (< 10) as will be seen in simulations. Therefore the complexity of the proposed method is significantly less than that of exhaustive search, while achieving global optimality with respect to each individual phase shift given the remaining ones. Note that the complexity of exhaustive search is exponential in N as given by O(Q N ).
Remark 2: When the IRS phase shifts are allowed to take any value from the continuous range [0, 2π ), we can still use the alternating maximization algorithm in Algorithm 1, with each IRS phase shift found using a line search in [0, 2π ) while the others are fixed.

D. OVERALL ALGORITHM
Based on these three subproblems, an alternating optimization algorithm is developed to solve Problem (P2) by alternately solving each sub-problem in each iteration. Denoting by F u (U, q, θ ) the objective of Problem (P2), where U = [u 1 , . . . , u K ] and q = [q 1 , . . . , q K ], the proposed algorithm is summarized in Algorithm 2.
Next we discuss the convergence of Algorithm 2 in which three sub-problems are alternately solved to determine the solution of Problem (P2). At the i th iteration an optimal solution is obtained for the receiver filter coefficients u (i) k , k = 1, . . . , K, for given q (i−1) and θ (i−1) resulting in F u (U (i) , q (i−1) , θ (i−1) ) ≥ F u (U (i−1) , q (i−1) , θ (i−1) ). Next an optimal solution is obtained for q (i) for given U (i) and θ (i−1) resulting in F u (U (i) , q (i) , θ (i−1) ) ≥ F u (U (i) , q (i−1) , θ (i−1) ). Similarly when θ (i) is obtained using Algorithm 1 for given U (i) and q (i) , we can see based on the discussion on the convergence of Algorithm 1 in Section IV-C that F u (U (i) , q (i) , θ (i) ) ≥ F u (U (i) , q (i) , θ (i−1) ). This reveals that the minimum net rate (or SINR) monotonically increases with each iteration. As the achievable max-min net rate is upper bounded under the given constraints, the proposed algorithm converges to a particular solution.
The computational complexity of Algorithm 2, depends on the complexity of solving the generalized eigenvalue problem (P3), the geometric program (P6), and the alternating maximization algorithm to solve (P7). For the receiver filter design in (P3), an eigenvalue solver has a computational complexity equivalent to O(KM 3 ) [53]. For the power allocation design, a standard geometric program in Problem (P6) can be solved with a complexity equivalent to O(K 7/2 ) [54, Ch. 10]. For the IRS phase shifts design in (P7), the alternating maximization algorithm has a complexity of O(I 1 NQ), where I 1 is the number of iterations until the convergence of Algorithm 1. Therefore the overall complexity of Algorithm 2 can be written as O(I 2 (KM 3 +K 7/2 +I 1 NQ)), where I 2 is the number of iterations until the convergence of Algorithm 2. Therefore the proposed algorithm only has a polynomial complexity in the system parameters M, N and K, making it affordable. Note that a theoretical analysis for the number of iterations I 1 and I 2 to reach convergence is very challenging. In fact, no theoretical results are available even for very simple scenarios [55]. However, we show numerically in the simulations results that the number of iterations until the algorithm converges is usually very small.
We also remark here that the global solution of Problem (P2) is unknown and alternating optimization is a practical way to solve this non-convex problem, as is the case with most radio resource allocation problems in wireless communication systems. In fact, most of the works that solve different joint optimization problems for IRS-assisted communication systems also utilize the alternating optimization method to obtain solutions with affordable complexity [10], [11], [13], [14], [17], [18], [22], [23], [40], [41]. However, this is the first work that uses this approach to optimize the receiver filter coefficients at the CPU, the power allocations across the users, and the phase shifts at the IRS, to maximize the minimum rate in an IRS-assisted cell-free MIMO system. Moreover, the solutions for CPU receiver filter coefficients and the users' power allocations are optimal given the other design variables. Only the IRS phase shifts optimization is based on a sub-optimal numerical alternating maximization algorithm. The non-convexity of the objective function as well as the discrete phase shifts constraints prevent us from using standard gradient search methods to solve (P7). In order to get a closed-form solution for the IRS phase shifts, one needs to oversimplify the system model, which would become impractical. Therefore, alternating maximization algorithm is a practical and low-complexity approach to design these discrete phase shifts [40], [56].
All the proposed solutions in this section not only have affordable complexity as discussed above, but also require only statistical CSI for implementation, making them practical for IRS-assisted systems for which instantaneous CSI acquisition is a major challenge. Moreover, the optimization problem is solved under practical constraints like discrete phase shift resolution constraints at the IRS, which makes the solution practically relevant.
Despite the advantages of this approach, it is important to study how this solution compares to that designed using full instantaneous CSI of all channels. In the next section, we will solve the max-min uplink net rate problem (P1) for the scenario where instantaneous CSI of all direct and IRS-assisted channels is available at the CPU.

V. BENCHMARK SCHEME: INSTANTANEOUS CSI AT THE CPU
The scheme proposed in this work requires each AP m to only estimate the overall end-to-end channel g mk , ∀k for a given IRS phase shifts matrix and use these estimates to implement matched filtering. The CPU uses only the statistical knowledge of the channels when performing the signal detection as well as optimizing the design variables u k 's, q k 's and .
In this section, we will outline a scheme in which each AP m estimates all individual IRS-assisted and direct channels, i.e., AP-IRS channel (h 1 m ), IRS-user channels (h [2] k , ∀k), and direct AP-user channels (h [d] mk , ∀k), that constitute the endto-end channel g mk in (5). These estimates are forwarded to the CPU which implements matched filtering before signal detection and uses this full instantaneous CSI to optimize the design variables u k 's, q k 's and . The availability of the instantaneous CSI of all individual channels allows the CPU to optimize these variables for each new channel realization resulting in better minimum SINR as compared to the statistical CSI based scheme where the variables are optimized only when channel statistics change. However, the acquisition of full instantaneous CSI requires a large training time which can compromise the achievable "net" rate. It is therefore useful to compare the achievable net rate performance of the max-min SINR scheme designed using full instantaneous CSI of all channels with that of the proposed max-min SINR scheme designed using statistical CSI.

A. CHANNEL ESTIMATION
We will briefly outline the MMSE-DFT channel estimation protocol from [24], [25] that is used to obtain the channel estimates. Each AP m computes the MMSE estimates of the direct and IRS-assisted channels constituting g mk ∀k, based on the received pilot sequences from the users over multiple sub-phases, where in each sub-phase the IRS applies an optimal reflect beamforming vector v tr ∈ C N×1 . This protocol requires N + 1 sub-phases to estimate all channels resulting in a training overhead of K(N + 1) symbols.
As discussed in the introduction, there are some works that develop lower overhead channel estimation protocols based on the idea of IRS elements grouping [21], [27] or by considering specific channel properties like channel sparsity [28], [29]. A more recent approach to estimate these channels appeared in [30] and [31], that proposed three-phase and two-phase channel estimation frameworks respectively. To reduce the training overhead, these frameworks first estimate the direct and IRS-assisted channels of a typical user and then exploit the strong correlation between the IRS-assisted channels of different users arising from the common AP-IRS channel in these links, to estimate the remaining IRS-assisted channels. When an AP has multiple antennas, say L, then instead of estimating N IRS-assisted channels of dimension L × 1 for each remaining user, it will only need to estimate N scaling factors, since each L × 1 IRS-assisted channel of each remaining user is just the scaled version of that of the typical user. The resulting training overhead is reduced to K + N + (K−1)N L symbols. However when each AP only has a single antenna as often considered in classical cell-free MIMO systems [3]- [5], [7], the total training overhead of these schemes is still K(N + 1) symbols. This overhead is the same as that of the MMSE-DFT protocol in [25]. In fact, the NMSE in the MMSE estimates derived under the MMSE-DFT protocol is less than that in the MMSE estimates derived under the three-phase scheme in [30], as well as less than that in the least squares estimates derived under the two-phase scheme in [31].
In light of this discussion, we utilize the MMSE-DFT protocol from [24], [25] since it does not make any assumptions on the channel properties and does not compromise on IRS beamforming gains. Moreover, it is also optimal in terms of the NMSE in the channel estimates. We also remark here that the achievable net-rate expression and optimization algorithm developed in this section can be easily extended to any other channel estimation protocol. The MMSE-DFT protocol is summarized next.
The total channel estimation period consists of S subphases, where similar to Section III-A, user k transmits the pilot sequence √ τ p φ k ∈ C τ p ×1 in each training sub-phase of length τ p symbols. 6 The IRS applies the reflect beamforming vector v tr s = [v s,1 , . . . , v s,N ] T ∈ C N×1 in sub-phase s, where v s,n = exp(jθ s,n ) and θ s,n is the phase-shift applied by IRS element n in training sub-phase s. The received training signal at AP m in sub-phase s is given as where where . For a given design of V tr and provided S ≥ N + 1, the received vector in (55) can be processed at AP m as It was proved in [24], [25] that the optimal design for V tr that minimizes the noise variance and guarantees uncorrelated noise across the sub-phases is the N + 1 leading columns of a S × S DFT matrix given as as the estimation error variance. The received signal at AP m from all users is given by for m = 1, . . . , M, whereḡ mkn is the n th element ofḡ mk . In the representation in (60), θ 1 = 0 since it is multiplied with mk , while θ 2 , . . . , θ N+1 are the phase shifts applied by the N IRS elements.
The aggregated received signal from all APs at the CPU after matched filtering and applying the receiver filter coefficients is given by for k = 1, . . . , K, whereĝ * mkn is the n th element ofĝ mk . Given the CPU only knows the channel estimates and not the true channels, we can write (61) as whereē mk n is the n th element ofē mk . Treating the last two terms as uncorrelated Gaussian noise, the achievable uplink net rate of user k is stated in Theorem 2. Theorem 2 The achievable net rate of user k in an IRSassisted cell-free MIMO system with full instantaneous CSI at the CPU is given by where S is the number of sub-phases in the MMSE-DFT protocol. The corresponding uplink SINR is given as where inst andγ mk n is the n th element ofγ mk .

C. OPTIMIZATION
The max-min SINR optimization problem under full instantaneous CSI is formulated as Since the problem is jointly non-convex in terms of the optimization variables, we obtain a solution for (P8) by iteratively solving the three sub-problems using an alternating optimization algorithm as done in Section IV. The CPU receiver filter design sub-problem for a given set of transmit power allocations at the users and phase shifts at the IRS is a generalized eigenvalue problem similar to (P3), for which the optimal solution for each u k is given by where C inst Next the power allocation problem can be formulated using similar steps in Section IV-B as where The transformed problem in (P9) has an objective which is a monomial function in t and the constraints are posynomial functions in [t, q 1 , . . . , q K ]. Therefore Problem (P9) is a standard geometric problem which can be optimally solved using CVX.
To optimize the IRS phase shifts for a given set of receiver filter coefficients and power allocations, we will utilize two methods. First is the alternating maximization approach proposed in Algorithm 1, where F(θ ) in this case is the objective function of (P8).
As opposed to (50), the SINR expression in (64) yields a convenient form in terms ofv = [1, v T ] ∈ C 1×N+1 , where v = [ exp(jθ 1 ), . . . , exp(jθ N )] T represents the phase shifts applied by the IRS elements. Therefore, we also outline an algorithm based on semi-definite relaxation (SDR) and Dinkelbach's algorithm to optimize the IRS phase shifts. We will later see in the simulations that the alternating maximization approach yields a very close performance to Dinkelbach's algorithm. To this end γ inst k in (64) is written in terms ofv as where and The IRS phase shifts design problem using the expression in (73) is first formulated for the case where IRS phase shifts can take any continuous value in the interval [0, 2π ). The obtained continuous phase shifts are later rounded to the closest points in the discrete set Q using the closest point projection method [57], [58]. The IRS phase shifts design sub-problem is therefore formulated as wherev n is the n th element ofv. Note thatv I kk v H = tr( I kk v Hv ). We can reformulate (P10) by defining V =v Hv , which needs to satisfy V 0 and rank(V) = 1. Since the rank-one constraint is non-convex, we apply semi-definite relaxation to relax this constraint by letting V be a positive semi-definite matrix of arbitrary rank. The semi-definite relaxed problem is given as subject to V 0, V n,n = 1, n = 1, . . . , N + 1. (76b) Problem (P11) is a max-min fractional problem which can be optimally solved with limited complexity using the generalized Dinkelbach's algorithm, that provides an efficient way

Algorithm 3 Dinkelbach's Algorithm for IRS Phase Shifts Design
where n k (V) and d k (V) are given by (77) and (78) respectively, subject to V 0 and V n,n = 1, ∀n; ; to maximize the minimum of a ratio in which the numerator is a concave function, the denominator is a convex function, and the constraint set is convex [55]. The objective function in (76a) considers a set of ratios of two functions, where we denote the numerator by n k (V) and the denominator by d k (V), k = 1, . . . , K. By exploiting the fact that tr(AB) = vec(A T ) T vec(B), we write n k (V) and d k (V) as It can be seen from (77) and (78) that n k (V) and d k (V) are linear functions ofV. Problem (P11) therefore considers a set of ratios where each ratio has an affine numerator n k (V), affine denominator d k (V) and convex constraints and can therefore be solved optimally using the generalized Dinkelbach's algorithm [55]. Once the optimal V * is obtained, the corresponding vectorv needs to be extracted. If the resulting matrix V * turns out to have rank-one, the optimal solutionv * can be obtained as where u max (A) is the eigenvector corresponding to maximum eigenvalue of A. If the rank turns out to be greater than one, then Gaussian randomization can be applied to find v * by using the eigenvalue decomposition V * = U U H and computingv l = U 1/2 r l , where r l ∼ CN (0, I N+1 ) for l = 1, . . . , L. The solutionv * can then be found as With a sufficiently large number of randomizations L, Gaussian randomization has been shown to guarantee a tight approximation of the optimal objective value [13].

Algorithm 4 Alternating Optimization Algorithm for Problem (P8)
k as in (71) for k = 1, . . . , K while using q = q (i−1) and θ = θ (i−1) ; Obtain q (i) by solving Problem (P9) while using U = U (i) and θ = θ (i−1) ; Obtain θ c(i) = phase(v * ) using Algorithm 3 while using U = U (i) and q = q (i) ; Finally, the solution to (P10) can be recovered by accounting for the constraint that the first element ofv should equal one and the next N elements ofv, which represent v, need to satisfy the constraint |v n | = 1. The resulting optimal solution for v is v * = exp(j∠([v * The computational complexity of Algorithm 4 depends on the complexity of solving the generalized eigenvalue problem to optimize u k 's, the complexity of solving the geometric program in (P9) to compute the power allocations, and the complexity of the Dinkelbach's Algorithm to optimize the IRS phase shifts. The complexity of finding u k 's and q k 's is O(KM 3 ) and O(K 7/2 ) respectively as discussed in Section IV-D. It is well-known known that the Dinkelbach's Algorithm converges with a super-linear rate in, say, I 3 iterations. The sub-problem in each iteration of the Dinkelbach's Algorithm is convex and can be solved with a computational complexity that is polynomial in the number of variables and constraints. Therefore the computational complexity of We conclude this section by remarking that the solutions in this section depend on instantaneous CSI. Therefore Algorithm 4 will be implemented at the CPU to find the optimal values of the design variables for each new channel realization. In contrast to this, Algorithm 2 will only be implemented at the CPU when the channel statistics change which can be after several channel realizations. The correlation matrices at the IRS with respect to all users, i.e., R [2] k 's, are simulated using the model in [16, eq. (34)]. The LoS AP-IRS channel vectors h [1] m 's are generated using (6) with ϑ m computed using the (x, y) coordinates of each AP and IRS.
The channel attenuation coefficients are modeled as where C is the path loss at the reference distance of 1 meter, d is the length of the considered link, and α is the path loss exponent. The parameters are set as α [1] mk = 3.5 and C = 30dB [11], [13]. We consider 5dBi elements at the IRS and APs and penetration loss of 20dB for the direct AP-user links. Moreover, we set B = 20MHz, the maximum transmit power at each user for both data and pilot transmission as 5W, and the noise variance as σ 2 = −174 + 10 log 10 B = −100dBm, which is used to compute ρ u and ρ p . Each coherence interval comprises of τ c = 200 symbols. The values of M, K and N and the length of training phase τ p will be specified under each simulation figure. The performance metric considered in all results is the minimum achievable uplink net rate, i.e., min k R k , where R k is defined in (20) and (63) for the statistical and instantaneous CSI based schemes, respectively.
We organize the simulation results into two subsections. The first section studies the performance of the proposed statistical CSI based max-min SINR scheme in Section IV and the second section compares the minimum achievable net rate performance of the proposed scheme to that of the instantaneous CSI based max-min SINR scheme outlined in Section V.

A. PERFORMANCE OF THE PROPOSED STATISTICAL CSI BASED MAX-MIN SINR SCHEME
The simulation results in this section will consider the following system configurations to draw useful comparisons.
1) Optimized IRS-assisted cell-free MIMO system: This is the proposed framework under imperfect CSI of the overall end-to-end channels at the APs and statistical CSI at the CPU. The design variables are optimized using Algorithm 2 with the IRS phase shifts optimized using Algorithm 1. The system is simulated for different values of b (i.e., number of bits used to represent the discrete phase shift levels) and is denoted by "IRS-b-bit" with the achievable rate given by Theorem 1. 2) IRS-assisted cell-free MIMO system with equal IRS phase shift solution from [7]: In this framework, Algorithm 2 is used to optimize the CPU receiver filter coefficients and the users' power allocations, while the equal phase shifts solution from [7, Corollary 2] is used to set the IRS phase shifts instead of Algorithm 1. The equal phase shifts solution was shown to minimize the total NMSE in the AP-user end-to-end channel estimates [7]. This system is denoted as "IRS-b-bit-Equal" with the achievable rate computed using Theorem 1.
Note that [7] neither incorporates the CPU receiver filter coefficients into its model nor does it optimize the power allocations. However to allow a fair comparison with our scheme, we use its equal phase shifts solution with the optimized CPU receiver filter coefficients and power allocations from our work. 3) Unoptimized IRS-assisted cell-free MIMO system: In this framework, the design variables are not optimized. Rather the users' transmit powers q k 's are set as p max k , the CPU receiver filter vectors are set as u k = 1 √ M 1 M , and the IRS phase shifts are randomly chosen from the discrete phase shift set Q. This system is denoted as "IRS-b-bit-Not-Opt" and the achievable rate is computed using Theorem 1. 4) Conventional cell-free MIMO system without IRS: In this framework we consider the cell-free MIMO system without any IRS. We optimize only the receiver filter coefficients and the power allocations across the users using Algorithm 2. This system is denoted by "No IRS" and the achievable rate is given in Corollary 2. We also consider the conventional cellfree MIMO system with the receiver filter coefficients and power allocations not optimized, and denote it as "No IRS-Not-Opt".
In Fig. 2, we plot the cumulative distribution function (CDF) of the minimum uplink net rate using the Monte-Carlo simulated SINR and the theoretical SINR for the IRS-assisted cell-free MIMO system and the conventional cell-free MIMO system with no IRS. For the IRS-assisted systems, the CDF curves are plotted considering 2-bit phase shifters at the IRS, under the proposed solution in Algorithm 2 with IRS phase shifts optimized using Algorithm 1, under the equal IRS phase shifts solution from [7] with the other variables optimized using Algorithm 2, and under the unoptimized solution where all parameters take given values (configurations 1, 2 and 3 described above). For the conventional system, the CDF curves are plotted for the case where the CPU receiver filter coefficients and power allocations are optimized using Algorithm 2, as well as for the case where these variables are not optimized (configuration 4 described above). The CDF is computed over 200 realizations of the locations of APs and users. The Monte Carlo results are obtained by simulating the achievable net rate expression in (20) with the SINR computed using (21) under the considered settings. The theoretical results are plotted using (20) with the SINRs given by Theorem 1 (22) for all IRS-assisted scenarios, and Corollary 2 (32) for the scenario without IRS. We observe an excellent overlap between the CDF yielded by Monte-Carlo simulated values and the theoretical expressions, thereby corroborating the accuracy of the analysis.
The CDF curves show a significant improvement in the minimum net rate performance with the introduction of an IRS with only 128 elements. Comparing the IRS-2-bit and No IRS curves, which are plotted under their respective optimized variables, we see that the median value of the minimum net rate increases from 3.68 to 4.31 Mbps and the 90 th percentile increases from 5.52 to 6.71 Mbps. We also see that using Algorithm 1 to optimize the IRS phase shifts instead of using the equal phase shifts solution from [7] brings a 0.6 Mbps increase in the median minimum net rate, and this gap will increase with N as we will see later. Moreover, using the equal phase shifts solution does not yield any noticeable improvement in the minimum net rate compared to the case without IRS. Therefore, optimizing the IRS phase shifts is crucial in observing gains from the deployment of the IRS. Finally, we observe that optimizing the CPU receiver filter coefficients and users' power allocations jointly while the IRS phase shifts are fixed brings about a 2.6 Mbps improvement in the median minimum net rate, and optimizing all three design variables jointly, i.e., CPU receiver filter coefficients, users' power allocations and IRS phase shifts, brings about a 3.3 Mbps improvement in the median minimum net rate, as compared to the case where these three design variables are not optimized. Therefore the joint optimization of all design variables is important to realize the true potential of IRS-assisted cell-free MIMO systems.
In Fig. 3, we plot the CDF of the minimum uplink net rate using the Monte-Carlo simulated SINR and the theoretical SINR for the IRS-assisted cell-free MIMO system and the conventional cell-free MIMO system under pilot contamination (i.e., when τ p < K) and without pilot contamination (i.e., when τ p ≥ K). When τ p < K, we will consider each user to be randomly assigned a pilot sequence from the set of τ p orthogonal sequences. The performance of both systems improves with the length of the pilot sequences, since the number of pilot sequences reused across the users reduces resulting in reduced interference due to pilot contamination. When all the users are assigned orthogonal pilot sequences (i.e., when τ p = K = 5), the interference due to pilot contamination is zero, resulting in a much better minimum uplink net rate performance as compared to the case where users are randomly assigned pilots from τ p < K available orthogonal pilot sequences. Analytically we can see from (22) that when orthogonal pilot sequences are assigned to all users, we have φ H k φ k = 0 for k = k and therefore the denominator of the SINR expression will reduce to u H Since the accuracy of the closed-form expressions compared with the Monte Carlo simulations is demonstrated above, the rest of the figures are obtained by using the closedform expressions of the SINR to compute the minimum uplink net rate. Fig. 4 plots the minimum uplink net rate against N for the proposed and the benchmark schemes described earlier. We see a significant gain in the performance of the IRS-assisted cell-free MIMO system with an increase in the number of reflecting elements. Moreover the performance is significantly better than that of the conventional cell-free MIMO system without IRS. The minimum net rate increases from 5.3 to 7.7 Mbps as N increases from 16 to 512, with the performance gap between the IRS-assisted system and the conventional system (both having 8 APs) increasing to 2.5 Mbps. We also observe that the performance jump when we switch from 1-bit to 2-bit IRS phase shifters is notable due to the promising gains brought about by carefully selecting the IRS phase shifts. However the performance gain decreases when switching to 3-bit IRS phase shifters. Therefore, we do not need costly high resolution phase shifters to implement IRSs to yield significant performance improvement over conventional cell-free MIMO systems. We also plot on this figure the minimum uplink net rate under the benchmark scheme 2, in which instead of using Algorithm 1 to optimize the IRS phase shifts, the IRS phase shifts are set as θ n = exp(jπ/4) ∀n according to the equal phase shift solution proposed in [7]. The CPU receiver filter coefficients and power allocations are still optimized using our proposed solutions. The performance improvement with optimizing the IRS phase shifts using Algorithm 1 is quite significant, especially for large N.
Next we study the benefit of integrating an IRS in a cell-free MIMO system. Since it is difficult to achieve uniform coverage in the deployment of APs in urban areas, there are often users in shadow areas that have weak links with the existing APs. Our goal in this work has been to improve the quality of service to these weak users, since our performance metric is the minimum achievable rate. A common solution to improve the minimum achievable rate is to deploy more APs in the cell-free MIMO system. However, a large-scale deployment of APs can result in large deployment costs and power consumption. An IRS can yield some portion of the rate gains that additional APs are expected to yield. This can help avoid the need for the deployment of additional APs especially in shadow areas, and therefore cut the cost and power consumption of the cell-free MIMO system. To highlight this, we plot in dashed lines on Fig. 4, the performance yielded by the conventional cell-free MIMO system with M = 12 and M = 16 APs and no IRS. Interestingly an IRS-assisted cell-free MIMO system with 8 APs and an IRS with 310 2-bit reflecting elements can achieve the same performance as a conventional cell-free MIMO system with 12 APs. Thus we can cut the cost of deploying 4 extra APs in the system by using a passive IRS. Furthermore, an IRS-assisted cell-free MIMO system with 8 antennas and a little more than 512 reflecting elements can approach the performance of a conventional cell-free MIMO system with 16 antennas. Therefore, for the considered setting, we can reduce the number of APs by one-half and get the same minimum net rate by using an IRS with a large number of passive reflecting elements. Fig. 5 plots the minimum uplink net rate against M for the proposed scheme and the benchmark schemes. We observe that the minimum net rate increases with the number of APs and the performance gap between the IRS-assisted cell-free MIMO system and the conventional cell-free MIMO system is around 1 Mbps for M = 8 and around 1.3 Mbps for M = 64 with only 128 elements at the IRS. In Fig. 6 we plot against the number of APs M, the ratio of the minimum achievable net rate of the IRS-assisted cell-free MIMO system with 3-bit phase shifters to the minimum achievable net rate of the conventional cell-free MIMO system. We see that the gain of the IRS-assisted system over the conventional system decreases with M. The gain is around 2.14× for M = 8 and it decreases to 1.35× for M = 64. This is because as the number of APs increases, there is a higher the probability of an AP (or multiple APs) being closer to each user in the system, which results in stronger direct APuser links. As a result the gain from deploying an IRS will  become less significant since most of the performance gain will come from using a higher number of active APs in the system instead of using a passive IRS. However, like we discussed earlier, a large-scale deployment of APs results in high deployment cost and power consumption, which can make the conventional cell-free MIMO system less efficient as compared to an IRS-assisted cell-free MIMO system that can achieve a similar performance with a reduced number of APs. Fig. 7 investigates the convergence of the proposed algorithms for the IRS-assisted cell-free MIMO system with M = 8, K = 5 and N = 128 2-bit reflecting elements. The objective value, i.e., minimum uplink net rate, is plotted against the number of iterations for a single realization of The path loss model for IRS-assisted links, given as β [1] m × β [2] k , introduces a double path loss effect since two fixed losses of C dB are incurred, and d [1] m and d [2] k appear within multiplicative factors. As a result the IRS-assisted link is often weaker than the direct AP-user link for small N and the gain from deploying an IRS is not significant unless N is large or the direct AP-user links suffer from high penetration losses. The latter is likely to be true in mmWave and sub-mmWave communication systems, which are known to suffer from high penetration losses resulting in signal blockages [14], [16]. Since the position at which the IRS is deployed is in control of the network operator, it is expected that the IRS-assisted links will avoid incurring these penetration losses and will provide alternate communication paths when the direct AP-user links are weak or blocked. In Fig. 8 we highlight the gain of IRS-assisted cell-free MIMO system over conventional cell-free MIMO system for different values of penetration losses in the AP-user links. On the left y-axis, we plot the minimum net rate of an IRS-assisted cell-free MIMO system and a conventional cell-free MIMO system and on the right y-axis we plot the ratio of the minimum net rate of these two systems, against the penetration loss in the direct links. We see that when there is negligible penetration loss in the direct links, the IRS-assisted links do not provide any additional performance gain due to the high path loss (note that the ratio of the minimum rates of the two systems is 1). However for penetration losses of 20dB (considered in this work) and 25dB, the IRS-assisted system yields a 1.5× and 3.5× performance gain respectively. The minimum net rate gain increases around 16× as the penetration loss in the direct links increases to 30dB, highlighting the potential of IRSs to effectively overcome the unreliability of the direct links. Therefore under the double path loss model, IRS shows more notable gains when the AP-user links are weak or when N is large.

B. PERFORMANCE COMPARISON WITH THE INSTANTANEOUS CSI BASED MAX-MIN SINR SCHEME
In this subsection, we compare the performance of the statistical CSI based max-min SINR scheme in Section IV with that of the instantaneous CSI based max-min SINR scheme in Section V. In the former, the APs only need the estimates of the overall end-to-end channels g mk 's to implement matched filtering while the CPU uses statistical CSI to design the receiver filter coefficients, the users' power allocations and the IRS phase shifts. In the latter scheme, the APs estimate all individual channel coefficients, i.e., h [d] mk 's and h [I] mkn 's, n = 1, . . . , N and share this full CSI with the CPU, which uses it to optimize the receiver filter coefficients, the power allocations and the IRS phase shifts.
The instantaneous CSI based scheme optimizes all parameters for every channel realization using full knowledge of all channel coefficients resulting in high minimum instantaneous SINR, whereas in the statistical CSI based scheme the parameters are re-optimized only when the large-scale channel statistics change, which maybe after several coherence periods. Therefore while the parameters are designed to achieve favorable channel statistics and maximize the minimum ergodic SINR, they will not be optimal for every channel realization and will not maximize the minimum instantaneous SINR. However, the instantaneous CSI based schemes incur large training overheads associated with estimating all individual IRS-assisted channels, with most channel estimation protocols [19], [24], [25], [30], [31] requiring at least K + KN symbols in the channel estimation phase. This compromises the achievable net rate as compared to the statistical CSI based scheme that only requires at least K symbols to estimate the required channels. Moreover, the instantaneous CSI based schemes require parameters to be optimized with each channel realization which increases the system complexity.
In light of this discussion, we will compare the minimum net rate performance of the instantaneous CSI and statistical CSI based max-min SINR schemes. The following configurations will be studied.
1) IRS-assisted cell-free MIMO system with statistical CSI: This is the proposed framework under imperfect CSI of the overall end-to-end channels at the APs and statistical CSI at the CPU. The design variables are optimized using Algorithm 2 with IRS phase shifts optimized using Algorithm 1. This system is denoted is given by (63) on the right y-axis, for the IRS-assisted cell-free MIMO system in which the CPU has instantaneous CSI. The results are plotted against the number of symbols used for channel estimation which represents the training overhead. The channel estimates are obtained using the MMSE-DFT channel estimation protocol [24], [25] which incurs a training overhead of Sτ p = (N + 1)τ p symbols, since the required number of estimation sub-phases is S = N + 1 and τ p ≥ K. In this figure, the training overhead is determined by the number of reflecting elements N, which is varied from N = 4 to N = 64, while τ p is fixed. Note that the training overhead is captured by the training loss factor (1 − Sτ p τ c ) in the achievable net rate expression in (63). The minimum SINR and the minimum net rate are plotted under Algorithm 4 for the case where the IRS phase shifts are optimized using the Dinkelbach's Algorithm outlined in Algorithm 3 (configuration 3 described above), and also for the case where they are optimized using the alternating maximization algorithm in Algorithm 1 (configuration 4 described above). We see that the performance under both algorithms for IRS optimization is quite close. The minimum SINR γ inst k grows with N since we can realize larger reflect beamforming gains through the optimization of more IRS phase shifts. On the other hand, the minimum net rate decreases with N. This is because the increase in training loss (i.e., the decrease in 1 − Sτ p τ c factor) is dominant over the increase in the minimum SINR γ inst k , resulting in the minimum net rate to decrease as N increases. Therefore there is a tradeoff between the channel estimation overhead and the SINR gain which is reflected in the net rate.
In Fig. 10, we plot the average minimum achievable net rate E[min k R inst k ] of the IRS-assisted cell-free MIMO system and the conventional cell-free MIMO system under instantaneous CSI at the CPU, where R inst k is given in Theorem 2. The minimum achievable net rate of the statistical CSI based scheme given in Theorem 1 is also plotted under Algorithm 2 and shows a similar trend as the previous figures, with the minimum net rate increasing from 11.9 to 12.9 Mbps as N increases from 4 to 64. The minimum net rate of the instantaneous CSI based scheme decreases with N since the increase in the training loss is dominant over the increase in the minimum SINR γ inst k , resulting in the minimum net rate in (63) to decrease. We see that for N > 16 the statistical CSI based max-min SINR scheme actually performs better than the full instantaneous CSI based max-min SINR scheme since the training overhead of the statistical CSI based scheme stays constant at τ p symbols with an increase in N, which is much less than the training overhead of the instantaneous CSI based scheme. Another observation we make is that the instantaneous CSI based scheme achieves a much higher minimum net rate for very small values of N. This is because for such small values of N the training loss is not dominant and the scheme benefits from higher minimum instantaneous SINR values, as well as from better channel estimation quality, since all the individual channels are estimated with an optimal IRS phase shifts solution based on the DFT matrix during a longer training phase. On the other hand, in the statistical CSI based scheme we only estimate the overall channel in a single sub-phase using an IRS phase shifts solution that does not minimize the estimation error.
We also observe that the conventional cell-free MIMO system designed using instantaneous CSI performs better than the system designed using statistical CSI, because the instantaneous CSI based scheme yields better minimum SINR values while the training overhead is τ p symbols for both systems. We further note that the IRS-assisted cellfree MIMO system that relies on statistical CSI at the CPU needs 32 reflecting elements to outperform the conventional cell-free MIMO system that relies on instantaneous CSI. Therefore an IRS-assisted cell-free MIMO system with statistical CSI at the CPU has the potential to outperform both IRS-assisted and conventional cell-free MIMO systems that have instantaneous CSI at the CPU.
For a longer duration of coherence interval τ c = 500 symbols, for which the results are plotted in Fig. 11, the instantaneous CSI based scheme yields a better performance than the statistical CSI based scheme for a larger range of N. The minimum net rate of the instantaneous CSI based scheme increases till N = 16 since the increase in the minimum SINR γ inst k is dominant over the increase in the training loss (i.e., the decrease in 1 − Sτ p τ c factor). However, after that the minimum net rate starts to decrease due to the increase in the training loss dominating over the increase in the SINR. For N > 26, we see that the statistical CSI based scheme outperforms the instantaneous CSI based scheme. Therefore, an important insight we draw here is that the performance gain of instantaneous CSI based scheme is more noticeable in low mobility scenarios which have longer coherence intervals over which the channels stay constant. However in high mobility scenarios, i.e., small τ c 's, the minimum net rate of instantaneous CSI based scheme quickly deteriorates and it becomes infeasible to acquire the instantaneous CSI of all IRS-assisted links for large N.
We would conclude by summarizing our findings from these results. First, instantaneous CSI based scheme for IRS-assisted systems definitely yields better minimum SINR since the parameters are optimized based on the actual channel realizations and not just the channel statistics. However unless we rely on quicker estimation schemes, 7 the minimum net rate will quickly deteriorate for large values of N due to the large training overhead. Second, even if we do rely on estimation schemes which require smaller training times, optimizing IRS phase shifts at the CPU and relaying this information to the IRS controller which then reconfigures the IRS at the pace of fast fading channel significantly increases the system complexity. On the other hand statistical CSI based scheme can yield performance close to and better than the instantaneous CSI based scheme for moderate to large number of IRS elements by saving the training overhead associated with estimating all links. Moreover, the design variables only need to be optimized once for several coherence intervals which makes the design simple. 7. Some examples of works that develop low overhead channel estimation protocols for IRS-assisted systems were given in Section V-A but most of them either make assumptions on channel conditions or compromise on IRS beamforming gains.

VII. CONCLUSION
In this paper, we considered the uplink of an IRS-assisted cell-free MIMO system in which the CPU only has statistical CSI to detect symbols and design the receiver filter coefficients, users' power allocations and IRS phase shifts. The APs only estimate the end-to-end channel with respect to each user to implement matched filtering on the received signal. This way, we avoid increasing the traffic on fronthaul links due to the exchange of CSI between the APs and CPU, and we avoid the large training overhead associated with the estimation of individual IRS-assisted links. For this transmission scheme, we derived a closed-form expression for the achievable uplink net rate of each user at the CPU, that only depends on the channel statistics and the design variables. Using this expression, we formulated the maxmin SINR problem to design the optimal receiver filter coefficients at the CPU, the power allocations at the users and the phase shifts at the IRS subject to per user power constraints as well as IRS phase resolution constraints. The resulting non-convex problem was solved using an alternating optimization algorithm. In each iteration, the optimal receiver filter coefficients solution is expressed in a closedform, the optimal power allocations at the users are obtained by solving a geometric program, and the IRS phase shifts are designed using an alternating maximization algorithm. For performance comparison, we also devised a max-min SINR scheme in which the receiver filter coefficients, the power allocations and the IRS phase shifts are designed based on full instantaneous CSI of all direct and IRS-assisted links at the CPU. Numerical results showed that the max-min SINR scheme designed using statistical CSI can outperform the scheme designed using instantaneous CSI for moderate to large number of IRS elements while for smaller number of IRS elements, the scheme based on instantaneous CSI yields a better performance.
For future work, it is interesting to extend the achievable rate analysis to other channel models, like Rician fading for all links, correlated Rayleigh fading for AP-IRS links, etc. The analysis can also be extended to multi-antenna systems, where each AP is equipped with more than one antenna, and to multi-IRS assisted cell-free MIMO systems, where several distributed IRSs assist the communication between the APs and users. Another important research direction is to compare the performance of both statistical CSI and instantaneous CSI based schemes with that achieved by a deep learning framework that adapts the design variables based on the learnt channels with minimal beam training overhead.

APPENDIX
We prove Theorem 1 in this Appendix. Before we present the proof, we present a useful lemma that characterizes the statistics of the IRS-assisted channels.
Lemma 2: The overall IRS-assisted channel in (5) satisfies the following expressions.