Intelligent Reflecting Surface Assisted Multi-User MISO Communication: Channel Estimation and Beamforming Design

The concept of reconfiguring wireless propagation environments using intelligent reflecting surfaces (IRS)s has recently emerged, where an IRS comprises of a large number of passive reflecting elements that can smartly reflect the impinging electromagnetic waves for performance enhancement. Previous works have shown promising gains assuming the availability of perfect channel state information (CSI) at the base station (BS) and the IRS, which is impractical due to the passive nature of the reflecting elements. This paper makes one of the preliminary contributions of studying an IRS-assisted multi-user multiple-input single-output (MISO) communication system under imperfect CSI. Different from the few recent works that develop least-squares (LS) estimates of the IRS-assisted channel vectors, we exploit the prior knowledge of the large-scale fading statistics at the BS to derive the Bayesian minimum mean squared error (MMSE) channel estimates under a protocol in which the IRS applies a set of optimal phase shifts vectors over multiple channel estimation sub-phases. The resulting mean squared error (MSE) is both analytically and numerically shown to be lower than that achieved by the LS estimates. Joint designs for the precoding and power allocation at the BS and reflect beamforming at the IRS are proposed to maximize the minimum user signal-to-interference-plus-noise ratio (SINR) subject to a transmit power constraint. Performance evaluation results illustrate the efficiency of the proposed system and study its susceptibility to channel estimation errors.


I. INTRODUCTION
M ASSIVE multiple-input multiple-output (MIMO) com- munication, millimeter wave (mmWave) communication, and network densification are some of the main technological advancements that are leading the emergence of Fifth Generation (5G) mobile communication systems.However, these technologies face two main practical limitations.First, they consume a lot of power, which is a critical issue for practical implementation and second, they struggle to provide the users with uninterrupted connectivity and a guaranteed quality of service (QoS) in harsh propagation environments, due to the lack of control over the wireless propagation channel.For example: the network's total energy consumption scales linearly as more base stations (BS)s are added to densify the network, while each active antenna element in a massive MIMO array is connected to a radio frequency (RF) chain comprising of several active components, rendering the total cost and energy consumption to be very high.Moreover, massive MIMO performance is known to suffer when the propagation environment exhibits poor scattering conditions [1], whereas, communication at mmWave frequencies suffers from high path and penetration losses.These two limitations have resulted in the need for green and sustainable future cellular Q.-U.-A.Nadeem, H. Alwazani, and A. Chaaban are with School of Engineering, The University of British Columbia, Kelowna, Canada (email: {qurrat.nadeem,hibat97, anas.chaaban}@ubc.ca) A. Kammoun and M.-S.Alouini are with the Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia 23955-6900 (e-mail: {abla.kammoun,slim.alouini}@kaust.edu.sa)M. Debbah is jointly with Université Paris-Saclay, CNRS, CentraleSupélec, 91190 Gif-sur-Yvette, France and The Huawei Mathematical and Algorithmic Sciences Lab, 92100 Boulogne, Billancourt, France (e-mail: merouane.debbah@centralesupelec.fr).
networks, where the network operator has some control over the propagation environment.
An emerging concept that addresses this need is that of a smart radio environment, where the wireless propagation environment is turned into an intelligent reconfigurable space that plays an active role in transferring radio signals from the transmitter to the receiver [2]- [5].This concept is enabled by the use of intelligent reflecting surfaces (IRSs) in the environment, that shape the impinging electromagnetic (EM) waves in desired ways in a passive manner, without generating new radio signals and thereby without incurring any additional power consumption.Several current research activities focus on developing different converging solutions to implement these IRSs, including fabricating new meta-surfaces and reflect arrays, making them re-configurable, implementing testbeds and generating experimental results [2], [6]- [12].
Very recently, works approaching this subject from the wireless communication design and analysis perspective have appeared, which view the IRS as a planar array of a large number of passive reflecting elements, each of which can independently induce a phase shift onto the incident EM waves and reflect them passively.By adaptively and intelligently adjusting the phase shifts of all the IRS elements, referred to as passive beamforming or reflect beamforming [13], [14], desired communication objectives can be realized.In the last year, several joint designs for precoding at the BS and phase shifts matrix at the IRS have been proposed to achieve different communication goals, for example: maximize the system's energy efficiency subject to the individual signal-tointerference-plus-noise ratio (SINR) constraints at the users in [15], maximize the minimum user rate subject to a transmit power constraint in the asymptotic regime in [16], minimize the transmit power at the BS subject to users' individual SINR constraints in [14] and maximize the sum-rate subject to a transmit power constraint in [17], [18].Moreover the use of IRS to maximize the minimum secrecy rate for physical layer security has been studied in [19] and to assist in simultaneous wireless information and power transfer has been studied in [20].IRSs have also found applications in wide-band orthogonal frequency division multiplexing (OFDM) systems in [21] and non-orthogonal multiple-access systems in [22].
A vast majority of the existing works assume the availability of perfect perfect channel state information (CSI) to design the precoding vectors at the BS and phase shifts matrix at the IRS.However, this assumption is highly unlikely to hold in practice for an IRS-assisted system.This is because as opposed to conventional multi-antanna and relay-assisted communication systems, where channels can be estimated by actively sending, receiving and processing pilot symbols, the IRS has no radio resources of its own to send and receive pilot symbols and no signal processing capability to estimate the channels.Therefore, it is critical to re-evaluate the promising gains shown by IRS-assisted communication systems under an imperfect CSI model.
Recently [23] and [24] have proposed channel estimation protocols for an IRS-assisted single-user MISO system based on the least squares (LS) estimation criteria, where the former paper estimates the IRS-assisted channels one-by-one by keeping one IRS element active and the other elements off in each sub-phase of the channel estimation period, while the latter improves this protocol by keeping all the IRS elements active and reflecting throughout the channel estimation phase, under an optimal solution for the IRS phase shifts matrix.The method in [23], [24] is extended in [25], [26], where the authors derive LS estimates for a single-user system assuming that the surface can be divided into multiple sub-surfaces of adjacent strongly correlated reflecting elements that apply the same reflection coefficient.The work is also extended in [27], that aims to reduce the channel training time by developing a three stage channel estimation protocol which exploits the strong correlation in the IRS-assisted channels due to the common BS-to-IRS channel.However, the protocol assumes an ideal environment where there is no received noise at the BS in the channel estimation phase, which is definitely not going to hold under any practical setting.Channel estimation using compressive sensing and deep learning techniques have been proposed in [28] for a single-user system by requiring a few elements of the IRS to be active.The authors in [29] focus on an IRS-assisted multi-user MISO system and leverage the sparsity of the cascaded channel, which consists of the BS-IRS channel and the IRS-user channel, to formulate the channel estimation problem as a sparse channel matrix recovery problem using compressive sensing techniques.The problem is solved using a two-step procedure based multi-user joint channel estimator.The authors in [30] exploit the rankdeficient structure of the massive MIMO channel to formulate and solve the cascaded channel estimation problem.To induce sparsity, some randomly selected IRS elements are switched off at each time.
With the exception of [23]- [26] that derive LS channel estimates for a single-user IRS-assisted system, the proposed protocols are based on approximate algorithms that do not yield analytical expressions for the channel estimates which could facilitate future theoretical analysis of IRS-assisted systems.Moreover, the contributions of most of these works are limited to developing channel estimation protocols and numerically evaluating them in terms of the mean squared error (MSE).They do not utilize the estimates to develop joint precoding and reflect beamforming designs for different downlink communication scenarios of interest, where the downlink rate loss caused by channel training is an important issue especially in IRS-assisted systems.The most notable work that proposes beamforming design under imperfect CSI is [18], that deals with the sum-rate maximization problem under a transmit power constraint by modeling the true channel coefficients as realizations from the sample space that is dominated by the knowledge of the imperfect CSI.However, the authors do not exploit any practical channel estimation protocol but rather assume a distribution for the channel estimation noise in the development of their algorithms.
Motivated by these gaps in research, we study the channel estimation and beamforming design problem for an IRSassisted multi-user MISO communication system.We first outline the IRS-assisted system model, considering correlated Rayleigh fading channels between the IRS and the users, which are practically more relevant than the independent Rayleigh fading channels considered in most existing works.We then propose an optimal minimum mean squared error (MMSE) based channel estimation protocol to estimate the direct BS-to-users channel vectors as well as the cascaded channel vectors consisting of the BS-to-IRS link and the IRS-to-users links.The proposed protocol divides the channel estimation phase into multiple sub-phases, where in each subphase an optimal reflect beamforming vector is applied across the IRS elements.It turns out that the optimal IRS configuration in the training phase is to choose the reflect beamforming vectors as columns of the discrete Fourier transform (DFT) matrix.The proposed DFT-MMSE estimation protocol utilizes prior information on the large-scale fading statistics, that change very slowly as compared to the fast-fading process and can be easily tracked at the BS [1], [31], [32], to derive closed-form expressions of the MMSE estimates of the direct channel and the IRS-assisted channels.A detailed analytical comparison in terms of the normalized MSE confirms the superiority of the MMSE-DFT protocol over the LS-DFT protocol in [24] and the LS-ON/OFF protocol in [23].
To study the performance of the IRS-assisted communication system, we focus on solving the maximization of the minimum SINR (max-min SINR) problem by jointly designing the precoding vectors and power allocation at the BS and the phase shifts vector at the IRS, subject to a transmit power constraint at the BS and non-convex unit-modulus constraints on the IRS elements.The objective function is also nonconvex in which the precoding vectors, allocated powers and phase shifts are coupled and no optimal design is known.We tackle the problem using alternating optimization (AO) where the precoding vectors and allocated powers at the BS are optimized iteratively with the phase shifts at the IRS, until convergence is achieved.For fixed IRS phase shifts vector, the optimal solution to the max-min SINR sub-problem with respect to precoding vectors and allocated powers is given by the optimal linear precoder (OLP) [32], while for fixed precoding and power allocation, the solution to maxmin SINR sub-problem with respect to IRS phase shifts is obtained by applying semi-definite relaxation and solving the resulting fractional optimization problem optimally using the generalized Dinkelbach's algorithm.The proposed AO algorithm is proved to converge.We then extend the AO algorithm to the imperfect CSI scenario, where the MMSE estimates are utilized to design the precoder and the IRS phase shifts vector.The max-min SINR problem has only been dealt with in the context of IRS-assisted systems in [16], where the authors approximate and solve this problem in the asymptotic regime under perfect CSI using project gradient ascent.Our work accounts for CSI errors and focuses on the exact problem.Simulation results are provided towards the end of the work that show the IRS-assisted system to be highly efficient but also sensitive to CSI errors as compared to the conventional MISO communication system.
The paper is organized as follows.The communication model for an IRS-assisted MISO system is introduced in Section II.We propose and analyze the MMSE-DFT channel estimation protocol in Section III.Joint design for precoding vectors and power allocation at the BS and phase shifts vector at the IRS are developed to solve the max-min SINR problem in Section IV.Simulation results are provided in Section V and conclusions are presented in Section VI.
Notation: The following notation is used throughout this work.The notation x ∈ [a, b] implies that the scalar x lies in the closed interval between a and b as a ≤ x ≤ b.Boldface lower-case and upper-case characters denote vectors and matrices respectively.The notations x ∈ C N ×1 and X ∈ C N ×N represent a vector of dimension N and a matrix of dimension N × N respectively with complex entries.The superscripts (•) T and (•) H represent the transpose and conjugate transpose respectively, E[•] represents the expectation and log(•) represents the logarithm.The operators tr(X) and ||X|| denote the trace and the spectral norm respectively of the matrix X.Also X −1 denotes the inverse of a non-singular matrix X.The N × N identity matrix is denoted by I N and the N × N diagonal matrix of entries {x n } is denoted by X = diag(x 1 , x 2 , . . ., x N ).A random vector x ∼ CN (m, Φ) is complex Gaussian distributed with mean vector m and covariance matrix Φ.The Kronecker product of two matrices X and Y is denoted as X ⊗ Y.

II. COMMUNICATION MODEL
In this section, we outline the transmission model and channel model utilized to study the IRS-assisted system.To improve the clarity of mathematical exposition, the important symbols used in this section are listed in Table I.

A. Transmission Model
The proposed IRS-assisted multi-user MISO system is illustrated in Fig. 1, which consists of a BS equipped with M antennas serving K single-antenna users.This communication Direct BS-to-user-k channel.
IRS reflect beamforming vector given as is assisted by an IRS, comprising of N nearly passive reflecting elements which introduce phase shifts onto the incoming signal waves.The IRS is attached to the facade of a building located in the line-of-sight (LoS) of the BS.The reflection configuration of the IRS, that governs the phase shifts applied by individual IRS elements, is controlled by a micro-controller, which gets this information from the BS over a backhaul link.
The BS employs Gaussian codebooks and linear precoding, where p k , g k ∈ C M ×1 and s k ∈ CN (0, 1) are the allocated power, digital precoding vector and data symbol of user k respectively.Based on these definitions, the transmit signal Fig. 1: IRS-assisted multi-user MISO system.Red dashed lines represent the uplink channel vectors estimated in the proposed protocol.
vector x ∈ C M ×1 is given as Given s k 's are independent and identically distributed (i.i.d.) CN (0, 1) variables, x has to satisfy the average transmit (Tx) power per user constraint as where P max > 0 is the Tx power constraint at the BS, P = diag(p 1 , . . ., p K ) ∈ C K×K is the power allocation matrix, G = [g 1 , . . ., g K ] ∈ C M ×K is the precoding matrix, and s = [s 1 , . . ., s K ] T is the vector of users' data symbols.We consider the block-fading model for the channels, which stay constant over the coherence interval of length T symbols.The received complex baseband signal y k (t) ∈ C at user k in time-slot t is given as where is the LoS channel between the BS and the IRS, h 2,k ∈ C N ×1 is the channel between the IRS and user k, h d,k ∈ C M ×1 is the direct channel between the BS and user k and n k (t) ∼ CN (0, σ 2 n ) is the noise at the user.The IRS is represented by the diagonal matrix Θ = diag(α 1 exp(jθ 1 ), . . ., α N exp(jθ N )), where θ n ∈ [0, 2π] and α n ∈ [0, 1] represent the phase-shift and the amplitude coefficient for element n respectively.Note that Θ is not updated on a symbol-duration level, but rather on a coherence-time level, i.e. after every T symbols.
The uplink channel through the IRS given by H 1 Θh 2,k can be equivalently expressed as H 0,k v, where v = [α 1 exp(jθ 1 ), . . ., α N exp(jθ N )] T ∈ C N ×1 is the reflect beamforming vector of the IRS and H 0,k = H 1 diag(h T 2,k ) ∈ C M ×N is the cascaded channel matrix.The cascaded matrix H 0,k has N column vectors of dimension M , where each column vector h 0,n,k , n = 1, . . ., N , can be written as . This formulation in (3) enables the separation of the response of the IRS in v from the cascaded channel outside the IRS control in H 0,k , and will assist us in the design of the channel estimation protocol.
In terms of CSI acquisition, the IRS-assisted system is different from existing popular communication systems, like the conventional MISO system and relay-assisted MISO system, since unlike BS and relay, the IRS has no radio resources of its own to send pilot symbols to help the BS estimate H 1 nor can it receive and process pilot symbols from the users to estimate h 2,k s.This is one of the biggest challenges in the practical design of IRS-assisted systems.In terms of precoding/beamforming design, the IRS-assisted system model is much more difficult to analyze than existing models, due to the constant-modulus constraints on elements of the reflect beamforming vector v.Although beamforming optimization under unit-modulus constraints has been studied in the context of hybrid digital/analog mmWave architectures [33], [34], such designs are mainly restricted to the BS side, and are not directly applicable to the joint design of the precoding at the BS and reflect beamforming at the IRS.

B. Channel Model
The design of IRS-assisted systems also requires the correct modeling of h 2,k and H 1 .Existing works (e.g.[13]- [15], [17]- [20]) utilize the independent Rayleigh and Rician models to analyze the system performance, which are only practical if the IRS elements are spaced far enough and the environment has rich scattering.In most practical settings, the channels with respect to IRS elements will be spatially correlated which will impact the performance.In this work, we will evaluate the performance of the IRS-assisted system under the correlated Rayleigh channel model for h 2,k and h d,k given as where R IRS k ∈ C N ×N and R BS k ∈ C M ×M are the correlation matrices at the IRS and the BS respectively with respect to (w.r.t.) user k, with tr(R IRS k ) = N and tr(R BS k ) = M .Moreover, z k ∼ CN (0, I N ) and z d,k ∼ CN (0, I M ) are the fast fading vectors for IRS-to-user k link and BS-to-user k link respectively, and β 2,k and β d,k are the path loss factors for the IRS-to-user k link and BS-to-user k link respectively.We will adopt the correlation model developed for arrays of discrete antennas in [35], [36], assuming that the underlying IRS technology is a reflective antenna array or a reflectarray.It is important to note that the conventional statistical correlation models for arrays of discrete antennas are not directly applicable if the IRS is realized using a reconfigurable meta-surface.The correct modeling of the spatial correlation for this implementation still requires significant attention from researchers who are conversant in both communication and electromagnetic theory.
The IRS is envisioned to be installed on a high rise building close to the BS, which will result in a LoS channel between the BS and the IRS [13], [16].Since the BS and the IRS have co-located elements, so the channel matrix H 1 is likely to have rank one, i.e.
are the array responses at the BS and IRS defined in [16].Under such a setting, the degrees of freedom offered by the overall IRS-assisted link H 0,k will be one and the IRS will only yield performance gains when K = 1 [16].To benefit from the IRS in a multi-user setting, we must have rank(H 1 ) ≥ K.One way to introduce this rank is to have deterministic scattering between the BS and the IRS or place the IRS close to the BS such that the LoS channel could be made of high rank.The high-rank LoS BS-to-IRS channel matrix H 1 for a multi-user setting can be generated as [16] [

III. CHANNEL ESTIMATION PROTOCOL
Channel estimation is necessary to compute the precoding vectors at the BS and the reflect beamforming vector v at the IRS.The real difficulty is in the estimation of H 1 and h 2,k s as the IRS has no radio resources and signal processing capability to send pilot symbols to the BS to enable the estimation of H 1 or to receive pilot symbols from users and estimate h 2,k .Recently a few papers have proposed LS estimates for the IRSassisted channels assuming a single-user IRS-assisted MISO system in [23] and [24].More specifically, [23] proposes an ON/OFF channel estimation protocol, where first the direct channel is estimated by keeping all IRS elements OFF and then the IRS-assisted channels h 0,n,k , n = 1, . . ., N , are estimated one-by-one by switching one element of the IRS ON at a time.In [24], LS channel estimates are derived keeping all the IRS elements active throughout the channel estimation phase with an optimal IRS phase shift matrix given as the DFT matrix.The idea was extended in [25] to an OFDM system and in [26] to an IRS-assisted system with discrete phase shifts while focusing on a single-user scenario.In parallel to these works, a few channel estimation algorithms exploiting the sparsity of the cascaded channel matrix H 0,k have also been recently proposed as discussed in the introduction.
In this section, we will outline our channel estimation protocol where the BS computes the MMSE estimates of the IRSassisted channel vectors based on the received pilot sequences from users over multiple sub-phases, where in each sub-phase the IRS applies an optimal reflect beamforming vector v. MMSE estimator significantly outperforms the LS estimator since it is based on the Bayesian estimation technique which achieves the minimum MSE between the true and estimated channel by exploiting prior knowledge of the channel's large scale fading statistics [37].These statistics stay constant over several coherence intervals and can be accurately learned and Received noise variance at BS.
Pilot sequence of user k.
Received training signal at the BS in sub-phase s.
Observation vector for user k in sub-phase s given as Normalized mean squared error in estimate ĥ. c Constant defined as tracked at the BS as discussed later in this section.After deriving the MMSE estimates, we will analytically compare the normalized MSE of both the LS and MMSE estimates.Simulation results are also provided to compare the MSE and bit error rate (BER) performance of the proposed protocol with existing methods.The important symbols used in this section are summarized in Table II for readers' convenience.

A. Proposed MMSE-DFT Channel Estimation Protocol
Given the passive nature of the IRS, we exploit channel reciprocity under the TDD protocol in estimating the downlink channels using the received uplink pilot signals from the users.For this purpose, we divide the channel coherence period of τ seconds (sec) into an uplink training phase of τ C sec and a downlink transmission phase of τ D sec.Throughout the uplink training phase, the users transmit mutually orthogonal pilot symbols.Since the IRS has no radio resources to send or receive and process pilot symbols, the BS has to estimate all the channels.To this end, note that H 1 and h 2,k have been cascaded as Since the estimation of h 2,k separately is extremely difficult due to the passive nature of IRS elements, we will focus on the MMSE estimation of the cascaded IRS-assisted channels h 0,n,k , n = 1, . . ., N and the direct channel h d,k for all k = 1, . . ., K users at the BS.
In the considered channel estimation protocol, the total channel estimation period of τ C sec is divided into S subphases 1 , each of length τ S = τ C S sec.The IRS applies the reflect beamforming vector v s = [v s,1 , . . ., v s,N ] T ∈ C N ×1 throughout sub-phase s, s = 1, . . ., S, where v s,n = α s,n exp(jθ s,n ).In each sub-phase, the users transmit T S = τ S τ pilot symbols, where τ is the duration of each symbol.Users transmit S copies of orthogonal pilot sequences across the S sub-phases, where the pilot sequence of user k is denoted as p,k x p,l = 0, for k = l, k, l = 1, . . ., K and x H p,k x p,k = P C T S τ = P C τ S Joules, where P C is the transmit power of user.The received training signal, Y tr s ∈ C M ×T S in sub-phase s is given as where N tr s ∈ C M ×T S is the matrix of noise vectors at the BS, with each column distributed independently as CN (0, σ2 I M ).To get the observation vector with respect to each user, the BS correlates the received training signal with the pilot sequence of user k to obtain the observation vector, r tr s,k ∈ C M ×1 , for user k in sub-phase s as where Collecting the observation vectors in (8) across S training sub-phases, we obtain where, The received observation vector in ( 9) is processed at the BS with the left pseudo-inverse of Vtr = V tr ⊗ I M ∈ C M S×M (N +1) , provided that S ≥ N + 1, 2 as Performing the pseudo-inverse operation in (11) will result in which is the function of the true channel vectors h d,k and h 0,n,k , n = 1, . . ., N collected in hk and the noise ñtr k in the received observation vector.The remaining task before proceeding to the derivation of the MMSE estimates is to design V tr .The appropriate design criteria is to minimize the variances of the elements of the noise vector ñtr k , while keeping the noise across the estimation of different channel vectors uncorrelated.The covariance matrix of the noise ñtr ) is given as To ensure uncorrelated noise across the estimated channels, C ñtr k should be a scaled identity matrix and therefore V tr should have orthogonal columns.Furthermore, we will aim to achieve the same noise variance in the estimation of all channels, which will require equally scaled orthogonal columns of V tr i.e. (V tr H V tr ) −1 = ζI N +1 .Minimizing the variance of the noise is then equivalent to minimizing ζ with the constraints that 1) V tr has the structure in (10), 2) Under the outlined constraints on V tr , a possible optimal design that attains the lower bound in (15) is the N +1 leading columns of a S × S DFT matrix given as [24] [ s = 1, . . ., S, n = 1, . . ., N + 1.Under the DFT design, we have (V tr H V tr ) −1 = 1 S I N +1 and therefore ζ = 1 S .This choice for V tr indeed attains the lower bound in (15) while meeting all constraints.
We now derive the MMSE estimates based on the received observation vector rtr k in (12), which can be simplified under the DFT design in (16) as We can write (17) as To derive the MMSE-DFT estimate of h d,k , we exploit the relationship between rtr 1,k and h d,k given as where v tr 1 is the first S × 1 column of V tr .Based on the observation vector in (18), the BS can compute the estimate of h d,k and the result is stated in the following lemma.
Lemma 1: The MMSE estimate ĥd,k of h d,k is given as which is distributed as ĥd,k ∼ CN (0, Ψ d,k ), where and Invoking the orthogonality property of the MMSE estimate [38], we can decompose the channel h d,k as h d,k = ĥd,k + hd,k , where hd,k ∼ CN (0, Ψd,k ) is the uncorrelated estimation error (which is also statistically independent of ĥd,k due to the joint Gaussianity of both vectors) and Ψd,k = We now find the MMSE-DFT estimate of h 0,n,k , n = 1, . . ., N using the received observation vector rtr n+1,k , which is given using (17) as where v tr n+1 is the (n + 1) th column vector of V tr .Based on this observation vector, the BS can compute the estimate of h 0,n,k and the result is stated in the following lemma.
Lemma 2: The MMSE estimate ĥ0,n,k of h 0,n,k is given as for n = 1, . . ., N , k = 1, . . ., K, which is distributed as ĥ0,n,k ∼ CN (0, Ψ n,k ), where and Proof: The proof is provided in Appendix B. Invoking the orthogonality property of the MMSE estimate, we can decompose h 0,n,k as h 0,n,k = ĥ0,n,k + h0,n,k , where h0,n,k ∼ CN (0, Ψn,k ) is the uncorrelated estimation error, where Ψn,k = β 2,k r n,k h 1,n h H 1,n − Ψ n,k .Under the proposed design in (16), the MMSE estimates do not depend on the cross-correlation between IRS elements, so knowledge of R IRS is not required at the BS 3 .
To calculate the MMSE estimates, the BS will require knowledge of the correlation matrices R BS k , k = 1, . . ., K, and the LoS BS-to-IRS channel vectors h 1,n , n = 1, . . ., N .The LoS channel vectors are deterministic which depend only on the LoS angles between the BS and IRS.These angles need to be calculated only once at the BS using knowledge of the IRS location, which is fixed.The correlation matrices vary very slowly as compared to the fast fading process and stay constant over many coherence intervals.As discussed in several works, they can be calculated based on knowledge of only the users' AoAs (which depend on their locations) and angular spread in the environment, both of which can be accurately learned and tracked at the BS [1], [31] 4 .In fact, second-order channel statistics are generally assumed to be perfectly known at the BS in massive MIMO literature [39].
Unlike LS estimates, the MMSE estimates depend on the distribution of H 1 , h 2,k and h d,k .The derived results can be easily generalized to other channel fading models.For example, the MMSE estimates under independent Rayleigh fading h 2,k s and h d,k s can be obtained by setting R BS = I M .The estimates when H 1 is not fixed but rather follows a fading model can be be similarly developed.After obtaining the MMSE estimates, the BS uses them to design the precoder G * , power allocation matrix P * as well as the reflect beamforming vector v * in (3) based on the performance criteria of interest.
The BS then provides information on the required IRS phase shifts vector v * for downlink transmission to the IRS microcontroller.Wireless backhaul links in mmWave and THz bands are suitable candidates for the BS to communicate with the IRS controller under strict latency requirements [2].

B. NMSE Comparison with Least Squares Estimation
The LS estimates are obtained by correlating the received training signal Y tr s with the pilot sequence of user k as shown in (8) and applying the pseudo-inverse of Vtr on the resulting observation vector as done in (11) [24].Under the DFT design for V tr in ( 16), the LS estimates are given as where v tr n+1 is the (n + 1) th column of V tr .We develop analytical expressions for the normalized MSE (NMSE) in the LS and MMSE estimates of direct and IRSassisted channel vectors.The NMSE is defined as To enable an analytical comparison, we set The result follows from using as proved in (65) and that tr((v tr The expression reveals that the NMSE in the LS estimate increases linearly as σ 2 grows large or β d,k , P C , τ S grow small.This result can also be derived directly as the trace of the first M × M block diagonal matrix of C ñtr k in (14).The NMSE in the MMSE-DFT estimate of h d,k in Lemma 1 can be computed as NMSE( ĥd,k We observe that the NMSE in the MMSE estimate approaches 1 as σ 2 grows large or β d,k , P C , τ S grow small.The NMSE value of 1 signifies that the error in the channel estimate has the same power as the true channel itself.Any beamforming transmission under estimates having NMSE values of 1 or beyond will correspond to isotropic transmission, i.e. as if the BS and IRS beamform with no CSI at all [23].However, as compared to the LS estimate, the NMSE in MMSE-DFT estimate will increase to 1 much slowly (i.e. when the noise becomes very high or β d,k becomes very small) as can be seen by comparing (30) and (32).This implies that MMSE-DFT estimates will be more accurate even at low values of training signal-to-noise ratio (SNR).Finally denoting c = σ 2 SP C τ S we have since c and β d,k are non-negative.Therefore, the MMSE-DFT estimate of the direct channel will always outperform the LS-DFT estimate for any value of σ 2 , P c , S, τ S and β d,k .
Next we compute the NMSE in the LS-DFT estimates of h 0,n,k in a similar manner as (30).Noting that tr The NMSE in the LS estimation of each h 0,n,k is the same as the NMSE in the LS estimation of the direct channel in (30).
The NMSE in the MMSE-DFT estimates of h 0,n,k in Lemma 2 can be computed as where (36) follows from applying the ShermanMorrison formula on the inverse term and noting that tr(h Denoting c = σ 2 SP C τ S and using straightforward calculation we can show that since c ≥ 0, β k ≥ 0 and M ≥ 1. Therefore the NMSE in the MMSE-DFT estimate of h 0,n,k will always be lower than the NMSE in the LS-DFT estimate for any value of noise, power, sub-phase duration and path loss factor.Also NMSE( ĥ0,n,k ) approaches 1 as c grows large or β k grows small.

C. Performance Evaluation of the Proposed Protocol
The NMSE in the LS-DFT and the MMSE-DFT estimates of the direct and IRS-assisted channels are compared in Fig. 2 versus the noise variance σ 2 .Fig. 2a shows the Monte-Carlo simulated NMSE( ĥd,k ) as well as the theoretical (Th.)  expressions in (30) and (32) for LS-DFT and MMSE-DFT estimates respectively.Fig. 2b shows the simulated quantity We observe that the NMSE in the MMSE-DFT and LS-DFT estimates of h d,k becomes the same for very low values of noise while the NMSE in the MMSE-DFT estimates of h 0,n,k s is always lower as compared to that in LS-DFT estimates.The NMSE in the MMSE estimates approaches 1 for both the direct channel and the IRS-assisted channels as the noise variance increases, while the NMSE in the LS estimates grows even beyond 1.However, as we discussed earlier, the NMSE value of 1 implies that the estimation error has the same power as the actual channel being estimated.For NMSE values of 1 and beyond under any estimation protocol, the performance of the IRS-assisted system will correspond to isotropic transmission, i.e. transmission without any CSI, which actually provides the worst bound on the performance under estimation errors [23].However, the NMSE under LS-DFT protocol grows to one much quicker than MMSE-DFT protocol, making LS-DFT more prone to estimation errors.
We also plot the NMSE for the correlated (Corr.)scenario where and η is set as 0.95.The NMSE in the LS-DFT estimates is unaffected and the NMSE in the MMSE-DFT estimates of h 0,n,k s is also unaffected by the structure of correlation.The NMSE in the MMSE-DFT estimate of the direct channel h d,k actually reduces with the introduction of correlation.
We also compare the results against the LS-ON/OFF protocol in [23], which sets S = N + 1 and uses . The drawbacks of this approach is that the cascaded channel is only estimated one-by-one such that the noise variance in each element of the received observation vector given in ( 14) is σ 2 P C τ S instead of σ 2 SP C τ S , and the error in the estimation of h d,k is propagated to the estimation of h 0,n,k s.The NMSE in the LS estimates of h d,k and h 0,n,k under ON/OFF protocol can be straightforwardly calculated to be σ 2 β d,k P C τ S and 2σ 2 β d,k P C τ S respectively.Compared to (30) and (35), we see a factor of S and 2S increase respectively in the NMSE in ĥLS d,k and ĥLS 0,n,k under ON/OFF protocol, which can also be observed by comparing the LS-DFT and LS-ON/OFF curves in Fig. 2.
Furthermore, the MMSE estimates under ON/OFF protocol can be derived in a similar manner as done in this work (details have been skipped for brevity in writing).The NMSE in the MMSE-ON/OFF estimates can be derived as β k under both MMSE-DFT and LS-DFT protocols.The value for σ 2 is set as 5 × 10 −4 J in these results.The match between the theoretical expressions of the NMSE derived in this section and the simulated values is perfect.The NMSE in MMSE-DFT estimates is always lower than that in LS-DFT estimates.We also show the effect of increasing the number of sub-phases S beyond N + 1.As evident in ( 30) and ( 35) there is a factor of S decrease in the NMSE in LS estimates over the entire range of β d,k and β k .The NMSE in MMSE estimates decreases by a factor of less than S in the low path loss (high SNR) regime while it approaches 1 in the high path loss regime irrespective of the value of S.However, the MMSE-DFT estimates are seen to outperform LS-DFT estimates for the considered values of S, with the performance gap becoming small as S increases.It is important to note that although we see a significant NMSE improvement by increasing the number of sub-phases S, there will also be a rate loss due to channel training as S increases.This is because the time left for downlink transmission reduces with S under the relation τ D = τ −Sτ S .Therefore, the system will suffer a rate loss factor of 1 − Sτ S τ during downlink transmission, rendering the IRS-assisted system performance sensitive to the value of S and the quality of estimates.This trade-off will be studied in the simulation results in Sec.V.
To gain further insights into how these NMSE values are related to the system performance, we numerically study the bit error rate (BER) achieved by an IRS-assisted system with M = 4 antennas and N = 10 reflecting elements serving a single-antenna user.For a single-user, it is well-known that the optimal precoding strategy at the BS is maximum ratio transmission (MRT), i.e. the precoding vector is set as g k = ĥk || ĥk || , where ĥk = ĥd,k + Ĥ0,k v.The estimates ĥd,k and ĥ0,n,k , n = 1, . . ., N , are given by ( 19) and ( 22) respectively under MMSE-DFT protocol, while under the LS-DFT protocol, they are given by ( 25) and ( 26) respectively.A close to optimal design for v that maximizes the received signal power at the user is proposed in [23] as v = exp(j∠( ĤH 0,k ĥd,k )).Under these designs for precoding at the BS and reflect beamforming at the IRS, we plot in Fig. 4 the BER achieved by the IRS-assisted system under binary phase-shift keying (BPSK) signaling, against the SNR defined as the ratio of the transmit power to the noise variance.The BER curves under perfect CSI and imperfect CSI with MMSE-DFT estimation as well as LS-DFT estimation are shown.We also plot the BER achieved by a conventional MISO system having 4 antennas at the BS and no IRS.As expected, the BER decreases with increasing SNR while it approaches the maximum value of 0.5 for very low values of SNR.We observe that the IRS-assisted system achieves a significantly better BER performance as compared to the conventional system without IRS, with the BER for the former decreasing to 10 −6 at SNR level of near 0dB, similar to the observation made in [5].In fact, the SNR gap between the IRS-assisted system and the conventional system to achieve the BER rate of 10 −6 is around 17dB, which shows that the IRS is capable of improving the reliability of the underlying communication channel by manipulating the propagation of radio waves in the environment.This superior BER performance is explained in [5] using the analytical result that the received signal power at the user scales quadratically as N 2 with the number of IRS elements N , whereas in the conventional MISO system it scales linearly with the number of BS antennas M .As a result the IRS provides approximately a factor of N 2 improvement in the received signal power 6 , because of which even when the SNR is relatively low, the BER achieved by the IRS-assisted system is quite low.
Under channel estimation errors in an IRS-assisted system, the BER performance of the MMSE-DFT protocol is significantly better than the LS-DFT protocol, with an SNR gap of almost 8dB to achieve the BER of 10 −6 .This is in accordance with the insights drawn earlier from the NMSE analysis where we showed the MMSE-DFT estimates to always achieve a lower NMSE.Further, we note that the BER under LS-DFT protocol approaches the maximum value at SNR level of −15dB whereas under MMSE-DFT protocol, it will reach the maximum BER slower (in fact it does not reach the maximum value for the SNR range considered in the figure).This can also be confirmed from Fig. 2a and 2b, where we see that the NMSE values in MMSE-DFT estimates approach 1 much slower (at higher values of noise) than the LS-DFT estimates.Finally, we see that the BER decreases with increasing number of sub-phases S for both protocols.This is due to the decrease in NMSE with increasing S as observed earlier in Fig. 3a and Fig. 3b.
It is important to remark here that both ON/OFF and DFT protocols require long channel training times when N is very large since the number of sub-phases S has to be greater than N + 1.As an extension, the scenario where IRS elements that experience strong correlation and therefore similar channels are grouped together can be studied.The number of sub-phases needed can then be reduced to the number of groups instead of the number of IRS elements.However this will also reduce the degrees of freedom offered by the IRS for performance improvement since elements in the same group will apply the same reflection coefficient.We stress that MMSE estimates yield convenient analytical expressions unlike the algorithms in [29], [30] and can be extended under future channel estimation protocols that reduce training overhead.

IV. JOINT ACTIVE AND PASSIVE BEAMFORMING DESIGN
In this section, we design the precoding vectors and power allocation at the BS and the phase shifts vector at the IRS.
The amplitude reflection coefficients α n , ∀n are assumed to be unity as done in almost all existing works, motivated by the recent advances in the design and development of lossless metasurfaces [40], [41].Similar to channel estimation, we assume that all the design computations take place at the BS since the IRS has no signal processing capability.The BS then informs the IRS controller about the required optimal reflect beamforming vector v * through a backhaul link, and the controller triggers the elements of the IRS to apply the required phase-shifts.
The performance metric employed is the max-min rate, which provides a good balance between system throughput and user fairness.The rate of user k is defined as R k = log 2 (1 + γ k ), where γ k is the SINR of user k given as where h k = h d,k + H 0,k v is the overall channel from BS to user k as defined in (3).Since logarithm is a monotonically increasing function so max-min rate problem is equivalent to solving the max-min SINR problem.

A. Problem Formulation
The BS utilizes the information it has on the direct and the IRS-assisted channels to find the optimal precoding vectors G * = [g 1 , . . ., g K ], allocated powers p * = [p 1 , . . ., p K ] T , and the IRS reflect beamforming vector v * as the solution of the following max-min SINR problem.
where v n = exp(jθ n ) is the n th element of v.Note that the constraints in (40b) and (40c) meet the constraint in (2).We would like to highlight that with the exception of [16], the max-min SINR problem has not been dealt with in the context of IRS-assisted systems.In contrast to [16] which focuses on the problem formulation and solution under perfect CSI in the asymptotic regime where M , N and K grow infinitely large, we focus on the exact problem in (P1) and deal with both perfect and imperfect CSI.
Due to the non-convex nature of the problem in which the precoding vectors, allocated powers and phase shifts are coupled, we will adopt an AO technique, where the precoding vectors and power allocation at the BS are optimized iteratively with the phase shifts at the IRS, until convergence is achieved.For fixed v, we have the following sub-problem It was shown in [16] that the optimal linear precoder (OLP) that solves (P2) optimally with respect to G and p takes the form where q * k s are obtained as the unique positive solution of the following fixed-point equations with where D = diag , if k = i and 0 otherwise.On the other hand, for fixed g k s and p k s, (P1) is reduced to We will propose a solution for (P3) in the next subsection.The proposed AO algorithm will then solve problem (P1) by solving problems (P2) and (P3) alternatively.The extension to imperfect CSI is summarized in Section IV-C.The AO technique has been utilized in [13] to solve the transmit power minimization problem and in [15] for energy efficiency maximization problem.However, the sub-problems constituting the AO algorithm in this work are different.

B. Problem Solution
The optimal solution for the precoding vectors and allocated powers in (P2) are already provided in (42) and ( 44) respectively.Here, we develop a solution for the design of reflect beamforming vector in (P3), which is a non-convex problem.However, we observe that the numerator and denominator of γ k in (39) which is the objective function in (45a) can be transformed into quadratic forms.To see this note that the terms |h H k g i | 2 in (39) can be written as where a k,i = H H 0,k g i and b k,i = h H d,k g i .By introducing an auxiliary variable t, (P3) can be reformulated in terms of quadratic forms as where However the problem (P4) is NP-hard in general [42].Note that vH R k,i v = tr(R k,i vv H ). Therefore, we can reformulate (P4) by defining V = vv H , which needs to satisfy V 0 and rank( V) = 1.Since the rank-one constraint is non-convex, we apply semi-definite relaxation to relax this constraint by letting V be a positive semi-definite matrix of arbitrary rank.The semi-definite relaxed problem is given as Problem (P5) is efficiently solved using fractional programming, which provides tools to maximize the minimum of ratios in which the numerator is a concave function, the denominator is a convex function, and the constraint set is convex [43], [44] .An efficient method to do so is the generalized Dinkelbach's algorithm, outlined in Appendix A of [44], which is guaranteed to converge to the global solution of the max-min fractional problem with limited complexity.The objective function in (48a) considers a set of ratios of two functions, where we denote the numerator by n k ( V) and the denominator by d k ( V), k = 1, . . ., K. By exploiting the fact that tr(AB) = vec(A T ) T vec(B), we write n k ( V) and d k ( V) as It can be seen from ( 49) and (50) that n k ( V) and d k ( V) are linear functions of V. Problem (P5) therefore considers a set of ratios , where each ratio has an affine numerator n k ( V), affine denominator d k ( V) and convex constraints and can therefore be solved optimally using the generalized Dinkelbach's algorithm [44].Once the optimal V * is obtained, the corresponding vector v that solves (P4) needs to be extracted.If the resulting matrix V * turns out to have rankone, the optimal solution v * can be obtained as where u max (A) is the eigenvector corresponding to maximum eigenvalue of A. If the rank turns out to be greater than one, then Gaussian randomization can be applied to find v * by using the eigenvalue decomposition V * = UΛU H and computing vl = UΛ 1/2 r l , where r l ∼ CN (0, I N +1 ) for l = 1, . . ., L. The solution v * can then be found as With a sufficiently large number of randomizations L, we can guarantee a very accurate approximation of the optimal objective value of (P4) [14], [42].In our extensive simulations, we have always observed the optimal solution of Problem (P5) to have rank-one and therefore v * in (51) is indeed optimal for (P4).The same observation was reported in some other works including [45], [46].Finally, the solution to (P3) can be recovered by accounting for the constraint that the last element of v * (which is t) should equal one and the first N elements of v * need to satisfy the constraint (45b).The resulting solution as outlined in [13], [14] , where [x] (1:N ) denotes the vector of first N elements of x and v * N +1 is the last entry of v * .The Dinkelbach's procedure to solve (P3) as well as the overall AO algorithm to solve (P1) is outlined in Algorithm 1.
The convergence of Algorithm 1 is ensured by the noting that the objective value of (P1), i.e. min , is upper-bounded due to the constraint set in (P1) and is non-decreasing over the iterations by applying Algorithm 1.
To see this, denote the objective value of (P1) based on a solution and (G r+1 * , p r+1 * , v r+1 * ) be the solutions to (P2) in the r th and (r + 1) th iterations, respectively in step 5 of the algorithm.It then follows that , where first inequality holds since for given v r+1 * in step 5 of Algorithm 1, G r+1 * , p r+1 * is the optimal solution to problem (P2), and second inequality holds because v r+1 * increases the objective value of (P3) for given G r * , p r * in step 14.However, no global optimality claim can be made since (P1) is not jointly convex with respect to G, P and v.

C. Imperfect CSI Scenario
When only imperfect CSI is available at the BS, the BS can implement the AO algorithm by using max p,G,v min k γk as the objective function in (P1), where γk = where ĥk = ĥd,k + Ĥ0,k v with ĥd,k and Ĥ0,k being the MMSE estimates defined in (19) and ( 22) respectively.The BS can not compute the true SINR values in (39) since it only has the estimates of h k 's available.As a consequence the solutions for (P2) and (P3) will be optimal in terms of the estimated minimum SINR in (54) instead of the true minimum SINR in (39).Finding the optimal solution to (P1) under imperfect CSI using the true minimum SINR as an objective function is extremely difficult with no optimal solution in the literature.Therefore, replacing h d,k s and H 0,k s with their estimates is a reasonable approach to tackle this problem and is similar to what is done in [23], [47] that deal with the design of IRS-Algorithm 1 Alternating Optimization Algorithm Compute g r * k and p r * k , ∀k, as the solution to ( 42) and (44)  V * = max where n k ( V) and d k ( V) are given by ( 49) and (50) respectively, subject to V 0 and Vn,n = 1, n = 1, . . ., N + 1; 10: until F < 1 . 13: v * computed using (51) or (53); assisted system under CSI errors 7 .
Solving (P2) with max min γk as the objective function for fixed v will result in where q * k s are obtained as the unique positive solution to q * The allocated powers p * k are given as where D = diag , if k = i and 0 otherwise.The optimization with respect to v in (P3) using max min γk as the objective function can be performed by expressing the numerator and denominator of (54) in terms of quadratic forms, with the difference being that h d,k s and H 0,k s will be replaced with their estimates in the definitions of a k,i and b k,i in (47a).The resulting problem can be relaxed using semi-definite relaxation and then solved using the Dinkelbach's algorithm.
The overall AO algorithm will be the same as Algorithm 1, with the difference being that the input channel vectors h d,k and h 0,n,k s in step 1 will be replaced by their estimates ĥd,k and ĥ0,n,k s in ( 19) and ( 22) respectively and the stopping criteria in step 16 will be applied on min k γk where γk is defined in (54).The algorithm will therefore alternate between the computation of g * k s and p * k s in ( 55) and ( 56) respectively for fixed v and the computation of v * for fixed g k s and p k s, until convergence is reached, which happens when the fractional increase in min k γk is below a threshold value.We would stress that the performance of the proposed design is shown in terms of the true minimum SINR in the simulation results and not the estimated minimum SINR.

V. SIMULATION RESULTS
We utilize the parameter values described in Table III in generating the simulation results.The path loss parameters are computed at 2.5 GHz operating frequency for the 3GPP Urban Micro (UMi) scenario from TR36.814 (detailed in Section V of [16]).We use the LoS version to generate path loss for H 1 and the non-LOS (NLOS) version to generate path losses for h 2,k and h d,k .Moreover, 5 dBi antennas are considered at the BS and IRS.Note that the IRS is deployed much higher than the BS to avoid the penetration losses and blockages caused by ground structures like buildings.Therefore, we assume a CoM scheme [16] Fig. 5: IRS-assisted single-user MISO system.The BS and IRS are marked with their (x, y) coordinates.penetration loss of 15 dB in each BS-to-user link, whereas we assume negligible penetration loss in the IRS-to-user links.We first focus on the single-user IRS-assisted system shown in Fig. 5 and plot in Fig. 6 the rate achieved by the user for varying values of d u .Note that for a single-user system, the SINR in ( 39) is simplified to SNR given as and the user rate is related to the SNR as , where the factor 1 − τ C τ accounts for the rate loss due to channel training.The results are plotted under the optimized precoding vector g * k and phaseshifts vector v * 8 .For the imperfect CSI case, we plot the results under both LS-DFT and MMSE-DFT estimates derived in Section III.We observe that in an IRS-assisted system, the user farther away from the BS can still be closer to the IRS and receive stronger reflected signals from it resulting in an improvement in the performance as observed for d u > 30.Consequently, the IRS-assisted system is able to provide a higher QoS to a larger region.For example, under perfect CSI it will cover 120m with a rate at least 2.3bps/Hz, whereas the system without the IRS can cover about 95m to achieve the same rate.Moreover, the users placed close to the IRS, e.g.located in 42 < d u < 70 range will see gains ranging from 2 to 4 bps/Hz.Although the rate decreases due to increasing signal attenuation when d u > 50 but it is still better than what would have been achieved without the IRS unless the user is so far away that the path loss becomes dominant over the gain provided by the IRS.
Doubling N at the IRS to 80, the achieved rate scales by about 2bps/Hz for users close to the IRS, which implies that the SNR scales by around 6dB.This corresponds to the scaling of SNR in the order of N 2 , corresponding to an array gain of N and the reflect beamforming gain of N as analytically proved in [14].However, the gain is negligible for 10 < d u < 25 because the BS-to-user direct channel is much stronger than the channel through the IRS.Moreover, higher coverage is possible with large number of reflecting elements as shown through the higher values of achieved rate for N = 80 under perfect CSI.
The curves under imperfect CSI show that the IRS-assisted system is more sensitive to channel estimation errors than the conventional MISO (without IRS) system.This is because the IRS-assisted system has to estimate N + 1 = 41 channel vectors whereas the direct system only needs to estimate one channel vector.Moreover, the error becomes more significant as the user moves away from the because the channel vectors become weaker and more difficult to estimate.The IRS-assisted system designed using MMSE-DFT estimates outperforms the system that relies on LS-DFT estimates especially for higher channel estimation noise, as discussed in Fig. 2 as well.
Next we study the minimum user rate performance of a multi-user system under imperfect CSI with the BS placed at (0, 0), IRS placed at (0, 100) and users distributed uniformly in the square (x, y) ∈ [−30, 30] × [70, 130].Accounting for the rate loss due to channel training, the net achievable rate of user k is given as where γ k is defined in (39).Note that the total channel estimation τ C sec is related to the number of estimation sub-phases S and the duration of each sub-phase τ S sec as τ C = Sτ S .In Sec.III we saw that increasing S improves the quality of channel estimates by reducing the NMSE by a factor of approximately S.Moreover, under the proposed channel estimation protocol the minimum number of required sub-phases S is N + 1, to ensure that the left pseudo-inverse of Vtr in (11) exists.At the same time, the total channel estimation time τ C increases linearly with S, which reduces the time left for downlink transmission causing the rate loss factor of 1 − Sτ S τ that we see in (57).Therefore, S has the positive effect of improving the channel estimates quality and the adverse impact of increasing the total channel estimation time and should be selected carefully to strike a balance.The next figure will study this trade-off.
In Fig. 7 we plot the net achievable minimum rate against S for an IRS-assisted system serving 4 users with M = 8 antennas at the BS, while optimizing the precoding vectors, power allocation and IRS phase shifts vector using Algorithm 1 with the MMSE channel estimates as the input.For the two considered IRS-assisted MISO systems, we find that S ≈ N + 1 is the optimal number of sub-phases that maximizes the achieved minimum user rate, i.e. S ≈ 9 is optimal for the system with N = 8 reflecting elements, while S ≈ 17 is optimal for the system with N = 16 reflecting elements.For S < N + 1, the NMSE in the channel estimates becomes very high since the left pseudo-inverse of Vtr utilized in (11) becomes singular as Vtr does not have full column rank 9 .As a result the rate obtained for S = N is lower than that for S = N + 1, since the computed pseudo-inverse for S = N is inaccurate.
Increasing S above N + 1 has the positive effect of reduced channel estimation error as shown earlier in Fig. 3a and Fig. 3b.However, increasing S also increases the channel training time causing a rate loss factor of 1 − Sτ S τ since the total time left for downlink transmission decreases as τ − Sτ S .The decrease in downlink transmission time is linear with increasing S as can be seen from (57), whereas the impact of improvement in estimation quality is only logarithmic with increasing S since the SINR γ k appears inside the log function in (57).The negative effect of decrease in downlink transmission time dominates over the positive effect of improvement in channel estimates quality as S increases.Therefore, S ≈ N +1 is the optimal number of sub-phases for both considered settings.
Fig. 8 plots the minimum user rate against N for varying number of antennas at the BS in an IRS-assisted system, where the precoding vectors, allocated powers and IRS phases are optimized under Algorithm 1 for both perfect CSI and imperfect CSI cases (where for the latter we use the channel estimates as input in step 1 of the algorithm).The number of sub-phases S = N + 1 under the MMSE-DFT channel estimation protocol.The performance is compared to that of a conventional large MISO system having 20 antennas at the BS and no IRS.We show that by appropriately selecting the number of reflecting elements N at the IRS, the IRS-assisted system can perform as well as the large MISO system with a reduced number of antennas at the BS.Under perfect CSI, the IRS-assisted MISO system with 28 passive reflecting elements at the IRS and only 12 active antennas at the BS can achieve the same performance as the considered large MISO system of 20 antennas.The same performance can also be achieved with M = 15 antennas using N = 19 reflecting elements at the IRS.We also notice that under channel estimation errors, larger array sizes are needed at the IRS to achieve the same performance as the conventional large MISO system.For example, under imperfect CSI an IRS-assisted system with M = 12 antennas at the BS can achieve nearly the same  Fig.8: Performance of an IRS-assisted multi-user system against N under perfect (per.) and imperfect (imper.)CSI.
performance using N = 48 instead of N = 28 reflecting elements.Moreover, as the value of N increases the performance gap between perfect and imperfect CSI curves for the IRS-assisted system significantly increases since the minimum number of required sub-phases S increases linearly in N .This causes a rate loss due to the time spent in channel training.Therefore, accurate and quick CSI acquisition is a critical issue in IRS-assisted communication systems that needs to be addressed to reap the full potential of this technology.However, IRS-assisted communication also has the potential to be an energy-efficient alternative to technologies like massive MISO and network densification by reducing the number of active antennas and RF chains needed at the BS.
To test the performance of the proposed Algorithm 1, we consider the benchmark Centre of Means (CoM) scheme from [16], where the IRS phase-shifts are set as the mean of the LoS angles of all users 10 .The proposed algorithm is shown to outperform the benchmark scheme considerably.
Finally, we show the convergence behaviour of Algorithm 1 in Fig. 9 by setting M = 8, N = 16, K = 4 and = 1 = 10 −4 .The phase shifts are initialized using the CoM scheme.The minimum user rate, computed using the SINR defined in (39), is plotted against the number of iterations.It is observed that the minimum rate yielded by the proposed algorithm under both perfect and imperfect CSI increases quickly with the number of iterations and the algorithm converges in less than 15 iterations.

VI. CONCLUSION
In this paper, IRS-assisted wireless communication is envisioned to be an important energy-efficient paradigm for beyond 5G networks, achieving massive MISO like gains with a lower number of active antennas at the BS.The passive elements constituting the IRS smartly re-configure the signal propagation by introducing phase shifts onto the impinging electromagnetic waves.This paper proposed the MMSE-DFT channel estimation protocol to estimate the direct and IRSassisted links and compared it with the existing LS based channel estimation protocols.The MMSE estimates were both analytically and numerically shown to achieve a much lower NMSE than the LS estimates.We then proposed an AO algorithm to maximize the minimum SINR, subject to a transmit power constraint and unit-modulus constraints on the IRS elements.The AO algorithm is proved to converge and is shown to yield excellent performance gains in the simulation results that compared the performance of the proposed IRS-assisted system to the conventional MISO system under imperfect CSI.However, the results also highlighted the high sensitivity of the IRS-assisted systems to the quality of the estimates and the rate loss due to channel training.
For future research, it is important to develop low overhead channel estimation protocols where the number of required sub-phases can be reduced to avoid long channel training times.It is also important to make the channel estimation protocols robust in high-speed environments.Another important direction is to study the impact of discrete phase shifts on the performance of the IRS-assisted systems under imperfect CSI.The work can also be extended to multiple IRSs-assisted communication systems as well as IRS-assisted multi-cell systems, where pilot contamination will play a detrimental role in channel estimation.Moreover it is clear that ĥd,k is a complex Gaussian vector, the covariance matrix for which can be computed as

APPENDIX
This completes the proof of Lemma 1.

B. Proof of Lemma 2
Given where E[n tr k n tr H k ] is computed using similar steps as done in (65).The expression in (73) then follows from realizing that v tr H n+1 v tr n+1 = S under the proposed DFT design for V tr .Using (71) and ( 73) in (69) we obtain Moreover it is clear that ĥ0,n,k is a complex Gaussian vector, the covariance matrix Ψ n,k = E[ ĥ0,n,k ĥH 0,n,k ] for which can be straightforwardly computed.
This completes the proof of Lemma 2.
ĥ0,n,k ) as well as the theoretical expressions in(35) and(37) for LS-DFT and MMSE-DFT estimates respectively.The parameter values are set as M = 4, N = 10, P C = 1, T S = K = 1, τ = 50µs, τ S = T S τ and S = N + 1.The simulated NMSE matches the theoretical expressions perfectly.Moreover, the MMSE-DFT estimates achieve a lower NMSE than the LS-DFT estimates especially for moderate to high values of σ 2 (i.e.low SNR regime).

16 :
until the fractional increase in min k γ k is below .

Fig. 7 :
Fig. 7: Number of sub-phases S that maximizes the minimum user rate achieved by the IRS-assisted multi-user MISO system under MMSE-DFT protocol.

TABLE I :
Important symbols defining the communication model.
Product of β 1 and β 2,k .R BS k ∈ C M ×M Correlation matrix at BS w.r.t.user k.R IRS k ∈ C N ×N Correlation matrix at IRS w.r.t.user k.
N , where λ is the carrier wavelength, θ LoS1,n and φ LoS1,n represent the elevation and azimuth LoS angles of departure (AoD) respectively at the BS w.r.t IRS element n, and θ LoS2,m and φ LoS2,m represent the elevation and azimuth LoS angles of arrival (AoA) respectively at the IRS.Moreover β 1 is the path loss factor for the BS-to-IRS link, d BS is the inter-antenna separation at the BS and d IRS is the inter-element separation at the IRS.

TABLE II :
Important symbols defining the channel estimation protocol.

TABLE III :
Simulation parameters.
A. Proof of Lemma 1Since both rtr 1,k and h d,k are jointly Gaussian, the MMSE estimator is linear.Given the observed training signal, rtr 1,k in (18), the MMSE estimate of h d,k is given asĥd,k = Wr tr 1,k ,(58)where W is found as the solution to minW tr(E[( ĥd,k − h d,k )( ĥd,k − h d,k ) H ]) and turns out to be tr1 ⊗ I M ) S 2 (P C τ S ) 2 (61) = β d,k R BS k + 1 S 2 σ 2 P C τ S (P C τ S ) 2 (v tr 1 ⊗ I M ) H I M S (v tr 1 ⊗ I M ), k (v k = E n tr s,k n tr H s,k ⊗ I S = E N tr s x p,k x H p,k N tr H (21)observed training signal, rtr n+1,k in(21), we can write the MMSE estimate of h 0,n,k as ]) −1 .Noting that n tr k and h 0,n,k are independent random vectors we obtainE[r tr n+1,k h H 0,n,k ] = E h 0,n,k + (v tr n+1 ⊗ I M ) H n tr k SP C τ S h H 0,n,k , where h 1,n is the n th column of H 1 and r n,k is element (n, n) of R IRS k .Next we obtain the expression of= r n,k β 2,k h 1,n h H 1,n + σ 2 I M SP C τ S .