Sum Rate Maximization Versus MSE Minimization in FDD Massive MIMO Systems with Short Coherence Time

The increasing demand for higher data rates motivates the exploration of advanced techniques for future wireless networks. To this end, massive multiple-input multiple-output (mMIMO) is envisioned as the most essential technique to meet this demand. However, the expansion of the number of antennas in mMIMO systems with short coherence time makes the downlink channel estimation (DCE) overhead potentially overwhelming. As such, the number of training sequence (TS) needs to be signiﬁcantly reduced. However, reducing the number of TS reduces the mean-squared error (MSE) accuracy signiﬁcantly and to date it is not clear to what extend can this TS reduction affects the achievable sum rate performance. Therefore, this paper develops a low complexity and tractable TS solution for DCE and establishes an analytical framework for the optimum TS. Furthermore, the tradeoff between the achievable sum rate maximization criteria and the MSE minimization criteria is investigated. This investigation is essential to characterize the optimum TS length and the actual performance of mMIMO systems when the channel exhibits a limited coherence time. To this end, the statistical structure of mMIMO channels is exploited. In addition, this paper utilizes a random matrix theory (RMT) method to characterize the downlink achievable sum rate and MSE in a closed-form. This paper shows that maximizing the downlink sum rate criterion is more important than minimizing the MSE of the SINR only, which is typically considered in the conventional MIMO systems and/or in the time division duplex (TDD) mMIMO systems. The results demonstrate that a feasible downlink achievable sum rate can be achieved in an frequency division duplex (FDD) mMIMO system. This ﬁnding is necessary to extend the beneﬁt of mMIMO systems to high frequency bands such as millimeter-wave (mmWave) and Terahertz (THZ) communications


I. INTRODUCTION
Future wireless networks aim to maximize the data-rate to support the rapidly increasing demands for data traffic and meet the envision needs of the Internet of Things (IoT) and artificial intelligence (AI) applications [1], [2]. Massive multiple-input-multiple-output (mMIMO) communication system is introduced as a key fundamental tech-nology to address the challenge of data traffic explosion [3], [4]. In mMIMO systems, hundreds of antennas that are grouped together at the base station (BS) are used to serve several users simultaneously over the same-time-andfrequency [5]- [8]. In particular, mMIMO systems have the ability to increase the degrees of freedom in the propagation channel, focus the energy into spatial directions and improve both data rate and communication reliability [3], [9], [10]. Furthermore, mMIMO systems can also allow the use of low complexity combining and precoding techniques [11]- [13].
However, unlike the conventional MIMO systems, which use a small number of BS antennas N , in mMIMO systems, the training sequence (TS) that required for downlink channel estimation (DCE) is potentially overwhelming [14]. In particular, the limited coherence time is to date considered as one of the major technical challenges in mMIMO systems, affecting the length of TS needed for accurate DCE [9], [14]. Therefore, obtaining a feasible solution for DCE based on downlink (DL) TS design with limited TS length is essential with limited coherence time. To this end, the vast majority of research studies on mMIMO systems have focused on the time division duplex (TDD) operation mode. In TDD based systems, the uplink (UL) and DL baseband channels are assumed to be reciprocal. This reciprocity assumption allows the UL channel estimation to be used at the BS in the DL precoder design. However, regardless of the optimistic results of channel estimation in TDD transmission mode, most of the currently deployed cellular systems use frequency division duplex (FDD) operation. In addition, an idealistic assumption of UL and DL channels reciprocity is considered in TDD systems. In practice, however, the transceiver hardware impairments and calibration error can be considered as a major restriction with TDD mode of operation [15]- [18]. Therefore, there is an essential commercial interest in enabling FDD operation mode, thus making mMIMO systems compatible with the currently deployed cellular networks [19], [20].
As such, this paper focuses on FDD operation mode where the DL channel is estimated using a dedicated DL TS. In FDD based systems, estimation of the DL channels using a UL TS, as considered in TDD systems, is not possible. Instead, to complete the precoder design, the DL channels of each of the N BS antennas would need to be estimated by the users using DL TS. This is considered unfeasible for large N since the available coherence time would be used or DCE only, leaving no time for sending useful information to the users. Previous research has demonstrated that DL TS length needs to be scaled linearly with N [9], [14], [21]. This finding arises mainly from the earlier point-to-point MIMO studies [22], where the channel is assumed to be uncorrelated and the DL TS is designed by considering the criteria of mean-squared-error (MSE) minimization of the DCE. Application of this design principle to FDD based mMIMO, indeed leads to DL TS lengths of N , or close to N . This supports the above conclusion that DL TS is unfeasible in mMIMO systems with large N , thereby rendering it unsuitable for FDD operation. To this end, several different studies have investigated the DCE performances using DL TS designs, see e.g., [23]- [30]. The aforementioned research works have found that a considerable enhancement in the DCE can be obtained by exploiting the Bayesian filter [31], i.e., making use of the minimum-mean-square-error (MMSE) estimator. The authors in [32], [33] have investigated the FDD operation in mMIMO systems using a two-stages precoding approach. In these works, the channel dimensions are reduced using the two-stages approach, which separates the users into groups based on their correlation similarities. The DCE in FDD mMIMO system has also been investigated in [34]- [36]. In these works, spatial and temporal correlations along with Kalman-filter (KF) have been investigated to minimize the MSE of the DCE. Another line of research works have investigated the minimization of the DCE by using compressed sensing (CS) based approaches [19], [37]- [39]. Other research directions have investigated the DL TS design for DCE in FDD mMIMO systems by utilizing the low-rank nature of the channel covariance matrix [40]- [45]. The aforementioned research studies tries to overcome the aforementioned pessimistic view of the infeasibility of DCE by reducing the TS length. However, it is to date not clear to what extent such TS reduction can affect the DL sum rate of the FDD operation in mMIMO systems with single-stage precoding and when the coherence time is short. Specifically, following the MSE minimization criterion leads to unnecessarily pessimistic predictions on the FDD performance in mMIMO systems. For example, using fewer TS length reduces the accuracy of DCE significantly, and thus, this could lead to reducing the DL achievable sum rate. Also, increasing the TS length improves the DCE accuracy but it comes at the expense of reducing the DL achievable sum rate significantly. Therefore, characterizing the tradeoff between achievable sum rate maximization and the MSE minimization in the FDD mMIMO systems with limited coherence time is crucial. To the best of our knowledge, this tradeoff, though crucial, has not been investigated in the literature. As such, this motivates us to investigate the tradeoff between the achievable sum rate maximization and the MSE minimization in an FDD mMIMO system using a feasible TS design.

A. PAPER CONTRIBUTIONS AND FINDINGS
This paper proposes a low complexity and tractable TS solution for DCE and establishes an analytical framework for the optimum TS length. Furthermore, the tradeoff between the achievable sum rate maximization criteria and the MSE minimization criteria is particularly investigated. This tradeoff is examined under a large number of antenna elements and a short finite coherence time. In order to design a feasible TS for DL FDD mMIMO system, the secondorder channel statistic is exploited and an objective function based on maximizing the achievable sum rate is used instead of minimizing the MSE of DCE. The proposed TS design can be considered as a special case of beam-domain or angular-domain channel estimation [46]- [49]. This is because the second order channel statistic denoted by the channel covariance matrix is used in this paper in the design of DL training sequence. Our proposed design paradigm leads to feasible DL FDD mMIMO performance where the TS length can be made much shorter than N , thus supporting a feasible rate as N increases even though the coherence time remains limited. In this paper, we explicitly characterize the optimal TS length required for DL sum rate maximization with limited coherence time. Since evaluating of the system performance and optimizing the rate using extensive Monte-Carlo simulation are computationally demanding, a random matrix theory (RMT) method [12], [50] is used. This allows the signal-to-interference-plus-noise ratio (SINR) and the DL achievable sum rate for eigenbeamforming (BF) and regularized-zero-forcing (RZF) precoding of FDD systems to be expressed in analytical forms. Thus, a straightforward design methodology can be obtained without resorting to computationally demanding exhaustive search. Comparisons between the DL sum rates using BF and RZF precoding are conducted under different spatial correlation models. In addition, we examine the effect of increasing the BS antennas N more than the available coherence time. Comparisons between the achievable sum rate maximization criteria and the MSE minimization criteria are provided. This paper shows that improving the MSE of the DCE increases the DL achievable rate logarithmically, whereas reducing the time required for channel training as a proportion of DL transmission time increases the achievable sum rate linearly. This implies that the loss in the DCE performance by using shorter TS is minor in comparison to the improvement of having more data symbols, which maximizes the DL achievable sum rate. The numerical results indicate that by using the proposed DL achievable sum rate maximization criteria, a feasible performance of DL sum rate in FDD operation of mMIMO systems can be obtained. The results also show that the RZF precoder under correlated channels achieves a significant gain in the DL sum rate in comparison to BF precoder. These findings create a pathway for realizing FDD mMIMO systems in high frequency bands such as millimeter-wave (mmWave) and Terahertz (THZ) communications with fully digital precoding. Finally, the results demonstrate that the analytical solution using RMT method is tightly agreed with the simulation, which underpins the contributions of this paper.
Paper organization: The organization of this paper is presented as follows. In Section II, we present the system model and characterize the SINR and the achievable sum rate for the BF and RZF precoders. In Section III, we introduce the DCE process based on the DL TS design together with the problem formulations of MSE minimization and sum rate maximization criterion. In Section IV, an asymptotic RMT method is exploited to develop an asymptotic analysis for the DL sum rate of the BF and RZF precoders. In Section V, analytical and simulation results for the sum rate and MSE of BF and RZF precoding are provided in order to characterize the FDD operation in mMIMO system under different channel correlation models. Finally, Section VI concludes the paper.
Notation: This paper uses the boldface symbol to denote a matrix and a lower boldface symbol to express a vector. The circularly symmetric complex Gaussian (CSCG) distribution is denoted by CN (0, G), which implies a zero mean and a covariance matrix of G. This paper uses the term E[·] for the expectation operator. Other mathematical operators such as trace, transpose, Hermitian transpose and inverse are denoted by tr(·), (·) T , (·) H , and (·) −1 , respectively.

II. SYSTEM MODEL
This paper considers a wireless communication system of a single-cell scenario where the BS is equipped with an array of N antennas, which communicates in the DL with K single-antenna uncorrelated users. The single-antenna assumption of the users allows inexpensive and simple hardware equipment with efficient power usages [51]. The DL transmission to all users is carried out over the same time-frequency resources simultaneously. In mMIMO systems, a large N can be used where the number of users K remains limited, i.e., N ≫ K, [5]. We consider an FDD transmission mode in the DL with Rayleigh flat-fading channels and single-frequency band. In this paper, a block fading structure is considered where each channel is static over a block of coherence time of τ c ∈ Z + counted in symbols. We generate random channel realizations, which are independent of coherence blocks.
The available energy in each coherence block can be freely distributed between the training transmission and the data transmission as typically considered in the currently deployed mobile networks, i.e., the advanced-long-termevolution (A-LTE) [52], [53]. To estimate the DL channel in training-phase, the BS sends TS of length τ p enumerated in symbols per coherence block. As such, the remaining time duration is allocated to data so that remaining time for the useful data in symbols is given as τ d = τ c − τ p . Fig. 1 presents a schematic diagram of the mMIMO systems with single-stage digital precoding at BS. This paper concentrates entirely on the problem of DCE using a feasible TS design. This paper also focuses on investigating the effect of reducing the TS length on MSE and DL achievable sum rate. However, investigating the effect of channel feedback in the UL and reducing its overhead [54]- [57] or using signal compression schemes as in [58]- [60] can be addressed in the future. To this end, the received DL data transmission signal r k at the k-th user is given as where P d denotes the per-user transmit signal power of useful data at base station during the data-phase. Parameter v represents the DL transmit vector that contains the precoding matrix and the transmitting data symbols at the BS, which is given in (4). Parameter z k denotes the additive noise, which is modeled as a zero mean unit variance circularly symmetric complex Gaussian (CSCG) random variable. The term g k ∈ C N denotes the DL channel vector, which is intended between the transmit BS and the k-th user and can be modeled as where G = E g k g H k ∈ C N ×N is the channel covariance matrix, which reflects the spatial correlation andg k , k = 1, . . . , K depict the instantaneous DL channel realizations, which are modeled as CSCG randomly distributed vectors. The spatial channel covariance matrix has the eigenvalue decomposition (EVD) of where matrix U = [u 1 , . . . , u N ] ∈ C N ×N the eigenvectors of G and Λ is the eigenvalues of G, which are arranged as λ 1 ≥ λ 2 ≥ · · · ≥ λ N . The spatial covariance matrix depends on large-scale statistics, i.e., angles of arrival and departure or spatial/temporal correlation, which are considered to be frequency-invariant, and thus, can be efficiently known at both the user side and the BS side for both FDD or TDD operation systems [61]. The term v ∈ C N in (1) represents the DL transmit vector at the BS, which is given as where F = [F 1 , . . . , F K ] ∈ C N ×K is the precoding matrix employed by the BS, which depends on the accuracy of the DCE. The transmitting data symbols are defined as d = [d 1 , . . . , d K ] T ∈ C K , which is modeled as a zero mean CSCG vector and satisfies E dd H = I K . Term ̺ in (4) is the normalization constant, which is defined as [12] to ensure that E v 2 = K and the power transmitted by the BS during the DL data transmission as P d . Following [11], we assume that the user does not know the exact channel vector and precoding matrix, but instead estimates their average effect through √ ̺E g H k f k . To this end, the DL received signal model (1) is decomposed as where the first term in (6) refers to the desired information signal that intended to the k-th user. However, the second term in (6) denotes the DCE error, which caused by the imperfect channel knowledge at the k-th user. Finally, the last term in (6) represents the error caused by the multiuser interference after precoding. Although the interference and the estimation error caused by the imperfect DL channel knowledge terms are neither Gaussian nor independent of the signal of interest, an ergodic sum rate lower bound is enumerated by considering that both terms are CSCG and independent of the signal of interest [11], [22]. Accordingly, a DL achievable sum rate, denoted as C , for the mMIMO systems is written as where the term γ k is the SINR at the k-th user, which can be written as (8) It should be pointed out that the SINR given in (8) is a deterministic approximation of the true SINR that gives a lower bound in (7) but does not directly correspond to a measurable physical quantity. In particular, the SINR is used as an auxiliary variable to calculate the ergodic DL sum rate lower bound defined in (7). In addition, the expectations in (8) are determined based on different channel realizations, which are carried out separately by using extensive Monte-Carlo simulations. This is deemed, in general, as a computationally demanding process since the SINR and the sum rate require to be evaluated for different values of N where N ≫ 1. However, computationally feasible solutions for the SINR and the sum rate are obtained by using the method of RMT.
The achievable sum rate in (7) relies on the statistics of channel, which is denoted by the covariance matrix, the knowledge of the DCE at the BS, and the precoding technique. In this paper, two different types of precoding are considered, BF precoding and RZF precoding, which are given in (9) [12].
The termF ∈ C K×N is the estimate of the DL channel matrix. i.e.,F = [ĝ 1 ,ĝ 2 , . . . ,ĝ K ] H ∈ C K×N , and ς denotes the regularization coefficient, which is assumed to be 1/P d , based on the discussions in [12], but optimization with respect to ς could also be considered in future work. The following section explains the DCE process based on TS in the FDD mMIMO systems.

III. DOWNLINK CHANNEL ESTIMATION AND PROBLEMS FORMULATIONS
As depicted in equations (8) and (9), the achievable sum rate performance relies on the channel statistics and the acknowledgment of DCE at the BS. Therefore, the following subsections address the DCE using DL TS in an FDD mMIMO system. It worth pointing out that the DCE is different from the UL channel estimation, which is considered for idealistic reciprocal channels, such as TDD systems.

A. DOWNLINK CHANNEL ESTIMATION PROCESS
As explained earlier, in the DL, the BS needs to serve different users simultaneously. The BS would also need to precode the data in the DL to a specific user direction. The BS makes use of the estimated channel to precode the data in the DL. To this end, to estimate the DL channel in FDD systems, the BS would need to send predefined TS of length τ p in symbols to multiple users during the training-transmission phase. The TS length τ p is defined as the number of training slots in the time and frequency resources grid over which the coherence block remains constant. An orthogonal TS with uniform power allocation is considered in this paper. Accordingly, the received trainingsignal, r k ∈ C τp , at the k-th user is given by which contains the inner products of the k-th user DL channel vector and TS plus the receiver noise z k . The receiver noise is assumed to be randomly distributed CSCG with CN (0, I τp ). The TS matrix should satisfy the energy constraint i.e., tr X H p X p = P p τ p , where P p is the average transmitted power during the training transmission phase and τ p is the pilot sequence length or duration. Since the channel vector g k exhibits a CSCG distribution with known statistics, the MMSE estimate can be used by exploiting the conventional linear processing [31]. In particular, user k exploits the standard MMSE linear filter to carry out channel estimateg k from the observation/received pilot-signal r k , and the resulting DCE by applying the standard MMSE linear filter is obtained aŝ where r k ∈ C τp is the received training signal as given in (10). To this end, the channel estimation error vector can be expressed asg By the orthogonality principle, the vectorsĝ k andg k are uncorrelated. The MMSE estimator makes use of second order channel statistic. Let Θ be defined as the covariance matrix of the MMSE channel estimation and is expressed as To this end, the covariance matrix of the MMSE channel estimation can be written as Given the channel estimation error vector in (12), the error covariance matrix E ∈ C N ×N can be expressed as yields the error covariance matrix as where the expression in (16) is minimized by maximizing the right hand side of (16).
Lemma 1. For any positive matrices A, U, C,V, based on the Woodbury matrix identity property, the following holds: Using Lemma 1, the expression of error covariance matrix in (16) can be reformulated as where the training sequence matrix should satisfy the energy constraint tr X H p X p = P p τ p . Accordingly, the MSE ∈ R cost function with MMSE channel estimate can be expressed as The expression in (19) depends on the eigenvalues distribution of G −1 + X p X H p . Specifically, the MSE depends on the energy that corresponds to the diagonal of the error covariance matrix.

B. FORMULATION OF THE MSE MINIMIZATION PROBLEM
Minimizing the MSE over the DL TS matrix X p for a given TS duration τ p and training power P p in the DL FDD mMIMO systems equates to the optimization problem defined in (20) under the transmit energy constraint.
The MSE function corresponds to a function of the subspace that relies on the structure of the channel covariance matrix G and the TS matrix X p , which in turn depends on pilot energy allocated during the training transmission phase, i.e., the training time and training power. The next subsection discusses in derails the optimum TS structure for an FDD mMIMO scenario under consideration.

C. DOWNLINK TRAINING SEQUENCE DESIGN
The expression in (13) characterizes the output of a channel estimator that minimizes the MSE, which depends on structure of the channel covariance matrix. This has motivated the use of statistical structure of the channel covariance matrix G to optimize the TS design of an FDD mMIMO system. To this end, the TS matrix X p ∈ C N ×τp is designed by constructing the first τ p eigenvectors of G, which correspond to the largest eigenvalues as given in (21) for τ p ∈ {1, 2, . . . , min(N, τ c )}, which implies that the TS length based on our approach should not exceed the coherence time even when the channels are uncorrelated.
While the TS design given in (21) implies a uniform power allocation across the TS, non-uniform power allocations across the TS and optimization with respect to P p and P d could be investigated in future. Substituting (21) into (13) with some straightforward algebra yields a simplified closed-form for the MMSE channel estimate as where Λ τp ∈ R τp×τp is a diagonal matrix that represents the eigenvalues of G with λ 1 ≥ λ 2 ≥ · · · ≥ λ τp . To keep the TS overhead limited, the energy in the channel, which is related to the eigenvectors u τp+1 , . . . , u N of G, is not used in the TS construction, and hence, not used in precoding. The trace is a linear mathematical operation, which is considered to be invariant to the unitary rotation, and thus, it allows tr(ABC) = tr(CAB). Therefore, using the TS design in (22) and the trace property, a simplified expression for tr(Θ) can be straightforwardly obtained as where the eigenvalues are arranged as λ 1 ≥ λ 2 ≥ · · · ≥ λ τ p with τ p ≤ N , which represents the largest eigenvalues of G.
Using the EVD of G in (3) and the simplified form of the tr(Θ) in (23), with the trace operation property and straightforward algebra, provides the simplified analytical solution for the MSE using the MMSE estimation as given in The expression in (24) for the MSE is valid for any channel correlation matrix. The expression in (24) depicts explicitly the dependence of the MSE performance on the eigenstructure of the channel correlation matrix, which takes the advantage of the strong eigendirections to improve the DCE.
Overall, the MSE closed-form solution in (24) is analytically tractable and, more importantly, is straightforward to be evaluated. The MSE expression relies on the TS length and training power. In particular, the MSE expression in (24) indicates that increasing τ p would allow for more trainingsignal energy to be received and thus minimizes the MSE of the DCE. However, in the limited coherence time, increasing τ p would reduce the DL sum rate, which implies a shorter remaining time for transmitting data symbols, as will be explained later in Subsection III-D. Furthermore, unless the energy allocated to the training transmission is increased accordingly, the DCE suffers from a more severe noise contribution, reducing the SINR at the receiver.
Remark 1. As discussed earlier in Section II, the vast majority of the previous studies on MIMO DCE have focused on finding X p and τ p that minimize the MSE for a given training power P p . This is conventionally deemed feasible since N is always ≪ τ c , thus choosing TS length as τ p ∈ {1, 2, . . . , min(N, τ c )} is not potentially overwhelming. However, in mMIMO systems where a large number of antennas is used at BS, i.e., N could be ≫ from τ c , which makes DCE problematic with FDD systems. Specifically, due to (7), minimising MSE approach using τ p closed to N maximizes the term log 2 (1 + SINR k ). However, this minimization comes at the expenses of the pre-log (τ c − τ p )/τ c . As demonstrated by the numerical results in Section V, this minimum MSE criterion based optimization could lead in general to the choice τ p = N and a vanishing achievable sum rate when τ c is smaller than N . To maximize the DL achievable sum rate for different N values and τ c , both terms must be considered in the evaluation. To the best of our knowledge, the tradeoff between the MSE of the DCE and the achievable sum rate in the FDD mMIMO systems with limited coherence time has not been considered in the literature. On of the essential contributions of the this present paper is in investigating the tradeoff between the MSE and the achievable sum rate with short coherence time and the observation that for given training power P p and the structure of X p , maximization of the sum rate over training length τ p while accounting for both terms, results in a feasible solution for DCE based mMIMO and provides the actual performance of this system. In particular, the achievable sum rate does not vanish when N > τ c , but instead can benefit from large antenna numbers. In summary, unlike state-of-the-art research where the performance is optimized by minimizing the MSE of the channel estimate, we focus on maximizing the achievable DL sum rate directly. Our approach relaxes the MSE requirement, resulting in a feasible FDD mMIMO performance during DCE even when the coherence time is short. The following subsection discusses the maximization problem of the DL sum rate.

D. FORMULATION OF THE ACHIEVABLE SUM RATE MAXIMIZATION PROBLEM
Maximizing the DL achievable sum rate over imperfect DCE using a DL TS in an FDD mMIMO system equates to the optimization problem given in (25).
As mentioned previously, future wireless networks, e.g the sixth generation (6G) [4], [62] aims to maximize the achievable data rate in order to meet the demands for high data traffic. As such, maximizing the achievable data rate is considered as one of the most essential performance indicators for 6G network [4], [62]. Therefore, this paper focuses on optimizing the downlink achievable data rate to fulfill the increasing demands for data traffic. The expression in (25) shows that maximizing the achievable sum rate depends essentially on the training sequence length and the channel coherence time. The SINR affects the achievable rate logarithmically, whereas the training sequence length and channel coherence time affect the achievable sum rate linearly. The problem formulation in (25) is computationally demanding. This is because the expectations in (8) need to be evaluated using extensive Monte-Carlo simulations for different correlation modes, different values of N , different precoders, and different TS lengths. To overcome this issue, a computationally suitable solution is obtained by using an asymptotic the RMT method. In particular, the RMT method is used in this paper to obtain an asymptotic approximation to (8), and the achievable sum rate, of an FDD mMIMO system as N → ∞. As such, a simplified numerical evaluation of (25) with low-complexity is achieved. In the following section, we provide the analyses of SINR using the RMT method.

IV. SINR AND SUM RATE ANALYSIS USING RMT METHOD
This section presents analytical expressions of the SINR and DL sum rate of an FDD mMIMO system based on the RMT method in [12], [50], [63]. In particular, a deterministic analytical approximation of the average SINR, expressed as γ is determined as Although the asymptotic expressions of SINR for BF and RZF precoding are obtained under an assumption that N → ∞, consistent with previous research on large system limit [12], [63]- [66], our numerical results demonstrate that these analyses are accurately approximate the SINR and the sum rate for finite values of N . The RMT method allows the SINR γ k term in (8) to be replaced with the asymptotic approximation given in this section, and thus, a straightforward system evaluation is achieved. Accordingly, this analysis allows the results to be straightforwardly reproducible. The following asymptotic analysis using the RMT method corresponds to an average of a K user mMIMO system where the users have similar statistics at the BS and, as a result, average ergodic achievable rates. myprob 1. Let γ BF denote the SINR for BF precoding. An asymptotically approximation of SINR with BF precoding of an FDD mMIMO system with imperfect DCE is given as where Θ is the estimation covariance matrix of the MMSE, which is given in (22).
Unlike the SINR formation of BF precoding, the deterministic equivalent of the SINR of RZF precoding is provided based on several auxiliary parameters. These parameters come from the use of the RMT method. These auxiliary parameters are required to be solved before the SINR expression is determined. The following Proposition demonstrates the analytical result of RZF precoding. myprob 2. Let γ RZF denote the SINR of RZF precoding. A deterministic approximation for the SINR with the RZF precoder of an FDD mMIMO system with imperfect DCE is given as where the variable̺ ∈ R denotes the analytical equivalent of RZF precoding and parameter δ ∈ R is determined using a fixed-point-algorithm. To this end, let define an integer t, with t = 1, 2, . . . ,

VOLUME 4, 2016
where the starting value of the parameter δ is given as δ (0) = 1/ς, and the variable δ ∈ R is obtained after the convergence of the fixed point in (30) is achieved.
After we obtain the solution of δ in (29) and (30), we substitute the solution into to determine T ∈ C N ×N . To this end, matrixT ∈ C N ×N is determined as with variableδ ∈ R, which is determined as The parameter̺ ∈ R denotes the precoding normalization of RZF precoding, which is determined by substituting the auxiliary matrices T andT intō Finally, the parameterμ ∈ R is determined by utilizing the dominated-convergence theorem and the continuousmapping theorem, and hence, it is obtained in (35), (36) and (37).
The SINR expressions provided in (28) is the asymptotic approximation of the SINR, which tightly approximates the true SINR with RZF precoding. The analysis of RZF precoding is valid for any channel correlation model. It worth pointing out that Proposition 1 and Proposition 2 are modified versions of precoding Theorems provided in [12]. In particulate, Proposition 1 and Proposition 2 are obtained for the DCE based on the FDD operation mode in mMIMO systems and not for the TDD systems. Also, these Propositions are obtained with single-cell scenario. The SINR approximations allow the numerical results to be directly reproducible and easy to be validated. In the following section, we present the analytical and simulated results of BF and RZF precoding techniques based on different physical channel correlation models.

V. NUMERICAL RESULTS WITH DIFFERENT PHYSICAL CHANNEL CORRELATION MODELS
The Rayleigh fading model is considered as a typical approach for modelling the covariance matrix [10], [67]. In this fading model, the channel coefficients are considered to be uniformly distributed. However, the condition for the channel coefficients to be uncorrelated, which exists only in a rich scattered environment, is very strict. Furthermore, due to the insufficient spacing of the antenna elements and their spatially dependent radiation patterns, MIMO channel coefficients become subject to a strong spatial correlation. Furthermore, field measurements of the propagation environments have revealed that the MIMO channel coefficients are correlated [68]- [70]. Therefore, to capture a realistic performance assessment of an FDD mMIMO system and to characterize the impact of spatial correlation on the DL achievable sum rate and MSE, three different correlation models are considered in this paper. These models are known as P -degrees of freedom (P -DoF) model, exponential correlation model, and the one ring (OR) channel model. Specifically, the channel covariance matrix G is considered to be not a scaled identity matrix. However, covariance matrix G describes more realistically the spatial propagation environment of the MIMO channels. It is worth noting that P -DoF and OR models are incorporated stochastic rank deficiency [71] while the exponential correlation model has a full rank matrix. In the following subsections, we present analytical and simulation results, which characterize the FDD operation in the mMIMO systems in terms of the MSE and the achievable sum rate for both BF and RZF precoding. A summary of the simulation parameters is provided in Table 1.

A. SUM RATE MAXIMIZATION AND MSE MINIMIZATION EVALUATION FOR THE P -DOF MODEL
Typically, any channel correlations depend on the number of the degrees of freedom in the channel. However, the degrees of freedom in the channel can be much smaller than N . Hence, the channel can be decomposed into a small number of dimensions (P -dimensional), where P is the number of spatial directions which can be relatively small compared to N . This subsection presents the P -DoF physical model as defined in [12], [72]- [74]. To this end, the covariance matrix G using the P -DoF model can be expressed as where P/N = c ∈ (0, 1]. The elements of matrix V ∈ C K×P are considered to be independent and identically distributed with CN 0, 1). Finally, matrix A ∈ C N ×P in (38) cab be constructed from P ≤ N of an N × N unitary matrix, which should satisfy A H A = I P . Clearly, G has rank P . If P < N , then the channel is stochastically rank deficient. Thus, the correlation parameter c controls the DoF in the channel. Specifically, c denotes the extent of correlation in the channel [12], [72]- [74]. The covariance matrix for the channel defined in (38) is of the form Using the EVD of G, which is defined in (3), where Λ is the eigenvalues matrix of G, which are arranged as λ 1 = λ 2 = · · · = λ P = 1/c and λ P +1 = · · · = λ N = 0. Thus, the smaller the normalized degrees of freedom c is, the more energy concentrated in the channel is. Let substitute the EVD given in (3) in the P -DoF model into (22) allows the MMSE to be expressed as where matrix U m ∈ C N ×m is formed from m unitary matrix eigenvectors of G. using the P -DoF, the channel would have P ≤ N DoF and the training energy is P p τ p , choosing τ p > P while making the power P p constant would lead to the same DCE as τ p = P but unnecessarily uses more energy in the TS transmission. This is a special case in the P -DoF model. Substituting the DCE of the MMSE in (39) with the P -DoF model into (24) yields after some straightforward algebraic manipulations to a novel analytical solution for the MSE in the FDD massive MIMO systems, which can be expressed as The expression in (40) implies that the MSE in the P -DoF model depends on P while the constraint in the achievable sum rate optimization problem depends on τ p ≤ min(P, τ c ), which limits the TS length to the DoF in the channel instead of N . Clearly, the choice τ p = P minimizes the MSE in (40) and yields exactly the same MSE channel estimates as with UL channel estimation in TDD systems. With this choice, the loss in achievable rate due to channel training, (τ c − τ p )/τ c , overcomes the MSE minimization, which increases the SINR when P become sufficiently large. In what follows, the results are presented based on P -DoF model, which is used to evaluate the achievable sum rate and the MSE of the DCE in BF and RZF precoding.
, SNR = 10 dB and K = 10 users.  (8) and (9), respectively. In Fig. V-A, the DL sum rate of the BF precoder increases gradually and saturates at ∼20 bit/s/Hz with values of N > 250. However, the sum rate for the RZF precoder increases more gradually before reaching a peak of ∼60 bit/s/Hz at N = 200. For values of N > 200, the DL sum rate decreases until is reaches ∼52 bit/s/Hz at value of N = 500. The results show that the RZF precoder provides a considerable enhancement in the DL sum rate when the DL channels are spatially correlated. In addition, the results show an accurate agreement is obtained between the analytical and simulated results.
To further investigate the agreement between the analytical and simulated results, Fig. 3 and Fig. 4 are provided. In particular, Fig. 3 and Fig. 4 show plots of the DL achievable sum rate versus the N , comparing both BF and RZF precoding performances in the spatially correlated channels when the P -DoF model is used with c = 0.1, K = 10 users and with τ c = 100 symbols, τ c = 150 symbols, and P p = P d = 5 dB and P p = P d = 15 dB, respectively. The analytical plots for both the BF and RZF precoders are obtained based on (27), (28), respectively. The simulated plots for both the BF and RZF precoders are obtained using equations (8) and (9), respectively. the results show an accurate agreement is obtained between the analytical and simulated results. The resutls show that increasing the coherence time and the power resutls in a sginficant improvement in the achievable sum rate, as expected. The VOLUME 4, 2016 following evaluations are carried out based on the method of RMT.   In what follows, we compare the downlink achievable sum rate, optimum TS length, and MSE when the proposed sum rate maximization and the conventional MSE minimization criteria are applied to an FDD mMIMO system using DCE. The normalized MSE (NMSE) per-symbol is obtained by dividing the analytical expression in (24) by the total N . In Fig. 5, which depicts achievable sum rate, the dashed curves for the both types of precoders clearly exhibit the limitation of the MMSE criterion, which results in zero throughput when N ≥ τ c = 100 symbols. That is, the whole coherence time is spent estimating the DL channel, which leaves no resources to actually transmit the payload data. For BS values of N < 100, the DL sum rates exhibit a maximum value and, as expected, RZF precoding achieves greater sum rates than BF precoding. In stark contrast, the solid curves for both types of precoders do not exhibit this limitation when the sum rate maximization criterion is used. Instead, the DL sum rate of BF precoding saturates at ∼14 bit/s/Hz for N > 100, as expected, whereas RZF precoding achieves a peak sum rate of ∼47 bit/s/Hz at N = 98, which slowly decreases to ∼31 bit/s/Hz at N = 500. The sum rate results in Fig. 5 are corroborated by the optimum TS length results in Fig. 6. The dashed lines, which correspond to the MMSE criterion, clearly show that, for both precoders, τ * p increases linearly up to τ * p = N τ c = 100 symbols, remaining fixed at 100 thereafter. In contrast, the solid lines, which correspond to the maximum sum rate criterion, show that τ * p increases linearly up to τ * BF and RZF precoding under both optimization criteria for the same system parameters. Again, the dashed curves represent the MMSE criterion whereas the solid curves represent the maximum sum rate criterion. As expected, the normalized MSE for the MMSE criterion is always less than that for the maximum sum rate criterion. It is also worth noting that, for each criterion, the BF and RZF precoders exhibit essentially the same normalized MSE. This is because the optimum pilot length for both precoders is closely related. The normalized MSE curves for the MMSE criterion can be divided into three distinct regions corresponding to N < 100, 100 < N < 335, and N > 335. For N < 100, the training power increases linearly with N and the powers of the K pilot sequences are quite unequal and, hence, appropriately matched to the correlated DL channel resulting in a constant normalized MSE level. For 100 < N < 335, the normalized MSE increases slowly, which corresponds to a region where the training power has saturated and the powers of the K pilot sequences are becoming more equal. For N > 335, when the pilot powers are more or less equal, the normalized MSE grows rapidly, which suggests that the DCE process has become interference limited. The normalized MSE curves for the maximum sum rate criterion exhibit only two distinct regions corresponding to N < 100, and 100 < N < 500. For N < 100, when the training power increases linearly with P , the normalized MSE level is constant albeit markedly higher than the normalized MSE obtained with the MMSE criterion. However, for N > 100, when both the pilot length and total power have saturated, the normalized MSE increases rapidly, which again suggests that the DCE process has become interference limited in this region. However, despite the rather poor MSE performance in this region, the overall sum rate remains viable, as the optimum pilot sequence length remains more or less constant at ∼34 symbols. The results in this subsection show that maximizing the DL sum rate, rather than minimizing the MSE of the SINR, leads to a feasible FDD mMIMO system performance when downlink channel state information with short coherence time is used.

B. SYSTEM PERFORMANCE FOR THE EXPONENTIAL CORRELATION MODEL
In this subsection, the covariance matrix is modeled based on the exponential correlation model. The exponential channel correlation model provides a full-rank covariance matrix. To this end, the (m, n)th element of G of the channel covariance matrix with an exponential model is given in a Hermitian Toeplitz [75], [76] form as where α is correlation coefficient between adjacent antennas at the BS and is given by α for (0 ≤ |α| ≤ 1). The correlation factor α represents the eigenvalue spread of the channel covariance matrix. Increasing the factor α in the exponential correlation model leads to higher spatial correlations. The correlation coefficient of exponential correlation model is considered to be α = 0.9, which implies relatively strong channel correlation. Fig. 8 and Fig. 9 demonstrate pilots of the DL sum rate using the MSE minimization criteria and the normalized MSE versus TS length, respectively, comparing the BF and RZF precoders performances in correlated channels when the exponential model is used with α = 0.9, K = 10 users, P p = P d = 10 dB, and τ c = 100 symbols. The results demonstrate that the DL achievable sum rate of the BF and RZF precoders are maximized with pilot lengths of 16   performance. As such, chosen τ p = 100 leads to a minimum MSE performance as shown in Fig. 9. This indeed leads to spend the whole coherence time in the downlink channel state information estimation, which actually leaves no time to transmit useful information stumbles to the users. We aim at maximizing the DL achievable sum rate by taking N and τ c into account. To achieve this aim, we relax the MSE requirement in the following evaluations. In particular, Fig. 10, Fig. 11 and Fig. 12 show plots of the DL sum rate, the optimum TS length that maximizes the sum rate, and the normalized MSE versus N for both BF and RZF precoding when achievable sum rate maximization criteria is used. All the system parameters remain unchanged from Fig. 8 and Fig. 9. The results in this section demonstrate that a feasible performance of the DL sum rate can be obtained in the FDD mMIMO systems with an optimum TS length of less than τ c /2 even when the covariance matrix is not rank-deficient and when short coherence time is used. BF, Sum rate maximization RZF, Sum rate maximization FIGURE 11. Optimum TS length τ * p versus N using the exponential model with α = 0.9, τc = 100 symbols, Pp = P d = 10 dB and K = 10 users.

C. FDD OPERATION IN MMIMO SYSTEMS USING THE ONE RING MODEL
So far, the results carried out using the P -DoF and exponential models. In this subsection, we consider the application of a more realistic physical model called one ring model [77], which is used in the performance evaluation of the FDD mMIMO systems under consideration. This model is also used to corroborate the paper's findings in a more realistic setting. The scatterers in the one ring model are located on a ring around the user as shown in Fig. 13 . The covariance matrix G in the one ring model is determined When the BS antenna elements are closely spaced, i.e., half wavelength and the amount of scattering around the user is limited, some of the eigenvalues of the channel covariance matrix become close to zero, which results in making G to be rank deficient. Fig. 14 and Fig. 15 plots achievable sum rate and optimum TS length versus N comparing correlated and relatively uncorrelated mMIMO channels in the one ring model. The parameter values for the one ring model are chosen as, θ = π/6 • , ω = 5 • and D = 1/2, which imply relatively strong correlation. While parameters ω = 25 • and D = 1 correspond to relatively weak channel correlation. Results are provided based on BF and RZF precoding for τ c = 100 symbols, K = 10 users, and P p = P d = 10 dB. The solid and dotted lines represent BF and RZF precoding when channels are uncorrelated while the solid lines denote the performance in the correlated channels. The results show that both the BF and RZF precoders provide a marked enhancement in the DL sum rate of FDD systems when the channel is correlated. This implies that DCE for an FDD mMIMO system can be more effectively exploited in correlated channels, which could allow for compact antenna form factors in an FDD mMIMO system, especially one operating at millimetre wave frequencies. Fig. 14 and Fig. 15 demonstrate the efficacy of using the one ring model, which can be used to predict the performance of FDD operation in mMIMO in a more realistic model. It is worth noting that the stepped curves occur because the optimal pilot sequence length τ * p is quantized and does not increases linearly as a function of N .

VI. CONCLUSIONS
This paper investigated the tradeoff between the sum rate maximization and MSE minimization in the downlink FDD mMIMO systems. A feasible downlink training sequence design has obtained, which can be used to maximize the downlink sum rate of FDD mMIMO systems. This paper also characterized the FDD mMIMO performance with BF and RZF precoding in different correlation models. The findings of this paper are supported by precise theoretical analyses, which accurately agree with the simulated results. These analyses are used to provide a straightforward system design methodology and to underpin our contributions. The results showed a feasible downlink sum rate in the FDD VOLUME 4, 2016 mMIMO systems can be obtained when finite training sequence length is used in the downlink channel state information estimation. In particular, this paper showed explicitly that the loss in MSE performance by using shorter training sequence length is negligible in comparison to the gain of having more transmission data symbols. Our discovery is based on maximizing the total sum rate of the system instead of minimizing the MSE of the SINR only, which is typically considered in the conventional MIMO systems or TDD mode of operation. This paper demonstrated that the downlink sum rate, defined by (25) relies on the two opposing-quantities, which are defined as log 2 (1 + γ k ) and (τ c − τ p )/τ c . The actual FDD mMIMO performance can only be predicated by maximizing (τ c − τ p /τ c ) K k=1 log 2 (1 + γ k ). The results showed that the optimum training sequence length of τ * p < τ c /2, can be obtained even when the channels are relatively uncorrelated. The proposed design paradigm provides a realistic characterization of an FDD mMIMO performance with single-stage digital precoding. This finding is very useful and enables an FDD mMIMO system to be implemented in high frequency bands such as mmWave and THZ communications with short coherence time.