On the Energy Efficiency of Multi-Cell Massive MIMO With Beamforming Training

This paper is concerned with a multi-cell downlink (DL) massive multiple-input multiple-output (MIMO) system operating over spatially correlated Rician fading channels. We not only consider estimating channel state information (CSI) at the base station (BS), but also adopt beamforming training (BT) to obtain CSI at the users. With maximum-ratio transmission (MRT) or zero-forcing (ZF) employed at the BS to process the transmit signals, the paper derives closed-form expressions of the sum spectral efficiency for both cases: with and without BT. Based on the obtained closed-form expressions, we investigate the effect of the DL pilot length on the system performance for MRT and ZF precoding and with or without BT. Moreover, when the DL pilot length falls within different ranges, we find out whether using BT leads to better system performance. To address the energy efficiency (EE) maximization problem under the constraints of a given sum spectral efficiency and a maximum total DL transmit power, we transform the problem into a geometric program (GP), which can be solved more efficiently. In particular, we develop one iterative power allocation algorithm for the system with BT scheme. Simulation and numerical results demonstrate that the proposed power allocation algorithm can improve the system EE. Numerical results also show that when the BS uses MRT precoding, the sum spectral efficiency in the high signal-to-noise ratio (SNR) region is much improved with BT.


I. INTRODUCTION
Multiple-input multiple-output (MIMO) systems equipped with a very large number of antennas, commonly referred to as massive MIMO, or large-scale MIMO, have been recognized as an important technology for current and future generations of wireless communications. The use of a large number of antennas enables energy concentration in a much narrower beam, which leads to much improved received signal strength as well as other advantages, such as improved energy and spectrum efficiencies, and the elimination of thermal noise and fading effects [1]- [5]. In massive MIMO, the channel vectors are nearly pairwise orthogonal, and hence, linear signal processing schemes, such as maximum ratio combining/maximum ratio transmission (MRC/MRT), equal-gain combining/equal-gain transmission (EGC/EGT), The associate editor coordinating the review of this manuscript and approving it for publication was Liang Yang . zero-forcing (ZF), and minimum mean-square error (MMSE) are nearly optimal [1]- [9].
With respect to the uplink (UL) and downlink (DL) signal processing stages of a massive MIMO system, channel state information (CSI) at the base station (BS) and user terminals usually plays a key role in maximizing the network throughput [10]. As an alternative to frequency-division duplex (FDD) mode, time-division duplex (TDD) mode is more suitable for large-scale MIMO systems. An important and distinctive feature of a TDD system is channel reciprocity, where the reverse channel can be used as an estimate of the forward channel. For a DL massive MIMO system, the BS usually performs DL precoding based on the CSI obtained by estimating the UL channel [1], [2], [10]. Since the length of DL pilots is proportional to the number of antennas at the BS, it would be impractical to acquire CSI for the users in a DL massive MIMO system based on channel reciprocity. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Fortunately, references [11]- [19] come up with a practical solution, called beamforming training (BT), that allows users to obtain DL CSI in a massive MIMO system. In particular, by using BT, the length of DL pilots is determined by the number of users and independent of the number of antennas at the BS. In [11], [12], the rate performance of a singlecell multiuser massive MIMO system with the BT scheme has been studied and [11] proposes a resource allocation strategy to maximize the spectral efficiency for MRT or ZF precoder. In [13], [14], the authors study a multicell multiuser massive MIMO system under Rayleigh fading using BT and derive achievable rate expressions when the BS uses MRT and ZF processing. In addition, [14] designs a method called BT and pilot contamination precoding (BT-PCP) scheme to alliviate the effect of pilot contamination. Simulation results also show that, compared with the conventional PCP method, the proposed BT-PCP scheme can improve the spectral efficiency to the extent that it can closely approach the spectral efficiency under the case of perfect CSI. Considering that Rician fading is a more general fading model than Rayleigh fading, the performance of a multicell multiuser massive MIMO system over Rician fading using BT has been obtained in [15]. For a distributed large-scale MIMO system operating over Rayleigh fading channels, references [16], [17] investigate the benefits of BT scheme and study how the number of antennas and the length of coherence interval affect the spectral efficiency of this system. Based on analytical results, the authors also find that compared to a co-located massive MIMO system, the BT scheme works much more effectively for a distributed massive MIMO system. In [18], [19], the authors mainly focus on the spectral efficiency of a cellfree massive MIMO system with BT scheme, and propose max-min fairness power control and greedy pilot assignment methods to promote spectral efficiency of such a system.
Previous research works on massive MIMO systems consider uncorrelated Rician fading channels [20], [21], which do not take into account the effect of channel correlation commonly present in practical wireless propagation environments. Under spatially uncorrelated Rician fading channels, the rate performance in single-cell and multi-cell UL massive MIMO systems with MRC processing is studied in [20] and [21], respectively. However, in reality, a MIMO channel is generally correlated because the antennas are not sufficiently well separated or the propagation environment does not offer rich enough scattering. Considering the finite number of scattering clusters [22], recent papers study massive MIMO under a more realistic assumption of spatially correlated Rician fading [23]- [26]. Specifically, references [23] and [24] study performance of multi-cell massive MIMO systems and derive the statistical properties of the MMSE, element-wise MMSE (EM-MMSE), and least-square (LS) channel estimates when the BS uses MRC and MRT to perform UL detection and DL precoding, respectively. For other linear processing methods, such as multi-cell MMSE (MMMSE) and linear MMSE (LMMSE) processing, references [25] and [26] derive closedform asymptotic approximations of the UL spectral efficiency and show how the line-of-sight (LOS) components and pilot contamination affect the overall system performance in multicell UL massive MIMO systems.
The above works were mainly concerned with the rate performance of a massive MIMO system. On the other hand, given the need to conserve power consumption and environmental concern, energy efficiency (EE) has become a critical design metric for green communication systems. Therefore, EE optimization is a highly-relevant and important problem for massive MIMO systems, as evidenced in a variety of works on EE optimization research [27]- [31]. For a massive MIMO full-duplex relay system, the authors in [27] propose a power allocation algorithm between the source and relay to maximize the EE, subject to a desired sum spectral efficiency and peak power constraints. For a MIMO two-way relay networks with simultaneous wireless information and power transfer, the authors in [28] present two optimization algorithms to maximize the worst-case EE: an iterative algorithm based on the weighted minimum mean-square error method, and a channel diagonalization algorithm based on the generalized singular value decomposition. For a multicell massive MIMO network, an algorithm is proposed in [29] to optimize the number of activated antennas together with power allocation and pilot assignment. Numerical results in [29] show that, compared to the conventional pilot assignment scheme, the proposed algorithm significantly improves both the sum spectral efficiency and EE.
Different from the majority of existing works, [13]- [15] consider the application of BT in a multi-cell massive MIMO system over Rayleigh or Rician fading channels. On the other hand, [23]- [26] examine the rate performance of a multicell massive MIMO system under spatially correlated Rician fading channels, but without BT. Given that there is only a limited amount of works on the application of BT for multicell massive MIMO systems over spatially correlated Rician fading channels, this paper shall carry out an extensive study on whether BT is beneficial to such systems. While the works in [13]- [15] and [23]- [26] mainly focus on studying the system's spectral efficiency and do not consider EE, our paper also investigates the EE optimization problem for a multicell massive MIMO system over spatially correlated Rician fading channels with BT scheme. In summary, the main contributions of this paper are as follows: • For the multi-cell DL massive MIMO system operating over spatially correlated Rician fading channels, when the BS uses MRT or ZF precoding to process DL transmission of data and pilots, we derive closed-form expressions of the sum spectral efficiency for both cases, with and without BT.
• Making use of the Bernoulli's inequality and based on the obtained expressions of the sum spectral efficiency, we find out the relationship between the length of the DL pilot and the system's performance in both cases of whether or not BT is employed for MRT or ZF precoding. The benefit of whether to implement BT is analyzed and shown to depend on the length of the DL pilot.
80740 VOLUME 8, 2020 • For the multi-cell DL massive MIMO system with BT scheme, we develop an optimal power allocation algorithm for assigning DL transmit power in order to maximize the EE for a desired sum spectral efficiency and a given total maximum DL transmit power. By using this algorithm, we can transform the original challenging optimization problem into geometric programs (GPs), which can then be reformulated as a convex problem and solved using standard convex optimization tools. Numerical results demonstrate that, compared to the conventional uniform power allocation, the proposed optimal power allocation algorithm significantly improve the EE of a massive MIMO system. Notation: Matrices and vectors are represented by boldface upper-case and lower-case letters, respectively. We use trace (·), (·) * , (·) T and (·) H to denote trace, conjugate, transpose and conjugate-transpose operations of matrices, respectively. I N denotes an N × N identity matrix and [A] mn gives the (m, n) th entry of A. The smallest element, largest element, real parts, expectation, variance, and covariance operators are denoted by min{·}, max{·}, Re{·}, E{·}, Var(·) and Cov {X , Y }, respectively. In addition, δ (n, j) is the delta function, defined as δ (n, j) = 1 if n = j, and δ (n, j) = 0 if n = j. Finally, we use z ∼ CN (a, A) to denote a complex Gaussian vector z with mean vector a and covariance matrix A.

A. CHANNEL MODEL
We consider a massive MIMO system with L cells where each cell consists of one BS equipped with M equally-spaced, omnidirectional antennas that serves N randomly located user equipments. For the user equipments (e.g. micro mobile devices, mobile phones and tablets), they usually can only configure a few antennas due to their own space and size. Similar to only considering mono-antenna user equipment in [32], [33], we also assume that each user equipment is configured with a single antenna in this paper. The matrix H il = [h il1 , · · · , h ilN ] stands for the MIMO channel matrix between the BS in the ith cell and the N users in the lth cell, where h iln denotes the nth column vector of H il . The correlated Rician fading channels are modeled such that the channel from the nth user equipment in the lth cell to the BS in the ith cell is represented as [23]- [26] h iln = β iln 1 1 + k iln where β iln accounts for the large-scale channel fading effect from the nth user in the lth cell to the BS in the ith cell, whereas the second term in parentheses represents the smallscale channel fading effect. In essence, β iln models the geometric attenuation and shadow fading, which is assumed constant over many coherence time intervals and known a priori [1]. In addition, the term accounting for small-scale channel fading effect from the nth user in the lth cell to the BS in the ith cell consists of a Rayleigh-distributed random component g iln ∼ CN (0, I M ) to represent the scattered (Non-LOS) signal and a deterministic componentḡ iln to capture the specular (LOS) signal. The quantity k iln ≥ 0 is the Rician K -factor that represents the power ratio of the LOS and Non-LOS components from nth user in the lth cell to the BS in the ith cell. Furthermore, the channel correlation matrix from the nth user in the lth cell to the BS in the ith cell is represented as iln . Due to the movement of users, it may cause the effect of phase-shift of the LOS component. But the phase shift will be identical for all BS antennas, and may therefore be accurately tracked in practice. Therefore, we still use the way of [23]- [26] to build the channel model between the BS and the user without considering the effect of phase-shift, (i.e. a LOS path with static phase). Finally, for notational convenience, we define: Therefore, we have h iln ∼ CN h iln , R iln , whereh iln describes the LOS component and R iln is a positive semidefinite covariance matrix representing the spatial correlation characteristics of the NLOS component.

B. MMSE CHANNEL ESTIMATION
Each BS needs to acquire CSI for signal processing. Therefore, τ p samples are reserved for performing UL pilot-based channel estimation in each coherence block. Assume that during the channel estimation phase, users of different cells simultaneously transmit the same set of UL orthogonal pilot sequences of τ p ≥ N symbols, which can be stacked in a N × τ p matrix √ τ p p p C, which satisfies CC H = I N . The received pilot signal Y p i ∈ C M ×τ p at the BS of the ith cell is [23]- [25] where p p is the transmit power of UL pilot sequences, c j is the jth row vector of UL orthogonal pilot matrix C and the entries of receive noise matrix N p i ∈ C M ×τ p are independent and identically distributed (i.i.d.) CN (0, 1) random variables. To estimate the channel h iln , the received pilot signal Y Since the LOS component changes slowly, the BS can estimate the LOS component very accurately with negligible signal overhead. Therefore,h iln is assumed to be known at both the BS and the users. Similarly, R iln varies slowly compared to the channel coherence time and is also assumed VOLUME 8, 2020 to be perfectly known to the BS and the users. Based on the processed received pilot signal in (5), the BS can apply the MMSE estimation method to obtain an estimate of h iln as follows [23]- [25]: whereȳ p in = L l=1h iln , and The true and estimated channel vectors are related as h iln = h iln + ε iln , where ε iln is the estimation error vector. With MMSE estimation, the MMSE estimateĥ iln is statistically independent of the estimation error ε iln and their distributions are given as [23]- [25] where iln = R iln in R iln , ∀i, l, n, For notational convenience, we also define C. DOWNLINK SPECTRAL EFFICIENCY WITH CONVENTIONAL SCHEME (WITHOUT BEAMFORMING TRAINING) Section II.B explained how the BS can obtain the UL CSI by processing the pilot sequences from the users. Based on the channel reciprocity of the TDD mode, the BS can use the estimated CSI to precode the DL transmission signal as follows. First, the received signal at the nth user of the ith cell is given as [14] where the diagonal matrix P l = diag {p l1 , · · · , p lN } ∈ C N ×N consists of DL data transmit powers for all users in the lth cell, x lN ] T is the vector of transmit data symbols for users in the lth cell with E {x l } = 0 and E x l x H l = I N , n in ∼ CN (0, 1) represents the receiver's noise at the nth user of the ith cell, h lin denotes the nth column vector of H li and F l ∈ C M ×N is a precoding matrix of the lth cell.
Next, we consider two common linear precoding methods for massive MIMO systems, which are MRT processing F l = F MRT l and ZF processing F l = F ZF l . They are defined, respectively, as [15] whereĤ ll = ĥ ll1 , · · · ,ĥ llN , and the values of α l,MRT and α l,ZF are chosen to satisfy the power constraint at the BS, i.e., E trace F l F H l = 1 [15]. Therefore, we have Therefore, the expression in (13) can be expanded as where For MRT or ZF precoding, when each user detects the received signal based only on the statistical CSI, the achievable rate's lower bound (without BT) for the nth user of the ith cell is given in (21), as shown at the bottom of this page. Substituting (21) into (22), the DL sum spectral efficiency for MRT and ZF precoding without BT is given as [10] where ♦ ∈ {MRT, ZF} corresponds to MRT or ZF precoding, and T is the coherence time of the channel.

D. DOWNLINK SPECTRAL EFFICIENCY WITH BEAMFORMING TRAINING
For a massive MIMO system operating in the FDD mode, references [34]- [36] investigate the system's performance based on a beam-domain (BD) DL training approach. Different from [34]- [36], we focus on studying performance of a massive MIMO system operating the TDD mode in this paper. For a conventional DL massive MIMO system operating the TDD mode, the DL transmission is usually divided into two phases: UL pilot training phase and DL data transmission phase. In the UL pilot training phase, users transmit the UL pilot sequences to the BS, and the BS can estimate the UL CSI by processing the received pilot signal. By exploiting the channel reciprocity in TDD setting, the BS can first use the estimated UL CSI to precode the DL transmit data symbols and then transmit these precoding signals to the users in DL data transmission phase. However, the conventional DL transmission has a disadvantage that the DL instantaneous CSI cannot be obtained at the users and this is not conducive to a better access to useful information for the users. In order to overcome this problem, we consider the use of BT scheme, in which a third phase is allocated to allow the BS transmit DL pilot matrix to the users so that the DL instantaneous CSI can be obtained at the users. The specific implementation is that all BSs of different cells transmit the same set of DL orthogonal pilot sequences synchronously during the BT phase (DL pilot training phase), and each user estimates the effective CSI of s ilnj based on the received DL pilot matrix. We define ∈ C N ×τ d , τ d ≥ N as DL orthogonal pilot sequences with the property that is pairwise orthogonal, i.e., H = I N . It follows that the received DL pilot matrix at the users of the ith cell can be expressed as [14] where N i ∈ C N ×τ d is the receiver noise matrix of the ith cell and the entries of N i are i.i.d. CN (0, 1) random variables. Multiplying the received DL pilot signal at users of the ith cell Y i by H gives [37] whereÑ i = N i H and the entries ofÑ i are i.i.d. CN (0, 1) random variables due to the pairwise orthogonality of DL pilots matrix .
Based on (24), the DL pilot vector at the nth user of the ith cell can be rewritten as where s iln = [s iln1 , · · · , s ilnN ] andñ in is the nth row of N i . According to [38], the entries of s iln can be estimated independently and the linear MMSE estimation of s ilnj can be given whereỹ in,j is given bỹ andñ in,j is the jth element ofñ in .
From (26), we also obtain the estimation error, µ ilnj = s ilnj −ŝ ilnj . Depends on the characteristics of linear MMSE estimation,ŝ ilnj and µ ilnj are uncorrelated, but not independent. It then follows s ilnj =ŝ ilnj + µ ilnj from that (19) can be rewritten as Based on (28), when the BS uses MRT or ZF precoding to process the transmitted signal, the DL achievable rate's lower bound at the nth user of the ith cell with the use of BT is given in (29), as shown at the bottom of this page. In (29), E µ ilnj 2 , (∀i, l, n, j) is given by (92) and (107) for MRT and ZF precoding, respectively. The lower bound of (29) is obtained on the assumption of Gaussian inputs and treating the interference and noise as worst-case uncorrelated additive noise. This similar methodology have been used in [12], [14] for massive MIMO systems.
Substituting (29) into (30), the DL sum spectral efficiency for MRT and ZF precoding with BT can be given as [14] where ♦ ∈ {MRT, ZF} corresponds to MRT or ZF precoding, and T is the coherence time of the channel.
Since (29) is not in closed form, it will cause great difficulties in the theoretical derivation. Driven by these reasons, with the processing technique from Lemma 2 of [14], (29) can be approximated by (31), as shown at the bottom of the previous page. Lemma 2 of [14] is a more accurate approximation method when the number of antennas at the BS is large. Substituting (31) into (32), the sum spectral efficiency for MRT and ZF precoding with BT can be approximated as [14]:

III. DOWNLINK SPECTRAL EFFICIENCY OF MRT PRECODING: WITH AND WITHOUT BT
By calculating the expectations of each term in (31) and (21), and plugging the obtained results into (32) and (22), the expressions of the sum spectral efficiency for MRT processing, with and without BT, are given in Theorem 1 and Theorem 2, respectively. Based on Theorem 1 and Theorem 2, we obtain three corollaries concerning the relationship between the length of the DL pilot and the rate performance for a massive MIMO system with and without BT. Theorem 1: For MRT precoding, the sum spectral efficiency with BT is given as with a ilnj and α l,MRT being given in (16) and (17), respectively, and Following similar steps in the derivation ofC MRT , we can derive an expression for the sum spectral efficiency under the MRT precoding and without BT. Omitting the derivation steps, the result is stated in Theorem 2 and shall be compared with the performance under the MRT precoding and with BT.
Theorem 2: Under the MRT precoding and without the use of BT, the sum spectral efficiency of a massive MIMO system is given as [24] By comparing Theorem 1 and Theorem 2, we can see that 4 represents the performance improvement brought by BT. But it should be pointed out that using BT incurs a higher pilot cost and results in less DL data transmission. Therefore, when the BS uses the MRT precoding to process the transmitted signal, it is of interest to find out the performance gap between the two cases, with and without the use of BT. Compared with the conventional scheme without BT, the biggest difference is that the scheme with BT can acquire the DL CSI at user terminals by sending the DL pilot. Therefore, the main performance difference between the two schemes is related to the DL pilot length. Below we give three corollaries about the relationship between the DL pilot length and the performance of the two schemes (with and without BT). For example, Corollary 1 and Corollary 2 establish the sufficient (but not necessary) conditions that the length of DL pilot sequences should satisfy so thatC MRT ≥ C MRT and C MRT ≥C MRT , respectively. These corollaries are obtained using Bernoulli's inequality.
Corollary 1: For MRT precoding, the inequalityC MRT ≥ C MRT holds if τ d satisfies where Proof: See Appendix B.
Thus, compared to the system without BT, Corollary 1 gives the range of τ d when we should adopt BT in DL transmission.
Corollary 2: For MRT precoding, the inequality where Proof: See Appendix C.
80744 VOLUME 8, 2020 Similarly, Corollary 2 gives the range of τ d when we should choose the conventional scheme without BT in DL transmission. Through Corollary 1 and Corollary 2, we can also find the range of τ d that leads toC MRT = C MRT in some specific cases, which is stated in Corollary 3 below.

IV. DOWNLINK SPECTRAL EFFICIENCY OF ZF PRECODING: WITH AND WITHOUT BT
This section provides the analysis for ZF precoding with and without BT, which parallel the analysis in Section III for MRT precoding. Similar to the case of MRT, based on (22) and (32), we can also obtain expressions for the sum spectral efficiency when ZF precoding is employed with and without BT. The results are stated in Theorem 3 and Theorem 4. Theorem 3: For ZF precoding, the sum spectral efficiency with BT is given as where with a ilnj , b ilnj and α l,ZF being given in (16), (38) and (18), respectively, and Proof: See Appendix D.
Following similar steps as in Appendix D, we can also derive the sum spectral efficiency for ZF precoding without BT, which is stated in Theorem 4. Due to space limitation, the proof is omitted.
Theorem 4: For ZF precoding, the sum spectral efficiency without BT is given as Based on (47) and (56) and following similar derivations in Appendices B and C for MRT precoding, we obtain the following corollaries regarding the relationship between the system performance and the length of DL pilot sequences for ZF precoding with and without BT. where where τ upper in,ZF = For ZF precoding, Corollary 4 and Corollary 5 show that we should use BT under the condition of (57) and should not use it under the condition of (59). Based on Corollary 4 and Corollary 5, when τ d satisfies a certain range, a value of τ d can be found so thatC ZF = C ZF and this is stated in Corollary 6.

VOLUME 8, 2020
Remark 1: When the channel correlation matrix iln = I M for ∀i, l, n, the expression in (33), (39), (47) and (56) are identical to the rate expression for uncorrelated Rician fading channels in reference [15]. Therefore the rate expression given in this paper is a more general result.

V. POWER ALLOCATION
In section III and IV, our main focus is on the achievable rate performance (spectral efficiency). Due to the need to conserve power and serious environmental concern, EE is also an aspect that needs attention for green communication systems. Therefore, under the premise of guaranteeing quality of service, how to better improve EE by allocating power has become an urgent problem. Driven by these reasons, under the condition of guaranteeing a certain sum spectral efficiency, we design a power allocation algorithms to optimize EE of the system with BT scheme in this section.

A. POWER ALLOCATION WITH BT
We consider that the DL transmit powers of different users in each cell are different and assume that the design for UL pilot training phase is done in advance, i.e., the transmit power of the UL pilot p p and the length of the UL pilot τ p are determined. Given the sum spectral efficiency and the constraint on the maximum DL transmit power, we want to find a power allocation algorithm for determining different DL transmit powers of different users in each cell so that the EE is maximized.
The EE (in bits/J) is defined as the sum spectral efficiency divided by the total power consumption in the network. Based on the linear BS power consumption model, the EE of the system for MRT precoding and ZF precoding with BT is given by [29] where ♦ ∈ {MRT, ZF} corresponds to MRT or ZF precoding, C MRT andC ZF are given in (33) and (47) for MRT precoding and ZF precoding with BT, respectively. The quantity LMp c represents the total circuit power consumption of all antennas at BS (which is independent of the DL transmit power), where p c is the power consumed by the circuit at each antenna including the power dissipation of filter and mixer, frequency synthesizer, and digital-to-analog converter; p 0 is the basic power consumed at the BS which is independent from the number of antennas; v ≥ 1 is the inefficiency of the power amplifier. Without loss of generality, we assume v = 1 in this paper. Based on the numerical results of Section VI, we know that the theoretical results of (33) and (47) are close to the simulation results of (30). Therefore, we can use the expressions of (33) and (47) instead of (30) to represent the sum spectral efficiency with BT for MRT precoding and ZF precoding in (64). So given the sum spectral efficiency and the constraint on the maximum DL transmit power, the optimization problem of maximizing EE can be expressed as [27] Maximize p in whereC ♦ denotes the given sum spectral efficiency and LNp is the peak power constraint of the total DL transmit power. Take the system using the MRT precoding and BT as an example. Since LM p c and p 0 are both fixed values, the optimal power allocation problem in (65) can be transformed into By expanding the inequality in (66), (66) can be rewritten as where (67), as shown at the bottom of the previous page. The objective function in Problem (68) is a posynomial function. However, the equality constraint in (68) is not the monomial function and the inequality constraint in (67) is not the posynomial function either. In fact, Problem (68) is a signomial programming [39]. If the equality constraint and the inequality constraint can be transformed into the form of monomial function and posynomial function, respectively, then Problem (68) becomes a GP which can be reformulated as a convex problem.
In Appendix E, we give an explicit transformation method to approximate the equality constraint and the inequality constraint in the forms of monomial function and posynomial function, respectively. By using a similar technique as in [27] and the approximation presented in Appendix E, we formulate Algorithm 1, a novel successive approximation algorithm, to solve Problem (68) for the case of MRT precoding and BT. Specifically, by using this new iterative algorithm (Algorithm 1), we can successfully convert Problem (68) to a GP and solve it with standard convex optimization tools, such as CVX. Using similar approximations in Appendix E, we can also develop a successive approximation algorithm to find the optimized DL transmit power for the system with ZF precoding and BT, which is also included in Algorithm 1.
In addition, it is pointed out that in Algorithm 1, we choose parameters ω = 1.5 and = 0.01 to control the approximation accuracy because these values can maintain a good balance between the approximation accuracy and convergence speed [40]. Due to space constraints, some expressions that will be used in Algorithm 1 are given from (69) to (82), as shown at the bottom of the next page.

VI. NUMERICAL RESULTS
In this section, we consider a multicell massive MIMO system over correlated Rician fading channels (L = 4 cells). Each cell has one BS which is equipped with M = 50 antennas and each BS can provide services to N = 5 users in the cell. We assume the coherence interval T = 1000, the UL pilot length is τ p = N , and the UL pilot power is p p = 0dB in this system. For a massive MIMO system over correlated Rician fading channel, the large-scale channel fading coefficient mainly related to the distance between BS and users. Without losing generality, we set all the direct gains to 1 and all cross gains to 0.1 in the simulations, i.e., for i, l = 1, · · · , L and n = 1, · · · , N , the large-scale channel Moreover, to ensure distinct Rician K -factor between BS and different users, we assume throughout all the simulations that ∀i, l, n, k iln ∈ [0, 1]. In this section, we consider a uniform linear array (ULA) with omnidirectional antennas to be employed at all BS, the LOS componentḡ iln can be expressed as [ḡ iln ] m1 = e −j(m−1)(2πd / λ) sin(θ iln ) , where λ is the carrier wavelength, θ iln ∈ [−π, π] is the arrival angle to the nth user in the lth cell seen from the BS of the ith cell and d = λ/2 is the antenna spacing [41]. In addition, the correlation matrix of the NLOS component is constructed , where r iln is the coefficient related to antenna correlation between the nth user equipment in the lth cell and the BS in the ith cell [42]. For convenience, we also set r iln = r for ∀i, l, n in this section.
In Fig. 1, we first illustrate the effect of DL signal-to-noise ratio (SNR), i.e., p in , (∀i, n), on the sum spectral efficiency VOLUME 8, 2020 when using MRT and ZF precoding for the systems with and without BT, ultimately, validating the conclusions of Theorems 1, 2, 3 and 4. For the system with BT, the simulation results of the spectral efficiency are obtained based on (30), whereas the theoretical results are obtained by (33) and (47) for MRT and ZF precoding, respectively. For the system without BT, the corresponding expressions for simulation and theoretical results are (22), (39) and (56). We set the channel correlation coefficient r = 0.7 and the length of the DL pilot τ d = N . As expected, our closed-form expressions are almost identical with the simulation results over the entire range of SNR. This figure shows that as the DL SNR increases (for both systems, with and without BT), the sum spectral efficiency keeps increasing and gradually approaches fixed values for MRT and ZF precoding. The sum spectral efficiency obtained with ZF precoding is higher than that obtained with MRT precoding. It is also seen that the system performance with BT gradually exceeds the system performance without BT for both MRT and ZF precoding. Compared to ZF precoding, when the BS uses MRT precoding to process signals, using BT can help to improve the sum spectral efficiency. Overall, compared to the system without Algorithm 1 (MRT and ZF Precoding With BT) 1: Initialization: Firstly, set the initial number of iterations t = 1, the tolerance value = 0.01, the maximum number of iterations t max = 10 and the parameter ω = 1.5. For ∀i, n, choose some initial values of p in,1 and γ in,1 for MRT or ZF precoding. 2: Iteration t: Secondly, compute (69) to (82). Based on these expressions, solve the following GP: (81) for MRT or (82) for ZF, ∀i, n For ∀i, n, let p * in and γ * in denote the solutions of this iteration. 3 are both established or t = t max , stop the iteration. Otherwise, go to Step 4. 4: Lastly, set t = t + 1, p in,t = p * in and γ in,t = γ * in for (i = 1, · · · , L; n = 1, · · · , N ), and go to Step 2. BT, using BT together with MRT precoding is very suitable for the multi-cell DL massive MIMO system operating over spatially correlated Rician fading channels. Given the accu-  racy of the obtained closed-form expressions, only theoretical results are presented in the remaining of this section. Fig. 2 and Fig. 3 depict the sum spectral efficiency versus the DL pilot length for MRT and ZF precoding, respectively. Here, we assume r = 0.7 and p in = 10dB, (∀i, n). It can be seen that the sum spectral efficiency with BT is better than that without BT if τ d ≤ τ lower ♦ . On the other hand, the opposite is true if τ d ≥ τ upper ♦ . The performance crossover point is located in the range of τ lower where ♦ ∈ {MRT, ZF} corresponds to MRT or ZF precoding. These numerical results verify the correctness of Corollary 1 to Corollary 6 for MRT and ZF precoding. Finally, we can also see that, compared with MRT precoding, ZF precoding reaches the performance cross-over point at a shorter length of the DL pilot. Fig. 4 shows that the sum spectral efficiency versus the channel correlation coefficient for MRT and ZF precoding with BT scheme. In Fig. 4, we assume τ d = N  and p in = 10dB, (∀i, n). As expected, as the channel correlation coefficient increase, the sum spectral efficiency gradually decreases. When r = 0 (i.e. [ iln ] = I M , ∀i, l, n) and it is equivalent to the system over uncorrelated Rician fading, the sum spectral efficiency can reach its maximum. And when r is less than 0.6, the spectral efficiency attenuation is not large, but after 0.6, the sum spectral efficiency decreases rapidly. The above phenomenons reflect the impact of channel correlation on spectral efficiency.
While Figs. 1 to 4 focus on the system's performance in terms of the achievable rate, the next two figures present the system's performance in terms of EE. Specifically, Figs. 5 and 6 plot the EE versus the sum spectral efficiency for MRT or ZF precoding with BT scheme under uniform and optimal power allocations, respectively. The uniform power allocation corresponds to the scenario in which all DL transmit and pilot powers are set at maximum, i.e., p in = p, (i = 1, · · · , L; n = 1, · · · , N ), where p is the maximum power constraint of each user. For the following two cases: MRT and ZF precoding with BT, the optimal power allocations are obtained by Algorithm 1. In Algorithm 1, the initial values of p in,1 are chosen as p in,1 = p, (∀i, n). After that we can substitute p in = p, (∀i, n) into γ in = to obtain the initial values of γ in,1 , (∀i, n) for MRT or ZF precoding with BT. For both examples, we set p c = 10dBm, p 0 = 10dBm, r = 0.7 and τ d = N . From Figs. 5 and 6, we can see that when the BS implements the optimal power allocation, the EE can be improved significantly as compared to the case of exercising the uniform power allocation. These simulation results therefore demonstrate the significant benefit of our proposed power optimization algorithm.

VII. CONCLUSION
In this paper, we have studied and analyzed the sum spectral efficiency and EE of a multi-cell DL massive MIMO system operating over spatially correlated Rician fading and imperfect CSI. Considering that using BT allows the DL pilot cost only dependent on the number of users and independent of the number of BS antennas, the paper examines both cases of using and not using BT for obtaining CSI at users. We derived closed-form expressions of the sum spectral efficiency achieved by MRT and ZF precoding and utilize them to determine the range of the performance cross-over point for the systems using and not using BT. Numerical results were presented to corroborate the analysis. Furthermore, the obtained expressions enabled us to develop one novel iterative power allocation algorithm to improve the system's EE with BT scheme. Finally, numerical results showed that as the channel correlation coefficient increases, the sum spectral efficiency drops sharply when channel correlation coefficient is large.

APPENDIX B
If the inequalityC MRT ≥ C MRT is satisfied, we first perform the exponential operation and then perform simple algebraic operations on both sides of the inequality. Therefore, the inequalityC MRT ≥ C MRT can be transformed into the following form where 1 , 4 and 5 are defined in (34), (37) and (44), respectively. When the precondition N ≤ τ d ≤ T − τ p is satisfied, we can use Bernoulli's inequality, i.e., (1 + x) r ≥ 1 + rx (for x > −1 and r ≥ 1) and (93) can be transformed into [15] Through (94), we know that when the inequality in (95) is satisfied, then the inequalityC MRT ≥ C MRT must be established. Therefore, (95) as a sufficient (but not necessary) condition ofC MRT ≥ C MRT can be expressed as Considering the precondition N ≤ τ d ≤ T − τ p , we perform a simple algebraic operation on (95) and arrive at the fact that the inequalityC MRT ≥ C MRT will be true if N ≤ τ d ≤ is given in (41).

APPENDIX C
For the inequality C MRT ≥C MRT , performing the exponential and algebraic operations on both sides of the inequality, we obtain where 1 , 4 and 5 are defined in (34), (37) and (44), respectively. By using Bernoulli's inequality similar to Appendix B, Based on (97), we know that the inequality (98) is a sufficient (but not necessary) condition of C MRT ≥C MRT and can be given as , ∀i, n.
(98) VOLUME 8, 2020 When (98) and N ≤ τ d ≤ T − τ p are simultaneously satisfied, combining the above two conditions and performing simple algebraic operations, the inequality C MRT ≥C MRT will be established in the case of is given in (43).

APPENDIX D
When the BS uses ZF precoding to process the transmit signal, we define the effective CSI as s ilnj = h T lin f ZF lj , (i, l = 1, · · · , L; n, j = 1, · · · , N ) and also analyze the expectations E ŝ ilnj 2 and E µ ilnj 2 under the following two cases: Case I: l = i and Case II: l = i. Before giving the expectation of each term under Case I and Case II, we firstly use the law of large numbers to give the following approximation [1], [2]: where E ll is a N × N diagonal matrix whose (j, j)th element is [E ll ] jj = a −1 lljj , where a lljj is given in (16). From the approximation of (99), we can obtain [24] Based on (100), one has E ỹ in,j = L l=1 √ τ d p lj υ ilnj .

APPENDIX E
In order to transform Problem (68) into a GP problem, we first consider converting the equality constraint in (68) from a posynomial function to a monomial function. More precisely, by using the technique in (40) of reference [39], we can use in γ ϑ in in to approximate 1 + γ in near the pointγ in , where ϑ in =γ in (1 +γ in ) −1 and in =γ −ϑ in in (1 +γ in ). Therefore, near the pointγ in , the left hand side of the equality constraint in (68) can be approximated as Thus, the equality constraint in Problem (68) becomes a monomial function. For the inequality constraint in (67), the left and right hand sides of the inequality are both posynomial functions. Thus, we consider converting the right hand side of the inequality to a monomial function by using the approximation method in (40) of reference [39], which is as follows: Therefore, when p in and γ in are close top in andγ in , respectively, for i = 1, · · · , L and n = 1, · · · , N , (109) is established. By dividing the left hand side of the inequality in (67) by this approximate monomial function in (109), we can convert (67) to a posynomial function in (81), as shown at the bottom of page 10. Through the two approximations in (108) and (109)