Uplink Power Control in Massive MIMO with Double Scattering Channels

Massive multiple-input multiple-output (MIMO) is a key technology for improving the spectral and energy efficiency in 5G-and-beyond wireless networks. For a tractable analysis, most of the previous works on Massive MIMO have been focused on the system performance with complex Gaussian channel impulse responses under rich-scattering environments. In contrast, this paper investigates the uplink ergodic spectral efficiency (SE) of each user under the double scattering channel model. We derive a closed-form expression of the uplink ergodic SE by exploiting the maximum ratio (MR) combining technique based on imperfect channel state information. We further study the asymptotic SE behaviors as a function of the number of antennas at each base station (BS) and the number of scatterers available at each radio channel. We then formulate and solve a total energy optimization problem for the uplink data transmission that aims at simultaneously satisfying the required SEs from all the users with limited data power resource. Notably, our proposed algorithms can cope with the congestion issue appearing when at least one user is served by lower SE than requested. Numerical results illustrate the effectiveness of the closed-form ergodic SE over Monte-Carlo simulations. Besides, the system can still provide the required SEs to many users even under congestion.


I. INTRODUCTION
Wireless communications has sustained an exponential demand growth in data throughput and reliability over the last decades [2], [3].The cellular network topology with the assistance of MIMO technology has been evolved over time to indulge the growing demand.However, mobile traffic will increase as foreseen in a short time with 12.3 billion wireless access devices by 2022 [4].To handle this issue, Massive MIMO, a disruptive technology with commercial deployments started in 2018 [5], not only inherits all the multiplexing gain and spatial diversity of the conventional MIMO but also offers extra degree-of-freedoms as a consequence of equipping base stations (BSs) with many antennas [6].Massive MIMO, therefore, provides unprecedented spectral and energy efficiency gains of modern wireless networks with only utilizing the contemporary time and frequency resources.Each Massive MIMO BS only exploits a low-cost linear processing technique such as maximum ratio (MR) or zero-forcing (ZF) combining to detect the transmit signals and obtain performance closed to the optimum thanks to the benefits of the use of many more antennas than users [7].In the uplink transmission, combining vectors for data detection are constructed from channel estimates, and therefore, the overhead is only made practically proportional to the number of users by sending pilot signals in the uplink.
In Massive MIMO, the closed-form expression of the ergodic SE can be obtained in certain scenarios.For rich scattering environments such that propagation channels ideally follow uncorrelated Rayleigh fading, the uplink and downlink SEs were obtained as a function of large-scale fading coefficients when each BS exploits MR or ZF combining as in [8], [9] and references therein.As such, many impacts such as array gains and channel estimation quality are explicitly observed in those ergodic rates, together with the power scaling laws are achieved.However, practical channels usually involve spatial correlation, which is modeled, for example utilizing correlated Rayleigh fading in the isotropic scattering environment where the gathered energy at an antenna array comes from many directions leading to the full ranks of covariance matrices with an overwhelming probability [7], [10], [11].For rank deficiency occurring in poor scattering conditions, the Kronecker channel model is popularly used to describe the spatial correlations at the transmitter and receiver [12], [13].The authors in [14] proposed the double scattering channel and demonstrated that the channel capacity is also characterized by the structure of scattering in the propagation environment instead of the spatial correlations around the transceiver only.
A few works have studied the effects of low-rank channels in Massive MIMO communications.For the keyhole channels (uncorrelated and rank-deficient), the channel hardening and favorable propagation were investigated in [15] to impress a significant reduction of the ergodic SE compared with that of uncorrelated Rayleigh fading.An extension of this work to communications scenarios where users having multiple antennas has recently reported in [16].The first work numerically studying the uplink ergodic SE of cellular Massive MIMO systems with the double scattering channels (spatially correlated and rank-deficient) was found in [17].For theoretical analysis, the authors in [18], [19] computed the asymptotic ergodic SE of a single-cell Massive MIMO system with the different linear precoding techniques when the number of BS antennas, scatterers, and users grow large with the same rate.It is worth emphasizing that these works assumed each user utilizing an orthogonal pilot signal in a single-cell system and the formulations are asymptotically established.To the best of our knowledge, no prior work analyzes the performance of cellular Massive MIMO systems with a finite number of BS antennas, users, and scatterers, where the ergodic SE might have different features than the one at an asymptotic regime.
Many resource allocation tasks in Massive MIMO communications can be implemented on the large-scale fading time scale in place of the small-scale fading one by virtue of the channel hardening [20].This makes resource allocation feasible to implement in practice.Various optimization problems with different utility functions have been formulated and solved in the Massive MIMO literature [21]- [23].Notice that the key component of Massive MIMO communications is that it can allow many users to access and share the radio resource at the same time with high quality of service.The max-min fairness optimization is therefore promising to provide uniform service to all the users in the coverage area [24].However, for large-scale networks with many base stations and users, the fairness level will approach a zero rate [25].In contrast, one can include separate SE constraints in the optimization problems to simultaneously maintain the quality of service for all the users [26], [27].However, since the users were randomly distributed, many user locations with poor channel conditions leads the optimization problems to be infeasible.The preliminary work in [28] has indicated that many users are still served by the required SEs if we can detect and relax the constraints of unsatisfied users when solving the problems and analyzing uncorrelated Rayleigh fading only.
By exploiting the double scattering channel model, this paper considers a Massive MIMO system in which a set of orthogonal pilot signals are reused by all the users such that the BSs can estimate channels in the pilot training phase.We then compute the uplink ergodic SE of each user in relation to the channel structure and propagation environment.The ergodic rate is then used to formulate and solve the total energy optimization problem for the uplink data transmission when each BS uses MR combining to detect the desired signals.
Our main contributions are summarized as follows: • A new ergodic SE expression is derived in closed form for a finite number of antennas at each BS while the number of scatterers observed by each user and BS is different from each other.This closed-form SE expression explicitly demonstrates the influence of pilot contamination, channel estimation errors, and limited scatterers.
Conforming with the literature, we also analyze the asymptotic closed-form SE expression when the number of antennas and/or scatterers grows large.We analytically testify the existence of a saturated point in most of the scenarios, but although the system still can offer an unbounded capacity under a certain condition.• We formulate a total uplink data energy minimization problem subject to the required SE from every user and the power constraints.This problem may have an infeasible domain under the complication of simultaneously serving many users.For user locations and shadow fading realizations, where our optimization problem is feasible, the global optimum can be obtained in polynomial time owning to its convexity.• We propose two low computational complexity iterative algorithms that tackle the infeasible optimization problem by relaxing the SE constraints of unsatisfied users.At each iteration, the first algorithm allows users to transmit full data power whenever the required SE constraints are not satisfied.In contrast, the second algorithm gives a procedure to scale down data power assigned to users with the lower SEs than requested.• Numerical results manifest that the closed-form SE expression overlaps Monte-Carlo simulations in all the system parameter settings.The effectiveness of the proposed data power control algorithms are compared with the interior-point methods.For given user locations and shadow fading realizations that form infeasible problems, the system still can provide satisfactory service to many users after relaxing one or a few the required SE constraints.This paper is organized as follows: Section II presents the considered Massive MIMO system under the double scattering channels and derives the closed-form expression of the uplink SE for the case each BS utilizing MR combining to decode the transmitted signals.We also compute the asymptotic SE as different factors grow large.Section III formulates the total data energy minimization problem and characterizes its canonical form and feasible domain.The two algorithms to obtain a solution to this problem and handle the congestion issue are proposed in Section IV.Finally, Section V shows extensive numerical results and the main conclusions are drawn in Section VI.
Notation: Upper-case bold face letters are used to denote matrices and lower-case bold face ones for vectors.I  is the identity matrix of size  × .The operation E{•} and Var{•} denotes the expectation and variance of a random variable, respectively.The notation • is the Euclidean norm of a vector and • 2 is the spectral norm of a matrix.Moreover, tr(•) is the trace of a matrix.The regular and Hermitian transposes are denoted by (•)  and (•)  , respectively.Finally, CN (•, •) denotes the circularly symmetric complex Gaussian distribution.

II. MASSIVE MIMO SYSTEM WITH DOUBLE SCATTERING CHANNELS
We consider an uplink Massive MIMO system comprising  cells, where cell  has one BS equipped with  antennas and serving  single-antenna users.Even though the propagation channels change over time and frequency, we use a quasistatic channel model where the time-frequency plane is divided into coherence blocks.Each coherence block comprises   symbols such that the channel between an arbitrary user and the BS is static and frequency flat.This paper assumes that instantaneous channels are not known at the BSs.Therefore, in each coherence block, the   symbols are dedicated to the pilot training phase and the remaining   −   symbols are used for the uplink data transmission.The channel between user  in cell  and BS  is modeled by the double scattering channel model [13], [17], which is1 where    is the large-scale fading coefficient, which models the effects of the pathloss due to long distance and shadow fading due to obstacles.The integer parameter    is the number of scatters generating the channel between BS  and user  in cell .The matrix R   ∈ C  × represents the correlation between the BS antennas and its scatterers; includes the small-scale fading coefficients between BS  and its scattering cluster.The matrix R   ∈ C    ×   stands for the correlation between the transmit and receive scatterers and g   ∈ C    represents the small-scale fading between the user and its scattering cluster.The elements of both G   and g   are independent and identically distributed as CN (0, 1) by constraints on the trace of the covariance matrices.
Remark 1.The double scattering channel model in (1) reflects three important aspects of Massive MIMO channel propagation: the rank deficiency at the transceiver, the spatial fading correlation, and the signal attenuation by controlling multiple factors such as the number of scatterers in the environment, the correlation matrices, and the large-scale fading coefficients.
It is more an involved channel model than in previous nonline-of-sight models to describe the sensitivity of the actual channel capacity to both the fading correlation and scattering structure in real propagation environments [14], [17].This model spans scenarios from uncorrelated Rayleigh to the single-keyhole channels.In practical systems, the covariance matrices can be estimated by averaging over many realizations of instantaneous channels, while the number of scatterers can be obtained by formulating and solving, for example, an ℓ  −norm optimization problem, which matches the double scattering channel model with measurement data [29].
The further interesting statistical information of the double scattering channels, which is later useful for computing the uplink ergodic SE expression in a closed form, is presented in the following lemma.Proof.The proof is to compute the moments of non-Gaussian random variables and available in Appendix A.
In (2), the second moment obtained for the inner product of two different channel vectors is a deterministic value, which depends on their covariance matrices and scales up with the number of antennas installed at BS , say .Meanwhile, the weighted forth moment in (3) indicates a scaling factor of  2 .This moment is also inversely proportional to the number of scatterers.The moments of channels in Lemma 1 are utilized to compute the closed-form expression on the uplink ergodic rate of an arbitrary user.

A. Uplink Pilot Training
In each coherence block, each BS needs instantaneous channel state information for the uplink data detection.The   symbols are dedicated to the uplink pilot training, which can create   mutually orthogonal pilot signals.User  in cell  uses the deterministic pilot signal     ∈ C   with     2 =   .This pilot signal is also reused by other users in multiple cells and we can define the pilot reuse set as P  = {( ,  ) :      =     ,  = 1, . . ., ,  = 1, . . .,  } , (4) which contains the indices of all users sharing the same pilot signal as user  in cell , including (, ).Mathematically, it observes that At BS , the received pilot signal Y   ∈ C  ×  with the superscript  standing for the pilot training phase is formulated as where N   ∈ C  ×  is additive noise with the independent and identically random elements distributed as CN (0,  2 where with     =   p          .The covariance matrix of the channel estimate ĥ   is computed as The proof is based on the LMMSE estimation of non-Gaussian random variables [30], but adapted to our framework with the channel vector in (1) and the pilot reuse in (4).The detail proof is available in Appendix B.
Lemma 2 shows the concrete expression of the channel estimate of each user together with the statistical information, which are used to formulate the combining vectors and computing the closed-form expression on the uplink ergodic SE hereafter.It should be noticed that our channel estimation considers the influence of coherent interference caused by the pilot contamination in multi-cell Massive MIMO scenarios, which is a generalization of the previous result in [18], [19] that assumed the orthogonal pilot signals for all the users in a single cell.Along with the statistical information in Lemma 1, the channel estimates and estimation errors in Lemma 2 are utilized to compute the closed-form uplink SE expression hereafter.

B. Uplink Data Transmission
During the uplink data transmission, user  in cell  sends a data symbol   with E{|  |2 } = 1 and the received data signal y  ∈ C  at BS  is a superposition of all the transmitted signals from all the users as where    is the transmit power of user  in cell  assigned to each data symbol and n  is additive noise distributed as CN (0,  2 I  ).By utilizing a combining vector v  ∈ C  based on the channel estimates, BS  decodes the desired signal from user  in cell  as where the first term contains the desired signal by virtue of the channel hardening [31].The second term describes the beamforming uncertainty effects, while the remaining terms are mutual interference and noise.As shown in [20], the uplink ergodic SE is obtained by the use-and-then-forget channel capacity bounding technique as where the effective signal-to-interference-and-noise ratio (SINR) value is computed as in (14).The expectations in (14) are taking over all the sources of randomness and ( 13) is an achievable rate since it is a lower bound on the channel capacity.Furthermore, this achievable rate can be computed numerically for any combining scheme.The main demerit of ( 13) is high computational complexity since many instantaneous channels need to be gathered such that several expectations can be numerically estimated.

C. Uplink Spectral Efficiency Analysis
If MR combining is used by each BS, i.e., v  = ĥ  , ∀, , we obtain the closed-form expression for the uplink SE in (13) as shown by Theorem 1. 2   Theorem 1.When BS  uses the MR combing vector to decode the desired signal from user  in cell , the achievable uplink SE obtained in (13) with the closed-form expression of the SINR value computed as where NI  , CI  , and NO  are respectively the non-coherent interference, coherent interference, and noise, which are computed in the closed-form expression as with the values     and     , ∀ ,  , , defined as Proof.The proof is obtained by computing the expectations of non-Gaussian random variables in (14).The detailed proof is available in Appendix C.
The SINR expression ( 15) is explicitly influenced by many factors such as channel covariance matrices, the number of scatters, pilot reuse, channel estimation quality, which are hidden in the general formulation (14).Specifically, the numerator of (15) shows the contribution of both channel estimation quality and covariance matrix of user  in cell .Moreover, the effectiveness of the array gain is verified since the numerator scales up with the number of antennas thanks to the spatial covariance property in (23).The first part in the denominator of ( 15) demonstrates the degradation of the received signal quality due to non-coherent interference.The second part presents the contributions of coherent interference caused by reusing the pilot signals among the users that is defined by the pilot reuse set P  .Unlike previous works with many scatterers [10], this part also shows that a small number of scatterers have significant contributions to increasing noncoherent interference.If the coherent blocks are large enough such that pilot sequences allocated to all users are pairwisely orthogonal, i.e.,   ≥ , the SINR value of user  in cell  is still computed as (15), but the following parameters are reformulated as which demonstrates the influences of a finite scatterer number to the uplink SE.Finally, the last part in the denominator of (15) presents the additive noise effects.
Remark 2. We consider the MR combining technique due to its low computational complexity.This linear combining technique allows the execution of SE analysis in the closed form with a finite set of BS antennas, users, and scatterers.In addition, it can be implemented by only using the local channel state information, and therefore, easy to implement in a distributed manner.

D. Asymptotic Analysis
In order to observe the uplink SE at an asymptotic regime and also compare with previous works, we now investigate the uplink asymptotic SE of each user when  → ∞ and     → ∞, ∀,  ,  .Aligned with previous works [32], the general preliminary settings on the covariance matrices are given in Assumption 1.
Assumption 1.For ,  = 1, . . .,  and  = 1, . . ., , the spatial covariance matrices R    and R    satisfy lim sup Assumption 1 is established based on the fact that a double scattering channel has two covariance matrices on the definition.This assumption is extended from the standard form in the asymptotic analysis for Massive MIMO communications with a single covariance matrix [7].Physically, the gathered signal energy at a BS originates from many spatial directions and is proportional to the number of antennas.We also utilize the spatial orthogonality between two covariance matrices to seek for a convergence point at the asymptotic regime as shown in Definition 1.
As pointed out in previous works [33], [34], the condition (25) indicates the two users having orthogonal correlation eigenspaces.This holds for a network where each BS is equipped with antennas in a uniform linear array and the supports of the multi-path angular distributions of the two users are strictly non-overlapping.The convergence of the uplink SE for each user is stated in Theorem 2.
Theorem 2. Under Assumption 1, the uplink SE of user  in cell  can be asymptotically observed by the following cases: a) As  → ∞ and a given set of finite scatterers, the achievable rate of user  in cell  converges to b) As  → ∞, a limited number of scatterers, and the two covariance matrices R    and R   are asymptotically orthogonal for all ( ,  ) ∈ P  \ (, ), the achievable rate of user  in cell  converges to c) As  → ∞ and     → ∞, ∀ ,  ∈ P  , the achievable rate of user  in cell  converges to where Proof.The proof is to compute the asymptotic SE of each user in the network with Assumption 1 and Definition 1 when the number of antennas and/or scatterers increases.The detailed proof is available in Appendix D.
Theorem 2 reveals that the uplink SE at an asymptotic regime is dependent on both the number of antennas at each BS and scatterers in propagation environments as well.For a limited number of scatterers at each communication link, the uplink SE of user  in cell  is bounded when the number of antennas increases due to the pilot contamination effects.Different from [35], the SE converges to a finite point as shown (27) even when the asymptotically orthogonality among covariance matrices holds because of lacking the scatterers.For a rich scattering environment, the limitation is mainly from reusing the pilot signals among users causing coherent interference, which is dominant at an asymptotic regime.The fundamental difference of the double scattering channels compared with other spatial fading models as correlated Rayleigh fading or local scattering fading is that the unbounded channel capacity is obtained when the covariance matrices are asymptotically orthogonal as well as both numbers of antennas at each BS and scatterers go asymptotically.

III. UPLINK TOTAL DATA ENERGY CONSUMPTION MINIMIZATION
This section expresses an uplink energy consumption minimization problem by assuming that user  in cell  requests a SE   > 0, ∀, , and has a maximum power  max, > 0. Investigating this optimization problem, we further manifest the feasibility for user locations, where all the users are served with the requested SE under the limited power budget.In contrast, the infeasibility is manifested for certain user locations, where users may be served with the SE lower than what has been requested.

A. Problem Formulation
The main goal of 5G-and-beyond systems is to provide the high SEs to all users with a minimal power consumption.
In this paper, we formulate a total data energy optimization problem for the uplink data transmission as follows minimize where  max, is the maximum power level that user  in cell  can allocate to each data symbol.Problem (30) constrains on the rate requirement and limited power budget of each user.The per-user power constraints implicitly indicate that the total transmit power in the network should be upper bounded.In addition, the objective function of problem (30) ensures the minimal network power consumption.Therefore, our proposed optimization problem is able to reduce the mutual interference on other networks.Remark 3. Note that, in (30), we consider the per-user power constraints.It is also interesting to additionally consider a network power constraint so that the mutual interference on other networks can be controlled more effectively.For this case, the feasibility of our optimization problem is a main issue.We may first check if the network power constraint would be active in the selected point, i.e., if the network power constraint is satisfied under the optimized individual constraints.If it is inactive, the solution remains unaffected.If it is active, a heuristic approach would be to reduce the number of users, increase the number of antennas, or relax the per-user SE requirements.This potential extension is left for the future work.In this paper, we assume that the network power constraint is always satisfied and only handling a scenario that the per-user powers are constrained.
By setting   = 2     /(   −  ) −1 and removing the constant   −   in the objective function, problem (30) is converted from the SE constraints into the equivalent SINR constraints as minimize Instead of optimizing the energy consumption as (30), problem (31) minimizes the total transmit powers, which all users consume for the uplink data transmission.Due to the universe of all SINR expressions {SINR  }, problem (31) is in a general form for any combining technique.We now focus on MR combining technique as the corresponding SINRs have been derived in closed-form as obtained in Theorem 1.The concrete optimization problem is reformulated by utilizing the SINR expression ( 15) into (31) as We stress that problem (32) jointly optimizes the powers to satisfy the requested SINRs from all the users.The required SINR levels   , ∀, , are distinct from each other in practice and the global optimum is only found when all the users are simultaneously served by the required SEs.This problem can be either feasible or infeasible for a given set of user locations and shadow fading realizations as presented hereafter. 3

B. Feasible and Infeasible Problems
When problem (32) has a non-empty feasible set meaning that the network is able to simultaneously provide the required SEs to all the users conditioned on the power constraints.We can find the global optimal solution to problem (32).Indeed, the objective function is a linear combination of all the power variables {  }, ∀, .In addition, the power budget constraint functions are affine while the SINR constraints, ∀, , are reformulated as which are also affine functions.Consequently, ( 32) is a linear program on standard form [36].We hence enable to solve (32) to the global optimality in polynomial time, for instance, utilizing a general interior-point optimization toolbox as CVX [37].Problem (32) includes the   optimization variables and the 2  constraints and as such it has the computational complexity of the order O   2 3  3 , where   is the number of Newton iterations needed to obtain a predetermined precision, typically in the order of tens [36,Chapter 11].It should be noticed that all the   users will spend non-zero data powers at the global optimum when problem (32) is feasible owning to the non-zero SE requirements.For a specific realization of user locations and the power budgets, there may be a situation that all the users cannot be simultaneously served by the SE requirements.We emphasize that only one unfortunate user served with a lower SE suffices to create an empty feasible domain for the total transmit power optimization problem.Alternatively, problem (32) lacks a feasible solution [36,Section 4.1].The unsatisfied SE is caused by high mutual interference in cellular networks and/or extreme locations as the cell edge leading to some users having a weak channel.Moreover, a user may require a too high SE for which the system cannot provide this service even spending maximum data power.Fortunately, a feasible solution of the data powers might still exist for most of the users with the required SEs, while only one or a few users are unsatisfied.Consequently, it may be sufficient to remove or reduce the required SEs of those unsatisfied users to convert an infeasible problem to a feasible one.However, it is not trivial to identify which users are unsatisfied to completely remove during solving problem (32).As one of the main contributions, this paper develops the power allocation strategies to handle such infeasible instances by allowing the corresponding SINR constraints to be violated.

OPTIMIZATION
This section proposes the two algorithms attaining a fixedpoint solution to problem (32) with either empty or nonempty feasible set.When the feasible set is empty, the SINR constraints of users, which potentially make the congestion issue are relaxed: The first approach is spending the maximum power on unsatisfied users.In contrast, the second approach is reducing the data power of those unsatisfied users.We now introduce important notations which will be widely utilized in this paper to construct the proposed algorithms as shown in Definition 2.

A. Spending Maximum Transmit Power on Unsatisfied Users
For the glorification of simplification in comprehension, problem (32) with a non-empty feasible domain is first considered.We stack all the data powers into a vector p = [  11 , . . .,   ]  ∈ R  + , then the SINR constraint of user  in cell  is reformulated as where   (p) is so-called a standard interference function, which is given by In (35), the detailed expressions of NI  (p) and CI  (p) have been already expressed in ( 16) and ( 17), but we here emphasize them as the functions of data power variables stacked in p.We now introduce the definition of a standard interference function for which an low complexity algorithm to obtain a fixed point solution is proposed.

Definition 3 (Standard interference function). A function 𝐼 (z)
is a standard interference function for all z 0, if the following properties hold: The positivity property is because of the inherent mutual interference and thermal noise in the system, which implies a non-zero value.This means that the transmit data powers are always larger than zero when users request non-zero SEs.The monotonicity property ensures that we can scale up or down (35) by adjusting the data powers.Finally, the scalability property suggests a method to uniformly scale down the data power coefficient of user  in cell  at each iteration by utilizing a positive constant .We now construct a policy to update the data power of user  in cell  for the given initial values   (0), ∀, , as in Theorem 3.
Theorem 3. By assuming that the feasible domain is nonempty and 0 ≤   (p) ≤  2 max, always holds for all p in the feasible domain.For the initial values of data powers   (0) =  max, , ∀, , there exist data powers for which each interference function   (p) is non-increasing along iterations and converges to a fixed point.Particularly, the data power of user  in cell , denoted by   (), can be updated at iteration  as Proof.The proof is to testify every function   (p) defined in (35) being standard interference, and hence the updated power policy in (36) ensures that this iterative approach will converge to a fixed point.The detailed proof is available in Appendix E.
Every user in the network has its own standard interference function satisfying the three fundamental properties in Definition 3 and utilizing it to update the data power as in (36).The analysis in Theorem 3 is based on the assumption that problem (32) has the global optimum for which all users are served with their required SEs.The power constraints in (32) (  ≤  max, , ∀, ) are tackled by the fact if   ( − 1) >  max, , then the congestion issue appears and leads to an obvious selection   () =  max, .We therefore define the constrained standard interference function used at iteration  − 1 as Î (p( − 1)) = min   (p( − 1)),  max, .
For a cellular Massive MIMO system with the power budget constraints and the initial data power vector p(0) with the entries   (0) =  max, , ∀, , iteration  updates the data power of user  in cell  as Combining ( 37) and ( 38), we observe that if Î (p( − 1)) =  max, , the update   () =  max, maintains the nonincreasing objective function of problem (32).Otherwise, it holds that Î (p( − 1)) =   (p( − 1)), and hence user  in cell  consumes less power than the maximum.This procedure will be applied to all the   users, which results in an alternating approach is summarized in Algorithm 1.Since the convergence of the update   () =  max, is trivial, the proposed algorithm converges to a fixed point follows a similar methodology as [38,Theorem 7].By assuming that the channel statistic information is computed in advance and available in the network, we can compute the total number of operations that dominate the computational complexity of this algorithm as O    2  2 + 3  |P  |  , where   is the number of iterations needed to reach the fixed point in polynomial time.Notice that, in Algorithm 1, when users cannot be served by the required SEs, one still lets them utilize the maximum power.This policy aims at maximizing the SE of a particular user, however producing more mutual interference to the other users.

B. Softly Removing Unsatisfied Users
Instead of allowing potential unsatisfied users to spend full data power, one can reduce their power with the goal to degrade mutual interference to the others.This policy might ameliorate the number of satisfied users in the entire network.The idea is in detail that: At first, every user improves the transmission quality by spending more power to each data symbol.This target can be achieved by, for example, simply constructing the standard inference functions as in the previous subsection.If at the limited power budget, the required SE cannot be achieved, unsatisfied users will reduce data power.We then mathematically suggest an update of the data powers along iterations as in Theorem 4.
Theorem 4. From the initial values   (0) =  max, , ∀, , if the data power of user  in cell  is updated at iteration  as then the iterative approach converges to a fixed point.
Proof.The proof is first to confirm that the updated power policy in (39) follows a so-called two-sided function and the convergence is then established.The detailed proof is available in Appendix F.
This theorem provides a procedure to minimize the total transmit power in the network and coping with the congestion issue based on the standard interference function defined for each user as in (35).If   (p( − 1)) is less than the maximum power  max, then the data power of user  in cell  is updated based on (36), same as what has done in Algorithm 1.The main distinction is to prevent any unsatisfied user from transmitting full power whenever the congestion issue happens, i.e.   (p( − 1)) >  max, .In particular, the data power of a unsatisfied user scales down with the total mutual interference and noise level, which contains in   (p( − 1)).By doing this, the mutual interference from this unsatisfied user to the others should be reduced, and hence there is Algorithm 2 Data power allocation to problem (32)  Remark 4. The proposed algorithms enable to work in both feasible and infeasible domain such that a fixed point to problem (32) can be obtained.For realizations of user locations that result in feasible domains, the fixed point obtained by those algorithms is unique, which is the global optimum.The main difference between the two algorithms is at the policy to assign data powers whenever the congestion issue appears.While Algorithm 1 allocates the maximum data power to users when their SINR constraints are not satisfied, Algorithm 2 reduces the data power.As a consequence, for an infeasible domain to problem (32), the fixed point obtained by each algorithm may be different from each other.We notice that it is straightforward to extend the proposed algorithms to the total downlink energy consumption optimization problem with the per-user power constraints.The extension is not trivial if one considers the per-BS total limited power budgets and a primal-dual decomposition approach might be utilized to allocate the downlink power coefficients based on the standard interference functions.

V. NUMERICAL RESULTS
We consider a Massive MIMO system with  = 4 square cells in a 1 km 2 area, each serving  = 5 users.All the users are uniformly distributed within its cell with the distance to the BS no less than 35 m.Each coherence book has   = 200 symbols and there are   = 5 orthogonal pilot signals with the power p =  max, = 200 mW, ∀, .Without the loss of generality, the users with same index in all cells sharing a orthogonal pilot signal.The system bandwidth is 20 MHz and the noise variance is −96 dBm with the noise figure 5 where    > 35 m is the distance between user  in cell  and BS  ;    is the shadow fading coefficient, which follows a Gaussian distribution with zero mean and standard deviation 7 dB.The covariance matrices are computed by using [17, (13) and ( 16)].In the proposed algorithms (Algorithms 1 and 2), we set  = 0.001, except Fig. 5 which visualizes the convergence property.For feasible systems, the global optimum obtained by utilizing interior point methods from previous works like [40], [41] are included for comparison. 4igure 1  Figure 2 plots the CDF of SE per user [b/s/Hz] with a different number of scatterers.Each BS is equipped with 100 antennas.All the Monte-Carlo simulations producing the same SE as the closed-form expression verifies the correctness of Theorem 1 when the number of scatterers varies.Clearly, the SE per user gets better for rich scattering environments.On average, a notable gain of 1.25× in SE is obtained if each channel has 21 scatterers instead of 11 scatterers.However, the SE has a small gai, e.g., with only 6.6% if the propagation environment has 31 scatterers.Therefore, Fig. 2 unveils a slow growth of the SE as a function of the scatterer number.At 95%-likely, the three considered scenarios provide the same SE with 0.16 [b/s/Hz] without data power control.Consequently, it seems that poor scattering environments affect the worst SE slightly.
Figure 3 shows the CDF of SE per user [b/s/Hz] for a system with either MR or ZF combining technique with a small number of scatterers per each propagation channel.The transmit power per symbol is 50 mW and the large-scale fading coefficients are computed similar to (40) but with the penetration loss of 20 dB.ZF generally provides better performance than MR since it cancels out mutual interference more effectively [17].On average, a system with MR combining is still the baseline that offers less than that of utilizing ZF    combining.Nonetheless, Fig. 3 demonstrates the sensitivity of ZF when the propagation environment lacks scatterers in many user locations and shadow fading realizations which result in low-rank channels.Consequently, MR outperforms ZF about 45.5% at the median SE. Figure 4 presents the CDF of SE per user by utilizing the different spatial correlation channel models.There are 21 scatterers for each propagation link with the double scattering channel model.The exponential correlation model is defined as in [10] with the correlation magnitude 0.9, while the local scattering channel model is defined in [20]  Figure 5 illustrates the convergence of Algorithms 1 and 2 by utilizing two different required SEs.They converge fast to a fixed point after a few tens of iterations.If each user requests a SE 1 [b/s/Hz], the proposed algorithms need less than 10 iterations to reach convergence, which is the same fixed point.This fixed point is the global optimum since the optimization problem is always feasible for the user locations and shadow fading realizations have been generated.When the required SEs expand to 2 [b/s/Hz], the proposed algorithms require around 40 iterations to approach the optimum.The convergence rate is therefore slower when the SE requirements enlarge.This SE setting also manifests the benefits of Algorithm 2, which yields 20% less the total transmit power than Algorithm 1.On the other hand, the fixed point obtained by each algorithm is different from each other.
We show the CDF of the data power consumption [mW] consumed by each user in Fig. 6 for feasible systems with the two different required SEs.Matched well with the claim in Remark 4 for feasible systems, the proposed algorithms provide a unique fixed point that is the global optimum as what has obtained by the interior-point methods.Additionally, data power escalates when users require higher SEs.With the required SE 1.5 [b/s/Hz], each user only spends 5.2 mW for each data symbol on average.However, it drastically grows to 11.4 mW (corresponding to 2.2× more power) with the required SE 1.75 [b/s/Hz].Both the considered SE settings illustrate significant reductions of transmit power compared to the scenario dedicating full power to the data symbols.Particularly, all the users consume 38.5× and 17.5× less power than the full power transmission with the two considered SEs, respectively.
Figure 7 displays the CDF of the data power consumption [mW] per user for infeasible systems.It is the main interest of this paper when working with multiple access in Massive MIMO communications since there is no global optimum to obtain or compare against.All the users consume non-zero    powers at the fixed points identified Algorithms 1 and 2.
The trend that more data power is needed when the users require higher SEs has still remained.In more detail, the data power obtained by Algorithm 1 grows 1.6× from 16.6 mW to 27.0 mW when the required SE increases from 1.5 [b/s/Hz] to 1.75 [b/s/Hz].The data power increases 1.7× from 14.5 mW to 24.1 mW if Algorithm 2 is exploited.Moreover, the data power consumption per user obtained by Algorithm 1 is 12.3% and 15.1% higher than by Algorithm 2. Figure 8 plots the satisfied SE probability defined as the fraction of random user locations and shadow fading realizations in which the users can be served by the required SEs.If each user requires an SE 1.5 [b/s/Hz], all the benchmarks provide an overwhelming satisfied SE probability.For instance, the interior-point methods offer 96.7% user locations and shadow fading realizations with the required SEs.Meanwhile, the proposed algorithms offer a satisfied SE probability 99.8%.However, the interior-point methods will perform worse with higher SE requirements since only one user is sufficient to create an empty feasible set as aforementioned in Section III-B, especially only 6.3% users satisfied the required SE 2 [b/s/Hz].In contrast, the proposed algorithms still offer a satisfied SE probability of more than 75%.Furthermore, Algorithm 2 slightly performs better than Algorithm 1 in those required SE settings.
Figure 9 provides the served SE per user [b/s/Hz] when the users have different required SEs, which are uniformly distributed in the range [1, 3] [b/s/Hz] over many user locations and shadowing fading realizations.The interior-point methods are not included since the optimization problem always has an empty feasible domain in this complicated scenario.Interestingly, Algorithm 1 performs pretty better than Algorithm 2 since the former gives 86.5% users satisfied their SEs, while the latter is only 82.5%.However, Fig. 10 indicates that Algorithm 2 produces a fixed point that has much lower power consumption than Algorithm 1.The saving power is about 54.7% on average thanks to the data reduction policy in (39) whenever the congestion issue appears.
Figure 11 shows the percentage of interference suppression obtained by Algorithm 2 in a comparison to Algorithm 1 by utilizing the different required SEs per user.Softly removing unsatisfied users generates less mutual interference than spending the maximum transmit power on those users, especially when the SE requirements are high.For instance, mutual interference from Algorithm 2 is only 1.3% less than that of Algorithm 1 if the required SE per user is 1.

VI. CONCLUSION
This paper has analyzed the system performance of Massive MIMO systems with an arbitrary number of BS antennas, users, and scatterers by utilizing the double scattering channel model, rather than the asymptotic regime as in previous works.The closed-form expression of the uplink SE per user was first computed, then the asymptotic performance was obtained.We further formulated and solved a total transmit power minimization problem with the required SE constraints and limited power budget.We proposed two algorithms to handle effectively the congestion issue that often happens since multiple users are simultaneously connecting to the network and sharing the same time and frequency resources.The solutions to those algorithms are quite similar to each other if the required SEs can be almost satisfied with the given power budget.In contrast, Algorithm 2 outperforms Algorithm 1 phenomena where the SE requirements vastly different and many users cannot be served with required SEs.

A. Proof of Lemma
For matrix B, we first compute the statistical of the channels h    and    when ( ,  ) ≠ ( ,  ) by averaging over the different realizations of smallfading coefficients as The first expectation in the right-hand side of ( 41) is computed by plugging the of the double-scattering channel model in (1) as where the last equality of ( 42) is obtained by utilizing [32,Lemma 8] to compute the covariance matrix of the circularly symmetric complex Gaussian matrix G    for a given deterministic matrix R    .Following a similar manner, the second expectation in the right-hand side of ( 41) is computed in closed form as Plugging ( 42) and ( 42) into (41), we obtain the result as shown in (2).For a given deterministic matrix B, the statistical information of the channel h    is computed as where the last equality of ( 44) is obtained by utilizing the normalization term R    1/2 g    .Let us introduce the new optimization variable z    , which is defined as then it is straightforward to prove that z    ∼ CN (0, I  ), and is independent of g    .Thus, (44) is equivalent to the following expression and doing some algebra, we obtain the expression of the channel estimate ĥ   as shown in the lemma.

C. Proof of Theorem 1
We compute the expectation in the numerator of ( 14) with noting that v  = ĥ  as where the last equality in (51) is obtained by using the covariance property in (10).The first part of the denominator of ( 14) is decomposed into the coherent and non-coherent interference based on the pilot reuse pattern as The first expectation in the right-hand side of (52) is noncoherent interference and computed in closed form by the independence of two random variables v  and h    as The second expectation in the right-hand side of ( 52) is coherent interference and computed by utilizing the channel estimate in (8) In order to obtain the result in (55), we have borrowed (2) in Corollary 1.The second expectation of (54) is computed by exploiting (3) as Thanks to the independence between the channel and noise, the last expectation of (54) is computed as Plugging ( 55)-( 57) into (54) and doing some algebra, the coherent interference term (54) is obtained in closed form as Combining ( 52), (53), and (58), the first part of the denominator of ( 14) is computed in closed form as Utilizing ( 51) and (59) into ( 14) together with doing some algebra, we obtain the closed-form SINR expression as in the theorem.

D. Proof of Theorem 2
We begin with dividing the numerator and denominator of the SINR expression (15) where () is obtained by the upper bound of the trace matrix expression [33,Lemma B.7].By applying Assumption 1 to the last result (60), we observe that this part converges to zero as either  → ∞ or     → ∞.It is also straightforward to prove that the last part in the denominator of the SINR expression (15) converges to zero as either  → ∞ or     → ∞, i.e., Combining ( 60) and (61), the denominator of ( 15) is formulated as CI  , and therefore the asymptotic SINR expression as  → ∞ for a given finite set of the scatterers and covariance matrices as shown in (26).When R   is asymptotically orthogonal with all the other covariance matrices of the users sharing the same pilot signal as user  in cell , the second part in the denominator of ( 15 where () is obtained by [33,Lemma B.7] and () is because of our assumptions on the covariance matrices.Consequently, the asymptotic uplink SE of user  in cell  is obtained as in (27).
As both the number of antennas at each BS and scatterers go without bound while the covariance matrices are nonorthogonal, the first and last parts in the denominator of (15) go to zeros, while the second part converges to as and hence we obtain the asymptotic SE expression as shown in (28).For the last case in ( 29) is obtained since the denominator of (15) goes to zeros, while the numerator goes to a constant.

E. Proof of Theorem 3
We first prove that every   (p) is a standard interference function as given in Definition 3. Indeed, the positivity property is true since it holds for all p 0 that   (p) ≥   (0) which confirms that   (p) satisfies the monotonicity property.Since every   (p) is a interference function, the update procedure in (36) guarantees: First, beginning with the initial data power values   (0) =  max, , ∀, , all the updated power coefficients at iteration  are in the feasible domain.Indeed, we can prove this statement by mathematical induction following similar steps as [28, Lemma 3].Second, the update in (36) ensures a reduction of the objective function along iterations.

F. Proof of Theorem 4
Before getting in the proof, we recall the so-called twosided function [42].Specifically, a function  (z) is a two-sided scalable if for ∀ > 1 and 1  z ẑ z, implies the following two-sided inequality 1   (z) <  (ẑ) <   (z).
We stress that the authors in [43] gave a toy example of a two-sided scalable function to update the data transmit power for a communication system under perfect channel state information.Unlike the previous works, all the functions   (p( − 1)) involve the complicated expressions of many effects from channel estimation, pilot contamination, noncoherent interference, and noise.We now prove that   (p( − 1)) is a two-sided scalable function.If   (p( − 1)) ≤  max, , then it is sufficient to prove that   (p( − 1)) is a two-side scalable function.Indeed, we have shown in Theorem 1 that   (p( − 1)) is a standard interference function.Therefore, for 1   p  ( − 1) p( − 1) p  ( − 1), we have: , (74) which completes the proof that confirms   (p( − 1)) being a two-side scalable function.From the initial values   (0) =  max, , ∀, , the update in (39) ensures that the iterative algorithm will converge to a fixed point.

Definition 1 .
The two covariance matrices R    and R   , ∀ , , ,  are asymptotically spatially orthogonal if 1
shows the cumulative distribution function (CDF) of SE per user [b/s/Hz] to verify the correctness of the closedform expression of the uplink SE for each user obtained in Theorem 1.There are 21 scatterers per communication link and all users spend full power for the data transmission.Particularly, the closed-form expression result matches very well Monte-Carlo simulation result for all the considered number of BS antennas.This figure also illustrates the SE per user getting better when each BS is equipped with more antennas.Each user can be served by a data rate increasing from 1.3 [b/s/Hz] to 1.8 [b/s/Hz] on average if the number of BS antennas increases from 50 to 150, which is a 38.5% data rate improvement.From this amount of antennas added, the median SE gets significantly better with a 60% data rate improvement as a consequence of the SE per user increasing from 1.25 [b/s/Hz] to 2 [b/s/Hz].

Fig. 2 .
Fig. 2. The CDF of the uplink SE per user [b/s/Hz] with Monte-Carlo simulation and closed-form expression with  = 100.

Fig. 4 .
Fig. 4. The CDF of SE per user with the different spatially correlated models,  = 100.
with 6 scattering clusters, the angular standard deviation 5 • , and the antenna spacing of the half wavelength.By assuming that the scattering clusters are in the half-space in front of the BSs, the local scattering channel model offers the highest SE per user with up to 2.1 [b/s/Hz] on average.The exponential correlation model provides the SE of about 1.8 [b/s/Hz] per user.Meanwhile, the double scattering model yields to the lowest SE with only 1.6 [b/s/Hz] due to taking both the local scattering property and rank deficiency into account.

Fig. 6 .
Fig. 6.The CDF of the power consumption per user [mW] for feasible systems with the different required SEs at the users,  = 100, and    = 21, ∀, ,  .

Fig. 7 .
Fig. 7.The CDF of the power consumption per user [mW] for infeasible systems with the different required SEs at the users,  = 100, and    = 21, ∀, ,  .

Fig. 8 .
Fig. 8.The satisfied SE probability versus the different required SE per user for a system with  = 100 and    = 21, ∀, ,  .
Figure 11 shows the percentage of interference suppression obtained by Algorithm 2 in a comparison to Algorithm 1 by utilizing the different required SEs per user.Softly removing unsatisfied users generates less mutual interference than spending the maximum transmit power on those users, especially when the SE requirements are high.For instance, mutual interference from Algorithm 2 is only 1.3% less than that of Algorithm 1 if the required SE per user is 1.5 [b/s/Hz].However, the mutual interference suppression gains up to 17.2% with the SE requirement 2 [b/s/Hz].In particular, Algorithm 2 suppresses mutual interference significantly when each user has its own SE requirement varied in the range from 1 [b/s/Hz] to 3 [b/s/Hz] with the mutual interference suppression of about 35.4%.We therefore conclude the effectiveness of the second algorithm compared with the first one.
dB.The large-scale fading coefficient [dB] of user  in cell  and BS  is modeled based on the 3GPP LTE specifications [39] as    = −128.1 − 37.6 log 10    /1km +    , to construct the combining vector as   and the last equality in (54) is decomposed based on the among the channels, and the uncorrelation between the channels and noise.In the last equation of (54), the first expectation is computed by using the independence of two random variables h    and h    as Let us denote the two vectors p and p having   ≥   , ∀, , then we obtain   (p) −   (p ) =   NI  p − NI  p +   CI  p − CI  p   (p) ≥   (p ) and confirms the monotonicity.For the scalability, we observe that   (p) =   NI  (p) +   CI  (p) +   NO  =   NI  (p) +   CI  (p) +   NO    NI  (p) +   CI  (p) +   NO ) 2 (   ) 2   tr R   Ψ Ψ Ψ   R   > 0,(64)where () is obtained since NO  is independent of the data powers and () is obtained after doing some algebra.