Spectral Efficiency Maximization for Massive MIMO Uplink With Intra-Cell Pilot Reuse

Massive multiple-input multiple-output (mMIMO) and spatial multiplexing of devices are crucial technologies to support massive connectivity. As the number of users increases, it becomes impossible to assign unique orthogonal pilot sequences to each device, making the necessary channel state estimation challenging. Hence, intra-cell pilot reuse becomes necessary. In this letter, we propose a novel method for determining the optimal pilot length that maximizes spectral efficiency (SE) in mMIMO networks with pilot reuse. To that end, we develop an approximated expression for the signal-to-interference-plus-noise ratio, which is a function of the pilot length. The proposed method achieves accurate SE for different degrees of channel correlation and block length. This allows for efficient evaluation of the SE for different pilot lengths, without having to resort to lengthy network simulations.


Spectral Efficiency Maximization for Massive MIMO Uplink With Intra-Cell Pilot Reuse
Lucas Ribeiro and Markku Juntti , Fellow, IEEE Abstract-Massive multiple-input multiple-output (mMIMO) and spatial multiplexing of devices are crucial technologies to support massive connectivity.As the number of users increases, it becomes impossible to assign unique orthogonal pilot sequences to each device, making the necessary channel state estimation challenging.Hence, intra-cell pilot reuse becomes necessary.In this letter, we propose a novel method for determining the optimal pilot length that maximizes spectral efficiency (SE) in mMIMO networks with pilot reuse.To that end, we develop an approximated expression for the signal-to-interference-plus-noise ratio, which is a function of the pilot length.The proposed method achieves accurate SE for different degrees of channel correlation and block length.This allows for efficient evaluation of the SE for different pilot lengths, without having to resort to lengthy network simulations.

I. INTRODUCTION
O NE KEY challenge for massive Internet of Things (IoT)   is the design of multiple access techniques that support the huge numbers of connected devices.Thanks to its high capability for spatial multiplexing, massive multiple-input multiple-output (mMIMO) is a crucial technology to support massive connectivity.Accurate channel state information (CSI) is required to fully benefit from spatial multiplexing.In conventional cellular systems, each user equipment (UE) within the cell is assigned a unique orthogonal pilot sequence.However, when it comes to massive connectivity, assigning unique sequences to all the UEs is not feasible.Therefore, intra-cell pilot reuse (PR) is unavoidable, extending the pilot contamination [1] to connectivity within one cell.
Reducing pilot contamination is critical as it degrades channel estimation accuracy and communications quality.Although the problem can be alleviated by an intelligent assignment of the pilots [2], the performance is limited by the overlapping on the direction of the UEs' incoming signals to the BS [3].Thus, the number of sequences, i.e., their length, directly affects the channel estimation accuracy and the SE.Longer pilot sequences improve the accuracy of the channel estimates, but increases pilot overhead.Conversely, shorter sequences lead to less accurate channel estimates, reducing the signal-tointerference-plus-noise ratio (SINR).The optimal pilot length in systems with PR has no closed-form solution [4].
Massive MIMO and PR for addressing massive connectivity have been studied in [4], [5], [6], [7].Björnson et al. [4] optimized the number of UEs to maximize the uplink (UL) SE.In the case of uncorrelated channels and a multi-cell system with inter-cell PR, their findings suggest that dedicating half of the time to channel training is optimal.For a similar multi-cell environment, Zhang et al. [5] derived approximate expressions for finding the optimal pilot length that maximizes the sum rate.However, their SINR approximation does not consider the spatial correlation of the channels, only the average large-scale attenuation.Massive access within singlecell mMIMO systems is explored in [6] and [7], where SINR approximations are derived for non correlated channels.In [6], de Carvalho et al. considered that only a subset of UEs is active at each time and that the set of active UEs performs pilot-hopping.In [7], Yan and Yang proposed the use of non-orthogonal (unique) pilot sequences to tackle the massive access.Other solutions for mitigating the effects of pilot contamination in massive MIMO systems are presented in [8], [9], [10].In [8] and [9], novel rate-splitting multiple access transmission strategies are proposed.In [10], a reverse time-division-duplexing strategy was introduced for systems with underlay spectrum-sharing.
In this letter, we study the extensive reuse of pilot sequences to address the massive connectivity problem.We propose a method to find the optimal pilot length that maximizes the SE in a mMIMO network with intra-cell reuse of orthogonal pilot sequences.To the authors' knowledge, this is an unsolved problem that has not been addressed in the literature yet.To this end, an approximated expression for the SINR is derived.We consider the spatial correlation of UL channels to model the average pilot contamination.Specifically, we model the SINR as a random variable (RV) with Gamma distribution in which the parameters depend on the pilot length τ .This allows us to efficiently evaluate the SE for distinct pilot lengths without having to resort to excessively long computer simulations.We show that our solution achieves accurate SE for distinct degrees of channel correlation and block length.
Notations: Boldface lowercase letters, x, denote column vectors and boldface uppercase letters, X, denote matrices.The superscripts (•) T , (•) * , and (•) H denote transpose, conjugate, and conjugate transpose, respectively.The multivariate circularly symmetric complex Gaussian distribution with covariance matrix R and mean x is denoted as CN (x, R).The n × n identity matrix is I n and diag(x) denotes the diagonal matrix with the elements of x.The expected value of x is denoted as E[x].The trace of X is denoted as tr(X).BS that is equipped with an M-element uniform linear array (ULA).The received signal, y ∈ C M , is given by

II. SYSTEM MODEL
where is the transmitted symbol vector, and n ∈ C M is the noise vector at the M receive antennas modelled as n ∼ CN (0, σ 2 n I M ), where σ 2 n is the noise power.The channel h k ∈ C M between the BS and the k-th UE is modelled as an independent realization from a Rayleigh distribution with covariance We adopt the one-ring channel model, as described in [1].This model assumes that the multi-path components are concentrated in a circle around the UEs, resulting in (l,m)-th element of R k given by [ where β k is the average channel gain for UE k, A k is the angular support for the possible incoming multi-path components from UE k, Δ r is the normalized (by the wavelength) antenna spacing, θ k is the angle-of-arrival (AoA), and f (θ k ) is the probability density function of the scatterers.
A well known property of the channel covariance matrix R k in (2) arises when the number of antennas grows towards infinity: its eigenvector matrix, F ∈ C M ×M , can be approximated by the unitary discrete Fourier transform matrix as [11] [ where each column represents a multi-path component arriving from one of the M angular directions.As shown in [2], the associated eigenvalues, λ k ,m , ∀k ∈ K and m = 1, . . ., M , are tightly related to the channel power angular spread (PAS), i.e., to the angular position of the UEs.Therefore, when operating in the large antenna array regime, we can write where Λ k ∈ R M ×M is the matrix containing the eigenvalues of the channel covariance matrix for UE k.

III. CHANNEL ESTIMATION AND PROBLEM FORMULATION
We consider time-invariant and flat-fading channels during one coherence block.We assume that a time-division duplex (TDD) protocol is employed, such that the UL and downlink (DL) channels remain the same within one coherence interval.At each coherence block, each of the K UEs transmits τ known pilot symbols of equal power p u to enable channel estimation.Once the pilots have been transmitted, the UEs transmit their data to the BS.The remaining duration of the coherence block is then used for DL communications.
Because of the large numbers of UEs in mMIMO networks, the coherence time (τ c ) is often shorter than the number of UEs, i.e., τ c < K .In fact, the pilot length must be smaller than the coherence time (τ < τ c ) -otherwise there will be no time left for the data transmission.Consequently, in the channel estimation phase, the same pilot is shared by η = K /τ UEs on the average.Let T = {1, . . ., τ} be the set of indices of available pilot sequences.UE k ∈ K = {1, . . ., K } transmits a pilot signal We define the set of UEs interfering with UE k, i.e., the set of UEs sharing the same pilot sequence as UE k, as We assume random pilot allocation is used. 1  The received pilot signal for channel estimation at the BS, Y = [y 1 , . . ., y τ ] ∈ C M ×τ , can be written as where N = [n 1 , . . ., n τ ] ∈ C M ×τ is the noise matrix and Ψ = [ψ 1 , . . ., ψ K ] T ∈ C K ×τ is the pilot signal matrix.Given Y in (5), the correlated received signal for the pilot sequence assigned to user k, y p k ∈ C M , is given by n denote the ratio between the transmit and the noise powers. 2 The linear minimum mean square error (LMMSE) estimate of the communications channel between user k and the BS is given as [2] ĥk where is the correlation matrix of the received signal for UE k, given as The respective channel estimation error, hk ∼ CN (0, R hk ), can be decomposed as hk = h k − ĥk .Thus, the error covariance matrix for user k is The achievable rate for UE k is lower bounded by [1] where the expectation is with respect to the channel realizations, and γ k is the effective SINR of UE k.When the LMMSE receive combiner is used at the BS [1], the SINR, γ k in (9), is given by By treating γ LMMSE k as a RV and applying the Jensen's inequality, we get a lower bound for (9) [12] ȒLMMSE Therefore, the UL sum-rate maximization problem can be formulated as finding the pilot length, given a fixed τ c , that maximizes the sum SE for the LMMSE receive combiner, i.e., 1 Ideally, jointly optimizing the pilot length and the pilot allocation scheme is desirable.However, this falls beyond the scope of this letter. 2Here ρ has the interpretation of "transmit" SNR.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Next we propose an approximate expression for the SINR, γ LMMSE k in (11), to solve (13) in closed form without having to resort to extensive network simulations.

IV. SINR APPROXIMATION AND SE MAXIMIZATION A. SINR Approximation
The SINR expression for UE k, in (11), can be rewritten as where The Gamma distribution is a good model for non-negative continuous RVs, taking a wide variety of shapes depending on its parameters [13].Thus, as in [14] and [12], for equal transmit power UEs, we approximate the UE k SINR by a RV with Gamma distribution, γk ∼ Γ(α k , θk ), with Above, where Note that this approximation is valid for αk > 2, 1/ θk > 0, and η > 1, conditions satisfied according to our system model.Applying the eigenvalue decomposition into A, in ( 14), we get where λ i , i = 1, . . ., M , are the eigenvalues of K k =1 R hk .From ( 4) and ( 7) we have Substituting ( 4) and ( 19) in ( 8) we can write with Since I k is the same regardless of the direction, all the M eigenvalues of R hk in (21) follow the same distribution, whereas their mean value depends on the AoA of each UE.
Finally, we can write the sum error covariance matrix as where Λ = diag(λ 1 , . . ., λ m , . . ., λ M ) ∈ R M ×M , and its m-th eigenvalue is given as Assuming that UEs are uniformly distributed, the eigenvalues of their channel covariance matrices are also uniformly distributed in all directions.Thus, for a large number of UEs, K rank(R k ) > M , ∀k ∈ K, the summation in (23) tends to the same value, regardless of the direction m.In other words, the distribution of the eigenvalues becomes invariant with respect to the spatial directions.Thereby, we can approximate λ 1 ≈ λ 2 ≈ . . .≈ λ M ≈ λ avg and rewrite (18) as We model the average channel estimation error, λ avg , as where η − 1 is the number of potential interfering UEs, p[A k ∩ A j ] is the probability of overlapping (interference) of the incoming signals of UE k and j at the BS, and δ jk accounts for the average normalized interference a potential interfering UE j, causes into UE k. 1) Probability of Angular Overlapping: For bounded angular support of the incoming signals, as in the one-ring channel model, the probability of angular overlapping is where the superscript max and min stand for the maximum and minimum value the variables can take.Whereas, for unbounded A k , ∀k = 1, . . ., K , the probability of interference is one, i.e., p[A k ∩ A j ] = 1.
2) Average Normalized Interference: Let A k ∩j = A k ∩ A j be the set of overlapping angles and A k ∩j = [θ min k ∩j , θ max k ∩j ].We compute the average interference of UE j into UE k, δ jk , as the ratio between the angular overlap of UEs' k and j by the angular spread of UE k, i.e., For uniformly distributed scatterers, one can note that the average interference is equal to a half, i.e., δ jk = 1/2.Note that the effect of distinct pilot allocation schemes can be incorporated through the average normalized interference and the probability of angular overlapping.Similarly, by reformulating (27), distinct channel models could be incorporated into the SINR model, assuming spatial correlation is still in play.

B. SE Maximization
Using the proposed approximation for the SINR, we can rewrite (12) as Then, replacing the exact rate for UE k by its approximate expression, Rk in (28), the SE maximization problem can be reformulated in the epigraph form as Since this is an integer problem with finite feasible domain, we can solve it by evaluating (29) for every feasible τ .
V. NUMERICAL RESULTS We consider a circular cell with a radius of 1000 meters and K = 100 UEs uniformly distributed at random within the cell.The BS is equipped with a M = 200 critically spaced ULA (Δ r = 0.5) and the minimum distance between a UE and the BS is 100 meters.Similarly to [12], we model the large-scale fading as β k = z k /(r k /100) ν , where z k is a lognormal RV with standard deviation σ shadow = 8 dB, r k is the distance between UE k and the BS, and ν = 3.8 is the pathloss exponent.Also, we scale the transmit power per UE by M, such that p u = E u / √ M .For these results we consider E u = 20 dB.Furthermore, we assume and that the scatterers follow a uniform distribution, Here, σ θ stands for the angular standard deviation (ASD).
We compare the proposed approximation for the SINR against simulated one, denoted as "Simulation", in terms of SE.As a baseline, we plotted the SE for imperfect CSI and Rayleigh fading (without PR), denoted as "W/o PR", derived in [12].In Fig. 1, we check the capability of the proposed method to adjust to different coherence intervals.In this example, there are K = 80 UEs, the BS has M = 150 antennas, and σ θ = 10 • .As one can see, the SE using the approximated SINR follows closely the simulated one for τ c = 40, 80 and 196 symbols.Furthermore, one may also observe that, at some  point, the loss caused by increasing the training period (τ ) will outweigh the SE gain and hence the sum SE will decrease.This phenomenon can be seen whenever the pilot length is comparable in size to the block length.As a result, smaller coherence time blocks suffer more losses in SE than larger ones because, proportionally, the amount of time they spend on channel training is larger.In Fig. 2, we fix τ c = 100 and evaluate the performance of the proposed approach against distinct channel angular spreads by computing the SE for different ASD values of σ θ = 5 • , 10 • , and 15 • .It is clear that the proposed approach provides a good approximation for the SE in different scenarios.One of the important benefits of this approach arises from the parameterization of the SINR by the ASD, easily accommodating small changes in the channels structure.Here, one can note that as the AoA spread widens, the correlation of the channels decreases.This, in turn, makes it harder to distinguish distinct UEs signal subspace, leading to stronger interference and degradation in the system's performance.For fixed K, optimizing the pilot length implies indirectly optimizing the pilot reuse factor η = K /τ .The top axis in Fig. 2 illustrates the relationship between the pilot length and the pilot reuse factor.
Lastly, we compare the solution of (29) to the optimum one, τ * sim .The results for σ θ = 10 • are summarized in Table I.
The proposed approximation yields a sum SE very close to the optimum solution.Even when the estimated pilot length is not optimal, it provides a similar SE.Also, it is worth noting that beyond 50 pilot symbols, the gains of increasing the pilot length become limited.This is likely due to most interference being suppressed, and further gains come from better capabilities to cope with noise.VI.CONCLUSION In this letter, we derived system design tools for mMIMO network design and optimization.We proposed a new method to determine the optimal pilot length that maximizes SE in a mMIMO network with intra-cell pilot reuse.Our approach involves modeling the average UE interference and developing an approximated expression for the SINR.This enables efficient evaluation of SE for various pilot lengths without lengthy system simulations.The numerical results demonstrated the effectiveness of the proposed method across different channel correlations and block lengths, achieving good accuracy in SE estimation.We also showed that by carefully selecting the optimal pilot length, we can maximize SE while minimizing the associated overhead.) H ĥk is given by Rearranging (32) we obtain Now, let A K j =1 R hj + 1 ρ I M , we can rewrite (33) as After some manipulations and applying the matrix inversion lemma [15, eq. (162)] on the expression between the brackets in (34) we get Therefore, (34) can be rewritten as From (36) we obtain the instantaneous SINR for user k as

Fig. 1 .
Fig. 1.SE versus the pilot length τ for distinct coherence intervals.In this example K =80 UEs are served by one BS with M = 150 antennas and σ θ = 10 • .
APPENDIX A. UE k SINR Let Λ k Ĥ ĤH − ĥk ĥH k + K j =1 R hj +1ρ I M , we can rewrite(10), the LMMSE receiver vector, asv LMMSE k = ĥk ĥH k + Λ k −1 ĥk .(30)Applying the Sherman-Morrison matrix inversion lemma in (30), letting A = Λ k and b = c = ĥk in [15, eq.(160)], we can rewrite it as k Considering an UL communications scenario with a set K = {1, . . ., K } of K single-antenna UEs communicating with a c 2023 The Authors.This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see https://creativecommons.org/licenses/by/4.0/Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.