Spectral Efficiency of Time-Variant Massive MIMO Using Wiener Prediction

Massive multiple-input multiple-output (MIMO) serves as technological pillar in current 5G deployments. The spectral efficiency (SE) reduction due to time-variance of the wireless channel, and respective mitigation strategies, are still not sufficiently understood. In this letter, we propose a multi-step Wiener predictor with a general temporal covariance function to quantify the impact of time-variance on the SE for different numbers of consecutive pilot symbols and prediction horizons. We provide asymptotic expressions of the SE for maximum ratio combining (MRC) beam-forming that show excellent agreement to numerical simulations. We investigate the channel hardening capability of massive MIMO systems in time-variant propagation channels and show that it is largely independent of the prediction horizon. The numerical evaluations show that channel prediction allows to double SE utilizing four instead of one uplink pilot symbol for a prediction horizon of $0.3\,\lambda $ for both MRC and regularized zero-forcing (RZF) beam-forming.


I. INTRODUCTION
M ASSIVE MIMO systems are an important enabling technology in current 5G deployments, providing improved SE and reliability. However, they still struggle to unlock their potential in high mobility scenarios, e.g., in automotive and industrial use cases.
The fundamental cause for this is the systems dependency on high quality and timely channel state information (CSI) for the communication channel from all users to all base station (BS) antenna elements for accurate beam-forming [1]. Currently, the only viable and scalable solution is the application of a time division duplex (TDD) method and relying on channel reciprocity. In high mobility scenarios, this approach inherently causes service degradation due to outdated CSI, also termed channel aging [2]. Solely increasing pilot overhead to overcome channel aging quickly becomes prohibitively expensive. Therefore, channel prediction is widely considered as the method of choice to combat outdated CSI [3], [4], [5], [6], [7].
In [3], the effects of channel aging are thoroughly derived, but the utilized autoregressive temporal correlation model of order one (AR(1)) and the consideration of only one-step Manuscript  prediction limits the applicability of the results to bandlimited real-world fading processes. In [4], the authors model achievable sum-rates of MIMO systems in the presence of channel aging and derive prediction strategies. However, again a simplified AR(1) model is utilized for the temporal correlation. A more suitable correlation is found in [8], where a Clarke's model is employed for modeling channel aging. However, no prediction algorithms are introduced or analyzed in [8]. Recent studies on high mobility massive MIMO utilize advanced prediction methods, such as Kalman filtering [5], [6] and machine learning (ML) [6], [7]. While providing state-of-the-art channel prediction quality, the work in [5], [6], and [7] is not easily accessible to analytic and asymptotic considerations, especially concerning the variance of the achieved received signal power around its mean (which becomes negligible in case of channel hardening).
In this letter, we analyze the uplink SE of a massive MIMO system employing channel prediction to quantify its degradation due to user mobility. Further, we investigate the variance of the received signal power around its mean for different prediction horizon lengths, i.e., how channel prediction influences channel hardening. This facilitates the reliability analysis of a massive MIMO system in terms of received signal power fluctuation.
Scientific Contribution: • We derive the instantaneous and asymptotical signal to interference and noise ratio (SINR) and the SE of a TDD massive MIMO system for a time-correlated fading process utilizing multi-step miminum mean square error (MMSE) (Wiener) prediction. This result generalizes the work of [3] for single-step AR(1) process prediction. • We prove and show by numerical simulation that the capability of massive MIMO to reduce the variance of the received signal power is independent of the CSI age, i.e., channel hardening is independent of channel aging. The CSI age mainly affects the SINR and thereby the SE. Notation: Throughout this work, letters in italic font (a) denote scalars. Bold lower case letters (a) denote vectors, while bold upper case letters (A) denote matrices. (·) T marks a transpose and (·) H marks the Hermitian transpose. The Kronecker product of two matrices A, B is denoted by A⊗B. The element-wise expectation of a random vector or matrix a is denoted as E {a} and its variance is written as Var {a}.

II. SIGNAL MODEL
We consider an uplink massive MIMO system, where K users with single-antenna terminals k ∈ {1, . . . , K} send pilots and data to a BS deploying A antenna elements, without neighboring cells. The notation we introduce for this system This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ is largely based on [3], with the exception that only one cell is considered as our focus mainly lies on the aspect of time variance of the bandlimited fading process and its impact on the SINR and SE.
Due to space constraints we discuss the uplink (UL) only. However all results on SE apply for the downlink (DL) too, through the UL-DL duality [9,Theorem 4.8].
The vector collecting the received symbols from all K users at time index m iŝ with P the average transmit power of each user, x m ∈ C K×1 the vector collecting the transmit symbols of all users, and contains the individual channel vectors of all users. Similarly, the beam-forming matrix contains the individual beam-forming vectors w k,m for user k and all A BS antenna elements. The spatial correlation matrix of the channel vector for user k is Further, we assume local wide-sense stationarity of the fading process and a joint spatial and temporal covariance matrix with r h,ℓ being the temporal correlation coefficient of the channel vector with time delay ℓ. Note that the scalar approach in (5) intrinsically assumes, that the temporal correlation is the same for all antenna pairs a, a ′ , which is valid if BS antenna elements are collocated.
The received symbol estimate from one single user k at the BS iŝ where the first term is the scaled signal, the second term is filtered and scaled Gaussian noise 1 √ P z k,m ∼ CN (0, σz 2 P ), and the third term is interference from other users k ′ ̸ = k.

A. Channel Estimation
For channel estimation, a single-cell approach similar to [3] and [9] is used to obtain noisy observations of the channel vector They are filtered by the MMSE estimator [3] to obtain the channel vector estimatê denoting the covariance matrix.

B. Multi-Step Channel Prediction
Given the following matrices to ease notation we find the linear Wiener multi-step predictor by considering the framework provided in [3] and the joint spatial and temporal correlation defined in (5). Theorem 1 (Wiener multi-step predictor): Given N consecutive pilot symbols, the optimal linear Wiener predictor V k,ℓ with prediction horizon ℓ is and the predicted channel vectorh k,m+1 is stacking N received pilot symbols delayed by ℓ. Proof: We follow the approach in [3], but extend it to a general temporal fading model and multi-step prediction. We assume that N consecutive pilot symbols are used for prediction.
Suppose that the pilot symbols are outdated, e.g., only pilot symbolsỹ k,m−ℓ with CSI acquisition delay ℓ > 0 are available to predicth k,m+1 . The optimal predictor [3] extended to multi-step prediction is then found by solving or, after reformulating, By using the channel estimation signal model in (7), the joint spatial and temporal correlation (5), and assuming that uncorrelated noise terms cancel for ℓ > 0, we find the covariance matrix of two received pilot symbols with CSI acquisition delay ℓ to be The covariance matrix of the stacked pilot symbols is then calculated with (16) and (19) as The first line of (20) holds since wide-sense stationarity is assumed, i.e., the statistics are independent of an arbitrary time shift ℓ for the considered frame duration and prediction horizon [10]. With (7), (5) and (16), the cross-correlation between the true channel and the stacked pilot symbols is Substituting (20) and (21) into (18) gives (14). The predicted channel vector is thus and its distributionh k,m+1 ∼ CN (0, Θ k,N,ℓ+1 ) depends on the prediction horizon ℓ [3]. □ Note that in the case of N = 1, i.e., only one pilot symbol is used, (10) and (11) each reduce to the scalar temporal correlation coefficient and the predicted channel becomes h k,m+1 = V k,ℓ+1ỹk,m−ℓ = r h,ℓĥk,m−ℓ+1 . Simply put, with only one pilot at hand, the evolution of the channel cannot be tracked and the best prediction is the weighted aged channel estimate.

III. SPECTRAL EFFICIENCY AND CHANNEL HARDENING
By introducing the predicted channelh k,m+1 in (6), we obtain the estimated received symbol from one single user k at the BS after receive combining aŝ The first term is the intended signal multiplied with the effective channel gain the second term is additive noise, the third term is the channel prediction error considered as noise, and the fourth term is interference from other users k ′ ̸ = k. The choice of the beam-forming vectors in W m greatly affects the capability of a massive MIMO system to suppress the interference from other users. In this work, we revert to MRC and RZF as common choices of beam-forming vectors [1], [4]: where the columns of the predicted channel matrixH m are composed of the predicted channel vectors similar to (2).

A. Instantaneous and Asymptotic SINR
Considering the signal model (23) and assuming channel prediction with prediction horizon ℓ at the BS, the instantaneous signal power is given as the squared absolute value of the effective channel gain γ k,m+1 and yields The instantaneous interference and noise power is defined as the sum of all interference/noise contributions in (23) The instantaneous SINR is defined as the ratio of instantaneous signal to interference and noise power Theorem 2: The asymptotic deterministic SINRη k,ℓ = lim A→∞ η k,ℓ with MRC is calculated as in (29), shown at the bottom of the page.
Proof: Since MRC is assumed, the beamforming vector is computed as w k,m+1 =h k,m+1 . For A → ∞, the instantaneous signal power (26) converges towards its asymptotic deterministic equivalent [3] S k,ℓ = lim with the covariance matrix Θ k,N,ℓ+1 defined in (13). Similarly, the instantaneous interference and noise power (27) converges toward its asymptotic deterministic equivalent [3] I k,ℓ = lim as the number of BS antenna elements grows.
The asymptotic deterministic SINR is therefore found as the ratioS k,ℓ /Ī k,ℓ as specified in (29). □ Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

B. Ergodic and Asymptotic Spectral Efficiency
Similar to [3], we define the uplink ergodic achievable SE of user k with prediction horizon ℓ as where the expectation is over channel realizations. In the asymptotic case with the number of BS antenna elements A → ∞, the SINR is deterministic and the uplink asymptotic achievable SE of user k with prediction horizon ℓ is [3] R k,ℓ = log 2 (1 +η k,ℓ ) . (33)

C. Channel Hardening
Channel hardening refers to the phenomenon of the effective channel gain γ k,m+1 = w H k,m+1h k,m+1 in (24) fluctuating only marginally around its mean [11], To assess if channel hardening is occurring, we investigate the ratio of the effective gains variance to its squared mean similar to [11], [12], and [13]. Low values of β k,ℓ indicate a low probability of the channel gain γ k,m+1 deviating significantly from its mean. Theorem 3 (Channel hardening): Let N = 1 and w k,m+1 = r h,ℓ+1ĥk,m−ℓ (i.e., MRC is applied), then the channel hardening metric β k,ℓ in (35) is independent of the prediction horizon ℓ. Proof: We acknowledge that, for N = 1, the predicted channel vectorh k,m+1 is the outdated channel estimate decreased by its temporal correlation coefficient, i.e.,h k,m+1 = r h,ℓĥk,m−ℓ+1 . Further, by applying MRC, the beam-forming vector becomes w k,m+1 = r h,ℓĥk,m−ℓ+1 .
We then calculate where the third line follows from (8), and We plug (36) and (37) into (35) to get which is independent of the prediction horizon ℓ. □ While β k,ℓ is time-dependent in general when employing RZF, we show empirically in the next section that its variation is negligible compared to the impact of the number of BS antenna elements on channel hardening. Fig. 1. Ergodic and asymptotic achievable SE R k,ℓ over prediction horizon ℓ for a normalized Doppler f D Ts = 0.04, an signal to noise ratio (SNR) P/σ 2 z = 12 dB, MRC beam-forming, and N ∈ [1,2,4,7] pilots. The distance in terms of the wavelength λ corresponding to a given prediction horizon is given as second x-axis.

IV. RESULTS
We use a Monte-Carlo simulation to verify the uplink ergodic and asymptotic SE and channel hardening results under the influence of aged CSI and multi-step Wiener prediction. The expectation and variance in (32) and (35) are numerically estimated as the mean over random channel realizations. The random variable realizations are drawn according to the distributions specified in Sec. II. The number of runs for the simulation is 20000.
Channel vector realizations with proper statistics in (6) are calculated using the approach outlined in [14] and [15] with 400 paths and one tap. We consider Clarke's model r h,ℓ = J 0 (2πf D T s ℓ) for the temporal correlation coefficient, where J 0 denotes the Bessel function of the first kind and f D T s denotes the maximum Doppler shift f D normalized by the symbol duration T s . This is more suitable for modeling a band limited fading processes, compared to the unbounded Doppler spectrum of the AR(1) model used in [3]. It also provides a direct relation between correlation in time and maximum Doppler shift. In the simulations, we set the normalized Doppler shift to f D T s = 0.04, which corresponds to a velocity of 172 km h −1 at a carrier frequency of 3.5 GHz (wavelength λ = 8.6 cm) and a symbol duration T s = 71.4 µs. The number of BS antenna elements is A = 64 unless stated otherwise.
We note an excellent match between the SE in (32), simulated with A = 64 BS antenna elements, and the asymptotic SE in (33) for MRC beam-forming, shown in Fig. 1. Channel prediction with a higher number of pilots N > 1 increases the SE considerably and prevents sudden drops (caused by the temporal correlation function r h,ℓ approaching zero). For a prediction horizon ℓ = 7 (equivalent to a movement of the user by 0.28λ since CSI acquisition), utilizing four instead of one pilot symbol doubles the achievable SE from 2.2 bit s −1 Hz −1 to 4.3 bit s −1 Hz −1 . Figure 2 shows significant achievable SE improvements in the most relevant region ℓ ≤ 10 when choosing RZF over MRC. For a prediction horizon ℓ = 7, utilizing four instead of one pilot symbol increases the achievable SE from 2.5 bit s −1 Hz −1 to 5.2 bit s −1 Hz −1 . We also note that for ℓ ≥ 5, the SE with an SNR of 6 dB and N = 4 is superior to the SE with an SNR of 12 dB and N = 1, highlighting the importance of channel prediction. SE R k,ℓ over prediction horizon ℓ for a normalized Doppler f D Ts = 0.04, RZF beam-forming, and N ∈ [1,2,4,7] pilots. The distance in terms of the wavelength λ corresponding to a given prediction horizon is given as second x-axis. Fig. 3. Boxplot of β k,ℓ distribution over prediction horizon ℓ ∈ {0, . . . , 30} for different number of BS antennas, A ∈ [8,128]. The normalized Doppler f D Ts = 0.04, the SNR P/σ 2 z = 12 dB, the number of pilots N = 4, and the beam-forming method is RZF. The small variance of β k,ℓ demonstrates that channel hardening is basically independent of the channel age. Figure 3 shows a boxplot of the distribution of β k,ℓ over the prediction horizon ℓ for different numbers of BS antenna elements A ∈ [8,128]. We verify that channel hardening is indeed largely independent of the prediction horizon ℓ and is mostly determined by the number of BS antenna elements.
V. CONCLUSION In this letter, we addressed the SE of time-variant massive MIMO using Wiener prediction. We derived asymptotic expressions of the achievable SE in an uplink massive MIMO system for an arbitrary prediction horizon ℓ ≥ 1, assuming a general temporal covariance matrix. Our results are also directly applicable to the downlink, due to UL-DL duality.
Given a prediction horizon of 0.3 λ, utilizing four instead of one pilot symbol is shown to double the SE, both for MRC and RZF. Utilizing more than four pilots shows no clear advantages.
Further, channel hardening, i.e., the capability of a massive MIMO system to eliminate small-scale fading and create quasi-deterministic effective channel gains, is shown to be (largely) independent of channel aging.
Our work allows to assess effects of channel aging due to mobility, and channel prediction as mitigation strategy, for arbitrary time delays.