Joint Channel Estimation and Equalization in Massive MIMO Using a Single Pilot Subcarrier

The focus of this letter is on the reduction of the large pilot overhead in orthogonal frequency division multiplexing (OFDM) based massive multiple-input multiple-output (MIMO) systems. We propose a novel joint channel estimation and equalization technique that requires only one pilot subcarrier, reducing the pilot overhead by orders of magnitude. We take advantage of the coherence bandwidth spanning over multiple subcarrier bands. This allows for a band of subcarriers to be equalized with the channel frequency response (CFR) at a single subcarrier. Subsequently, the detected data symbols are considered as virtual pilots, and their CFRs are updated without additional pilot overhead. Thereafter, the remaining channel estimation and equalization can be performed in a sliding manner. With this approach, we use multiple channel estimates to equalize the data at each subcarrier. This allows us to take advantage of frequency diversity and improve the detection performance. Finally, we corroborate the above claims through extensive numerical analysis, showing the superior performance of our proposed technique compared to conventional methods.


I. INTRODUCTION
Massive multiple-input multiple-output (MIMO) has brought substantial improvements to 5G systems and it continues to be a dominant technology in next-generation networks.In particular, spatial diversity and beamforming gains provided by massive MIMO have led to significant improvements in capacity, spectral, and energy efficiency compared to 4G networks [1].In massive MIMO, multiple users reuse the same time-frequency resources, as their signals can be distinguished based on their channel responses.Therefore, obtaining accurate channel state information (CSI) at each base station (BS) antenna is crucial.Conventional channel estimation methods, e.g.least squares (LS), rely on the transmission of pilot sequences from each user.These methods are favorable due to their low complexity.However, this comes at the expense of spectral efficiency (SE) loss.
Hence, extensive research has been conducted to reduce the pilot overhead, see [2]- [4] and the references therein.For instance, (semi) blind techniques emphasize reducing the pilot length and estimating the channel by solving underdetermined systems of equations using iterative algorithms [2].Moreover, superimposed pilots share the same time-frequency resources for both data and pilot transmission.In these techniques, the channel is estimated by treating data as noise.This can completely remove the pilot overhead, however, at the cost of interference and extra computational load [3], [4].Most works on massive MIMO consider a narrow band channel.Therefore, utilizing these channel estimation techniques for orthogonal Danilo Lelin Li and Arman Farhang are with the Department of Electronic and Electrical Engineering, Trinity College Dublin, Dublin 2, Ireland (Email: {lelinlid,arman.farhang}@tcd.ie).
frequency division multiplexing (OFDM) would require pilots on each subcarrier.This is while it is shown in [5] that the number of pilot subcarriers needs to be at least equal to the channel length.However, this leads to a large overhead, as the channel length varies from 7% to 25% of the OFDM symbol duration, [6].To alleviate this issue, the authors in [5] proposed a technique that can reduce the required pilot subcarriers by 80%, however, the technique is limited to sparse channels.
To address the aforementioned limitations of the existing literature, in this letter, we show that only one reference subcarrier is sufficient for both channel estimation and data detection in massive MIMO.It is worth noting that our proposed technique is independent of channel sparsity and length.We take advantage of multiple adjacent subcarriers being within the channel coherent bandwidth.Thus, the transmitted data over a band of subcarriers can be detected using only one reference subcarrier's channel estimate.Thereafter, the detected data symbols are considered as virtual pilots to update their corresponding channel estimates.Each updated channel estimate is then used to equalize a new band of subcarriers within the coherent bandwidth.This way, we can obtain multiple estimates of the transmitted data symbols at each subcarrier and average them.With this approach, we take advantage of the frequency diversity apart from the spatial diversity to achieve an improved performance.By repeating this process in a sliding manner, all the data symbols and the CFRs can be estimated, starting from one reference pilot.
Furthermore, we prove that in the large antenna regime, linear combining can be performed using the channel estimate at any subcarrier.The combined signal at a subcarrier will be scaled by a coefficient, which depends on the channel power delay profile (PDP) and frequency spacing between the subcarrier and the CFR used for combining.However, this coefficient can take very small values leading to noise enhancement.To avoid this issue, we define a depth factor in our proposed sliding technique that determines the number of subcarriers within each band to be equalized using the same CFR.This ensures that only large values of the aforementioned coefficient are considered.To evaluate the effectiveness of our proposed technique, we compare its bit error rate (BER) and signal-tointerference and noise ratio (SINR) performance with the conventional techniques.For 15 MHz transmission bandwidth, the existing techniques require over 70 pilot subcarriers, whereas our proposed technique reduces this overhead to only one pilot subcarrier.Moreover, our proposed technique outperforms the linear equalizers by around 2 dB, at high signal-to-noise ratios (SNRs), in terms of the BER performance.
Notations: Matrices, vectors, and scalar quantities are denoted by boldface uppercase, boldface lowercase, and normal letters, respectively., for m, n = 0, ..., M − 1 and f M,m is the m th column of F M .

II. SYSTEM MODEL
We consider the uplink (UL) of a single-cell massive MIMO system operating in the time-division duplex (TDD) mode.OFDM with M subcarriers is deployed as the modulation format, with the cyclic prefix (CP) of length M CP .M CP is chosen to be larger than the channel length to avoid intersymbol interference.The BS is equipped with Q antennas and serves K single antenna users.The duration of each frame is assumed to be within the channel coherence time, including N number of OFDM time symbols in the UL.Hence, the channel remains time-invariant within each frame.
The UL transmission is divided into two phases, training/pilot and data transmission.N p and N d time slots are allocated to pilot and data transmission, respectively.Thus, a given user k transmits the time-frequency symbols where P k ∈ C M×Np represents the pilot and D k ∈ C M×N d the data symbols.Hence, the OFDM transmit signal of user k is obtained as T is the CP addition matrix and the M CP × M matrix G CP includes the last M CP rows of I M .
The signal is transmitted through the channel and undergoes OFDM demodulation.Thus, the received signal from all the users at a given BS antenna q is obtained as where R CP = [0 M×MCP , I M ] is the CP removal matrix, H q,k denotes the Toeplitz channel matrix realizing the linear convolution, which is formed by the channel impulse response (CIR) between user k and antenna q, i.e., h q,k = [h q,k [0], . . ., h q,k [L − 1]] T , and W q includes the complex additive white Gaussian noise (AWGN) in the frequency domain, with the variance σ 2 w , i.e., [W q ] m,n ∼ CN (0, σ 2 w ).We assume the CIR between the UEs and the BS antennas to be independent and identically distributed (i.i.d.) complex random variables with the length L, h q,k ∼ CN (0 L×1 , Σ k ) for q = 0, . . ., Q − 1 and k = 0, . . ., K − 1. Σ k is a diagonal matrix with the diagonal elements formed by the PDP of the channel A CP is a circulant matrix, with the first column formed by the zero-padded CIR, i.e., h q,k = [h T q,k , 0 T M−L×1 ] T .Hence, the channel matrix H q,k can be diagonalized by the DFT and inverse DFT matrices as ] T represents the CFR with the elements obtained as To pave the way toward the derivations in the following sections, we rearrange the received signals into the space-time representation.By stacking the received samples at a given subcarrier m, i.e., the m th row of Y q from all the Q receive antennas, we obtain the Q × N space-time matrix Y m .Thus, the input-output relationship for a given subcarrier m across all the antennas can be represented as where the elements of the matrices where P m and D m represent the transmitted pilot and data symbols, respectively.

III. CONVENTIONAL CHANNEL ESTIMATION AND
EQUALIZATION In this section, the widely used linear pilot-based channel estimation and equalization techniques are described.In particular, we consider LS-based channel estimation, and two linear combining techniques, namely MRC and MMSE equalization techniques.At the channel sounding stage, each user transmits a pilot sequence with the length N p ≥ K over different time slots on a given subcarrier allocated to pilot symbols [1].The pilot sequence of length N p for a given user k at a given subcarrier m, p k m , is chosen from a pilot book ] T that satisfies the orthogonality property P m P H m = N p I K , [7].We consider a pilot book where each p k m is obtained from k cyclic shifts of the Zadoff-Chu (ZC) root sequence of length N p .Using (3), the received pilots from all the users at subcarrier m can be represented as Consequently, the channel response for all the users at a given subcarrier m can be estimated as where To estimate the whole CFR, it is sufficient to allocate L subcarriers as pilots [5].Let I represent the set of subcarrier indices allocated to the pilot, and λ I q,k the vector formed by selecting the L entries of λ q,k at the pilot subcarriers, each element of λ I q,k can be estimated from (4), as λ q,k [m] = [ Λ m ] q,k .From (2), the following relation between CFR and CIR holds where F I M is a L × L matrix formed by selecting the L first columns of F M with the rows indexed by I. Hence, the CIR of user k at a given BS antenna q can be estimated by solving (5) for h q,k , i.e., h q,k = 1 √ M (F I M ) −1 λ I q,k .Finally, after obtaining the CIR, Λ m at each subcarrier can be reconstructed with (2).Therefore, the channel estimation process, ( 4) and ( 5), requires a minimum of LK pilots.
Using the channel estimates Λ m , and deploying a linear combining technique such as MRC, the transmit data symbols for all the users can be estimated as where Γ m is a K × K diagonal matrix, with the k th diagonal element given by 1 considered solely to normalize the amplitude of the equalizer output.In the literature, the effect of channel estimation noise is often neglected and the diagonal elements of Γ m are formed from the diagonal elements of Λ H m Λ m , i.e., the norm squared of the channel vector for each user at a given subcarrier m.However, in the presence of channel estimation errors, using (4), the normalization factors on the elements of Λ H m Λ m are obtained as the diagonal elements of The term Λ H m W m tends to zero in the asymptotic regime, as Q → ∞, since the channel gain and noise are independent.However, the same is not true for (W represents the AWGN, in the asymptotic regime, we have Hence, taking a similar approach to [8], the channel estimation noise can be mitigated by subtracting (8) from the normalization factor, i.e., Γ m is formed from the diagonal elements of ( Λ Due to the channel hardening effect [1], in the asymptotic regime, as Q → ∞, we have where λ m,k , of length Q, represents the In practical systems, the number of BS antennas is limited, and the off-diagonal elements of Λ H m Λ m , in (6), lead to a significant amount of interference.Thus, MMSE combining, where H m , provides an improved performance compared to MRC.

IV. PROPOSED CHANNEL ESTIMATION AND COMBINING
The linear combining techniques presented in the previous section take advantage of the channel hardening effect and spatial diversity gains in massive MIMO.However, it requires knowledge of the whole CFR, and therefore, a minimum of LK time-frequency slots are allocated to the pilots for channel estimation.In practical systems, L can take large values, especially as the bandwidth increases.Hence, in this section, we study the possibility of reducing the pilot overhead to a single pilot subcarrier.Therefore, only K time-frequency slots would be necessary for pilot allocation, reducing the training overhead by a factor of L.
In current standards, the subcarrier spacing ∆f is chosen to be much shorter than the coherent bandwidth [9].Hence, the channel can be considered flat across the frequency band of multiple adjacent subcarriers.This suggests that the channel at a single subcarrier can be considered for the equalization of its adjacent subcarriers.Moreover, the correlation between the CFR of adjacent subcarriers can be obtained from the following approximation [10] where we define ℓτ is the coherence bandwidth and ℓ τ is the maximum delay spread of the channel.Taking this into account, we consider a reference pilot in one subcarrier and utilize its channel estimate for equalization of the neighboring subcarriers.The detected data symbols will be considered as virtual pilots, updating their channel estimates and allowing for the equalization of data at their adjacent subcarriers.We repeat this process, in a sliding manner, until all the data and the CFR in the UL frame are estimated.Hence, the data X m , at any given subcarrier m, can be equalized using (10) and (11) with the CFR at subcarrier m − 1 as In ( 12), we consider the approximation , and the phase of α 1,k to be negligible.After hard decision of the QAM symbols in X MMSE m to obtain X HD m , the CFR at subcarrier m is updated as This channel estimate is then used for equalization of the following subcarrier, m + 1, using (12).In particular, since we now consider X m as the pilot, the noise mitigation term in (8) should be corrected accordingly.That is, the term P H m in Γ m and Φ MMSE m should be substituted by ( X HD m ) † .Then the procedure is repeated in a sliding manner until the whole frame is equalized.If the detected symbols X HD m are erroneous, (13) provides imperfect channel estimates, and hence, the sliding technique propagates the error to further subcarriers.Therefore, as we will show in the later part of this section, we propose the concept of depth that significantly improves the accuracy of X HD m , alleviating the error propagation issue.It is worth noting that X m can be a rank-deficient matrix.Specifically, if the rank of X m is lower than K, (13) fails to retrieve Λ m .To solve this issue, we calculate α ∆m,k for any value of ∆m.This way, X MMSE m at any subcarrier can be detected with the same reference pilot using (12).
Proposition 1: In the asymptotic regime, it is possible to perform MRC with the channel estimates on a single subcarrier.In other words, the data at any given subcarrier m can be equalized with the CFR at any other subcarrier m ′ .
Proof We detect the data X m , at any given subcarrier m, by performing MRC with the channel estimates at subcarrier m Considering (9) in the asymptotic regime, as where ∆m = m − m ′ .Consequently, in (14 , where Ψ ∆m is a diagonal matrix with the diagonal elements formed by the vector [α ∆m,0 , . . ., α ∆m,K−1 ]. To obtain the exact value of α ∆m,k , we substitute the CFRs with the expression given in (2).This yields a quadratic form, with the expectation defined as [11] where µ and Σ k represent the mean and covariance matrices of h q,k , respectively, and With the parameters presented in section II, it is given that the channel is zero mean, i.e., µ = 0 M×1 , and Σ k is a diagonal matrix with the diagonal elements formed by the vector Hence, (16) reduces to Based on this result, as long as the PDP is known, equalization can be performed using (14) followed by scaling the MRC output for each subcarrier by 1 α ∆m,k .While Proposition 1 is sufficient in the asymptotic regime, in practical systems, the aforementioned scaling by 1 α ∆m,k may lead to noise enhancement and, consequently, a performance loss.For this reason, we rely on our proposed sliding equalization technique as α ∆m,k is larger for the smaller values of |∆m|.When X m is rank-deficient, Λ m cannot be estimated.Thus, we resort to utilizing the previously estimated channels on the closest subcarrier.Furthermore, if the PDP is known at the receiver, the scaling term α ∆m,k can be calculated from (17), instead of using the approximation from (11).
When the coherence bandwidth is much larger than the subcarrier spacing, the correlation α ∆m,k takes large values for multiple values of ∆m.This sparks the idea of taking advantage of the frequency diversity in addition to the spatial diversity.To this end, improved SINR performance can be achieved by performing equalization for each subcarrier multiple times, each with the channel of a different subcarrier within the coherence bandwidth.This improvement substantially reduces the error propagation issue from (13).
While the channels from all subcarriers could be considered for the averaging, the channels with a small correlation would bring minimal improvement to the output.Since, the value of α ∆m,k decreases as we increase the value of |∆m|, only channels of subcarriers within a range are considered.Hence, we call the maximum range considered as the depth, D.
Ideally, the depth would consider all adjacent subcarriers, at higher and lower indices.However, since the channels are updated in a sliding manner, only the channel on one side of the subcarriers is available at the time of equalization.Therefore, the final output of our proposed technique is achieved with two steps.We perform the first step of the proposed technique by sliding from lower to upper subcarrier indices until the whole frame is equalized.For the second step, the procedure Algorithm 1 Proposed sliding technique with reference pilot at subcarrier index i.
1: Initialize: Estimate channel at reference subcarrier (Pilot), obtaining Λ i using (4).end for 14: if c == 0 then 15: end for 26: end for 27: Obtain X m by averaging the two sets of result is realized again, however, now sliding from higher to lower subcarrier indices.It is worth noting that both steps can be realized in parallel as the steps are independent of each other.The output of both steps are then averaged to obtain the final output of our proposed technique.We distinguish each step with the direction variable ξ ∈ [−1, 1] representing steps one and two, respectively.Each step can be represented as The final output is then obtained as X m = 1 2 ( X m,−1 + X m,1 ).Our proposed technique is summarized in Algorithm 1.
In the following section, we numerically evaluate the performance of our proposed technique.We compare our solution with the conventional LS-based channel estimation and MMSE combining.We show that our proposed channel estimation and equalization method, using only K pilots, outperforms the conventional method, which requires LK pilots.

V. NUMERICAL RESULTS
In this section, we evaluate the efficacy of our proposed technique and compare it with the conventional LS-based channel estimation and MMSE equalization.In our simulations, we consider the UL transmission of a single frame with 16-QAM and OFDM modulation, M = 1024 subcarriers   In Fig. 1, we evaluate the SINR performance of our proposed technique versus the number of BS antennas for D = 0, 1, 2 and 3 and the input SNR of 0 dB.When D = 0, we only consider one sliding direction where one estimate of the data symbol at each subcarrier is obtained.The results in Fig. 1 show that our proposed technique can effectively average out noise and multiuser interference as the number of BS antennas grows large.Furthermore, when D = 0, our proposed technique can achieve very close performance to that of the conventional MMSE combiner.This is while for D = 1, 2 and 3, it leads to the SINR performance gain of around 2 dB compared to the conventional MMSE combining.The SINR performance of our proposed technique becomes linear only after deploying a minimum number of BS antennas.This is due to error propagation at the channel estimation stage in (13).This highlights the benefit of increasing depth as it greatly improves the SINR performance, especially when considering fewer BS antennas or lower input SNR.Furthermore, the frequency diversity predominantly averages out the channel estimation noise in (4).However, as it is shown in Fig. 2, in the absence of noise, increasing depth adversely affects the signalto-interference ratio (SIR).This is due to the approximation in (12).The conventional LS channel estimation and MMSE combining lead to infinite SIR in the absence of noise and thus, cannot be included in the results of Fig. 2.
Finally, in Fig. 3, we analyze the BER performance of our proposed technique when Q = 200 BS antennas are deployed.The proposed technique with D = 3 leads to 2 dB performance gain at large Eb/N0s compared to the conventional method.These results show that increasing D from 1 to 3 provides almost 4 dB performance gain.

VI. CONCLUSION
In this letter, we proposed a sliding joint channel estimation and equalization technique that significantly reduces the pilot overhead in massive MIMO.With the CFR estimates on a single reference subcarrier, we showed that the data symbols on the adjacent subcarriers can be detected with linear combining.The detected data symbols are then utilized as virtual pilots to update their corresponding channel estimates.The updated CFRs are then used for equalization of the remaining data symbols, in a sliding manner.With this approach, we obtain multiple estimates of the transmitted data symbols at each subcarrier that are averaged to provide additional frequency diversity to the spatial diversity gains of massive MIMO.Through extensive numerical analysis, we showed that our proposed technique achieves improved SINR and BER performance compared to the conventional linear combiners.
[A] m,n represents the element on row m and column n of A. tr{A} represents the trace of A. I M and 0 M×N are the M × M identity and M × N zero matrices, respectively.The superscripts (•) † , (•) H , (•) T and (•) * indicate pseudo inverse, Hermitian, transpose and conjugate operations, respectively.| • |, ((•)) M and E{•} are the absolute value, modulo-M , and expected value operators, respectively.Finally, F M is the normalized M -point discrete Fourier transform (DFT) matrix with the elements [F M ] m,n = 1

Fig. 1 .
Fig. 1.SINR performance as a function of the number of BS antennas.

Fig. 2 .
Fig. 2. SIR performance as a function of the number of BS antennas.

Fig. 3 .
Fig. 3. BER performance for our proposed technique and MMSE combining.