Data Aided Channel Estimation for MIMO-OFDM Wireless Systems Using Reliable Carriers

Wireless channel estimation is one of the challenging problems in multiple input multiple output orthogonal frequency division multiplexing (MIMO-OFDM) wireless systems. The MIMO-OFDM exploits the spatial resources and increases the reliability and capacity of wireless systems. However, the performance of these systems depends on accurate channel estimation since the receivers require perfect channel state information (CSI) for coherent signal detection. Thus, wireless channel estimation is a necessary component of OFDM systems. In this article, we have proposed an algorithm for MIMO-OFDM systems that combines pilot symbols with reliable data symbols for channel estimation. The reliable data symbols serve as virtual pilots and enhance the spectral efficiency of the system. The proposed data aided channel estimation (DACE) algorithm eliminates the requirement of any additional resources such as excessive number of training sequences to attain the desired performance. Also, it outperforms the traditional least square (LS) and linear minimum mean square error (LMMSE) methods for channel estimation in terms of mean square error (MSE) and bit error rate (BER) performance of the system.


I. INTRODUCTION
Wireless communication is the most effective zone of technological growth where rapid developments have been made in recent times to improve the performance of wireless systems. Different generations of mobile wireless technologies were introduced in past few decades as wireless technology enabled several services including voice to data and to multimedia transmission. Now, we are aiming towards more advancements with vast range of applications in future wireless systems [1], [2], [3], [4], [5]. In this regard, MIMO-OFDM systems meet the requirements of greater bandwidth and capacity, resist noise and multipath effects, and offer high data transmission rate. In wireless communication, CSI is one of the fundamental concepts that provides the possibility to adjust communication according to the current channel condition, which is essential to achieve reliable communication in MIMO-OFDM systems. However, it is highly complicated to obtain accurate CSI due to the time varying nature of the wireless channel [6], [7], [8].
The associate editor coordinating the review of this manuscript and approving it for publication was Usama Mir .
In recent times, different modulation techniques have been devised, but OFDM presents its own advantages. OFDM is a multicarrier modulation (MCM) technique where the whole transmission bandwidth is divided into N number of orthogonal subcarriers and the parallel data streams are multiplexed to offer high data rates. It converts the frequency selective channel into the frequency flat channels and avoids inter symbol interference (ISI) using cyclic prefix. Furthermore, it is easier to implement OFDM as compared to some other techniques like filter bank multicarrier (FBMC) that does not need cyclic prefix but has greater computational complexity [9], [10], [11], [12], [13], [14].
Channel estimation is one of the demanding challenges in OFDM communication systems [15], [16]. Different channel estimation methods have been discussed in the literature including LS, minimum mean square error (MMSE), LMMSE, singular value decomposition (SVD)-LMMSE etc. but LS and LMMSE are the most common estimation methods because of their simplicity and better performance. Therefore, several channel estimation techniques have been developed around LS and LMMSE estimators. The LS estimator is modest in comparison with LMMSE because it does not require a priori knowledge of the channel statistics. Also, it offers comparable performance at high signal-to-noise ratio (SNR). Whereas the LMMSE estimator uses second-order channel statistics to minimize the MSE. It is computationally more complex because the multiplication and inversion of higher dimensional matrices are required for each estimate. However, it is robust and offers better performance against fast fading wireless channel. Generally, increasing the number of training sequences (pilots) improve the performance for both LS and LMMSE estimators, but it reduces the spectral efficiency of the wireless systems [17], [18], [19], [20], [21], [22], [23].
One of the important questions that has been addressed in this research is how to improve the MSE and BER performance of MIMO-OFDM systems without increasing the number of pilots? In this regard, different channel estimation techniques have been considered and it is realized that DACE techniques offer better performance using reliable data symbols. The reliable data symbols are those symbols which are decoded correctly by the receiver and can be used as virtual pilots. These virtual pilots are merged with original pilots to update the solutions for conventional channel estimation methods. This improves the quality of channel estimation without compromising the spectral efficiency of the system. However, another key challenge is how to develop an algorithm to select the reliable data symbols out of all possible data symbols transmitted through the wireless channel?
This problem has captured a lot of attention from research community, but only few techniques/algorithms have been proposed to deal with it. One of the most widely used method to select reliable symbols is maximum likelihood (ML). Although it is optimal at high SNR in additive white Gaussian noise (AWGN) scenario, it is not preferable when received symbols are equidistant from a certain constellation point. Also, ML performance deteriorates because it ignores the channel estimation error [24], [25]. In this work, the reliable data symbols are selected using a novel algorithm for OFDM systems. The proposed method calculates the reliability of each observation independently, and judiciously selects the most reliable symbols when they are equidistant from a certain constellation point. It considers both noise error as well as channel estimation error for the detection. Furthermore, the proposed algorithm offers promising performance for both single input single output (SISO) and MIMO-OFDM systems and achieves high accuracy over the existing methods for channel estimation.
Several channel estimation techniques have been proposed and investigated, but each of them has its own limitations. The ultimate goal is to accurately estimate the wireless channel, in order to compensate for its detrimental effects on transmitted signals, and for perfect signal demodulation. The existing channel estimation techniques are discussed below and shown in Fig. 1.

A. BLIND CHANNEL ESTIMATION (BCE)
The statistical characteristics of the received signal are exploited in BCE techniques which require a large amount of data. These techniques are investigated to overcome the wastage of bandwidth in pilot assisted channel estimation (PACE) techniques. The BCE techniques are separated in statistical and deterministic methods. Using deterministic methods, both the received signals and the channel coefficients are considered as the deterministic quantities. The comparison of deterministic and statistical methods indicate that the deterministic methods converge faster than the statistical methods. However, the deterministic methods have greater computational complexity that increases further as the constellation order increases. On the other hand, the performance of statistical methods deteriorate when dealing with exceptionally short sampling sequence. As, no training sequences are required for BCE techniques, thus seem more attractive with effective bandwidth utilization [26], [27]. However, these techniques have greater computational complexity and low convergence rate that cause performance degradation. Also, BCE techniques require time invariant channels for better performance, so these methods are restricted to slow fading channels and not favorable for fast fading channels.

B. NON-BLIND CHANNEL ESTIMATION (NBCE)
In NBCE techniques, information of the previous channel estimates or some portion of the transmitted signal is available to the receiver to be used for channel estimation. These techniques can be divided into the following two types.

1) PILOT ASSISTED CHANNEL ESTIMATION (PACE)
In PACE, also known as training based channel estimation technique, data known to the receiver is multiplexed with the transmitted data symbols at the pre-determined location before transmission. Using optimized space between pilot and data symbols, the PACE techniques offer better performance in terms of the spectral efficiency of the system. However, regardless of several improvements that have been made, the major drawback associated with PACE techniques is the wastage of the bandwidth due to the requirement of large number of pilots for channel estimation. Another setback is that the channel estimates depend only on pilot tones, therefore interpolation techniques are used to estimate the data tones which could not be a perfect estimator all the time [28], [29], [30], [31], [32].

2) DECISION DIRECTED CHANNEL ESTIMATION (DDCE)
The DDCE technique exploits the information of non-pilot symbols to improve the performance of wireless systems. It offers higher reliability than PACE techniques. However, error propagation can occur in successive decisions of DDCE because it utilizes previous channel estimates to decode current OFDM symbols. The newly estimated symbol information is then used to estimate the channel corresponding to the current OFDM symbols. Generally, the estimator of VOLUME 11, 2023 47837 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  the channel utilizes soft symbol information or hard symbol information in the iterative form of DDCE techniques. This information is filtered iteratively and estimated by the detector. It is then sent back to the channel estimator to improve the performance of channel estimates with the increasing number of iterations. In this way, the detector takes advantage from the better channel estimates and produces updated symbol information. Hence,the detector works in an iterative manner with the channel estimator. The channel estimator uses hard decision output of the detector in hard iterative channel estimation methods. In contrast, if soft symbol information is employed, the log-likelihood ratios (LLR) are used by the channel estimator. These ratios are estimated by the detector for the channel estimation. Such techniques are known as soft iterative channel estimation methods [33], [34], [35], [36], [37].

C. DATA AIDED CHANNEL ESTIMATION (DACE)
The DACE technique improves the quality of channel estimation using both pilot and decoded data symbols. This is in sheer contrast to the PACE techniques which rely only on pilot symbols. That is why, DACE dominates the PACE techniques, reduces the pilot overhead, and enhances the spectral efficiency of the wireless systems [38], [39], [40], [41], [42], [43]. Different channel estimation techniques have been analyzed with the perspective of the selection of reliable data symbols to improve the MSE and BER performance of MIMO-OFDM systems and it turns out that DACE techniques can give better performance in this regard. The MIMO-OFDM increases the range, offers superior performance using multipath propagation, and improves the data rate combining multiple data streams. Thus, it provides both the diversity gain as well as the multiplexing gain. The diversity gain increases the reliability of the system as mutiple received signals from different channels are combined to estimate the transmitted signal. Whereas the multiplexing gain increases the capacity as different signals are sent through different paths for this purpose [44], [45], [46], [47], [48], [49].

D. PERFORMANCE MEASURE FOR DACE
The performance measure for DACE depends upon various factors including pilot patterns, estimation methods, and detection methods.

1) PILOT PATTERNS
The estimation of pilot spacing requires extreme care because greater number of pilots improves the quality of channel estimation but reduces the spectral efficiency of the system and vice versa. An optimum pilot pattern for one channel could not be optimum for some other channel because of different fading processes. Another significant feature is the power allocation to the pilot tones in comparison with the data symbols. In numerous cases, equal power is assigned to both pilot and data symbols. However, the accuracy of channel estimation can be enhanced by assigning more power to the pilot tones, but it reduces the SNR.
Pilots can be used in different patterns including block type and comb type pilot patterns as shown in Fig. 2. The block type pilot configuration has pilot tones for all subcarriers in first OFDM block and it is preferable for slow fading channels. Whereas comb type pilot configuration has uniform distribution of pilot tones in all OFDM blocks and it is suitable for fast fading channels [30], [50]. 47838 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

2) ESTIMATION METHODS
There are many channel estimation methods like LS, MMSE, LMMSE, SVD-LMMSE, etc. but LS and LMMSE are the most widely used methods for channel estimation. Every method has its own advantages and disadvantages. For instance, LS is simple and direct method, and its performance reaches MMSE at high SNR. It does not require channel and noise statistics but ignoring the noise influence completely, it has greater estimation error particularly at low SNR. Also, LS offers no advantages if the channel length or the channel impulse response (CIR) is unknown. Whereas if channel has L number of taps, an improved performance can be obtained because of the noise reduction.
In contrast to the LS, the MMSE considers the noise influence and gives better performance at low SNR. However, its computational complexity is high due to the extra information incorporated in the estimation technique. Similarly, LMMSE is extensively utilized for the channel estimation of OFDM based systems as it is an optimum method to reduce the MSE while AWGN is present. It requires prior knowledge of pilot symbols or channel statistics for the channel estimation. Therefore, it has greater computational complexity that can be reduced by the appropriate insertion of pilot tones across the OFDM subcarriers [51], [52], [53]. Similarly, SVD also reduces the computational complexity of LMMSE. However, SVD estimation involves complex nonlinear optimization methods, and its estimation is prone to the error of estimated channel matrix [54], [55].

3) DETECTION METHODS
The detection of reliable data symbols is an integral part of DACE techniques, so the detection method plays a vital role in this regard. The most common detection method is ML that determines reliable symbols based on minimum distance from a certain constellation point. However, it is not preferable in equidistant scenario for multiple symbols. Also, intelligent ML decoder needs to update its decisions iteratively in frequency domain based on the resulting waveforms in time domain [56]. So, one faulty decision in frequency domain can generate error propagation on subsequent decisions.
Some other detection methods have also been proposed in the literature including symbol selection algorithm (SSA) in which SNR based reliability coefficient (RC) is associated with each received symbol. The symbols with highest RC values are considered reliable, and the rest are discarded [57]. However, SSA is also not an optimal choice for the accurate detection of reliable symbols. Thus, it is key to introduce such an algorithm that detects the reliable symbols with high precision and improves the quality of channel estimation for MIMO-OFDM systems.

E. MACHINE LEARNING BASED METHODS FOR CHANNEL ESTIMATION
Apart from the above-mentioned basic techniques for channel estimation, the machine learning based channel estimation methods have also been developed in recent years. Different from the existing OFDM receivers that first estimate the CSI explicitly and then the transmitted symbols are detected/recovered, using deep learning approach for channel estimation, the CSI is estimated implicitly and the transmitted symbols are recovered directly. The use of deep neural networks (DNN) for channel estimation improves the performance because the various characteristics of wireless channel are learnt and analyzed by DNN. The DNN are basically deeper versions of artificial neural networks (ANN) with greater number of hidden layers. In this regard, the deep learning approach for channel estimation is more effective when the wireless channel has greater interference and distortion [58], [59]. However, it offers various challenges in terms of availability of datasets and training because less datasets have been used in wireless communication. Also, privacy and data security, fluctuations in wireless propagation environment and interference are the various constraints to obtain these datasets. The training of these datasets is VOLUME 11, 2023  another crucial problem in deep learning approach because the time complexity becomes too high when all layers are trained simultaneously, and the deviation can transfer from one layer to the other if one layer at a time is trained. Furthermore, the machine learning techniques considered only simplified system models whereas radio environments with high complexity have not been investigated with comprehensive considerations. So, it is paramount to investigate all the aforesaid challenges associated with deep learning approach for wireless channel estimation.

F. RESEARCH PROBLEMS
Looking for different channel estimation techniques, the recent trends indicate that the performance improvement for channel estimation in MIMO-OFDM systems costs us in many ways. Some of the important research gaps which have been addressed in this work are shown in Table. 1. The indicated shortcomings are considered as benchmark to improve the performance of channel estimation. Knowing that the data aided techniques can give better performance and higher spectral efficiency, we propose a novel algorithm for the channel estimation of MIMO-OFDM systems. The proposed algorithm outperforms the existing techniques and implies high quality channel prediction.

II. MIMO-OFDM SYSTEM MODEL
The MIMO-OFDM system model is presented in this section. It offers robust and efficient data communication as demodulation of subcarriers can be performed individually. The inverse fast Fourier transform/fast Fourier transform (IFFT/FFT) implementation of MIMO-OFDM system with DACE is shown in Fig. 3.
The baseband signal for OFDM system is given as: where N represents the number of subcarriers, X (k) is complex modulation symbol transmitted on the k th subcarrier, and f is the subcarrier spacing. The time domain samples at the output of the IFFT are given by: where W is FFT unitary matrix of order N × N and its elements are given as: The cyclic prefix of length g is added to avoid ISI such as: where C T is (N + G) × N order matrix and given as: The frequency selective channel is modelled using L-tap finite impulse response (FIR) filter with taps The time domain received signal is given by: where z(n) ∼ N (0, σ 2 z ) is AWGN. The convolution is converted into matrix multiplication using following Toeplitz like channel matrix H . So, At the receiver, the removal of the cyclic prefix takes place that can be expressed as multiplication of the received signal with matrix C R given as: VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  The multiplication converts the Toeplitz like channel matrix into the circulant matrix.
Circulant matrix for 4-taps is given as: Now, the equation for received signal can be written as: where Z (k) = WC R z(n) and H = WC R H C T W H Thus, the frequency domain received signal for OFDM system is given as: where H represents the channel frequency response (CFR), X contains the transmitted symbols and Z is AWGN with zero mean and covariance matrix R Z = σ 2 Z I N [60]. The matrix vector form of the received signal model is given as: where A = √ N diag(X )F and F represents the partial FFT matrix comprising of first L columns of full FFT matrix.
Also, the CFR matrix is given as: Using N t number of transmit antennas and N r number of receive antennas for MIMO-OFDM systems, the intention is to estimate h given the received symbol Y with minimum number of pilots.

III. PROPOSED METHOD
Herein, an algorithm is devised to judicially pick the reliable data symbols to overcome the inadequacies for DACE techniques.
Starting from equation (13) over the k th subcarrier as: FindingĤ (k) using LS/LMMSE estimation methods, the estimate of X (k) is given by: where, D(k) indicates the distortion and it is defined as: Now, the reliability values are determined by the following expression.
It is a likelihood ratio dependent upon relative posterior probability of D(k) equatingX (k) − ⟨X (k)⟩ to the probability of some other vectorX (k)−X i (k). Intuitively,X (k) is decoded relative to both the nearest constellation point as well as the nearest neighbours. The p D(k) is the probability density function (PDF) of the distortion D(k). As it is highly complex to determine the exact PDF of the distortion, so we consider it circularly symmetric Gaussian with mean zero and variance VOLUME 11, 2023 σ 2 D(k) . Consequently, the following PDF expression is used to evaluate reliability values in (19). (21) where the variance of distortion is computed using (16) in (17) as follows:X Comparing (17) with (23), we get D(k) = Z (k)/Ĥ (k). Now, the variance of distortion is given as: Hence, the distortion depends on both AWGN and channel estimates. Now, the important question is how to pick the reliable data carriers from a set of all possible 2 N data carriers? The reliability value for each data carrier is obtained using equation (19), which results in a non-negative number. Therefore, a threshold value is chosen and all data carriers having reliability values greater than the threshold value are considered as reliable data carriers. The threshold value is estimated empirically and a set of reliable data tones m is selected out of all available data tones. In this regard, the proposed reliability function associates a reliability estimate with each tone and determines the m tones having highest reliability to generate m . By incorporating the reliable data carriers with pilots, the wireless channel is re-estimated using LS and LMMSE methods.
Consider P as pilot and R as reliable indices, equation (14) is updated as: where and Finally, LS and LMMSE channel estimates for equation (25) are given by: Summary of the proposed algorithm 1) Get initial channel estimatesĥ by using pilots/training symbols (e.g., using LS, LMMSE or any other method). 2) ObtainĤ by frequency transformation ofĥ and then equalize the data using (17) to obtainX . 3) Compute the variance σ 2 D(k) and the reliability R(k) for each subcarrier, k = 1, 2, . . . , N , using (24) and (19) respectively. 4) Select the index R of the most reliable data carriers by thresholding i.e., the index vector R is computed as follows: for k = 1, 2, . . . , N , k ∈ R iff R(k) > threshold. 5) Re-estimate the channel using both pilots and reliable data carriers (e.g., using LS, LMMSE or any other method). The graphical illustration of the reliability variations of observation for the proposed algorithm is shown in Fig. 4. ConsiderX 1 (k) andX 2 (k) such that: It is obvious from Fig. 4 that X a and X c are the nearest neighbours and X b is the next nearest neighbour for ⟨X (k)⟩. The received symbolsX 1 (k),X 2 (k) andX 3 (k) are equidistant form ⟨X (k)⟩ and exist at the circumference of the circle of radius r. In this scenario, ML is not an optimum method for detection because the received symbols are equidistant from ⟨X (k)⟩. Now, consider X a as the nearest neighbour to compare the reliability ofX 1 (k) andX 2 (k). It can be observed that  X 2 (k) is farther from X a thanX 1 (k), soX 2 (k) is more reliable thanX 1 (k). Alternatively, the reliability to assumeX 2 (k)-⟨X (k)⟩ is higher thanX 1 (k)-⟨X (k)⟩. Similarly, if we consider X c as the nearest neighbour for the comparison ofX 2 (k) and X 3 (k), the distance betweenX 2 (k) and X c is greater than the distance betweenX 3 (k) and X c . So, the reliability to assumê X 2 (k)-⟨X (k)⟩ is also higher thanX 3 (k)-⟨X (k)⟩. Thus,X 2 (k) is more reliable thanX 1 (k) andX 3 (k).

IV. RESULTS AND DISCUSSION
In this section, the simulated results are achieved using N =256 number of OFDM subcarriers and N p = 16 number    LMMSE methods using virtual pilots outperform the conventional LS and LMMSE methods for channel estimation. The LS curves reach LMMSE curves at high SNR and give comparable performance. The performance improvement seems better at high SNR as data equalization/channel estimation has less distortion at high SNR and vice versa.
The BER curves for different MIMO configurations including 2×4, 2×8, and 4×8 are shown in Figs. 7 to 12. These curves indicate that unlike many other methods/algorithms applied for channel estimation, the BER performance for the proposed algorithm improves for different MIMO configurations, from low SNR values to high SNR  values i.e., from 0 dB to 30 dB. It is clear from Figs. 7 and 8 that BPSK attains better BER performance for a greater number of receive antennas due to the diversity gain. Similarly, Figs. 9 and 10 show that the 4-QAM constellation also offers superior performance as the number of receive antennas increases. However, Figs. 11 and 12 imply that the BER performance deteriorates with an increasing number of transmit antennas for BPSK and 4-QAM constellations, respectively. The reason is that it is challenging to pick reliable symbols as the size of the data increases. Furthermore, the MSE and BER performance for BPSK appears slightly better than that of 4-QAM because it is difficult to detect reliable symbols as the constellation order increases. Finally, it is investigated how much pilot overhead reduction/spectral efficiency improvement has been achieved using the proposed DACE algorithm? To ascertain this, the performance curves are obtained for both LS and LMMSE channel estimation methods at a fixed SNR. The targeted MSE value is chosen as 0.02. The success rate is defined as the ratio of number of symbols for which the MSE is less than the targeted MSE to the total number of transmitted symbols. The SNR value is fixed at 20 dB for 4-QAM modulation scheme. The success rate is plotted against number of pilots varying from L to N as shown in Fig. 13. It is observed that the traditional LS and LMMSE channel estimaton methods require 128 number of pilots to achieve a 100% success rate. However, using the proposed DACE algorithm, the LS and LMMSE methods reach a 100% success rate with only 52 number of pilots. Thus, the proposed method reduces approximately 60% pilot overhead as compared to the conventional methods for channel estimation, which means a significant improvement in terms of the spectral efficiency of the system.

V. CONCLUSION
In this work, we have proposed a data aided channel estimation method for both SISO and MIMO-OFDM systems. A novel algorithm for the selection of reliable data symbols/virtual pilots is developed for optimal channel estimation. The reliable data symbols enable us to attain accurate channel estimates with minimum number of pilots. Alternatively, the reduction of pilot overhead allows us to enhance the spectral efficiency of the system. The simulation results endorse our theoretical analysis and performance comparison against existing methods for channel estimation. The proposed algorithm is simple, precise, and efficient. Also, it selects the reliable data symbols intelligently and improves the MSE and BER performance of the system.