Channel Estimation for Large-Scale Multiple-Antenna Systems Using 1-Bit ADCs and Oversampling

,

hard thresholding (IHT) and recursive least squares (RLS) adaptive channel estimators, respectively. In the context of the signal detection used in uplink 1-bit massive MU-MIMO systems, the work in [14] proposes the iterative detection and decoding (IDD) technique together with regular LDPC codes and [15] presents a low-complexity near maximum-likelihood-detection (near-MLD) algorithm called 1-bit sphere decoding.
Moreover, some prior works have investigated 1-bit ADCs used in wideband communication systems. The works in [16]- [19] have studied massive MU-MIMO systems with coarsely quantized signals that deploy orthogonal frequencydivision multiplexing (OFDM) for wideband communications. Their results show that it is satisfactory to use 1-bit ADCs in wideband massive MU-MIMO-OFDM systems. Furthermore, the studies in [20]- [22] have discussed some key transceiver design challenges, including channel estimation, signal detection, achievable rates and precoding techniques, in millimeter-Wave (mmWave) massive MIMO systems, which are promising candidates for 5G cellular systems.
The previous works have considered quantized systems with sampling at the Nyquist rate. However, utilizing oversampling at the receiver can partially compensate for the information loss brought by the coarse quantization [23]. The work in [24] has proposed faster than symbol rate (FTSR) sampling in an uplink massive MIMO system with coarsely quantized signals in terms of the symbol error rate (SER). It shows that the FTSR sampling provides about 5dB signalto-noise ratio (SNR) advantage in terms of SER and achievable rate with a linear zero forcing receiver. The work in [25] has analyzed the achievable rate for 1-bit oversampled systems over band-limited channels. To reduce the computational cost caused by the large number of samples due to oversampling, a sliding window based linear detection has been proposed in [26]. In addition to the conventional system models based on matched filtering and correlated noise samples, alternative receiver assumptions exist in literature such as in [27], where the authors consider a wideband receiver whose bandwidth scales proportionally with the oversampling factor and has the drawback of additional received noise and interference from neighboring frequency bands.
From the channel estimation point of view, the works in [11], [12] have proposed different channel estimation techniques for systems operating at the Nyquist rate. However, only few works have considered channel estimation in oversampled systems. The study in [28] considers time-of-arrival estimation for systems with 1-bit quantization and oversampling and proposes corresponding performance bounds. The study in [29] has proposed carrier phase estimation and given lower bounds on complex channel parameter estimation for 1-bit oversampled systems based on [30]. In the study in [24] the BLMMSE channel estimator is applied to the MIMO channel with 1-bit quantization and oversampling using the simplifying assumption of uncorrelated noise samples which then yields performance degradation especially at low SNR and high oversampling factors.
In this work, low-resolution aware (LRA) channel estimators are developed for 1-bit oversampled large-scale MIMO systems in the uplink based on the Bussgang decomposition. Although the received signals are quantized to 1 bit, the computations after the 1-bit ADCs of all algorithms compared are performed at a higher resolution (8 bits or higher). The application of oversampling at the receiver can lead to significantly better performance. Unlike prior works we explicitly consider the correlation of the filtered noise, which is a main property of oversampled systems, and employ the Bussgang decomposition [31] to reformulate the nonlinear system into a statistically equivalent linear system. Based on this linear model, low-resolution aware least-squares (LS), linear minimum mean square error (LMMSE) and least-mean square (LMS) channel estimation algorithms are proposed for 1-bit oversampled systems and evaluate their computational costs. Moreover, an adaptive technique is devised to estimate the statistical quantities resulting from the Bussgang decomposition, which are required by channel estimators. We also examine the fundamental estimation limits by deriving a Bayesian framework and bounds on channel estimation for both non-oversampled and oversampled systems. In addition to the Bayesian Cramér-Rao bounds (CRBs), general CRBs is proposed for biased estimators due to the correlation between the signal and its quantization error. In summary, our work has the following contributions: • The LRA-LS, LRA-LMMSE and LRA-LMS channel estimation algorithms are presented for the 1-bit largescale MIMO systems in the uplink with oversampling.
• We obtain analytical expressions associated with the Bayesian CRBs for the oversampled systems and observe that the proposed bounds are very close to the results obtained from simulations at low SNR.
• An adaptive technique is proposed to estimate the autocorrelation of the channel vector, which is an essential part for the Bussgang decomposition in 1-bit systems. Some preliminary results have been shown in [32] and [33]. However, as compared to [32], [33], this paper extends and refines the analysis of the correlation property of filtered noise and proposes a more practical adaptive channel estimator with lower computational cost. In the section of numerical results, the performance of the proposed LRA-LMMSE estimator is compared with its simplified version in [24]. Furthermore, a comparison of the performance of systems using ADCs with more bits is also shown in this paper.
The rest of this paper is organized as follows: Section II illustrates the system model and gives some statistical properties of 1-bit quantization. Section III derives the proposed oversampling based channel estimators and analyzes the computational complexity of the estimators. Section IV gives the upper bounds of the Bayesian CRBs and the general CRBs for 1-bit non-oversampled and oversampled MIMO systems. Section V compares the normalized mean square error (MSE) and SER performance of the proposed and existing channel estimators. Section VI concludes the paper. Notation: The following notation is used throughout the paper. Matrices are in bold capital letters and vectors in bold lowercase. I n denotes the n × n identity matrix and 0 n is the n × 1 all-zero column vector. Additionally, diag(A) is a diagonal matrix only containing the diagonal elements of A. The transpose, conjugate transpose and pseudoinverse of A are represented by A T , A H and A + , respectively. a * denotes the complex conjugate of a and [a] k represents the kth element of vector a. (·) R and (·) I get the real and imaginary part from the corresponding vector or matrix, respectively.
⊗ is the Kronecker product. Finally, vec(A) is the vectorized form of A obtained by stacking its columns and det(A) is the determinant function. x ∼ CN (a, B) indicates that x is a complex Gaussian vector with mean a and covariance matrix B. The expectation and covariance is denoted as E{·} and Cov{·}, respectively.

II. SYSTEM MODEL AND PROBLEM STATEMENT
In this paper, we consider a single-cell multi-user large-scale MIMO system with N t single-antenna terminals and a BS with N r receive antennas, where each receive antenna is equipped with two 1-bit ADCs (one for the in-phase component and the other for the quadrature-phase component) and N r N t . The system model is depicted in Fig. 1. In the uplink, by assuming perfect synchronization the received oversampled signal y ∈ C MN r N ×1 can be expressed as where x ∈ C NN t ×1 contains independent identically distributed (i.i.d.) transmitted symbols from N t terminals, each with block length N . The vector x is arranged as where x i,j corresponds to the transmitted symbol of terminal j at time instant i. Each symbol has unit power so that E[|x i,j | 2 ] = 1. The vector n represents the filtered oversampled noise expressed by with w ∼ CN (0 3MN r N , σ 2 n I 3MN r N ). Note that the noise samples are described such that each entry of n has the same statistical properties. Since in digital domain the receive filter has a length of 2MN + 1 samples, 3MN unfiltered noise samples in the noise vector w need to be considered for the description of an interval of MN samples of the filtered noise n. The matrix G ∈ R MN ×3MN is a Toeplitz matrix that contains the coefficients of the matched filter m(t) (operated in analog domain) at different time instants and is shown in (4), as shown at the bottom of the next page, where T is the symbol period and M denotes the oversampling rate. The equivalent channel matrix H is described as where H ∈ C N r ×N t is the channel matrix for nonoversampled systems and u is an oversampling vector with length M, which has the form The matrix Z ∈ R MN ×MN is a Toeplitz matrix that contains the coefficients of z(t) at different time instants, where z(t) is the convolution of the pulse shaping filter p(t) and the matched filter m(t) given by (7), as shown at the bottom of the next page. In particular, M = 1 refers to the non-oversampling case. Let Q(·) represent the 1-bit quantization function, the resulting quantized signal y Q is given by The real and imaginary parts of y are quantized elementwised to {± 1 √ 2 } based on the sign. The factor 1 √ 2 is to make the power of each quantized signal to be one.
Since quantization strongly changes the properties of signals, some statistical properties of quantization for Gaussian input signals will be shown. For 1-bit quantization and Gaussian inputs, the cross-correlation between the unquantized signal s with covariance matrix C s and its 1-bit quantized signal s Q is described by [31] Furthermore, the covariance matrix of the 1-bit quantized signal s Q can be obtained through the arcsin law [34] The problem we are interested in solving in this work is to cost-effectively estimate the channel parameters in H .

III. CHANNEL ESTIMATION FOR UPLINK 1-BIT OVERSAMPLED MIMO
In a standard uplink implementation, the channel state information (CSI) is estimated at the BS and then used to detect the data symbols transmitted from the N t users. Each transmission block is divided into two sub-blocks: one for pilots and another for the data symbols. Pilots are either located at the beginning of each block or spread according to a desired pattern [35]. During the training phase, each terminal simultaneously transmits τ pilot symbols to the BS, which yields Vectorizing (11) we get where h = vec(H ) and the equivalent pilot matrix The vector x p ∈ C τ N t ×1 contains the transmitted pilots and e n ∈ R τ ×1 represents a column vector with a one in the nth element and zeros elsewhere. After processing by 1-bit ADCs, the quantized signal can be expressed as The vector n q is the statistically equivalent quantization noise 1 with covariance matrix C n q = C y Qp − A p C y p A H p . The matrix A p ∈ R M τ N r ×M τ N r is the Bussgangbased linear operator chosen independently from y p and is given by where C y p y Qp denotes the cross-correlation matrix between the received signal y p and its quantized signal y Q p The formulas of (15) and (16) involve the auto-correlation matrix C y p : where (17) is calculated as For non-oversampled system (M = 1), (18) is reduced to However, for oversampled system (M ≥ 2) (18) cannot be further simplified due to the correlation of oversampled samples. The off-diagonal elements will appear in the matrix of GG H . One example is shown in Fig. 2, where m(t) is assumed to be a normalized root-raised cosine (RRC) filter with different roll-off factors, M = 2 and τ = 10. It can be seen that the lower the roll-off factors the more offdiagonal elements appear in GG H , which means that for systems with low roll-off factors it is important to consider C n p as a full matrix rather than a simplified diagonal matrix as assumed in [24]. 1 In this paper, we assume the quantization noise n q is Gaussian distributed with zero mean and covariance C n q .

B. STANDARD LS CHANNEL ESTIMATOR
The work in [36] has proposed the standard LS estimator for 1-bit non-oversampled systems. Similar to this, this estimator is extended to oversampled systems, which can be computed according tô The advantage of this estimator is that no a priori information is needed at the receiver. However, the issue with this estimator, when applied with 1-bit quantization, is that the channel estimateĥ scales with the amplitude associated with the quantizer, which then corresponds to a biased estimation.

C. LRA-LS CHANNEL ESTIMATOR
Based on the Bussgang decomposition, the LS estimate is proposed for the linear equivalent system model in (14). The LRA-LS channel estimator is obtained by solving the following optimization problem: Compared to the standard LS channel estimator, the proposed estimator has taken R h into consideration in order to obtain the linear operator A p .

D. LRA-LMMSE CHANNEL ESTIMATOR
The LMMSE channel estimator has the advantage of superior MSE performance to that of the LS channel estimator. Based on the statistically equivalent linear model in (14), the oversampling based LRA-LMMSE channel estimator is proposed. The optimal filter is given by where The resulting LRA-LMMSE channel estimator is then Proof: See Appendix A. Note that when M = 1, (24) reduces to the same as that of the BLMMSE channel estimator in [11].

E. LRA-LMS CHANNEL ESTIMATOR
LMS is the most widely used adaptive algorithm and has been adopted in various applications like system identification and channel equalization. In addition, LMS has robust performance and a low cost of implementation. Based on the linear equivalent model in (14), an LRA-LMS channel estimator for 1-bit oversampled systems is devised.
Since for large-scale MIMO with N r N t , in order to reduce the computational complexity the multiplications and divisions involving large matrices, whose dimensions contain N r elements, need to be avoided. For this reason, we concentrate on the channel from N t users to only one receive antenna n r and the received quantized signal is modelled as where y n r Q p = [y n r Q p (1), y n r Q p (2), . . . , y n r Q p (M τ )] T and h n r ∈ C N t ×1 is the n r th row of H . Different from˜ p in (14), n r p ∈ C M τ ×N t is an equivalent pilot matrix to the n r th receive antenna. The sliding window based technique [26] (shown in Fig. 3) is applied, which combines the adjacent symbol-rate-sampled symbols together to estimate the instantaneous channel parameters, since in oversampled systems the interference from adjacent symbol-rate-sampled symbols should be considered. The first window contains the first Ml win oversampled samples and the second contains the next Ml win samples until the last window. Note that only one symbol-rate-sampled symbol (or M oversampled samples) is shifted for the subsequent window.
Based on (25), the received signal at the nth window can be expressed as where y n r Q p (n) = [y n r Q p (M (n − 1) + 1), . . . , y n r Q p (M (n − 1) + Ml win )] T and˜ n r p (n) = A n r p (n) n r p (n) ∈ C Ml win ×N t contains the transmit pilot sequences in the nth window. VOLUME 8, 2020 The optimization problem that leads to the proposed LRA-LMS channel estimation algorithm can be stated aŝ whereh n r (n) is the instantaneous estimate of h n r in the nth window.
Taking the partial derivative of the objective function in (27) with respect toh n r (n) H , we obtain The recursion of the proposed LRA-LMS algorithm is h n r (n + 1) =h n r (n) + µ˜ n r p (n) H e n r (n), n = 1, . . . , τ − l win + 1, (29) where the constant step size µ fulfills γ max is the largest eigenvalue of C˜ nr p (n) , which is E{˜ n r p (n)˜ n r p (n) H }.
Proof: See Appendix B. The proposed adaptive channel estimator is summarized in Algorithm 1, where x p (n) ∈ C l win N t ×1 contains the pilot symbols in the nth window. Both e n ∈ R l win ×1 and e n ∈ R N r ×1 represent all-zero column vectors except that the nth elements are ones. Fig. 4 shows the convergence performance of the proposed LRA-LMS channel estimator for each receive antenna. The proposed estimator achieves its steady state after τ = 40. C n r y p (n) = n r p (n) n r p (n) H + σ 2 n GG H ;

8:
A n r p (n) = 2 π diag(C n r y p (n)) − 1 2 ; 9:˜ n r p (n) = A n r p (n) n r p (n); 10: e n r (n) = y n r Q p (n) −˜ n r p (n)h n r (n); 11: h n r (n + 1) = h n r (n) + µ˜ n r p (n) H e n r (n); 12: end for 13: end for

F. COMPLEXITY ANALYSIS
The computational complexities of the proposed channel estimators are compared in this subsection. For the sake of simplification and a fair comparison among the estimators, we assume R h is an identity matrix. Table 1 shows the total required complex additions/subtractions and multiplications/divisions for obtaining the channel estimateĥ . More intuitively, Fig. 5 shows the total number of complex operations, which is a sum of complex additions and multiplications, as a function of the number of receive antennas N r . Compared to other channel estimators, the LRA-LMS channel estimator consumes the lowest computational cost since there are no matrix inversions or large matrix multiplications in the algorithm. The comparisons in terms of MSE performance are shown in the simulations section.

G. ESTIMATION OF R h
In practical environments, there is no prior information about R h at the receiver. In this subsection, an adaptive technique is proposed to recursively estimate R h aŝ where λ is the forgetting factor andĥ (n) is the channel estimate at the Nyquist time instant n. Consider the system model where y Q (n) and n(n) are column vectors with size MN r × 1. Different from x p (n) in Algorithm 1, x p (n) ∈ C N t ×1 contains pilot symbols from N t terminals at time instant n. Z ∈ R M ×M is a simplified version of Z with N = 1. The instantaneous estimate of h is calculated aŝ where the initial guess ofR h (1) is an identity matrix by assuming channel parameters are uncorrelated and each has unit power.

IV. CRAMÉR-RAO BOUNDS
Unlike the works in [29], [30], which have proposed the CRBs for the unbiased estimators, the existing CRBs are extended suitable for the biased estimators. Two different types of CRBs are proposed depending on whether the prior information R h is known at the receiver, namely Bayesian CRB with known R h and general CRB with estimated R h .

A. BAYESIAN CRAMÉR-RAO BOUNDS
Bayesian bounds on the fundamental limits of estimation are derived for non-oversampled and oversampled systems. Without loss of generality, we extend (12) considering the whole system and not just the pilots, and rewrite the complexvalued model in the following real-valued form Leth = [h R ; h I ] be the unknown parameter vector, since the real and imaginary parts are independent, the Bayesian information matrix (BIM) [37] for the quantized signal is defined as where where To transform the real-valued J y Q (h ) back to the complex domain J y Q (h ), J y Q (h ) is defined with the following structure: and apply the chain rule to get: where J RR y Q (h ), J RI y Q (h ), J IR y Q (h ) and J II y Q (h ) have the same dimensions N r N t × N r N t . The variance of the estimator h (y Q ) is lower bounded by For non-oversampled systems, i.e, M = 1, the covariance matrix of the equivalent noise vector n is C n = σ 2 n I NN r . With the independence of the real and imaginary parts, the loglikelihood function can be expressed as With the derivative of the Q(x) function, the real part in (38) The derivation for the imaginary part [J D By assuming thath is Gaussian distributed with zero mean and covariance matrix Ch Substituting (48) into (39), we obtain Finally, the resulting BIM is the summation of (46) and (49) as described by

2) BIM FOR OVERSAMPLED SYSTEMS
When M ≥ 2 the equivalent noise vector n consists of colored Gaussian noise samples. Computing p(y R/I Q |h ) requires the orthant probabilities, which are not available or too difficult to compute. The authors in [28], [30] have introduced a lower bounding technique on the Fisher information for realvalued system. To employ this lower bounding technique in the complex-valued system, the work of [29] has come out. The lower bound of J D y R/I Q (h ) is calculated based on the first and second order moments as Since the lower-bounding technique is identical for the real and the imaginary parts, only the derivation ofJ D The partial derivative of (52) with respect to [h ] i is The diagonal elements of the covariance matrix are given by while the off-diagonal elements are calculated as where [z k , z n ] T is a bi-variate Gaussian random vector The lower bound for the imaginary part is derived in the same way. With the calculations above the lower bound of the BIM is obtained as where the equality holds for M = 1, as shown in [30] for the real valued CRB and in [29] for the complex valued CRB. Based on (42), the inverse of this BIM lower bound will result in an upper bound of the actual Bayesian CRB for oversampled systems. 85250 VOLUME 8, 2020

B. GENERAL CRAMÉR-RAO BOUNDS
When R h is unknown and needs to be estimated at the receiver, the Bayesian CRBs will not be applicable. The general CRBs are derived for the proposed channel estimators with estimated R h . Lemma 1: The proposed LRA channel estimators with combination of estimatedR h are biased channel estimators.
Proof: See Appendix C. Since the proposed LRA channel estimators are biased, while calculating the CRBs, they should apply as which are the upper left and lower right part of the J D y Q (h ) (similar as (40)), respectively.

V. NUMERICAL RESULTS
The simulation results presented here consider an uplink single-cell 1-bit large-scale MIMO system with N t = 8 and N r = 64. The modulation scheme is quadrature phaseshift keying (QPSK). The m(t) and p(t) filters are normalized RRC filters with a roll-off factor of 0.8. The channel is assumed to experience block fading and the pilots are column-wise orthogonal with length 20. The SNR is defined as 10 log( N t σ 2 n ). The normalized MSE and SER performance plots are obtained by taking the average of 300 channel matrices, noise and symbol vectors.
For the LRA-LMS channel estimator, the window length l win is chosen as three to ensure low computational complexity. The step size µ is optimized according to the oversampling factor and SNR. In the simulation, µ varies between 0.05 and 0.3. While recovering the transmitted symbols from the received quantized signal, the slidingwindow based LMMSE detector [26] with window length equal to three (l win = 3) and the estimate of the channel obtained by the proposed algorithms is applied in the system for obtaining both high accuracy and low computational cost.
The performance of the channel estimators is evaluated based on the channel model simulated in [38]. The channel for user n t is assumed Rayleigh distributed where R r,n t denotes the receive correlation matrix with the following form ρ n t is the correlation index of neighboring antennas. (|ρ n t | = 0 represents an uncorrelated scenario and |ρ n t | = 1 implies a fully correlated scenario.) The elements of h w,n t are i.i.d. complex Gaussian random variables with zero mean and unit variance. All users are assumed to experience the same value of |ρ n t | = |ρ| but different phases uniformly distributed over 2π . The overall channel model is summarized as and R h is calculated as

A. R h IS KNOWN AT THE RECEIVER
In this subsection, we evaluate the performance of the proposed LRA channel estimators with known R h at the receiver. Fig. 6a and Fig. 6b compare the normalized MSE of the various channel estimators as a function of SNR in uncorrelated (|ρ| = 0) and correlated channel (|ρ| = 0.75), respectively. There is a 2dB performance gain of the oversampled systems as compared to the non-oversampled systems for the LRA-LMMSE channel estimator at low SNR, whereas a much larger gain at high SNR. In both channels the LRA-LMMSE achieves the best MSE performance at the cost of high computational cost. In contrast, the LRA-LMS estimates the channel matrix H row by row. This approach can largely reduce the computational cost (shown in Fig. 5). Note that this separation into several rows may overlook the correlation of receive antennas. More specifically, the proposed LRA-LMS treats R r,n t as an identity matrix. As an amendment, the resulting estimated channel matrixĥ LRA-LMS needs to be multiplied with the square root of the receive correlation matrix R 1 2 r n t , which can be derived from R h in (64). From the results, it can be seen that in both channels the LRA-LMS approaches the performance of the LRA-LMMSE at low SNR (≤ 5 dB), whereas at high SNR this performance gap becomes large.
The Bayesian CRBs illustrated in Section IV-A are also depicted in Fig. 6. Note that for the oversampled systems (M ≥ 2) the upper bounds of Bayesian CRBs are higher than the actual Bayesian CRBs, since they are derived from the lower bounds of Bayesian information. The black lines represent the standard LMMSE performance for the systems with unquantized signals, which can be treated as lower bounds for the systems with 1-bit quantized signals.  The LMMSE detector with sliding-window based SER performance of the system with the LRA-LMMSE estimated and perfect channel matrix are illustrated in Fig. 7, where the oversampled systems obviously outperform the non-oversampled systems. As described in III-A, Fig. 8 shows the MSE comparisons between LRA-LMMSE and simplified LMMSE [24] channel estimator in the system with τ = 10 and roll-off factor 0.1. We emphasize again that in our work, the correlation of filtered noise is taken into account, and hence C n p is not a diagonal matrix in oversampled systems. It can be seen that at low SNR (≤ 10 dB) the performance of simplified LMMSE [24] is worse than the proposed LRA-LMMSE, although they converge together at high SNR (> 10 dB). Another observation is that at low SNR the simplified LMMSE estimator with M = 3 performs worse than that with M = 2, which shows that the assumption in [24] is inaccurate.

B. R h IS UNKNOWN AT THE RECEIVER
Practically, R h is not known at the receiver. Fig. 9 shows the MSE performance of the LRA channel estimators by using the proposed adaptive recursion to estimate R h , where λ is set to 0.99. It can be seen that the performance remains almost the same as Fig. 6a, which shows that the proposed estimation of R h works well under uncorrelated channel.
While analyzing the general CRBs proposed in (57) and (58), instead of directly calculating the gradient of the expected value with respect to the channel vector ∂E{ĥ R/I bias } ∂h R/I , this gradient is numerically evaluated, since there is an adaptive estimation technique inside the channel estimator, which makes the calculation more difficult. As one example, Fig. 10 shows the normalized MSE performance of the LRA-LS channel estimator with estimatedR h in (31) for estimating the first N r elements 2 of h R and its corresponding numerically calculated general CRBs under uncorrelated channels (|ρ| = 0). More specifically, each element of the gradient vector ∂E{ĥ R/I bias } ∂h R/I is calculated with the following steps: 2 For the sake of simplicity, only first N r elements are considered, since for the large-scale MIMO there are N t N r elements in h R , which will cost much time for calculating the general CRBs. 85252 VOLUME 8, 2020  In this subsection, the channel estimation performance of the 1-bit oversampled system is compared with the b-bit nonoversampled systems. In Fig. 11 the LRA-LMMSE channel estimator for a system with 2 or 3 bits is based on the work in [6]. It can be seen that a system with 2 or 3 bits has better MSE performance than the 1-bit system especially at high SNR. However, the advantages of 1-bit ADCs is that they do not require automatic gain control (AGC) and linear amplifiers, and hence the corresponding radio frequency chains can be implemented with very low cost and power consumption  (a few milliwatts) [7], [11], [39]. As one example, Fig. 12 shows the total receiver power consumption as a function of the quantization bits b. The calculation of receiver power consumption is based on the work in [40] P total = P BB + P LO + N r (P LNA + P H + 2P M ) +2N r (cP AGC + P ADC ), (65) where P BB , P LO , P LNA , P H , P M and P AGC denote the power consumption in the baseband processor, local oscillator (LO), low noise amplifier (LNA), π 2 hybrid and LO buffer, Mixer and AGC, respectively. c is chosen as 0 for the 1-bit system and 1 for b-bit systems. The power consumption of different hardware components is given as P BB = 200 mW, P LO = 22.5 mW, P LNA = 5.4 mW, P H = 3 mW, P AGC = 2 mW and P M = 0.3 mW. The P ADC is calculated as where FOM w is 200 fJ/conversion-step at 50 MHz bandwidth and f n is 100 MHz. From the results, it can be seen that the 1-bit system consumes much less power than the 2-bit and 3-bit systems in both non-oversampled and oversampled systems. Indeed, the 1-bit oversampled systems have largely improved the estimation performance and allows the VOLUME 8, 2020 estimator to approach the performance of the 2-bit system at low SNR.

VI. CONCLUSION
In this work, oversampling based low-resolution aware channel estimators have been proposed for uplink single-cell large-scale MIMO systems with 1-bit ADCs employed at the receiver. The Bussgang decomposition is used to derive linear channel estimators based on different criteria. With oversampling in such systems, it is observed that we can achieve obvious advantage compared to the non-oversampled system in terms of the normalized MSE. Moreover, the LMS adaptive technique used for channel estimation can largely reduce the computational cost and has almost the same accuracy as the LRA-LMMSE channel estimator at low SNR, which is important to ensure low computational complexity and for hardware implementation. In addition, we have also derived Bayesian and general CRBs on MSE, which give theoretical limits on the performance of the channel estimators. Furthermore, we have proposed an adaptive technique to estimate the auto-correlation of channel vector, which is important for practical use. In general, the 1-bit ADCs have the advantage of energy saving. Our proposed oversampling based channel estimation, especially the LRA-LMS estimator, increases the accuracy of estimation while maintaining low computational cost, which is important for future low cost and low latency wireless systems.
The matrix A p (n) depends on R h such that the expectation in (84) can be different from the identity matrix especially for channels without normalization, which verifies that (33) has an unknown bias [37]. With the analysis above, it is concluded that the adaptive estimatorR h is also biased, which shows that the estimation procedures together with the proposed LRA channel estimators are biased.