Adaptive Channel Estimation and Tracking for URA-Based Massive MIMO Systems

In this paper, an adaptive channel estimation and tracking method is proposed for the <italic>uniform rectangular array</italic> (URA) based <italic>massive multiple-input and multiple-output</italic> (M-MIMO) systems. To decrease the computational complexity while tracking the change on the <italic>channel state information</italic> (CSI) in time, a <italic>principal subspace analysis</italic> (PSA) based scheme is proposed to estimate and track the signal subspace based on the received signal covariance matrix. Based on the estimated signal subspace, the channel parameters including the <italic>direction of arrival</italic> (DoA) angle and the channel gain corresponding to each resolvable path of the channel, can be recovered via the classic <italic>multiple signal classification</italic> (MUSIC) algorithm. Then, the <italic>Cramer-Rao lower bound</italic> (CRLB) is derived to evaluate the performance of the proposed scheme. Simulation results are provided to verify the accuracy of the proposed scheme on the channel estimation.

ABSTRACT In this paper, an adaptive channel estimation and tracking method is proposed for the uniform rectangular array (URA) based massive multiple-input and multiple-output (M-MIMO) systems. To decrease the computational complexity while tracking the change on the channel state information (CSI) in time, a principal subspace analysis (PSA) based scheme is proposed to estimate and track the signal subspace based on the received signal covariance matrix. Based on the estimated signal subspace, the channel parameters including the direction of arrival (DoA) angle and the channel gain corresponding to each resolvable path of the channel, can be recovered via the classic multiple signal classification (MUSIC) algorithm. Then, the Cramer-Rao lower bound (CRLB) is derived to evaluate the performance of the proposed scheme. Simulation results are provided to verify the accuracy of the proposed scheme on the channel estimation.

I. INTRODUCTION
As one of the most important techniques to fulfill the future high data rate requirements in 5G and beyond 5G systems, massive MIMO (M-MIMO) has been investigated extensively for decades [1]. Different from the conventional MIMO systems, hundreds of antennas are packed into a limited area at the base station (BS). As a consequence, much more freedom in spatial domain can be leveraged to improve the spectrum efficiency (SE) and energy efficiency (EE) [2], [3].
After extensively study in both academia and industry, the fundamental theory and implementation issues on the M-MIMO techniques have been revealed [4], [5]. With perfect channel state information (CSI) on the downlink transmission, the channel capacity of the M-MIMO systems will be improved with the increase of antenna elements [6], and for both hybrid structure and full digital structure, the accurate CSI is the necessary condition to guarantee the high SE The associate editor coordinating the review of this manuscript and approving it for publication was Faisal Tariq . and EE achieved by M-MIMO techniques. On the other hand, without perfect CSI, i.e. brought by the pilot contamination [7], the channel capacity will be degraded significantly. Therefore, achievable accurate CSI at the BS is the key to release the power of M-MIMO technique. Unfortunately, achieving accurate CSI is not trivial due to the large number of entries in the channel matrix. In general, two main obstacles need to be conquered to obtain the accurate CSI, which are pilot contamination and high computational complexity [8].
The pilot contamination issue has been extensively studied and well addressed in past few years. For a good survey on this topic please refer [7]. In [9], downlink training and a scheduled UL training processes have been combined to eliminate the pilot contamination in multi-cell time division duplex (TDD) based orthogonal frequency division multiplexing (OFDM) systems. In [10], a consecutive pilot transmission phases has been proposed to cancel the pilot contamination, where each BS stays idle at one phase and repeatedly transmits pilot sequence in the remaining phases. The effect of pilot contamination has been quantified in VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the presence of the location information of users in [11]. Then, a novel pilot assignment scheme has been proposed to minimize the pilot contamination. Another way to mitigate the pilot contamination is subspace-based, where the covariance of the channel has been exploited to estimate the CSI. In [12], the channel covariance matrix has been used to derive the channel power distribution in angular domain. Then, the pilot reuse pattern has been designed accordingly to avoid the pilot contamination. A subspace projection method with power-controlled hand-off has been proposed to estimate the CSI without using the pilot sequences in [13].
In [14], the angle-delay power spread function of users have been estimated, which is used to enable the pilot use while mitigating the contamination problem. Even though the pilot contamination can be relieved, the computational complexity is high due to the requirement on matrix operation, i.e. eigenvalue decomposition (EVD) or singular value decomposition (SVD).
To decrease the computational complexity, the sparsity on the M-MIMO channel in spatial domain has been exploited to estimate the CSI [15], [16]. Since the large scale of antenna arrays with half wavelength antenna spacing is deployed at the BS and located at the top of a high tower, there are quite few scattering between the user terminals (UTs) and the BS. Moreover, the M-MIMO is more likely used in the scenarios with high frequency, i.e. millimeter wave and THz bands. As a consequence, the number of resolvable paths from the UTs to the BS is very limited. Therefore, the received signals at the antennas are highly correlated. Based on this attribute, array signal processing technique has been applied to estimate the channel parameters [17], which include direction of arrival (DoA) angles and path gains on the resolvable paths from the UTs to the BS. In [18], the spatial spectrum of the received signals at the uniform rectangular array (URA) of antennas in angular domain has been derived. Then, the channel parameters can be estimated via seeking the peaks of the spectrum. By using the two dimensional discrete Fourier transformation (2D-DFT) and angular rotation methods, the DoA angles have been estimated for the URA based M-MIMO channel after obtaining the initial CSI estimation during the training process in [19]. Moreover, a compressive sensing based channel estimation scheme has been proposed in [20] via exploiting the sparsity on the M-MIMO channel in spatial domain.
Although many works have studied how to avoid or minimize the effect of pilot contamination while reducing the computational complexity of the channel estimation for M-MIMO systems, most of them are one-time estimation schemes, which can not track the change on the CSI. It is well known that the CSI may vary from time to time due to the mobility of UTs. Obviously, checking the change on the CSI has much lower computational complexity than estimating the CSI repeatedly. Therefore, another way to decrease the computational complexity is to design the CSI tracking schemes. In [21], the modified unscented Kalman filter has been applied to track the DoA information while the training is still required to estimate the path gain. A first-order Taylor expansion-based method has been proposed to estimate the M-MIMO channel in [22]. In [23], the UT motion model has been used to find a temporal variation law of the direction between the BS and UTs. Accordingly, the M-MIMO channel can be tracked with the initial CSI estimation. Even though some works on channel tracking have been developed, they need some extra information, i.e. motion model of UTs, which maybe be hard to achieve in practice.
Based on above investigation, a novel adaptive channel estimation and tracking scheme is developed for M-MIMO systems with URA antennas which has high accuracy and low complexity in this paper. Different from conventional methods, the received signal snapshots at the BS are used to estimate and track the CSI adaptively. Therefore, it named as adaptive channel estimation and tracking scheme. First, the spatial channel model associated to the DoA and path gains is built and explained. Accordingly, the principle subspace analysis (PSA) based iterative method is proposed to estimate and track the received signal subspace from the received signal matrix at the BS. Based on the estimated signal subspace, the multiple signal classification (MUSIC) algorithm is applied to estimate the eigenvectors of the channel. Then, the DoA and path gains can be recovered accordingly. The main contributions of this paper are summarized as follows: 1. The novel information criterion proposed in [24] is borrowed to formulate the objective function when estimating the signal subspace. Based on this criterion, the channel estimation accuracy can be improved and tracking speed can be accelerated. 2. A PSA based iterative method is proposed to search and track the signal subspace, which can reduce the computational complexity of the proposed scheme significantly. 3. With the estimated DoA angles, a UT grouping scheme is proposed to enable the pilot reuse, which may mitigate the pilot contamination. 4. The Cramer-Rao lower bound (CRLB) is derived to evaluate the accuracy on the channel estimation for URA based M-MIMO systems. Moreover, the simulation results are demonstrated to verify the effectiveness of the proposed scheme. The notations adopted in the paper are described as follows. We use C to denote the complex domain. The small boldface is used to represent vector and the big boldface is used to represent the matrix. Let (X) T , (X) * , (X) H denote the transpose, conjugate, and conjugate transpose of a matrix X, respectively. E{·} represents the expectation, and diag{a 1 , · · · , a N } denotes a diagonal matrix with a 1 , · · · , a N at the main diagonal. The Kronecker product of two matrices X and Y is denoted by X ⊗ Y. = is used for definition. vec{·} denotes the vector mapping operation. |t i,i | denotes the modulus of t i,i . || · || denotes the Euclidian norm. log(.) is defined as Natural logarithm of a symmetric positive definite matrix. det(.) denotes determinant.

II. SYSTEM MODEL
In the system, M × N antennas are uniformly deployed in the planar rectangular area at the BS with spacing d between adjacent antennas, as shown in Fig. 1. A three-dimension (3D) coordination plane is used to describe the location of antennas and a single antenna is deployed at each UT. Assuming that there are P resolvable paths between the BS and a UT and the ith path has a corresponding elevation DoA angle, θ i , and azimuth DoA angle, φ i . Then, the M × N received signal matrix at the BS is given by where the coefficients β i are independent and uniformly distributed, a i is the attenuation, i is the initial phase of path i, x ∈ C is the transmit signal of the UT, Y ∈ C M ×N is one snapshot received signal at the BS, and Z ∈ C M ×N is the AWGN matrix in which each entry has zero mean and variance σ 2 . Define According to the (5), the steering matrix associated with the DoA angles on path i is given by With the matrix A i defined in (6), the received signal can be rewritten as where . Therefore, the channel matrix between the UT and the BS can be written as From (7) and (8), we can observe that the massive MIMO channel is directly related with two crucial matrices. The first matrix is the steering matrix, A i determined by the DoA angles of the resolvable paths. The second matrix includes the channel attenuation and the initial phases of the signals from resolvable paths. Therefore, as long as these two matrices are estimated with high accuracy, the channel matrix can be reconstructed. Moreover, according to (7), the channel matrix is determined by parameters related to P resolvable sub-paths. As a result, instead of estimating all M × N entries in the channel matrix H, we only need to estimate θ i , φ i and β i for each resolvable path i. Since the number of resolvable paths is much less than the number of entries in H, the computational complexity can be significantly reduced when only estimating DoA angles and amplitude of each resolvable path. In the following sections, the PSA based method is proposed to estimate and track the DoA angles and path gains on the resolvable paths to recover matrices A i and β i , respectively.
To abstract the channel matrix from the received signals, the SVD operation can be applied to decompose the received signal matrix, which is written as where S is an orthonormal unitary matrix with M ×M dimension and its' columns lay in the geometrical subspace of Y, is a M × N diagonal matrix with the singular values, denoted as ρ i , ∀i, sorted in descending order, D is an orthonormal unitary matrix with N ×N dimension. Physically, the singular values in are related to the energy distribution of the signals received from different paths at the BS. Based on the characteristics on SVD operation, the columns of matrices S and D are the eigenvectors of the covariance matrices, R 1 = E{YY H } and R 2 = E{Y H Y}, respectively. Therefore, matrix S spans the row space of Y and matrix D spans the column space of Y, respectively. Then, abstracting first P columns from matrices S and D to construct two new matrices, S P = {s 1 , · · · , s P } and D P = {d 1 , · · · , d P }. Based on the array signal processing theory, matrices S P and D P are determined by the azimuth and elevation DoA angles of P resolvable paths since their corresponding singular values are much bigger than the remaining singular values. Even though the eigenvectors associated to the signal subspace can be estimated by the SVD decomposition, it has very high computational complexity and can not track the variety on the eigenvectors. Therefore, in next sections, a novel PSA based CSI estimation and tracking scheme is demonstrated.

III. ESTIMATION ON S P AND D P
Based on the analysis in Section II, the DoA angles can be estimated by recovering the matrices S P and D P . Therefore, VOLUME 8, 2020 in this section, the way to estimate and track S P and D P via estimating and tracking the eigenvectors of R 1 and R 2 are presented. First, the theoretic covariance matrix R 1 is given by where R 1 is a Hermitian matrix, ψ i = E{xx * }. Then, after the eigenvalue decomposition (EVD) operation on R 1 , the matrix R 1 can be written as where S P is an M × P matrix with eigenvectors corresponding to eigenvalues, λ 1 , . . . , λ P , s is a P × P diagonal matrix with entries from λ 1 to λ P , S z is an M × (M − P) matrix composed of eigenvectors corresponding to eigenval- Similarly, applying EVD operation to R 2 , we can obtain where D P is and N × P matrix with the eigenvectors associated to the signal subspace, d is a P × P diagonal matrix with the eigenvalues associated to the signal subspace, D z is an N × (N − P) matrix composed of eigenvectors and z 2 is a (N − P) × (N − P) diagonal matrix with the eigenvalues associated to the noise subspace. Based on the array signal processing knowledge, the eigenvectors in signal subspaces, S P and D P , are decided by the DoA angles of the resolvable paths and the eigenvalues are determined by the amplitudes of the resolvable paths. Therefore, the DoA angles can be estimated after recovering the S P and D P via the EVD operation. However, similar to the SVD decomposition, the EVD operation has very high computational complexity, which is O M 3 + N 3 , due to the high dimensions of the channel matrix. Secondly, it can not trace the change on the channel matrix caused by the mobility of UTs. Therefore, in the followings, an adaptive PSA based method is proposed to estimate and track S P and D P .

A. PSA BASED ESTIMATION AND TRACKING METHOD
First, an adaptive PSA based method is proposed to estimate and track the signal subspace. The novel information criterion (NIC) proposed in [24] for PSA is adopted to formulate the objective function defined as J (W 1 ), where W 1 is the estimation on the signal subspace. According to [24], the estimation scheme developed based on NIC has higher convergence speed than that with Oja's subspace estimation algorithm [25]. Accordingly, the objective function is written as Accordingly, the signal subspace estimation can be formulated as max where 0 is a P × P zero matrix. Via analyzing the problem (14), we can achieve the following Theorem. Theorem 1: All stationary points of J (W 1 ) are found in the form W 1 = S P TQ, where Q is an unitary matrix and T is a diagonal matrix with |t i,i | = 1.
Proof: First, the gradient of J (W 1 ) with respect to W 1 can be derived as Then, replacing W 1 with S P TQ in (15), we can achieve Then, multiplying W H 1 in both sides of (16), we have Therefore, W 1 = S P TQ is a stationary point of J (W 1 ). Based on Theorem 1, according to Theorem 3.2 in [24], J (W 1 ) has only one global maximum at W * 1 = S P Q. Therefore, the optimal solution for problem (14) is the estimation on the signal subspace.
To track the signal subspace which may change from time to time due to the mobility of UTs, the gradient ascent method is applied to estimate and upgrade W 1 , which is given by where the subscript k is the updating step index, µ is a updating step size. The initial W 1 can be any M × P matrix. The gradient of J (W 1 ) can be derived by (15) where the covariance matrix R 1 is required. Since it is difficult to achieve the accurate R 1 in practice, the following estimation on R 1 proposed in [26] is used in the paper.
where 0 < α < 1 is the forgetting factor, Y i is is the received signal at the ith time slot, k is the current time slot and update step. The initial covariance is equal to Y 1 Y H 1 . It is noteworthy that the forgetting factor is used to track the time varying covariance matrix brought by the change on the channel condition between the UTs and the BS.
Based on the estimation on the covariance matrix in (19), the updating rule in (18) can be rewritten as Based on the analysis in [24] and [27], the gradient ascent method presented in (19) will converge to the optimal solution of (14) with fast speed. However, the optimal W 1 is only the estimation of the signal subspace, but not the true eigenvectors, S P , due to the matrix TQ.
To recover S P , the EVD operation is applied to estimate matrix TQ as explained in the followings. First, define a matrix, L, as Then, input W 1 = S p TQ into (21), we can achieve Furthermore, by applying EVD operation on L, we have Based on (22) and (23), we can derive Then, multiplying Q 1 in (24) with W 1 , we can recover the eigenvectors, S P . Therefore, after the EVD operation on the matrix L, we can achieve the matrix Q H T H , which is composed of the eigenvectors of L. Then, the matrix S P can be derived accordingly. It is noteworthy that even though the EVD operation is used in (23), the computational complexity is much lower than that when using EVD operation directly to the covariance matrix R 1 . This is because that the dimension of matrix L, which is P×P, is much lower than the dimension of R 1 , which is M × M . By using the same way, the signal subspace, D P , can also be recovered. In the next section, the method to estimate the DoA angles via the signal subspaces is introduced.

IV. THE ESTIMATION ON CHANNEL PARAMETERS
In this section, first, a MUSIC based method is proposed to estimate the DoA angles based on the estimation of u i and v i . Then, a projection method is proposed to pair the DoA angles on elevation and azimuth directions. Afterwards, the way to estimate β i based on the recovered signal eigenvectors is introduced.

A. ESTIMATION OF RIGHT P
In order to apply the proposed channel estimation and tracking scheme, the number of distinguishable paths should be known. Assume that P is the initial value on the number of distinguishable paths. With P, the matrix S P can be recovered by the proposed scheme, which is composed of the P eigenvectors. Accordingly, we can calculate the eigenvalues corresponding to the eigenvectors, which are written as {λ 1 , · · · , λ P }. Then, sort the eigenvalues with ascending order. If there are only P distinguishable paths, the first P eigenvalues would be far bigger than the remaining eigenvalues. Based on this observation, the real number of distinguishable paths can be determined.

B. ESTIMATION OF u i AND v i
To estimate u i with the signal subspace S P , first multiplying (10) and (11) with S z , we have Then, subtracting (25) to (26), there is Since Based on matrix manipulation, from (28) we can obtain where s z,k is the kth column of matrix S z , s z,k,m is the mth entry of s z,k . According to (29), we can observe that finding u is equal to seek the roots for (29). To do so, we first compute the pseudo-spectrum with respect to u for the eigenvectors of noise subspace as where p(ζ ) = 1, ζ, . . . , ζ (M −1) T . According to (30), if ζ = e j2πu , the roots of the denominator would locate on the unit circle and be the estimation on u since the steering vector, a(u i ), 1 ≤ i ≤ P, is orthogonal to the noise sub-space, S z . Moreover, S z S H z can be derived based on the estimation on S P in Section II,which is given by Then, inserting (31) into (30), u i , ∀1 ≤ i ≤ P, can be estimated by finding the peaks of the pseuospectrum in (30). By using the same method, we can estimate v i , ∀1 ≤ i ≤ P with the recovered D P .

C. PAIRING ON u i AND v i
Since u i and v i are estimated separately and independently, they need to be matched to the corresponding path. After achieving the estimation on u i and v i , we need to pair u i with v i . To do so, we first define standard inner products for matrices as where J and K are two matrices. The inner products operation can be interpreted as the correlation operation, which can maximize the energy of the received signals of K lying in the ''direction'' of J. Accordingly, with u i and v i , the inner VOLUME 8, 2020 products operation of the received signal matrix, Y, on the steering matrix A i (u i , v j ) can be written as where i, j = 1, 2, 3 . . . P. Obviously, only when j = i, the value of P r will reach the maximum since the received signal at the BS are mainly transmitted via the P resolvable paths. In another word, the received signals from P paths have contributed the most on the received energy at the BS. Therefore, by finding the maximum value of P r in (33), the u i can be paired with the corresponding v i . After the estimation and paring on u i and v i , the corresponding DoA angles can be estimated by the inverse function of (5), which is given by where τ = d λ .

D. ESTIMATION OF β i
Based on (9), the received signals at the BS can be written as Then, (36) can be rewritten as where S i and D i are the ith column vectors of S P and D P , G i = S i D H i , respectively. The set of matrices, {G 1 , G 2 , · · · , G P }, constructs an orthonormal space since Therefore, after the SVD operation, the received data matrix is decomposed into P orthogonal subspaces. Consequently, the ith eigenvalue, ρ i , is equal to < G i |Y >, which can be interpreted as the proportion of the received energy laying in the direction on G i . With this in mind, the inner products operation of the received signals, Y, on G j , ∀ 1 ≤ j ≤ P, is given by Then, according to (40) and (41), the amplitude of path i can be calculated as Based on the description in Section IV. A, B, C and D, after deciding the number of distinguishable paths, the proposed channel estimation and tracking algorithm for M-MIMO systems can be concluded as Algorithm 1 listed in above Table. E

. CHANNEL ESTIMATION AND TRACKING FOR MULTIPLE USERS
In Section IV. A, B and C, we have described the estimation and tracking on the uplink channel parameters of each UT based on Algorithm 1. When multiple UTs are present in the system, the orthogonal training sequences need to be used to distinguish the signals from different UTs first. Assuming that there are K UTs in the system and I orthogonal training sequences, represented by {r 1 , · · · , r I }, where each item, r i , is a vector with 1 × L dimension and ||r i || 2 = L. After allocating the training sequence r i to UT i, the received signal at the BS is given byȲ whereH i is a M × N × 1 matrix which denotes the channel between the BS and the UT i, andZ ∈ C M ×N ×L is the additive white Gaussian noise tensor with Gaussian distribution. To distinguish the received signal from UT i, the BS multiplies (43) with r H i go get After separating the signals from different UTs, Algorithm 1 can be applied to estimate the initial DoA angles from UT i. If the number of orthogonal training sequence is less than the number of UTs, i.e. I < K , then the UTs are divided into K I , where · is the maximum integer operator. Then, each group uses I orthogonal training by time division multiple access (TDMA) manner to separate the received signals at the BS. Then, based on Algorithm 1 proposed in the, the initial DoAs of UTs can be estimated.
After achieving the initial DoA angles for UTs, the difference on DoA angles can be used to partition users into different groups which is similar to the grouping scheme in [19]. If the minimum difference on DoA angles of two users k 1 and k 2 are bigger than φ, they are put into the same group and they can share the same training sequence.
After initial CSI estimation and UT grouping at the BS, the BS receives the signals from multiple UTs within the same group and updates the covariance matrix bŷ where G is a group of UTs whose DoA angle difference is larger than a threshold Then, according to the proposed scheme, the matrices S P and D P , which include the eigenvectors of the signal subspace, can be recovered with the updated R 1,k . Since the difference on DoA angles from different UTs in the same group is larger than a threshold, u l,j , 1 ≤ l ≤ P j , and v i,j , 1 ≤ l ≤ P j , for UT j, j ∈ G, can be estimated, where P j is the number of distinguishable paths from UT j to the BS. Actually, for the BS, it is similar to estimate the DoA angles of one UT with j∈G P j distinguishable paths. Therefore, the UTs in the same UT group can use the same pilot without causing pilot contamination problem since their DoA angles are different enough to be separated at the BS.

V. CRLB OF 2D DoA ESTIMATION
The Cramer-Rao lower bound (CRLB) is the best low bound on the mean square error (MSE) of the estimation in theory, which is widely used to evaluate the estimation accuracy.
In this section, we analyze CRLB for the estimation on DoA angles of URA based M-MIMO systems. For simplicity, assuming the signal via one path impinges on the planar with the noise which has zero mean and the variance σ 2 . Then, the likelihood function of resource signal can be expressed as where y R m,n denotes the real part of received signal at the (m, n)th antenna and c m,n is the signal without noise where f 0 = d λ cos θ and f 1 = d λ sin θ cos φ. is defined as = [β, f 0 , f 1 ] T , according to [24], the joint CRLB of the elevation and azimuth angles can be found through the inverse of the Fisher information matrix which is defined as: and where I( ) is the 3 × 3 Fisher information matrix whose elements can be obtained according to equations (46) and (48), for i = 1, 2, 3 and var( i ) is the variance of i . Then, using the approximations given in [24], the Fisher matrix I( ) is given by Since the parameters we need to estimate are DoA angles, θ and φ, based on (34) and (35), define a vector VOLUME 8, 2020 Accordingly, we can obtain the CRLB lower bound on the elevation and azimuth angle of a M-MIMO system through the Jacobian matrix based on [29]: where To simplify the estimation accuracy upper bound on φ, define Then, based on (54), (53) is rewritten as where γ = β 2 /(2σ 2 ) is the received signal to noise ratio (SNR) at the BS. From above CRLB derivation for DoA estimation, we can observe that the MSEs of DoA angle estimation is reverse proportional to the SNR of the received signal. Furthermore, the larger is the number of antennas deployed as the BS, the smaller is the CRLB.

VI. PERFORMANCE SIMULATION
In this section, the performance of the proposed scheme is evaluated via the numerical simulations. In the simulated system, the hight of the BS is set to 35 meters while a target UE is at the height of 1.8 meters in the cell. An URA with 8 × 8 antennas is deployed at the BS and the antenna space is set to be half of the wavelength, d = λ/2, and the received SNR is 25dB at the BS. Assuming that the received signals go through three disguisable paths before they reach BS.

A. ESTIMATION ON u i AND v i
First, the estimation accuracy on u i and v i is examined with the proposed scheme. Assuming that three distinguishable paths have elevation angles 25 • , 45 • and 80 • , and the azimuth angles 20 • , 30 • and 70 • , respectively. Then, based on the MUSIC algorithm presented in Section IV.A, the pseudospectrum of the received signals can be derived, which is demonstrated in Fig. 2. From the figure, we can observe that there have three acute peaks on each graph which correspond to the three DoA angles on azimuth and elevation directions, respectively. Accordingly, the locations of the corresponding peaks are the estimations of u i and v i , which are listed in Table 1. The real values of u i and v i are also demonstrated in Table 1. We can observe that the estimations on both u i and v i are very close to the real values, which verifies the effectiveness of the proposed estimation scheme. It is noteworthy that the search accuracy on u i and v i depends on the searching step size, which is set to be 0.0001 in the manuscript.

B. PAIRING ON u i AND v i AND DoA ANGLES ESTIMATION
Even though both u i and v i can be achieved with high accuracy, the paring between the u i and v i is not considered in the last section. In this section the parings scheme introduced in Section IV. B is verified via the numerical simulations. In Table 2, the projection values of the steering vectors to the received signals are listed, where the steering vectors are generated by the combination of the estimated u i and v i in last section. From Table 2, the maximum projection values should be chosen as the reference to match u i with v i . Accordingly, u 1 , u 2 , and u 3 should be paired with v 1 , v 2 , and v 3 , respectively. After pairing, the azimuth and elevation DoA angles can be recovered based on (34) and (35), which is listed in Table 3. From Table 3, we can observe that both azimuth and elevation DoA angles are recovered with high accuracy. But, the estimation accuracy on the elevation DoA angles, θ i , is higher than that on the azimuth DoA angles, φ i , since θ i is only decided by u i , but φ i is related to both u i and v i . Therefore, the estimation error on both u i and v i would affect the estimation accuracy on φ i .

C. THE IMPACT OF NOISE POWER ON THE ESTIMATION AND PAIRING ACCURACY
The main idea on the pairing algorithm is to match u i with the corresponding v i based on correlation operation. Obviously, the received energy at the BS is mainly from the signals received via the distinguishable paths. Therefore, when the steering vector is built with right u i and v i , the its inner product with the received signals should be larger than the rest as denoted in (33).
However, the noise may impair the estimation and pairing accuracy. In the previous simulations, received SNR at the BS is set to be 25 dB. To examine the impairment of the noise, we change the received SNR to 10 dB. Then, the PseudoSpectrum with respect to u i and v i is demonstrated in Fig. 3. From the figure, we can observe that it also has three peaks. Then, the pairing algorithm is applied and the paring values are listed in Table 4. According to the pairing rule, u 1 , u 2 and u 3 should be paired with v 1 , v 3 and v 1 , respectively, based on the values provided in Table 4, which is not accord to the right one. Furthermore, when SNR decreases to 5dB, there are only two PseudoSpectrum peaks as demonstrated in Fig. (4). Therefore, u i and v i can not be estimated accurately in this scenario.   According to above investigation, we can conclude that the proposed channel estimation and tracking algorithm works when SNR is high enough.

D. COMPARISON ON COMPUTATIONAL COMPLEXITY
The computational complexity of the PSA based and the SVD based schemes is compared and demonstrated in Table 5 when M × N antennas are deployed at the BS. In the PSA based scheme, the computational complexity is mainly from the covariance matrix estimation on the received signals and tracking the signal eigenvector which is O(M 2 NT + N 2 MT ), where T is the number of received signal snapshots. From the Table 5, we can observe that the computational complexity of SVD based scheme is much higher than that of PSA based scheme.

E. EVALUATION ON THE CONVERGENCE SPEED
In this section, a direction cosine parameter, ε, is defined to evaluate the convergence speed of the proposed algorithm. Assuming that W 1 is the estimation on S P , it can be written as where s i is the ith principal eigenvector of matrix S P , w 1,i,k denotes the estimation of the ith principal eigenvector in S P at the kth iteration step. In order to make a comparison, Oja's algorithm in [25] is simulated with the same initial condition, W 1,0 ∈ C M ×P , which is a unit matrix. The updating step size µ in PSA based and Oja's algorithm is set to 0.11 and 0.0038, respectively. Similarly, the updating step size µ to achieve W 2 , which is the estimation on D P , is set to 0.07 and 0.007 in PSA based and Oja algorithms, respectively. From Fig. 5, we can observe that ε achieved by both PSA based and Oja algorithms converges to 1, which implies that both schemes can converge to the signal eigenvectors. However, the convergence speed of PSA based algorithm is faster than Oja's algorithm, which verifies the effectiveness of the proposed algorithm on the estimation of S P and D P .

F. EVALUATION OF CRLB FOR VARIOUS ANTENNA CONFIGURATIONS
Numerical evaluation on CRLB for various antenna configurations is shown in Fig. 6 when the received SNR ranges from 6 dB to 24 dB. It can be observed from the figure that the MSE on the estimation reduces as the received SNR increases. Moreover, the number of the antennas plays an important role on the estimation accuracy. A larger number of antennas, M × N , leads to a smaller MSE on the DoA angle estimation on both elevation and azimuth directions. In Fig. 7, the DoA angle estimation accuracy by PSA based scheme is compared with that of Oja scheme and the CRLB. We can observe that the estimation MSE by the proposed scheme gets close to CRLB as the increase on received SNR and always smaller than that of Oja's scheme. It is noteworthy that the estimation accuracy on the elevation DoA angle is higher than that on the azimuth DoA angle, which verifies the analysis in Section VI. B. Another observation is that the gap between the proposed scheme and the ideal CRLB on the estimation MSE is pretty high when the received SNR is lower. Therefore, the effectiveness of the proposed scheme can not be guaranteed with low SNR. As a result, the orthogonal training sequences are still required to estimate the CSI for each UT initially when multiple UTs are present in the system, as we proposed in Section IV. D, which is to avoid the strong interference. After the initial estimation, the UTs whose minimum difference on the DoA angles is larger than a threshold can share the same training sequence, which is also to avoid the strong interference and keep high received SNR at the BS for each UT.

G. TRACKING ON THE DoA ANGLES
To examine the tracking ability and adaptation of the proposed scheme, we set the variation on elevation and azimuth angles as 0.02 • /step and 0.01 • /step, respectively. Then, the elevation and azimuth angles change from 45 • to 61 • and 30 • to 22 • over 800 samples, respectively. We can observe that the PSA scheme can adaptively track the variation on the DoA angles in time with high accuracy after initial transient. The elevation angle's tracking curve is more smooth than the azimuth angle's tracking curve since the estimation of azimuth angle rely on the estimation of both u i and v i .

VII. CONCLUSION
In this paper, a PSA based adaptive method is proposed to estimate and track the CSI for the URA based M-MIMO systems. To reduce the computational complexity, the M-MIMO channel model in spatial domain is adopted and an iterative algorithm is designed to estimate the received signal eigenvectors at the BS. Then, the conventional MUSIC algorithm and a projection based method are employed to estimate the DoA angles and path gains. To evaluate the estimation accuracy and the influence of the antenna array configuration deployed at the BS to the estimation accuracy, the CRLB is derived in theory. Afterwards, the numerical results are demonstrated to verify the effectiveness of the proposed scheme. Even though the PSA based estimation and tracking scheme has advantages on low computational complexity and mitigating the pilot contamination, there are still some issues which need to be further investigated. The first one is that the high SNR on the received signals is required to guarantee the effectiveness of the proposed scheme. As a consequence, the orthogonal pilots are still necessary to achieve the accurate CSI estimation at the initial stage. The second one is that the iteration step size µ to estimate the signal eigenvectors is set to a constant. If above two issues can be address properly, the proposed scheme may estimate and track the CSI for M-MIMO systems with high accuracy and convergence speed. Therefore, they are our future work.