Channel Estimation for RIS-Aided mmWave MIMO Systems via Atomic Norm Minimization

A reconfigurable intelligent surface (RIS) can shape the radio propagation environment by virtue of changing the impinging electromagnetic waves towards any desired directions, thus, breaking the general Snell's reflection law. However, the optimal control of the RIS requires perfect channel state information (CSI) of the individual channels that link the base station (BS) and the mobile station (MS) to each other via the RIS. Thereby super-resolution channel (parameter) estimation needs to be efficiently conducted at the BS or MS with CSI feedback to the RIS controller. In this paper, we adopt a two-stage channel estimation scheme for RIS-aided millimeter wave (mmWave) MIMO systems without a direct BS-MS channel, using atomic norm minimization to sequentially estimate the channel parameters, i.e., angular parameters, angle differences, and products of propagation path gains. We evaluate the mean square error of the parameter estimates, the RIS gains, the average effective spectrum efficiency bound, and average squared distance between the designed beamforming and combining vectors and the optimal ones. The results demonstrate that the proposed scheme achieves super-resolution estimation compared to the existing benchmark schemes, thus offering promising performance in the subsequent data transmission phase.


I. INTRODUCTION
The millimeter wave (mmWave) bands with multiple-input multiple-output (MIMO) transmission is a promising candidate for 5G and beyond 5G communication systems [1]. However, the transmission distance is limited due to the high free-space path loss, which can be compensated for by introducing large antenna arrays at both ends of the link [2]- [4]. This in turn brings challenges on the channel estimation (CE) compared to that for small-scale MIMO systems with less unknown channel coefficients. Unlike the sub-6 GHz bands, the wireless channels at mmWave frequencies are verified to have less scattering [1]. Thereby fewer resolvable paths exist between the base station (BS) and mobile station (MS). Thus, the mmWave MIMO channel is typically inherently sparse (i.e., the number of distinguishable paths in the angular domain is much smaller than that of transmit and receive antennas). Efficient yet effective compressive sensing (CS) techniques, which take advantage of the sparsity, have been widely applied in the channel (parameter) estimation of point-to-point (P2P) mmWave MIMO channels, e.g., in [5]- [8].
Due to the channel sparsity, the mmWave communications typically require line-of-sight (LoS) connection to maintain sufficient receive power level. In practice, the direct channel between the BS and MS can be blocked by objects [9]. In order to maintain the connectivity under LoS blockage, the concept of a reconfigurable intelligent surface (RIS), also known as intelligent reflecting surface (IRS) [10] or large intelligent surface (LIS) [11], [12], has been recently proposed in [13]- [17] as a smart reflector. It can also been interpreted as a fullduplex (FD) relay [18], although it is in reality a passive element with no active transmit power amplifier, which is a core component of an actual relay station. Other potential benefits brought by introducing a RIS include enhanced spectrum efficiency (SE), energy efficiency (EE), and physical-layer security [19]. Additionally, the RIS has great potential to offer higheraccurate indoor or outdoor radio localization [17], [20]. In practice, the RIS can be made of an array of discrete phase shifters, which can passively steer beams towards dedicated terminals by controlling the phase of each RIS unit. This kind of RIS architecture is called the discrete RIS and does not have any baseband processing capability [14], [15], [17]. Therefore, extremely low power consumption is expected, used only for the control of the RIS units. Another type of RIS, on the contrary, is the continuous/contiguous RIS, which can be seen as an active transceiver with baseband processing capability [12] or a passive reflector [21] like the aforementioned discrete RIS.
CE methods for RIS-aided MIMO systems have been recently studied in [22]- [26]. Taha et al. [22] considered a special setup with mixed active and passive elements at the RIS. Therefore, CE was performed using CS and deep learning (DL) methods at the RIS based on the received signals at the active elements with pilots sent from the BS and MS. The introduction of active receive elements at the RIS increases the power consumption, complexity and cost of RIS.
In [23], sparse matrix factorization and matrix completion were exploited in a sequential manner to perform iterative CE. Thereby full rate advantage of the RIS is not achieved during the training process due to the on/off state applied to the RIS elements. An optimal CE scheme was studied by following the criterion of minimum variance unbiased (MVU) estimation in [24]. In [25], CS was applied to estimate the cascade mmWave channel. However, a single antenna was assumed for the MS in both [24] and [25], which applies for wireless sensor network applications, but is not practical for mmWave MIMO communications. In our recent work [26], we applied the iterative reweighted method of [7], [27] to estimate the channel parameters. However, both BS-RIS and RIS-MS channels were assumed to have only a LoS path. Unlike all the aforementioned literature, a multi-level hierarchical codebook based scheme was leveraged to design the phase control matrix (reflection beam) at the RIS and the combining vector at the MS jointly [28] instead of estimating the MIMO channel parameters as an intermediate step towards joint design of active combining vector at the MS and passive beamforming (BF) at the RIS.
In this paper, we study the CE problem of passive RIS-aided mmWave MIMO systems, where the direct channel is obstructed and multiple paths exist for both the BS-RIS and RIS-MS channels. We resort to the parametric channel model for the individual channels [2], [29], based on angular parameters, i.e., angles of departure (AoDs) and angles of arrival (AoAs), and propagation path gains. Furthermore, no data sharing backhaul link is assumed between the BS and RIS; low rate control link is sufficient. We divide the CE problem into two subproblems and apply atomic norm minimization to sequentially find the estimates of the channel parameters, e.g., angular parameters, angle differences, and products of propagation path gains. Besides evaluating the mean square error (MSE) of the estimated channel parameters, we design the RIS phase control matrix, the BS BF vector, and the MS combining vector based on the estimates and evaluate the average effective SE bound and RIS gains. The proposed CE scheme significantly outperforms an orthogonal matching pursuit (OMP) based two-stage counterpart [30]. Simulation results demonstrate that the average effective SE bound achieved by the proposed method approximate that with perfect channel state information (CSI) in the low signal-to-noise ratio (SNR) regime with limited training overhead. The contributions of the paper are summarized as follows: • We propose an efficient super-resolution channel parameter estimation scheme for RIS-aided mmwave MIMO systems, based on atomic norm minimization [31], [32]. The proposed scheme can reduce the training overhead significantly by first estimating part of the channel parameters (i.e., AoDs of the BS-RIS channel and AoAs of the RIS-MS channel) and utilizing the estimates in the subsequent training period.
• Decoupled atomic norm minimization is applied in the first stage with a multiple measurement vectors (MMV) model, while atomic norm minimization is applied in the second stage with a single measurement vector (SMV) one.
• The design of RIS phase control matrix is studied by following the criterion of maximizing the power of the effective channel. On the basis of the designed RIS phase control matrix, the joint design of BS BF and MS combining vectors are considered based on the reconstructed composite channel matrix (using estimated channel parameters).
The rest of the paper is organized as follows: Section II introduces the channel model for the RIS-aided mmWave MIMO system, followed by the sounding procedure in Section III.
Section IV provides the details about the proposed two-stage CE approach based on atomic norm minimization, followed by the RIS control as well as beamforming and combining design in Section V. The performance evaluation is offered in Section VI. Section VII draws the conclusions and discusses the potential directions for future investigation. antenna array is assumed to be an uniform linear array (ULA) with consideration of azimuth angle only; an extension to an uniform planar array (UPA) can be done. 1 We further assume that the direct channel between the BS and MS is obstructed, which renders the potential usage of a RIS for maintaining the connectivity between the BS and MS. 2 We assume the geometric channel model, which is based on the AoDs, the AoAs, and the propagation path gains of each link. The channel between the BS and the RIS H B, where [θ B,R ] l and [φ B,R ] l denote the lth AoD and AoA of the BS-RIS channel, respectively, L B,R denotes the number of resolvable paths, which is usually on the order of 2-8 in mmWave frequency bands [1], and [ρ B,R ] l denotes the lth propagation path gain. Index l = 1 refers to the LoS path, and l > 1 refer to the non-line-of-sight (NLoS) paths, e.g., single-bounce or multi-bounce reflection paths. Usually, |[ρ B,R ] 1 | 2 |[ρ B,R ] l | 2 for l > 1, and the difference is easily more than 20 dB [33].
where d is the antenna element spacing, λ is the wavelength of the carrier frequency, and j = √ −1. By following Similar to (1), the channel between the RIS and the MS, denoted as H R,M ∈ C N M ×N R , is where the channel parameters φ R,M , ρ R,M , θ R,M , A(φ R,M ), and A(θ R,M ) are defined in the same manner as those in (1).
Using (1) and (4), the composite channel H ∈ C N M ×N B between the BS and MS, after taking into consideration the RIS, becomes where Ω ∈ C N R ×N R is the phase control matrix at the RIS. We assume that the RIS is composed of a series of discrete phase shifters. Therefore, matrix Ω is a diagonal matrix with unit-modulus constraint on the diagonal entries, i.e., [Ω] kk = exp(jω) with phase ω ∈ [0, 2π). In practice, the reflection of RIS may not be perfect so that reflection coefficient a ∈ [0, 1] as in [Ω] kk = a exp(jω) describes the amplitude scaling and power loss 3 [10]. We assume an ideal RIS with a = 1; for our focus on CE, this does not decrease the generality of the work as long as the value of a is known. In this regard, the received power at the MS can be considered as a theoretical upper bound if the RIS phase control matrix is optimally designed.
Let us define G ∈ C L R,M ×L B,R as the effective channel, taking into consideration of propagation path gains, RIS phase control matrix and the angular parameters associated with the RIS, i.e., θ R,M and φ B,R . Because G is a function of the RIS phase control matrix, the design of Ω affects the effective channel, which in turn influences the achievable rate (i.e., capacity) of the composite channel. This imposes the significance of the RIS design and control for data communications, especially, when the direct BS-MS channel is blocked. By following (6), the composite channel H in (5) can be further expressed as Remark 1. The composite channel matrix H in (7) is similar to a P2P mmWave MIMO channel.
However, a difference exists. As for the P2P mmWave MIMO channel, G is a diagonal matrix, like diag(ρ B,R ) in (1) and diag(ρ R,M ) in (4) while for the RIS-aided MIMO channel, G is usually in a general format, i.e., a full matrix. In addition, the effective channel matrix G needs to be optimized via controlling the RIS phase shifters in order to take the full potential of introducing the RIS.

Remark 2.
In the first CE stage, we estimate φ R,M and θ B,R with randomly generated training sequences. In the second CE stage, we estimate the remaining channel parameters, e.g., ρ R,M , first stage. Due to the coupling effect in (6), these parameters can not be estimated separately in the second stage, detailed in Section IV.

III. SOUNDING PROCEDURE
We also assume that the wireless channels are quasi-static block fading. That is, the channel parameters remain unchanged during a certain period of time, known as the coherence time. For the sounding process, one coherence time interval is divided into two subintervals, the first one for CE and the second for data transmission (DT), as depicted in Fig. 2. The CE subinterval is further divided into T + 1 blocks. In each block, a different Ω is taken into consideration, i.e.,

A. Stage 1 Sounding
In the first block of CE subinterval, i.e., t = 0, the BS sends a (random) training matrix As in mmWave MIMO systems, the BS and MS are commonly assumed to possess a hybrid analog-digital precoding architecture with limited number of radio frequency (RF) chains for the sake of reduced complexity, cost, and power consumption [2], [3], [29], [34]. We follow the same hybrid architecture in this paper. Therefore, at the MS, we can only access to a maximum N RF -dimensional signal vector per symbol time with N RF being the number of RF chains at the MS. In other words, the combining matrix at the MS can be as large as N M × N RF per symbol duration. Meanwhile, at the BS, we can only explore one beam (i.e., one column vector of transmitted signals in X 0 ) per symbol duration regardless of the number of RF chains at the BS [2], [34]. When N RF < M 0 , each training beam from X 0 needs to be sent M 0 N RF times. Thus, the training overhead in the first stage is N 0

B. Stage 2 Sounding
Based on the received signal Y 0 , we resort to the atomic norm minimization to recover the angular parameters θ B,R and φ R,M , which guide the design of sequential training matrices {X 1 , · · · , X T } and combining matrices {W 1 , · · · , W T }. To simplify the design, we fix and obtain the received signals as {Y 1 , · · · , Y T }. 5 We intentionally choose N 0 L B,R and M 0 L R,M in order to provide a very accurate estimate in the first stage. Therefore, the training overhead can be greatly reduced for the block t as t = 1, · · · , T compared to that for the first block. The overall training overhead in the second stage is T L B,R L R,M N RF . Based on {Y 1 , · · · , Y T }, the atomic norm minimization is further applied to estimate the remaining channel parameters as detailed below.

C. Observation Model
The received signals for all the blocks are summarized as where we write H explicitly as a function of Ω t , and each entry in additive white Gaussian noise (AWGN) Z t follows CN (0, σ 2 ). 5 In principle, we can refine the training and combing matrices at block t based on the received signals up to block t − 1.
However, this will bring more computational complexity of the proposed CE algorithm. Also, we intentionally use more time slots in the first block of CE subinterval in order to obtain a super resolution for the estimates of channel parameters in the first stage. Therefore, the room for gradual improvement will be rather limited.

IV. TWO-STAGE CE APPROACH
Before moving to the details of the two-stage CE approach, we briefly review the atomic set, the atomic norm, and the atomic norm minimization.

A. Atomic Norm Minimization
Unlike the conventional greedy CS approaches, e.g., OMP, the atomic norm minimization is based on an infinite set and solved by resorting to convex optimization tools [31], [35]. Atomic norm minimization can well address the basis mismatch problem, which is commonly known in finite-size dictionary based CS approaches. Depending on the signals to be recovered, an atomic set is formulated by containing atoms with the same dimension of the desired signals [31], [35].
1) 1D Signal: As in direction of arrival (DoA) estimation or line spectral estimation problems [31], [36], the one dimensional (1D) signal to be recovered is in the form of α(θ) ∈ C Nu×1 . 6 Therefore, the atomic set is defined as where the cardinality of A is infinite, i.e., card(A) = +∞. For any signal with the same dimension of the atoms, e.g., u ∈ C Nu×1 , its atomic norm with respect to A in (9) is defined as where conv(A) is the convex hull of A, and u = A u β falls into the SMV model with A u = [α(θ 1,1 ), α(θ 1,2 ), · · · ] and β = [β 1 , β 2 , · · · ] T .
2) 2D Signal: As for a two-dimensional signal, one valid matrix atomic set can be defined as [8] A We intentionally introduce such an atomic set, since it will be used in the first stage of the proposed two-stage CE scheme. Other types of matrix atomic sets also exist in the literature depending on the structure of the original signal to be recovered. Each atom in set A M is a rank-1 matrix, and the atomic set size is also infinite due to the continuum of θ 1 .
For any matrix U ∈ C N U ×M U with the same dimension of α(θ 1 )c T , its atomic norm with respect to A M in (12) is defined as where This atomic norm is equivalent to the solution of the following SDP, as in [35] Similar to other CS methods, the goal of atomic norm minimization is also to find the sparsest representation of u or U with the least number of atoms from the predefined atomic set [35].

B. First Stage of Channel Estimation Algorithm
The CE problem in the first stage falls into the category of two decoupled 2D signal (with a MMV model) recovery subproblems.
, the estimation of φ R,M based on Y 0 in the first stage can be formulated as regularized denoising which can be further expressed as where µ is a regularization parameter controlling the trade-off between sparsity and data fitting, set as µ ∝ σ 2 N M log(N M ) [32]. We assume that we know the number of (significant) paths as prior information. In practice, this can be identified either by long-term site specific measurements or CS based support recovery algorithms, for example. The recovery of φ R,M is then based on the solution of Toep(û 1 ) from (16) by root finding approach or other related approaches, e.g., the classical multiple signal classification (MUSIC) and estimation of signal parameters via rotational invariant techniques (ESPRIT) [37], [38].
2) Estimation of θ B,R : Similarly, based on the Y H 0 , we can recover θ B,R by addressing the following convex problem , and η is a regularization parameter controlling the trade-off between sparsity and data fitting, set as η ∝ σ 2 N B log(N B ) [32]. It can be further expressed as Similarly, the recovery of θ B,R is based on the solution of Toep(û 1 ) from (18) by root finding approach or other related approaches.

C. Second Stage of Channel Estimation Algorithm
In the second stage, we first design training and receive beams, which leads to a simplified approximate observation model. From this model, we can determine L B,R L R,M separate obser-vations and apply SMV atomic norm minimization on each of these. These different steps are now detailed.

1) Training and Receive Beams:
After estimation of θ B,R and φ R,M , we align the training beams at BS and receiving beams at MS with these angles. Namely, we design the X t and W t , for t = 1, · · · , T , as follows 2) Simplified Observation Model: Assuming we have a very accurate estimate in the first stage, i.e.,θ B,R ≈ θ B,R andφ R,M ≈ φ R,M , we have the following under the condition of sufficient separation of angles and a large number of antennas at both BS and MS. In practice, the estimation performance depends on the SNR level, number of training sequences used in the first stage, and the size of the combining matrix in the first stage. Super resolution estimation is possible in appropriate SNR conditions and aforementioned parameter values. In general, the estimation in the first stage loses the order information on entries in θ B,R and φ R,M . Therefore, the products may not be scaled identity matrices as in (20) but scaled elementary matrices. This does not affect the parameter estimation in the second stage, as explained in the sequel.
Let us assume that the relationship in (20) holds. Then, the received signals in the second stage can be further approximated as 3) Formulation of L B,R L R,M Observations: diag(ρ B,R ), the (m, n)th entry of G t is in the form of for m = 1, · · · , L R,M , n = 1, · · · , L B,R , where is the angle difference matrix associated with the RIS and ω t ∈ C N R ×1 is the vector composed of diagonal elements of Ω t , i.e., Ω t = diag(ω t ).
By setting g t = vec(G t ), the ith element of g t is of the form of where where % is the modulo operation. In other words, the product of propagation path gains ρ i is taken from entries of vector ρ = ρ R,M ⊗ ρ B,R , andθ i is taken from the set of angle differences whereΩ = [ω 1 , · · · , ω T ] T and z i is the additive noise as z i = [vec(W H 1 Z 1 ), · · · , vec(W H T Z T )] T i,: .

4) SMV Atomic Norm Minimization:
According to the formulation (26), this incurs L B,R L R,M sparsity-1 signal recovery problems withΩ being the linear measurement matrix. We can estimate ρ i andθ i by resorting to atomic norm minimization on SMV. It should be noted that we cannot estimate ρ R,M and ρ B,R separately due to the coupling effect, and the same principle applies to φ B,R and θ R,M , as seen in (22) and (24).
In the second stage, L B,R L R,M atomic norm minimization problems are formulated as where h i = ρ i α(θ i ) and the regularization parameter ν i is set as ν i ∝ σ 2 N R log(N R ). The estimate ofθ i , denoted asθ i , relies on Toep(v) by resorting to root finding methods. The estimation of ρ i is obtained by using least squares (LS) aŝ where (·) † denotes Moore-Penrose pseudo-inverse andĥ i is the solution from (27) for h i .
The proposed two-stage CE approach is summarized in Fig. 3.

Remark 3.
There exists one-to-one correspondence between {ρ i ,θ i } and [Y] i,: , depicted in (26). (27), we estimate the parameter pairs {ρ i ,θ i } one by one based on one row from Y. The loss of order information on entries in θ B,R and φ R,M in the first CE stage will only change the row order of Y accordingly, which will only changes the order of estimating the parameter pairs other than bring negative effect on the estimation accuracy.

D. Complexity Analysis and Training Overhead
The computational complexity in the first stage depends on the size of the positive semidefinite matrix in (16) and (18)

V. RIS CONTROL AND BEAMFORMING & COMBINING DESIGN
The ultimate motivation of estimating the channel parameters discussed above is to enable coherent demodulation, to be able to design the phase control matrix at the RIS and transmit and receive beamforming vectors in order to maximize the SE.

A. Design of Ω
The optimization criterion used here is to maximize the power of G, defined in (6), as a function of Ω, i.e., G 2 F , to maximize the effective SNR at the receiver. The optimal design of Ω is expressed as where G 2 F can be expressed as where (a) and (b) are obtained by following (22) and (23), respectively, and ω = diag(Ω).
Therefore, the optimal ω (denoted by ω ) based on the estimates in the second stage is obtained by where

B. Beamforming at BS and Combining at MS
The BS BF and MS combining design is based on the estimate of composite channel after setting Ω = diag(ω ). The reconstructed composite channel is formulated aŝ , constructed by using Ω and estimates in the second stage, i.e., {ρ i ,θ i }, and vec2mat(·) converts a vector to a matrix with a predefined size. 7 The SVD is further applied toĤ asĤ =ȖΣV H , and the optimal BF and combining vectors at after taking into consideration the constraints of the hybrid precoding architecture. 8

VI. PERFORMANCE EVALUATION
In this section, we demonstrate the efficiency of the proposed CE approach. We present several benchmarks, detail the simulation scenario parameters as well as performance metrics, and provide an in-depth performance analysis and discussion.

A. Benchmarks
For the benchmark scheme, we consider the OMP based two-stage approach. In the first stage, the vectorization of Y 0 is in the form of (35) can be further expressed asĀg 0 = A dg0 , where A d is deemed as an overcomplete dictionary containing the columns ofĀ and constructed by quantizing the angular domains of AoD of the BS-RIS channel and AoA of RIS-MS channel into 2N B and 2N M levels, respectively. Ideally,g 0 is a vector with L B,R L R,M elements the same as these of g 0 while the remaining elements are all-zeros. In other 7 Here, vec2mat(·) is an inverse operation of vec(·). For instance, we haveĝ = vec(Ĝ), and on the contrary, we havê G = vec2mat(ĝ) under the condition that the size ofĜ is known. 8 We use ≈ here due to the inherent hardware constraints, which may bring some gap between f (w) and [V]:,1([Ȗ]:,1). If no constraints exist, like that in the full digital precoding systems, = will be used instead.
words,Āg 0 can be sparsely represented under a certain overcomplete dictionary. X T 0 ⊗ W H 0 is considered as the linear measurement matrix. Therefore, the recovery ofĀ (or equivalently θ B,R and φ R,M ) and g 0 can be addressed by resorting to the OMP algorithm [30]. In the second stage, the dictionary is constructed by quantizing the angular domains into 2N R and each atom is in the form of an array response vector. The recovery of {ρ i ,θ i } is also conducted by using OMP on (26).
We also consider two benchmarks under perfect CSI: (i) CSI of the individual channels is perfectly known to evaluate the average SE. This perfect CSI may be obtained by knowing the exact location information of the BS, MS, and RIS and environmental information [39]; (ii) CSI of the LoS path is perfectly known, where we align the beams with the angles related to the LoS path and evaluate the average SE bound.

B. System Parameters and Performance Metrics
The simulation parameters are set as follows: N B = N M = 16, N R = 32, and N RF = 8. The angle separation in terms of directional sine is assumed to be larger than 4/N B , 4/N R , and 4/N M at the BS, RIS, and MS, respectively. We assume that the propagation path gains follow CN (0, 1) until Section VI-C2 and each element of Z t follows CN (0, σ 2 ). The SNR is defined as 1/σ 2 , and 2000 realizations are considered for averaging. We fix the channel coherence time as 0.5 ms and further assume the system adopts OFDM accommodating a bandwidth of 1 GHz with a 10 % cyclic prefix overhead and 1024 FFT size, so that one OFDM symbol duration is approximately 1 us, implying one coherence time has around 500 symbol durations, i.e., T c = 500.
Performance will be assessed in several metrics: (i) the MSE of the estimated parameters (angles in the first CE stage, angle difference and the product of propagation path gains in the second CE stage), (ii) the average effective SE bound; (iii) the average squared distance (ASD) between the designed beamformer (combiner) in Section V-B and the optimal one obtained by assuming full CSI; and (iv) the RIS gain based on the estimated parameters. The MSEs of angular parameter estimation and product of propagation path gains estimation are defined as 9 The average effective SE bound for a given channel realization is defined as 10 where the design of Ω was discussed in Section V- The ASDs of the beamformer and combiner are defined as where f o and w o denote the optimal beamformer and combiner at the BS and MS, respectively (assuming full CSI).
Finally, the RIS gain is defined as    information on the number of paths is assumed to be known precisely. The MSE performance of parameter estimation related to the strong paths outperforms that related to the weak path(s).
We now study the ASD between the designed beamformer (combiner) in Section V-B and the optimal one, designed by assuming full CSI of the individual channels. We compare the performance with partial estimation, where in stage 2 sounding only beams towards the strongest Average squared distance Fig. 9. Average squared distance between the designed beamformer/combiner and the optimal one for partial estimation vs. full estimation.
path are formed (leading to a reduced T t ). We also compare with the OMP-based two-stage approach. The performance is shown in Fig. 9. From the figure, we observe that the partial estimation can offer comparable performance compared to the that by full estimation in the inhomogeneous paths scenario, where only one path dominates in each individual channel. The performance of the proposed scheme significantly outperforms that of the OMP-based counterpart in terms of ASD.
The full estimation aiming at estimating all the channel parameters even brings some negative effect on the average effective SE bound, shown in Fig. 10, compared to the partial estimation.
This may result from the poor estimation of product of propagation path gains, related to weak paths, which in turn provides a bad design of RIS phase control matrix. An initial result on perfect CSI on the LoS (assuming that the strongest path is the LoS with path gain following CN (0, 1)) is obtained by aligning the beams towards the corresponding angles. As shown in Fig. 10, knowing the LoS path (e.g., from the accurate location information) even brings some gains compared to the proposed scheme in the scenario of inhomogeneous paths, and offers similar performance with perfect full CSI case. This will attract great interests on application of location information (in practice imperfect) to the RIS-aided mmWave MIMO systems to boost the CE process and BF design. In the inhomogeneous paths scenario, we have evaluated the parameter estimation from the path by path perspective, where better performance can be achieved for the parameters related to the strong paths. The benefits brought by the availability of location information in the inhomogenous paths scenario has also been examined.
Future studies can include the optimization of training and combining matrices during stage 1 sounding, optimization of the regularization parameter to bring a better trade-off between the data fitting (i.e., effect of noise term) and sparsity (i.e., prior information). In addition, the transmit powers during the entire sounding process can be optimized to bring better estimation performance. The prior information on the number of paths should be avoided to make the proposed scheme practical. Some preliminary results on the benefits brought by location information on the RIS and MS are provided, and deserve to be explored in depth with a more realistic assumption on the location awareness.