Channel Estimation and User Localization for IRS-Assisted MIMO-OFDM Systems

We consider the channel estimation problem and the channel-based wireless applications in multiple-input multiple-output orthogonal frequency division multiplexing systems assisted by intelligent reconfigurable surfaces (IRSs). To obtain the necessary channel parameters, i.e., angles, delays and gains, for environment mapping and user localization, we propose a novel twin-IRS structure consisting of two IRS planes with a relative spatial rotation. We model the training signal from the user equipment to the base station via IRSs as a third-order canonical polyadic tensor with a maximal tensor rank equal to the number of IRS unit cells. We present four designs of IRS training coefficients, i.e., random, structured, grouping and sparse patterns, and analyze the corresponding uniqueness conditions of channel estimation. We extract the cascaded channel parameters by leveraging array signal processing and atomic norm denoising techniques. Based on the characteristics of the twin-IRS structures, we formulate a nonlinear equation system to exactly recover the multipath parameters by two efficient decoupling modes. We realize environment mapping and user localization based on the estimated channel parameters. Simulation results indicate that the proposed twin-IRS structure and estimation schemes can recover the channel state information with remarkable accuracy, thereby offering a centimeter-level resolution of user positioning.

spectrum, e.g., terahertz (0.1-10 THz), has also been regarded as one of the potential development directions of the future sixth generation wireless communications [2]. The short signal wavelength enables miniaturized implementation of massive multiple-input multiple-output (MIMO) arrays, providing considerable directional beamforming gains [3]. In order to address the limited coverage and line-of-sight (LoS) obstruction of high-frequency systems, intelligent reconfigurable surfaces (IRSs) have been recently investigated to artificially establish controllable non-line-of-sight (NLoS) links [4].
IRS technologies can contribute to the full coverage and broadband connectivity of the future wireless networks [5]. An IRS, typically a programmable metasurface, is composed of a massive number of unit cells independently interacting with incident waves [6]. The reflection coefficients of IRS elements can be predefined by digital controllers to perform real-time manipulation of electromagnetic responses [7]. IRSs can help realizing a smart wireless propagation environment, providing additional degrees of freedom for transceiver design and network optimization [8]. Some works have already integrated IRSs into wireless applications, e.g., MIMO detection, non-orthogonal multiple access, radio localization and mapping, etc. [9]- [11]. Nevertheless, these services require precise knowledge of the propagation channels and multipath parameters, thereby entailing fundamental challenges to channel estimation involving fully passive IRS modules.
We will now delineate the following relevant works: An efficient scheme that developed parallel and sequential training designs was proposed in [12]. A compressed sensing (CS) method that converted the channel estimation into a sparse recovery problem was proposed in [13]. A tensor-based approach that factorized the cascaded channel by iterative decompositions was presented in [14]. A message-passing algorithm that solved a dictionary learning problem was presented in [15]. A two-timescale training protocol that individually estimated the one-hop channels was developed in [16]. A matrix-calibration scheme that developed a joint calibration and estimation algorithm was developed in [17]. Most of these works achieve separated channels from the base station (BS) or user equipment (UE) to the IRS plane, which, however, induces inherent estimation ambiguities that hinder the exact recovery of multipath parameters. These ambiguities limit the integration of IRSs into environment-dependent applications [11], which can only be removed with strong or unrealistic assumptions, e.g., normalized power, channel This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ reciprocity, quasi-static states, a priori long-term information, etc [14]- [17].
In this paper, we consider the channel estimation of IRS-assisted MIMO orthogonal frequency division multiplexing (OFDM) systems. By leveraging the concept of multi-IRS networks [18], we propose a novel twin-IRS structure composed of two IRS planes with a relative spatial rotation. This three-dimensional (3-D) structure can provide channel information along three orthogonal spatial dimensions, which enables us to accurately extract the channel parameters in 3-D propagation spaces. By leveraging the geometric relationship of cascaded parameters obtained from the twin-IRS structures, we can precisely recover the channel parameters, i.e., angles of arrival (AoAs), angles of departure (AoDs) and time delays, by solving systems of nonlinear equations. Unlike some existing works that rely on strong assumptions, e.g., BS-IRS LoS components with a priori information [19], [20], our work is able to exactly decouple the cascaded parameters of common channels, supporting precise environment mapping and localization applications. The main contributions of this paper are summarized as follows: • We model the training signal as a third-order canonical polyadic (CP) tensor, transforming the channel estimation problem into a tensor factorization problem [21], [22]. By leveraging the assignment flexibility and distribution regularity of IRS unit cells, we apply the MIMO antenna designs and array processing techniques to the IRS configuration. We introduce four training designs, i.e., random, structured, grouping and sparse patterns, as well as, the corresponding uniqueness conditions of channel estimates. The structural information artificially attached to IRS coefficients can help to efficiently reduce the training overhead. • We combine the concept of sparse/grouping arrays and atomic norm denoising techniques with the tensor operations to realize the cascaded parameter recovery [23]- [26]. Moreover, by leveraging the physical relationship of the novel twin-IRS structures, i.e., the adjacent position and relative rotation, we introduce additional information of channel paths into the phases/amplitudes of the IRS array response vectors. We further formulate a nonlinear equation system of cascaded parameters and propose two efficient decoupling modes to exactly recover the channel parameters, i.e., AoA/AoDs and time delays. Note that in most prior works (e.g. [14]- [17], [19], [20]), these parameters cannot be obtained without strong assumptions or estimation ambiguities. • By leveraging the geometric characteristics of mmWave propagation, we implement a straightforward 3-D propagation environment mapping, including scatterer positioning of NLoS paths, user localization and orientation determination, based on the recovered channel path parameters. Dedicated wideband and narrowband training schemes with single and multiple twin-IRS structures are respectively proposed. Simulation results indicate that the proposed schemes can achieve considerable channel estimation accuracy with reduced training overhead for large-scale IRSs with massive The rest of the paper is organized as follows: Section II introduces the IRS-assisted MIMO-OFDM system, as well as, the proposed twin-IRS structure. Section III presents the training pattern designs of IRS coefficients and the tensor-based channel estimation schemes. Section IV presents the recovery and decoupling procedures of cascaded channel parameters. Section V discusses the application of user localization based on the estimated channel information. Section VI presents the numerical results of the channel estimation schemes and environment mapping applications. Section VI draws the most important conclusions.
Notations: a, A, A and A denote vectors, matrices, tensors and element sets, respectively; (·) T , (·) * , (·) H and (·) † denote the transpose, conjugate, Hermitian transpose and pseudoinverse, respectively; A (N ) , A [N ] denote the columns and rows of A indexed by N , respectively; a • b and a × b denote the dot and cross products, respectively; a, A F denote the 2-norm and Frobenius-norm, respectively; ⊗, and • denote the Kronecker, Khatri-Rao and tensor outer products, respectively; Diag(a) denotes the diagonal matrix formed by a; Tr(A) denotes the matrix trace; I(n) {1, . . . , n}; r(A), k(A) denote the general rank and Kruskal-rank, respectively; 0 n , 1 n and I n denote all-zeros, all-ones vectors and identity matrices, respectively; Matr(X ; [k 1 , . . . , Tens(X; [I 1 , . . . , I N ], [k 1 , . . . , k P ], [k P +1 , . . . , k N ]) tensorizes X to X ; X × n U denotes the mode-n tensor-matrix product [27]. The symbols of important variables and parameters are summarized in Table I. II. SYSTEM MODEL We consider an IRS-assisted MIMO-OFDM system as illustrated in Fig. 1(a), where one base station (BS) with N B antennas and M B radio frequency (RF) chains communicates with one user equipment (UE) with N U antennas and M U RF chains. The system occupies K subcarriers with a central carrier frequency f c and a bandwidth f s , where K tr subcarriers are allocated for training. Multiple two-dimensional (2-D) IRSs with N I elements are employed [5]- [7]. We propose to arrange every two IRS planes closely together, constructing R twin-IRS structures as shown in Fig. 1(b). Without loss of generality, we assume that the primary IRS is configured on the yz-plane (in the local coordinate system), while the secondary IRS is rotated with horizontal and vertical angles, i.e., δ h and δ v , relative to the primary one. Note that δ h , δ v can be viewed as the yaw and pitch angles of the rotation about the zand y-axes, respectively. Other definitions of rotation, e.g., Euler angles or quaternion [28], can also be applied.
We consider an uplink communication scenario, where the direct BS-UE channel is assumed to be obstructed. 1 The channel estimation procedure occupies P tr training frames, each containing T tr training time slots. The training signal within the tth time slot of the pth frame at the kth subcarrier reflected via the primary IRS of the rth twin-IRS structure can be represented as 2 y r,k,p,t = W H r,k H BI,r,k Diag(ψ r,k,p ) ×H IU,r,k F r,k,t x r,k,t + n r,k,p,t , (1) where x r,k,t ∈ C MU and n r,k,p,t ∈ C MB denote the transmitted pilots and additive noise, respectively; F r,k,t ∈ C NU×MU denotes the precoding beamformer within the tth time slot of each frame; W r,k ∈ C NB×Mtr denotes the combining beamformer with M tr ≤ M B data streams processed in parallel; H BI,r,k ∈ C NB×NI and H IU,r,k ∈ C NI×NU denote the one-hop IRS-BS and UE-IRS channels, respectively; 1 Actually, the BS-UE channel is expected to remain static with a large coherence time. It can be efficiently removed by considering two received signals with different IRS training coefficients and mutually subtracting them. The IRS training pattern designs presented in this work are still applicable. 2 The adjacent primary/secondary planes of each twin-IRS structure are expected to share identical propagation conditions, which can be equivalently regarded as a 2N I -element 3-D IRS. The proposed training designs can be directly applied to this 3-D topology. Moreover, multiple twin-IRS structures are distributed far apart from each other, whose propagation channels tend to be weakly correlated. By leveraging the spatial orthogonality with directional beamforming, or applying interference elimination methods with orthogonal reflection coefficients [29], one can identify the signals via different twin-IRS structures. For simplicity of notation and illustration, we here only represent the signal via the primary IRS of one twin-IRS structure. ψ r,k,p ∈ C NI denotes the dynamic reflection coefficient vector during the pth training frame, which is sensitive to signal frequencies [30].
We consider IRS-assisted systems working at mmWave frequencies, where the uplink spatially sparse channels H BI,r,k and H IU,r,k via the rth primary IRS can be expressed as where L BI,r , L IU,r denote the numbers of propagation paths; α BI,r, , α IU,r, denote the random complex gains; τ BI,r, , τ IU,r, denote the time delays; φ B(U),r, , θ B(U),r, denote the horizontal and vertical AoAs at the BS (AoDs at the UE), respectively; φ D(A),r, , θ D(A),r, denote the horizontal and vertical AoDs (AoAs) at the primary IRS of the rth twin-IRS structure, respectively; F (φ, θ) denotes the angle-dependent IRS power radiation pattern defined as [31] F where q ≥ 0 is the power radiation coefficient; q = 0 corresponds to omnidirectional IRS elements; A B,r ∈ C NB×LBI,r , A U,r ∈ C NU×LIU,r concatenate the antenna steering vectors formed by {φ B,r, , θ B,r, }, {φ U,r, , θ U,r, }, respectively; A D,r ∈ C NI×LBI,r , A A,r ∈ C NI×LIU,r concatenate the IRS response vectors formed by {φ D,r, , θ D,r, }, {φ A,r, , θ A,r, }, respectively; β BI,r,k ∈ C LBI,r , β IU,r,k ∈ C LIU,r concatenate the equivalent path gains {β BI,r,k, }, {β IU,r,k, }, respectively. Note that H BI(IU),r,k may contain both the LoS and NLoS components or only one kind of them, where the Rician K-factor is denoted by K BI(IU),r . In this paper, we consider only the first-order NLoS components scattered/reflected once by the environment objects, as illustrated in Fig. 1(a). 3 The reflectors of an IRS plane form an (N h × N v )-uniform planar array (UPA) topology with N I = N h N v . The array response vector of the primary IRS can be represented in a form of a Kronecker product a a v (φ, θ) 1, e j 2πdv λc cos θ , . . . , e j 2πdv with d h(v) and λ c denoting the azimuth (elevation) inter-element spacing of IRS arrays and central carrier wavelength, respectively. The BS/UE antennas can form either a UPA or a uniform linear array (ULA) topology, whose array steering vectors can be similarly defined as in (4). Recall now that the secondary IRS is placed close to the primary one with rotation angles δ h , δ v . In the far-field scenario, the radiation pattern of the secondary IRS can be expressed as Similarly, the array response elements of the secondary IRS can be expressed as where n h(v) ∈ I(N h(v) ). Now, the uplink channel matrices via the rth secondary IRS can be represented by (2) with the radiation patterns and steering vectors of (3), (4) being replaced by those of (5), (6), respectively. One can also generate (3), (4) as F (φ, θ; 0, 0) and a I (φ, θ; 0, 0) = a h (φ, θ; 0, 0) ⊗ a v (φ, θ; 0, 0), respectively. Note that (5), (6) can be derived by multiplying [sin θ cos φ, sin θ sin φ, cos θ] T with the yaw and pitch rotation matrices defined by δ h and δ v , respectively [28].
III. TENSOR-BASED CHANNEL ESTIMATION In this section, we develop channel estimation schemes to recover the channel matrices, i.e., H BI,r,k , H IU,r,k . The IRS plane adopts dynamic reflection coefficients {ψ r,k,p } Ptr p=1 across the training frames. We concatenate the received training signals of (1) across P tr T tr training time slots in total as Y r,k [y r,k,1,1 , y r,k,1,2 , . . . , y r,k,Ptr,Ttr ] ∈ C Mtr×PtrTtr . It can be equivalently regarded as a matricization of a third-order tensor as Y r,k = Matr(Y r,k ; 1, [3,2]), where Y r,k ∈ C Mtr×Ttr×Ptr fits the CP tensor model as [21], [22] where Ψ r,k [ψ r,k,1 , . . . , ψ r,k,Ptr ] T ∈ C Ptr×NI ; F r,k [F r,k,1 x r,k,1 , . . . , F r,k,Ttr x r,k,Ttr ] ∈ C NU×Ttr ; H BI,r,k W H r,k H BI,r,k ∈ C Mtr×NI and H IU,r,k F T r,k H T IU,r,k ∈ C Ttr×NI denote the combined and precoded equivalent channels, respectively; N r,k ∈ C Mtr×Ttr×Ptr is the equivalent noise tensor; I 3,NI ∈ {0, 1} NI×NI×NI denotes a third-order identity tensor. Performing the channel estimation is now equivalent to solving the factorization problem of (7) with a maximal tensor rank min r( H BI,r,k )r( H IU,r,k )r(Ψ r,k ), N I . Since the wideband training signal {Y r,k } Ktr k=1 cannot be modeled as a fourth-order CP tensor with frequency-sensitive {Ψ r,k } Ktr k=1 , we seek to subcarrier-wisely perform the factorization of Y r,k . One can average the estimation results of frequency-flat channel parameters along multiple subcarriers to enhance the recovery accuracy.
Obviously, the training IRS coefficients significantly affect the estimation strategy and performance. First of all, we consider a training pattern Ψ r,k with full column rank, which requires the number of measurements to exceed the number of signal components, i.e., P tr ≥ N I . A cascaded beamformed channel can be naturally derived by a least squares (LS) solution as follows: The nth column of (8) can be reshaped into a rank-1 product, i.e.,h BI,r,k,nh T IU,r,k,n ∈ C Mtr×Ttr , whereh BI,r,k,n H BI,r,k :,n ,h IU,r,k,n H IU,r,k :,n . By performing rank-1 truncated singular value decomposition (tSVD) λ 1 u 1 v H 1 = h BI,r,k,nh T IU,r,k,n with the dominant singular value λ 1 , we can take √ λ 1 u 1 ∈ C Mtr and √ λ 1 v * 1 ∈ C Ttr as the estimates of h BI,r,k,n andh IU,r,k,n , respectively.
For the case of P tr < N I or Ψ † r,k Ψ r,k = I NI , however, the LS solution in (8) is invalid, which undermines the integration of large-scale IRSs with a massive number of unit cells. To address this critical problem, we present four different IRS reflection pattern designs, as well as, the corresponding uniqueness condition analysis.

A. Random Pattern
One of the simplest designs is to assign random reflection coefficients to the IRS elements [30]. For instance, we employ a training pattern with unit moduli and random phases, i.e., [Ψ r,k ] p,n = 1, ∀p, n. A uniqueness condition of factorizing (7) is presented as follows: [36,Lemma 4.3]): Consider a third-order CP tensor Y r,k ∈ C Mtr×Ttr×Ptr with H BI,r,k , H IU,r,k , Ψ r,k being the factor matrices in (7). If then, r(Y r,k ) = N I , and the factorization of Y r,k is unique up to column scaling and permutation. Furthermore, if Ψ r,k is known with k Ψ r,k = P tr ≥ 2, and then, H BI,r,k , H IU,r,k can be found algebraically up to column scaling ambiguities. If F r,k , W r,k have full column ranks with M tr ≥ L BI,r , T tr ≥ L IU,r , the beamformed channels inherit the sparse nature, i.e., r H BI,r,k ≤ r H BI,r,k ≤ L BI,r , r H IU,r,k ≤ r H IU,r,k ≤ L IU,r . According to the property of k-rank: [37], in the generic case, (10) can be relaxed to min L BI,r , L IU,r +P tr ≥ N I +2. Hence, to estimate low-rank mmWave channels with usually 3-5 paths [38], the quantity of training frames should be of the same order of magnitude as the number of IRS elements. According to [36,Lemma 4.3], the factorization of (7) can be algebraically implemented by generalized eigenvalue decomposition. In practice, when P tr min(M tr , T tr ) ≥ N I , we can adopt the bilinear alternating least squares (BALS) method [14] to iteratively update H BI,r,k , H IU,r,k .

B. Structured Pattern
If additional structural information is embedded into the IRS coefficients, one can develop a more flexible solution with a more relaxed uniqueness condition. For instance, we can design the training coefficients as [Ψ r,k ] p,n = λ p−1 n with distinct {λ n } NI n=1 , which has a column-wise phase rotational-invariant characteristic. We note that this codebook may be valid only for a particular subcarrier due to the frequency-sensitive characteristics of IRS reflectors [30]. As such, this strategy is more suitable for single-carrier training. In practice, the IRS is usually implemented by finite-resolution phase shifters to reduce the hardware complexity. Hence, one can employ a discrete Fourier transform (DFT) codebook, i.e., λ n e −j2π(n−1)/NI . In this way, the training codewords multiplex N I distinct phases uniformly spaced within [0, 2π], which can be realized by b = log 2 N I -bit phase shifters. A corresponding uniqueness condition is presented as follows: then, r(Y r,k ) = N I , and H BI,r,k , H IU,r,k can be found algebraically up to column scaling ambiguities. According to [35,Lemma 1], A B has full column rank In the generic case, (11) can be relaxed to min (P 1 − 1)L BI,r , P 2 L IU,r ≥ N I . Compared with the random pattern case, the necessary number of training frames for estimating mmWave channels with 3-5 paths is now roughly halved to about N I /2.
The decomposition of (7) can now be algebraically solved by the structured CP decomposition (SCPD) method [40], where the original eigenvalue decomposition for the generator recovery is replaced by the eigenvector derivation with a priori eigenvalues. One can adopt a set of valid candidates {P 1 , P 2 } and average the obtained results to enhance the estimation performance.

C. Grouping Pattern
We can cluster the IRS reflectors into G N I groups, where the elements of the gth group are indexed by N g : |N g | = N g . The reflectors belonging to the same group adopt identical training coefficients, i.e., [Ψ r,k ] :,n ψ g r,k , ∀n ∈ N g . The training signal (7) can be modified as where We propose to divide the IRS reflector array into latticed Fig. 2(a). The horizontal and vertical indices of the IRS entries belonging to the gth group are defined as We call a factor matrix partially unique if its columns can be partitioned into disjoint subsets and each subset is identified up to its linear span [42,Definition 3.2]. Then, a partial uniqueness condition is presented as follows: Lemma 3: Consider a third-order tensor Y r,k ∈ C Mtr×Ttr×Ptr with H BI,r,k , H IU,r,k , Ψ r,k being the factor matrices in (7). If Ψ r,k = Ψ r,k Γ r with r(Ψ † r,k ) = G and [Γ r ] g,n = 1, ∀n ∈ N g , then, H BI,r,k and H IU,r,k are partially unique.
Proof: By removing the grouping codebook Ψ r,k from (12) via the LS method, one can obtain G sets of rank-N g products as where H (Ng ) BI,r,k ∈ C Mtr×Ng and H (Ng ) IU,r,k ∈ C Ttr×Ng contain the columns of H BI,r,k and H IU,r,k allocated to the gth group, respectively. By performing rank-N g tSVD on the gth slice of (14), H (Ng) BI,r,k and H (Ng) IU,r,k are factorized uniquely up to a nonsingular transformation.
The partial uniqueness condition is generally satisfied with P tr ≥ G training frames, which is generally independent of the channel ranks. The grouping pattern (13) does not directly lead to a complete estimate of H BI(IU),r,k . By leveraging the phase shift feature of grouped subarrays, one can formulate an equivalent (G h × G v )-array. As long as min(G h , G v ) ≥ 2, we are able to acquire the cascaded channel parameters, which in turn support a unique channel separation. The detailed derivations will be provided in the next section.

D. Sparse Pattern
It can be observed from the earlier designs that the required amount of training frames is generally proportional to the number of reflectors. By leveraging the concept of sparse arrays [43], we are able to obtain the parameterized channels with fewer IRS elements, leading to reduced training overhead. Specifically, N A N I reflectors are installed regularly in the (N h ×N v )-array aperture, leaving (N I −N A ) virtual unit cells with no circuits. We denote the index set of selected unit cells by N A : |N A | = N A , and modify (7) as where Ψ IU,r,k ∈ C Ttr×NA contain the columns indexed by N A of H BI,r,k and H IU,r,k , respectively.
One can leverage sparse array topologies for the training pattern. Since the nested array geometry admits a closed-form expression for the locations of the antenna elements and offers a consecutive difference coarray with large degrees of freedom, we consider a 2-D symmetric nested array [43], as illustrated in Fig. 2(b). The training pattern consists of an (N hd × N vd )-dense subarray and an (N hs × N vs )-sparse subarray with N hd , N hs being odd values [23]. The 2-D subarray coordinates are defined as 4 where N hd , N vd , N hs , N vs ∈ Z + satisfy the geometric constraint, i.e., N hd (N hs − 1) The total number of activated IRS elements is N A = N hd N vd + uniquely up to column scaling. The necessary number of training frames is hence P tr ≥ N A , which is independent of the channel ranks. Generally, the sparsity of IRS arrays enhances the channel estimation accuracy, but may yield reduced beamforming gains or system capacity. Similar to the grouping pattern (13), the sparse pattern (16) can only return a portion of the subchannel columns. Fortunately, we are able to recover the cascaded channel parameters and reconstruct the channel matrices. The detailed derivation will be presented in the next section. Remark 1: Note that all the proposed training patterns do not place any constraint on the specific values of reflection coefficients. They can be designed as directional beams to search across the cascaded angular parameters on the IRS plane, while the BS can select the received signals with the highest power to implement the channel estimation or dynamically adjust the searching directions of the subsequent frames through the controller [19]. Moreover, apart from the structured pattern, the others can be applied to multi-subcarriers without being interfered by the frequency-sensitive nature of reflectors. Based on the uniqueness conditions and other physical constraints, we summarize the recommended training settings for IRS pattern designs in Table II.

IV. MULTIPATH PARAMETER RECOVERY
We now recall that many wireless applications, e.g., environment mapping, user positioning, mobility tracking, etc., require precise information of the propagation paths, i.e., AoAs, AoDs, and time delays. Moreover, the channel estimates with missing or partially unique columns, derived from the sparse or grouping training pattern, can be well calibrated based on the propagation parameters. Specifically, by leveraging the characteristics of H BI(IU),r,k in (2), we can equivalently represent the cascaded beamformed channel H cas,r,k in (8) where H cas,r,k = W H r,k ⊗ F T r,k H cas,r,k with H cas,r,k H BI,r,k H T IU,r,k . With a little abuse of notation, The cascaded channel parameters via the primary IRS are defined as ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ π h,r,12 − sin θ D,r,1 sin φ D,r,1 + sin θ A,r,2 sin φ A,r,2 , π v,r,12 − cos θ D,r,1 + cos θ A,r,2 , ρ r,k,12 β BI,r,k,1 β IU,r,k,2 .
Based on (17), (18), H cas,r,k can be reconstructed by the channel parameters {φ(θ) B,1 , φ(θ) U,2 , π h(v),r,12 , ρ r,k,12 }, and be further separated into H BI,r,k and H IU,r,k by the tSVD. Note that in the noiseless case, the separated estimates of beamformed channels can be represented as H BI,r,k Λ r,k and H IU,r,k Λ −1 r,k with a diagonal matrix Λ r,k ∈ C NI×NI containing the scaling ambiguities. Clearly, an arbitrary Λ r,k does not change the values of Y r,k or H cas,r,k , which cannot be exactly determined. Also, the row-subspace information of H BI(IU),r,k has been compromised by those uncertainties. Generally, without a priori assumptions or conditions, one can only determine the cascaded parameters in (18) but not the exact path parameters via a single IRS plane, i.e., {φ D(A),r, , θ D(A),r, , τ BI(IU),r, }. It can be physically explained by that the passive IRS cannot sense the environment, while the receiver can only process signals propagated through the entire two-hop channel. To address these issues, we propose the channel parameter recovery schemes, as well as, the cascaded parameter decoupling strategy in this section.

A. Beamforming Design
We seek to adopt special beamforming to preserve the structural information of BS/UE antenna arrays. By leveraging the concept of beamspace ESPRIT method [44], [45], we adopt a Kronecker product-formed DFT combining strategy for the UPA-sized BS antennas as follows . It can be verified that the DFT beamforming preserves the rotational-invariant feature of IRS array phases since The precoded pilot design at the UE side can be similarly derived as in (20). As long as the search region formed by the beamspace codewords covers the channel path direction formed by {φ B(U), , θ B(U), }, we are able to receive the training signal with a desired power level [44]. Furthermore, before transmitting Y r,k in (7), one can perform a coarse sensing to find appropriate {ω h(v),m } as in [45]. Matr(Y r,k ; 1, [3,2]) and Y (2) r,k Matr(Y r,k ; 2, [3, 1]), respectively. After obtaining the angular parameters at the BS/UE ends, we can compute a smaller-sized tensor signal (in the noiseless case) as The rank-deficient factor matrices of (7) are transformed into full-row-rank ones. Performing channel factorization or parameter recovery of Z r,k ∈ C LBI,r ×LIU,r×Ptr can effectively reduce the computational complexity.

B. Cascaded Parameter Recovery
For the training design of random or structured IRS reflection pattern, one can directly compute which concatenates L BI,r L IU,r parallel response vectors of {ρ r,k,12 a I (π h,r,12 , π v,r,12 )} in (17). By leveraging the 2-D uniform array geometry of a I (π h , π v ), we can easily recover the cascaded channel parameters indexed by ( 1 , 2 ) in (18) from the (( 1 − 1)L IU,r + 2 ) row of G cas,r,k . We note that π h(v),r,12 ranges across [−2, 2], which may cause a spatial aliasing problem, i.e., the extracted phase of e j 2π λc d h(v) π h(v),r, 1 2 may exhibit an uncertain bias of 2iπ, i ∈ Z. In order to achieve a unique value of π h(v),r,12 , we propose to configure the inter-element spacing of IRS arrays as d h = d v = λ c /4, yielding a maximal phase range of [−π, π]. Furthermore, by exploiting the rotational-invariant phase feature of {ρ r,k,12 } Ktr k=1 along multiple training subcarriers as ρ −1 r,k1,12 ρ r,k2,12 = e j 2πfs K (k1−k2)(τ BI,r, 1 +τ IU,r, 2 ) , (23) with k 1 , k 2 ∈ I(K tr ), we can recover the cascaded time delays, i.e., τ BI,r,1 + τ IU,r,2 . Next, we develop two dedicated channel parameter recovery solutions for the grouping and sparse training patterns respectively, as well as, a generalized solution for an arbitrary reflection pattern.

1) Grouping Scheme:
Recall that the grouping pattern in Fig. 2(a)  With the regularly grouped lattice layout, the mode-3 (tube) fibers of G cas,r,k ∈ C LBI,r ×LIU,r×G have a phase rotational-invariant characteristic as are the selection matrices that select the first and last (G h(v) − 1) entries of the fibers, respectively. Therefore, [G cas,r,k ] 1,2,: ∈ C G can be equivalently regarded as the steering vector of a (G h × G v )-UPA topology, from which we are able to derive the cascaded parameters in (18).

2) Sparse Scheme:
Recall that the sparse pattern in Fig. 2(b) activates N A out of N I reflectors. One can derive a submatrix of (22) as G Then, the following steps are performed: (i) Copy and insert data at the overlapped origin of the sparse/dense subarrays, yielding G cas,r,k T and sort its rows to fit the 2-D nested array geometry; (iii) Remove the repetitive or discontinuous rows, yielding L BI,r L IU,r response vectors of an (N hd N hs × (2N vd N vs − 1))-difference coarray; (iv) Leverage rank-1 tSVD to recover the cascaded parameters in (18). Note that these steps leverage the principle of nested array processing, but do not require any statistical information of the signal correlation.

3) ANM-SDP/ADMM Scheme:
We leverage the atomic norm denoising technique to develop a general solution of (18) for arbitrary designs of IRS training patterns [25], [26]. The ( 1 , 2 )th mode-3 fiber of Z r,k , i.e., [Z r,k ] 1,2,: , is expressed as z r,k,12 = Ψ r,k a I (π h,r,12 , π v,r,12 )ρ r,k,12 . a I (π h , π v ) can be regarded as a 2-D atom, whilst the 2-D atomic norm of arbitrary vector x ∈ C NI with respect to the atomic set is then defined as where conv(A) is the convex hull of A; T 2 (u) ∈ C NI×NI denotes a two-level block Toeplitz matrix defined by u ∈ C (2N h −1)(2Nv−1) [25], [26]. We can now individually solve L BI,r L IU,r atomic norm minimization (ANM) subproblems of (26) as min u,x,t x H t 0, 1 ∈ I(L BI,r ), 2 ∈ I(L IU,r ), (27) where μ denotes the regularization parameter; x approximates ρ r,k,12 a I (π h,r,12 , π v,r,12 ). These optimization subproblems can be solved by the semidefinite programming method (SDP). In order to improve the computational efficiency of parameter recovery for large-scale IRSs, one can adopt the alternating direction method of multipliers (ADMM). The detailed derivations of the Lagrangian function and variable update are omitted due to space limitations, and the interested readers are referred to [25], [26] and references therein. Finally, one can exploit the rotational-invariant feature of T 2 (u) to recover the cascaded parameters.

C. Twin-IRS-Assisted Parameter Decoupling
Given arbitrary pairs of (φ D,r,1 , φ A,r,2 ), (θ D,r,1 , θ A,r,2 ), (β BI,r,k,1 , β IU,r,k,1 ) that satisfy the constraints of (18), one can recover the cascaded channel H cas,r,k as in (17). However, the information of (18) is not sufficient to support an exact decoupling of the cascaded parameters. Concretely, there are 2(L BI,r + L IU,r ) AoA/AoDs at the IRS to be estimated but only 2(L BI,r + L IU,r − 1) out of 2L BI,r L IU,r effective phase equations, whilst the amplitude equations cannot be directly used due to the unknown path gains. Therefore, with a single IRS plane, one can only formulate an underdetermined nonlinear equation system of {φ A(D),r, , θ A(D),r, }, which does not have a unique solution. In order to provide sufficient information for the parameter decoupling, we propose the twin-IRS structure, enabling the nonlinear solver to converge to the precise solution with a much higher probability. Based on (5), (6), we can represent the cascaded parameters via the secondary IRS as By combining the cascaded parameters in (18), (28) and leveraging the trigonometric function theory, extra phase equations of − sin θ D,r,1 cos φ D,r,1 + sin θ A,r,2 cos φ A,r,2 , and magnitude equations of F (φ D,r, 1 ,θ D,r, 1 ;δ h ,δv)F (φ A,r, 2 ,θ A,r, 2 ;δ h ,δv) F (φ D,r, 1 ,θ D,r, 1 ;0,0)F (φ A,r, 2 ,θ A,r, 2 ;0,0) can be introduced by the secondary IRS. These additional constraints make it possible to realize the parameter decoupling. Here, we present two decoupling modes of channel parameters against different cases of system settings and channel conditions. 1) "All" Decoupling Mode: If the IRS reflectors are omnidirectional, i.e., F (φ, θ; δ h , δ v ) ≡ 1, the magnitude equations of (28) become invalid. Hence, the nonlinear solver is able to converge to the exact azimuth/elevation AoA/AoDs only when the number of effective phase equations is no less than the number of angles to be estimated, i.e., We jointly solve a nonlinear system of 3(L BI,r + L IU,r − 1) phase equations in (18), (28) involving geometric information of sin θ cos φ, sin θ sin φ, cos θ along the x, y, z-axes, respectively, to acquire precise estimates of all the 2(L BI,r + L IU,r ) angular parameters {φ(θ) D,r,1 , φ(θ) A,r,2 } at once. One can directly employ existing nonlinear algorithms, e.g., Gauss-Newton method, Levenberg-Marquardt method, trust-region (dogleg) method, etc., to efficiently solve this problem.
2) "Pair" Decoupling Mode: If there exists only one propagation path in both the BS-IRS and IRS-MS channels, i.e., L BI,r + L IU,r = 2, one cannot just rely on three phase equations to obtain unique estimates of four angles. 5 We need to recover four angular parameters {φ D(A),1 , θ D(A),1 } by jointly solving four nonlinear magnitude/phase equations of (18), (28). Note that this approach can also be applied to the case (29), returning the angles of a pair of paths at a time. One needs to average the recovery results of each angle to obtain 2(L BI,r + L IU,r ) distinct parameters from in total 4L BI,r L IU,r estimates.
After obtaining the angles and the cascaded time delays, we can remove their contributions from the equivalent path gains in (18) or (28) to derive α BI,r,k α IU,r,k . Due to the randomly distributed complex gains including both the large-scale fading and the small-scale fading, it is usually difficult to decouple the path gains, as well as, the time delays solely based on the training signals. Fortunately, we can leverage the 3-D geometric relationship of communication devices to realize the environment mapping, determining the precise propagation distances. The detailed derivations are provided in the next section.

V. CHANNEL PARAMETER-BASED USER LOCALIZATION
After acquiring precise estimates of the channel parameters via the decoupling mode described in Section IV. C. 1) or 2), we can combine the information of propagation channels and device physical configurations to realize an application of user localization. The BS and UE are located at p B , p U ∈ R 3 , respectively, whose orientations, i.e., the normals to the antenna array planes, are defined as n B , n U ∈ R 3 , respectively. The adjacent primary/secondary IRSs of the rth twin-IRS structure are assumed to share identical coordinates p r ∈ R 3 , while the normal direction to the primary IRS is n r ∈ R 3 . The scatterers of the 1 th NLoS path of H BI,r,k and 5 If there exists a LoS path with a large Rician K-factor, the other NLoS components are much weaker and may be regarded as environmental noise. Then, the channel matrix H BI(IU),r,k has approximately rank-1 with L BI(IU),r ≈ 1. the 2 th NLoS path of H IU,r,k are located at s BI,r,1 ∈ R 3 and s IU,r,2 ∈ R 3 , respectively.

A. Preliminary Environment Mapping
The positioning performance depends on the environment mapping that determines the distribution of scatterers and propagation paths. Note that the AoA/AoDs at the IRS are defined relative to the primary IRS on the yz-plane with a default normal n 0 [1, 0, 0] T , as illustrated in Fig. 1(b). We parameterize the rotation of an IRS plane with a normal n r ∈ R 3 from the default orientation by the rotation axis and rotation angle, which are defined as 6 The actual direction of a signal wave parameterized by azimuth/elevation angles {φ, θ}, or equivalently a direction d 0 [sin θ cos φ, sin θ sin φ, cos θ] T , relative to the primary IRS can be derived by the Rodrigues' rotation formula as [46] d r = d 0 cos ξ r +(c r × d 0 ) sin ξ r +c r (c r • d 0 )(1−cos ξ r ).

(31)
Given the angular estimates and known device orientations, we can derive the actual path directions at the BS and IRSs, denoted by d B,r,1 , d D,r,1 , d A,r,2 ∈ R 3 , ∀ 1 ∈ I(L BI,r ), 2 ∈ I(L IU,r ), respectively. For a LoS path, d B,r,1 , d D,r,1 should be (approximately) collinear.
Recall that only the NLoS paths scattered or reflected once are considered. The scatterer located at s BI,r,1 ∈ R 3 is the intersection point of two spatial lines passing through p B and p r with directions d B,r,1 and d D,r,1 , respectively. In the noisy case, these estimated spatial lines may not exactly intersect. As illustrated in Fig. 3(a), the scatterer can be approximated by the median point on the common perpendicular of the two spatial lines as Note that if the BS antennas are ULA-shaped, which are commonly assumed to be parallel with the xy-plane, only the azimuth angle can be estimated (elevation angle is π/2 by default). In this case, as illustrated in Fig. 3(b), the scatterer can be derived as the intersection point of a spatial line passing through p r with a direction d D,r,1 and a spatial plane passing through p B with a normal n B,r,1 ⊥ d B,r,1 as [47] [s BI,r, where i ∈ {1, 2, 3}. Once the scatterer position is determined, the propagation distance, as well as, the path time delay from the IRS to the BS can be directly computed as where v c = 3 × 10 8 m/s is the light velocity. One can now easily derive the UE-IRS path distance by combining (34) and the estimated cascaded time delay, i.e., τ BI,r,1 + τ IU,r,2 , from (18), (28). Without the information of the user coordinates, it is difficult to realize a unique environment mapping between the IRS and UE. Fortunately, after acquiring the UE position and orientation based on the LoS component of H IU,r,k , we can finally determine the scattering details.

B. User Localization Implementation
We present two dedicated user localization schemes for the single-carrier and multi-carrier training strategies, respectively. When the UE-IRS channel contains a LoS path with a relatively large Rician K-factor, it can be detected and distinguished by its dominant complex gains. 7 1) Single-Carrier Strategy: In the single-carrier strategy, the system activates a single subcarrier to perform channel estimation. With only one estimate of β BI,r,k,1 β IU,r,k,2 , we cannot recover the cascaded time delays τ BI,r,1 + τ IU,r,2 , as well as, the LoS path distance of H IU,r,k . We propose to localize the UE by an AoA-based positioning scheme with R = 2 distributed twin-IRS structures. The user position can be derived as the intersection point of two spatial lines passing through p 1 and p 2 with directions d A,1,1 and d A,2,1 , respectively, which can be approximated by following (32) as 2) Multi-Carrier Strategy: In the multi-carrier strategy, we are able to precisely determine the LoS path distance of H IU,r,k by realizing the scatterer mapping. We propose to localize the UE by a hybrid AoA-delay positioning scheme with R = 1 twin-IRS structure. The user position can be Algorithm 1 Channel Estimation and User Localization Require: observation Y r,k , IRS pattern Ψ r,k , beamformers F r,k , W r,k , channel ranks L BI,r , L IU,r . 1: procedure CH-EST 2: Derive Y (1) r,k , Y (2) r,k by unfolding Y r,k as in Remark 2.
, α BI(IU),r, follows a normal distribution CN (0, 1). 8 The received signal-to-noise ratio (SNR) is defined as Y r,k − N r,k 2 F /N r,k 2 F . The simulation of channel estimation and parameter recovery considers a full-NLoS propagation; the simulation of environment mapping and user localization assumes that H IU,r,k contains a LoS path, while H BI,r,k consists of only NLoS components. Apart from the narrowband positioning that requires two twin-IRS structures, most simulations are realized with a single twin-IRS structure. The two twin-IRS pairs are located at p 1 = [0, 0, 0] T , p 2 = [8, 0, 0] T with orientations n 1 = [1, 0, 0] T , n 2 = [−1, 0, 0] T , respectively. We leverage the first one to perform the channel estimation and parameter recovery by default. The proposed channel estimation schemes, i.e., the sparse, grouping and ANM-SDP/ADMM schemes described in Section IV. B. 1), 2) and 3), respectively, are compared with the OMP [13], BALS [14] and SCPD [40] schemes. 9 Table III tabulates the estimation performance of BS/UE angular parameters and normalized signal power with random training pattern for different beamforming schemes, where N h = N v = 4 and q = 1.5. 10 It shows that the beamspace beamforming (19) achieves the best recovery accuracy and highest received signal strength when the search regions generally cover the AoA/AoDs at the BS/UE. The baseline scheme achieves slightly better performance than the uniform DFT scheme but yields much lower effective signal power than the optimized DFT scheme, leading to higher transmitted power for a desired received SNR. Hereinafter, we introduce a coarse estimation scheme with one extra frame, and use the obtained estimates to construct the beamformers F r,k , W r,k , as well as, the training signal tensor Y r,k . 8 One can leverage the principal component analysis or minimum length description to estimate the number of paths [40]. For simplifying the evaluation of the parameter recovery performance, we assume that L BI(IU),r is a priori known or perfectly estimated. 9 The maximum iterations of BALS is 20; the angular resolution of OMP is 2π 4N h × 2π 4Nv ; the maximum iterations and penalty parameter of ANM-ADMM are 20 and 0.1, respectively; the ANM-SDP is handled by the CVX toolbox [50]. 10 "acc" is defined as the rooted mean square error (RMSE) (degrees): ÈL BI(IU),r

=1
(φ B(U),r, − φ B(U),r, ) 2 /L BI(IU),r ¡ 1/2 ; "pow" is defined as the normalized signal power: Y r,k − N r,k F / √ MtrTtrPtr; the resolution of OMP is 2π/4N B(U) ; "Coarse" applies the coarse estimation scheme [45] within a single frame; "Baseline" applies the coarse estimation scheme [45] across (Ptr + 1) frames; "Uniform DFT" applies the DFT beamforming (19) across (Ptr +1) frames with uniformly-spaced {ω h(v),m }; "Optimized DFT" applies the DFT beamforming (19) across Ptr frames with {ω h(v),m } closest to the coarse estimates.   (18) and normalized mean square error (NMSE) of the cascaded channel with different configurations of the sparse and grouping patterns. 11 It shows that when the quantity of training frames is sufficient, the estimation accuracy is mainly determined by the degrees of freedom of the difference coarray and equivalent array of the sparse and grouping patterns, respectively. Hereinafter, we adopt the best pattern designs marked in bold in Table IV to evaluate the channel estimation performance.
Figs. 4 and 5 plot the RMSEs of cascaded phase/amplitude parameters (18) and normalized mean square error (NMSE) of the cascaded channel versus the received SNR with N I = 16 and 64, respectively. It shows that the OMP returns SNR-insensitive results due to its fixed angular resolution. The grouping scheme has relatively worse performance with the lowest complexity. The BALS and SCPD yield deteriorated performance when the quantity of training frames fails to keep pace with the increasing number of IRS unit cells. Moreover, the ANM-SPD achieves the best estimation 11 The RMSE of a parameter x 1 2 is defined as  accuracy with the highest complexity. The sparse scheme and ANM-ADMM can effectively acquire RMSE/NMSEs close to those of ANM-SDP with reduced computational burden, especially in the case of small-scale IRSs with sufficient training measurements. Figs. 6 and 7 plot the estimation performance curves of the cascaded parameters and channels versus the number of training frames with N I = 16 and 64, respectively. It shows that as P tr increases, the recovery accuracy of the BALS rapidly improves, while the SCPD is working when the uniqueness condition (10) holds. Moreover, the performance of the ANM-ADMM gradually approximates to that of ANM-SDP. By contrast, the results of the sparse, grouping and ANM-SDP schemes are relatively less sensitive to the quantity of measurements, which originates from the robustness of LS and SDP operations. One can observe from Figs. 4-7 that the sparse scheme and ANM-SDP can achieve better performance than other counterparts. This can be attributed to the fact that the former effectively reduces the number of subchannels to be decomposed and reconstructed, while the latter leverages the optimization approaches with dominant computational complexity. Hereinafter, we combine these two methods with the proposed twin-IRS structure to implement the decoupling of channel multipath parameters.
Figs. 8 and 9 plot the RMSEs of the decoupled angular parameters versus the relative rotation angle and the power radiation coefficient of twin-IRS structures, respectively. The ANM-SDP and sparse schemes work with N I = 16, P tr = 14 and N I = 64, P tr = 48, respectively. The figures indicate that the proposed twin-IRS structure enables us to precisely recover the angular parameters at the IRS end with a resolution less than 0.1 • . Concretely, it can be observed that as δ h , δ v increase from 5 • to 30 • , the RMSEs firstly decrease and then increase. This can be explained as follows: too small rotation angles cannot yield significant amplitude/phase variations along the IRS array aperture, whilst too large ones will lead to a disappearance of some weak paths with near-zero AoA/AoDs relative to the IRS planes due to the non-ideal power radiation pattern. Furthermore, as q increases from 0 to 3, the performance of the "All" decoupling mode gradually deteriorates, while that of the "Pair" counterpart first improves and then worsens. The former does not rely on amplitude equations derived by the power radiation pattern, while the latter depends on effective amplitude equations with  an appropriate value of q. In conclusion, the optimal values of δ h , δ v and q are shown to be about 15 • and 1.5, respectively.
Figs. 10 and 11 plot the RMSEs of the environment mapping and user localization with the sparse scheme, where the proposed multi-carrier (MC) and single-carrier (SC) strategies are implemented with the aid of one and two twin-IRS structures, respectively. 12 The simulation results show that even with no 12 The BS and UE are randomly located at p B = [2 ∼ 6,   LoS path between the BS and IRSs, the proposed twin-IRS structure and the corresponding channel estimation schemes can achieve centimeter-level and degree-level resolution of the scatterer/user locations and orientations, respectively. The MC strategy consistently outperforms the SC counterpart, which benefits from more training subcarriers and less error accumulation of twin-IRS structures. More specifically, the "All" decoupling mode returns more accurate results of scatterer mapping, whilst the "Pair" counterpart generally performs  better at user positioning. The former can more robustly recover the parameters of multiple NLoS paths, while the latter is more suitable for solving the AoA/AoDs of a specific pair of paths involving the LoS component. Furthermore, one can observe that an increasing Rician K-factor leads to enhanced performance of environment mapping and user localization except that of IRS-UE scatterer positioning. This is due to the fact that higher power of a LoS path can improve its probability of being correctly detected, but will dominate the NLoS components and make them more likely be treated as environmental noise.
Finally, we analyze the computational complexity of the involved channel estimation algorithms, as tabulated in Table V. Our analysis indicates that the complexity of the BALS and SCPD is proportional to the number of reflectors N I . The algebraic grouping and sparse schemes efficiently reduce the complexity by activating fewer reflection unit cells and formulating smaller-scale equivalent arrays, respectively. Moreover, the optimization-based ANM-SDP/ADMM can handle arbitrary designs of the IRS training coefficients at the expense of relatively higher complexity.

VII. CONCLUSION
We considered the channel estimation, as well as, the user localization problems of an IRS-assisted mmWave MIMO-OFDM system. We proposed a novel twin-IRS structure consisting of two IRS planes with a relative spatial rotation to extract the 3-D propagation channel. By leveraging the techniques of tensor factorization, sparse array processing and atomic norm denoising, we presented four IRS training pattern designs and the corresponding parameter recovery schemes. By concatenating the cascaded phase/amplitude parameters recovered from the twin-IRS structure and leveraging the geometric relationship of devices, we achieved the decoupling of the exact channel angular and temporal parameters. We also proposed wideband and narrowband training strategies with single and multiple twin-IRS structures for the scatterer mapping and user positioning. Numerical results showed that the proposed strategy can precisely extract the channel parameters, and, therefore, can support a centimeter-level positioning resolution. In our future work, we are going to exploit the potential of conformal array topologies for the IRS designs alongside the corresponding channel estimation strategies and parameter recovery schemes.
Yuxing Lin (Student Member, IEEE) received the B.S. degree in information engineering from Southeast University, Nanjing, China, in 2016, where he is currently pursuing the Ph.D. degree with the School of Information Science and Engineering. His research interests include massive MIMO, millimeter wave wireless communications, hybrid beamforming, channel estimation, beam training/tracking, tensor signal processing, and intelligent reconfigurable surface-assisted communications.