Establishing Multi-User MIMO Communications Automatically Using Retrodirective Arrays

Communications in the mmWave and THz bands will be a key technological pillar for next-generation wireless networks. However, the increase in frequency results in an increase in path loss, which must be compensated for by using large antenna arrays. This introduces challenging issues due to power consumption, signalling overhead for channel estimation, hardware complexity, and slow beamforming and beam alignment schemes, which are in contrast with the requirements of next-generation wireless networks. In this paper, we propose the adoption of a retro-directive antenna array (RAA) at the user equipment (UE) side, where the signal sent by the base station (BS) is reflected towards the source after being conjugated and phase-modulated according to the UE data. By making use of modified Power Methods for the computation of the eigenvectors of the resulting round-trip channel, it is shown that, in single and multi-user multiple-input multiple-output (MIMO) scenarios, ultra-low complexity UEs can establish parallel communication links automatically with the BS in a very short time. This is done in a blind way, also by tracking fast channel variations while communicating, without the need for ADC chains at the UE as well as without explicit channel estimation and time-consuming beamforming and beam alignment schemes.

(SVD) operations to determine the precoding and combining matrices. To minimize both computational effort and signalling, various sub-optimal techniques have been introduced, that require the knowledge of the channel matrix solely at the receiver side, such as V-BLAST [6], whereas blind methods have been proposed to reduce or avoid the need for explicit channel estimation at the expense of reduced performance with respect to SVD-based methods [7]. Other proposed solutions of particular interest for mmWave and THz massive MIMO are based on partial channel state information (CSI) and leverage user position and/or angular information [8], [9]. Explicit channel estimation can be avoided also considering beam scanning techniques, which are typically used in strong line-of-sight (LOS) conditions: in this case, the transmitter and the receiver test all possible beam orientations (beam alignment process) until a connection is established [10]. This procedure can be extremely demanding in terms of complexity and latency when large antenna arrays are adopted at both base station (BS) and user equipment (UE) sides.
In summary, state-of-the-art approaches either involve some computational effort on both sides, which affects hardware complexity and power consumption, or long training data/search time, resulting in increased latency. Clearly, these problems are exacerbated when the number of antennas is high, as expected in beyond 5G systems. In addition, at high frequencies, such as sub-THz/THz, technological constraints place several limitations on the flexibility in processing the signal sent/received by each individual antenna element, thus reducing the set of affordable methods and preventing their use in low-cost sensors.
Consequently, it is evident that the design of highfrequency MU-MIMO systems, incorporating large antenna arrays and involving simple and energy-efficient UEs (e.g., sensors), necessitates a paradigmatic and technological breakthrough. Such advancement should allow for the elimination of ADC chains, while simultaneously preserving the inherent advantages offered by the MIMO channel, all the while prioritizing low-latency and extensive communication capabilities.
Having this objective in mind, this paper introduces a methodology for establishing uplink MIMO communications without imposing any computational burden on the UE side or necessitating signalling procedures. This is accomplished in a blind and nearly optimal fashion during communication, without the need to explicitly estimate the CSI or resort to time-consuming beam alignment schemes. The procedure we propose is based on three pillars: • The availability of a retrodirective antenna at the UE side, which is capable of reflecting the impinging signal along the direction of arrival (retrodirective backscattering), that is, towards the BS; • The capability of the UE to modulate the phase of the reflected signal, according to the user's data, directly at electromagnetic (EM) level, thus enabling a low-complexity uplink communication; • The availability of an algorithm running on the BS capable of jointly demodulating the received data and deriving the optimal precoding vector, i.e., enabling joint communication and beamforming, without any channel estimation or prior knowledge, thanks to the retrodirectivity property of the UE's antenna. The algorithm must also be able to dynamically adjust the beam orientation in response to any change in the position of the UE during the ongoing communication, thereby exhibiting tracking capabilities. Ultimately, the adoption of retrodirective antennas combined with the processing performed at the BS enables the design of MIMO-enabled UEs with significantly reduced complexity and power consumption, as no ADC chain is required.

A. RELATED STATE-OF-THE-ART
Historically, retrodirective antennas have been implemented using arrays, denoted as retro-directive antenna arrays (RAAs) [11]. This technology, which dates back several decades, gained significant interest over time due to its ability to transmit a signal back to the interrogator's position without requiring prior knowledge of the incoming angle or complex digital signal processing algorithms. This type of array has found applications in various contexts, such as radar [12], wireless power transfer [13], [14], collision avoidance systems [15], microwave imaging or detection [16], RFID systems [17], remote information retrieval from sensors [18], interference rejection [19], [20], and space communications [21], [22].
For several decades, two technological solutions have been available for implementing RAAs. The first solution involves the use of mixers that apply phase conjugation to the signals received by each antenna element [23]. When these signals are combined within the array, they form a main beam directed toward the source direction. More details on this scheme will be provided in Section II. The second solution, known as the Van Atta array [24], comprises a set of antennas interconnected with equal-length connections between pairs of antennas, all equidistant from the array's center.
More recently, the landscape of devices designed for electromagnetic wave emission has been enriched with the emergence of metasurfaces, which can be seen as an evolution of traditional antenna arrays. In contrast to their early days, when metasurfaces were fixed reflective structures, contemporary metasurfaces are endowed with the ability to reconfigure their behaviour dynamically [25], thus enabling the implementation of passive reconfigurable intelligent surfaces (RISs) [26], new-generation metasurface-based antennas [27], as well as the possibility to perform some signal processing functions directly at the EM level [28], [29].
So far, RISs have been studied mainly as reflectors deployed in the environment to create favourable propagation conditions for communications between the BS and the UEs [26], [30]. To cope with the large attenuation experienced by signals when reflected by passive RISs, the advantages of introducing amplifying (or active) RISs have recently been discussed [31], [32], [33].
Specifically pertinent to this paper, notable advancements have been achieved in architectures where data is incorporated into the reflected signals by modifying the reflection characteristics of the RIS, enabling spectrum sharing with legacy systems, as well as architectures supporting conventional backscatter communications [34], [35], [36], [37], [38]. Specifically, in [39] a scenario is examined where an unmodulated carrier is emitted by a radiofrequency (RF) generator in close proximity to a RIS, which then reflects the carrier signal by modulating it based on the data to be transmitted to UEs.
Another advancement in the field of metasurfaces, which is closely related to the content of this article, focuses on the realization of retrodirective devices. Several studies, accompanied by prototypical implementations, have demonstrated that achieving this objective is feasible [40], [41], [42], [43]. For instance, the design of a metasurface is presented in [43] that exhibits a high level of retrodirectivity for 17 different angles of incidence. This result is achieved by realizing the metasurface according to a periodic grating, whose surface impedance is engineered such that retro-reflective property is obtained. In the following, we will denote as selfconjugate metasurfaces (SCMs) those metasurfaces capable of retro-directing the impinging signal.
In this paper, we address the uplink scenario where each UE is equipped with a retrodirective device, either a RAA or an SCM, which is also capable to modulate the phase of the reflected signal according to the user's data, and we focus the establishment of MIMO communications with no computational effort at the UE side and negligible latency for the link establishment. To the Authors' best knowledge, the use of retrodirectivity for efficient wireless communications has been little studied [11]. In particular, retrodirectivity has been considered for uplink MU-MIMO systems only recently in our previous paper [44], where it favours the initial estimate of the beamforming vectors in a grant-free multiple access scheme.
Nomenclature: Unless otherwise specified, in the following we will use the acronym RAA to encompass both traditional retrodirective antenna arrays and the more recent advancements where rectrodirectivity is achieved through the utilization of SCMs. This intentional flexibility in terminology usage aims to enhance readability without compromising clarity or precision. Indeed, we are interested in the property of retrodirectivity itself, rather than the way it is achieved from a technological standpoint.

B. OUR CONTRIBUTION
With respect to the above state-of-the-art, our paper provides the following novel contributions: • We propose the adoption of modulating RAAs as a viable approach to achieve ultra-low complexity MU-MIMO uplink communications through retrodirective backscattering. It is worth remarking that in this case, the metasurface or antenna array is integrated into the UE itself, rather than being an additional element deployed in the environment, as in most of the RIS-based scenarios investigated in the literature. • We start by considering the single-user scenario and by introducing an algorithm, executed at the BS, able to estimate the optimal beamforming vector, which corresponds to the top eigenvector of the BS-UE roundtrip channel. Our approach entails iterative exchanges between the BS and the UE's RAA, while employing modified Power Methods [45] for the computation of eigenvectors. Power Methods are widely recognized as the most popular algorithms for iteratively estimating eigenvectors in problems involving eigen-value decomposition (EVD) and SVD, which are typical in MIMO systems. In fact, some early works in the MIMO domain (e.g., [46], [47], [48]) utilized the basic Power Method for beamforming. Following several iterations, during which the transmitter and receiver engage in a pingpong exchange of packets, the desired beamforming vector is obtained. Notably, our algorithm distinguishes itself from prior literature as it is specifically tailored to retrodirective UEs operating in backscatter mode. Furthermore, it possesses the capability to jointly estimate the beamforming vector while simultaneously transmitting UE's data and tracking rapid channel variations. With respect to our work [44], where a RAA-based scheme is proposed to estimate the beamforming vectors in a grant-free access scheme, here the main novelty is that we perform joint communication and beamforming with continuous tracking of channel variations. • We analytically characterize the convergence behaviour of the signal-to-noise ratio (SNR) by taking into account the noise and data introduced by the RAA, BS, and environmental clutter (dynamic and static). Specifically, we derive the condition under which the static clutter effect can be made arbitrarily small by increasing the number of antennas at the BS. It is worth highlighting that, to the best of our knowledge, such SNR analysis is not present in the literature, which primarily addresses the impact of noise and data in Power Methods under the subspace's approximation perspective and assuming bounded noise [49], [50], [51]. • To deal with multiple UEs, it is necessary to estimate and track multiple eigenvectors associated with the channels of each UE concurrently. In this regard, we propose an algorithm based on Block Power methods, whose novelty is the capability to converge to the top-L channel eigenvectors even in the presence of data transmission and noise. While Block Power methods have previously been investigated for estimating the top L left and right eigenvectors of MIMO channels [7], [51], [52], their application has primarily been envisioned for time-division duplexing (TDD)-based communications. In such systems, both the BS and the UE must possess substantial signal processing capabilities to implement signal decoding and QR-decomposition, as required by the Block Power method. Furthermore, these methods are designed for single-user systems and cannot be directly applied to backscatter-based systems, as elaborated later in this paper. • To validate and characterize the performance of the proposed RAA-based system and algorithms, numerical results are presented considering both free space and realistic LOS/non-line-of-sight (NLOS) channel models. These results consider various factors such as dynamic and static clutter, as well as the impact of nonideal retrodirectivity leading to channel non-reciprocity. The findings demonstrate that MU-MIMO communications can be automatically established with low latency, even when employing large antenna arrays, without requiring channel estimation or processing at the UE side. Furthermore, the numerical results highlight the system's capability to effectively track rapid channel variations induced by user mobility. The rest of the paper is structured as follows: In Section II, a brief overview of RAAs is given and the strategy to exploit them to transmit data is proposed. The description and analysis of the RAA-based system in single-user scenario is given in Section III. Then, its extension to the multi-user case is offered in Section IV. Numerical results are illustrated in Section V, and a discussion concerning the pros/cons of the proposed solution, as well as its comparison with other MIMO-related implementations, is proposed in Section VI. Finally, our conclusions are drawn in Section VII.

C. NOTATIONS AND DEFINITIONS
Boldface lower-case letters are vectors (e.g., x), whereas boldface capital letters are matrices (e.g., H). I N is the identity matrix of size N, h n,m = [H] n,m represents the (n, m)th element of matrix H, x the Euclidean norm of vector x and x * is its conjugate. H * , H T and H † indicate, respectively, the conjugate, transpose, and the conjugate transpose operators applied to matrix H. The Hadamard product between two matrices is indicated as A B. vec(x 1 , x 2 , . . .) returns a column vector stacking vectors x 1 , x 2 , . . . . The notation x ∼ CN (m, σ 2 ) indicates a complex circular symmetric Gaussian random variable (RV) with mean m and variance σ 2 , whereas x ∼ CN (m, C) denotes a complex Gaussian random vector with mean m and covariance matrix C. The real part of a complex number z is (z).

II. RETRODIRECTIVE AND MODULATING ANTENNA ARRAYS
In the following, we consider a UE equipped with an antenna array or a metasurface capable of (i) reflecting the impinging signal along the direction of arrival, and (ii) capable of modulating the reflected signal, thus being able to incorporate the information intended for the BS into the retro-directed signal. As pointed out in the Introduction, these two characteristics are the first and second pillars of the methodology we propose. In particular, these are the features that concern the UE. As for the retrodirectivity property, in Section II-A we illustrate how RAAs can be practically realized. The methodology to modulate the retro-directed signal is instead described in Section II-B.

A. RETRODIRECTIVE ANTENNA ARRAYS
A schematic representation of a RAA is reported in Fig. 1, where we consider M antennas organized, for simplicity of explanation, as a uniform linear array (ULA). From a technological point of view, the possibility of realizing ULAs capable of conjugating the phase of the impinging signal has been investigated for many years [11]. The main methods to achieve retrodirectivity are phase conjugating arrays, exploiting heterodyne mixing, and Van Atta arrays [23], [53], [54].
Phase conjugating arrays are based on the heterodyne mixing of the incoming wave, centered at frequency f 0 , with a locally generated sinusoid oscillating at 2f 0 [23], [55]. The principle behind this solution is sketched in Fig. 2(a), which refers to the mth phase-conjugating antenna of the array. For the sake of clarity, in Fig. 2(a) we have depicted two separate transmit and receive antennas, which however correspond to a single antenna of the ULA shown in Fig. 1.
Consider a plane wave impinging the ULA, with an angle α with respect to its normal direction, as shown in Fig. 1. At the mth antenna, the impinging wave accumulates a phase shift θ m , with respect to the first antenna, given by for m = 0, 1, . . . , M − 1, where is the inter-antenna separation.
With reference to Fig. 2(a), we can easily derive the analytical expressions of the signals at each port of the mth antenna. In this regard, since we are illustrating the theoretical foundation of the phase conjugation process, in this subsection we do not consider the presence of noise, which will be taken into account, instead, in the remainder of the paper. In particular, at the output of the mth receiving antenna, the RF signal z (RF) m (t) can be written as where x(t) and y(t) denote the in-phase and quadrature components of the received signal, respectively, and represents its complex envelope. After the heterodyne mixing of z The bandpass filter shown in Fig. 2(a) is designed to remove the spectral components centered at 3f 0 , so the signal entering the transmitting antenna is with g denoting the amplitude gain of the power amplifier and being the complex envelope of r (RF) m (t). Comparing (3) with (6) it is evident that the final result of the processing carried out by the antenna is the phase conjugation of the signal, as expected. It is worth noting that, assuming that the modulated signal z which is required to steer the array beam back to α, that is, to have the impinging wave reflected in the same direction of arrival (retrodirectivity).
An example of the experimental effectiveness of this technique can be found in [55], where a testbed consisting of an antenna array with eight slot antennas, each of which connected to a Schottky diode performing heterodyne mixing, was realized. The measure of the bistatic radar cross-section (RCS) of the resulting meta-interface (as it is called in [55]) showed the retrodirectivity property, exhibiting an RCS almost coincident with the array factor for an 8-element array focusing the received signal toward the source. Another relatively simple solution is proposed in [56], in which active split-ring resonators loaded with varactor diodes are demonstrated to act as phase-conjugating elements when pumped with a signal at frequency 2f 0 .
While phase conjugating arrays require active components, Van Atta arrays realize passive retrodirectivity (in this case g < 1) [54]. As sketched in Fig. 2(b), a Van Atta array consists of an array of antenna elements arranged in symmetrical pairs and connected through transmission lines. Each element of the pair acts as receiving and transmitting antenna. In particular, the signal received by an element is re-irradiated by its pair after it has travelled through the transmission line whose length is designed to not introduce phase shifts. The elements are deployed in a mirror-symmetric manner to cause an equivalent phase-conjugation effect for the reflected wave compared to the incident wave.
Recently, it has been shown that also metasurfaces can be designed to provide passive retrodirectivity. This property can be achieved by engineering the surface impedance according to a supercell design periodicity that is greater than the wavelength λ [43]. The general principle is that, provided that electric or magnetic surface currents are induced at the reflecting boundary, the conventional reflection law does not hold when the surface properties (e.g., the surface impedance) smoothly vary within the wavelength scale [40]. As a result, by properly engineering the induced surfacecurrent gradients, it is possible to reflect the incident wave in a direction other than specular. This principle is exploited, for instance, in [43] to realize a metasurface that exhibits a high level of retrodirectivity for almost 17 different angles of incidence. Another example of implementation at mmWave for Internet-of-Things (IoT) applications can be found in [57]. Clearly, the main advantage of the heterodyne approach over passive solutions is that by employing active mixing devices, amplitude gain can be achieved in addition to phase conjugation, thereby improving the transmission range at the expense of some power consumption.
One of the primary challenges associated with active RAAs is maintaining a high level of isolation between the transmit (TX) and receive (RX) signal chains during simultaneous transmission and reception. This is typically achieved through various methods, including bistatic configurations (spatial multiplexing), slightly different TX and RX frequencies (frequency multiplexing), and dual-polarized elements (polarization multiplexing). Recently, a co-polarized active RAA has been demonstrated experimentally in [58] with a gain g of 23 dB. Unfortunately, such techniques could lead to a loss of channel reciprocity, which, on the other hand, is assumed in our proposed scheme, due to the unavoidable asymmetry of the circuits. However, the numerical results demonstrate that our communication scheme exhibits strong robustness to channel non-reciprocity.
Another possible issue of heterodyne-based RAAs is the frequency offset f between the local oscillator of the RAA and the carrier frequency of the BS because they are not synchronized. This generates an additional phase rotation ψ(t) = 2π f t in the reflected signal. The effect of the frequency offset f is equivalent to the Doppler shift that would be generated by a moving UE in a dynamic environment. In the following analysis, we assume that the phase variation within the transmission of a few symbols is negligible, i.e., f 1/T, being T the symbol interval. However, the impact of the Doppler/frequency offset will be investigated in the numerical results.

B. MODULATION OF THE RETRO-DIRECTED SIGNAL
Given the bandwidth W of the narrowband signal in (2), we consider the sampled version of its complex low-pass signal in (3) By introducing the noise generated by the RAA, the discrete-time signals at the input of the M antennas in the kth symbol interval can be expressed by the vector where being κ the Boltzmann constant, T 0 = 290 K, and F RAA the RAA's noise figure [32], [59]. Thanks to the self-conjugating property of the antenna array, we can write the vector r[k] of the signal reflected by the RAA in the same symbol interval as Suppose, now, that the RAA not only performs the conjugation and amplification 1 of the received signal, thus retro-directing the impinging signal as it appears in (9), but also introduces, in the kth symbol interval, a phase shift 1. If the RAA is passive, then g ≤ 1.

φ[k]
(the same for all the antennas) that incorporates the information to be transmitted by the UE in that interval [44]. The vector of the transmitted signal becomes therefore Equation (10) is thus representative of the proposed solution in which the BS transmits a signal to the UE, which (possibly) amplifies and retransmits along the direction of arrival the received signal (retrodirectivity), using it as a carrier to incorporate through the phase φ[k] the information intended for the BS (see Fig. 1). Note that the phase φ[k] associated with the information affects all the antennas of the array, thus not compromising the retrodirectivity behaviour. Most importantly, the implementation of the modulated RAA does not require ADC chains as data directly modulates the phase sequence {φ[k]}, thus allowing a low-cost, low-complexity, low-energy consumption multi-antenna device.

III. ESTABLISHING MU-MIMO COMMUNICATIONS AUTOMATICALLY -SINGLE USER SCENARIO
In addition to the retrodirectivity of UEs' antennas and their capability to perform phase modulation, the third pillar of our methodology concerns the BS, which must be able to decode the modulated data (extracting the phase φ[k]) and, jointly, to derive the optimal precoding vector without any channel estimation or prior knowledge, using an approach that exploits the UE capability to conjugate and retro-direct the received signal. An iterative algorithm to be executed by the BS, capable of automatically establishing uplink MU-MIMO communications with RAA-based UEs, is illustrated in this section for the single-user scenario (see Fig. 3), and in Section IV for the multi-user scenario. We consider a BS, equipped with an antenna array of N elements, capable of full-duplex transmissions. The design and characterization of the full-duplex front-end are beyond the scope of this paper. Interested readers can find more details, e.g., in [60]. Alternatively, two separated antennas for the transmitter and receiver sections can be considered depending on application constraints. We assume a narrowband transmission with bandwidth W. It might represent, for instance, a sub-carrier or a resource block in an orthogonal frequency division multiplexing (OFDM) system. The UE is realized according to the scheme described in Section II and its RAA is composed of M elements.

Let
√ P T x[k] ∈ C N×1 be the vector containing the signal transmitted by the N elements of the BS's antenna array, where P T is the transmitted power and x[k] is a unit norm beamforming vector, i.e., the precoding vector, at the generic time interval k. At the startup, i.e., at k = 0, the optimal beamforming vector for the link with the given UE is not known by the BS, which therefore randomly generates a unit norm beamforming vector x[0]. At the end of each time interval k, with k ≥ 1, the beamforming vector x[k] will be iteratively updated, as described in detail later.
We underline that the time interval T between time instants k − 1 and k must be larger than twice the propagation delay τ p . With some abuse of notation, in order to keep it simple, we consider that the transmitted signal x[k − 1] is received and retro-directed by the UE at time k, then collected by the BS 2 (see Fig. 3). Thus, the signal received by the UE is where H ∈ C M×N denotes the channel matrix, and the signal retro-directed by the UE, according to (10), is being φ[k] the phase shift carrying the data generated by the UE in the kth symbol time. Assuming channel reciprocity, at the BS side the received signal at time instant k, consisting of the feedback of the signal transmitted in the last symbol time, is with w[k] ∼ CN (0, σ 2 w I N ) being the AWGN at the receiver, σ 2 w = κT 0 F AP W, F AP the BS' noise figure, and C[k] the stochastic clutter transfer function determining the signal backscattered by the surrounding environment [61]. The characterization of the clutter strongly depends on the environment. In the radar literature, the clutter is often modelled as homogeneous and with uncorrelated spatial intrinsic reflectivity [61], so Regarding the time variability of the clutter, we consider two extreme cases: fast clutter (dynamic clutter), for which matrices C[k] are considered i.i.d., and static clutter, where C[k] is considered to be almost constant (but still random) during the convergence process of the algorithms proposed in this paper, i.e., C[k] C. In the following, we first analyse the dynamic clutter scenario, then the static case will be addressed at the end of the section.
In the presence of dynamic clutter, the last term in (13) results in a random vector When describing the transmitted signal and the signal received/retrodirected by the UE, time instants k should be intended as intervals. Sampling is operated at the BS after standard matched filter processing of the signal in the last symbol interval of duration T.
, which is uncorrelated with respect to the useful term. From (13) we have where we have defined A = √ , which includes all the noise terms. It is now convenient to introduce the eigenvalue decomposition of matrix A and v j is the jth eigenvector (direction) forming the jth column of matrix V ∈ C N×N . As a consequence, the generic vector x[k] at the kth iteration can be decomposed as Analogously, the noise term can be expressed as where n j ∼ CN (0, σ 2 j ), with Based on the above, in the following we introduce the iterative Communication Scheme 1, which is capable of automatically establishing joint communication and beamforming between the UE and the BS.
Communication Scheme 1. As appears in the pseudocode of this procedure, to which the reader is referred, the process starts (step 0 of the pseudo-code) with the random generation of a guess unitary norm beamforming vector x[0]. Obviously, any prior information (e.g., past transmission, UE position, etc.) can be exploited to speed up the process, therefore here we are implicitly considering a worst-case scenario. At the (k − 1)th iteration, with k ≥ 1, the BS transmits the current version of the beamforming vector, x[k − 1] (step 2), which is then received by the UE (step 3). The latter reflects the received signal along the direction of arrival (thanks to the conjugation operation), modulating its phase based on the data intended for the BS (step 4 at the kth time instant is obtained by means of the function detection(·), according to the adopted modulation scheme (step 8). Here, we are assuming symbol-level synchronization between the UE and the BS. 3 It is worth noticing that data demodulation is operated while the BS is transmitting its precoding vector, thanks to the full-duplex BS assumption. 4 In the absence of noise and data, the processing operated in Communication Scheme 1 corresponds to the Power Method, or Von Mises Iteration, which allows to estimate the strongest eigenvector of a square matrix A, described by the recurrence relation [45] x In the following, we first investigate the convergence of Communication Scheme 1 in the absence of noise, then 3. Symbol-level synchronization is not required for the algorithm to converge, but it is necessary to decode the data. The analysis of synchronization schemes is outside the scope of this paper, and it will be the topic of future works.
4. While a half-duplex implementation is possible, it requires the TX to switch to the RX mode before the earliest reflected signal arrives at the BS, which depends on the (usually) unknown UE-BS distance. This necessitates a shorter TX time, reducing energy accumulation within a symbol time and resulting in a lower SNR compared to full-duplex. Thus, despite the complexity, the full-duplex implementation remains the more appealing choice. we complete the investigation by introducing the noise and deriving the time evolution of the SNR.

A. CONVERGENCE IN THE ABSENCE OF NOISE
First, we show the convergence in the presence of data by neglecting the noise. In this case, we have By unwrapping k − 1 times the recursion in (20) it is and By substituting (22) in (21) and letting k grow, we obtain which converges to the top eigenvector v 1 with convergence rate related to λ 2 /λ 1 . It is worth noticing that the complex , being a multiplicative factor for all components of v 1 , does not make the (asymptotic) direction of x[k] divert from that of v 1 . In the numerical results, it will be shown that only a few iterations are needed to approach the asymptotic value, that is, to perform optimum beamforming, under common channel conditions.
As far as the demodulation is concerned, from (23) it immediately follows that asymptotically (practically, in a few iterations) the decision variable u[k] is and the kth phase information φ[k] can be retrieved. VOLUME 4, 2023VOLUME 4, 1403 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

B. SNR EVOLUTION
Now we reintroduce the noise and evaluate the SNR at the discrete time k (i.e., in the kth iteration) for the demodulation of the data associated with the phase. In the presence of the noise, (20) reads where Note that the term carrying the information (i.e., the phase φ[k]) also includes the noise from the previous iterations, which is contained in at the kth symbol is proportional to the product (27) in which the first term is the useful one, as it contains the phase φ[k], and the second term represents the noise. In the following analysis, we consider the particular case in which the noise generated by the RAA and seen by the receiver is negligible compared to the other two noise components, i.e., σ 2 j σ 2 , with σ 2 = σ 2 w + P T σ 2 c ∀j, which is reasonable considering it is attenuated by the UE-BS channel. Therefore, considering that by construction x[k] 2 = 1, the SNR at the input of the detector for the kth time instant is Note that |x j [k − 1]| 2 / x[k − 1] 2 = |x j [k − 1]| 2 represents the fraction of the total power (useful plus noise) transmitted by the UE associated with direction v j at the discrete time k − 1. Then, at the end of the kth time interval, the SNR (at the BS) along the direction v j is given by Therefore, we can rewrite (28) as a function of SNR j [k] as where represents the maximum SNR along the direction v j , i.e., the SNR it would obtain if all the power were concentrated in the direction v j .
The goal is now to determine an iterative expression for SNR (dec) [k], which drives the signal demodulation performance, and evaluate the convergence condition for the communication scheme we proposed. Considering (26), the fraction of the total power that is associated with direction v j at the beginning of time interval k can be written as Then, by inverting (29) and plugging |x j [k]| 2 at both the lefthand and right-hand sides of (32), we obtain the following iterative formula for SNR j [k] for k ≥ 2, where Denote by r = rank(A) the rank of matrix A, i.e., the rank of the channel. In the particular case of a channel that has rank r = 1 (e.g., a far-field LOS channel with negligible multipath), (30) is given by with the SNR at the kth time instant along direction v 1 in (33) expressed as The rank-1 case allows an easy evaluation of the convergence value. Following the approach proposed in [44], the solution at the equilibrium of the recursive expression in (36) can be found by solving with a = N and b = SNR , which takes the role of asymptotic SNR corresponding to the optimum beamforming vector for a channel with r = 1.
Note that, if x[0] is randomly chosen, it is |x j [0]| 2 1/N, and hence at the first iteration (i.e., at the system bootstrap) it is SNR 1 [ 1. It is worth noticing that the convergence value does not depend on the initial random guess x[0], but only on the bootstrap SNR.
It is interesting to investigate the implications of the above results in simple scenarios, such as in free space. In this case, the generic element of the channel matrix H related to the channel between the ith element of the BS's antenna and the jth antenna of the RAA, is where d i,j is the distance, λ is the wavelength, G BS and G RAA are the gain of the elements of the BS's antenna and the RAA's antenna, respectively. As an example, with reference to Fig. 1, in far field v 1 takes the form of (7), that is, the beam steering vector pointing towards the direction α, being α the direction of arrival of the BS signal with respect to the UE. At the UE side, after conjugation, the retro-directed signal in (10) has direction of departure α, which means that the signal reflected by the RAA points back to the BS. More specifically, assuming the BS and the RAA are in paraxial configuration (α = 0), the first eigenvalue of A is given by [6] λ 1 As a consequence, the maximum and bootstrap SNRs become, respectively, 4 .
Remarks: The last equations show that, due to the backscattering nature of the communication, the path loss increases with the distance to the power of four, as happens with radio frequency identification (RFID) systems based on backscatter modulation [62]. On the other hand, such large path loss can be compensated by increasing the number of antenna elements N and M at the BS and UE, respectively. Increasing N and M is beneficial also for the bootstrap SNR, even though M has a higher impact than N. Another way to compensate for the path loss is to fix the areas of the BS and UE and increase the frequency. In this way, since the area of the BS's antenna is equal to A BS = Nλ 2 /4 and the area of the UE's antenna is equal to which highlights that, by keeping the area of the antenna arrays constant, the SNR increases with the fourth power of the frequency, and hence that our scheme is particularly suitable for high-frequency implementations.
The proposed system works independently of the channel characteristics (far-field, near-field, free-space, multipath, . . .), number of antenna elements, and it is completely blind as it does not require any explicit channel estimation. The result is the fast establishment of a singlelayer MIMO communication link with an extremely simple UE which does not require any ADC chain. It also allows channel tracking provided that the channel dynamics are not faster than the convergence speed, which depends on the ratio λ 2 2 /λ 2 1 and the symbol time T, as it will be investigated in the numerical results. The latter could be significantly small as it is lower bounded by the propagation round-trip time (< 100 ns in indoor environments) and the BS computational capacity.
It is worth noticing that when the channel has rank r = 1, the resulting communication scheme is capacityoptimal. When r > 1, the scheme is not capacity-optimal because the RAA is intrinsically single-layer so that only one out of r potential data streams that could be established between the BS and the UE is exploited (lower spectral efficiency). However, it corresponds to the optimal single-layer beamforming scheme in the SNR maximization sense [63]. As a consequence, it contributes to maximizing and reducing the fluctuations of the SNR at the decision variable thus providing the maximum diversity gain. This is achieved with extremely low complexity with respect to classical implementations, which require CSI estimation and analog/digital beamforming, as it will be discussed in Section VI.

C. ANALYSIS WITH STATIC CLUTTER
In the presence of static clutter, for which C[k] C, the term (13) is no longer uncorrelated with respect to the first term and it cannot be considered as additive noise. As a consequence, the iterative Communication Scheme 1 will converge to the strongest eigenvector of matrix A = A + √ P T C instead of v 1 , i.e., that of matrix A associated with the UE. In the presence of strong clutter caused, for instance, by scatterers located close to the BS, it might happen that the largest eigenvector ofÃ corresponds to a beam steering towards the scatterers instead of the UE. A general analysis of the problem appears prohibitive due to the presence of a random matrix. However, it is possible to derive a worst-case asymptotic condition on σ 2 c for a large number N of antennas at the BS. Specifically, the worst-case scenario is that where the subspaces spanned by A and C are disjoint, meaning, for instance, that the UE and the scatterers are located far away from each other. On the contrary, if A and C had a common subspace, the strongest eigenvector of A would be likely partially steered toward the UE then making the communication still possible. Under the worst-case assumption, Communication Scheme 1 fails to converge to v 1 in favor of the strongest eigenvector of C if λ 1 < √ P T λ (c) 1 , where λ 1 and λ (c) 1 are the largest eigenvalues of A and C, respectively. Being C a random matrix, the previous inequality has to be intended as an event that requires a statistical characterization. From the Tracy-Widom law, it is known that the largest eigenvalue of a random matrix of size N with i.i.d. Gaussian entries having variance σ 2 c converges almost surely to the value 2σ 2 c √ N for increasing N [64]. Therefore, Communication Scheme 1 will converge to v 1 if the following condition holds In the non-worst-case scenario, condition (43) is still valid even though conservative. It is interesting to point out that λ 2 1 increases with N 2 (see, for instance, (39)), then the threshold σ 2 t increases with N, meaning that Communication Scheme 1 can be made more robust to static clutter by increasing the number of antennas at the BS.

IV. ESTABLISHING MU-MIMO COMMUNICATIONS AUTOMATICALLY -MULTI USER SCENARIO
Consider now the scenario shown in Fig. 4, in which there are U users equipped with the proposed RAAs with the aim of establishing up to L parallel MU-MIMO links. Denote with H (1) , H (2) , . . . , H (U) ∈ C M×N the channel matrices related to the U links between the BS and the users. For further convenience, we define the total channel matrix H of size MU × N (1) H (2) . . .
and the matrix A ∈ C N×N as Note that A = U u=1 A (u) , with A (u) ∈ C N×N given by for u = 1, 2, . . . , U. Define r (u) = rank A (u) , and v (u) i the ith eigenvector of A (u) , with i = 1, 2, . . . , r (u) . Note that now P T represents the total transmitted power.
We now make the assumption that the subspaces spanned by H (u) are orthogonal, and so those by 1, 2, . . . , r (p) . As a consequence, the top eigenvectors {v j } of A are given by the eigenvectors v (47) with u = 1, 2, . . . U, i = 1, 2, . . . , r (u) , and U u=1 r (u) ≤ N. This means that each user is associated with a dedicated subset of eigenvectors of A. For instance, in free-space and far-field conditions, each eigenvector will correspond to one user. In such a case, if the BS was capable of estimating the L ≤ U strongest eigenvectors, then the optimal MU-MIMO communication with L out of U users would be established. More in general, to establish a communication link with U UEs, L must be larger than U because the same data stream generated by one UE affects multiple directions v (u) i that might not be ordered in terms of associated eigenvalues. In this case, it is supposed the data streams are reordered by the higher layers of the protocol stack. Without loss of generality, in the rest of the paper we suppose the top-L eigenvectors of A correspond to distinct UEs, i.e., v j = v (j) 1 , for j = 1, 2, . . . , L, and L ≤ U.
The above disjoint assumption might correspond to the situation where users are located at different positions and a large number of antennas are used on both sides in a rich scattering scenario, as in massive MIMO systems (thus exploiting favourable propagation conditions). In practical contexts, such an assumption might not be exactly satisfied thus generating interference between users. This phenomenon will be investigated in the numerical results.
In the following, we extend the approach considered in the previous subsection to estimate the L strongest eigenvectors of A.

A. BEAMFORMING VECTORS ESTIMATION WITH NO DATA TRANSMISSION
We first consider the case where no data are transmitted, and the purpose is to estimate the L-top eigenvectors of A forming the columns of matrix V L which contains the first L-left columns of V. This can be accomplished by customizing to our case the block version of the Power Method [45], also known as Orthogonal Iteration, to account for the operation performed by the RAA. The algorithm, presented below as Communication Scheme 2 along with its accompanying pseudo-code, exemplifies how this adaptation is implemented. where is the noise at the uth RAA (step 4). By defining with r (u) denoting the signal retro-directed by the uth UE (step 5), the signal received by the BS at the lth time interval (step 6) can be written as . At each iteration, the L received vectors are gathered into the matrix Y[k] = {y l [k]} L l=1 , then, according to the Block Power method [45], the receiver performs a QR decomposition to obtain an updated version of matrix X[k] that will contain a set of L orthogonal vectors (step 7). In the absence of noise, the Block Power method ensures that for large k, the matrix X[k] (updated in step 8) converges towards V L , which contains the L top eigenvectors of A [45]. Communication Scheme 2 could be used as part of a protocol in which, first, an unmodulated preamble of length K p is sent to estimate the beamforming vectors V L at the BS and, second, data is transmitted by keeping the estimated beamforming vectors fixed to X[K p ] throughout the data packet transmission time, assuming the channel remains static. The first step must be repeated periodically according to the rate of change of the channel, which implies some overhead due to the preamble.
We point out that, differently from Communication Scheme 1, Communication Scheme 2 cannot be used to transmit data and track channel variations simultaneously because, upon convergence, X[k] will contain the set of eigenvectors which are orthogonal and hence only one user per symbol time would be addressed following a TDMA-like scheme (see Fig. 4). This would correspond to an increased complexity, as each user should modulate the RAA according to a slotted protocol, as well as to a reduction of the spectral efficiency of a factor L (the analytical motivation will be explained in the next subsection and in the Appendix). To overcome the above issues, in the next section, we propose a novel ad-hoc algorithm, inspired by the Block Power method, suitable for continuous data transmission from users and channel tracking.

B. JOINT BEAMFORMING AND DATA TRANSMISSION
We now introduce Communication Scheme 3, which is capable of handling simultaneous UEs' transmissions, as well as tracking the channel.
Communication Scheme 3. The main idea underlying Communication Scheme 3 (refer to the accompanying pseudo-code) is to transmit, at each iteration, a fullrank linear combination of the estimated eigenvectors (scrambling) rather than transmitting them sequentially. This is achieved by multiplying the matrix X[k] with a full-rank unitary matrix P (scrambling matrix). By adopting this approach, all users can be simultaneously addressed, enabling continuous transmission without sacrificing spectral efficiency. Unfortunately, due to the scrambling process performed by the BS on the transmitted signal and the phase modulation {φ Once the "cleaned" matrix B has been obtained in step 13, its QR decomposition can be performed (step 14), leading to the derivation of the updated estimate of V L (step 15). To enhance the scheme's resilience against interference among users when the orthogonality assumption on the channels is violated, a straightforward approach involves applying a time filtering operation to the updated eigenvector matrix V[k] using a forgetting factor 0 ≤ α f ≤ 1 (step 16). The new full-rank unitary matrix P is then generated (step 17) and used to scramble V L before transmission (step 18). In the Appendix, we demonstrate that in the absence of noise, the proposed scheme operates effectively and achieves convergence. The general characterization of the convergence behaviour of this algorithm in the presence of noise appears prohibitive so it will be investigated numerically in the next section. As it will be shown in Section V, Communication Scheme 3 is capable of establishing automatically and simultaneously up to L data links without the need for channel estimation. In free-space and far-field conditions, it realizes an optimal MU-MIMO scheme which does not require any processing at the UE as well as any ADC chain. Importantly, since L parallel links are established, no reduction in the spectral efficiency is introduced when L users are simultaneously active. Concerning the computational complexity of Communication Scheme 3, it is mainly determined by the QR decomposition, so it is O(N 3 ) and it involves only the BS.

V. NUMERICAL RESULTS
In this section, we provide some numerical examples to investigate the performance of the proposed RAA-based joint communication and beamforming scheme. The values of the system parameters we adopted for the analysis and the simulations are those reported in Table 1, if not otherwise specified. The BS, equipped with a planar array deployed along the xy-plane, is located at (0, 0, 0) [m]. Both free-space and more realistic 3GPP TR 38.901 channel models [65] have been considered in the simulations, the latter with a delay spread of 9 ns. In particular, among the NLOS channel models provided in [65] (CDL-A, CDL-B and CDL-C), we chose CDL-C because it features higher path delays than CDL-B and more evenly distributed over time than CDL-A. Concerning LOS channel models (CDL-D, CDL-E), we opted for CDL-E because it features higher path delays than CDL-D. In the numerical results, we focused more on CDL-C, as the NLOS condition is more challenging.

A. SINGLE-USER SCENARIO
We start our investigation by considering a single-user scenario and Communication Scheme 1 in the presence of dynamic clutter. The UE is located at (0, 0, 10) [m] with a RAA deployed along the xy-plane. In Fig. 5(a), the evolution of SNR (dec) [k] is shown in static free-space condition for two values of the transmitted power, P T = −30 dBm and P T = −45 dBm, corresponding to SNR (boot) = 10.3 dB, and SNR (boot) = −4.7 dB, respectively. Simulation results are compared with the theoretical ones obtained by evaluating (30). From the plots, it can be clearly noticed that when SNR < 0 dB, as well predicted by the theoretical analysis reported in Section III. The SNRs that would be obtained with a single-antenna backscattering UE is also given as a reference (the bottommost flat curves) in order to emphasize the large gain introduced by the RAA and the antenna array at the BS, despite the large path loss experienced at the considered carrier frequency (28 GHz) and the two-way backscatter channel. In Fig. 5(b), the effect of static clutter is investigated for different values of the ratio ηc = σ 2 c /σ 2 t for one realization of the clutter matrix C. As predicted in Section III, as long as η c overcomes 1, the performance degrades rapidly until a complete lack of convergence, as it can be seen for η c = 5, indicating that the beam focuses on the strong clutter instead of on the user. The impact of channel non-reciprocity caused, for instance, by asymmetric RAAs, is reported in Fig. 5(c). Non-reciprocity has been simulated by substituting (H + R) † to H † in the backward channel, where R is modelled as a random matrix with zero mean complex Gaussian elements and variance σ 2 asym . Plots for different ratios η asym = σ 2 asym N M/ H 2 F are shown, being H F the Frobenius norm of matrix H. The results indicate that the proposed scheme is very robust to forwardbackward channel asymmetries until η asym remains below N (26 dB).
Results with more realistic, time-variant, multipath 3GPP channel models, namely the Clustered Delay Line (CDL) models [65], are shown in Fig. 6 for P T = −10 dBm and P T = −15 dBm. In particular, the LOS CDL-E and the NLOS CDL-C channel models [65] have been considered for different speeds from 3 km/h up to 300 km/h. The behaviour of the SNR assuming perfect estimation of the top eigenvector, i.e., SNR , is reported for comparison (see the theoretical curves plotted with a dashed style). 5 The results demonstrate the capability of the communication scheme to track the fading evolution at very high speed as long 5. Being the channel time-variant, also SNR  Since f d f 0 , in our analysis, we neglected the effect of Doppler frequency shift in the RAA response characterization. It can also be observed that the range of fading variation during tracking is relatively narrow, approximately 10 dB. This limited range is a consequence of the diversity gain intrinsic in the communication associated with the strongest eigenvector. Moreover, it can be noticed that the initial transient to achieve the convergence in NLOS conditions (CDL-C), characterized by significant multipath components, is slightly larger than that of the LOS condition (CDL-E). However, it falls below 8 time intervals (i.e., 36 μs), confirming that the proposed scheme achieves initial beamforming and link setup with extremely low latency. In any case, it is safe not to transmit data during the initial transient by setting to zero the phase φ[k] within the preamble, that is, for k = 1, 2, . . . , K p , with K p larger than the expected transient duration.

B. MULTI-USER SCENARIO
A scenario with 4 users located at coordinates  Fig. 7, is investigated when Communication Scheme 3 is adopted. A forgetting factor α f = 0.3 and a preamble of K p = 40 symbols (with scrambling matrix P = I L ) have been chosen as a trade-off between convergence speed and robustness to inter-user interference. Indeed, in a multi-user scenario, it is possible for channels to lack orthogonality, resulting in interference between users. Consequently,  the key metric to be examined in this context is the signal-to-interference-noise ratio (SINR).
In Fig. 7, the resulting equivalent array factor at the BS and the 4 UEs at the convergence is also reported in the case of free-space propagation. The equivalent array factor at UE side has been evaluated by computing (H V[k]) * at the convergence, whereas at the BS side, it has been obtained starting from the beamforming vector V[k]. As expected, for each column of V[k], the BS generates a beam pointing towards the corresponding user. On the UEs side, the backscatter signal is redirected towards the BS, as it can be noticed in Fig. 7.
An example of transient of the SINR is reported in Fig. 8 for a total power P T = −30 dBm in free-space conditions. Compared with the transient where only one user is considered, as in Fig. 5(a), a slower convergence and a noisier behaviour are observed, mainly due to the presence of interuser interference caused by the non-perfect orthogonality between the different user's channels. The SINR has been evaluated as the ratio between the power from the intended user associated with its strongest eigenvector and the power of the other users' components projected onto the same eigenvector (plus noise).
In Fig. 4(a), the empirical cumulative distribution function (CDF) of the SINR is reported by simulating 100 Monte Carlo iterations for P T = −30 dBm and P T = −45 dBm in free space. Clearly, the more the curves are shifted to the right, the better the conditions. While with P T = −30 dBm all the 4 users present a significant SINR in most cases, with P T = −45 dBm, only user 4 has relatively good performance because it is the closest to the BS while the other users experience a SINR which goes below the bootstrap SNR in most of the realizations. The results obtained in the ideal case where no inter-user interference is present are also reported for comparison in order to highlight the impact of non-perfect orthogonality between users that, in this scenario, is dominant and might generate important SINR fluctuations. Inter-user interference can be made negligible by significantly increasing the number of antennas to create the conditions for favourable propagation, as well-known from massive MIMO theory.
Finally, Fig. 4(b) shows the empirical CDF of the SINR for P T = −10 dBm and P T = −20 dBm by considering 100 different realizations of the CDL-C NLOS channel model Also in this case, results indicate the effectiveness of the proposed multi-user scheme even in the presence of strong multipath and inter-user interference.

VI. DISCUSSION
Our scheme is designed to support MU-MIMO communications with low latency, while avoiding the need for expensive circuitry. In the following, we will delve into the advantages and disadvantages of our solution compared to alternative schemes.

A. COMPARISON WITH OTHER MIMO IMPLEMENTATIONS 1) FULL DIGITAL MIMO
The full digital MIMO scheme (refer to Fig. 10(a)) includes multiple antennas at both the BS and the UE, with each antenna being equipped with its own RF chain.
Pros with respect to our scheme: It offers multiplexing gain and a more favourable link budget, as the UE does not operate in backscatter mode.
Cons with respect to our scheme: It necessitates multiple RF/ADC chains at the UE along with a digital processing unit for computing precoding and combining vectors, as well as demodulating data. This process exhibits a complexity of approximately O(N 3 ) to perform the SVD. Additionally, it involves a CSI estimation process, resulting in signalling overhead and increased latency, whose outcome must be available at both the UE and BS. Moreover, the UE is intrinsically power-hungry (active transceiver).

2) SINGLE-RF MIMO
A single-RF MIMO system (refer to Fig. 10(b)), such as the one proposed in [66], or a conventional analog-beamformer scheme, considers beamforming operated in the analog domain at the UE side (or a similar strategy leading to a single RF chain). Thus, it entails significantly lower complexity than a full digital MIMO.
Pros with respect to our scheme: Similarly to our approach, it performs beamforming resulting in a single-layer optimum communication (thus, without exploiting spatial multiplexing). However, since it employs active devices, it offers broader coverage.
Cons with respect to our scheme: It still necessitates a CSI estimation process, leading to signalling overhead, and the CSI result must be available at both the UE and BS. As for full digital MIMO, the UE is power-hungry (active transceiver).  Fig. 10(c)) incorporate a RIS at the UE side, aiming to effectively reduce UE complexity and minimize power consumption [39], [67], [68].

RIS-based backscatter communication systems (see
Pros with respect to our scheme: Since a RIS is composed of multiple cells that in general can be controlled singularly, different data streams can be transmitted at the same time, thus exploiting spatial multiplexing.
Cons with respect to our scheme: In order to configure the RIS, CSI estimation is required in addition to a dedicated RIS control channel. Moreover, the RIS configuration speed might represent a bottleneck limiting the capability of the system to work in highly dynamic environments and provide low-latency beamforming.

4) SINGLE-ANTENNA (ACTIVE OR BACKSCATTER)
To avoid the CSI estimation process and obtaining a UE with similar complexity as the proposed scheme, one should resort to an active or backscatter-based single antenna UE [62], [69] (see Fig. 10(d)).
Pros with respect to our scheme: Only in the case of active transmission the link budget is more favourable, at the expense of increased power consumption.
Cons with respect to our scheme: The single-antenna scheme, although being single-layer as the proposed one, presents a loss of MIMO SNR and diversity gain. This is a primary issue when considering a backscatter-based solution working at high frequency, such as mmWave, which has an unfavourable link budget.

B. REMARKS ON THE PROPOSED RAA-BASED SCHEME
Our schemes possess the remarkable capability of supporting optimal single-layer MU-MIMO without requiring any processing by the UE to perform beamforming, nor complicated ADC/RF circuits. The computational complexity at the base station is O(N) for the single-user scenario, whereas it is O(N 3 ) for the multi-user scenario, similar to that of other MIMO schemes, whereas the complexity at the UE is minimized. Since communication is unidirectional, as the data flow only goes from the UEs to the BS, this scheme is suitable for applications where traffic is strongly asymmetrical, with uplink communications predominating.
We also point out that the iterative nature of the proposed beamforming algorithm exploiting the backscatter characteristic of the RAA requires that the iteration time should be larger than the propagation round-trip time τ p . Since the symbol time T is equal to the iteration time, then T is lower bounded by τ p , and hence the bandwidth is roughly upper bounded by 1/τ p . This means that high data rates can be achieved only for short distances. This is obviously a limitation of our proposal whose application has to be intended mainly in short-range scenarios such as industrial IoT. Indeed, a typical application scenario is where a large number of low-complexity energy-efficient sensors have to transmit short packets to the BS in a highly dynamic industrial environment with low latency (e.g., sensors applied to a rotating part of a machine). Notably, our scheme also provides a solution to the deafness problem [70], which occurs when receiving devices are insensitive to connection requests from transmitters whose antenna beams are not aligned with theirs. This problem is particularly critical for future communication systems that are expected to rely on highly directional beamforming antennas to enable high-capacity and ultra-massive communications. Indeed, our scheme facilitates the automatic alignment of transmitter and receiver beams with exceptionally low latency, ensuring efficient and prompt beam synchronization.

C. IMPLEMENTATION CHALLENGES
In order to implement the proposed communication system, several challenges have to be tackled. In particular, (i) the BS must provide full-duplex capabilities with high isolation between the TX and RX sections, in order to feed the UE with an updated precoding vector while demodulating data simultaneously; (ii) the RAA (or SCM) must provide high gain in order to compensate for the two-hop propagation loss; as discussed, this gain can be traded with an increased power level at the BS side or with the number of antenna elements.
Moreover, for what concerns RAAs/SCMs implementations, solutions capable of guaranteeing high symmetry between the forward and backward channels are needed to ensure high performance; (iii) low-complexity symbol-level synchronization must be granted between BS and UEs, in order to transmit data timely by including few additional circuitry at the UE. A possible drawback of a scenario employing RAAbased UEs is the self-interference they could cause to other sources, for instance, belonging to other wireless networks operating in the same frequency band. In fact, the RAA will always redirect any incoming RF signal back to the respective source. Nevertheless, this does not represent an issue for conventional half-duplex BSs/UEs' because they do not receive while transmitting. Vice versa, full-duplex BSs might suffer from self-interference if adequate countermeasures are not implemented (e.g., echo cancellation).

VII. CONCLUSION
In this paper, we have proposed the adoption of modulating RAAs as a means to realize ultra-low complexity uplink MIMO communications based on retrodirective backscattering. Thanks to the iterative algorithms introduced in this paper, inspired by Power Methods, the BS is able to derive and track the optimal beamforming vectors for the BS-UE round-trip channel. It has been shown that, in MU-MIMO scenarios, UEs can establish parallel communication links automatically with the BS in a blind way while communicating without the need for ADC chains as well as without explicit channel estimation and time-consuming beamforming and beam alignment schemes. An analytical characterization of the SNR evolution during the iterations has been derived allowing the identification of the conditions needed to reach a robust SNR at the convergence, even in the presence of strong static clutter. Numerical results have put in evidence that MU-MIMO communications can be established automatically after a few iterations, with μs-level latency, even using very large arrays, and that very fast channel variations can be tracked in realistic dynamic scenarios characterized by multipath. The proposed system is particularly appealing in many applications envisioned for 6G requiring low-complexity MIMO solutions at high frequencies with extremely short connection-setup times, such as in IIoT environments and V2X communication.

APPENDIX
In this Appendix we prove that, assuming the convergence is reached, i.e., V[k] ≈ V L for large k, and in the absence of noise, the algorithm in Communication Scheme 3 of Section IV actually manages to extract the data and the convergence is kept, which means that matrix Q returns the eigenvectors matrix V L at each iteration independently of data and the scrambling matrix P. We consider the generic iteration k so that, to lighten the notation, in the following we will drop the index k by focusing on the transmission and reception of a sequence of L vectors (block). According to the assumptions made above, at the convergence the transmitted signal is X = V L P. The signal reflected by the uth user equipped with the RAA at the lth time instant of the sequence is l v 1 p 1,l + · · · + v L p L,l * .
With the assumption that the subspaces spanned by A (u) , for u = 1, 2, . . . , U are orthogonal and the L eigenvectors in V L are mapped one-to-one to the first L users, (53) becomes being λ u the corresponding eigenvalue of A (u) , and A (u) v l = 0 when l = u. Gathering the L vectors y l into the matrix Y, it can be shown that where = diag(λ 1 , λ 2 , . . . , λ L ), and [D] u,l = e jφ (u) l is the data matrix, with u, l = 1, 2, . . . L. According to Communication Scheme 3, to retrieve the data, the receiver computes the matrix being V L a unitary matrix. It is straightforward to see that Then, it is possible to conclude that the phases carrying the information can be easily recovered as φ (u) l = − arg f u,l /p u,l (58) and hence the data matrix D can be recovered as well.
To obtain an updated estimate of the beamforming vectors V L , the BS needs to first de-embed the user data (contained in D) and the scrambling matrix P from the received signal Y. This task is accomplished by deriving the matrix B, which is defined as with E = (P D * ) −1 . By substituting (55) in (59), it is It is easy to see that B contains a weighted version of V L , then the QR decomposition will return Q = V L , i.e., at steady state, the transmitted vector sticks on X = V L P, as per initial assumption.
It is worth noticing that in the absence of the scrambling operation, i.e., P = I L , only the data lying on the diagonal of matrix D can be recovered, thus forcing the users to follow a TDMA scheme with consequent loss of spectral efficiency, as pointed out in Section IV.