Differential Data-Aided Beam Training for RIS-Empowered Multi-Antenna Communications

The Reconfigurable Intelligent Surface (RIS) constitutes one of the prominent technologies for the next generation of wireless communications. It is envisioned to enhance the signal coverage in cases when the direct link of the communication is weak. Recently, beam training based on codebook selection is proposed to obtain the optimized phase configuration of the RIS. After that, the data is transmitted and received by using the classical coherent demodulation scheme (CDS). This training approach is able to avoid the large overhead required by the channel sounding process, and it also circumvents complex optimization problems. However, the beam training still requires the transmission of some reference signals to test the different phase configurations of the codebook, and the best codeword is chosen according to the measurement of the received energy of the reference signals. Then, the overhead due to the transmission of reference signals reduces the spectral efficiency. In this paper, a zero overhead beam training for RIS is proposed, relying on data transmission and reception based on non-CDS (NCDS). At the BS, the received differential data can also be used for the determination of the best beam for the RIS. Therefore, the efficiency of the system is significantly enhanced since reference signals are fully avoided. After choosing the best codebook, NCDS is still more suitable to transmit information for high mobility scenarios as compared to the classical CDS. Analytical expressions for the Signal-to-Interference and Noise Ratio (SINR) for the non-coherent RIS-empowered system are presented. Moreover, a detailed comparison between the NCDS and CDS in terms of efficiency and complexity is also given. The extensive computer simulation results verify the accuracy of the presented analysis and showcase that the proposed system outperforms the existing solutions.

fully enhanced Mobile Broadband (eMBB) experience.The drawback is that the coverage in these bands will suffer from a strong attenuation loss.RIS-empowered links is an appealing solution to both improving and extending the signal transmitted by either the Base Station (BS) or User Equipment (UE), without excessively increasing the overall cost of the wireless network.
RISs are lightweight and hardware-efficient artificial planar structures of nearly passive reflective elements [5] that enable desired dynamic transformations on the signal propagation environment in wireless communications [3].They can support a wide variety of electromagnetic functionalities [11], [12], ranging from perfect and controllable absorption, beam and wavefront shaping to polarization control, broadband pulse delay, radio-coverage extension, and harmonic generation.The RIS technology is envisioned to coat objects in the wireless environment [3] (e.g., building facades and room walls), and can operate either as a reconfigurable beyond Snell's law reflector [1], or as an analog receiver [13], [14] or lens [15] when equipped with a single Radio-Frequency (RF) chain, or as a transceiver with multiple relevant RF chains [16].
The exploitation of RISs requires to obtain a proper phase configuration of its reconfigurable elements, capable of creating an alternative high-gain reflective channel between BS and UE [17]- [25].For that purpose, a significant amount of reference signals is transmitted in the uplink to obtain the socalled cascaded channel estimation, which encompasses the joint effect of the signal propagation over the BS-RIS and RIS-UE links [17]- [19].Note that this overhead not only becomes prohibitive as the numbers of UEs, RIS elements, and configurations increase [26], but it is absolutely useless when the cascaded channel has a strong attenuation due to the multiplicative fading effects [27].Once the cascaded channel estimation is available at the BS, the best pair of precoder/combiner and phase configuration are computed and communicated to the RIS via a side control link.This processing task is not straightforward due to the fact that a nonconvex design optimization needs to be solved, increasing the operational complexity.Several recent studies have focused on accelerating this optimization using alternative methods at the expense of sacrificing the performance with sub-optimal optimizations [1], [4], [20], [21], [28], [29].After the RIS is adequately configured, the traditional Coherent Demodulation Scheme (CDS) is used for the data transmission.It is generally assumed that the channel coherence time is long enough to encompass the estimation, optimization, and data transmission, a condition that may be difficult to satisfy, especially in mobile communication systems.
Recently, different alternatives have been proposed in order to alleviate the overhead incurred by the cascaded channel estimation and reduce the delay induced by the optimization, including beam training [30]- [34] and the exploitation of the diversity gain of the RIS [35], [36].The former approaches [30]- [34] use codebooks based on a reduced number of phase configurations, capable of producing several directive beams pointing towards some specific area(s) of interest.Similarly to the beam training proposed for the multi-antenna BS at mm-Wave [37], [38], the UE transmits some reference signals and the BS measures the received signal strength when each codeword is applied.The chosen codeword is that one associated with the highest received energy.This solution is not only able to reduce the required time for channel estimation and optimization, but it also enables this technology to be used at high path-loss scenarios.However, it still requires a beam training period to test all the phase configurations of the codebook.Then, in order to reduce further the number of phase configurations, [31] proposed the use of multiple beams at the expense of reducing their directivity.However, the exploitation of beam alignment at mm-Wave will arise some security issues related to jamming attacks [39], [40].A jammer may transmit high-power signals to intentionally induce a beam failure event, and the UEs are not capable to access the network or switch to a new better beam.On the other hand, the strategies in [35], [36] assumed that the phase configurations at the RIS are randomly chosen over time, and hence, the received signal will be enhanced due to the time and/or spatial diversity.However, [35] only provides some theoretical bounds in terms of outage probability and achievable rate, assuming that any modulation and coding scheme can be applied.In turn, [36] proposed the use of Non-CDS (NCDS) [41]- [44] in order to increase the rate, by avoiding the transmission of reference signals and only exploiting the spatial diversity produced by both the BS and RIS.Although the average performance is scaled by the spatial diversity produced by both the antennas of the BS and the passive elements of the RIS, the received signal may suffer from strong fading for certain periods, since some random phase configurations may not be favorable for the UE of interest.
To the best knowledge of the authors, a zero overhead beam training based NCDS for RIS-aided communications (NCDS-RIS) has never been proposed.It would provide the advantages of beam training [30]- [34] and NCDS [41]- [44].During the beam training process, the effective BS-UE channel is strongly time-varying as a consequence of testing different phase configurations of the RIS profile codebook.However, data transmission can be efficiently performed during those training periods by using NCDS [41]- [44], which allows to both measure the received energy for each chosen codeword and demodulate the received signals without the knowledge of the channel estimation.Unlike CDS, it is very robust to the time variability.Hence, the best codeword can be selected without sacrificing the data-rate.Moreover, after the alternative link between BS and UE via RIS has been established, the NCDS is still a preferable choice in some circumstances as compared to the traditional CDS, as will be shown.For high mobility scenarios where the channel coherence time may be small, the NCDS-RIS is able to fully avoid the estimation of the resulting channel between BS-UE, and hence, the datarate is increased.Also in this transmission stage, computing the precoder/combiner at the BS is not required, reducing the complexity, delay and energy consumption of the system.Additionally, our proposed scheme are compatible with the different anti-jamming techniques [39], [40], such as pseudorandom multi-beam pattern, randomized probing, etc. Motivated by the above described facts, in this paper, we propose the novel NCDS-RIS, which consists of the combination of RIS beam training and NCDS in order to fully avoid the overhead, targeting moderate to high mobility scenarios for 5G Advanced and 6G systems.The main contributions of the paper are summarized as follows: • An RIS-empowered Single-Input Multiple-Output (SIMO) Orthogonal Frequency-Division Multiplexing (OFDM) system with differential-data-aided beam training is presented, denoted as NCDS-RIS.This combination requires neither a lengthy channel estimation process nor solving a non-convex optimization problem for the RIS configuration.In order to maximize the throughput, the way to encode the non-coherent data based on differential Phase-Shift Keying (PSK) modulation in the time and frequency resources of the OFDM signal is proposed for the different stages of the communication.During the beam training stage, differential modulation is proposed to be exclusively performed in the frequency domain.This is the best solution because the channel will suffer from strong variations in consecutive OFDM symbols due to the beam training process.At this stage, the energy of the received data signal is measured to choose the best phase configuration for the RIS.After the training process is done and the best codeword is chosen to configure the RIS, the differential PSK modulation is proposed to be performed in both frequency and time domains.Consequently, the proposed approach significantly enhances the data-rate of the system for both stages by reducing the required reference signals.
• The Signal-to-Interference-plus-Noise Ratio (SINR) of the proposed RIS-empowered NCDS system, determining the power of the useful signal over the self-interference and thermal noise, is analytically characterized over a realistic geometric wideband channel model [45]- [47].• The throughput and complexity of the proposed NCDS-RIS system are analyzed and compared to the existing solutions based on reference signals and CDS transmission.The throughput is computed taking into account the amount of overhead required and the bit error rate (BER) of each approach, while the number of complex product operations accounts for the complexity for all the chosen techniques.
• The simulation results verify the accuracy of the presented SINR analysis and highlight the superiority of the proposed NCDS-RIS system over the state of the art.The BER and throughput are also numerically assessed using the 5G numerology for scenarios with different degrees of mobility.The performance comparison shows the benefits of the proposed NCDS-RIS, relative to the existing CDS-RIS.The remainder of the paper is organized as follows.Section II introduces the system model and the geometric wideband channel model.Section III outlines the channel gain of the RIS-aided communication and the codebook design.Section IV details the implementation of the proposed differential PSK scheme.Section V presents the analytical expressions for the SINR.Section VI includes the comparison analysis among the proposed NCDS-RIS and the existing solutions based on CDS in terms of throughput and complexity.Section VII discusses the performance assessment results.Finally, Section VIII concludes the paper.
Notation: Matrices, vectors, and scalar quantities are denoted by boldface uppercase, boldface lowercase, and normal letters, respectively.[A]  denotes the element in the th row and -th column of A.  is the imaginary unit, |• | represents the absolute value and (• ) corresponds to the phase component of a complex number.||• ||  is the Frobenius norm.{• } and {• } represent the real and imaginary parts, respectively.⊗ and denote the Kronecker and Hadamard products of two matrices, respectively.E {• } represents the expected value of a random variable, Var {• } denotes the variance, and CN (0,  2 ) represents the circularly-symmetric and zero-mean complex normal distribution with variance  2 .Exp () accounts for the Exponential distribution with rate parameter .

II. SYSTEM AND CHANNEL MODELS
The considered mobile communications scenario comprises a BS, an RIS, and a single-antenna UE (see Fig. 1).The BS is equipped with a uniform rectangular array (URA) consisting of  =     antenna elements, where   and   denote the number of elements in the horizontal and vertical axes, respectively, and the distance between any two contiguous elements in their respective axes is given by  BS  and  BS  .Analogously to the BS, the RIS is built by  =     fully passive reflecting elements, whose respective distances between elements are given by  RIS  and  RIS  .We finally assume that the RIS is attached to a dedicated controller for managing its configuration, which is synchronized with the BS.
Focusing on the uplink case, the UE transmits the data to the BS using both the direct BS-UE link and the reflected link via the BS-RIS and RIS-UE communication links.It is understood that other UEs may be multiplexed in different orthogonal (e.g.time or frequency) resources.

A. Channel Model
All the channel links (BS-UE, BS-RIS, and RIS-UE) are modeled by a geometric wideband channel model [45]- [47], made up of the superposition of several separate clusters, each of them with a different value of delay, gain, and angles of arrival and departure.The delays and geometrical positions of each cluster/ray are typically characterized by the Delay and Angular Spreads (DS and AS), respectively.The spatialtemporal information of these clusters and the array steering vectors are included to model the spatial correlation due to the array responses.Therefore, this channel model is able to account for the spatial correlation considering both the given antenna array response of the BS and RIS, as well as the geometrical positions of all clusters.
Moreover, it is assumed that the BS-RIS channel impulse response is time invariant and characterized by Rician fading, since both BS and RIS are fixed elements (e.g., placed at the top of a wall).In turn, the BS-UE and RIS-UE links are considered to remain quasi-static within the channel coherence time (  ), due to the mobility of the UE.The direct BS-UE link is modeled as Rayleigh fading, assuming that there are obstacles preventing a line-of-sight (LoS) component.However, the RIS-UE link is modeled as Rician, since it is considered that the reflected link via the RIS is able to establish an alternative LoS link, capable of avoiding the mentioned obstacles present at the direct link.In the considered case where there is not any phase configuration capable of providing better alternative link, the communication is still carried through the direct BS-UE link.
Hence, the channel links at the -th subcarrier (with  = 1, 2, . . ., ) can be described as where   ,   , and   are the large-scale gain for the BS-UE, BS-RIS, and RIS-UE links, respectively,   and   correspond to the Rician factor of the BS-RIS and RIS-UE links, respectively.Note that the overall channel gain is dominated by the large-scale gains, since they are significantly lower than the Rician factors (  <<   and   <<   ).In addition, h, , G, , and g, account for the non-LoS (NLoS) components of the channels, while G , and g , are the LoS ones.
The channel impulse response of the direct BS-UE link at the -th subcarrier can be described as h, where   is the number of clusters,  , accounts for the delay of the -th cluster measured in samples,  , is the channel coefficient for the -th cluster,  2 , is the average gain of the th cluster and  2  accounts for the total gain of the channel.In turn, a BS  , ,  , corresponds to the array steering vector at the BS, and its arguments are the azimuth and elevation angles of arrival, respectively, for the -th cluster.The steering vector for the BS is given by where 1 ≤  ≤  is the antenna index and  is the wavelength, as well as a  BS (, ) and a  BS (, ) are the steering vector for the horizontal and vertical arrays, respectively.
The channel impulse response of the BS-RIS link at -th subcarrier can be described as G, where   and  , are the channel coefficients,   and  , are the delay measured in samples.Similarly to the BS, a RIS (  ,   ), a RIS  , ,  , , a RIS (  ,   ) and a RIS  , ,  , denote the steering vectors for the RIS, and their arguments are the azimuth and elevation angles of arrival and departure, respectively.The expression for the steering vector is the same as described in ( 5)-( 7), replacing respectively (  ,   ,  BS  ,  BS  ) by (  ,   ,  RIS  ,  RIS  ).The channel impulse response of the RIS-UE link at -th subcarrier can be described as g, where   and  , are the channel coefficients,   and  , account for the delay measured in samples, and the arguments of the steering vectors correspond to the angles of arrival to the RIS.

B. Uplink Transmission
Given the channel coherence time (  ), the UE transmits a frame of  contiguous OFDM symbols of  subcarriers each.In order to avoid the Inter-Symbol and Inter-Carrier Interferences (ISI and ICI), the length    of the cyclic prefix must be long enough to absorb the BS-UE direct and reflective paths.
At the BS, the baseband representation of the received signal y , ∈ C  at the -th subcarrier in the -th OFDM symbol is given by where  , ∈ C denotes the complex symbol transmitted from the UE whose power is E  , 2 =   , v , ∈ C  represents the Additive White Gaussian Noise (AWGN) vector which is distributed as v ,  ∼ CN 0,  2  , and h , ∈ C  is the effective channel frequency response between BS and UE, which can be decomposed as where h , ∈ C  is the direct BS-UE channel frequency response at the -th subcarrier, h  ,, ∈ C  corresponds to the reflective BS-UE channel frequency response through the RIS at the -th subcarrier in the -th OFDM symbol.The symbol ψ  ∈ C  accounts for the phase configurations applied to the RIS at the -th OFDM symbol, which is defined as with  , for 1 ≤  ≤  representing the -th phase shift of the RIS.Moreover, H , ∈ C × denotes the cascaded channel frequency response at -th subcarrier which is given by

III. BEAM TRAINING FOR RIS-AIDED COMMUNICATIONS
Beam training for RIS-aided communications [30]- [34] consists of transmitting some reference signals using different phase configurations of a predefined codebook, similarly to the proposed scheme for multi-antenna BS at mm-Wave [37], [38].Each phase configuration will produce a (possibly directive) beam, and hence, the receiver will be able to estimate the received energy for each phase configuration.Finally, the chosen beam corresponds to that phase configuration associated to the highest received energy.Note that the beam training procedure requires a significant amount of time to scan the whole three dimensional space.Therefore, in order to accelerate this process, multi-beam training is proposed, where the panel is split into several sub-panels and each of them is in charge of a portion of the entire space.However, this time reduction comes at the expense of reducing the directivity of the beams, and consequently, UEs far from the BS cannot be detected.
In the following subsection, the average channel gain of an RIS-aided communication system, based on a geometric wideband channel model, is characterized by using the best phase configuration at the RIS for the UE of interest.Then, given this upper-bound and the considered channel model, a codebook of phase configurations will be presented, with a similar approach to [30]- [34].For the sake of simplicity and ease of notation, the described beam training procedure is based on a single beam.Note that the proposed beam training process can be easily extended to the multi-beam case.

A. Channel Gain Enhancement by RIS
Taking into account (12) and considering that the channel impulse responses, transmitted data symbols and noise are independent random variables, the average received signal energy at the BS for a given RIS phase configuration (ψ  ) is described by where 1 ≤  ≤  and the expectation is performed over the subcarriers.Since the reflective BS-UE channel gain can be enhanced by manipulating ψ  , the optimization problem can be formulated as max where D is the codebook of available phase configurations (see, e.g.[48], for realistic values for the number of RIS tunable elements).By inspecting the cascaded channel (H , ) given in (15),which is an element-wise product of the BS-RIS and RIS-UE channel frequency responses in (2) and (3), respectively, the RIS is capable to generate a pencil-like sharp beam and focus the energy towards a specific angular direction, the phase configuration capable of maximizing ( 17) is given by where ψ + is the best phase configuration of the RIS, and according to [25] and [30], it is capable of simultaneously pointing to the angular direction of the LoS components of the BS-RIS and RIS-UE links, (  ,   ) from ( 8) and (  ,   ) from (10), respectively, which are the strongest taps (  ,   > 1).Hence, the best reflective channel can be rewritten as Note that h +  , is an upper-bound since the amplitude gain of the LoS component is amplified by the number of passive elements of the panel () in (19).

B. Phase Configuration Codebook
Taking into account [30] and following the analysis given in ( 17)-( 19), the codebook D can be built as where    is the number of entries of the codebook, and the -th entry of D is computed as with   and   being the number of codewords in the azimuth and elevation dimensions, respectively, and    =     .Note that the angles   and   in ( 21) are considered to be known at the BS since the channel of BS-RIS link is time invariant.On the other hand, the azimuth and elevation angles  ,  ,  ,  can be obtained as where Δ and Δ represent the angle resolution in the azimuth and elevation dimensions.

IV. PROPOSED NCDS-RIS
The beam training process [30]- [34], [37], [38] relies on the transmission of reference signals in order to allow the received energy to be measured.Furthermore, once the best beam of the codebook is chosen, the data symbols are transmitted and received by using the classical CDS, which also relies on reference signals.Consequently, in order to successfully deploy an RIS-empowered communication system, the transmission of a significant amount of overhead is required, which implies a reduction of the system efficiency, especially for high mobility scenarios where the required amount of pilot symbols is much larger.
The proposal NCDS-RIS, which consists of exploiting NCDS based on differential modulation [41]- [44] at RIS-aided communications during and after the beam training process is presented in this section (see Fig. 2).Note that the whole beam training procedure is centralized at the BS and it is transparent to all UEs of the cell.During the beam training process, the effective BS-UE channel impulse response (h , ) is extremely time-varying due to the continuous testing of different phase configurations of the codebook at the RIS.However, the differentially encoded data are robust to channel variations.They can be transmitted and demodulated without the need of any channel estimation, and at the same time, their received power can be measured.In the second stage, once the best codeword is chosen and configured at the RIS panel, we will show that the advantage of avoiding the transmission of additional reference signals in the NCDS is still interesting for practical scenarios, especially when the channel coherence time (  ) is not excessively large.Consequently, the proposed scheme not only is capable of avoiding the overhead produced by the undesirable reference signals, but it also does not require any network information.
The  OFDM symbols transmitted during the channel coherence time (  ) are split into two stages (see Fig. 2).In the first stage,   OFDM symbols are transmitted from the UE to the BS through the direct link, and at the same time, the panel will test different phase configurations given by the codebook.In the second stage, the remaining  ℎ OFDM symbols are transmitted by mainly using the enhanced reflected link, thanks to the selection of the best phase configuration of the codebook for the RIS.Consequently, at the second stage, the communication link via RIS provides a higher gain than the direct link, and higher order modulations can be exploited, producing a better throughput.

A. Stage One: Simultaneous Data Transmission and Beam Training at the RIS
The beam training is performed over the first consecutive   OFDM symbols, where each entry of the codebook is configured at the RIS for, at least, the duration of one OFDM symbol (  ≥    ).At this stage, even though all the channel links remain invariant during the channel coherence time (  ), the effective BS-UE channel (h , ), given in ( 13), changes from one to the following OFDM symbol, ensuring that the effective coherence time is larger than one OFDM symbol.This effect is produced by the reflective link (h  ,, ), which is constantly varying as a consequence of using different codewords (ψ  ) at each OFDM symbol.In order to be able to transmit data during this training period, the differential modulation is performed between consecutive subcarriers.In particular, we deploy the Frequency-Domain Scheme (FDS) of [42].
At the UE, the data symbols are differentially encoded in the frequency domain before their transmission as follows: where  , denotes the complex symbol to be transmitted at the -th subcarrier and the -th OFDM symbol, that belongs to a   -PSK constellation and its power is normalized (i.e.,  , 2 = 1).The constellation size at this stage may be small since the direct BS-UE channel link does not have a LoS component and the path-loss may be high.In (24),  1, and  2, are two reference symbols.Before data transmission, the power of differential symbols    is scaled according to   .According to [42], [44], the differential modulation performed at FDS is chosen when the channel coherence time is as small as one OFDM symbol period (  ≈ ( +    ) /(Δ  ) with Δ  being the subcarrier spacing measured in Hz).Here, the data symbols are differentially encoded in the contiguous frequency resources of each OFDM symbol.However, this scheme requires two reference symbols ( 1, and  2, ), namely the first one to estimate and compensate the residual differential phase component ( , ) produced by the multi-path channel at the -th OFDM symbol at the first stage, and the second one for performing the differential demodulation.Nevertheless this overhead can be neglected for broadband systems, when  is very large.
Given the received signal (12), the BS performs the following differential decoding: where  1 includes the useful symbol  , ,  2 and  3 represent the cross-interference terms produced by the noise and the received differential symbol in two time instants, while  4 is exclusively produced by the product of the noise in two contiguous subcarriers.Moreover, the residual phase component at the -th OFDM symbol can be estimated as While this transmission occurs, the beam training process is being executed at the RIS, where a different phase configuration of the codebook is configured for each group of   consecutive OFDM symbols, and the BS is measuring the received power in the -OFDM symbol as follows: After testing all entries of the codebook, the BS will choose that phase configuration corresponding to the highest measured received power (29).Hence, the gain of the reflective channel is significantly enhanced by the RIS ( 18)- (19).Besides, note that the differential data transmission has the benefit that it can be combined with any beam training procedure (singlebeam, multi-beam, hierarchical, etc.), since it is independent of the particular choice of codebook, and any anti-jamming processing.

B. Stage Two: Data Transmission via Reflective RIS Link
Once the best phase configuration is chosen in the previous stage, the effective BS-UE channel remains constant for  ℎ OFDM symbols (h , = h  , 1 ≤  ≤  ℎ ).Consequently, the differential modulation can be performed by using the Mixed Domain Scheme (MDS) [44], where the differential data can be simultaneously encoded in both time and frequency domains.The main benefit of this scheme is that only two reference signals are required for transmission of a total of  ℎ OFDM symbols (  ℎ resources), reducing further the overhead as compared to the FDS.
Firstly, the data symbols are differentially encoded as where the  denotes the resource index.Then, the differential symbols x are allocated to the two-dimensional resource grid as where  (•) is the resource mapping policy function.A possible mapping policy is given in [44], which mainly follows the FDS, except for the edge subcarriers of the block, that follow a Time-Domain Scheme (TDS).The latter consists on performing the differential encoding using resources of the same subcarrier at two consecutive OFDM symbols.
Similarly to the previous stage, the residual phase compensation is also required, except for those differential symbols following the TDS.The residual phase component for the second stage ( ℎ ) can be estimated by using only the first two subcarriers of the first OFDM symbol as follows: V. ANALYSIS OF THE SINR In this section, the SINR performance of NCDS-RIS is analyzed.According to [42]- [44], the residual phase component can be considered perfectly estimated and compensated, since the degradation produced by the residual phase value is negligible after estimation and compensation.
Taking into account ( 26) and ( 27), the undesirable effects produced by the self-interference and noise terms can be characterized as where  2  and  2  are the channel gains of the direct and reflective channel links, respectively, which is taking into account both the large and small-scale effects of the channel.The expectation is performed over the subcarriers and OFDM symbols.Note that the channel gain of the reflective BS-UE link is upper-bounded by using the best phase configuration given in (18), which is capable of perfectly matching with the arrival and departure angles of the LoS components of the BS-RIS and RIS-UE channel links.
Following the derivations given in [41]- [44], the four terms given in ( 26) and ( 27) are statistically independent since the channel frequency response, noise, and symbols are independent random variables, and the noise samples between two time instants are also independent.Hence, the two last terms in ( 33) can be simplified as After performing some manipulations given in Appendix A, the different interference and noise terms are given by Hence, the SINR of the proposed NCDS-RIS assuming a geometrical wideband channel is given by where the SINR can be enhanced by increasing the transmit power (  ), the number of antennas at the BS (), and/or the channel gain of the direct and reflective links ( 2  ,  2  , respectively).However, the first term of the second fraction is saturating the quality of the link due to the presence of the direct link ( 2  ).
The SINR given in (41) will be particularized for two extreme but illustrative scenarios.The first scenario corresponds to the performance obtained by using exclusively the direct link between BS-UE, where the reflective link is absent.On the contrary, the second case only takes into account the reflective link, assuming that the direct link is negligible.
A. Direct Link Only (Lower-Bound) Before choosing a proper phase configuration of the codebook for the RIS, the gain of the direct link is typically much higher than the reflected one by the RIS ( 2  >>  2  ).Given this situation, the data transmission is performed over mainly the direct link, and therefore, the SINR given in ( 41) can be simplified as where the quality of the link is upper-bounded by the first term, pointing out that the performance of the proposed system under a geometric wideband channel with NLoS paths is limited.

B. Reflective Link Only (Asymptotic Case)
Assuming the hypothetical case that the RIS size is very large ( ↑), the direct channel can be neglected since the gain of the reflected channel by RIS is significantly high ( 2   <<  2  ), and hence, the SINR given in ( 41) can be simplified as Assuming that  4  <<  2  and making use of the best phase configuration given in ( 18), ( 43) can be approximated as where it is noted that the SINR can be approximated by a linear function, and the performance is directly scaled with the reflective channel gain via the RIS, transmit power, and the number of antennas at the BS.Additionally, note that the constant value at the denominator corresponds to the typical 3 dB of performance loss of NCDS as compared to CDS.However, the classical CDS requires a significant amount of reference signals in order to track the variations of the channel, especially for high mobility scenarios, and hence, the overall throughput is significantly reduced with NCDS [49].

VI. THROUGHPUT AND COMPLEXITY COMPARISON
In this section, a comparison in terms of efficiency and complexity among the proposed NCDS-RIS and the baseline cases are given, in order to show the superiority of the NCDS-RIS.
For the first stage of beam training, two baseline approaches are taken into account for comparison purposes.The transmitted symbols can be either exclusively Reference Signals (RS) [31], or the traditional CDS (pilot and data symbols).
Therefore, the main difference between these two cases is that CDS is also able to transmit some information.The chosen combiner for this stage is Maximum Ratio Combining (MRC) since the channel gain of the direct link is not very high.
For the second stage, only the classical CDS is considered.According to [30], the first OFDM symbol of this stage is exclusively employed for channel estimation, while the  ℎ − 1 OFDM symbols left are used for data transmission.Unlike the first stage, the chosen combiner is Zero-Forcing (ZF) since the channel gain of the reflected link is enhanced by the RIS, and for high-SINR ZF is superior to MRC.

A. Throughput Comparison
For a typical packet-based transmission, the total throughput of the system can be defined as where   and  ℎ refer to the throughput of the first and second stages, respectively.Similarly to [49], the throughput of each stage can be found as where  , and  ,ℎ are the BER for the two stages,   and  ℎ are the constellation sizes of each stage,   denotes the number of bits in one packet, and   and  ℎ are the efficiency of the system for the two stages, taking into account the overhead produced by the transmission of the reference signals.
The efficiency of the first stage is given by where   is the number of reference symbols at each OFDM symbol.For the CDS scenario, the number of pilot symbols is typically a portion of the total amount of subcarriers (  < ).
On the other hand, the pilot-based approach and the proposed NCDS-RIS are two particular cases of (48), where the number of reference signals is fixed to   =  and   = 2, respectively.
After the best phase configuration is chosen, the data transmission is enhanced by the RIS.The efficiency of this second stage for the CDS case is where only one OFDM symbol is used for channel estimation.
On the other hand, the efficiency of the second stage for the NCDS is where only two reference symbols placed at the first OFDM symbols are required out of  ℎ OFDM symbols.Taking into account ( 45)-( 50), the proposed NCDS-RIS for both stages is capable of outperforming the classical CDS, especially for high mobility scenarios, since it does not require to transmit a large amount of reference signals, and its efficiency is not degraded.

RS CDS NCDS
Stage One 0

B. Complexity Comparison
The complexity evaluation is performed accounting for the number of required complex product operations for each case, which are summarized in Table I.Note that the operations needed for measuring the energy in order to determine the best beam are not considered in the stage one for this comparison, since they are the same for all approaches.
For the first stage, RS does not require any additional operations due to the fact that it does not transmit any data symbols.On the contrary, CDS is transmitting both data and pilot symbols, and hence, it has to estimate the channel and perform a MRC equalization for each OFDM symbol.Note that the computation of the MRC combiner is neglected, since it is obtained by performing a complex conjugate of the estimated channel.Additionally, the channel estimates at the pilot symbol resources must be interpolated to obtain values for the data resources.The complexity of this interpolation is given by   , and this value depends on the chosen algorithm [50], [51].However, the proposed NCDS only performs the differential encoding and decoding  −1 times for each OFDM symbol.
For the second stage, the classical CDS performs the channel estimation and computation of the ZF combiner at the first OFDM symbol, which requires a matrix inversion.Then, the remaining OFDM data symbols are equalized.Again, the NCDS only requires the differential encoding and decoding operations.
The proposed NCDS-RIS has less complexity in terms of complex products as compared to the traditional CDS, since channel estimation, computation of the combiner, and the equalization operations are not required.Moreover, the classical CDS not only requires more computations, but it also needs more memory to store the channel estimation samples and the computed combiners.Consequently, the proposed NCDS-RIS system is cost-effective and/or able to reduce the delay of the communication system.

VII. PERFORMANCE EVALUATION RESULTS
In this section, several numerical results are provided in order to show the performance of the proposed NCDS-RIS, as compared to the considered baseline cases presented in the previous section, and the accuracy of the analytical results.
For a more realistic performance evaluation, all the links (BS-UE, BS-RIS and RIS-UE) are generated according to the 5G standard [10].The power-delay profile follows an exponential distribution whose standard deviation is the DS; the azimuth angles of arrival/departure are modeled by a wrapped Gaussian distribution which is characterized by the Azimuth AS of Arrival and Departure (ASA and ASD); ).The number of codewords to be tested is set to the number of OFDM symbols at the first stage (   =   ).Additionally, in order to show the quality of the phase configuration based on codeword selection, the performance provided by the best configuration given in ( 18) is also shown, in order to see the difference between the realistic case based on the codebook and its upper-bound.
A. Verification of the SINR Analysis for the Proposed NCDS-RIS Figure 3 illustrates the SINR performance, as a function of the UE transmit power   in dBW, of the proposed NCDS-RIS for the direct BS-UE link and effective BS-UE link made up by the direct and reflective link enhanced by the RIS.As clearly shown, the reflective link significantly outperforms the direct link by approximately 29-33 dB.On the other hand, the performance of the phase configuration based on codeword selection is slightly worse than the best one, and this difference can be reduced further as the number of entries of the codebook is increased (   ↑), at the expense of increasing the beam training period (  ).It is also shown in this figure that the SINR analysis given in (41) and (42) (plotted with black dotted lines) accurately characterizes the RIS-empowered system performance under geometric wideband channel models.

B. BER comparison
The BER comparison between the classical CDS and the proposed NCDS-RIS signaling is depicted in Fig. 4. For the stage one, the chosen modulation is 4-QAM and 4-PSK for the CDS and NCDS, respectively, and the selected equalization technique for CDS is MRC.For the second stage, the chosen modulation is 16-QAM and 16-PSK for the CDS and NCDS, respectively, and the selected equalization technique for CDS  is ZF.Additionally, for the particular case of CDS, the particular case of perfect channel estimation (PCE) is also plotted in order to show the lower bound.Even though the CDS is able to outperform the NCDS in terms of BER, this result does not account for the time/frequency and energy resources required for the transmission of reference signals (efficiency) and the complexity described in Section VI.Moreover, the degradation produced by the CDS is higher than NCDS for the particular case of codeword selection.The reason behind this behavior is due to the fact that CDS requires accurate channel estimates, especially for the particular case of ZF criterion.Otherwise, the computed equalizers will enhance the noise.

C. Throughput Comparison between CDS and NCDS
Finally, a throughput comparison is shown taking into account the configurations described in Section VI.The constellation sizes are   = 4 and  ℎ = 16, the packet length is set to   = 20, and for the particular case of CDS at the stage one, the ratio of pilot symbols transmitted at each OFDM symbol is set to /  = 3.In Table III, the throughput is evaluated for both stages at different values of speed for the UE, which is equivalent to evaluating the performance at several values of channel coherence times (  ) and their corresponding lengths (), taking into account the OFDM numerology given in Table II.On the one hand, the NCDS is always outperforming the CDS in the first stage, as a result of avoiding the reference signals in order to track the strongly time-varying channels produced by the RIS.Moreover, the throughput of the CDS is slightly lower than the NCDS in the second stage, since the additional OFDM symbol spent for channel estimation becomes negligible.On the other hand, comparing the throughput produced by the two stages, the proposed NCDS-RIS scheme contributes more to the total throughput () as compared to the CDS, especially when the channel coherence time or the number of total OFDM symbols is not large.For example, for the case of 7.3 m/s ( =   ), the NCDS can double the throughput.
In Fig. 5, the total throughput () is shown for the different baseline cases described in the previous section.The two bottom curves represent the total throughput for the case of exclusively transmitting reference signals at the beamtraining case, while both CDS and NCDS are exploited at the second stage.It can be noted that the NCDS slightly outperforms the CDS due to the fact that the latter requires an additional OFDM symbol for transmitting the reference signals for channel estimation.Then, the proposed fully NCDS system has a significant higher throughput as compared to the fully CDS one, since the NCDS is capable of transmitting more data symbols at the first stage, where no reference signals are needed.

D. Throughput Comparison for different RIS sizes in realistic scenarios
In Fig. 6, a throughput comparison for different RIS sizes and training periods is plotted, where the total number of OFDM symbols to be transmitted corresponds to  = 4  .Note that the upper-bound is computed using ( 46) and ( 47) by setting  , = 0 and  ,ℎ = 0. Theoretically, a higher number of passive elements of the RIS will provide a more directive beam, and hence, UEs further from the BS will be able to be discovered while closer UEs will face an improvement in their links.However, the simulation results are showing that having a extremely large RIS may even degrade the overall performance in realistic channel environments.This can be seen when comparing  = 8×8 and  = 16×16 in this Figure .Certainly, a larger RIS will produce a narrower directive beam enabling a better spatial resolution.However, it requires a higher number of entries for the codebook in order to be  able to sweep the whole space.Otherwise, a perfect alignment between the pencil-sharp beam and the strongest cluster of the channel cannot be achieved, and consequently, the resulting reflecting channel gain (BS-RIS-UE) for  = 16 × 16 is lower than  = 8 × 8, since the latter is even able to point towards several clusters of the channel, as consequence of having a wider beam.Then, when the number of entries in the codebook to be transmitted at the first stage is the same (   =   and  ℎ = 3  ), the larger RIS shows a slightly worse performance.Additionally, we have to consider that increasing the number of entries in the codebook to be transmitted at the first stage in the figure, the case (   = 2  and  ℎ = 2  ) will enable to explore the whole space with a better spatial resolution, and therefore, the gain of reflective link can be enhanced.Nevertheless, this improvement comes at the expense of reducing the number of OFDM symbols transmitted at the second stage ( ℎ ), degrading the overall performance of the system.This explains why in the latter case the performance of the larger RIS is remarkably worse, and highlights the importance of considering the effect of the training on the performance.

VIII. CONCLUSIONS
This paper investigated NCDS based on differential modulation adequately combined with a codebook-based beam training of the RIS (NCDS-RIS) in order to provide a zero overhead training procedure.The proposed scheme is able to execute the beam training process at the RIS, and at the same time, the differential data symbols are transmitted using the direct and reflective links, where the latter is strongly timevarying as a consequence of the beam training.Therefore, this combination is able to efficiently obtain the best phase configuration for the RIS and increase the system data-rate, especially in scenarios with moderate to high mobility and/or highly attenuated direct BS-UE links.
Moreover, the proposed method is simple, yet effective, as compared to conventional RIS-empowered systems based on CDS.contrast to CDS, the presented analysis of the efficiency and complexity revealed that, the NCDS requires a substantially smaller amount of reference signals and complex products.Hence, these additional benefits will reduce the cost, energy consumption and latency of RIS-empowered communications.

APPENDIX A COMPUTATION OF THE AVERAGE INTERFERENCE AND NOISE POWER
The derivation of closed-form expression for the interference and noise terms required in (41) for the considered geometric wideband channel model is presented.It is assumed that the channel frequency responses of any two contiguous subcarriers of the OFDM signal have a similar value as (h , = h −1, , 2 ≤ , ≤ , 1 ≤ , ≤ ).Note that this assumption is widely accepted in multi-carrier waveforms, since the number of subcarriers is typically designed to be much larger than the number of taps of the channel impulse response.
Taking into account the received signal given in (13), the first term after performing the differential decoding in (26) where it can be easily shown that the three terms are independent to each other, and hence, the average power of ( 51) can be decomposed as The second term of (52) corresponds the reflected BS-UE channel via RIS after choosing the phase configuration (h  , ).Assuming that the RIS is capable of pointing the narrow directive beams towards the LoS components of the BS-RIS and RIS-UE channels, and the NLoS components are spatially filter out, the reflective channel can be considered a deterministic value as The second and third terms after performing the differential modulation given in ( 26) and ( 27) can be easily obtained since the transmitted symbol, the channel frequency response and noise are independent random variables.Hence, the average power of  2 and  3 can be computed as Finally, the fourth term after performing the differential modulation given in (27) can be straightforwardly computed since the noise samples at two different subcarriers are independent random variables, therefore

Fig. 1 .
Fig. 1.The RIS-empowered wireless communication link comprising a multi-antenna BS, a multi-element passive panel, and a single-antenna mobile UE.
Fig.2.A frame of  OFDM symbols is transmitted by two stages within the channel coherence time (  ).In the first stage, beam training is executed, and at the same time, non-coherent transmission is performed over the BS-UE direct link.In the second stage, the best phase configuration is loaded to the RIS and non-coherent data symbols are transmitted over the enhanced reflective BS-UE link via RIS.

Fig. 4 .
Fig. 4. BER comparison between CDS and NCDS for both stages.

Fig. 5 .
Fig. 5. Throughput comparison between CDS and NCDS for a speed of 4 m/s, which corresponds to  = 2  .

TABLE I COMPLEXITY
COMPARISON BETWEEN THE PROPOSED NCDS AND THE CONSIDERED BASELINE RS AND CDS.
Moreover, a summary of the simulation parameters is provided in TableII, the location of each network node is given by the Cartesian coordinates (, , ) measured in meters, and   denotes the carrier frequency.The distance between any two contiguous elements of the BS and RIS is set to half wavelength ( BS Fig. 3. Analytical SINR verification for NCDS at both stages.

TABLE III THROUGHPUT
COMPARISON BETWEEN CDS AND NCDS AT EACH STAGE FOR   = −8 DB IN [10 6 PACKETS/S].