Enabling Large Intelligent Surfaces with Compressive Sensing and Deep Learning

Employing large intelligent surfaces (LISs) is a promising solution for improving the coverage and rate of future wireless systems. These surfaces comprise a massive number of nearly-passive elements that interact with the incident signals, for example by reflecting them, in a smart way that improves the wireless system performance. Prior work focused on the design of the LIS reflection matrices assuming full knowledge of the channels. Estimating these channels at the LIS, however, is a key challenging problem, and is associated with large training overhead given the massive number of LIS elements. This paper proposes efficient solutions for these problems by leveraging tools from compressive sensing and deep learning. First, a novel LIS architecture based on sparse channel sensors is proposed. In this architecture, all the LIS elements are passive except for a few elements that are active (connected to the baseband of the LIS controller). We then develop two solutions that design the LIS reflection matrices with negligible training overhead. In the first approach, we leverage compressive sensing tools to construct the channels at all the LIS elements from the channels seen only at the active elements. These full channels can then be used to design the LIS reflection matrices with no training overhead. In the second approach, we develop a deep learning based solution where the LIS learns how to optimally interact with the incident signal given the channels at the active elements, which represent the current state of the environment and transmitter/receiver locations. We show that the achievable rates of the proposed compressive sensing and deep learning solutions approach the upper bound, that assumes perfect channel knowledge, with negligible training overhead and with less than 1% of the elements being active.


I. INTRODUCTION
Large Intelligent Surfaces (LISs) are envisioned as intrinsic components of beyond-5G wireless systems [1]- [10]. In its core design concept, an LIS realizes a continuous electromagneticallyactive surface by stacking a massive number of radiating/sensing elements. These elements interact with the incident signals, for example by reflecting them, in a way that improves the coverage and rate of the wireless systems [1], [2]. This concept is further motivated by the possible energy-efficient implementation using nearly passive elements such as analog phase shifters [7], [11], [12]. Prior work focused on developing efficient designs for the LIS reflection matrices and evaluating their coverage and rate gains assuming the existence of global channel knowledge. But how can these extremely large-dimensional channels be estimated if the LIS is implemented using only reflecting elements? Obtaining this channel knowledge may require huge-and possibly prohibitive-training overhead which represents a main challenge for the operation of the LIS systems. To overcome this challenge, this paper proposes a novel LIS hardware architecture jointly with compressive sensing and deep learning based solutions that design the LIS reflection matrices with negligible training overhead.

A. Prior Work
LIS-assisted wireless communication is attracting increased interest in the last few years. In terms of the circuit implementations, LIS surfaces can be realized using nearly passive elements with reconfigurable parameters [7]. Candidate designs include conventional reflect-arrays [11], [12], and software-defined metamaterials [8], [13] among others. Using these surfaces, several signal processing solutions have been proposed to optimize the design of the LIS reconfigurable parameters (mostly considering the LIS as a reflecting surface) [4], [7], [14]. In [7], an LISassisted downlink multi-user setup is considered with single-antenna users. The LIS elements in [7] are modeled as quantized phase shifters/reflectors and computational low-complexity algorithms were developed to design these LIS phase matrices. In [4], an LIS-assisted downlink scenario, similar to that in [7], was considered and the precoder matrix at the base station as well as the LIS reflection matrices were designed, focusing on the case where a line-of-sight (LOS) exists between the base station and the LIS. In [14], a new transmission strategy combining LIS with index modulation was proposed to improve the system spectral efficiency. In terms of the overall system performance, [5] considered an uplink multi-user scenario and characterized the data rates with channel estimation errors.
The Critical Challenge: All the prior work in [4], [5], [7], [12], [14] assumed that the knowledge about the channels between the LIS and the transmitters/receivers is available at the base station, either perfectly or with some error. Obtaining this channel knowledge, however, is one of the most critical challenges for LIS systems due to the massive number of antennas (LIS elements) and the hardware constraints on these elements. More specifically, if the LIS elements are implemented using phase shifters that just reflect the incident signals, then there are two main approaches for designing the LIS reflection matrix. The first approach is to estimate the LIS-assisted channels at the transmitter/receiver by training all the LIS elements, normally one by one, and then use the estimated channels to design the reflection matrix. This yields massive channel training overhead due to the very large number of elements at the LIS. Instead of the explicit channel estimation, the LIS reflection matrix can be selected from quantized codebooks via online beam/reflection training. This is similar to the common beam training techniques in mmWave systems that employ similar phase shifter architectures [15], [16]. To sufficiently quantize the space, however, the size of the reflection codebooks needs normally to be in the order of the number of antennas, which leads to huge training overhead. To avoid this training overhead, a trivial solution is to employ fully-digital or hybrid analog/digital architectures at the LIS, where every antenna element is connected somehow to the baseband where channel estimation strategies can be used to obtain the channels [17]- [19]. This solution, however, leads to high hardware complexity and power consumption given the massive number of LIS elements.

B. Contribution
In this paper, we consider an LIS-assisted wireless communication system and propose a novel LIS architecture as well as compressive sensing and deep learning based solutions that design the LIS reflection matrix with negligible training overhead. More specifically, the contributions of this paper can be summarized as follows.
• Novel LIS hardware architecture: We introduce a new LIS architecture where all the elements are passive except a few randomly distributed active channel sensors. Only those few active sensors are connected to the baseband of the LIS controller and are used to enable the efficient design of the LIS reflection matrices with low training overhead.
• Compressive sensing based LIS reflection matrix design: Given the new LIS architecture with randomly distributed active elements, we develop a compressive sensing based solution to recover the full channels between the LIS and the transmitters/receivers from the sam-pled channels sensed at the few active elements. Using the constructed channels, we then design the LIS reflection matrices with no training overhead. We show that the proposed solution can efficiently design the LIS reflection matrices when only a small fraction of the LIS elements are active yielding a promising solution for LIS systems from both energy efficiency and training overhead perspectives.
• Deep learning based LIS reflection matrix design: Leveraging tools from machine/deep learning, we propose a solution that learns the direct mapping from the sampled channels seen at the active LIS elements and the optimal LIS reflection matrices that maximize the system achievable rate. Essentially, the proposed approach teaches the LIS system how to interact with the incident signal given the knowledge of the sampled channel vectors, that we call environment descriptors. In other words, the LIS learns that when it observes these environment descriptors, it should reflect the incident signal using this reflection matrix.
Different than the compressive sensing solution, the deep learning approach leverages the prior observations at the LIS and does not require any knowledge of the array structure.
The proposed solutions are extensively evaluated using the accurate ray-tracing based Deep-MIMO dataset [20]. The results show that the developed compressive sensing and deep learning solutions can approach the optimal upper bound, that assumes perfect channel knowledge, when only a few LIS elements are active and with almost no training overhead, highlighting potential solutions for LIS systems.
Notation: We use the following notation throughout this paper: A is a matrix, a is a vector, a is a scalar, A is a set of scalars, and A is a set of vectors. a p is the p-norm of a. |A| is the determinant of A, whereas A T , A H , A * , A −1 , A † are its transpose, Hermitian (conjugate transpose), conjugate, inverse, and pseudo-inverse respectively.

II. SYSTEM AND CHANNEL MODELS
In this section, we describe the adopted system and channel models for the large intelligent surfaces (LISs).

A. System Model
Consider the system model shown in Fig. 1 where a transmitter is communicating with a receiver, and this communication is assisted by a large intelligent surface (LIS). For simplicity, we assume that the LIS has M antennas while both the transmitter and receiver are singleantenna. The proposed solutions and results in this paper, however, can be extended to multiantenna transceivers. Note also that these transmitters/receivers can represent either base stations or user equipment.
Adopting an OFDM-based system of K subcarriers, and defining h T,k , h R,k ∈ C M ×1 as the M × 1 uplink channels from the transmitter/receiver to the LIS at the kth subcarrier, h T T,k , h T R,k as the downlink channels by reciprocity, h TR,k ∈ C as the direct channel between the transmitter and receiver, then we can write the received signal at the receiver as where s k denotes the transmitted signal over the kth subcarrier, and satisfies E[|s k | 2 ] = P T K , with P T representing the total transmit power, and n k ∼ N C (0, σ 2 n ) is the receive noise. The M × M matrix Ψ k , that we call the LIS interaction matrix, represents the interaction of the LIS with the incident (impinging) signal from the transmitter. The overall objective of the LIS is then to interact with the incident signal (via adjusting Ψ k ) in a way that optimizes a certain performance metric such as the system achievable rate or the network coverage. To simplify the design and analysis of the algorithms in this paper, we will focus on the case where the direct link does not exist. This represents the scenarios where the direct link is either blocked or has negligible receive power compared to that received through the LIS-assisted link. With this assumption, the receive signal can be expressed as where (a) follows from noting that the interaction matrix Ψ k has a diagonal structure, and denoting the diagonal vector as ψ k , i.e., Ψ k = diag (ψ k ). This diagonal structure results from the operation of the LIS where every element m, m = 1, 2, ..., M reflects the incident signal after multiplying it with an interaction factor [ψ k ] m . Now, we make two important notes on these interaction vectors. First, while the interaction factors, [ψ k ] m , ∀m, k, can generally have different magnitudes (amplifying/attenuation gains), it is more practical to assume that the LIS elements are implemented using only phase shifters. Second, since the implementation of the phase shifters is done in the analog domain (using RF circuits), the same phase shift will be applied to the signals on all subcarriers, i.e., ψ k = ψ, ∀k. Accounting for these practical considerations, we assume that every interaction factor is just a phase shifter, i.e., [ψ] m = e jφm . Further, we will call the interaction vector ψ in this case the reflection beamforming vector.

B. Channel Model
In this paper, we adopt a wideband geometric channel model for the channels h T,k , h R,k between the transmitter/receiver and the LIS [21]. Consider a transmitter-LIS uplink channel h T,k (and similarly for h R,k ) consisting of L clusters, and each cluster is contributing with one ray of time delay τ ∈ R, a complex coefficient α ∈ C, and azimuth/elevation angles of arrival, θ , φ ∈ [0, 2π). Let ρ T denote the path loss between the transmitter and the LIS and p (τ ) characterizes the pulse shaping function for T S -spaced signaling evaluated at τ seconds.
The delay-d channel vector, h rd ∈ C M ×1 , between the transmitter and the LIS can then be defined as where a(θ , φ ) ∈ C M ×1 denotes the array response vector of the LIS at the angles of arrival θ , φ . Given this delay-d channel, the frequency domain channel vector at subcarrier k, h T,k , can be written as Considering a block-fading channel model, h T,k and h R,k are assumed to stay constant over the channel coherence time, denoted T C , which depends on the user mobility and the dynamics of the environment among others. It is worth noting that the number of channel paths L depends highly on the operational frequency band and the propagation environment. For example, mmWave channels normally consist of a small number of channel paths, ∼3-5 paths, [22]- [24], while sub-6 GHz signal propagation generally experiences rich scattering resulting in channels with more multi-path components.

III. PROBLEM FORMULATION
The objective of this paper is to design the LIS interaction vector (reflection beamforming vector), ψ, to maximize the achievable rate at the receiver. Given the system and channel models in Section II, this achievable rate can be written as where SNR = P T Kσ 2 n denotes the signal-to-noise ratio. As mentioned in Section II-A, every element in the LIS reflection beamforming vector, ψ, is implemented using an RF phase shifter. These phase shifters, however, normally have a quantized set of angles and can not shift the signal with any phase. To capture this constraint, we assume that the reflection beamforming vector ψ can only be picked from a pre-defined codebook P. Every candidate reflection beamforming code-word in P is assumed to be implemented using quantized phase shifters. With this assumption, our objective is then to find the optimal reflection beamforming vector ψ that solves to result in the optimal rate R defined as Due to the quantized codebook constraint and the time-domain implementation of the reflection beamforming vector, i.e., using one interaction vector ψ for all subcarriers, there is no closed form solution for the optimization problem in (8). Consequently, finding the optimal reflection beamforming vector for the LIS ψ requires an exhaustive search over the codebook P.
The main challenge: As characterized in (8), finding the optimal LIS interaction vector ψ and achieving the optimal rate R requires an exhaustive search over the codebook P. Note that the codebook size should normally be in the same order of the number of antennas to make use of these antennas. This means that a reasonable reflection beamforming codebook for LIS systems will probably have thousands of candidate codewords. With such huge codebooks, solving the exhaustive search in (8) is very challenging. More specifically, there are two main approaches for performing the search in (8).
• Full channel estimation with offline exhaustive search: In this approach, we need to estimate the full channels between the LIS and the transmitter/receiver, h T,k , h R,k and use it to find the optimal reflection beamforming vector by the offline calculation of (8). Estimating these channel vectors, however, requires the LIS to employ a complex hardware architecture that connects all the antenna elements to a baseband processing unit either through a fully-digital or hybrid analog/digital architectures [17], [18]. Given the massive numbers of antennas at large intelligent surfaces, this approach can yield prohibitive hardware complexity in terms of the routing and power consumption among others. If the LIS is operated and controlled via a base station or an access point [7], then this channel estimation process can be done at these communication ends. This, however, assumes an orthogonal training over the LIS antennas, for example by activating one LIS antenna at a time, which leads to prohibitive training overhead given the number of antennas at the LIS.
• Online exhaustive beam training: Instead of the explicit channel estimation, the best LIS beam reflection vector ψ can be found through an over-the-air beam training process. This process essentially solves the exhaustive search in (8) by testing the candidate interaction vectors ψ ∈ P one by one. This exhaustive beam training process, however, incurs again very large training overhead at the LIS systems.
Our objective in this paper is to enable large intelligent surfaces by addressing this main challenge. More specifically, our objective is to enable LIS systems to approach the optimal achievable rate in (9) adopting low-complexity hardware architectures and requiring low training overhead. For this objective, we first propose a novel energy-efficient LIS transceiver architecture in Section IV. Then, we show in Sections V-VI how to employ this LIS architecture to achieve near-optimal achievable rates with negligible training overhead via leveraging tools from compressive sensing and deep learning.

IV. LARGE INTELLIGENT SURFACES WITH SPARSE SENSORS: A NOVEL ARCHITECTURE
As discussed in Section III, a main challenge for the LIS system operation lies in the high hardware complexity and training overhead associated with designing the LIS interaction (reflection beamforming) vector, ψ. In order to overcome this challenge and enable LIS systems in practice, we propose the new LIS architecture in Fig. 2. In this architecture, the LIS consists of (i) M passive reflecting elements, each one is implemented using an RF phase shifter and is full RF chains and baseband processing, and (ii) a reflection mode where they act just like the rest of the passive elements that reflect the incident signal. It is important to note that while we describe the M phase shifting elements as passive elements to differentiate them from the M active channel sensors, they are normally implemented using reconfigurable active RF circuits [11], [25]. Next, we define the channels from the transmitter/receiver to the active channel sensors of the LIS, and then discuss how to leverage this energy-efficient LIS architecture for designing the LIS interaction vector ψ.   Finally, we define h k = h T,k h R,k as the overall LIS sampled channel vector at the kth subcarrier.
Designing the LIS interaction vector: For the system model in Section II-A with the proposed LIS architecture in Fig. 2, estimating the sampled channel vectors h T,k , h R,k can be easily done with a few pilot signals. For example, adopting an uplink training approach, the transmitter can send one pilot signal that will be simultaneously received and processed by all the active elements in the LIS to estimate h T,k (and similarly for h R,k ). Given these sampled channel vectors, however, how can the LIS find the optimal reflection beamforming vector ψ that solves (9)? In the next two sections, we propose two approaches for addressing this problem leveraging tools from compressive sensing (in Section V) and deep learning (in Section VI).

V. COMPRESSIVE SENSING BASED LIS INTERACTION DESIGN
As shown in Section III, finding the optimal LIS interaction (reflection beamforming) vector ψ that maximizes the achievable rate with no beam training overhead requires the availability of the full channel vectors h T,k , h R,k . Estimating these channel vectors at the LIS, however, normally requires that every LIS antenna gets connected to the baseband processing unit through a fully-digital or hybrid architecture [17], [19], [26]. This can massively increase the hardware complexity with the large number of antennas at the LIS systems. In this section, and adopting the low-complexity LIS architecture proposed in Section IV, we show that it is possible to recover the full channel vectors h T,k , h R,k from the sampled channel vectors h T,k , h R,k when the channels experience sparse scattering. This is typically the case in mmWave and LOS-dominant sub-6 GHz systems.
A. Recovering Full Channels from Sampled Channels: With the proposed LIS architecture in Fig. 2, the LIS can easily estimate the sampled channel vectors h T,k , h R,k through uplink training from the transmitter and receiver to the LIS with a few pilots. Next, we explain how to use these sampled channel vectors to estimate the full channel vectors h T,k , h R,k . First, note that the h T,k in (4), (5) (and similarly for h R,k ) can be written as where

by defining the array response matrix
A and the kth subcarrier path gain vector β as we can write h T,k in a more compact way as h T,k = A β. Now, we note that in several important scenarios, such as mmWave and LOS-dominant sub-6 GHz, the channel experiences sparse scattering, which results is a small number of paths L [18], [23]. In order to leverage this sparsity, we follow [19] and define the dictionary of array response vectors A D , where every column constructs an an array response vector in one quantized azimuth and elevation direction.
For example, if the LIS adopts a uniform planar array (UPA) structure, then we can define A D as with A Az D and A El D being the dictionaries of the azimuth and elevation array response vectors. Every column in A Az D (and similarly for A El D ) constructs an azimuth array response in one quantized azimuth (elevation) direction. If the number of grid points in the azimuth and elevation dictionaries are N Az D and N El D , respectively, and the number of horizontal and vertical elements Now, assuming that size of the grid is large enough such that the azimuth and elevation angles θ , φ , ∀ matches exactly L points in this grid (which is a common assumption in the formulations of the sparse channels estimation approaches [18], [19], [27]), then we can rewrite h T,k as where x β is an N Az D N El D sparse vector with L N Az D N El D non-zero entries equal to the elements of β. Further, these non-zero entries are in the positions that correspond to the channel azimuth/elevation angles of arrival. Next, let h T,k denote the noisy sampled channel vectors, then we can write where v k ∼ N C (0, σ 2 n I) represent the receive noise vector at the LIS active channel sensors and G LIS is the selection matrix defined in (10). Now, given the equivalent sensing matrix, Φ and the noisy sampled channel vector h T,k , the objective is to estimate the sparse vector x β that solves the non-convex combinatorial problem Given the sparse formulation in (20), several compressive sensing reconstruction algorithms, such as orthogonal matching pursuit (OMP) [28], [29], can be employed to obtain an approximate solution for x β . With this solution for x β , the full channel vector h T,k can be constructed according to (16). Finally, the constructed full channel vector can be used to find the LIS reflection beamforming vector ψ via an offline search using (8). Achievable Rate (bps/Hz) Fig. 3. This figure plots the achievable rates using the proposed compressive sensing based solution for two scenarios, namely a mmWave 28GHz scenario and a low-frequency 3.5GHz one. These achievable rates are compared to the optimal rate R in (9) that assumes perfect channel knowledge. The figure illustrates the potential of the proposed solutions that approach the upper bound while requiring only a small fraction of the total LIS elements to be active.
In this paper, we assume for simplicity that the M active channel sensors are randomly selected from the M LIS elements, assuming that all the elements are equally likely to be selected. It is important, however, to note that the specific selection of the active elements designs the compressive sensing matrix Φ and decides its properties. Therefore, it is interesting to explore the optimization of the active element selection, leveraging tools from nested arrays [30], coprime arrays [31], [32], incoherence frames [33], and difference sets [26], [34].

B. Simulation Results and Discussion:
To evaluate the performance of the proposed compressive sensing based solution, we consider a simulation setup at two different carrier frequencies, namely 3.5GHz and 28GHz. The simulation setup consists of one large intelligent surface with a uniform planar array (UPA) in the y-z plane, which reflects the signal coming from one transmitter to another receiver, as depicted in Fig. 6.
This UPA consists of 16 × 16 antennas at 3.5GHz and 64 × 64 antennas at 28GHz. We generate the channels using the publicly-available ray-tracing based DeepMIMO dataset [20], with the 'O1' scenario that consists of a street and buildings on the sides of the street. Please refer to Section VII-A for a detailed description of the simulation setup and its parameters.
Given this described setup, and adopting novel LIS architecture in Fig. 2, we apply the proposed compressive-sensing based solution described in Section V-A as follows: (i) We obtain the channel vectors h T,k , h R,k using the ray-tracing based DeepMIMO dataset, and add noise with the noise parameters described in Section VII-A. (ii) Adopting the LIS architecture in This is shown in Fig. 3 as the compressive sensing based solution requires a higher ratio of the LIS elements to be active to approach the upper bound in the 3.5GHz scenario that has more scattering than the mmWave 28GHz case. Further, the compressive sensing solution does not  the system operation and the adopted deep learning model are diligently described. We refer the interested reader to [35] for a brief background on deep learning.

A. The Key Idea
The large intelligent surfaces are envisioned as key components of future networks [7].
These surfaces will interact with the incident signals, for example by reflecting them, in a way that improves the wireless communication performance. In order to decide on this interaction, however, the LIS systems or their operating base stations and access points need to acquire some knowledge about the channels between the LIS and the transmitter/receiver. As we explained in Section III, the massive number of antennas at these surfaces makes obtaining the required channel knowledge associated with (i) prohibitive training overhead if all the LIS elements are passive or (ii) infeasible hardware complexity/power consumption in the case of fully-digital or hybrid based LIS architectures.
The channel vectors/matrices, however, are intuitively some functions of the various elements of the surrounding environment such as the geometry, scatterer materials, and the transmitter/receiver locations among others. Unfortunately, the nature of this function-its dependency on the various components of the environment-makes its mathematical modeling very hard and infeasible in many cases. This dependence though means that the interesting role the LIS is playing could be enabled with some form of awareness about the surrounding environment. With this motivation, and adopting the proposed LIS architecture in Fig. 2, we propose to consider the sampled channels seen by the few active elements of the LIS as environment descriptors capturing a multi-path signature [21], [36], [37], as shown in Fig. 4. We then adopt deep learning models to learn the mapping function from these environment descriptors to the optimal LIS interaction (reflection beamforming) vectors. In other words, we are teaching the LIS system how to interact with the wireless signal given the knowledge of the environment descriptors (sampled channel vectors). It is worth emphasizing here that the sampled channel vectors can be obtained with negligible training overhead as explained in Section IV. In the ideal case, the learning model will be able to predict the optimal interaction vector given the environment descriptors. Achieving this means that the LIS system can approach the optimal rate in (9) with negligible training overhead (that is required to obtain the sampled channel vectors) and with low-complexity architectures (as only a few elements of the LIS are active).

B. Proposed System Operation
In this section, we describe the system operation of the proposed deep learning based LIS interaction approach. The proposed system operates in two phases, namely (I) the learning phase and (II) the prediction phase. LIS receives two pilots from the transmitter and receiver and estimates h(s).
for n = 1 to |P| do 2 LIS Reflects using ψ n beam and receives the feedback R n (s).  LIS receives two pilots from the transmitter and receiver, and estimates h. 8 LIS predicts the interaction (reflection) vector using the trained DL model. 9 LIS uses the interaction vector ψ n DL , with n DL = arg max nR n , for the data reflection.
where v k ∼ N C (0, σ 2 n I) represent the receive noise vector at the LIS active channel sensors. The receiver-LIS sampled channel vector h R,k (s) will be similarly estimated. Finally, the vectors h k (s) = h T,k (s) h R,k (s) and h(s) = vec h 1 (s), h 2 (s), . . . , h K (s) will be constructed.
2) Exhaustive search over reflection beamforming codewords (line 2): In this step, the LIS performs an exhaustive beam training using the interaction/reflection codebook P. Particularly, the LIS attempts every candidate reflection beamforming vector, ψ n , n = 1, ..., |P|, and receives a feedback from the receiver indicating the achievable rate using this interaction vector, R n (s), defined as Note that, in practice, the computation and feedback of the achievable rate R It is worth mentioning here that while we assume that the system will switch one time to PHASE II after the deep learning model is trained, the system will need to retrain and refine the model frequently to account for the changes in the environment. 2) Achievable rate prediction (line 8): In this step, the estimated sampled channel vector, h, is fed into the deep learning model to predict the achievable rate vector, r.
3) Data transmission (line 9): In this step, the predicted deep learning reflection beamforming vector, ψ n DL , that corresponds to the highest predicted achievable rate, is used for reflecting the transmitted data (signal). Note that instead of selecting only the interaction vector with the highest predicted achievable rate, the LIS can generally select the k B beams corresponding to the k B highest predicted achievable rates. It can then refine this set of beams online with the receiver to select the one with the highest achievable rate. We evaluate the performance gain if more than one reflection beam are selected in Section VII-E.

C. Deep Learning Model
Recent years have proven deep learning to be one of the most successful machine learning paradigms [38]. With this motivation, a deep neural network is chosen in this work to be the model with which the desired LIS interaction function is learned. In the following, the elements of this model are described.
Input Representation: A single input to the neural network model is defined as a stack of environment descriptors (sampled channel vectors h k ) obtained from a pair of transmitter and receiver at K different sub-carrier frequencies. This sets the dimensionality of a single input vector to KM . A common practice in machine learning is the normalization of the input data.
This guarantees a stable and meaningful learning process [39]. The normalization method of choice here is a simple per-dataset scaling; all samples are multiplied by a factor that is the inverse of the maximum absolute value over the whole input data where ∆ is given by and h b is the b th complex entry of h. Besides helping the learning process, this normalization choice preserves distance information encoded in the environment descriptors. This way the model learns to become more aware of the surroundings, which is the bedrock for proposing a machine-learning-powered LIS.
The last pre-processing step of input data is to convert them into real-valued vectors without losing the imaginary-part information. This is done by splitting each complex entry into real and imaginary values, doubling the dimensionality of each input vector. The main reason behind this step is the modern implementations of DL models, which mainly use real-valued computations.
Target Representation: The learning approach used in this work is supervised learning. This means the model is trained with input data that are accompanied by their so-called target responses [35]. Linear Units (ReLUs) [35]. Each unit operates on a single input value outputting another single Training Loss Function: The model training process aims at minimizing a loss function that measures the quality of the model predictions. Given the objective of predicting the best reflection beam vector, ψ n DL , having the highest achievable rate estimate, max n R n , the model is trained using a regression loss function. At every coherence block, the neural network is trained to make its output, r, as close as possible to the desired output, the normalized achievable rates, r. Specifically, the training is guided through minimizing the loss function, L (θ), expressed as where θ represents the set of all the neural network parameters and MSE (r, r) indicates the mean-squared-error between r and r.

VII. SIMULATION RESULTS
In this section, we evaluate the performance of both the deep learning (DL) and the compressive sensing (CS) based reflection beamforming approaches. The flow of this section is as follows. First, we describe the adopted experimental setup and datasets. Then, we compare the Fig. 6. This figure illustrates the adopted ray-tracing scenario where an LIS is reflecting the signal received from one fixed transmitter to a receiver. The receiver is selected from an x-y grid of candidate locations. This ray-tracing scenario is generated using Remcom Wireless InSite [41], and is publicly available on the DeepMIMO dataset [20].
performance of the deep learning and compressive sensing solutions at both mmWave and sub-6 GHz bands. After that, we investigate the impact of different system and machine learning parameters on the performance of the deep learning solution.

A. Simulation Setup
Given the geometric channel model adopted in Section II and the nature of the reflection beamforming optimization problem, with its strong dependence on the environmental geometry, it is critical to evaluate the performance of the proposed solutions based on realistic channels. This motivates using channels generated by ray-tracing to capture the dependence on the key environmental factors such as the environment geometry and materials, the LIS and transmitter/receiver locations, the operating frequency, etc. To do that, we adopted the DeepMIMO dataset, described in detail in [20], to generate the channels based on the outdoor ray-tracing scenario 'O1' [41], as will be discussed shortly. The DeepMIMO is a parameterized dataset published for deep learning applications in mmWave and massive MIMO systems. The machine learning simulations were executed using the Deep Learning Toolbox of MATLAB R2019a. Next, we explain in detail the key components of the simulation setup.
System model: We adopt the system model described in Section II-A with one large intelligent surface that reflects the signal received from a transmitter to a receiver. The transmitter is assumed to be fixed while the receiver can take any random position in a specified x-y grid as illustrated in Fig. 6. We implemented this setup using the outdoor ray-tracing scenario 'O1' of the DeepMIMO dataset that is publicly available at [20]. As shown in Fig. 6, we select BS 3 in the 'O1' scenario to be the LIS and the user in row R850 and column 90 to be the fixed transmitter. The uniform x-y grid of candidate receiver locations include 54300 points from row R1000 to R1300 in  Table I.
Channel generation: The channels between the LIS and the transmitter/receiver, h T,k , h R,k , for all the candidate receiver locations in the x-y grid, are constructed using the DeepMIMO dataset generation code [20] with the parameters in Table I. With these channels, and given the randomly selected active elements in the proposed LIS architecture, we construct the sampled channel vectors h T,k , h R,k . The noisy sampled channel vectors h T,k , h R,k are then generated by adding noise vectors to h T,k , h R,k according to (21), with the noise power calculated based on the bandwidth and other parameters in Table I To evaluate the performance at sub-6 GHz systems, we plot the achievable rates of the proposed deep learning and compressive sensing solutions compare to the optimal rate R in Fig. 8. This figure adopts the simulation setup in Section VII-A at a 3.5GHz band. The LIS is assumed to employ a UPA with 16 × 16 antennas and each channel incorporates the strongest L = 15 paths. Fig. 8 shows that the proposed deep learning and compressive sensing solutions are also to the compressive sensing approach in the sub-6 GHz systems, where the channels are less sparse than mmWave systems. This gain, however, has the cost of collecting a dataset to train the deep learning model, which is not required in the compressive sensing approach.
C. How much training is needed for the deep learning model?
The data samples in the deep learning dataset are captured when the receiver is randomly sampling the x-y grid. In Fig. 9, we study the performance of the developed deep learning approach for designing the LIS interaction vectors for different dataset sizes. This illustrates the improvement in the machine learning prediction quality as it sees more data samples. For Fig. 9, we adopt the simulation setup in Section VII-A with an LIS of 64 × 64 UPA and a number of active channel sensors M = 2, 4, and 8. The setup considers a mmWave 28GHz scenario and the channels are constructed with only the strongest path, i.e., L = 1. Fig. 9 shows that with only

D. Impact of Important System and Channel Parameters
In this subsection, we evaluate the impact of the key system and channel parameters on the performance of the developed deep learning solution.
Number of LIS antennas: Fig. 10 examines the achievable rate performance of the developed solutions for designing the LIS interaction vectors when the LIS employs either a 32 × 32 or a 64 × 64 UPA. This figure adopts the same mmWave scenario considered in Fig. 9. As illustrated, Transmit power: In Fig. 11, we study the impact of the transmit power (and receive SNR) on the achievable rate performance of the developed deep learning solution. This is important in order to evaluate the robustness of the learning and prediction quality, as we input the noisy sampled channel vectors to the deep learning model. In Fig. 11, we plot the achievable rates of the proposed deep learning solution as well as the upper bound in (9) for three values of the transmit power, P T = −5, 0, 5 dBW. These transmit powers map to receive SNR values of −3.8, 6.2, 16.2 dB, respectively, including the LIS beamforming gain of the 4096 antennas. The rest of the setup parameters are the same as those adopted in Fig. 9. Fig. 11 illustrates that the proposed deep learning solution can perform well even with relatively small transmit powers and low SNR regimes.
Number of channel paths: In Fig. 12, we investigate the impact of the number of channel paths on the performance of the developed deep learning solution. In other words, we examine the robustness of the proposed deep learning model with multi-path channels. For this figure, we adopt the same simulation setup of Fig. 9 with an LIS employing 64×64 UPA. The channels are constructed considering the strongest L = 1, 2, or 5 channel paths. As illustrated in Fig. 12, with the increase in the number of channel paths, the achievable rate by the proposed deep learning solution converges slower to the upper bound. This shows that the proposed deep learning model can learn from multi-path channels if a large enough dataset is available.

E. Refining the deep learning prediction
In Fig. 7-Fig. 12, we considered the proposed deep learning solution where the deep learning model use the sampled channel vectors to predict the best beam and this beam is directly used As the number of channel paths increases, the achievable rate achieved by the proposed DL solution converges slower to the upper bound. Hence, using more training data can help learn multi-path signatures.
to reflect the transmitted data. Relying completely on the deep learning model to determine the reflection beamforming vector has the clear advantage of eliminating the beam training overhead and enabling highly-mobile applications. The achievable rates using this approach, however, may be sensitive to small changes in the environment. A candidate approach for enhancing the reliability of the system is to use the machine learning model to predict the most promising k B beams. These beams are then refined through beam training with the receiver to select the final beam reflection vector. Note that the most promising k B beams refer to the k B beams with the highest predicted rates from the deep learning model. To study the performance using this approach, we plot the achievable rate of the deep learning solution in Fig. 13, for different values of k B . As this figure shows, refining the most promising k B yields higher achievable rates compared to the case when the LIS relies completely on the deep learning model to predict the best beam, i.e., with k B . The gain in Fig. 13 is expected to increase with more time-varying and dynamic environment, which is an interesting extension in the future work.

VIII. CONCLUSION
In this paper, we considered LIS-assisted wireless communication systems and developed efficient solutions that design the LIS interaction (reflection) matrices with negligible training overhead. More specifically, we first introduced a novel LIS architecture where only a small number of the LIS elements are active (connected to the baseband). Then, we developed two solutions that design the LIS reflection matrices for this new architecture with almost no training overhead. The first solution leverages compressive sensing tools to construct the channels at all the antenna elements from the sampled channels seen only at the active elements. The second approach exploits deep learning tools to learn how to predict the optimal LIS reflection matrices directly from the sampled channel knowledge, which represent what we call environment descriptors. Extensive simulation results based on accurate ray-tracing showed that the two proposed solutions can achieve near-optimal data rates with negligible training overhead and with a small number of active elements. Compared to the compressive sensing solution, the deep learning approach requires a smaller number of active elements to approach the optimal rate thanks to leveraging its prior observations. Further, the deep learning approach does not require any knowledge of the LIS array geometry and does not assume sparse channels. To achieve these gains, however, the deep learning model needs to collect enough dataset, which is not needed in the compressive sensing solution. For the future work, it is interesting to investigate other machine learning models such as the use of reinforcement learning that does not require an initial dataset collection phase. For the compressive sensing solution, there are several interesting extensions, including the optimization of the sparse distribution of the active sensors leveraging tools from nested and co-prime arrays.