Learning Indoor Environment for Effective LiFi Communications: Signal Detection and Resource Allocation

Seamless light fidelity (LiFi) communications in a realistic indoor environment faces a number of important challenges including the loss of line-of-sight (LOS) link due to the random orientation of mobile devices or blockage by users or furniture in the room. These effects make the frequency response of the LiFi channel highly environment-dependent. The dynamic nature of the channel is not only dependent on the geometrical configuration of the room but also on the user behavior. In this paper, we investigate the hypothesis that, deep learning (DL) based LiFi communication techniques can effectively learn the distinct features of the indoor environment and user behavior to provide superior performance compared to conventional channel estimation techniques particularly when access to real-time channel state information (CSI) is restricted. In particular, we implement DL methods in two different problems, namely, signal detection and resource allocation for orthogonal frequency division multiplexing (OFDM) LiFi systems. To effectively test this hypothesis, a realistic LiFi channel impaired by random device orientation and blockage is simulated considering different geometrical configurations and user behavior in an indoor environment, e.g. presence or lack of furniture and varying user distribution defined by a conditional hotspot model. The simulation results confirm that DL-based LiFi systems with partial CSI are able to offer close performance to the optimal signal detection and resource allocation with perfect CSI. Moreover, the DL-based techniques demonstrate substantial gain against conventional benchmark schemes that employ channel estimation algorithms such as least squares (LS) or minimum mean square error (MMSE) and this gain increases in a more realistic or complex indoor environment, e.g. room with furniture or a hotspot scenario.


I. INTRODUCTION
The increasing demand for wireless data has motivated both academia and industry to find alternative technologies that are capable of providing efficient, reliable and high-speed wireless data service. One of the currently rising technologies, namely optical wireless communication (OWC) has become a complementary solution along with the radio frequency (RF) based standards [1]. Visible light communication (VLC) and and light fidelity (LiFi) are among the most attractive and important representations of the future indoor OWC systems. They rely on the transmission of encoded optical signal using off-the-shelf light emitting diodes (LEDs) as the optical sources and photodiodes (PDs) as detectors [2]. VLC and LiFi offer a number of advantages in comparison to RF systems which include massive and unregulated bandwidth, high potential data rates, high energy efficiency and better security [3]. Furthermore, orthogonal frequency division multiplexing (OFDM) which is a common modulation technique in RF has also been implemented in LiFi systems for its high spectral efficiency. Since the transmitted signals in LiFi must be real and positive, DC bias optical OFDM (DCO-OFDM) were introduced to ensure the transmitted signals in LiFi are real and non-negative, hence suiting the requirements for intensity modulation/direct detection (IM/DD) systems. In OFDM systems, pilot-aided channel estimation techniques such as least squares (LS) and minimum mean square error (MMSE) are typically used to estimate the channel and improve the transmitted signal recovery. Moreover, the frequency spectrum in OFDM systems can be divided into multiple subcarrier frequency and can be shared VOLUME 4, 2016 among users. In order to ensure fairness between users, a resource allocation technique known as proportional fair (PF) scheduling is popularly used [4] where it is capable of providing efficient bandwidth allocation to users but requires perfect channel knowledge, which is challenging to obtain in practical systems.

A. MOTIVATION
In an indoor environment, the information of the channel condition (e.g., channel gain) can be exploited to improve the communication metrics such as the signal-to-noise ratio (SNR), bit-error ratio (BER) and user throughput [5]. For simplicity, most of the existing works on LiFi systems assume general geometries and a deterministic channel to model an indoor VLC system [6]- [8]. However, in a practical scenario, there are many contributing factors such as specific geometrical configurations of the network and the user behavior effects which are strong enough to influence the system's performance and therefore cannot be ignored during the communication system design. The OWC channel highly depends on the behavior of light (e.g., line of sight (LOS), non-line of sight (NLOS), link blockage), the geometry of the indoor environment (e.g., furniture) and the behavior of the user (e.g., random orientation of the UE and hotspots). These effects make the performance of LiFi communication and networking schemes such as OFDM signal detection and resource allocation highly dependent on the specific features of the indoor environment. Hence, it is worth to investigate whether it is possible to adapt the design of the LiFi systems to the underlying indoor environment using learning techniques.

B. RELATED WORK
To optimize the performance of a LiFi OFDM system, accurate channel state information (CSI) is required. CSI can be obtained using traditional channel estimation techniques such as LS and MMSE but with certain estimation errors. Nevertheless, these pilot-based channel estimation approaches are commonly used and work well for linear systems. A study in [9] compared the performance of the two methods for channel estimation in OFDM systems and show that LS provides a simple implementation but give insufficient performance due to not having any prior channel statistics. Meanwhile, MMSE is capable of offering a better channel estimation as it takes influence of noise into account but has higher computational complexity due to the inverse operations of the channel matrix. Several works on channel estimation in OWC systems (e.g., [10]) have used a generalized channel model (e.g., based on Gaussian channel gain) that does not consider user behavior and the geometry of the indoor environment. A deep neural network (DNN) based channel estimation method was proposed in [11] that can increase the spectral efficiency assuming point-to-point VLC system with only LOS channel considered. Compared to our paper, the authors in [11] only considered LOS channel in their system and ignore the effect of blockage and user behavior. Channel estimation for indoor VLC systems assuming Gaussian channel gains have been discussed in [12] and [13], taking into consideration the types of reflectors' materials and light sources. Note that the existing works focused on a uniform scenario of the indoor LiFi systems where the effect of NLOS, link blockage and user behavior were not considered.
Machine learning schemes such as deep learning (DL) can be used as an alternative approach to model a realistic LiFi channel. In contrast to the traditional approaches, DL offers a simple estimation process that can be performed indirectly in real time by viewing the channel as a black box. Useful channel information (e.g., channel gain, SNR, BER, etc.) which may not be easily or directly measured using conventional methods can be simply learned by the DL model based on the environment and the user behavioral data in order to consider the specific underlying geometry. Over the years, DL has achieved great success in designing different aspects of wireless communication systems, such as modulation, positioning, resource allocation, etc.. In [14], an automatic modulation recognition framework for detection of radio signals was proposed using convolutional neural network (CNN) and long-short term memory (LSTM). The authors in [15], have proposed an iterative point-wise reinforcement learning for highly accurate indoor visible light positioning. In [9], the authors proposed a DNN based channel estimation technique for learning the wireless channel in an OFDM system. The proposed technique was shown to be more robust than the traditional methods (e.g., LS and MMSE) especially when fewer pilots were used, cyclic prefix were removed and with presence of nonlinear clipping noise. A reinforcement learning (RL) technique based on a metalearning was proposed in [16] for access point selection and user association in THz/VLC wireless virtual reality (VR) networks that can accurately locate VR users using VLC and build THz links to transmit high-quality VR images while avoiding blockages caused by other users. A DL-based detection algorithm for molecular communication systems were developed in [17] where the model was trained in the absence of channel knowledge. Moreover, a LSTM aided system was proposed in [18] to predict human mobility and improve the accuracy of handover management. It can be envisioned that DL proposes a very promising solution to address the various channel characteristics and therefore can be applied in LiFi systems.
Considering a multiuser LiFi system, resource allocation and scheduling can become a vital issue as appropriate scheduling schemes are needed to ensure all of the available resource are allocated fairly to the users. It is important that these resources are allocated to users taking user behavior and specific geometry of the room into consideration. PF scheduling has been considered as an attractive bandwidth allocation criterion in wireless networks, capable of supporting high resource utilization while maintaining good fairness among network flows. Typically, it requires full channel knowledge which is practically difficult to obtain. The existing applications of PF has proven that it ensures a level of fairness to users [19], [20], however they do not account for the NLOS channel, user behavior and the specific geometry of the network. As the frequency response of the LiFi channel can be highly environment-dependent, it is important to consider these effects to ensure priority is given to the user with the poorest channel. DL has found success in solving resource allocation problems in LiFi systems under specific scenarios [21]- [23]. However, the literature mostly assume generalized channel models and neglect the effect of furniture and user behavior.

C. MAIN CONTRIBUTIONS
In order to incorporate the distict features of realistic indoor environment in the design of LiFi systems, DL techniques are implemented in this paper. We show that DL can be very effective in learning specific characterizations of the indoor LiFi channels as well as the corresponding non-random user behavior imposed by the specific indoor environment (e.g., based on the furniture configuration, etc.), outperforming the traditional channel estimation techniques. This paper is an extended version of our previous work [24].
To the best of our knowledge, this is the first time that DL-based communication schemes are investigated in the presence of a realistic LiFi channel in a typical room with furniture and different user behaviour while considering important effects such as the random orientation of the user device, blockage of LiFi links due to the existence of furniture and self-blockage by the users, and an infinity order of the NLOS channel components. We design two deep learningbased schemes for signal detection and resource allocation for LiFi communication in a realistic indoor environment based on LSTM and feed forward neural networks, respectively. We train the network considering specific geometrical configurations of the indoor environment and conditional hotspot models based on varying user distributions to show the effectiveness of our schemes in adapting to a particular scenario compared to the benchmark conventional methods. We show that when partial CSI is available, these schemes are able to indirectly estimate the underlying channel characteristics and improve the performance of signal detection and user scheduling. In practical scenarios with limited instantaneous CSI, our method can perform close to the optimal performance in the presence of full channel knowledge.
The remainder of this paper is organized as follows. Section II, describes the system model. The deep learning implementation in LiFi are introduced in Section III. The performance evaluation and discussions are presented in Section III. Finally, the conclusions are drawn in Section VI.

A. INDOOR LIFI CHANNEL
A typical single input/single output (SISO) configuration of a LiFi system consists of one access point (AP) and one photodiode (PD) which transmits and detects the signal, respectively. LED is used as the optical source while PD as the detector. Due to the incoherent characteristics of LED, the most practical modulation and down-conversion technique for LiFi is intensity modulation/direct detection where the signal is modulated onto the power of the carrier. At the AP, the transmitted optical signal, x(t) passes through the channel with the channel impulse response (CIR) of h(t) and outputs the current, y(t) at the PD described as: where R is the PD responsivity, n(t) is the signalindependent noise at the receiver and "⊗" is the convolution operator. The LiFi system comprises of an LED transmitter located at the ceiling, facing vertically downwards and a PD receiver mounted on a user equipment (UE), initially assumed to be oriented vertically upwards. The LiFi channel relies upon the existence of LOS and NLOS components where LOS is a phenomena when the link established between the AP and UE is direct and uninterrupted. Meanwhile, NLOS link relies upon the reflection of light from reflecting surfaces (e.g., walls and furniture) in the environment. It should be noted that an infinite order of reflections is necessary to achieve the most accurate representation of the NLOS channel. The optical channel in an indoor LiFi system can be described as: where H LOS and H NLOS are the frequency response of the LOS and NLOS links, respectively. The direct-current (DC) gain of the LOS path between the AP and the UE is given by: where A, ϕ, ψ are the detector area, transmitter radiance angle and receiver incidence angle, respectively. The Lambertian emission order is given as m = −1/log 2 (cos Φ 1/2 ) where Φ 1/2 is the half-intensity angle, while d denotes the distance between the AP and the UE. The blockage status, denoted as v is 1 if there is a blockage of the LOS link and 0 if there is no blockage. Furthermore, rect( ψ Ψ ) = 1 for 0 ≤ ψ ≤ Ψ and 0 otherwise where Ψ is the field of view (FOV).
For the NLOS links, a high reflection order ensures accurate values of the diffuse channel components. The method described in [25] is used to consider an infinite order of reflections by calculating the channel gain in the frequency domain instead of the time domain. To approach this, the environment is segmented into multiple small surface elements which act as reflectors. Thus, the NLOS channel gain which include an infinite order of reflections can be expressed as: where the transmitter transfer function vector is denoted as: In (5), H k,Tx (f ) is the channel gain from the transmitter to the kth reflector while N denotes the total number of reflecting elements in the room. The frequency-dependent transfer  , H e (f ) of size N × N describes the LOS transfer function between the kth and the ith reflectors which acts as the transmitter and receiver surface elements, respectively. The receiver transfer function vector is expressed as: where H Rx,i (f ) is the channel gain between the ith reflector and the receiver. G ρ = diag(ρ 1 , ..., ρ N ) is the reflectivity matrix of the reflecting elements, ρ i is the reflection coefficient of the ith reflector and I is the unity matrix of size N × N . In our analysis, we consider stochastic geometry to model the mobile nature of the users while we use the random orientation model proposed in [26] to consider the effect of receiver tilting as well as other realistic channel characteristics. We assumed that the indoor mobile network is quasi-static where the user can take different locations and random orientation angles. Hence, each realization comprises a snapshot of the network. Based on [26] and [27], it has been reported that the coherence time of the LiFi channel is in the order of a few hundreds of milliseconds (i.e., 130 ms) which justifies the assumption of quasi-static channel considering the typical data rates of LiFi systems. Mobility management issues such as handover will be considered in the future work as it is beyond the scope of this paper. Moreover, this paper mainly focuses on a single cell due to the small room dimension. We note that for multiple cells, the user in a particular cell would be connected to one of the APs, while the signal from the remaining APs should be considered as interference. Hence, an additional scenario where multiple LEDs are placed on the ceiling and the effect of interference from neighboring APs are considered to show that the average effect of interference is not significant on the overall performance and the generality of the DL-based methods.

B. LINK BLOCKAGE AND USER BEHAVIOR MODEL
Link blockage is another important factor that may significantly affect the channel. Due to the nature of light, the link established between the AP and UE may be blocked if there exists an opaque barrier within the room where in this case it could be a human body or a furniture as shown in Figure  1. In this study, we implement a realistic LiFi channel simulator which considers blockage due to human and furniture which was proposed by the author in [28]. We consider selfblockage (i.e., when the user who is using the UE is blocking the link between the AP and the UE) and blockage due to furniture (e.g., cabinets, desk or chairs blocking the LOS link between the AP and UE). We model the human body and the furniture in the room as rectangular prisms of length, L b , width, W b , and height, H b . The direction and location of the self-blocking user are obtained based on the direction and location of the UE. It is assumed that the user keeps the UE at a distance, d p of 0.3 m away from themselves. We also assume that when the LOS link is blocked, the user is served using NLOS channels. To determine the blockage status, we firstly denote the location of the source as P a = (x a , y a , z a ) and the location of the receiver as P u = (x u , y u , z u ). The user facing direction and the distance between the user and the UE are denoted as Ω and d p , respectively. The center location of the blocker that is modelled as a cube is denoted as B loc = (x b , y b ) and can be calculated as where B direc is the direction of the blocker, described as B direc = [sin(Ω), cos(Ω)]. The height of the blocker is defined as H b = H i + H UE where H UE is the height of the receiver. H i depends on whether the user is standing or sitting where H i = 1 if the user is standing, and H i = 0.3 when the user is sitting. The edge planes of the blocker can be calculated based on the corner points of the blocker, which are functions of B loc , B direc , L b , W b , and H b . Then, the LOS line between the source and the receiver is generated by obtaining its endpoints and its slope. Now that the blocker and the LOS line have been modelled, we can obtain the blockage status, v of the user by determining whether the LOS line intersects the planes of the blocker. This can be done using the Cyrus-Beck algorithm [29] which is a wellknown algorithm used for classifying the point of intersection against a polyhedron. Since the blocker is modelled as cubic, there are a total number of six faces that need to be checked. For blockage due to furniture, the same algorithm can be applied but replacing H b , B direc and B loc with the height, direction and location of the furniture, respectively. Figure  1 demonstrates the geometry of an indoor LiFi system with two UE where one UE is tilted, and an obstacle as an example to visualize the LOS link and the blocked link. In this study, we consider a 3D indoor environment as depicted in Figure  2 which represents the room configuration with different types of furniture (e.g., tables, chairs and cabinets) and the placement of APs when we consider multi-LED scenario in the later sections.
In LiFi, the channel gain highly depends on the random orientation of the UE. Hence, it is important to consider this factor to achieve as accurate and realistic channel model as possible. In [26], the authors have proposed a novel and experimentally validated model for random orientation of mobile devices which we adopt in modelling the indoor channel of our system. The user behavior model is depicted in Figure 1. Based on this model, the incidence angle from (3) is derived as [26]: Here, θ is the elevation angle between the positive direction of the Z-axis and the UE normal vector. It was shown in [26] that the probability density function (PDF) of θ and Ω follow a Laplace and uniform distribution, respectively. Then, we can directly include cos(ψ) into the channel gain analysis.

C. OPTICAL OFDM TRANSMISSION AND MULTIPLE ACCESS TECHNIQUE
In this paper, we consider OFDM which is one of the most common modulation techniques in LiFi communication systems [30]. In contrast to a single carrier modulation, OFDM makes efficient use of the frequency spectrum by dividing a single channel into multiple narrow-band channels at different subcarrier frequencies. For IM/DD systems, the conventional OFDM approach cannot be used as it only takes bipolar and complex signals. Hence, for the use of OFDM in LiFi, the transmitted signal must be real and positive and therefore a number of OFDM based modulation suitable for LiFi were introduced (e.g., DCO-OFDM and ACO-OFDM). DCO-OFDM is considered in this work since it uses all of the subcarriers to carry the data symbols, making it more bandwidth efficient compared to ACO-OFDM which only uses half of the subcarriers to carry the data symbols [31].
In DCO-OFDM, Hermitian symmetry is applied to generate real-valued signals in the time domain, while a DC bias is added to ensure the signal to become positive when passing through the channel. In LiFi systems, multiuser access can be supported by means of multiple access technique. In section V where we discuss in detail the DL-based resource allocation approach, we consider orthogonal frequency division multiple access (OFDMA) to allow users to share the frequency resources at different subcarriers. By allocating subsets of subcarriers to different users, multiple access can be achieved while quality of service of different users can be controlled. Therefore, OFDMA is regarded as a practical choice for downlink transmission. As mentioned previously, PF scheduling technique can be used to fairly allocate these subcarriers to different users. The decision of subcarrier allocation to users is based on the instantaneous CSI and the moving average data rate of the user. Hence, user with the poorest channel gain and lowest average data rate is prioritized for the next subcarrier allocation. The implementation of PF scheduling for this research is further described in section V.

D. SIGNAL DETECTION AND RESOURCE ALLOCATION
The signal detection problem follows the classical maximum likelihood (ML) optimization where optimal detection is achieved by minimizing the distance between the received and transmitted signals. Hence, ML can be defined as [32], [33]: where Y , X and H are the received signal, transmitted signal and the channel gain, respectively. The resource allocation problem is based on PF resource allocation scheme where it begins with the calculation of the priority for each user at each subcarrier, then the user with maximum priority is assigned the subcarrier. The priority of VOLUME 4, 2016 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. the jth UE for the kth subcarrier is calculated based on the following metric [5], [34]: whereR j is the average data rate of the jth UE before allocating the kth resource, and R req is the requested data rate of the jth UE. The algorithm then continues to assign the subcarriers to the user with the next maximum priority until all of the frequency subcarriers are allocated. In our work, we assume that the full CSI is not available, and therefore, prior to signal detection and resource allocation, the system must first estimate the channel based on pilot signals. In order to do the channel estimation, we follow the classical channel estimation methods such as MMSE and LS. For MMSE, the goal is to obtain the estimate of H and X that jointly minimize the mean square error between the actual value of the received pilot symbols and the estimated value of the pilot symbols. Hence, MMSE can be defined as [10], [35], [36]: The goal of LS estimator is to minimize the square distance between the received pilot signal and the original pilot signal which can be defined as: The important connection that can be noted between the signal detection and resource allocation problems is that both of the considered problems are being affected by limited number of pilots and require for channel estimation. Therefore, in our work, we are looking at how imperfect CSI will affect the problem in signal detection and resource allocation. It should be noted that these are classical problems and the focus of this work is not to modify them in that perspective but rather focusing on the effect of limited pilots, furniture, indoor environment, etc., to demonstrate the potential of the learning techniques in effectively enhancing signal detection and resource allocation in the absence of full CSI.

III. DEEP LEARNING METHOD
In this paper, we propose an effective learning approach for signal detection and resource allocation for a DCO-OFDM LiFi networks with the aid of LSTM neural network and DNN, respectively. For the implementation of DL in LiFi, we first generate a sufficiently large amount of training data to be fed into the learning algorithm. The DL process constitutes of two stages; offline training and online implementation.
Using the realistic channel models that well describe the real channels and considering an optimal solution for the problem of interest, the training data can be generated by simulation which was conducted using MATLAB and Python. During training, the DL model learns a mapping function between the input and the output and minimizes the error between the output of the model with the actual value of the output signal. Once the network is trained, it can be employed in real time deployment stage to output an accurate estimate of a desired task in which the system can be viewed as a black box. In this work, the DL models were trained on a computer with 2.3 GHz Quad-Core Intel i5 CPU taking between 0.6 to 1.5 hours to complete. It should be noted that one of the features of the indoor environment is that the furniture does not necessarily move every minute or every hour. Therefore, the indoor environment can remain unchanged for some period of time. In this case, training the deep learning methods based on a large set of channel realizations which consider random device orientations, random user locations, etc., is enough to help increase the generalization ability of the network. If the system changes over time (e.g., due to the placement of furniture at different locations), transfer learning can be used to update the deep learning network. For the system to adapt to new changes, we need to determine the part of systems that can be fixed and the part that needs to be fine-tuned. This can be considered as a future research direction. LSTM is a recurrent neural network capable of learning long-term dependencies between data sequence and has achieved success in sequence prediction problems [37], [38]. Their hidden layers consist of memory cells controlled by 'gates' which regulates the flow of information into and out of the memory cell. Hence, they decide what new information to be input to the cell, what old information to be discarded, and what will be the updated information to be output from the cell. Since LSTM has internal memory and makes its decision by considering the current input and the previous outputs, it can perform extremely well in processing sequences of data. This makes LSTM applicable to the signal detection problem as sequences of transmitted and received symbols are used to train the neural network. Figure 3 shows the architecture of LSTM that is considered in the signal detection problem in section IV. The sequence of operations in LSTM at time step t can be found in [39]. LSTM consists of sigmoid gate activation function, σ g (.) and hyperbolic activation function, tanh (.) that act as squashing functions to ease the training of the network. The sigmoid function is considered as a gating function where it decides how much information will pass through the cell gates.
DNN is the most popular DL algorithm that can be used to solve nonlinear problems. It has found success in many applications especially for classification problems. Figure  4 shows the structure of the DNN model implemented in the resource allocation problem in section V. DNN is more suitable to be used in the resource allocation problem as the problem involves a deterministic mapping between input and output. The network needs to learn the relationship between the channel gains of the users and the user that will be prioritized for the next subcarrier allocation. Figure 4 shows a typical DNN consisting of an input layer, multiple hidden layers and an output layer. Here, the inputs are the channel gains of the users waiting to be allocated the k-th subcarrier, and the output is the allocated user index across the whole subcarriers. To ease the learning process and achieve better learning performance, a nonlinear function called Rectified This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  Linear Unit (ReLU) is used which can be mathematically defined as: Therefore, the forward propagation of DNN can then be described as: where z (l) i is the output of the lth layer, W i (l) is the weight coefficient, y (l−1) is the output of the previous layer, b i (l) denotes the bias, and finally y (l) i denotes the output of lth layer after undergoing the nonlinear operation. During the backpropagation, the output values are compared with the target values in order to calculate the error. The error can be calculated by means of a loss function (e.g., mean square error (MSE)) which is then fed back through the network where the algorithm adjusts the weights of each connection in order to reduce the error. This process is called supervised learning.

IV. DEEP LEARNING-BASED SIGNAL DETECTION
The architecture of our DL-based DCO-OFDM is illustrated in Figure 5. As seen in the figure, the system is very similar to the conventional OFDM. The main difference is that firstly, the transmitted signal in DCO-OFDM must have Hermitian symmetry and secondly, a DC bias should be applied before transmitting the signal through the channel. This is to fulfil the requirements where the signal to be transmitted in IM/DD systems must be both real and non-negative. During the real-time operation, a sequence of symbols are randomly generated at the transmitter which then undergo modulation. Then, pilot symbols are uniformly inserted into where κ is the total number of subcarriers and the modulated subcarriers carrying the information is only κ/2 − 1. Afterwards, in the time domain, CP is inserted and DC bias is added to make the signal positive before passing through the optical channel. After going through the channel, the received signal can be expressed as in (1). At the receiver side, after removing the DC bias and the CP, fast Fourier transform (FFT) is performed to convert the signal from the time domain to the frequency domain. Hence, the received signal can then be described as: , H(f ) and N (f ) are the FFT of y(t), x(t), h(t) and n(t), respectively. Finally, after demodulation, the estimated noisy version of the received symbols are fed as input to the trained LSTM model and the original transmitted data is recovered.
For the training stage, using the channel models discussed in Section II, the training data can be generated by simulation. The training data is divided into training set and validation set. The latter is not used for training but instead is used to monitor the validation loss during the training process which helps determine if the network is overfitting the training data. In each simulation, the transmitted signal undergoes channel distortions caused by the diffuse channel components and different noise samples to increase the generalization ability of the DL model during the online deployment stage. To collect the training data, a DCO-OFDM system with 127 subcarriers and quadrature phase shift keying (QPSK) modulation are considered. The transmit VOLUME  Overall, 1 × 10 8 data samples are produced for training through random realization of different parameters (i.e., user location, device orientation, user direction, and noise) independently generated from each other. The transmitted and received symbols are collected as the training data. During the training stage, the DL model learns to minimize the error between the output of the neural network and the original transmitted data by re-adjusting the weights using a MSE loss function.

A. LEARNING ALGORITHM DESIGN
Choosing a suitable size of training data set depends on the complexity of the system as well as the complexity of the learning algorithms. Using too little training data may cause poor performance since the model would be incapable of fully learning the diverse characteristics of the environment under study. Having too large training data may result in over-fitting, where the model corresponds too closely to the training data but fail to give good performance when new data is presented during the testing stage. Thus, we conduct several experiments to decide the suitable training data size and see which of them could offer the best BER performance. Table 2 shows the effect of increasing the total number of training data for LSTM when the pilot ratio is 1/8 and 1/32, respectively. Considering the BER of 3.8 × 10 −3 which is the forward error correction (FEC) threshold [40], we can see that increasing the training data size noticeably helps to reduce the SNR penalties against maximum likelihood (ML) when read at BER = 3.8 × 10 −3 . In order to decide on the number of layers to be used, several experiments were conducted by training different architecture complexities of the LSTM by varying the number of LSTM hidden layers. Table 3 shows the performance of LSTM with varying hidden layers. It is clear that by increasing the number of layers, LSTM works very well and perform close to ML. We can see that at BER = 3.8 × 10 −3 , the SNR penalty for 5 hidden layers compared to 10 hidden layers is almost 1 dB which is not a huge difference. Therefore, to avoid the risk of overfitting, 5 LSTM layers were chosen which gives good enough performance.   The hyperparameters (e.g., number of hidden layers, number of neurons, learning rate, etc.) are determined based on several experiments conducted using various configurations of the LSTM network. We manually search for the best value by trial and error (e.g., increasing and decreasing the number of hidden layers, the number of neurons, etc.). The number of neurons in each layer are adjusted to the complexity of the LiFi network where the number of neurons range is set to be from 5 to 100. The learning rate chosen for Adam optimizer is 0.001, which gives a reasonable training time for the neural network. It is observed that the optimal performance can be achieved when our DL model consist of one input and one output layer, five hidden LSTM layers with 100, 50, 50, 25 and 10 neurons respectively, one fully connected layer and one dropout layer. The LSTM layers process the whole input sequence using its feedback connections while the fully connected layer helps to map the output of the LSTM layer to the same size as the input layer. The dropout layer is added to help reduce overfitting and the model is trained until the validation loss stops decreasing or becomes larger than the previous minimum value. The high computational complexity of the proposed method is only during the training stage which is conducted offline. However, during the real-time implementation, the trained network is capable of offering a much faster solution.

B. EFFECT OF REDUCED PILOT NUMBERS
In order to show that our proposed method can give significant gains when partial CSI is considered, we compare the performance of our LSTM-based signal detection method, which for simplicity we call it LSTM, with the traditional LS estimation, and MMSE in terms of BERs for different SNRs. The simulation parameters are listed in Table 1. To test the accuracy of the DL network, we also implement ML detection as a benchmark which assumes perfect CSI. From the results depicted in Figure 6, it is clear that the proposed model is proven to always achieve better performance than both LS and MMSE and can perform almost as good as the detection with full CSI. Our DL model offers excellent performance since it can adapt to the specific geometrical configurations and user behavior effects especially when fewer pilots were used. Focusing at the case when pilot ratio is 1/32 and looking at BER = 3.8 × 10 −3 , an SNR gain of approximately 9 dB and 15 dB was obtained for LSTM-based approach compared to MMSE and LS, respectively. This can be expected as even when the number of pilots is reduced, the DL model has the ability to use the whole sequence of historical data to learn the channel characteristics and the user behavior effects and make reliable predictions of the transmitted signal. In comparison, LS and MMSE give poor performance since the reduced pilot number is not enough to accurately estimate the channel. Therefore, even when the system have partial channel information, the proposed model can still give excellent detection performance, that is similar to the detection with full CSI. Figure 7 shows the MSE of the different detection schemes when the CSI is limited. From the figure, it can be seen that the MSE declines gradually with increasing SNR for all of the detection methods. It is clear that our LSTMbased detection scheme yields the best MSE performance for both full CSI and partial CSI scenarios. Once again, we can see that with partial CSI, LS gives the worst performance compared to LSTM and MMSE. This is expected as we know that LS does not take the channel statistics into account during channel estimation. Unlike LS, MMSE uses the first and second order of the channel statistics when performing channel estimation, which explains the better MSE performance obtained compared to LS.

C. EFFECT OF FURNITURE
As previously mentioned in Section I, in order to make the LiFi channel more realistic, we consider a very specific indoor scenario by including some furniture in the room. To clearly see the effect it has on the performance of each methods, we compare their performance under two different scenarios; i) with furniture, ii) without furniture which can be seen in Figure 8. We calculated the SNR penalties against ML for the performance of each detection schemes for the two different geometrical configurations (e.g., with and without furniture) which is shown in Table 4. We can see that without furniture, LSTM has a gain of 7 dB and 11dB with respect to MMSE and LS. However, as also mentioned in the previous section, when furniture is included in the room, LSTM performs very close to the respective ML with a gain of 9 dB and 15 dB compared to MMSE and LS, respectively to achieve BER = 3.8 × 10 −3 . This means that even when the geometrical configurations of the room changes, deep learning is able to capture those characteristics and outperforms the other traditional methods. By including furniture into the room, the blockage probability of the optical link increases, and the uniform scenario changes to become more specific. Therefore LS and MMSE perform worse in this case. These results prove our expectations of deep learning to learn the environment better than LS and MMSE when the geometry of the room becomes more complex by deviating from a symmetrical or uniform state. This conclusion is also confirmed by Table 4.

D. EFFECT OF CONDITIONAL HOTSPOT MODEL
We further simulate the signal detection problem using the proposed method and train it based on a hotspot model. The idea of using a hotspot model is to see whether we can achieve higher gain in the DL method when the behavior of the user deviates from a uniform distribution thereby increasing the complexity of the environment. The hotspot model considers an area in the room where the probability of the user being within this area is higher than being elsewhere. We collect the training data in a similar manner as the previous approach for all of the possible user locations. We then consider three types of scenarios where the probability of the user to be inside the hotspot area is 100%, 80% or 50%, namely Hotspot A, Hotspot B and Hotspot C, respectively.
To realize this hotspot model, for each scenario, we train the LSTM network by using 100%, 80% and 50% of the total data collected from all possible locations of the user inside the hotspot area, respectively. For the second and third case, the balancing 20% and 50% of the data are taken from the random user positions located outside the hotspot area. We compare the performance between LSTM, MMSE and LS in the case where partial channel knowledge is considered. Figure 9 presents the average BER curves for the three indoor scenarios where we remove LS performance to keep the figure less crowded. As expected, the best performance that is closest to the full CSI can be achieved by the learning-   Hotspot A can be regarded as a fixed case where we are looking at a very specific scenario. Here the user is always sitting around the table in the hotspot area, rather than being randomly distributed within the room. Therefore, with this knowledge, it is easier for LSTM to learn and operate close to ML even with partial CSI. Hotspot B and Hotspot C cases can be viewed as random location scenarios, where the user can be anywhere in the room. For these random cases, the traditional methods benefit from more randomness and therefore gives better performance compared to Hotspot A. This is because the other 20% and 50% of the data are taken randomly and symmetrically outside the hotspot area, where there are parts of the room that has really good and really poor channels. The averaging effect eventually leads to LS and MMSE having better BER performance. On the other hand, for Hotspot A, the channel gains are more highly influenced by the NLOS links which is a combination of many effects. This makes the frequency response of the system to be more complex and more difficult to estimate in the absence of full CSI particularly for LS and MMSE. Table  5 shows the effect of the different user behavior models to the performance of LSTM, MMSE and LS in the presence of partial CSI in terms of SNR penalty againts ML with full CSI. It can be seen that LSTM outperforms the other techniques by only having 1 dB to 2 dB SNR penalty against ML when read at BER = 3.8 × 10 −3 . Hence, from the result shown in Figure 9 and in Table 5, we can confirm our earlier expectation that when different user behaviors exist, DL-based method should be able to capture and adapt to those specific characteristics of the channel and perform better compared to the traditional techniques.

E. EFFECT OF FIELD OF VIEW AND MULTIPLE LEDS
In previous simulations, we have considered the FOV to be 85 • and only a single LED is assumed. We now simulate two different cases in order to study the effect of FOV and multiple LED on the detection performance. Firstly, assuming partial CSI with furniture included in the room, we limit the value of the FOV to be 45 • . Figure 10 show the performance comparison of different detection schemes between the two FOV values. It can be seen that when the FOV is reduced from 85 • to 45 • , the performance of all detection methods degrades. However, LSTM shows that it is more robust to the  effect of changing the FOV and can still give better detection performance compared to LS and MMSE. It is expected that the performance of detection methods degrades when compared to the case where the FOV value is 85 • . When setting the FOV to 45 • , the UE in some locations (e.g., users closer to the walls) may have no chance of accessing the AP as they may not see the LOS channel and there is less contribution of the NLOS channel. Next, instead of focusing on a single LED, we investigate the effect of placing four LEDs on the ceiling as depicted in Figure 2. For this particular scenario, we also assume partial CSI, furniture is included, and the FOV is fixed at the original value of 85 • . The LED half-intensity angle, Φ 1/2 is set to be 35 • which is chosen to fulfil the illumination requirements for an indoor environment [41]. Figure 11 depicts the performance of each detection scheme when the effect of interference from the nearby APs is considered. From the result, it is shown that the added interference leads to no significant degradation in the average performance of LSTM as well as for MMSE and LS. Note that the average performance result shown does not consider any interference mitigation techniques such as fractional frequency reuse, which can also be applied to reduce the interference effect further particularly for the edge users.

F. COMPLEXITY ANALYSIS
It has been mentioned in previous works that LSTM algorithm is very efficient and is local in space and time [37]. This means that the complexity of the network does not depend on the input sequence length and at each time step, the computational complexity of an LSTM layer per weight is O(1). Hence, the total complexity of an LSTM at each time step only depends on the number of weights which is O(w) where w is the number of weights. Therefore, the time complexity for our model is O( d l=1 w l ) where l and d are VOLUME 4, 2016 the index and the number of LSTM layers, respectively while w l denotes the number of weights for the l-th layer. In our simulation, the LSTM model consist of 5 LSTM layers with 100, 50, 50, 25 and 10 hidden units, respectively. Note that the size of the training data samples exceeds the number of the neural network parameters, which means that the neural network will not overfit the data.
It was mentioned earlier that channel estimation using LS requires low complexity and can be obtained by a simple division of the received pilot symbols under the effect of the optical channel over the transmitted pilot symbols. However channel estimation with LS gives inadequate performance as opposed to the DL techniques. LS channel estimation can be expressed as [42]: where p n is the n-th pilot index of the signal, X(p n ) is the transmitted pilot signal and Y(p n ) is the received pilot signal. Interpolation is then used to obtain the rest of the data subcarriers to get the full channel estimates. Assuming N is the total length of the pilot symbols, the complexity of LS can be described as O(N ). Meanwhile, MMSE estimates the channel by minimizing the mean squared error. Unlike LS, it can provide better channel estimates as it utilizes the second order statistics of the channel. However, this results to an increase in complexity due to the inverse operations of the channel covariance matrix. MMSE can be expressed as [43]: where R HH = E[H H H] is the channel covariance matrix in the frequency domain and σ 2 is the noise variance. The channel covariance matrix is calculated based on the statistical CSI across all subcarriers. Thus, the complexity of MMSE can be described as O(k 3 ), where k is the number of frequency subcarriers. Finally, ML detection is carried out by assuming the full knowledge of the CSI to provide a benchmark for optimal detection. Hence, due to the assumption of the availability of full CSI in this case, the complexity comparison would not be insightful.

V. DEEP LEARNING-BASED RESOURCE ALLOCATION
The conventional resource allocation strategies in LiFi are usually iterative where the implementation complexity increases with the number of users. Most importantly, traditional resource allocation strategies require perfect CSI of all the users in the network, which is usually challenging to acquire in real-time, especially when we consider a high number of users. In multiuser systems, the users normally compete for resources. One of the popular scheduling technique namely proportional fair ensures efficient bandwidth allocation to users in order to support high utilization of resources while maintaining a level of fairness among the users [4]. However, PF scheduling algorithm requires full knowledge of the channel which may not be easily obtained in practice. Hence, motivated by the previous problem where accurate signal detection can be achieved using deep learning using only partial CSI, we propose a novel DNN-based resource allocation scheme for multiuser LiFi systems. We assume OFDMA based on DCO-OFDM in order to support multiple access between users. The signal transmissions for OFDMA are similar to what have been described in the detection problem. As the optimal benchmark, we simulate PF scheduling using full CSI. Considering H j is the optical channel gain vector from the AP to the UE j based on the realistic channel model proposed in the previous section, during the first round of scheduling, the user that has the maximum channel gain will be selected to connect to the AP. The PF scheduler then allocates a number of subcarriers to the UE j based on the user's requested data rate and its link quality. The scheduler allocates the kth resources to jth UE following the metric defined as [4], [5]: whereR j is the average data rate of the jth UE before allocating the kth resource, and R req is the requested data rate of the UE. In this paper, we assume that the request data rate for all users are the same. After all of the resources has been allocated to all users, the downlink rate of UE j can be obtained using: where s j,k = 1 if the kth subcarrier is allocated to the UE j and s j,k = 0 otherwise. The SNR of the UE j on the kth subcarrier served by the AP, denoted as γ d,j,k can be described as: where R PD is the PD responsivity, P d,opt is the transmitted optical power, H j,k is the channel gain on the kth subcarrier, K denotes the total number of subcarriers, η is the conversion factor, where we choose η = 3 to guarantee less than 1% of clipped signal. σ j,k is the noise power on the kth subcarrier of UE j expressed as: where N 0 is the noise power spectral density and B d is the downlink bandwidth.

A. LEARNING ALGORITHM DESIGN AND COMPLEXITY ANALYSIS
The PF algorithm takes in the channel gain data and outputs the user index of the allocated subcarriers. After all of the subcarriers have been successfully assigned to the users using PF scheduling technique, the channel realizations and the corresponding user allocation at each subcarrier are collected to be used as input and output training data, respectively. The user scheduling can be seen as a classification problem, where the channel gain is fed as input to the network, and the network then outputs the user index that has been allocated a subcarrier. As previously mentioned in Section II, the channel is assumed to be quasi-static where the resource allocation is performed over a single coherence time of the channel. Since the resource allocation problem does not really exploit the temporal memory of the channel, implementing LSTM will add unnecessary degrees of freedom, making DNN design complex compared to a regular feed forward network. Hence, a feed forward network was chosen due to the state-free mapping from the input to the output. For this work, we consider a feed-forward DNN model which consist of an input layer, an output layer, and 7 hidden fully connected layers with 100, 50, 25, 20, 15, 10 and 5 neurons respectively. The architecture of the DNN network used for this problem can be seen in Figure 4. The DNN is trained based on the training data collected using the conventional PF user scheduling strategy. The trained network is later used in real-time implementation to efficiently conduct user scheduling based on the received input channel gains, with lower complexity. Same as the detection problem, the training for the network in resource allocation is conducted offline.
To analyse the performance of DNN with different network architectures, we trained five DNN models consisting of different numbers of hidden layers and compared their performance in terms of average throughput versus SNR. To test the accuracy of the trained networks, PF with full CSI is used as a benchmark. In Table 6, it appears that by increasing the number of layers of the network, DNN can perform much better and even achieve very close performance to the PF scheduling. However, other than the issue of increased computational complexity when the number of layers are increased, it is also important to note that if the number of layers for the DNN is too high, the network may overfit the training data, hence resulting in extremely good performance. As previously mentioned, overfitting occurs when the network corresponds too closely, or exactly to a particular set of data. Therefore, in order to avoid this problem, we have chosen to use 7 hidden layers which still gives good performance and is close enough to the performance of PF scheduling.
The online computational complexity of DNN can be generally represented by the number of multiplications needed to compute the activation of all neurons in all of the network layers. The transition between the lth and (l − 1)th layers requires w l · w l−1 multiplications, where w l are the weights at the lth layer and w l−1 are the weights at the previous layer. Therefore, the total complexity in DNN network is given by O( L l=1 w l ), where L is the total number of layers. For PF scheduling, the major operation blocks consist of determining the average data rate atR j and the metric j. Hence, its complexity can be described as O(N u k 2 ), where N u is the total number of users and k is the number of subcarriers.

B. EFFECT OF REDUCED PILOT NUMBERS
Similar to the analysis in the OFDM signal detection problem, we compare the performance of our DNN-based resource allocation scheme, which for simplicity we call it DNN, with the traditional PF scheduling algorithm in terms of averaged throughput for different SNRs. Using the same simulation parameters as listed in Table 1, we consider an OFDMA system based on DCO-OFDM with the number of users, N u = 4 and the requested data rate, R req = 20 Mbps. To analyse the performance of the DNN network, we set the PF scheduling technique which assumes perfect CSI as a benchmark. For the case of partial CSI, we apply PF scheduling which firstly estimates the channel based on LS and MMSE before moving on to the scheduling algorithm based on the estimated channel. We also compare two DNN model which are trained based on full CSI and partial CSI. Figure 12 show that the DNN-based scheduling scheme for both scenarios (i.e., full and partial CSI) can achieve almost similar performance to the optimal PF scheduling technique which considers full channel knowledge. For the partial CSI case, it is clear that the DL method are proven to always achieve better performance compared to PF scheduling using LS and MMSE. We once again show that LS and MMSE based channel estimation give poorer performance when a  realistic environment is considered. In contrast to PF scheduling with partial CSI, our DNN model offers excellent performance since it has the ability to specifically adapt to the complex geometrical configurations and the user behavior effects. Hence, we can see a significant gain between the learning and non-learning techniques especially when fewer pilots were used. Figure 13 compares the MSE of the different scheduling schemes for the case of full CSI and partial CSI. Similar to the detection problem, the MSE declines gradually with increasing SNR for all of the scheduling schemes. It can be seen that the proposed DNN-based scheduling scheme provides the best MSE performance for both full CSI and partial CSI scenarios. As previously mentioned, we can expect that PF-LS gives the worst performance compared to DNN and PF-MMSE when partial CSI is considered. This is due to LS not taking the channel statistics into account during channel estimation while MMSE uses the first and second order of the channel statistics.
Moreover, we compare the fairness among the users for the different scheduling algorithms to determine whether the users are receiving a fair share of the system resources. Many approaches to quantify fairness has been proposed in the literature with the most commonly used being Jain's fairness index. The user fairness index can be described as: where N u is the total number of users and R j is the average throughput of user j. The value of the fairness index is 1 for  the highest fairness when all users have the same throughput. The values of the Jain's fairness index for the users using different resource allocation schemes were calculated using (20) and are tabulated in Table 7. It is clear that most of the scheduling algorithms are able to achieve the fairness value of 1. For all of the PF schemes, the schedulers allocate the resources to the user who has the worst current channel realization relative to its own average. Hence, it guarantees an equal amount of resources for all users. The DNN-based resource allocation scheme imitates the way PF allocates the subcarriers to users based on the given training data. Therefore it is also able to achieve fairness index value of 1 when full channel knowledge is available, or very close to 1 with partial CSI.

C. EFFECT OF FURNITURE
In this subsection, we also compare the performance of the different scheduling schemes when we take into account the effect of furniture in the room. Similar to the detection problem, the performance is compared between two different scenarios; i) with furniture, ii) without furniture where the result is shown in Figure 14. The SNR penalties against the respective PF scheduling algorithm for each scheduling schemes are also tabulated in Table 8. In these results, we can see the same trend is happening as in the detection problem.  Looking at Figure 14, assuming a target throughput of 20 Mbps, it is obvious that there is not a huge difference for DNN-based method when we include the furniture. There is only 3 dB difference between DNN with furniture and DNN without furniture, and it can still operate close to the optimal PF scheduling. However, looking at the performance of PF based on LS and MMSE, there is quite a significant change when we include furniture in the room. Therefore it is clear that when furniture is added, PF-LS and PF-MMSE degrade substantially as it was the case in the signal detection problem. Once again our results show that deep learning-based system is able to perform better than the traditional methods when we consider specific geometrical configurations of the room that add complexity to the channel frequency response.

D. EFFECT OF CONDITIONAL HOTSPOT MODEL
For this section, we applied the hotspot model for the proposed scheduling technique to see the effect of user behavior on the performance of our learning approach. As mentioned previously, the hotspot model considers an area within the room where users are most likely to be compared to other locations. We still consider three different user behavior scenarios where the probability of the user to be located within the hotspot area is 100%, 80% and 50%, and is referred to as Hotspot A, Hotspot B and Hotspot C, respectively. Focusing on the case with partial channel knowledge (i.e., having less pilot numbers), the performance is compared between our proposed DNN technique and PF based on LS and MMSE. Figure 15 shows the average throughput versus SNR curves for DNN and PF-MMSE when different user behavior models are considered. These effects can also be seen in Table 9 where the performance is compared in terms of SNR penalty againts PF with full CSI when we assume a target average throughput of 20 Mbps. Similar to the results in detection problem, we can see that DNN is able to achieve the best performance even when the knowledge of user behavior decreases. It can also be seen that there is an increase in gain between the learning technique and PF with partial CSI when the dependence to user behavior increases. Similar to the signal detection problem, it is expected that PF based on MMSE and LS estimation work well in a more random scenario where the performance is averaged over different channel conditions, which is why, for Hotspot B and Hotspot C, PF-MMSE and PF-LS provide better performance compared to when Hotspot A is considered. Meanwhile, for Hotspot A, we are looking at a very specific scenario in which the users are always around the table area. Since for this case MMSE and LS depends fully on the channels within the hotspot area, due to the nature of the light and furniture within the room, the channel is highly influenced by the NLOS links. Therefore, it may be challenging for MMSE and LS to estimate the channel accurately, which then leads to worse performance. Hence, this confirms the effectiveness and robustness of the learning method to adapt to different user behavior compared to the conventional techniques.

E. EFFECT OF FIELD OF VIEW
We study the effect of limiting the FOV from 85 • to 45 • on the performance of each resource allocation schemes. Here, again, one AP provides coverage for the whole room so that the assumption of multiple users within a cell considered for the resource allocation problem is justified under the existing geometry of the room. As expected from the results of signal detection problem, Figure 16 shows that when the FOV becomes smaller, the performance of all resource allocation schemes degrades while DNN proves to be robust against FOV reduction achieving significantly better performance than PF-LS and PF-MMSE. As stated previously, due to the narrow FOV, the UE in some locations such as the users closer to the walls may not see the LOS channel and the contribution of the NLOS channel is smaller.

VI. CONCLUSIONS
In this paper, an indoor LiFi system with realistic channel model was considered by including the specific geometrical configurations and user behavior effects. With these channel models, two learning-based approaches were then introduced for improving the performance of signal detection and resource allocation. We compared the performance between the proposed learning methods and the conventional algorithms and demonstrated that the learning schemes outperform the traditional methods as it has the ability to adapt to the specific changes in the environment and user behavior scenarios. Unlike the conventional techniques, the DL-based method has shown to give good performance even when there are irregularities in the system environment. By considering the channel as a black box, the proposed DL methods were able to indirectly estimate the channel and yield high gains in the performance of signal detection and resource allocation especially in the event of having partial CSI and with furniture taken into account. Simulation results show that our DL models, with limited instantaneous knowledge of the channel, were able to perform almost similar to the optimal traditional techniques with perfect CSI. We also demonstrated the robustness of the learning based schemes in adapting to different user behavior scenarios by implementing user hotspot models. The simulated results confirm our expectation that DL-based schemes are able to operate better than traditional methods when specific indoor scenarios were considered. Future works will include the application of DL to a more complex multi-user system with multiple cells in the presence of user mobility such as handover, indoor positioning, and online learning of DL models when the indoor scenarios change.