A QoE-Based Framework for Video Streaming Over LTE-Unlicensed

To improve the system capacity and accommodate the ever-increasing demand for bandwidth by the network users, LTE service providers have turned their attention to the unlicensed industrial, scientific, and medical (ISM) spectrum; currently heavily utilized by the users of the IEEE 803.11 standards or WiFi. Unfortunately, such an approach, referred to as LTE-U, causes co-existence problems which necessitates the development of effective spectrum-sharing mechanisms to mitigate the interference with WiFi users. In this work, we propose a framework for video transmission over LTE-U to achieve harmonious coexistence with WiFi systems while taking into account the quality of experience (QoE) requirements of the user equipment (UE). In the proposed scheme, the channel allocation aims to enhance the peak signal-to-noise ratio (PSNR) and to reduce the end-to-end delay of video packets. We present analytical schemes for predicting PSNR and delay of the received video sequence based on video parameters transmitted by the video server and periodic feedback from the UE, which provides the signal-to-noise ratio (SNR) of the channel. In addition, the probability of causing interference to WiFi users is used to formulate the channel allocation problem as a multi-objective optimization problem. Taking into account the received video quality and the achieved inter-frame delay for both LTE-U and WiFi users, the simulation results show that the proposed scheme outperforms a reference model that employs a channel access mechanism but randomly assigns frames to the available channels.


I. INTRODUCTION
The number of mobile devices is expected to exceed 12 billion by the year 2022, which will cause the monthly traffic to approach 80 exabytes, more than 50% of which will be serving video applications [1]. This expected phenomenal increase in the demand for higher data rates by the consumers requires an efficient use of the available spectrum. However, one major challenge faced by the service providers to doing so is the fact that the available licensed spectrum is fully utilized. Therefore, any increase in the number of users or in the required bandwidth per user will inevitably lead to a reduced quality of experience (QoE) due to spectrum congestion. For video applications such as video streaming, compression techniques are widely employed to reduce the size of the transmitted data. However, the interdependencies signification throughput gain can be achieved. LTE-U can be done through the deployment of dense small-cell networks and the use of carrier aggregation with the primary carrier frequency in the licensed spectrum aggregated with one or more secondary carrier frequencies in the unlicensed spectrum. The most suitable unlicensed bands for small cell deployments are found to be in the 5 GHz region of the industrial, scientific and medical (ISM) and the unlicensed national information infrastructure (U-NII) bands. LTE-U in uses the supplemental downlink (SDL) mode, where downlink (DL) user data may be transmitted over unlicensed frequency channels, but all control information and uplink (UL) data are still transmitted over a licensed channel [6]- [8].
Clearly, the implementation of LTE-U needs to address the interference issues arising from the coexistence with wireless systems that are already operating in the ISM and U-NII bands. One such a system, IEEE802.11 wireless LAN or WiFi, poses a unique challenge because WiFi channel access employs the contention-based carrier sense multiple access with collision avoidance (CSMA/CA) mechanism, whereas LTE employs a scheduling-based approach. In LTE, the scheduler continues to allocate resources to user equipment (UE) as long as they have data to transmit. This causes WiFi to constantly back off when LTE attempts to access the unlicensed channel. This back-off procedure that is inherent to CSMA will severely hinder the QoE of WiFi users.
To address the coexistence issues, 3GPP has introduced a technique called license-assisted access (LAA) as a part of LTE-Advanced (LTE-A) standards. LTE-A makes it mandatory for LTE to use listen-before-talk (LBT) when accessing the unlicensed channel [9]. LBT, in a similar fashion to CSMA/CA, mandates that the LTE eNodeB (eNB) senses the channel before any transmission. The use of LBT, however, does not provide any guarantees on the quality of service (QoS) desired by the LTE users, since their equipment can still experience delays and interruptions depending on the WiFi users utilization of the channel.
To evaluate Wi-Fi and LTE coexistence problem, two simulation scenarios are defined by 3GPP TR36.889 [10], indoor and outdoor. Some authors considered the indoor scenario as it is more challenging. High frequency in the unlicensed band contributes to greater wall and floor losses inside the building compared with the licensed spectrum. This leads to different coverage regions and SINR values of licensed and unlicensed spectrum. Therefore, for better spectral efficiency, efficient allocation of licensed and unlicensed spectrum for indoor users is required [11], [12]. Furthermore, it is more challenging to maintain fair coexistence due to the close proximity between the eNB and APs [13], [14].
In addition to the coexistence issue with WiFi, wireless video streaming poses additional challenges on the LTE-U system design due the stringent quality of experience (QoE) demands associated with video delivery. The time-sensitive nature of the data being transmitted places strict upper bounds on the delays allowed between two consecutive frames so as to maintain continuous playback at the receiver. In addition, the frames in the encoded video sequence vary in terms of their importance and therefore the loss of the different types of video frames affect the decoding process differently.
In this article, a channel selection scheme for transmitting video over LTE-U is also proposed. The proposed scheme takes into account the QoE requirements of UEs and jointly aims to achieve harmonious coexistence with WiFi. The scheme dynamically assigns UE video frames to channels based on three factors: predicted peak signal-to-noise ratio (PSNR), predicted delay, and interference caused to WiFi. The contributions of the paper can be summarized as follows. First, a framework for predicting the achieved PSNR of video frames transmitted over a flat fading channel is presented. Second, a framework for predicting the transmission delay of the video frames to a UE based on the historical average of the UE's throughput is also presented. Finally, using these two factors in addition to the probability of causing interference to WiFi, the channel selection problem is formulated as a multi-objective optimization problem with three priority levels whereby the eNB assigns UE frames to different channels with the aim of maximizing the average PSNR of the reconstructed videos at the UE while minimizing the average frame delay as well as minimizing the probability of causing interference to WiFi. Simulation results are presented to verify the effectiveness of our proposed channel selection scheme.
The rest of the paper is organized as follows. Related work is presented in Section II followed by the system model in Section III. The details of the proposed channel selection scheme are described in Section IV. Simulation results are presented in Section V before the paper is finally concluded in Section VI.

II. RELATED WORK
Several mechanisms have been proposed in the literature for enabling the harmonious coexistence of LTE and WiFi in the unlicensed spectrum. Some of these mechanisms build on the already standardized LBT, improving different aspects of it, while others propose different channel access schemes that replace LBT altogether.
The improvements to LBT mainly focus on two aspects; the back-off mechanism and the sensing procedure. For the back-off mechanisms, many works proposed the adaptation of the contention window (CW) size such as [15] where the CW size is adaptively changed by the LTE-U eNB based on the results of previous transmissions while in [16] the CW size is adjusted based on the available bandwidth in the licensed spectrum and the WiFi traffic load in the unlicensed spectrum. Other works replace the random back-off altogether such as [17] which proposed a fixed CW size and [18] where the random back-off is replaced with continuous sensing until finding a free channel. For sensing schemes, the authors in [17] proposed the use of signal detection in addition to energy detection to improve the accuracy of the sensing results. The authors in [19] proposed two channel sensing schemes for LTE-U in which sensing either takes a fraction of a subframe or the whole subframe. The authors in [20] developed an adaptive p-persistent CSMA scheme for LTE-U where access to the channel is carried out in a probabilistic fashion based on the value of a Bernoulli random variable with a mean reflecting the interference level in the channel. The authors in [21] proposed to perform sensing only during the arbitration inter-frame spacing (AIFS) period between two WiFi transmissions in order to divide the channel airtime into two orthogonal airtimes for LTE-U and WiFi. The authors in [22], on the other hand, added a random back-off counter after the sensing period to avoid collision with other LTE-U nodes, and adaptively change the sensing period duration to achieve different WiFi protection levels.
Other works in the literature replace LBT with other techniques including time division multiplexing (TDM), channel selection, and power control techniques. TDM techniques primarily focus on sharing the channel airtime between LTE-U and WiFi in a fair manner. For example, many works are based on duty cycling methods where LTE transmits for a period of time and is silent in another to provide transmission gaps for WiFi. This includes leaving variable-length coexistence gaps after LTE-U transmission [6], carrier-sensing adaptive transmission (CSAT) [5], allocating a number of silent subframes within the LTE radio frame [23]- [25], and adaptively adjusting the channel access probability and occupancy time of LTE-U [26]. In terms of channel selection techniques, the authors in [5] discussed the idea of dynamic channel selection where small cells periodically measure the interference in multiple channels in the band and choose the channel with the least interference to transmit in. The authors in [27] proposed another channel selection functionality for LTE-U based on Q-learning in which prior experience is used to decide on the best channel to transmit over. In terms of power control techniques, the authors in [28] suggested the use of uplink power control where LTE eNBs and UEs measure the interference in the channel to estimate the presence and proximity of WiFi nodes and adjust their transmission power accordingly. Finally, the authors in [29] proposed a spectrum etiquette protocol for LTE-U in which LTE-U regards WiFi as the primary user with higher priority to transmit and adjusts its transmission power based on the information obtained from decoding WiFi physical-layer (PHY) frames.
As mentioned earlier, wireless video streaming poses additional challenges on the LTE-U system design due to the time-sensitive nature of the data being transmitted. However, there has hardly been any work in the literature covering video transmission over LTE-U. Only the authors in [30] proposed a scheme for video transmission over LTE-U based on adaptively assigning video frames to a number of available channels with the aim of maintaining continuous video playback at the receiver. The work at hand will also tackle the problem of video streaming over LTE-U but using an alternative channel selection scheme that strives to achieve harmonious coexistence with WiFi while taking into account the stringent QoE requirements of the LTE-U UEs.

III. SYSTEM MODEL
The proposed system model is presented in Fig. 1, we consider K Wi-Fi access points (APs) that have overlapping coverage with an LTE home eNB (HeNB) and share a common unlicensed spectrum. Each of the WiFi APs is assumed to operate in one of the N CH unlicensed channels and serve N STA WiFi stations (STAs). The HeNB is serving N UE UEs and is assumed to operate over the licensed channel to which it originally has access in addition to N CH unlicensed channels that are being provided by the WiFi APs. All the users are assumed to be moving randomly within the area to represent unpredictable movement of users to model the network dynamics. We assume that the IEEE 802.11n protocol is supported by all the APs which works in the 5 GHz band. The SDL deployment mode is adopted, where the downlink data is split over the licensed and the N CH unlicensed channels, where additional capacity is required for video streaming, while keeping the transmission of control and signaling information as well as uplink data of LTE over the licensed spectrum [8], [31].
In the proposed system, a video server remotely located in the Internet is sending a video sequence to the users. All video files are compressed before transmission to reduce the file size. It is also assumed that the video sequences are compressed using the MPEG-4/JVT or H.26L standards. Therefore, the encoded video sequence contains 3 types of frames. The first type is intra-coded frames (I-frames), which are used as the source of prediction for the other frames. The second type is predicted frames (P-frames), which are predicted from previous I-frames. The third type is the bidirectional predicted frames (B-frames), which are predicted from the previous and following I or P-frames. The encoded video file is divided into a sequence of group of pictures (GoP) where each GoP begins with an I-frame followed by a number of P and B frames. Since I-frames are used as the main source of prediction, they are the most important ones for the decoding process. If an I-frame is lost, then the remaining P and B frames in the GoP cannot be correctly decoded and the video quality degrades probably until the next I frame is correctly received.

IV. PROPOSED APPROACH
When it comes to real-time communication services such as video transmission, focusing on the technical performance of the network captured by the QoS measures is not enough. Video streaming is an application in which the performance quality is greatly determined by the user perception, which is not necessarily reflected by technical performance metrics. Therefore, QoE as another quality metric is adopted in this study. QoE reflects the general acceptability of the service as subjectively perceived by the end-users [32]. It can be measured using subjective or objective approaches [33]. In this work, we focus on objective approaches since they are more suitable for our implementation. Specifically, we have chosen to focus on two main aspects, namely the average PSNR of the received video as well as the frame delay. We provide a framework for predicting the expected average PSNR of a received video sequence over a flat fading channel based on the UE's signal-to-noise ratio (SNR) measurements of that channel. We also provide another framework for predicting the delay of a video frame over each channel based on the UE's past experiences using these channels. To minimize the interference caused to WiFi, the above two factors are coupled with another that indicates the probability of a channel being already occupied by a WiFi transmission. The three factors are combined into a multi-objective optimization problem, the solution of which assigns the different video frames of the UEs to the different channels. To model which channel a UE is assigned to, we define a binary variable, x ij that indicates whether the i-th UE, i ∈ {1, 2, . . . , N UE } is assigned to the j-th channel, j ∈ {1, 2, . . . , N CH }, in other words: In the following subsections, we detail each of the previously mentioned frameworks and we show how they are integrated to achieve our goal.

A. SNR-BASED PSNR PREDICTION FRAMEWORK
The factor that has the greatest impact on the user's viewing experience is the quality of the received frame. The most popular metric used to quantify this quality is the PSNR defined as the ratio of the highest signal power to the corrupting noise power, which is quantified using the mean squared error (MSE) between the pixel values of the original and received video frames. The MSE for the i-th UE over the j-th channel, MSE ij , can be expressed as where K and L are the dimensions of the video frames in pixels, X orig is the original pixel value, and X rec is the reconstructed pixel value. Since the maximum brightness value a pixel can take, assuming a YUV color encoding system, is 255, the highest possible signal power is simply 255 2 and consequently, the PSNR for the ith UE over the j-th channel can be given by Now, the first objective function we formulate is one that aims to maximize the average PSNR of the received video sequence. Since the PSNR cannot be measured by the HeNB beforehand, it has to be predicted. Here, we outline a method for predicting the average PSNR of a received video sequence using the known SNR of the UE over the assigned channel. From (3), it is clear that in order to predict the PSNR, the MSE needs to be predicted first. The method adopted VOLUME 8, 2020 here to perform the needed prediction is based on the one presented in [34], which links the average MSE of a received video to the bit error rate (BER) experienced over a channel. This analysis, however, is carried out for H.263 coded videos transmitted over binary symmetric channels, which use linear forward error correction (FEC) schemes. It also does not take into account the effects of fading on the BER of the received signal. In what follows, we extend that framework to MPEG4 coded videos transmitted over lossy flat fading channels in LTE systems.
According to [34], the MSE averaged over all frames can be modeled as a summation of two types of distortions. The first type is the distortion introduced by the encoder due to the loss of information caused by quantization. The second type of distortion is introduced by transmission errors, which also propagate through several frames at the decoder due to the temporal dependencies between the video frames. In what follows, these two types of distortion are denoted by D e and D v , respectively. Hence, the MSE can be expressed as MSE = D e + D v . The distortion introduced by the encoder is codec specific and is independent of the state of the channel. Hence, it is treated as a constant in our analysis. The distortion introduced by the decoder, on the other hand, is dependent on the packet error rate, which varies based on the state of the channel. D v can be expressed as [34]: where σ 2 u 0 is a constant representing the sensitivity of the decoder to an increase in the error rate, P e is the residual word error rate and α(t) is the power transfer factor that represents the decay of the error energy after t frames. This power transfer factor takes into account the effects of spatial loop filtering and the intra-coded frames. It is worth mentioning that the maximum number of frames an error can propagate over is the number of frames in one GOP structure. This is because at the end of each GOP, a new I frame is inserted as the beginning of the new GOP. Hence, the source of prediction is now the new I frame and the propagated error is reset to zero.
In LTE, the data handed from the MAC layer to the PHY layer are divided into units called transport blocks (TBs). These are the units to which cyclic redundancy check (CRC) is applied in order to detect transmission errors. Hence, in our analysis, we choose to represent the residual error rate, P e , as the block error rate (BLER). The linear relationship between the error energy and the channel error rate, according to [34], is valid only for error rates less than 0.1 after which the frame quality is already considerably deteriorated. Therefore, for our implementation purposes, this linear relationship is sufficient. The MSE is thus fully expressed as The only missing factor in order to be able to fully evaluate (5) is the BLER. Unfortunately, the BLER depends on many factors including the state of the channel as well as the used FEC scheme. In LTE, the TBs are further divided into code blocks, which then undergo channel coding. Due to the use of adaptive modulation and coding, the code rate varies among these blocks, which makes it difficult to get a precise analytical expression for the probability of error in each block. Hence, we propose to use an upper bound on the BLER that represents a worst case scenario. Assuming the bit errors in each TB are independent and identically distributed, the BLER can be simply estimated from the BER as follows: where TBS i is the size of the transport block in bits for UE i.
To predict the BLER of a UE over a channel, the BER of that UE over the channel is thus needed, which is not available to the HeNB. The HeNB, however, can have access to the UE's SNR measurements, which can be periodically reported by the UE to the HeNB in each subframe. In what follows, we show how it is possible to link the BER achievable over a channel to the SNR measurements in that channel for LTE. The first thing one can observe is that the BER indeed depends on the modulation scheme used and is generally expressed in terms of the energy per bit to noise power spectral density E b /N 0 . Using the SNR value, E b /N 0 can be found as: where BW is the bandwidth allocated to the UE by the LTE scheduler and R b is the achieved data rate. Since LTE employs OFDMA in the PHY layer, the bandwidth can be calculated through multiplying the number of resource blocks (RBs) assigned to the UE by the bandwidth of one RB. Hence, BW ij = N RB ij × 180 kHz, where N RB ij is the number of RBs assigned by the scheduler to UE i over channel j. The data rate R b varies based on the bandwidth, modulation and coding schemes used. Since subcarriers from multiple RBs are aggregated together in each LTE subframe and each RB comprises 12 subcarriers, the data rate of UE i over channel j can be calculated as where M is the modulation level, which varies based on the modulation scheme used and R s is the OFDM symbol rate. E b /N 0 can then be used to calculate the expected BER depending on the modulation scheme used. The second thing to note is that all transmissions undergo the effects of fading which greatly affects the probability of erroneous reception of a block. Therefore, we account for the effects of fading in our analysis as well. We consider a flat fading channel whose magnitude h is assumed to follow the Rayleigh distribution. The average received SNR,γ is thus defined asγ where E h 2 is the mean instantaneous power of the fading channel. LTE employs three variants of the M -QAM modulation scheme namely, QPSK, 16-QAM, and 64-QAM. The BER of M -QAM modulation scheme in a Rayleigh fading channel is given by [35]: Substituting back into (6), we get an expression for the BLER of a UE's M -QAM modulated signal in a Rayleigh fading channel as: The MSE can be found using (11) and (5). According to the above mentioned framework, in order to estimate the PSNR achieved by transmitting a video frame over a certain channel, the HeNB needs the UE's SNR measurement over that channel. The other parameters σ 2 u0 and D e are constants that are codec-specific and can be provided by the server for each video sequence. The first objective function that we propose is thus expressed as where x ij is as defined in (1) and PSNR ij is the average predicted PSNR of UE i's video over channel j. We are now ready to introduce the second objective function that will help maintain timely delivery of frames at the UEs.

B. THROUGHPUT-BASED FRAME DELAY PREDICTION
In order to maintain continuous video playback at the receiver, each video frame must arrive before its deadline. In other words, the time that it would take the frame to arrive at the receiver must be less than the time leading up to the frame deadline. We simply estimate that time through dividing the frame size by the expected throughput achievable over the channel, specifically, where T ijk is the time (in seconds) needed to deliver frame k of UE i over channel j, S ik is the size of frame k of UE i in bits, and R ij is the expected throughput of UE i using channel j. The expected throughput over a channel is calculated for each UE in a moving average fashion similar to [27] where the achievable throughput is a combination of the UE's past experience using the channel and the throughput in the most recent usage of the channel, viz., where R ij (n) is the expected throughput of UE i using channel j at time instant n, R ij (n − 1) is the averaged throughput up to time instant n − 1, r ij (n − 1) is the throughput achieved after the last usage of the channel, and α is the learning rate, which is a design parameter that determines how much the calculated value depends on the current usage of the channel versus the past experience. After each packet reception, the UE evaluates (14) and informs the HeNB with the value. The second objective function can now be formed by taking the difference between the frame delivery time defined in (13) and the frame deadline. Subtracting these two intervals will indicate the duration of time the frame falls short of meeting its deadline. The goal here is to minimize this period so as to guarantee as many timely arrivals of frames as possible, viz., one can write the objective function as where τ ik is the deadline of frame k of UE i.

C. COEXISTENCE WITH INCUMBENT WiFi
While it is very important to guarantee the quality and timely delivery of video frames the LTE UEs, it is also equally important to limit the interference caused to WiFi. This can be achieved by trying to select the channel that is least likely to be occupied by a WiFi transmission. Hence, we add another factor to our channel selection scheme that reflects the probability of causing interference to WiFi. This factor is obtained from the channel access scheme we propose to use in the unlicensed spectrum, which is the adaptive p-persistent CSMA scheme proposed in [20]. Conventional p-persistent CSMA uses a probabilistic approach when accessing the channel. Prior to transmission, the interference level I in the channel is sensed. If the detected interference is found to be below a predetermined threshold I th , a Bernoulli random variable is generated with a probability of success p. If this random variable is equal to 1, the channel is accessed. Otherwise, transmission is deferred to the next time slot. The adaptive p-persistent CSMA proposed in [20] is a modified version of this where the probability of success of the Bernoulli random variable is adaptively changed based on the sensed interference level in the unlicensed spectrum. This ensures that the probability of LTE-U accessing the channel is low when WiFi is most likely to be transmitting, thus avoiding collisions. The measured interference power in the channel I is modeled as a Gaussian VOLUME 8, 2020 random variable with mean I m and variance σ 2 I . The probability of success p is then calculated as the tail probability of the Gaussian distribution, as follows: where I is the interference power in dBm measured by the HeNB in the unlicensed channel. The mean and standard deviation of the distribution in the above equation are design parameters used to control the shape of the distribution. By varying those two parameters, different WiFi protection levels can be achieved. Generally, a narrow distribution where the mean equals the energy detection threshold is preferred to ensure that the probability of LTE accessing the channel is high when the sensed interference is below the energy detection threshold and low when the sensed interference is above the energy detection threshold. The probability of WiFi transmissions being present over channel j can now be calculated by taking the complement of p j . Hence, striving to reduce the interference caused to WiFi, the HeNB should select the channel with the least probability of WiFi presence. Unlike other sensing-based channel schemes such as LBT in which the channel is not accessed if the interference exceeds a certain threshold, this scheme provides more flexibility in channel access and higher accuracy in estimating the state of the channel. The third objective function can hence be formulated as where 1 − p j is the probability of channel j being occupied by a WiFi device.

D. CHANNEL SELECTION OPTIMIZATION PROBLEM
So far, three factors have been involved in making the decision of which channel to transmit a UE's frame over. These factors are the average UE PSNR, the average UE delay, and the probability of WiFi being present in the channel as defined by the objectives (12), (15), and (17), respectively. These objectives are of conflicting nature. For example, a channel that provides the best PSNR for a UE may also be the channel that is most populated by WiFi. In addition, while a channel with better quality improves the throughput hence decreasing the delay, a channel with a larger number of WiFi users reduces the throughput hence increasing the delay. Due to the conflicting nature of these objectives, we propose to employ multi-objective optimization [37] to solve the channel selection problem.
Since no single solution exists that can simultaneously optimize all objectives, some sort of tradeoff must be managed. This tradeoff is achieved by assigning a different priority for each objective. Three priority levels are thus defined in our formulation for the above factors. Since WiFi performance is somewhat protected by employing the time-sharing channel access scheme discussed in Section IV-C, the WiFi protection factor is given the lowest priority in the optimization formulation. The other two LTE-related factors are given higher priorities. We experiment with two different priority combinations leading to two different formulations of the problem. Each formulation prioritizes a different QoE factor that affects the user's viewing experience. In the first formulation, the quality of the received video is prioritized by giving the PSNR factor the highest priority followed by the frame delay factor. We shall refer to this formulation as the quality-prioritizing formulation. In the second formulation the continuity of playback is prioritized by giving the frame delay factor the highest priority followed by the PSNR factor. We shall refer to this as the continuity-prioritizing formulation.
In terms of constraints, each UE must be assigned to one channel only in the duration of a single frame transmission, hence the first constraint is expressed as i x ij = 1. LTE allows the grouping of the RBs in each channel into RB groups (RBG). The number of RBs per RBG depends on the type of the resource allocation employed by the MAC layer. Each UE is assigned to one RBG by the scheduler. Hence, to ensure that the amount of resources assigned to UEs does not exceed the amount of resources available in each channel, we formulate the second constraint as j x ij ≤ N RBG j , which limits the number of UEs assigned to each channel to the number of RBGs available in that channel, N RBG j . The quality-prioritizing formulation is thus given by: while the continuity-prioritizing formulation is given by: Users are allocated to each channel based on the solution to one of these optimization problems, which is carried out on a frame-by-frame basis. The weighted sum technique is a commonly used approach to solve a multiobjective optimization problem, where all the weighted objective functions are combined to form a single-objective optimization problem. Although the formulation is simple, the solution is weight dependent and it is difficult to assign the proposer weight for each objective [36]. On the other hand, the lexicographic optimization approach is an attractive multiobjective optimization method where the objective functions are optimized one at a time in a decreasing priority order [37]. After each step, the solution to the previous objective function is added as an equality constraint to the next objective function. In other words, when optimizing for one objective, only the solutions that wouldn't degrade the higher-priority objectives are considered. Some solvers such as the one adopted in this work allow this constraint to be relaxed by specifying a tolerance value for each objective. This indicates the fraction of the optimal value of this objective by which the lower priority objectives are allowed to degrade it. Although the lexicographic optimization approach is somewhat more complex compared to the weighted sum technique, it allows to solve the problem sequentially with resealable complexity [38]. The complexity of different Multiobjective Evolutionary Algorithms (MOEA) is presented in [39] and the computational complexity of lexicographic approach is found to be T f G n k + G 2 n k − G n k, where n is population size, k is the number of optimization functions, and G.T f represents the fitness computation time.
From an implementation point of view, the HeNB sends a reference signal to the UE over each of the channels. Based on the received signal, the UE calculates the SNR for each channel and sends it to the HeNB along with the average past throughput over each channel, the channel quality indicator (CQI) of each channel, and the buffer status report (BSR). This process is repeated in every subframe. For each video frame of each UE, based on the CQI and BSR information received from the UE, the scheduler in the HeNB decides the best allocation of resources to the UE over each channel. Then, using that along with the SNR, throughput, and video frame information, the HeNB calculates the expected PSNR and delay of the frame over each channel. The HeNB also calculates the probability of WiFi presence in each channel based on energy detection results. Using this information, the HeNB decides the best channel to transmit the frame over based on the solution to either (18a) or (19a).
The details of the quality-prioritizing resource allocation algorithm using lexicographic approach is illustrated in Algorithm 1 with the relative tolerance set to δ r1 and δ r2 .

V. PERFORMANCE EVALUATION
Here, we propose to use a simulation approach in order to investigate the performance of the proposed channel allocation scheme and study its impact on the QoE of LTE-U and WiFi users. The scheme is simulated using an integration of three software tools, namely NS-3 [40], EvalVid [41], and Gurobi optimizer [42] as shown in Fig. 2. EvalVid is a video quality evaluation software that is responsible for encoding the raw video into a compressed MP4 format, generating the hint track, which tells the video server how to packetize the frames for transmission, and generating a trace file that includes information about the video sequence. The trace files are then fed into NS-3, which is responsible for simulating the wireless network architecture including the LTE HeNBs and UEs, the WiFi APs and STAs, the wireless channels, and the server-client interaction. An EvalVid module is integrated within NS-3 that reads the content of the trace files, generates VOLUME 8, 2020 video packets accordingly, and transmits them through the designated sockets. The optimization software Gurobi is also integrated within NS-3 as an external library that is called to solve the optimization problem formulated in Section IV-D. The simulation output consists of server and client dump files generated by NS-3, which are then fed to EvalVid to decode the video received by each user. Using the decoded video file, it then calculates the percentage of lost frames in the video sequence, the end-to-end delay of the frames, and their PSNR, among other parameters. As mentioned, two scenarios are considered by 3GPP for the deployment of LAA [10]. The scenario adopted here is based on the methodology described on 3GPP for indoor deployment where users are dropped randomly within the coverage of the small cell in the unlicensed band having a mobility speed of 3 km/h following any mobility model that represents the user's movement such as random way-point [43], random walk or constant speed mobility model [44]. This deployment for an indoor environment is also employed by authors in [45], [46]. We consider the scenario presented in Fig. 3, where four LTE and WiFi small cells are allowed to co-exist in a single floor 120 m ×50 m building. The small cells are centered along the shorter dimension of the building and they spaced in the X-axis by the d and bs-space distances [14], where bs-space is set to 25 m. Each cell serves N users, which are moving randomly within the building with a speed of 3 km/h. A remote server is sending a video sequence to each of the users through the LTE-U HeNB and the WiFi AP. LTE-U uses our proposed channel selection scheme illustrated in Section IV-D. The scheduler used in the simulation is a proportional fair one and the propagation model used is the Rayleigh propagation model. The remaining simulation parameters are summarized in Table 1. The video file used is the ''Foreman'' test sequence and is encoded using an MPEG-4 codec at a bit rate of 2 Mbps.  The resolution of the sequence is 352 × 288 and it consists of 300 frames with a total duration of 10 seconds. To evaluate the effectiveness of our proposed scheme, another naive scheme is simulated. This latter also employs p-persistent CSMA as a channel access mechanism in the unlicensed channel, but the frames are assigned to the channels in a random fashion.
The first set of results are shown in Fig. 4. Four performance metrics are used to evaluate and compare the performance of the two schemes, namely the average percentage of lost frames per user, the average user PSNR, the average user end-to-end delay, and the average cumulative jitter. These metrics are reported versus different number of LTE-U UEs. The results show that, for all schemes, as the number of LTE-U UEs increases, the number of lost frames, delay, and jitter for both LTE-U and WiFi users increase while the average PSNR of the received video decreases. This degradation in performance is expected as UEs get assigned less resources and WiFi users experience more competition in the unlicensed spectrum. For small numbers of UEs, both the optimized and random assignment schemes perform similarly as the resources available in the channels are enough to accommodate all users. However, as the number of UEs increases, we find that the optimized assignment schemes significantly outperform the random assignment scheme in terms of reducing the percentage of lost frames, average delay, and jitter, as well as improving the average PSNR. More specifically, for the quality-prioritizing assignment, the average PSNR of the received video improved by a maximum of 15  the overall quality of the UE videos improves. In addition, the UEs are assigned to the channels with the least probability of being occupied by a WiFi transmission, hence the likelihood of collisions is significantly reduced and a better performance for both systems is experienced.
Comparing the two optimized assignment schemes, we find, as expected, that the quality-prioritizing assignment outperforms the continuity-prioritizing assignment in terms of improving the average UE PSNR. This is intuitive since the PSNR factor had the highest priority in that problem formulation. Following the argument presented in subsection IV-D, channels vary in their quality regardless of the number of WiFi devices occupying them. A channel can have a small number of WiFi devices using it, hence resulting in a smaller transmission delay, while at the same time exhibiting bad quality. Hence, by prioritizing quality over delay in the formulation, we significantly improve the PSNR of the received video.
In addition to the video-specific performance metrics, we also evaluate the fairness of each of the channel access schemes by calculating Jain's fairness index between LTE-U and WiFi, which can be expressed as [47] where x L i is the throughput of LTE-U UE i and x W j is the throughput of WiFi STA j. We plot the fairness index as a function of the number of LTE-U UEs to test the robustness of each channel selection scheme as the competetion in the unlicensed channels increases. The results are illustrated in Fig. 5. The results show that, for the most part, all the schemes achieve a comparable performance. However, as the number of LTE-U UEs increases, the optimized assignment schemes outperform the random assignment scheme by a small margin. The comparable performance of all the schemes is attributed to the fact that all of them employ p-Persistent CSMA, an access scheme that shares the channel in time with WiFi. So in terms of the fair division of channel airtime between the two technologies, all schemes perform similarly. The simulation results thus confirm a very important observation, which is the fact that, for video streaming applications, time-sharing coexistence schemes are not as effective on their own. Specifically, the random assignment scheme simulated also employs p-Persistent CSMA, however, that alone is not enough to guarantee a good performance for both technologies. Similarly, simply extending LTE-U operation to multiple unlicensed channels without a specific channel selection criteria is not effective on its own either, even with the use of an effective time-sharing coexistence mechanism. The best performance is achieved by coupling adaptive channel selection schemes with time-sharing channel access mechanisms.
In order to quantify the gain obtained by LTE-U accessing the unlicensed channel using our proposed channel selection VOLUME 8, 2020  scheme, we simulate another scenario where the whole video is transmitted over the licensed channel only. The simulation results are illustrated in Fig. 6. The results show a considerable performance improvement to be gained by extending LTE's operation to the unlicensed spectrum using our scheme. For the largest number of LTE-U UEs, the performance improvement gained using the quality-prioritizing scheme is 31.55% reduction in percentage of lost frames and 6.43 dB improvement in the average UE PSNR. The reason behind this improvement is that by aggregating multiple unlicensed channels, the overall bandwidth available for LTE-U is significantly increased, allowing it to accommodate more users who are all requesting bandwidth-hungry applications at the same time. In addition to the increased aggregate bandwidth, LTE-U has more options, in terms of the channels available for it to transmit over. On the other hand, the licensed channel may not always exhibit the best quality. By calculating the expected PSNR and delay achieved by transmitting the video over each channel, LTE-U can choose the best channel to transmit a video frame over that will guarantee the best results.

VI. CONCLUSION
In this article, we presented a dynamic channel selection scheme for video transmission over LTE-U that accounts for QoE requirements of the LTE-U UEs while simultaneously striving to minimize interference caused to WiFi. The proposed QoE-based channel selection scheme assigns user frames to different channels based on the solution to a multi-objective optimization problem that takes into account the predicted quality of the received video in each channel, the expected average frame delay, and the probability of causing interference to WiFi transmissions. We have presented a framework for predicting the average PSNR of a received video over a flat fading channel using the SNR measurements of the UE in that channel. We have also presented a framework for predicting the average frame delay over a channel based on the UE's past experiences using the channel.
Through extensive simulations, we have evaluated the performance of this channel selection scheme and compared it to another scheme that employs the same time-sharing channel access mechanism but randomly assigns frames to channels. The simulation results show that the proposed dynamic channel selection scheme achieves a significantly better performance in terms of improving the received video quality and reducing the inter-frames delays for both LTE-U and WiFi users. In a future work, we propose to integrate the current framework with Dynamic Adaptive Streaming over HTTP and scalable video coding.