Dynamic Resource Allocation for Scalable Video Streaming in OFDMA Wireless Networks

Mobile video streaming is a successful example of Cyber-Physical-Social Systems (CPSS). How to schedule network resources and provide better mobile video streaming services for mobile users are very important. Scalable video streaming is regarded as a promising technology in wireless networks where the cognitive femtocells are overlaid within the coverage area of a macrocell network. In this paper, we study dynamic resource allocation for scalable video streaming over cache-enabled wireless networks with time-varying channel conditions. We formulate the scalable video streaming problem as a stochastic optimization problem which aims at maximizing the time-averaged system utility subject to the time-averaged video cache constraint at the server and the cross-tier interference constraint on the primary user under the sparse deployment scenario of femtocells. By employing the Lyapunov optimization theory, we design a dynamic cache and resource allocation (DCRA) algorithm to solve this problem. Furthermore, the problem is decomposed into three subproblems, i.e., video layer selection, cache placement, and wireless resource allocation. Via solving these subproblems, we derive the video layer selection and cache placement strategies, and a wireless resource allocation algorithm to manage the cross-tier interference. Simulation results demonstrate the advantages of the proposed DCRA for streaming scalable video over time-varying wireless networks.

spectrum. However, the cross-tier or cotier interference can seriously restrict the network performance. Hence, the concept of interference temperature is introduced to constrain the total allowable interference in a spectral band. In the literature [8], by employing Fuzzy Logic System (FLS) to estimate the instantaneous channel gain, a power allocation scheme was proposed for the OFDMA-based femtocell network to maximize the total network capacity under the dynamic communication environments. Interference graph was introduced to solve the severe inter-cell interference in the densely deployed small base stations (SBS) [9]. A joint power allocation and sensing time optimizing algorithm, subject to the constraints of transmit power, minimum rate requirements and the interference among primary user and secondary users, was proposed to maximize the achievable throughput for the CR network [10]. A novel joint spectrum and power management scheme was presented to maximize the total throughput of the full-duplex ultra-dense network (FDUDN), under the cross-tier interference and quality-of-service constraints [11]. Lin et al. studied the joint user association and spectrum allocation for multi-tier heterogeneous networks (HetNets) in the interference-limited regime [12].
With rapid advancement of edge computing, the video server located closely to mobile users can provide more flexible services for mobile users in CPSS. Scalable Video Coding (SVC) [13]- [16] has been widely used for flexible video streaming by adjusting the number of enhancement layers in response to time-varying wireless channel conditions. For instance, a femtocell equipment (FE) can receive a highquality video streaming containing more enhancement layers when the wireless communication channel is good, whereas the FE can receive a low-quality video streaming containing fewer enhancement layers when the channel is poor. Scalable video streaming in wireless networks has been investigated in many studies. Considering the cache placement, video quality decision and wireless resource allocation, a dynamic cache algorithm was presented to maximize video quality and backhaul saving in single cache-enabled vehicular networks [17]. A dynamic resource allocation algorithm for streaming scalable videos over SDN-aided dense small-cell networks was presented in [18], which aims for maximizing the timeaveraged quality of experience (QoE). To solve the problems caused by vehicle's mobility and hard service deadline constraints, a deep reinforcement learning algorithm with the multi-timescale framework considering joint optimal caching and computing allocation strategy was proposed in [19]. Zhao et al. proposed a dynamic bitrate adaptation scheme to maximize the user's QoE over heterogeneous wireless networks by considering fundamental uncertainties of wireless networks (i.e., the stochastic throughput) [20]. An optimal resource allocation of a downlink non-orthogonal multiple access (NOMA) system was proposed to maximize a long-term network utility by jointly optimizing the data rate control and the power allocation among multiple users in [21]. However, all these works did not consider the interference management in spectrum-sharing femtocell deployment networks.
In this paper, we study video layer selection, cache placement, wireless resource allocation in OFDMA-based wireless networks under the cross-tier interference temperature constraint and network stability constraint. The key contributions of this paper can be summarized as follows • Since the femtocell is enabled with cognitive capabilities, the scheduling of video streaming in the femtocells should not affect the data transmission of primary macrocells. Thus, the cross-tier interference is considered in this paper.
• We formulate the scheduling of video streaming as a novel stochastic optimization problem to maximize the time-averaged system utility, which takes into account three factors of video streaming, including video quality, quality switching, and cache capacity under the constraints of the cross-tier interference limit and network stability. By exploiting Lyapunov optimization technique, the formulated optimization problem is further decomposed into three distinct and tractable subproblems.
• Without requiring the complete knowledge of wireless channel statistics, we develop a dynamic cache and resource allocation (DCRA) algorithm to efficiently solve the three subproblems at each time slot, which target at video layer selection, cache placement, and wireless resource allocation, subject to the interference temperature limit constraint and network stability constraint.
• We theoretically analyze the performance of the proposed DCRA algorithm. And we conduct extensive simulation experiments to verify the theoretical analysis and advantages of DCRA. The remainder of the paper is outlined as follows. Section II elaborates the system model. In Section III, we present the mathematical problem formulation. Section IV is devoted to deriving our joint video layer selection, cache placement, and wireless resource allocation conceived for scalable video streaming. Section V presents our experimental results for characterizing the attainable performance of the proposed approach. Finally, Section VI concludes the paper.

II. THE SYSTEM MODEL
As shown in Fig. 1, we consider a downlink OFDMA-based cache-enabled wireless network, where I cochannel cognitive femtocells are overlaid randomly within the coverage area of a primary macrocell. There are E active macrocell equipment (MEs) in the macrocell and J FEs asking for video services in each femtocell. We assume that all femtocells in wireless networks operate in closed access where only specified registered FEs can communicate with their femtocell base station (FBS), and MEs can only access to their macrocell base station (MBS). A video server is collocated with the MBS to provide storing and streaming video services to FEs and MEs in wireless networks. All femtocells can communicate with the primary macrocell via optical fibers. It should be noted that we consider the sparse deployment scenario of  femtocells within the primary macrocell [22], that results in a higher cross-tier interference between femtocells and the macrocell than the cotier interference between neighboring femtocells. Each FBS maintains a transmission queue, containing the video data from the video server that should be sent to each preregistered FE. In addition, we assume the wireless network operates in a slotted structure, and each time slot t ∈ {0, 1, 2, ...}.

A. SCALABLE VIDEO STREAMING MODEL
In this paper, the SVC technique is adopted to support adaptive video transmission due to its beneficial flexibility in adjusting the quality of video streams. Each video is encoded into a base layer containing the minimum quality representation and (L − 1) enhancement layers for additional quality.
The Mean Opinion Score (MOS) [18] and the Peak Signal to Noise Ratio (PSNR) [23] are commonly employed as metrics to measure the video quality. Without loss of generality, we utilize the MOS level as the QoE metric to quantify the perceived video quality in the problem formulation. Let l ij (t) and q ij (t) denote the number of video layers and the corresponding MOS level for FE j in femtocell i at time slot t, respectively. q ij (t) is a video quality function, which is written as Let s ij (t) denote data packet size of the video slice containing the first l ij (t) video layers. Then, we have Generally, the more enhancement video layers means the higher MOS level and the greater volumes of video data. For example, for a video with four enhancement layers, i.e., l ij (t) ∈ {1, 2, 3, 4, 5}, the MOS level can be commonly expressed using 5-point (i.e., 1-bad, 2-poor, 3-fair, 4-good, and 5-excellent) [24], and the data packet size can also be expressed by 5-level (i.e., 1-30 kb/s, 2-60 kb/s, 3-120 kb/s, 4-250 kb/s, and 5-500 kb/s).
The manager utilizes the prevalent network conditions to make video layer selection and cache placement to adjust the number of enhancement layers and to decide the cache placement at each time slot, respectively. However, the interval of video layer selection and cache placement decision is much longer than the length of the physical transmission time slot. This is because the video layer switching and cache placement are usually operated at every video slice time which equals hundreds of milliseconds. In contrast, wireless resource allocation is usually conducted on the order of several milliseconds, which is the duration of wireless channel conditions change due to the high user mobility. For simplicity, we assume that the duration of each video slice is constant and equals T times the physical transmission time slot length. Hence, both video layer selection and cache placement are operated every T physical transmission time slots.

B. CACHE PLACEMENT POLICY
Denoted by x ij (t), the binary cache placement decision variable for FE j in femtocell i at time slot t, where if the selected video slice is cached in the video server, x ij (t) = 1; otherwise, x ij (t) = 0. Since the cache space of video servers is limited, a fixed number of cache partitions is maintained for each FE to cache the video data. Thus, the cache space constraint for each FE is given as follows where η ij is the time-averaged value of maximum cache partitions of the video server for FE j in femtocell i.

C. RADIO RESOURCE MODEL
To improve the transmission rate and alleviate cross-tier interference, each FBS optimizes the wireless resource allocation (e.g., transmit power allocation for each subchannel and subchannel assignment for all FEs) at beginning of each physical transmission time slot. By employing OFDMA technique, the channel can be transformed into K subchannels in wireless networks, each of which is with a bandwidth of W . Based on the standard assumption in OFDMA network [25], each subchannel k can be assigned to at most one FE at each time slot. The binary subchannel assignment variable for the access link is denoted as a ijk (t) ∈ {0, 1}. If subchannel k is assigned to FE j in femtocell i at time slot t, a ijk (t) = 1 and otherwise, a ijk (t) = 0. Therefore, we have the following constraint Let p f ijk (t) denote the transmit power from FBS i to FE j on subchannel k at time slot t. We assume that the maximum transmit power of each FBS at each time slot t is limited by a predefined threshold P max . Hence, we have By constraining cross-tier interference from cognitive femtocell to primary macrocell user e to a predefined interference level I th k on each subchannel, we employ the interference temperature limit to protect the data transmission of primary macrocell. Let g f iek (t) denote the interference gain on subchannel k from FBS i to ME e in time slot t. Thus, we have Denoted by g f ijk (t), the channel gains on subchannel k from FBS i to FE j in time slot t. The interference subchannel gains on subchannel k from MBS to FE j in femtocell i in time slot t is also indicated by g m ijk (t). Then, the instantaneous received signal-to-interference-plus-noise ratio (SINR) from FBS i at FE j on subchannel k at time slot t, denoted by γ f ijk (t), is given by where p m ek is the transmit power on subchannel k from MBS to ME e at time slot t, p m ek g m ijk (t) is the interference power caused by ME e at the FBS i on subchannel k at time slot t, and σ 2 represents the power spectral density of additive white Guassian noise.
According to the Shannon's capacity, the instantaneous received data rate from FBS i at FE j on subchannel k at time slot t is given by

D. DYNAMIC QUEUE SETUP
Based on the system model introduced previously, the dynamics of the queues for the data queue Q ij (t) for FE j in femtocell i at time slot t evolves according to the following queue dynamics where C ij (t) = K k=1 a ijk (t)C ijk (t) denotes the achievable data rate at FE j in femtocell i at time slot t, and [x] + max(x, 0). To ensure the network stability, we must guarantee the queues are strongly stable. Thus, we have the following definition.
Definition 1: An individual queue Q ij (t) is called strongly stable if the following condition holds: where it indicates that a network is stable if all the upper bound of time-averaged queues length in the network are finite [26].

III. PROBLEM FORMULATION
In this section, we first formulate the system utility function by considering three time-averaged factors, including video quality, quality variation and cache capacity. Hence, the instantaneous system utility function can be defined as where w 1 , w 2 and w 3 are the weights of video quality, quality variation and cache capacity, respectively. Since our aim is to maximize the time-averaged system utility, the system utility maximization of our scalable video streaming problem in wireless networks is formulated as the following stochastic optimization problem P: max } denote the number vector of enhancement layers for the selected video slices, cache placement vector, transmit power allocation vector, and subchannel assignment vector of wireless networks, respectively. C1 and C2 represent the cache placement constraints. C3 is the video layer selection constraint. C4 and C5 are the transmit power allocation constraints. C6 and C7 are the subchannel assignment constraints due to the OFDMA assumption. C8 represents that the cross-tier interference power on subchannel k should not exceed I th k . C9 is the network stability constraint. Theoretically, the Problem P is a classical constrained stochastic optimization problem and can be solved using dynamic programming (DP)-based methods [28] or deep learning technique [29], [30] if the prior knowledge of the stochastic process concerning the channel state information (CSI) is available. However, these solutions are computationally complex and suffer from the curse of dimensionality. Moreover, it is difficult to obtain a prior knowledge of CSI in real wireless networks. This motivates us to employ the powerful Lyapunov optimization theory [26], [27] to develop an online cache placement and wireless resource allocation algorithm, since the algorithm designed on Lyapunov optimization technique does not rely any prior statistical knowledge of wireless networks and also has low computational complexity.

IV. DYNAMIC CACHE AND RESOURCE ALLOCATION ALGORITHM
In this section, we design a dynamic cache and resource allocation (DCRA) algorithm by invoking Lyapunov optimization theory for transforming the time-averaged stochastic optimization problem into a static minimization problem at each time slot. Furthermore, the minimization problem is divided into three independent subproblems including video layer selection, cache placement, and wireless resource allocation.

A. LYAPUNOV STOCHASTIC OPTIMIZATION FORMULATION
Since the constraint C1 is based upon time-averaged values, the virtual queue Y ij (t) over time is established to transform the time-averaged constraint on the cache storage, which can be modeled as Then, the time-averaged constraint C1 can be satisfied through maintaining the stability of virtual cache queue Y ij according to the rate stability theorem [26].
By introducing the virtual cache queue Y ij , we can formulate the transformed problem as To solve the optimization objective problem P1, we define the quadratic Lyapunov function as denotes a concatenated vector of data backlog and virtual cache queue length. Furthermore, the corresponding T -slot conditional Lyapunov drift T (t) from the time slot t to the time slot (t + T ) is defined as Since the objective of the joint video layer selection, cache placement, and wireless resource allocation is to maximize the system utility subject to the constraints, we integrate Lyapunov drift and the instantaneous system utility in T time slots, and define the T -slot Lyapunov drift-minus-reward term (t) in the context of Lyapunov optimization framework as where β is a positive weight factor of constants which tunes the tradeoff between the system utility and network stability in our control strategy. According to the design rule of Lyapunov optimization technique, the goal of network optimization objective is to realize the video layer selection, cache placement, and wireless resource allocation by minimizing the upper bound of the Lyapunov drift-minus-reward term at each time slot. Then, the following theorem gives the result Theorem 1: For any queue backlogs and actions, the upper bound of (t) is presented as follows where B is a positive constant, which is defined as Proof: See Appendix A.
Exploiting Theorem 1, we reformulate P1 to minimize the Right-Hand Side (R.H.S) of (18) at each time slot, subject to the instantaneous constraints in P1.
We can observe that the queues backlog, Q ij (t) and Y ij (t), impact the upper bound of (t). Moreover, the Problem P depends on the queue state information (QSI) and current network parameters including CSI which is reported back to the manager at each FBS from feedback channel without any delay and error. Then, each FBS operates video layer selection, cache placement and wireless resource allocation with the help of MBS based on these information at beginning of each time slot.
Theorem 2: For any positive value of β, the time-averaged system utility SU sub obtained by solving the Problem P satisfies where SU * is the optimal system utility of the Problem P.
Proof: See Appendix B. Theorem 2 implies that the sub-optimal time-averaged system utility is within O(1/β) of the optimal system utility SU * . VOLUME 8, 2020 That is, when β is sufficiently large, the sub-optimal system utility asymptotically approaches the optimum system utility. While, the control parameter β controls the tradeoff between the system utility and queue stability. Specifically, a larger value of β increases the system utility but may degrade the queue stability of data queue Q(t) and virtual cache queue Y (t). Hence, an experiment has been conducted to analyze the performance sensitivity about the tradeoff factor β in Section V.

B. ALGORITHM DESIGN
In this section, we design the proposed DCRA method to achieve the optimal video layer selection, cache placement, and wireless resource allocation.
The pseudo code of the DCRA algorithm is detailed in Algorithm 1, which performs the following three operations: (1) Joint video layer selection and cache placement, which determine the quality of video slice and cache placement for each FE; (2) Wireless resource allocation in each FBS, which performs transmit power allocation for each subchannel and subchannel assignment for their FEs; (3) Queues updating for Q(t) and Y (t). Compute X(τ ) according to (22).

1) JOINT VIDEO LAYER SELECTION AND CACHE PLACEMENT
After decoupling the video layer selection l ij (t) and cache placement x ij (t) from R.H.S of (18) and rearranging the objective function, the subproblem for joint video layer selection and cache placement is obtained as follows which is a mixed combinatorial programming problem.
According to the similar technique [17], the optimal cache placement solution is Then, by substituting the optimal cache placement decision x ij (t) into (21), the subproblem can be recast to Considering the fact that the video sequence is encoded into a very limited number of layers, we can use bruteforce method to find the optimal number of video layers by enumerating all possible layers l ij (t) ∈ {1, 2, .., L} at each time slot.

2) WIRELESS RESOURCE ALLOCATION
We tackle the wireless resource allocation subproblem embedded in (18), where transmit power allocation P(t) and subchannel assignment A(t) should be determined as follows max

P(t),A(t)
s.t. C4, C5, C6, C7, and C8. (24) The optimization problem in (24) is a nonconvex mixed integer programming problem. To solve this problem, we first relax subchannel assignment constraint C6 so that a ijk (t) takes value in a continuous interval [0, 1]. Then, the original problem (24) is transformed into a convex problem and can be solved by employing the Lagrangian dual decomposition method [22]. At last, we get the optimal transmit power allocation and subchannel assignment in the following theorem.

Theorem 3 (Optimal Wireless Resource Allocation):
The optimal power allocation and subcarrier assignment decisions in (24) are given by And the optimal subchannel assignment a ijk (t) is expressed as where λ i and θ k are the Lagrange multiplier associating the instantaneous transmit power allocation constraint C5 and cross-tier interference constraint C8, respectively, and We employ a subgradient method to update the Lagrange multiplier, which is given as Algorithm 2 Wireless Resource Allocation (WRA). 1: Input: λ, θ , maximum iteration number F max , and convergence factor 1 , 2 2: Initialization: f ← 1 3: while convergence or f < F max do 4: for i = 1 to I do 5: for j = 1 to J do 6: for k = 1 to K do 7: each FBS computes p f * ijk (t) according to (25). 8: each FBS computes a * ijk (t) according to (26). 9: each FBS updates λ according to (28). 10: end for 11: end for 12: end for 13: Primary MBS updates θ according to (29), then send these values to all FBS. 14: Algorithm 2 describes the pseudo code to accomplish the wireless resource allocation.

V. SIMULATIONS RESULTS AND DISCUSSIONS
In this section, we evaluate the performance of the proposed DCRA algorithm. In the simulations, the radius of macrocell and each femtocell are set to 500 m and 20 m, respectively. The femtocells are uniformly and randomly distributed within the area of macrocell. The bandwidth of each subchannel are W = 10 MHz, p m ek = 1 W, P max = 0.1 W. The number of femtocell is I = 3 and each of femtocell has J = 4 FEs. The fading gain g(t) between a FE and the FBS/MBS is characterized by the path loss and shadow fading as g(t) = 10 (−P d (d)+ B )/10 , where path loss model is P d (d) = 38.4+20 log 10 (d), d is the distance from a FE to a FBS/MBS, and B is random variable with mean zero and deviation 6 dB. The maximum number of video layers is L = 5. The MOS-rate function of videos are characterized by q = bl, where l ∈ {1, 2, . . . , L}. Fig. 2 depicts the data backlog, MOS level and system utility, with respect to different values of β = [50, 500, 1000, 1500, 2000, 2500]. As demonstrated pictorially in Fig. 2, the time-averaged system utility achieved by DCRA algorithm converges gradually to the optimal value with the increase in β, which validates Theorem 2. However, a larger value of β increases the upper bound of the date queue backlog, leading to network congestion. Therefore, the value of β can adjust the tradeoff between the system utility and queue backlog. Fig. 3 shows the network performance versus those for different cross-tier interference limits. With an increasing I th k ,   we observe that the queue backlog is decreased, and both MOS level and system utility are increased. Intuitively, this is because a larger value of I th k means that more transmit power is allocated to the FEs for transmission and thus the network performance is increased. Fig. 4 shows the network performance by varying the video size parameter b. It can be seen that the queue backlog VOLUME 8, 2020  is increased and the MOS level is decreased with a larger value of b. However, we also observe that the system utility increases rapidly with b when b < 15, and then slow down decreasing when b > 15. The reason is that the larger value of b means larger data packet size for each enhancement layer and queue backlog is increased, thus MOS level is decreased for maintaining the network stability. From the changing of system utility, we can conclude that there exists a suitable b that can make the system utility best. Fig. 5 plots the network performance with different values of network bandwidth W . We can observe that the queue backlog decreases with the increase in W , and MOS level and system utility increase when W < 1.1 MHz, and then slow down increasing and start to stabilize when W > 1.1 MHz. This is because more sufficient network bandwidth can transmit more video data to users in the same video slice time, thus both MOS level and system utility are also increased. Fig. 6 shows the performance of MOS level and system utility with respect to different values of η, respectively. We can observe that both MOS level and system utility are increased with the increase in η. This is because a large value of η denotes a large cache space for each FE.   show the channel interference at time slot t = 500 for the K = 20 subchannels, with and without interference constraints, respectively. The cross-tier interference level I th k = 1.5×10 −10 . We can observe that the cross-tier interference is kept under the predefined level I th k , so the interference temperature limit policy can protect the MEs from severe cross-tier interference.
Finally, we compare the network performance of DCRA with two baselines in Fig. 8 and Fig. 9. Baseline 1 is the dynamic cache algorithm in [17], which maximizes the video quality and cache capacity without the consideration of transmit power allocation and subchannel assignment. Baseline 2 is the video quality maximization algorithm without the consideration of caching at each FBS. From the comparisons in Fig. 8, the queue length of Baseline 1 algorithm is the largest among the three algorithms. This is because Baseline 1 adopts constant transmit power without considering subchannel assignment. Additionally, from the Fig. 9, Baseline 2 achieves the worst network performance among the three algorithms, since it ignores the video caching at video  server. Finally, we can conclude DCRA achieves the best network performance and has small queue length.

VI. CONCLUSION
In this paper, we have studied the dynamic resource allocation for streaming scalable videos over wireless networks with cross-tier interference. An interference temperature limit has been introduced to protect the macrocell users from cross-tier interference. The stochastic optimization problem has been formulated to maximize the time-averaged system utility subject to the constraints of video server cache storage limit and cross-tier interference temperature limit. We have designed a dynamic cache and resource allocation (DCRA) algorithm to solve the stochastic optimization problem. Simulation results have verified that the proposed DCRA is effective for streaming scalable videos over time-varying wireless networks.

APPENDIXES APPENDIX A PROOF OF THEOREM 1
Squaring both sides of (9) and (13) produces and Plugging (30) and (31) into the one-slot Lyapunov drift yields where B 1 is a positive constant, which is defined as By summing (32) over [t, t + T − 1], we obtain the upper bound of the Lyapunov drift T (t) as where B 2 is given as We can observe s ij (τ ) is the packet size of video and R ij (τ ) is the cache capacity in the (34). Both of them depend on the decision of video layer selection and cache placement, which is taken every T time slots. Hence, we have s ij (τ ) = s ij (t) and R ij (τ ) = R ij (t) for all τ ∈ [t, t + T − 1]. By employing the fact that for any τ ∈ [t, t + T − 1], we have Q ij (t)s ij (t) + 1 2 T (T − 1)IJs 2 max (38) VOLUME 8, 2020 and − t+T −1 and t+T −1 and − t+T −1 Substituting (38), (39), (40), and (41) into (34) we have where B is a positive constant, which is defined as

APPENDIX B
Accoring to [26], there exists a stationary optimal policy ωonly policy that achieves the optimal system utility SU * . Then, under any feasible decisions, which satisfy C1-C10, we have where s * ij (τ ), R * ij (τ ), and C * ij (τ ) denotes the packet size of video slice, cache capacity and transmission rate for FE j in femtocell i under the optimal ω-only policy, respectively. Both the second and third terms is non-positive because of the queue stability constraint strategy. Since SU sub is a suboptimal system utility, thus SU sub ≤ SU * holds. Then, we have By taking the iterated expectations of the above inequality and summing over t ∈ {0, T , 2T , . . . , NT }, we have Dividing both sides of the above inequality with βNT , taking a limit as N → ∞, and rearranging and neglecting appropriate terms, we get hence, SU sub ≥ SU * − B βT .