Decentralized Optimization for Multicast Adaptive Video Streaming in Edge Cache-Assisted Networks

Adaptive streaming based on DASH offers personalized video experience and smooth playback by allowing dynamical adjustments of the video bitrate to the variations of network conditions. This is especially important for current and future Internet video streaming applications, including emerging ones such as virtual reality-based, as adaptive streaming plays a key role in providing high quality viewing experience, especially in limited bandwidth delivery environments. To enable this promising avenue in a 5G context, efforts are made to consider it alongside multicast and edge caching, as part of the next generation communication technology. In this paper, we model the adaptive streaming transmission problem in a mobile scenario as a multi-source multicast multi-rate problem (MMMP) whose linear relaxation is concave. We decompose the problem in terms of clients and propose the distributed delivery algorithm (DDA). The computation complexity, convergence and time-varying adaptation of DDA are theoretically analyzed. Additionally, to further reduce the computation complexity of the solution, a heuristic approximation method (H-DDA) based on the physical meaning of the problem is proposed and it is also shown how H-DDA converges to the optimal value by numerical means. Finally, we conduct a series of simulation tests to demonstrate the superiority of the proposed HDDA in comparison with other state-of-art solutions.


I. INTRODUCTION
A DAPTIVE streaming such as dynamic video streaming over HTTP (DASH) [1] enable video delivery based on diverse representations and dynamic content adjustment to match network bandwidth variations and different user equipment characteristics. The latest increases in the demand for bandwidth and expectations in terms of viewing quality make adaptive streaming necessary for various emerging video applications. For example, virtual reality (VR) [2], which is a type of omnidirectional video with ultra large bandwidth requirements, heavily relies on the adaptive streaming technology to deliver the high definition content within the viewer's field-of-view (FoV) only. By not delivering the whole image, any unnecessary bandwidth consumption caused by the potential delivery of the rest of the image is avoided.
Most of current DASH-based solutions have been proposed for conventional wired networks [3] and broadband wireless networks [4], [5]. By introducing ubiquitous edge caching and multicast support for the wireless connections, current cache-assisted mobile networks enable large scale low latency video services. Given the fact that caching video content at the edge not only reduces the delivery latency, but also simplifies the multicast design thanks to the multicast feature of wireless communications, integrating edge caching into the video system facilitates multicast video delivery. With such advantages, there is a natural interest in proposing solutions for adaptive video streaming in such an environment [6], [7]. Yet, achieving optimal delivery over the cache-assisted mobile networks is non-trivial. On one hand, adaptive streaming refers to dynamical selection of the video bitrate in order to both optimize user quality-of-experience (QoE) and maximize the utilization of bandwidth resources while avoiding network congestion [2], [13], [14]. Adaptive streaming schemes require to frequently rearrange the bitrate selection policy to adapt the randomness of the wireless communications, user preferences, etc. On the other hand, despite the benefits brought by multicast and ubiquitous caching [17], these features also make traditional rate adaption methods [8], [9] impossible to use. An important issue is that the fully distributed aspect of ubiquitous caching triggers the requirement of optimizing overall user bitrate by using a decentralized method based on local information. In addition, conventional adaptive streaming serves video clients separately and adjusts user streaming rate individually [1], [14]. However, this one-to-one design paradigm is not appropriate in the context of one-to-many multicast delivery.
In this paper, we focus on proposing a decentralized method for adaptive multicast video streaming in a cache-assisted mobile network environment. First, we mathematically model the adaptive streaming problem and then propose an optimal decentralized delivery algorithm (DDA) which enables each video client optimize its bitrate without coordinating with other clients. Moreover, we further propose a heuristics rate adaption algorithm that significantly reduces the computation load while approximating the optimal value derived by DDA. We present a series of simulation tests which demonstrate the close to optimal performance of the proposed algorithm, and show how our algorithm outperforms other state-of-art solutions. The main contributions of this paper are: 1) Multisource Multicast Rate-Adaptive Problem: We formulate mathematically the optimal adaptive video streaming in cache-assisted mobile networks as a multisource multicast multi-rate problem (MMMP). We then introduce a linear relaxation of MMMP and prove its concavity, demonstrating that it has a unique optimal solution. 2) Distributed Delivery Algorithm: We further decompose MMMP in terms of the end users and prove the equivalence between the original MMMP and the decomposed problems. Furthermore we propose DDA, a decentralized algorithm which achieves optimal rate adaptation. We also extend our algorithm to the omnidirectional video applications such as VR. 3) Heuristic Distributed Delivery Algorithm: By observing the physical meaning of the problem, we further propose a heuristic rate adaption algorithm (H-DDA) that achieves a similar performance, but yet dramatically reduces the computation load in comparison with the original DDA. Our algorithms are implemented and involved in simulations, whose results show how they approximate the theoretical optimal and outperform state-of-art solutions.

II. BACKGROUND AND RELATED WORKS
A. Caching-Assisted Mobile Adaptive Streaming Background Fig. 1 illustrates the process of provisioning adaptive video streaming services in cache-assisted mobile scenarios. Video clients can retrieve the content from a nearby edge caching node instead of the far-end media server. Each client uses the adaptive streaming control module to determine the appropriate representation of the desired video based on their network conditions. This is especially important in VR applications [23], where the image is tiled and each tile has multiple representations. The adaptive streaming control module selects the bitrates for each tile according to the network conditions and user viewport as well. Adaptive streaming supports the video provider to deliver only the users' area of interest rather than the whole image at high quality, saving bandwidth without impairing the viewing experience.

B. Related Works
Numerous studies have focused on improving the performance of adaptive video streaming. For example, Yuan et al. in [15] proposed an ensemble rate adaptation framework which aims to take advantage of the benefits of multiple rate adaptation methods. The proposed framework mainly consists of two modules, a method pool to store the rate adaptation policy and a method controller to decide the policy to use. By constructing a two layer network structure, a distributed joint optimization algorithm for adaptive video streaming which aims to maximize the total user demand rate is proposed in [16]. Furthermore, a modified algorithm with a practical caching strategy is also designed in order to support realistic implementation. Several recent studies attempts to apply the machine learning method to deal with the high dynamic network conditions when adaptively streaming the video content. For example, in [18], a Q-learning model is applied to generate adaptive streaming schemes for 5G multimedia services with the aim to preserve both energy efficiency and user QoE. TCLiVi in [19] applies the deep reinforcement learning to control the bitrate selection for adaptive streaming. A major difference between our work and current reinforcement learning-based studies is that we take the multicast into account instead of only considering the case of end-to-end video delivery. Besides, reinforcement learning requires pre-training of a learning model which can be time consuming and requires a-priori knowledge of the network, which can be practically difficult.
Most studies on cache-assisted adaptive streaming focus on cache placement and are not focused on in this work. Zhang et al. [17] proposed VISCA, which integrates the edge caching capacity to enhance the streaming performance. Moreover, a novel Adaptive BitRate (ABR) algorithm decides the bitrate and video chunk source by considering network conditions, QoE objectives, and edge resource availability jointly is then proposed. VISCA also uses the super resolution method to enhance the low-quality data. Liu and Wei [20] designed a hop-by-hop adaptive streaming control, which sets a scheduling window at each switch to limit the data transmission rate according to the one-hop link capacity. Furthermore, a priority-based data delivery scheme is proposed to enable popular and lowest representation video content to be delivered preferentially. However, this method only adapts video rate at each node individually, and it is difficult to achieve overall clients bitrate adaption optimization without coordinating with each other. Eswara et al. [21] formulated the resource allocation problem for adaptive streaming as a stochastic optimization problem with the purpose to optimize the long term QoE metrics. However, the formulated problem treats each flow individually and ignores the multicast feature of the wireless communication, decreasing the transmission performance.
In the context of the above discussion, a distributed method that not only supports multicast, but also provides optimal rate adaptation is required for cache-assisted scenarios and is proposed in this paper. Table I shows the notations used in this paper. We consider a network of N nodes including cache carriers, switches content sources and users that communicate with each other over a given connected, undirected graph G = (V, L). V and L ⊆ V × V denote the set of nodes and network links, respectively. Let S = {1, . . . , S} ∈ V be the set of video providers in the network and U = {1, . . . , U} ∈ V the set of end users. We define the path from client i to its video provider j as p j , i.e., p j {l i,s 1 , l s 1 ,s 2 , l s 2 ,s 3 , . . . , l s n ,j }, where l x,y indicates the link between nodes x and y. Due to the in-network caching, intermediate nodes can also be treated as providers, namely, for all k = 1, 2, 3, n, s k ∈ S. In our adaptive model, scalable video coding (SVC) [22] is used to encode video content into a base layer and several enhancement layers. Video clients can either decode the video with only the base layer or with base plus multiple enhancement layers. The more enhancement layers decoded, the better quality of video can be presented. Let videos in G consist of m enhancement layers, and let b 1 and h k be the bitrate of base layer and each enhancement layer k, respectively. Accordingly, possible requested video bitrates are

A. System Model
With the layered coding property of SVC, content providers can serve multiple request of the same video with different bitrates by multicasting the highest requested bitrate, and the switch forwards the layers of data to clients according to the request of client.
In our solution, we attempt to optimize the rate adaptation and maximize user QoE. As we select the bitrate to optimise both network bandwidth utilisation and user QoE, we use the QoE model proposed in [25] and introduced next 1 : (1)

B. Problem Formalization
The objective of the rate adaption algorithm is to chose a rate adaption strategy x to maximize the overall user QoE given the network capacity constraints. Let strategy vector x = {x 1,1 , . . . , x i,j , . . . , x S,U }, where x i,j implies the delivery rate of client j receiving video from i. We represent the capacity of links in L as a vector Considering the above objective function, the rate adaptation streaming problem can be referred to as the following multicast multi-sources multi-rate problem (MMMP): P1. P1: where s i (u) is the clients set of providers i, l(s) denotes the set of providers that use link l, s i (u) l denotes the set of clients using link l to access videos from i. Accordingly, max indicates the maximum bitrate over link l of users in the multicast tree rooted at i; we define this bitrate as the provider rate of i over l. The constraints from eq. (3) ensure that in a multicast scenario, for any link l, the total sum of provider rates cannot exceed the capacity c l . Constraints from eq. (4) indicate that each client selects a bitrate from B to request. If J(x i,j ) is given by eq. (1), (2) and (3) are concave and convex [26], respectively. However, B is a discrete set, making the P1 hard to be solved. Instead, we consider the linear relaxation of the MMMP as follows: P2: indicates that the rate adaption strategy can be chosen from a continuous U-dimensional close space, which is considered as the relaxation of constraint from eq.
Therefore, problem P2 is a concave optimization [26] whose maximum value is unique.
In spite of adopting multi-sources and multicast features, P2 can be easily generalized to other scenarios with minor modifications. For example, to apply P2 in a scenario with a single provider concurrently delivering multiple videos, we can split the provider i with n video flows into n virtual source nodes. Virtual node i k corresponding to video k is described is the link set that is used by i k and s i k (u) is the group of users that access k from i. For multipath delivery scenarios, assuming client j accesses content via m interfaces and corresponding delivery rate of each interface f k is x i,j f k , applying P2 only needs to rephrase the objective function of j to J( M i=1 x i,j f k ).

IV. ALGORITHM DESIGN
In this section, we first decompose MMMP in terms of video clients and consider its dual problem. Then, we propose a distributed rate adaption algorithm DDA which supports individual clients to determine their optimal video bitrate.

A. Problem Decomposition
Considering P2, the objective function in eq. (5) is separable for-the video clients yet coupled by the constraints from eq. (6). In addition, as constraints in eq. (6) contain the maximum value function which is not differential, directly solving this problem is a nontrivial task. We introduce a new parameter x l i , and formulate the decomposed MMMP as follows: ∀k ∈ S, l ∈ L, x l k is defined as the video bitrate of i such that x l k ≥ x k,j for all j ∈ s(u) l . Constraint (8) indicates that for each link l used by j, the bitrate of j should not exceed the minimum residual link capacity, and eq. (9) says that the bitrate x i,j cannot exceed all x l i over its delivery path. We introduce following theorem.
Theorem 1: For each user j ∈ U , the corresponding optimal value x * i,j in P2 can be derived equally by solving problem U1. Namely, ∀i, j, x * i,j of P2 and U1 are equal. Proof: See Appendix A.

B. Distributed Optimal Rate Adaptation Algorithm
To derive the optimal x * i,j of U1 distributedly, we consider the dual problem of U1. Consider the Lagrangian of U1: The Lagrangian dual function is thus: and the dual problem of U1 can be formulated as follows: Since the optimal values of the primal and dual problems are equal due to the strong duality property of U1, the primal optimal solution x * i,j can be recovered from a dual optimal point (λ * p j , υ * p j ), namely: . If the inverse of J(.) exists, according to the Karush-Kuhn-Tucker condition of U1:A [26], x i,j (p j ) can be derived by: As D u (λ p j , υ p j ) is continuous and differential for (λ p j , υ p j ), the partial differentials of each λ l , υ l are: Therefore, based on eq. (13), (14), (15), DDA solves the λ * l and υ * l of dual problem U1:D by gradient projection method [27], and updates x i,j (t) iteratively, as follows: The above iterations suggest treating users, routers as processors in a distributed processing system, and the optimal rate of each client can be derived by only communicating with links over its delivery path, without coordination with other clients. Specifically, at each iteration t, client j solves x i,j (t) in eq. (16) by collecting λ l (t − 1) and υ l (t − 1) from links over its delivery path p j and communicates them the new derived x i,j (t). In parallel, client j requests video with bitrate arg min b∈B x(t) − b 2 . Link l receives the x i,j (t) of all users that use l and select max j∈s i (u) l x i,j (t) as x l i (t) for each source s in l(s). Then, l uses all x l k ,k ∈ l(s)/i and x i,j to compute the λ l (t + 1) and υ l (t + 1) by (17), (18). The derived λ l (t + 1), υ l (t + 1) will be delivered to user j for computing the new x i,j (t + 1) in the next iteration. The above process is repeated until the results reach the iteration criterion, are small enough and can be smuggled into Interest and Data packets, hence, do not require extra communication resources. The pseudocode of DDA is shown in Algorithm 1.
Convergence: assuming that initial λ(0) and υ(0) are feasible, we have following convergence results. the (x * , λ * , υ * ) generated by Algorithm 1 is dual optimal, namely, the x * is the optimal adaptation rate for P2. Proof: See Appendix B. Complexity: By observing Algorithm 1, the complexity of DDA at link side is mainly determined by the process. Let gradient projection iterates N times, and number of users and providers using link l are U l and S l , respectively. Thus, the complexity of algorithm at the link side is O(N(U l + S l )). At clients' side, the corresponding complexity is determined by the number of iteration of (16), which is N.
Time-varying adaptation: In order to extend DDA to the time-varying scenarios, the objective function P2 can be re- and s i (u, t) are the set of providers and user set of provider i at time t, respectively. The l(s) in constraint (6) is replaced by l(s, t), which is the provider set that uses link l varying with t. Based on above changes, each end users still executes the same user algorithm as in Algorithm 1, except for computing the p j (t) in the place of p j in (16). Each link executes the same link algorithm as in Algorithm 1, only with minor changes by replacing l(s) in (17) with l(s, t). Intuitively, if the change in link routings and providers is relative slower than the convergence rate, the algorithm still can converge to the optimal rates x * . We will further illustrate this feature by experimental tests in Section VI.

V. HEURISTIC DISTRIBUTED RATE ADAPTATION ALGORITHM
The proposed DDA converges to the optimal value under any initial condition, as proved. However, the computation complexity of DDA at link grows with the number of passing users, which may trigger a scalability problem at the bottleneck links. In this context, in this section, we propose a lightweight heuristic distributed delivery algorithm (H-DDA).
Observe the following special case of P2: The above problem describes a unicast scenario where each source serves one user only. The corresponding Lagrangian is: This problem can be easily solved by considering its dual as in [21], similar to eq. (13), where p i denotes the path used by provider i. Intuitively, the inverse of bitrate is equal to waiting delay of sending unit number of data according to the little's law.
For instance, when the data over link is 10Mbps, the delay of sending 1Mb data is 0.1s. Accordingly, l∈p i λ l indicates the total delay of sending unit number of data using path i. Thus, the physical meaning of λ l is the waiting delay over l∈p i λ l . Obviously, λ l can be literately derived by following gradient projection method: Consider following problem with multicast feature: P3: Similar to eq. (22), given by the sending delay of l is only related to the load of link, we also have: for P3. Therefore, according to the physical meaning of λ l , extending eq. (26) to the generalized J(.) which is strictly concave and continuous: Hence, at each iteration T of H-DDA, each link l in G collects the bitrate x i,j of clients over l, and determines the λ l (t) by (25). Link l communicates the λ l (t) to all users that use l. User j receives the λ l (t) of all l in p j and calculates the x i,j (t). Links and users repeat this process until satisfy the stopping criterion of gradient method: for each l, at iteration T, λ l (T) − λ(T − 1). The pesudocode of H-DDA is given in Algorithm 2.
According to the pseudocode of Algorithm 2, the computation complexity at link side is bounded by O(N.S l ). Because S l U l + S l in multicast scenario, Algorithm 2 can significantly reduce the computation load at links.
Unlike the optimal convergence of DDA which was proved in Section IV, it is difficult to theoretically analyse the optimality of H-DDA. Instead, we use the physical meaning of λ compute the λ l (t) according to (25); 8 communicate the λ l (t) to all users over l; 9 t++; 10 end 11 λ * = λ(t), υ * = υ(t); 12 user j's algorithm: 14 receives the sum of λ l (t) from the links over its path; 15 determines the next period delivery rate x(t) according to (27); 16 communicates the x i,j (t + 1) to links l ∈ p j ; 17 request video bitrate by arg min b∈B of P3 to explain how H-DDA approximates the optimal value. For each end user j, let x * i,j be the corresponding optimal value derived by DDA, the inverse of x * i,j is equal to the current waiting delay of path, say λ * p . And because λ * p is the optimal delay of path which is equal to total sum of λ * l whose corresponding link l is in p j . Therefore, because λ l (t) converges to the λ * l in H-DDA, hence x i,j (t) in H-DDA converges to the x * i,j in DDA. We thereby prove the optimal approximation of H-DDA.
Furthermore, we also test the optimal approximation of H-DDA through numerical evaluation in MATLAB. We consider a tree-based network whose topology and link bandwidth are shown in Fig. 2. In this tree topology, four leaf nodes act as video clients continuously sending out DAS requests during the simulation. Fig. 3 illustrates convergence of user rate of U1-U4 derived by H-DDA to the solutions of DDA. Observing that, both H-DDA and DDA converge to the same results, only different in convergence rate, hence permits using H-DDA to achieve the optimal rate adaptation.
Adoption to VR Applications: Our algorithm provides an adaptive streaming scheme and can be easily applied to the omnidirectional video such as VR. In VR, each image is splitted into multiple tiles and each tile is coded independently. When delivering adaptive VR streaming, viewport predictions are required to forecast the location of user's interested area. However, viewport prediction problem is beyond the scope of this manuscript. For more details of these methods, please refer to our previous work [23]. In our previous work [23],  we propose a viewport prediction method which can derive the probability of a tile watched by a user. Let the probability of a tile v i watched by a viewer be p i and the optimal transmission rate is x * , for each tile, the allocated bandwidth can be given by: Then, v i 's bitrate b i is: The pseudocode is shown in Algorithm 3.

VI. PERFORMANCE EVALUATION
To evaluate the performance of the proposed algorithms, we implemented DDA and H-DDA in MATLAB and NS-3,

Algorithm 3: Rate Control for Omnidirectional Video
Input: x * Output: x * , λ * 1 foreach video chunk do 2 invoke the viewport prediction algorithm to derive the viewing probability; 3 foreach tile v i do 4 calculate the allocated bandwidth: respectively. First, we present the simulation setup. Then, we analyse the convergence of DDA and H-DDA in time-varying condition and compare our algorithm against two state-of-art solutions HAVS [20] and DASH-BOLA [24].

A. Simulation Setup
We select BestRoute [8] as the request routing strategy, where routers maintain a routing table in order to discover replicas with minimum hop counts. For caching strategy, we employ the Leave Copy Everywhere (LCE) [8], which enables edge servers copying all passing content to their storage The size of the cache is randomly set to 25MB, 50MB and 100MB per router. For test videos, we use MPEG-DASH multimedia streaming with SVC-encoded format. The DASH video set is from [22], each segment is two seconds long and video set contains 8 movies with 120s of each. Each video is encoded into one base layer and four enhancement layers. The base layer b 1 has an average bitrate of 600kbps, and enhancement layers 1, 2, 3, 4 have 1600kps, 2600kps, 1940kps and 4440kps, respectively. To simulate the multicast scenario, a random number of users (from 1 to 5) will be selected to request the same video within a very same time window. The arrival rate of requests group follows the Poisson distribution with λ = 0.05. Each requests group randomly select a video to request by a Zipf distribution whose parameter is 0.8. After determining the video to ask, end users will request chunks of video in sequence and re-select a new video to request after requesting all chunks of current video.

B. Experimental Results
To simulate a realistic environment, we build a forestbased topology as in Fig. 4 which is widely used for Content Delivery Networks (CDN). The forest-based topology consists of 14 nodes and 13 nodes acting as video clients. To simulate the heterogeneous characteristics of an access network, the leaf routers act as access points (APs) with different communication technologies. For instance, AP1 and AP5 act as edge routers in wired networks, AP2 and AP4 are wireless access points which use the 802.11a protocol with 10 and 5Mbps  shared bandwidth, respectively. AP3 is a LTE network base station to simulate the cellular network environment which provides 4Mbps access bandwidth to each end user.
1) User Rate Convergence Analysis: Fig. 5 shows the rate convergence of U4, U5, U6 when accessing video from AP1 and AP2. The solid lines and dash lines correspond to the rate adaption provided by DDA and H-DDA, respectively. As the figure depicts, the simulation results of H-DDA converge well to the optimal value. Besides, the two algorithms also quickly adapt the network condition variation during the simulation. For example, when U6 begins to request video, the rate of U6 quickly decreases to the new optimal value 5Mbps, hence, showing the property of time-varying adaption of both DDA and H-DDA. Fig. 6 shows the convergence analysis of users accessing video from AP4, where all users share the access bandwidth with 5Mbps. As we expect, the rate of users at AP2 also converges well to the theoretical results. We also observe an interesting result in Fig. 6: both simulation and theoretical values show that when U8 U9 concurrently access DAS (at 22s during the simulation), the access bandwidth of WiFi is split into 2.5Mbps for each user, respectively. When more flows joining (At 35s and 75s), the bandwidth is further equally split into fours, which reveals the fairness of our algorithm. The above observation indicates that our proposed scheme can accommodate dynamic network variations. Additionally, note that a faster convergence can be achieved by relaxing the iteration criterion, but at the cost of larger upper bound of the convergence. This show that there is a tradeoff between the dynamic adaptation and better theoretical performance in DDA.
2) Average Bit Rate (ABR) Comparison: We define the ABR as the arithmetic mean of average bitrate of overall users. Fig. 7 show the ABR comparison of H-DDA, HAVS and   BOLA. As figure shows the ABR of three solutions experience a increasing trend at the beginning. After 50s, all solutions decrease and then enter periodical vibration phase. The red line corresponding to H-DDA achieves a 37% and 41% increment against the HAVS and BOLA. At the beginning, the network load is at low level and the links have enough bandwidth to support the requested high bitrate video. However, after the total bitrate reaches the link capacity limits, the continuous increase in video clients reduces the ABR. In H-DDA, the overall bitrate tracks to the theoretical optimal bound, hence, providing the best performance among three solutions. HAVS adjusts the data rate at each hop locally, which fails to optimize the user bitrate globally, and results in a relatively low bitrate against H-DDA. Regardless of the link capacity, each client in BOLA greedily requests higher bitrate video in order to maximize their own video quality, which may aggravate the network congestion when the network is already in a high load condition. Therefore, BOLA performs the worst.
3) Average Stalling Time (AST) Comparison: We define the time interval between playback freeze and restart as the stalling time. The shorter stalling time is, the smoother the playback experienced by the client is. We measure the average value of stalling time of using H-DDA, HAVS and BOLA and show the results in Fig. 8. We observe that the red curve corresponding to the H-DDA reduces the AST by 10% and 30% after 300s when comparing with HAVS and BOLA. As mentioned, H-DDA uses a distributed rate adaptive method to take full use of link bandwidth while avoiding the network congestion by limiting the total delivery rate to the link capacity, achieving a smoother playback. HAVS also limits the data rate to the link capacity at each hop, hence avoiding the network congestion and smooth playback at some level. However, the hop-level transmission control results in a sub-optimal rate control. BOLA uses a greedy method to request video content, which leads to higher risk of playback freeze and hence, it performs the worst among the three solutions.

VII. CONCLUSION AND FUTURE WORKS
This paper proposes a distributed optimal rate configuration algorithm for dynamic adaptive streaming. First the rate adaptation problem is formulated as MMMP, whose linear relaxation is concave. Then MMMP is decomposed in terms of video clients and DDA is proposed to enable users communicate optimally. Furthermore, a heuristic method named H-DDA which reduces the computation complexity in comparison with DDA, while maintaining the optimal approximation is introduced. Simulation results show algorithm convergence and illustrate how H-DDA outperforms other state-of-art solutions.
Although the theoretical proofs and simulations test validate the performance of our proposed algorithms, several open issues remain. First, our work focuses on wireless communications and it is necessary to consider the mobility of the nodes. Future work will study how to model the user mobility behavior and embed this into the design of our algorithm. Secondly, for live streaming services, transcoding the video content into multiple representations consumes large computation resources. Future research will jointly optimize the transmission and transcoding which are both critical to the high performance of 360 degree live streaming. Thirdly, future work will consider deploying the proposed algorithm in a real life environment and testing it.

APPENDIX A PROOF OF THE THEOREM 1
Proof: Given by the definition of x l i , the problem P2 can be rephrased as follows: s.t i∈l(s) x l i ≤ c l l ∈ L (30) We aggregate U1 across all users and obtain: Assuming that x * and x * are the optimal solutions of P2:A and U1:A, respectively, Theorem 1 holds only when x * = x * . Next, we show how to prove that x * = x * .
Let the Lagrangian of P2:A and U1:A be as in eq. (35) and (36) shown at the top of the next page. The dual optimal values of eq. (35) and (36) are defined as (x * , λ p * , υ * ) and (x * , λ p * , υ * ), respectively. According to the slackness complementarity condition [26], we have: Using x * to replace the x * , we have: For the case of k∈l(s)/i x l k * + x * i,j < c l , the corresponding λ ijl * = 0. This can be proved by contradiction. If there exists which contradicts the assumption that x i,j * is the maximum value.
In particular, (CA j (ξ )C T ) ∞ = CA j (ξ )C T 1 and because CA j (ξ )C T is symmetric, we further have CA j (ξ )C T ∞ = CA j (ξ )C T 1 . Therefore, Therefore, ∇D u is Lipschitz with Because the J(.) is continuous and one-to-one mapping, x i,j (p j ) is continuous and therefore, lim t→∞ x i,j (t) = x * ij , hence, the theorem is proved.