Abstract:
Adaptive Bit Rate (ABR) assignment plays a crucial role for ensuring satisfactory quality of experience (QoE) in video streaming applications. Recently the authors of [1]...Show MoreMetadata
Abstract:
Adaptive Bit Rate (ABR) assignment plays a crucial role for ensuring satisfactory quality of experience (QoE) in video streaming applications. Recently the authors of [1] proposed to use reinforcement learning (RL) based asynchronous advantage actor-critic (A3C), an on-policy method, Pensieve, to improve ABR algorithms. It has shown to achieve a higher QoE as compared to other traditional ABR methods. However, Pensieve is sample inefficient and frail to different random seeds and hyperparameters. In this paper, we present soft actor-critic based deep reinforcement learning for adaptive bitrate streaming (SAC-ABR), an off-policy method, which improves the QoE as compared to other existing state-of-the-art ABR algorithms under a wide variety of network conditions. Based on the maximum entropy RL framework, SAC-ABR aims to maximize entropy while maximizing the expected rewards, hence achieving a better exploration-exploitation tradeoff as compared to on-policy ABR methods. We present the overall design together with the training and testing results of SAC-ABR, and evaluate its performance as compared to other state-of-the-art ABR algorithms. Our results show that SAC-ABR provides up to 27.42% higher average QoE as compared to Pensieve and much higher QoE when compared to other traditional fixed-rule based ABR algorithms.
Date of Conference: 04-08 January 2022
Date Added to IEEE Xplore: 13 January 2022
ISBN Information: