Skip to Main Content
In this paper, we address the problem of how to optimize the cross-layer transmission policy for delay-sensitive video streaming over slow-varying flat-fading wireless channels on-line, at transmission time, when the environment dynamics are unknown. We first formulate the cross-layer optimization using a systematic layered Markov decision process (MDP) framework, which complies with the layered architecture of the OSI stack. Subsequently, considering the unknown dynamics of the video sources and underlying wireless channels, we propose a layered real-time dynamic programming (LRTDP) algorithm, which requires no a priori knowledge about the source and network dynamics. LRTDP allows each layer to learn the dynamics on-the-fly, and adjusts its policy autonomously, based on their experienced dynamics as well as limited message exchanges with other layers. Unlike existing cross-layer methods, LRTDP optimizes the cross-layer policy in a layered and on-line fashion, exhibits a low computational complexity, requires limited message exchanges among layers, and is capable to adapt on-the-fly to the experienced environment dynamics. Finally, we prove that LRTDP converges to the optimal cross-layer policy asymptotically. Our numerical experiments show that LRTDP provides comparable performance to the idealized optimal cross-layer solutions based on complete knowledge.