Skip to Main Content
In this paper, we propose a two-timescale delay-optimal dynamic clustering and power allocation design for downlink network MIMO systems. The dynamic clustering control is adaptive to the global queue state information (GQSI) only and computed at the base station controller (BSC) over a longer time scale. On the other hand, the power allocations of all the BSs in each cluster are adaptive to both intracluster channel state information (CCSI) and intracluster queue state information (CQSI), and computed at each cluster manager (CM) over a shorter time scale. We show that the two-timescale delay-optimal control can be formulated as an infinite-horizon average cost constrained partially observed Markov decision process (CPOMDP). By exploiting the special problem structure, we derive an equivalent Bellman equation in terms of pattern selection Q-factor to solve the CPOMDP. To address the distributed requirement and computational complexity, we approximate the pattern selection Q-factor by the sum of per-cluster potential functions and propose a novel distributed online learning algorithm to estimate them distributedly. We show that the proposed distributed online learning algorithm converges almost surely. By exploiting the birth-death structure of the queue dynamics, we further decompose the per-cluster potential function into the sum of per-cluster per-user potential functions and formulate the instantaneous power allocation as a per-stage QSI-aware interference game played among all the CMs. The proposed QSI-aware simultaneous iterative water-filling algorithm (QSIWFA) is shown to achieve the Nash equilibrium (NE).
Date of Publication: March 2011