This is the TD3-MIX DRL architecture we use to train centrally the base station agents for finding the best resource allocation policy in the multicell MIMO network.
Abstract:
Dynamic frequency allocation (DFA) with massive multiple-input multiple-output (MIMO) is a promising candidate for multicell communications where massive MIMO is adopted ...Show MoreMetadata
Abstract:
Dynamic frequency allocation (DFA) with massive multiple-input multiple-output (MIMO) is a promising candidate for multicell communications where massive MIMO is adopted to maximize the per-cell capacity whereas the inter-cell interference (ICI) is tackled by DFA. Realizing this approach in a distributed fashion is however very difficult due to the lack of global channel state available at the base stations (BSs) in the cell level. We utilize a forward-looking game to automate reconciliation for DFA in a distributed manner between cells while zero-forcing (ZF) is used at each cell to maximize the multiplexing gain. To maximize the network capacity, multi-agent deep reinforcement learning (DRL) using offline centralized training is leveraged to train the BSs to master their game-theoretic reconciliation strategies. The result is a trained neural network for each BS, empowering it with rich experience of reconciliation with other BSs for converging to a network-efficient equilibrium. The online algorithm is distributed with the BSs competing as expert players to start the negotiation process using their trained actions. Simulation results show that the proposed synergized deep-learning game-theoretic algorithm outperforms significantly the DRL-only and game-theoretic only methods, and other benchmarks for multicell MIMO.
This is the TD3-MIX DRL architecture we use to train centrally the base station agents for finding the best resource allocation policy in the multicell MIMO network.
Published in: IEEE Access ( Volume: 9)