Abstract:
Millimeter-wave (mmWave) communication is a promising technology for future vehicular networks, where plenty of self-driving vehicles transmit a great amount of sensing d...Show MoreMetadata
Abstract:
Millimeter-wave (mmWave) communication is a promising technology for future vehicular networks, where plenty of self-driving vehicles transmit a great amount of sensing data to the edge-cloud platform for real-time processing to ensure driving safety. While beam selection has been widely investigated in single mmWave base station (mmBS) scenario to maximize the throughput between the vehicle and the mmBS, it is still quite challenging to perform optimal beam selection in mmWave vehicular networks with multiple mmBSs. On the one hand, performing beam selection at a central controller with global information of the networks is infeasible due to the exponentially increased complexity. On the other hand, a distributed solution may suffer from the interference between overlapping beams among mmBSs which leads to severe throughput degradation. To fill this gap, in this paper, we propose a Multi-Agent Reinforcement Learning based cooperative Beam Selection (MARL-BS) algorithm for mmWave vehicular networks. Specifically, we model the beam selection problem as a multi-agent multi-armed bandit problem and then adopt Q-learning to learn how to coordinate the beam selection decisions. In the proposed approach, each mmBS acts as an agent and learns the Q-values of its own actions in conjunction with those of the other mmBSs. We further propose a modified combinatorial upper confidence bound (CUCB) algorithm to take advantage of exploring and exploiting all the candidate beams to avoid falling into local optimum. Finally, our simulations validate the proposed MARL-BS algorithm and confirm its higher performance compared with the other benchmark algorithms.
Date of Conference: 04-07 October 2021
Date Added to IEEE Xplore: 13 December 2021
ISBN Information: