Skip to Main Content
The use of multiple antennas in mobile devices provides enhanced data rates at the cost of increased power consumption. The stochastic nature of the wireless propagation medium and random variations in the utilization and operating environment of the device makes it difficult to estimate and predict wireless channels and power consumption levels. Therefore, we investigate a robust antenna subset selection policy where the power-normalized throughput is assumed to be drawn from an unknown distribution with unknown mean. At each time instant, the transceiver decides upon the active antenna subset based on observations of the outcomes of previous choices, with the objective being to identify the optimal antenna subset which maximizes the power-normalized throughput. In this work, we present a sequential learning scheme to achieve this based on the theory of multi-armed bandits. Simulations verify that the proposed novel method that accounts for dependent arms outperforms a naïve approach designed for independent arms in terms of regret.