Loading [MathJax]/extensions/MathMenu.js
A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of the Multi-Armed Bandit | MIT Press Journals & Magazine | IEEE Xplore

A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of the Multi-Armed Bandit


Abstract:

We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is po...Show More

Abstract:

We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper-bound-matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world data sets.
Published in: Neural Computation ( Volume: 37, Issue: 2, 21 January 2025)
Page(s): 294 - 310
Date of Publication: 21 January 2025
Print ISSN: 0899-7667

Contact IEEE to Subscribe