Abstract:
Contextual multi-armed bandit algorithms serve as an effective technique to address online sequential decision-making problems. Despite their popularity, when it comes to...Show MoreMetadata
Abstract:
Contextual multi-armed bandit algorithms serve as an effective technique to address online sequential decision-making problems. Despite their popularity, when it comes to off-the-shelf tools the library support remains limited, in particular for the Python technology stack. To fill this gap, in this paper we present a system that provides context-free, parametric and non-parametric contextual multi-armed bandit models. The available bandit policies accommodate both batch and online learning. The MABWISER system is implemented as an open-source Python library. Our design enables built-in parallelization to speed up training and test components for scalability while ensuring the reproducibility of results. We present a running example to highlight the user-friendly nature of the public interface and discuss the simulation capability of the library for hyper-parameter tuning and rapid experimentation.
Date of Conference: 04-06 November 2019
Date Added to IEEE Xplore: 13 February 2020
ISBN Information: