Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost

Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost | IEEE Journals & Magazine | IEEE Xplore