Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards | IEEE Journals & Magazine | IEEE Xplore