On learning Whittle index policy for restless bandits with scalable regret | IEEE Journals & Magazine | IEEE Xplore