Reinforcement Learning-based Bandwidth Decision in Optical Access Networks: A Study of Exploration Strategy and Time with Confidence Guarantee | IEEE Journals & Magazine | IEEE Xplore