RTP-Q: a reinforcement learning system with an active exploration planning structure for enhancing the convergence rate | IEEE Conference Publication | IEEE Xplore