Experience replay for least-squares policy iteration | IEEE Journals & Magazine | IEEE Xplore