Adaptive stepsize selection for online Q-learning in a non-stationary environment

Adaptive stepsize selection for online Q-learning in a non-stationary environment | IEEE Conference Publication | IEEE Xplore