A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces | IEEE Conference Publication | IEEE Xplore