On-line learning of a feedback controller for quasi-passive-dynamic walking by a stochastic policy gradient method | IEEE Conference Publication | IEEE Xplore