Trajectory-based Probabilistic Policy Gradient for Learning Locomotion Behaviors | IEEE Conference Publication | IEEE Xplore