Skip to Main Content
We propose a model-based reinforcement learning (RL) algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm controls the via-points based on a learned model of the Poincare map of the periodic walking pattern. The model maps from a state in the single support phase and the controlled via-points to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies were acquired.