I. INTRODUCTION
The use of Deep Reinforcement Learning (RL) for robotic control is on the rise, revolutionizing the way control policies are created for legged robots and other complex dynamic systems. Particularly, model-free approaches have gained prominence, replacing traditional optimization-based methods. This paradigm shift can be attributed to the high-capacity neural network models, effective model-free algorithms that can solve complex problems, and efficient tools for data-generation (i.e. simulations). As a result, the synthesis of locomotion policies for legged robots has become more straightforward and accessible, as evidenced by the growing number of RL-based controllers in recent literature.