Constrained Dirichlet Distribution Policy: Guarantee Zero Constraint Violation Reinforcement Learning for Continuous Robotic Control | IEEE Journals & Magazine | IEEE Xplore