Loading [MathJax]/extensions/MathMenu.js
Celebrating Robustness in Efficient Off-Policy Meta-Reinforcement Learning | IEEE Conference Publication | IEEE Xplore

Celebrating Robustness in Efficient Off-Policy Meta-Reinforcement Learning


Abstract:

Deep reinforcement learning algorithms can enable agents to learn policies for complex tasks without expert knowledge. However, the learned policies are typically special...Show More

Abstract:

Deep reinforcement learning algorithms can enable agents to learn policies for complex tasks without expert knowledge. However, the learned policies are typically specialized to one specific task and can not generalize to new tasks. While meta-reinforcement learning (meta-RL) algorithms can enable agents to solve new tasks based on prior experience, most of them build on on-policy reinforcement learning algorithms which require large amounts of samples during meta-training and do not consider task-specific features across different tasks and thus make it very difficult to train an agent with high performance. To address these challenges, in this paper, we propose an off-policy meta-RL algorithm abbreviated as CRL (Celebrating Robustness Learning) that disentangles task-specific policy parameters by an adapter network to shared low-level parameters, learns a probabilistic latent space to extract universal information across different tasks and perform temporal-extended exploration. Our approach outperforms baseline methods both in sample efficiency and asymptotic performance on several meta-RL benchmarks.
Date of Conference: 17-22 July 2022
Date Added to IEEE Xplore: 05 September 2022
ISBN Information:
Conference Location: Guiyang, China

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.