ExAug: Robot-Conditioned Navigation Policies via Geometric Experience Augmentation | IEEE Conference Publication | IEEE Xplore

ExAug: Robot-Conditioned Navigation Policies via Geometric Experience Augmentation


Abstract:

Machine learning techniques rely on large and diverse datasets for generalization. Computer vision, natural language processing, and other applications can often reuse pu...Show More

Abstract:

Machine learning techniques rely on large and diverse datasets for generalization. Computer vision, natural language processing, and other applications can often reuse public datasets to train many different models. However, due to differences in physical configurations, it is challenging to leverage public datasets for training robotic control policies on new robot platforms or for new tasks. In this work, we propose a novel framework, ExAug to augment the experiences of different robot platforms from multiple datasets in diverse environments. ExAug leverages a simple principle: by extracting 3D information in the form of a point cloud, we can create much more complex and structured augmentations, utilizing both generating synthetic images and geometric-aware penalization that would have been suitable in the same situation for a different robot, with different size, turning radius, and camera placement. The trained policy is evaluated on two new robot platforms with three different cameras in indoor and outdoor environments with obstacles.
Date of Conference: 29 May 2023 - 02 June 2023
Date Added to IEEE Xplore: 04 July 2023
ISBN Information:
Conference Location: London, United Kingdom
References is not available for this document.

I. Introduction

Machine learning methods can be used to train effective models for visual perception [1]–[3], natural language processing [4], [5], and numerous other applications [6], [7]. However, broadly generalizable models typically rely on large and highly diverse datasets, which are usually collected once and then reused repeatedly for many different models and methods. In robotics, this presents a major challenge: every robot might have a different physical configuration, such that end-to-end learning of control policies usually requires specialized data collection for each robotic platform. This calls for developing techniques that can enable learning from experience collected across different robots and sensors.

Select All
1.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255. 1.
2.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. 1.
3.
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021. 1.
4.
A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “Glue: A multi-task benchmark and analysis platform for natural language understanding,” in 7th International Conference on Learning Representations, ICLR 2019, 2019. 1.
5.
M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv: 1910.13461, 2019. 1.
6.
J. A. Sidey-Gibbons and C. J. Sidey-Gibbons, “Machine learning in medicine: a practical introduction,” BMC medical research methodology, vol. 19, no. 1, pp. 1–18, 2019. 1.
7.
T. Davenport and R. Kalakota, “The potential for artificial intelligence in healthcare,” Future healthcare journal, vol. 6, no. 2, p. 94, 2019. 1.
8.
C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” J. Big Data, vol. 6, p. 60, 2019. [Online]. Available: https://doi.org/10.1186/s40537–019-0197-01, 2.
9.
S. Y. Feng, V. Gangal, J. Wei, S. Chandar, S. Vosoughi, T. Mitamura, and E. Hovy, “A survey of data augmentation approaches for nlp,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, pp. 968–988. 1, 2.
10.
S. Hutchinson, G. D. Hager, and P. I. Corke, “A tutorial on visual servo control,” IEEE transactions on robotics and automation, vol. 12, no. 5, pp. 651–670, 1996. 2.
11.
F. Chaumette and S. Hutchinson, “Visual servo control. i. basic approaches,” IEEE Robotics Automation Magazine, vol. 13, no. 4, pp. 82–90, 2006. 2.
12.
F. Chaumette and S. Hutchinson, “Visual servo control. ii. advanced approaches [tutorial],” IEEE Robotics Automation Magazine, vol. 14, no. 1, pp. 109–118, 2007. 2.
13.
Z. Li, C. Yang, C.-Y. Su, J. Deng, and W. Zhang, “Vision-based model predictive control for steering of a nonholonomic mobile robot,” IEEE Transactions on Control Systems Technology, vol. 24, no. 2, pp. 553–564, 2015. 2.
14.
M. Sauvée, P. Poignet, E. Dombre, and E. Courtial, “Image based visual servoing through nonlinear model predictive control,” in Proceedings of the 45th IEEE Conference on Decision and Control. IEEE, 2006, pp. 1776–1781. 2.
15.
R. Mur-Artal and J. D. Tardós, “Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,” IEEE transactions on robotics, vol. 33, no. 5, pp. 1255–1262, 2017. 2.
16.
A. Kim and R. M. Eustice, “Perception-driven navigation: Active visual slam for robotic area coverage,” in 2013 IEEE International Conference on Robotics and Automation. IEEE, 2013, pp. 3196–3203. 2.
17.
G. Kahn, P. Abbeel, and S. Levine, “Badgr: An autonomous self-supervised learning-based navigation system,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1312–1319, 2021. 2.
18.
N. Savinov, A. Dosovitskiy, and V. Koltun, “Semi-parametric topological memory for navigation,” in International Conference on Learning Representations, 2018. 2, 4.
19.
K. Chen, J. P. de Vicente, G. Sepulveda, F. Xia, A. Soto, M. Vázquez, and S. Savarese, “A behavioral approach to visual navigation with graph localization networks,” arXiv preprint arXiv: 1903.00445, 2019. 2.
20.
D. Shah, B. Eysenbach, G. Kahn, N. Rhinehart, and S. Levine, “Ving: Learning open-world navigation with visual goals,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 13 215–13 222. 2.
21.
N. Hirose, F. Xia, R. Martín-Martín, A. Sadeghian, and S. Savarese, “Deep visual mpc-policy learning for navigation,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3184–3191, 2019. 2, 4, 5.
22.
D. Pathak, P. Mahmoudieh, G. Luo, P. Agrawal, D. Chen, Y. Shentu, E. Shelhamer, J. Malik, A. A. Efros, and T. Darrell, “Zero-shot visual imitation,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 2050–2053. 2.
23.
Y. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, L. Fei-Fei, and A. Farhadi, “Target-driven visual navigation in indoor scenes using deep reinforcement learning,” in 2017 IEEE international conference on robotics and automation (ICRA). IEEE, 2017, pp. 3357–3364. 2.
24.
G. Kahn, A. Villaflor, B. Ding, P. Abbeel, and S. Levine, “Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 5129–5136. 2.
25.
N. Hirose, S. Taguchi, F. Xia, R. Martín-Martín, K. Tahara, M. Ishi-gaki, and S. Savarese, “Probabilistic visual navigation with bidirectional image prediction,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 1539–1546. 2, 4.
26.
D. Shah and S. Levine, “Viking: Vision-based kilometer-scale navigation with geographic hints,” arXiv preprint arXiv: 2202.11271, 2022. 2.
27.
F. Xia, A. R. Zamir, Z. He, A. Sax, J. Malik, and S. Savarese, “Gibson env: Real-world perception for embodied agents,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9068–9079. 2.
28.
M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik, “Habitat: A platform for embodied ai research,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9339–9347. 2.
29.
A. Kadian, J. Truong, A. Gokaslan, A. Clegg, E. Wijmans, S. Lee, M. Savva, S. Chernova, and D. Batra, “Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance? ” IEEE Robotics and Automation Letters, 2020. 2.
30.
H. Kataoka, K. Okayasu, A. Matsumoto, E. Yamagata, R. Yamada, N. Inoue, A. Nakamura, and Y. Satoh, “Pre-training without natural images,” in Proceedings of the Asian Conference on Computer Vision, 2020. 2.

Contact IEEE to Subscribe

References

References is not available for this document.