Abstract:
The ability to localize in the co-ordinate system of a 3D model presents an opportunity for safe trajectory planning. While SLAM-based approaches provide estimates of inc...Show MoreMetadata
Abstract:
The ability to localize in the co-ordinate system of a 3D model presents an opportunity for safe trajectory planning. While SLAM-based approaches provide estimates of incremental poses with respect to the first camera frame, they do not provide global localization. With the availability of mobile GPUs like the Nvidia TX1 etc., our method provides a novel, elegant and high performance visual method for model based robot localization. We propose a method to learn an environment representation with deep residual nets for localization in a known 3D model representing a real-world area of 25,000 sq. meters. We use the power of modern GPUs and game engines for rendering training images mimicking a downward looking high flying drone using a photorealistic 3D model. We use these images to drive the learning loop of a 50-layer deep neural network to learn camera positions. We next propose to do data augmentation to accelerate training and to make our trained model robust for cross domain generalization, which has been verified with experiments. We test our trained model with synthetically generated data as well as real data captured from a downward looking drone. It takes about 25 miliseconds of GPU processing to predict camera pose. Unlike previous methods, the proposed method does not do rendering at test time and does independent prediction from a learned environment representation.
Date of Conference: 17-20 September 2017
Date Added to IEEE Xplore: 22 February 2018
ISBN Information:
Electronic ISSN: 2381-8549