Skip to Main Content
Single camera solutions such as monocular visual odometry or monoSLAM approaches - found a wide echo in the community. All the monocular approaches, however, suffer from the lack of metric scale. In this paper, we present a solution to tackle this issue by adding an inertial sensor equipped with a three-axis accelerometer and gyroscope. In contrast to previous approaches, our solution is independent of the underlying vision algorithm which estimates the camera poses. As a direct consequence, the algorithm presented here operates at a constant computational complexity in real time. We treat the visual framework as a black box and thus the approach is modular and widely applicable to existing monocular solutions. It can be used with any pose estimation algorithm such as visual odometry, visual SLAM, monocular or stereo setups or even GPS solutions with gravity and compass attitude estimation. In this paper, we show the thorough development of the metric state estimation based on an Extended Kalman Filter. Furthermore, even though we treat the visual framework as a black box, we show how to detect failures and estimate drifts in it. We implement our solution on a monocular vision pose estimation framework and show the results both in simulation and on real data.