Abstract:
This paper presents a study for RGB-D camera pose estimation using deep learning techniques. The proposed network architecture is composed of two components: the convolut...Show MoreMetadata
Abstract:
This paper presents a study for RGB-D camera pose estimation using deep learning techniques. The proposed network architecture is composed of two components: the convolution neural network (CNN) for exploiting the vision information, and the Long Short-Term Memory (LSTM) block for incorporating the temporal information. The CNN, more precisely a RGB-D variant of GoogLeNet, functionalizes as a feature-oriented camera pose estimator, while the LSTM works as a temporal filter to model the pose transition. A modified loss function is also proposed to help regulate the convergence of the pose parameters. Experimental results show that the combination of CNN and LSTM can achieve a higher pose estimation accuracy, while the pipeline structure defined in the network can also provide flexibility for handling different scenarios.
Date of Conference: 14-16 November 2017
Date Added to IEEE Xplore: 08 March 2018
ISBN Information: