Skip to Main Content
Estimation of human motion has been improved by recent advances in depth sensors such as the Microsoft Kinect. However, they often have limited range of depths and a large number of such sensors are necessary to estimate motion in large areas. In this paper, we explore the possibility of estimating motion from monocular data using initial and intermittent 3D models provided by the depth sensor. We use motion segmentation to divide the scene into several rigidly moving components. The orientation of individual components are estimated and these reconstructions are synthesized to provide a coherent estimate of the scene. We demonstrate our algorithm on three sequences from a real video sequence. Quantitative comparison with depth sensor reconstructions show that the proposed method can accurately estimate motion even with a single 3D initialization.