Skip to Main Content
The purpose of this study is to investigate a variational method for joint multiregion three-dimensional (3-D) motion segmentation and 3-D interpretation of temporal sequences of monocular images. Interpretation consists of dense recovery of 3-D structure and motion from the image sequence spatiotemporal variations due to short-range image motion. The method is direct insomuch as it does not require prior computation of image motion. It allows movement of both viewing system and multiple independently moving objects. The problem is formulated following a variational statement with a functional containing three terms. One term measures the conformity of the interpretation within each region of 3-D motion segmentation to the image sequence spatiotemporal variations. The second term is of regularization of depth. The assumption that environmental objects are rigid accounts automatically for the regularity of 3-D motion within each region of segmentation. The third and last term is for the regularity of segmentation boundaries. Minimization of the functional follows the corresponding Euler-Lagrange equations. This results in iterated concurrent computation of 3-D motion segmentation by curve evolution, depth by gradient descent, and 3-D motion by least squares within each region of segmentation. Curve evolution is implemented via level sets for topology independence and numerical stability. This algorithm and its implementation are verified on synthetic and real image sequences. Viewers presented with anaglyphs of stereoscopic images constructed from the algorithm's output reported a strong perception of depth.