Skip to Main Content
Estimating 3D information from an image sequence has long been a challenging problem, especially for dynamic scenes. In this paper, a novel semi-automatic 2D-to-3D conversion method is presented to estimate the disparity maps for regular 2D video shots. Our method requires only a few user-scribbles on very sparse key frames, and then other frames of the video are converted to 3D automatically. Multiple objects are first segmented by the input user-scribbles. Then, the initial disparity map is assigned to each key frame with the aid of various preset disparity models for each object. After the disparity assignment step, disparity maps for other frames of the video are obtained through a disparity propagation strategy taking into account both color similarity and motion information. Finally, the 3D video is synthesized according to the type of 3D display device. Our method is verified on different kinds of challenging sequences containing occlusion, textureless regions, color ambiguity, large displacement movements, etc. The experimental results show that our method has better performance than the state-of-the-art 2D-to-3D conversion systems.