We propose a content-based 3-D mosaic (CB3M) representation for long video sequences of 3-D and dynamic urban scenes captured by a camera on a mobile platform. In the first phase, a set of parallel-perspective (pushbroom) mosaics with varying viewing directions is generated to capture both the 3-D and dynamic aspects of the scene under the camera coverage. In the second phase, a segmentation-based stereo matching algorithm is applied to extract parametric representations of the color, structure and motion of the dynamic and/or 3-D objects in urban scenes, where a lot of planar surfaces exist. Multiple pairs of stereo mosaics are used for facilitating reliable stereo matching, occlusion handling, accurate 3-D reconstruction, and robust moving target detection. CB3M is a highly compressed visual representation for a dynamic 3-D scene, and has object contents of both 3-D and motion information. Experimental results are given for various real video sequences of large-scale 3-D scenes.
Published in:
Circuits and Systems for Video Technology, IEEE Transactions on
(Volume:22
,
Issue:
2
)
Date of Publication: Feb. 2012