Skip to Main Content
Detailed geometric modeling from images is very important but extremely complex and computationally expensive. In this paper we present an algorithm for large-scale urban terrestrial geometric modeling from videos. In the proposed approach, we classify and segment the contents of images based on the knowledge about the scene. Then the segments of each image are aligned to similar segments of the consecutive images and warped accordingly. The alignment and warping provide an overall image-to-image matching and allow us to achieve refined dense pixel matching more efficiently and reliably. In our experiment, we reconstruct the dense three-dimensional (3D) point cloud of a street and buildings from the video captured by a camera mounted on top of a vehicle. Our experimental results demonstrate that the proposed algorithm works effectively for difficult scenes such as objects that lack of texture or under unfriendly lighting conditions.