Skip to Main Content
We present a novel method to reconstruct the 3D shape of a scene from several calibrated images. Our motivation is that most existing multi-view stereovision approaches require some knowledge of the scene extent and often even of its approximate geometry (e.g. visual hull). This makes these approaches mainly suited to compact objects admitting a tight enclosing box, imaged on a simple or a known background. In contrast, our approach focuses on large-scale cluttered scenes under uncontrolled imaging conditions. It first generates a quasi-dense 3D point cloud of the scene by matching keypoints across images in a lenient manner, thus possibly retaining many false matches. Then it builds an adaptive tetrahedral decomposition of space by computing the 3D Delaunay triangulation of the 3D point set. Finally, it reconstructs the scene by labeling Delaunay tetrahedra as empty or occupied, thus generating a triangular mesh of the scene. A globally optimal label assignment, as regards photo-consistency of the output mesh and compatibility with the visibility of keypoints in input images, is efficiently found as a minimum cut solution in a graph.