Skip to Main Content
In this paper we analyze the ability of a computer vision system to derive properties of the three-dimensional (3-D) physical world from viewing two-dimensional (2-D) images. We present a new approach which consists of a model-based interpretation of a single perspective image. Image linear features and linear feature sets are backprojected onto the 3-D space and geometric models are then used for selecting possible solutions. The paper treats two situations: 1) interpretation of scenes resulting from a simple geometric structure (orthogonality) in which case we seek to determine the orientation of this structure relatively to the viewer (three rotations) and 2) recognition of moderately complex objects whose shapes (geometrical and topological properties) are provided in advance. The recognition technique is limited to objects containing, among others, straight edges and planar faces. In the first case the computation can be carried out by a parallel algorithm which selects the solution that has received the largest number of votes (accumulation space). In the second case an object is uniquely assigned to a set of image features through a search strategy. As a by-product, the spatial position and orientation (six degrees of freedom) of each recognized object is determined as well. The method is valid over a wide range of perspective images and it does not require perfect low-level image segmentation. It has been successfully implemented for recognizing a class of industrial parts.