Skip to Main Content
This paper describes an approach for recognizing instances of a 3D object in a single camera image and for determining their 3D poses. A hierarchical model is generated solely based on the geometry information of a 3D CAD model of the object. The approach does not rely on texture or reflectance information of the object's surface, making it useful for a wide range of industrial and robotic applications, e.g., bin-picking. A hierarchical view-based approach that addresses typical problems of previous methods is applied: It handles true perspective, is robust to noise, occlusions, and clutter to an extent that is sufficient for many practical applications, and is invariant to contrast changes. For the generation of this hierarchical model, a new model image generation technique by which scale-space effects can be taken into account is presented. The necessary object views are derived using a similarity-based aspect graph. The high robustness of an exhaustive search is combined with an efficient hierarchical search. The 3D pose is refined by using a least-squares adjustment that minimizes geometric distances in the image, yielding a position accuracy of up to 0.12 percent with respect to the object distance, and an orientation accuracy of up to 0.35 degree in our tests. The recognition time is largely independent of the complexity of the object, but depends mainly on the range of poses within which the object may appear in front of the camera. For efficiency reasons, the approach allows the restriction of the pose range depending on the application. Typical runtimes are in the range of a few hundred ms.