Skip to Main Content
We study the problem of actively searching for an object in a three-dimensional (3-D) environment under the constraint of a maximum search time using a visually guided humanoid robot with 26 degrees of freedom. The inherent intractability of the problem is discussed, and a greedy strategy for selecting the best next viewpoint is employed. We describe a target probability updating scheme approximating the optimal solution to the problem, providing an efficient solution to the selection of the best next viewpoint. We employ a hierarchical recognition architecture, inspired by human vision, that uses contextual cues for attending to the view-tuned units at the proper intrinsic scales and for active control of the robotic platform sensor's coordinate frame, which also gives us control of the extrinsic image scale and achieves the proper sequence of pathognomonic views of the scene. The recognition model makes no particular assumptions on shape properties like texture and is trained by showing the object by hand to the robot. Our results demonstrate the feasibility of using state-of-the-art vision-based systems for efficient and reliable object localization in an indoor 3-D environment.