Skip to Main Content
In this paper we present an approach to object segmentation and recognition that combines depth and color cues. We fuse information from color images with depth from a Time-of-Flight (ToF) camera to improve recognition performance under scale and viewpoint changes. Firstly, we use depth and local surface orientation extracted from the ToF image to normalize color and depth image features with regard to scale and viewpoint. Secondly, we incorporate local 3D shape features into the classifier. The use of a Random Forest classifier facilitates the seamless combination of depth and texture features. It also provides image segmentation through pixel-wise classification. We demonstrate our approach on a labeled dataset of seven object categories in table-top scenes and compare it with a vision-only approach.