Abstract:
RGBD semantic segmentation is a popular task in computer vision with applications in autonomous vehicles and virtual reality. This problem is challenging due to the clutt...Show MoreMetadata
Abstract:
RGBD semantic segmentation is a popular task in computer vision with applications in autonomous vehicles and virtual reality. This problem is challenging due to the cluttered, dense and diverse scenes. To solve the loss of context information in dense semantic scene segmentation, we propose a novel architecture built on multi-scale feature representation that contains more global and local context cues. The multi-scale features, which are generated via aggregating 3D region features and sparse coding SIFT features extracted from multiresolution RGB and depth images, are fed into a softmax classifier to labeling each region produced by hierarchical segmentation with a predefined class, that is our final semantic scene segmentation. In addition, compared to the rough four categories predefined from the 894 pixel categories in NYUD2 dataset, we define the 40 detailed pixel classes that cover most common object categories and makes a fine-grained semantic segmentation. Extensive experiments on the standard NYUD2 benchmark demonstrate the effectiveness of our method.
Published in: 2018 Chinese Control And Decision Conference (CCDC)
Date of Conference: 09-11 June 2018
Date Added to IEEE Xplore: 09 July 2018
ISBN Information:
Electronic ISSN: 1948-9447