Abstract:
Deep neural networks have demonstrated excellent object detection and segmentation performance from RGB data. However, these models can only recognize and predict segment...Show MoreMetadata
Abstract:
Deep neural networks have demonstrated excellent object detection and segmentation performance from RGB data. However, these models can only recognize and predict segmentation masks with great accuracy when RGB data have sufficient information about the objects of interest. In this paper, we suggest an intelligent, active perception system that can adjust its 3D position to improve signal acquisition. The segmentation score of cluttered scene is improved a lot due to this proposed system, which can also enhance grasp pose detection for the robotic manipulator. The ResNet-50 backbone of the proposed perception system is initialized using pre-trained weights to extract a latent state from an RGB image of the cluttered scene. A Reinforcement Learning (RL) agent uses these retrieved states to reposition the visual perception system for enhancement of the underlying computer vision tasks such as segmentation of the cluttered scene. Our trained RL agent can anticipate the better position of the visual perception system, which ensures enhanced signal recovery. The effectiveness of the proposed approach is tested in a pybullet simulation environment.
Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information: