Abstract:
Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-...Show MoreMetadata
Abstract:
Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-level or relational-level dependency individually, lacking sufficient global information, which is difficult to handle different types of reasoning errors, especially in complex environments with multi-object stacking and occlusion. To solve the above problems, we propose Dual Graph Attention Networks (Dual-GAT) for visual manipulation relationship detection, with an object-level graph network for capturing object-level dependencies and a relational-level graph network for capturing relational triplets-level interactions. The attention mechanism assigns different weights to different dependencies, obtains more accurate global context information for reasoning, and gets a manipulation relationship graph. In addition, we use multi-view feature fusion to improve the occluded object features, then enhance the relationship detection performance in multi-object scenes. Finally, our method is deployed on the robot to construct a multi-object grasping system, which can be well applied to stacking environments. Experimental results on the datasets VMRD and REGRAD show that our method significantly outperforms others.
Published in: IEEE Transactions on Automation Science and Engineering ( Early Access )