Dual Graph Attention Networks for Multi-View Visual Manipulation Relationship Detection and Robotic Grasping | IEEE Journals & Magazine | IEEE Xplore

Dual Graph Attention Networks for Multi-View Visual Manipulation Relationship Detection and Robotic Grasping


Abstract:

Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-...Show More

Abstract:

Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-level or relational-level dependency individually, lacking sufficient global information, which is difficult to handle different types of reasoning errors, especially in complex environments with multi-object stacking and occlusion. To solve the above problems, we propose Dual Graph Attention Networks (Dual-GAT) for visual manipulation relationship detection, with an object-level graph network for capturing object-level dependencies and a relational-level graph network for capturing relational triplets-level interactions. The attention mechanism assigns different weights to different dependencies, obtains more accurate global context information for reasoning, and gets a manipulation relationship graph. In addition, we use multi-view feature fusion to improve the occluded object features, then enhance the relationship detection performance in multi-object scenes. Finally, our method is deployed on the robot to construct a multi-object grasping system, which can be well applied to stacking environments. Experimental results on the datasets VMRD and REGRAD show that our method significantly outperforms others. Note to Practitioners—The motivation of this research paper is to design efficient and accurate visual manipulation relationship reasoning methods to accomplish relevant grasping tasks in complex stacked scenes with multiple objects. The problem requires judging the positional relationship between objects in a stacked scene and determining the appropriate grasping order. In order to accurately grasp the target objects, this paper proposes a Dual Graph Attention Network for visual manipulation relationship detection, which utilizes object-level and relationship-level dependencies to obtain accurate global information for better performance. Meanwhile, multi-view feature fusion effectively improve...
Page(s): 13694 - 13705
Date of Publication: 26 March 2025

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.