Journals & Magazines >IEEE Transactions on Automati... >Volume: 22

Dual Graph Attention Networks for Multi-View Visual Manipulation Relationship Detection and Robotic Grasping

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-...Show More

Metadata

Abstract:

Visual manipulation relationship detection facilitates robots to achieve safe, orderly, and efficient grasping tasks. However, most existing algorithms only model object-level or relational-level dependency individually, lacking sufficient global information, which is difficult to handle different types of reasoning errors, especially in complex environments with multi-object stacking and occlusion. To solve the above problems, we propose Dual Graph Attention Networks (Dual-GAT) for visual manipulation relationship detection, with an object-level graph network for capturing object-level dependencies and a relational-level graph network for capturing relational triplets-level interactions. The attention mechanism assigns different weights to different dependencies, obtains more accurate global context information for reasoning, and gets a manipulation relationship graph. In addition, we use multi-view feature fusion to improve the occluded object features, then enhance the relationship detection performance in multi-object scenes. Finally, our method is deployed on the robot to construct a multi-object grasping system, which can be well applied to stacking environments. Experimental results on the datasets VMRD and REGRAD show that our method significantly outperforms others. Note to Practitioners—The motivation of this research paper is to design efficient and accurate visual manipulation relationship reasoning methods to accomplish relevant grasping tasks in complex stacked scenes with multiple objects. The problem requires judging the positional relationship between objects in a stacked scene and determining the appropriate grasping order. In order to accurately grasp the target objects, this paper proposes a Dual Graph Attention Network for visual manipulation relationship detection, which utilizes object-level and relationship-level dependencies to obtain accurate global information for better performance. Meanwhile, multi-view feature fusion effectively improve...

Published in: IEEE Transactions on Automation Science and Engineering ( Volume: 22)

Page(s): 13694 - 13705

Date of Publication: 26 March 2025

ISSN Information:

DOI: 10.1109/TASE.2025.3555206

Funding Agency:

Contents

I. Introduction

By perceiving and understanding complex environments, robots can detect and infer the relationships between objects. Pairs of objects and their relationships are usually represented in the form of triples, such as . In [1], the visual manipulation relationship is proposed for robot grasping tasks, which is divided into three categories: parent, child, and no relationship. Most of the current robotic grippers follow the object detection frame for grasping, which makes it difficult to keep the object attitude center of gravity, etc., unchanged during the grasping process. As shown in Fig. 1(a), a water cup is on top of a book. If this positional relationship is ignored and the book is grasped directly, it will cause the water cup to fall or be damaged and the robot cannot realize safe grasping. Therefore this visual manipulation relationship is needed in stacked scenarios of multiple objects. As shown in Fig. 1(b), in a stacked scene with five objects, the robot needs to determine the relationship between pairs of objects, generate the correct operation relationship graph, and operate according to the grasping order. And in this graph, the robot grasping the notebook should first remove all other objects on it. With a generated manipulation relationship graph, the robot can grasp objects in order.Fig. 1.

(a) The importance of grasping order. (b) The generation process of visual manipulation relationship graph. The graph shows the correct manipulation relationship graph and robot grasping order. It is worth noting that the order of the manipulation relationship graph is the reverse of the grasping order.

References is not available for this document.

Dual Graph Attention Networks for Multi-View Visual Manipulation Relationship Detection and Robotic Grasping

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Dual Graph Attention Networks for Multi-View Visual Manipulation Relationship Detection and Robotic Grasping

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Supplemental Items

References

IEEE Account

Purchase Details

Profile Information

Need Help?