Journals & Magazines >IEEE Transactions on Multimedia >Early Access

RoG-SAM: A Language-Driven Framework for Instance-Level Robotic Grasping Detection

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Robotic grasping is a crucial topic in robotics and computer vision, with broad applications in industrial production and intelligent manufacturing. Although some methods...Show More

Metadata

Abstract:

Robotic grasping is a crucial topic in robotics and computer vision, with broad applications in industrial production and intelligent manufacturing. Although some methods have begun addressing instance-level grasping, most remain limited to predefined instances and categories, lacking flexibility for open-vocabulary grasp prediction based on user-specified instructions. To address this, we propose RoG-SAM, a language-driven, instance-level grasp detection framework built on Segment Anything Model (SAM). RoG-SAM utilizes open-vocabulary prompts for object localization and grasp pose prediction, adapting SAM through transfer learning with encoder adapters and multi-head decoders to extend its segmentation capabilities to grasp pose estimation. Experimental results show that RoG-SAM achieves competitive performance on single-object datasets (Cornell and Jacquard) and cluttered datasets (GraspNet-1Billion and OCID), with instance-level accuracies of 91.2% and 90.1%, respectively, while using only 28.3% of SAM's trainable parameters. The effectiveness of RoG-SAM was also validated in real-world environments. A demonstration video is available at https://www.youtube.com/playlist?list=PL7et4nGJAImLGytsJbglGbXl1hacA2dy_.

Published in: IEEE Transactions on Multimedia ( Early Access )

Page(s): 1 - 13

Date of Publication: 03 April 2025

ISSN Information:

DOI: 10.1109/TMM.2025.3557685

RoG-SAM: A Language-Driven Framework for Instance-Level Robotic Grasping Detection

Abstract:

Metadata

Abstract:

ISSN Information:

IEEE Account

Purchase Details

Profile Information

Need Help?

RoG-SAM: A Language-Driven Framework for Instance-Level Robotic Grasping Detection

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Keywords

IEEE Account

Purchase Details

Profile Information

Need Help?