Journals & Magazines >IEEE Robotics and Automation ... >Volume: 10 Issue: 3

From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The rise of Large Language Models (LLMs) has impacted research in robotics and automation. While progress has been made in integrating LLMs into general robotics tasks, a...Show More

Metadata

Abstract:

The rise of Large Language Models (LLMs) has impacted research in robotics and automation. While progress has been made in integrating LLMs into general robotics tasks, a noticeable void persists in their adoption in more specific domains such as surgery, where critical factors such as reasoning, explainability, and safety are paramount. Achieving autonomy in robotic surgery, which entails the ability to reason and adapt to changes in the environment, remains a significant challenge. In this work, we propose a multi-modal LLM integration in robot-assisted surgery for autonomous blood suction. The reasoning and prioritization are delegated to the higher-level task-planning LLM, and the motion planning and execution are handled by the lower-level deep reinforcement learning model, creating a distributed agency between the two components. As surgical operations are highly dynamic and may encounter unforeseen circumstances, blood clots and active bleeding were introduced to influence decision-making. Results showed that using a multi-modal LLM as a higher-level reasoning unit can account for these surgical complexities to achieve a level of reasoning previously unattainable in robot-assisted surgeries. These findings demonstrate the potential of multi-modal LLMs to significantly enhance contextual understanding and decision-making in robotic-assisted surgeries, marking a step toward autonomous surgical systems.

Published in: IEEE Robotics and Automation Letters ( Volume: 10, Issue: 3, March 2025)

Page(s): 2598 - 2605

Date of Publication: 27 January 2025

ISSN Information:

DOI: 10.1109/LRA.2025.3535184

Funding Agency:

Contents

I. Introduction

Robot-assisted surgery (RAS) has enormously changed the way many surgeons operate. Surgical robots can enhance accuracy and dexterity, provide better anatomical access, and minimize invasiveness, surgery time, and the need for revision surgery [1]. With the development of surgical robots and the da Vinci Research Kit (dVRK) [2], along with realistic surgical simulation environments [3], [4], [5], the automation of surgical sub-tasks such as tissue retraction [6], suturing [7], endoscopic camera control [8], cutting [9], and body fluid removal [10], has been an area of research in the past few years. These are the building blocks of surgeries that form the foundation for enhancing bottom-up surgical autonomy [11], [12], and automating these commonly faced sub-tasks provides the basic robot skills necessary for reaching a more advanced level of autonomy, including the ability to reason and plan tasks.

References is not available for this document.

From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Supplemental Items

References

IEEE Account

Purchase Details

Profile Information

Need Help?