Loading [MathJax]/extensions/MathMenu.js
Real-Time Surgical Tool Detection in Minimally Invasive Surgery Based on Attention-Guided Convolutional Neural Network | IEEE Journals & Magazine | IEEE Xplore

Real-Time Surgical Tool Detection in Minimally Invasive Surgery Based on Attention-Guided Convolutional Neural Network


We propose a real-time attention-guided convolutional neural network (CNN) architecture, which comprises a coarse (CDM) and a refined (RDM) detection modules. The two-ste...

Abstract:

To enhance surgeons' efficiency and safety of patients, minimally invasive surgery (MIS) is widely used in a variety of clinical surgeries. Real-time surgical tool detect...Show More

Abstract:

To enhance surgeons' efficiency and safety of patients, minimally invasive surgery (MIS) is widely used in a variety of clinical surgeries. Real-time surgical tool detection plays an important role in MIS. However, most methods of surgical tool detection may not achieve a good trade-off between detection speed and accuracy. We propose a real-time attention-guided convolutional neural network (CNN) for frame-by-frame detection of surgical tools in MIS videos, which comprises a coarse (CDM) and a refined (RDM) detection modules. The CDM is used to coarsely regress the parameters of locations to get the refined anchors and perform binary classification, which determines whether the anchor is a tool or background. The RDM subtly incorporates the attention mechanism to generate accurate detection results utilizing the refined anchors from CDM. Finally, a light-head module for more efficient surgical tool detection is proposed. The proposed method is compared to eight state-of-the-art detection algorithms using two public (EndoVis Challenge and ATLAS Dione) datasets and a new dataset we introduced (Cholec80-locations), which extends the Cholec80 dataset with spatial annotations of surgical tools. Our approach runs in real-time at 55.5 FPS and achieves 100, 94.05, and 91.65% mAP for the above three datasets, respectively. Our method achieves accurate, fast, and robust detection results by end-to-end training in MIS videos. The results demonstrate the effectiveness and superiority of our method over the eight state-of-the-art methods.
We propose a real-time attention-guided convolutional neural network (CNN) architecture, which comprises a coarse (CDM) and a refined (RDM) detection modules. The two-ste...
Published in: IEEE Access ( Volume: 8)
Page(s): 228853 - 228862
Date of Publication: 21 December 2020
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.