Real-Time Segmentation of Surgical Tools and Needle Using a Mobile-U-Net

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review


This paper presents a CNN-based method for markerless segmentation of surgical tools and suture needles for surgical task automation. The proposed CNN, which we refer to as Mobile-U-Net, is based on the classic U-net encoder-decoder architecture and uses a lightweight MobileNet encoder backbone. This enables the network to perform real-time segmentation at ∼36 and ∼90 frames per second on 672×1120 and 448×448 pixel images. The network is equipped with a multi-kernel output segmentation layer with three k×k×N softmax kernels with k = 1, 3, and 5. On a proprietary dataset from a surgical robot laboratory setup, the multi-kernel Mobile-U-Net achieves intersection over union scores of 0.9495 for surgical tool shafts, 0.8631 for tool end-effectors, 0.9350 for phantom tissue suture pad, 0.8531 for marked needle insertion points, and 0.7524 for suture needles. The method is validated on a second set of images achieving intersection over union scores of 0.9515, 0.8225, and 0.5638 for tool shafts, end-effectors and suture needles. Using multiple kernels improve needle segmentation by 3.11% and 7.00% on the two datasets compared to the baseline of using a single 1×1×N filter and 10.97% and 1.72% for overall mean intersection over union.
Original languageEnglish
Title of host publication2021 20th International Conference on Advanced Robotics (ICAR)
Publication dateDec 2021
ISBN (Electronic)978-1-6654-3684-7
Publication statusPublished - Dec 2021
Event2021 20th International Conference on Advanced Robotics (ICAR) - Congress Centre Cankarjev dom, Ljubljana, Slovenia
Duration: 7. Dec 202110. Dec 2021
Conference number: 20


Conference2021 20th International Conference on Advanced Robotics (ICAR)
LocationCongress Centre Cankarjev dom
Internet address


Dive into the research topics of 'Real-Time Segmentation of Surgical Tools and Needle Using a Mobile-U-Net'. Together they form a unique fingerprint.

Cite this