TY - GEN
T1 - Real-Time Segmentation of Surgical Tools and Needle Using a Mobile-U-Net
AU - Andersen, Jakob Kristian Holm
AU - Schwaner, Kim Lindberg
AU - Savarimuthu, Thiusius Rajeeth
N1 - Conference code: 20
PY - 2021/12
Y1 - 2021/12
N2 - This paper presents a CNN-based method for markerless segmentation of surgical tools and suture needles for surgical task automation. The proposed CNN, which we refer to as Mobile-U-Net, is based on the classic U-net encoder-decoder architecture and uses a lightweight MobileNet encoder backbone. This enables the network to perform real-time segmentation at ∼36 and ∼90 frames per second on 672×1120 and 448×448 pixel images. The network is equipped with a multi-kernel output segmentation layer with three k×k×N softmax kernels with k = 1, 3, and 5. On a proprietary dataset from a surgical robot laboratory setup, the multi-kernel Mobile-U-Net achieves intersection over union scores of 0.9495 for surgical tool shafts, 0.8631 for tool end-effectors, 0.9350 for phantom tissue suture pad, 0.8531 for marked needle insertion points, and 0.7524 for suture needles. The method is validated on a second set of images achieving intersection over union scores of 0.9515, 0.8225, and 0.5638 for tool shafts, end-effectors and suture needles. Using multiple kernels improve needle segmentation by 3.11% and 7.00% on the two datasets compared to the baseline of using a single 1×1×N filter and 10.97% and 1.72% for overall mean intersection over union.
AB - This paper presents a CNN-based method for markerless segmentation of surgical tools and suture needles for surgical task automation. The proposed CNN, which we refer to as Mobile-U-Net, is based on the classic U-net encoder-decoder architecture and uses a lightweight MobileNet encoder backbone. This enables the network to perform real-time segmentation at ∼36 and ∼90 frames per second on 672×1120 and 448×448 pixel images. The network is equipped with a multi-kernel output segmentation layer with three k×k×N softmax kernels with k = 1, 3, and 5. On a proprietary dataset from a surgical robot laboratory setup, the multi-kernel Mobile-U-Net achieves intersection over union scores of 0.9495 for surgical tool shafts, 0.8631 for tool end-effectors, 0.9350 for phantom tissue suture pad, 0.8531 for marked needle insertion points, and 0.7524 for suture needles. The method is validated on a second set of images achieving intersection over union scores of 0.9515, 0.8225, and 0.5638 for tool shafts, end-effectors and suture needles. Using multiple kernels improve needle segmentation by 3.11% and 7.00% on the two datasets compared to the baseline of using a single 1×1×N filter and 10.97% and 1.72% for overall mean intersection over union.
U2 - 10.1109/ICAR53236.2021.9659326
DO - 10.1109/ICAR53236.2021.9659326
M3 - Article in proceedings
SP - 148
EP - 154
BT - 2021 20th International Conference on Advanced Robotics (ICAR)
PB - IEEE
T2 - 2021 20th International Conference on Advanced Robotics (ICAR)
Y2 - 7 December 2021 through 10 December 2021
ER -