Cargando…

SD-Net: joint surgical gesture recognition and skill assessment

PURPOSE: Surgical gesture recognition has been an essential task for providing intraoperative context-aware assistance and scheduling clinical resources. However, previous methods present limitations in catching long-range temporal information, and many of them require additional sensors. To address...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jinglu, Nie, Yinyu, Lyu, Yao, Yang, Xiaosong, Chang, Jian, Zhang, Jian Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580939/
https://www.ncbi.nlm.nih.gov/pubmed/34655392
http://dx.doi.org/10.1007/s11548-021-02495-x
_version_ 1784596707976151040
author Zhang, Jinglu
Nie, Yinyu
Lyu, Yao
Yang, Xiaosong
Chang, Jian
Zhang, Jian Jun
author_facet Zhang, Jinglu
Nie, Yinyu
Lyu, Yao
Yang, Xiaosong
Chang, Jian
Zhang, Jian Jun
author_sort Zhang, Jinglu
collection PubMed
description PURPOSE: Surgical gesture recognition has been an essential task for providing intraoperative context-aware assistance and scheduling clinical resources. However, previous methods present limitations in catching long-range temporal information, and many of them require additional sensors. To address these challenges, we propose a symmetric dilated network, namely SD-Net, to jointly recognize surgical gestures and assess surgical skill levels only using RGB surgical video sequences. METHODS: We utilize symmetric 1D temporal dilated convolution layers to hierarchically capture gesture clues under different receptive fields such that features in different time span can be aggregated. In addition, a self-attention network is bridged in the middle to calculate the global frame-to-frame relativity. RESULTS: We evaluate our method on a robotic suturing task from the JIGSAWS dataset. The gesture recognition task largely outperforms the state of the arts on the frame-wise accuracy up to [Formula: see text] 6 points and the F1@50 score [Formula: see text] 8 points. We also keep the 100% predicted accuracy for the skill assessment task using LOSO validation scheme. CONCLUSION: The results indicate that our architecture is able to obtain representative surgical video features by extensively considering the spatial, temporal and relational context from raw video input. Furthermore, the better performance in multi-task learning implies that surgical skill assessment has a complementary effects to gesture recognition task.
format Online
Article
Text
id pubmed-8580939
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-85809392021-11-15 SD-Net: joint surgical gesture recognition and skill assessment Zhang, Jinglu Nie, Yinyu Lyu, Yao Yang, Xiaosong Chang, Jian Zhang, Jian Jun Int J Comput Assist Radiol Surg Original Article PURPOSE: Surgical gesture recognition has been an essential task for providing intraoperative context-aware assistance and scheduling clinical resources. However, previous methods present limitations in catching long-range temporal information, and many of them require additional sensors. To address these challenges, we propose a symmetric dilated network, namely SD-Net, to jointly recognize surgical gestures and assess surgical skill levels only using RGB surgical video sequences. METHODS: We utilize symmetric 1D temporal dilated convolution layers to hierarchically capture gesture clues under different receptive fields such that features in different time span can be aggregated. In addition, a self-attention network is bridged in the middle to calculate the global frame-to-frame relativity. RESULTS: We evaluate our method on a robotic suturing task from the JIGSAWS dataset. The gesture recognition task largely outperforms the state of the arts on the frame-wise accuracy up to [Formula: see text] 6 points and the F1@50 score [Formula: see text] 8 points. We also keep the 100% predicted accuracy for the skill assessment task using LOSO validation scheme. CONCLUSION: The results indicate that our architecture is able to obtain representative surgical video features by extensively considering the spatial, temporal and relational context from raw video input. Furthermore, the better performance in multi-task learning implies that surgical skill assessment has a complementary effects to gesture recognition task. Springer International Publishing 2021-10-16 2021 /pmc/articles/PMC8580939/ /pubmed/34655392 http://dx.doi.org/10.1007/s11548-021-02495-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Article
Zhang, Jinglu
Nie, Yinyu
Lyu, Yao
Yang, Xiaosong
Chang, Jian
Zhang, Jian Jun
SD-Net: joint surgical gesture recognition and skill assessment
title SD-Net: joint surgical gesture recognition and skill assessment
title_full SD-Net: joint surgical gesture recognition and skill assessment
title_fullStr SD-Net: joint surgical gesture recognition and skill assessment
title_full_unstemmed SD-Net: joint surgical gesture recognition and skill assessment
title_short SD-Net: joint surgical gesture recognition and skill assessment
title_sort sd-net: joint surgical gesture recognition and skill assessment
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580939/
https://www.ncbi.nlm.nih.gov/pubmed/34655392
http://dx.doi.org/10.1007/s11548-021-02495-x
work_keys_str_mv AT zhangjinglu sdnetjointsurgicalgesturerecognitionandskillassessment
AT nieyinyu sdnetjointsurgicalgesturerecognitionandskillassessment
AT lyuyao sdnetjointsurgicalgesturerecognitionandskillassessment
AT yangxiaosong sdnetjointsurgicalgesturerecognitionandskillassessment
AT changjian sdnetjointsurgicalgesturerecognitionandskillassessment
AT zhangjianjun sdnetjointsurgicalgesturerecognitionandskillassessment