Cargando…
A vision transformer for decoding surgeon activity from surgical videos
The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details of intraoperative surgical actions, which can vary widely, are not well understood. Here we report a machine learning system leveraging a vision transformer a...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10307635/ https://www.ncbi.nlm.nih.gov/pubmed/36997732 http://dx.doi.org/10.1038/s41551-023-01010-8 |
_version_ | 1785066075944124416 |
---|---|
author | Kiyasseh, Dani Ma, Runzhuo Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. |
author_facet | Kiyasseh, Dani Ma, Runzhuo Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. |
author_sort | Kiyasseh, Dani |
collection | PubMed |
description | The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details of intraoperative surgical actions, which can vary widely, are not well understood. Here we report a machine learning system leveraging a vision transformer and supervised contrastive learning for the decoding of elements of intraoperative surgical activity from videos commonly collected during robotic surgeries. The system accurately identified surgical steps, actions performed by the surgeon, the quality of these actions and the relative contribution of individual video frames to the decoding of the actions. Through extensive testing on data from three different hospitals located in two different continents, we show that the system generalizes across videos, surgeons, hospitals and surgical procedures, and that it can provide information on surgical gestures and skills from unannotated videos. Decoding intraoperative activity via accurate machine learning systems could be used to provide surgeons with feedback on their operating skills, and may allow for the identification of optimal surgical behaviour and for the study of relationships between intraoperative factors and postoperative outcomes. |
format | Online Article Text |
id | pubmed-10307635 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-103076352023-06-30 A vision transformer for decoding surgeon activity from surgical videos Kiyasseh, Dani Ma, Runzhuo Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. Nat Biomed Eng Article The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details of intraoperative surgical actions, which can vary widely, are not well understood. Here we report a machine learning system leveraging a vision transformer and supervised contrastive learning for the decoding of elements of intraoperative surgical activity from videos commonly collected during robotic surgeries. The system accurately identified surgical steps, actions performed by the surgeon, the quality of these actions and the relative contribution of individual video frames to the decoding of the actions. Through extensive testing on data from three different hospitals located in two different continents, we show that the system generalizes across videos, surgeons, hospitals and surgical procedures, and that it can provide information on surgical gestures and skills from unannotated videos. Decoding intraoperative activity via accurate machine learning systems could be used to provide surgeons with feedback on their operating skills, and may allow for the identification of optimal surgical behaviour and for the study of relationships between intraoperative factors and postoperative outcomes. Nature Publishing Group UK 2023-03-30 2023 /pmc/articles/PMC10307635/ /pubmed/36997732 http://dx.doi.org/10.1038/s41551-023-01010-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Kiyasseh, Dani Ma, Runzhuo Haque, Taseen F. Miles, Brian J. Wagner, Christian Donoho, Daniel A. Anandkumar, Animashree Hung, Andrew J. A vision transformer for decoding surgeon activity from surgical videos |
title | A vision transformer for decoding surgeon activity from surgical videos |
title_full | A vision transformer for decoding surgeon activity from surgical videos |
title_fullStr | A vision transformer for decoding surgeon activity from surgical videos |
title_full_unstemmed | A vision transformer for decoding surgeon activity from surgical videos |
title_short | A vision transformer for decoding surgeon activity from surgical videos |
title_sort | vision transformer for decoding surgeon activity from surgical videos |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10307635/ https://www.ncbi.nlm.nih.gov/pubmed/36997732 http://dx.doi.org/10.1038/s41551-023-01010-8 |
work_keys_str_mv | AT kiyassehdani avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT marunzhuo avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT haquetaseenf avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT milesbrianj avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT wagnerchristian avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT donohodaniela avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT anandkumaranimashree avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT hungandrewj avisiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT kiyassehdani visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT marunzhuo visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT haquetaseenf visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT milesbrianj visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT wagnerchristian visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT donohodaniela visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT anandkumaranimashree visiontransformerfordecodingsurgeonactivityfromsurgicalvideos AT hungandrewj visiontransformerfordecodingsurgeonactivityfromsurgicalvideos |