Cargando…

Data-centric multi-task surgical phase estimation with sparse scene segmentation

PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic...

Descripción completa

Detalles Bibliográficos
Autores principales: Sanchez-Matilla, Ricardo, Robu, Maria, Grammatikopoulou, Maria, Luengo, Imanol, Stoyanov, Danail
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/
https://www.ncbi.nlm.nih.gov/pubmed/35505149
http://dx.doi.org/10.1007/s11548-022-02616-0
_version_ 1784709106232197120
author Sanchez-Matilla, Ricardo
Robu, Maria
Grammatikopoulou, Maria
Luengo, Imanol
Stoyanov, Danail
author_facet Sanchez-Matilla, Ricardo
Robu, Maria
Grammatikopoulou, Maria
Luengo, Imanol
Stoyanov, Danail
author_sort Sanchez-Matilla, Ricardo
collection PubMed
description PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development.
format Online
Article
Text
id pubmed-9110447
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-91104472022-05-18 Data-centric multi-task surgical phase estimation with sparse scene segmentation Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail Int J Comput Assist Radiol Surg Original Article PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development. Springer International Publishing 2022-05-03 2022 /pmc/articles/PMC9110447/ /pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Article
Sanchez-Matilla, Ricardo
Robu, Maria
Grammatikopoulou, Maria
Luengo, Imanol
Stoyanov, Danail
Data-centric multi-task surgical phase estimation with sparse scene segmentation
title Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_full Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_fullStr Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_full_unstemmed Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_short Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_sort data-centric multi-task surgical phase estimation with sparse scene segmentation
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/
https://www.ncbi.nlm.nih.gov/pubmed/35505149
http://dx.doi.org/10.1007/s11548-022-02616-0
work_keys_str_mv AT sanchezmatillaricardo datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation
AT robumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation
AT grammatikopouloumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation
AT luengoimanol datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation
AT stoyanovdanail datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation