Cargando…
Data-centric multi-task surgical phase estimation with sparse scene segmentation
PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/ https://www.ncbi.nlm.nih.gov/pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0 |
_version_ | 1784709106232197120 |
---|---|
author | Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail |
author_facet | Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail |
author_sort | Sanchez-Matilla, Ricardo |
collection | PubMed |
description | PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development. |
format | Online Article Text |
id | pubmed-9110447 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-91104472022-05-18 Data-centric multi-task surgical phase estimation with sparse scene segmentation Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail Int J Comput Assist Radiol Surg Original Article PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development. Springer International Publishing 2022-05-03 2022 /pmc/articles/PMC9110447/ /pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Article Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title | Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title_full | Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title_fullStr | Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title_full_unstemmed | Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title_short | Data-centric multi-task surgical phase estimation with sparse scene segmentation |
title_sort | data-centric multi-task surgical phase estimation with sparse scene segmentation |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/ https://www.ncbi.nlm.nih.gov/pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0 |
work_keys_str_mv | AT sanchezmatillaricardo datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT robumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT grammatikopouloumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT luengoimanol datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT stoyanovdanail datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation |