Cargando…

Data-centric multi-task surgical phase estimation with sparse scene segmentation

PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sanchez-Matilla, Ricardo, Robu, Maria, Grammatikopoulou, Maria, Luengo, Imanol, Stoyanov, Danail
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2022
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/ https://www.ncbi.nlm.nih.gov/pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0

_version_	1784709106232197120
author	Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail
author_facet	Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail
author_sort	Sanchez-Matilla, Ricardo
collection	PubMed
description	PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development.
format	Online Article Text
id	pubmed-9110447
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-91104472022-05-18 Data-centric multi-task surgical phase estimation with sparse scene segmentation Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail Int J Comput Assist Radiol Surg Original Article PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development. Springer International Publishing 2022-05-03 2022 /pmc/articles/PMC9110447/ /pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Original Article Sanchez-Matilla, Ricardo Robu, Maria Grammatikopoulou, Maria Luengo, Imanol Stoyanov, Danail Data-centric multi-task surgical phase estimation with sparse scene segmentation
title	Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_full	Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_fullStr	Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_full_unstemmed	Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_short	Data-centric multi-task surgical phase estimation with sparse scene segmentation
title_sort	data-centric multi-task surgical phase estimation with sparse scene segmentation
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9110447/ https://www.ncbi.nlm.nih.gov/pubmed/35505149 http://dx.doi.org/10.1007/s11548-022-02616-0
work_keys_str_mv	AT sanchezmatillaricardo datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT robumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT grammatikopouloumaria datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT luengoimanol datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation AT stoyanovdanail datacentricmultitasksurgicalphaseestimationwithsparsescenesegmentation

Data-centric multi-task surgical phase estimation with sparse scene segmentation

Ejemplares similares