Cargando…

Global–local multi-stage temporal convolutional network for cataract surgery phase recognition

BACKGROUND: Surgical video phase recognition is an essential technique in computer-assisted surgical systems for monitoring surgical procedures, which can assist surgeons in standardizing procedures and enhancing postsurgical assessment and indexing. However, the high similarity between the phases a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fang, Lixin, Mou, Lei, Gu, Yuanyuan, Hu, Yan, Chen, Bang, Chen, Xu, Wang, Yang, Liu, Jiang, Zhao, Yitian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710114/ https://www.ncbi.nlm.nih.gov/pubmed/36451164 http://dx.doi.org/10.1186/s12938-022-01048-w

_version_	1784841300162379776
author	Fang, Lixin Mou, Lei Gu, Yuanyuan Hu, Yan Chen, Bang Chen, Xu Wang, Yang Liu, Jiang Zhao, Yitian
author_facet	Fang, Lixin Mou, Lei Gu, Yuanyuan Hu, Yan Chen, Bang Chen, Xu Wang, Yang Liu, Jiang Zhao, Yitian
author_sort	Fang, Lixin
collection	PubMed
description	BACKGROUND: Surgical video phase recognition is an essential technique in computer-assisted surgical systems for monitoring surgical procedures, which can assist surgeons in standardizing procedures and enhancing postsurgical assessment and indexing. However, the high similarity between the phases and temporal variations of cataract videos still poses the greatest challenge for video phase recognition. METHODS: In this paper, we introduce a global–local multi-stage temporal convolutional network (GL-MSTCN) to explore the subtle differences between high similarity surgical phases and mitigate the temporal variations of surgical videos. The presented work consists of a triple-stream network (i.e., pupil stream, instrument stream, and video frame stream) and a multi-stage temporal convolutional network. The triple-stream network first detects the pupil and surgical instruments regions in the frame separately and then obtains the fine-grained semantic features of the video frames. The proposed multi-stage temporal convolutional network improves the surgical phase recognition performance by capturing longer time series features through dilated convolutional layers with varying receptive fields. RESULTS: Our method is thoroughly validated on the CSVideo dataset with 32 cataract surgery videos and the public Cataract101 dataset with 101 cataract surgery videos, outperforming state-of-the-art approaches with 95.8% and 96.5% accuracy, respectively. CONCLUSIONS: The experimental results show that the use of global and local feature information can effectively enhance the model to explore fine-grained features and mitigate temporal and spatial variations, thus improving the surgical phase recognition performance of the proposed GL-MSTCN.
format	Online Article Text
id	pubmed-9710114
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-97101142022-12-01 Global–local multi-stage temporal convolutional network for cataract surgery phase recognition Fang, Lixin Mou, Lei Gu, Yuanyuan Hu, Yan Chen, Bang Chen, Xu Wang, Yang Liu, Jiang Zhao, Yitian Biomed Eng Online Research BACKGROUND: Surgical video phase recognition is an essential technique in computer-assisted surgical systems for monitoring surgical procedures, which can assist surgeons in standardizing procedures and enhancing postsurgical assessment and indexing. However, the high similarity between the phases and temporal variations of cataract videos still poses the greatest challenge for video phase recognition. METHODS: In this paper, we introduce a global–local multi-stage temporal convolutional network (GL-MSTCN) to explore the subtle differences between high similarity surgical phases and mitigate the temporal variations of surgical videos. The presented work consists of a triple-stream network (i.e., pupil stream, instrument stream, and video frame stream) and a multi-stage temporal convolutional network. The triple-stream network first detects the pupil and surgical instruments regions in the frame separately and then obtains the fine-grained semantic features of the video frames. The proposed multi-stage temporal convolutional network improves the surgical phase recognition performance by capturing longer time series features through dilated convolutional layers with varying receptive fields. RESULTS: Our method is thoroughly validated on the CSVideo dataset with 32 cataract surgery videos and the public Cataract101 dataset with 101 cataract surgery videos, outperforming state-of-the-art approaches with 95.8% and 96.5% accuracy, respectively. CONCLUSIONS: The experimental results show that the use of global and local feature information can effectively enhance the model to explore fine-grained features and mitigate temporal and spatial variations, thus improving the surgical phase recognition performance of the proposed GL-MSTCN. BioMed Central 2022-11-30 /pmc/articles/PMC9710114/ /pubmed/36451164 http://dx.doi.org/10.1186/s12938-022-01048-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Fang, Lixin Mou, Lei Gu, Yuanyuan Hu, Yan Chen, Bang Chen, Xu Wang, Yang Liu, Jiang Zhao, Yitian Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title	Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title_full	Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title_fullStr	Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title_full_unstemmed	Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title_short	Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
title_sort	global–local multi-stage temporal convolutional network for cataract surgery phase recognition
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710114/ https://www.ncbi.nlm.nih.gov/pubmed/36451164 http://dx.doi.org/10.1186/s12938-022-01048-w
work_keys_str_mv	AT fanglixin globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT moulei globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT guyuanyuan globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT huyan globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT chenbang globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT chenxu globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT wangyang globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT liujiang globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition AT zhaoyitian globallocalmultistagetemporalconvolutionalnetworkforcataractsurgeryphaserecognition

Global–local multi-stage temporal convolutional network for cataract surgery phase recognition

Ejemplares similares