Cargando…

PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos

PURPOSE: To build and evaluate deep learning models for recognizing cataract surgical steps from whole-length surgical videos with minimal preprocessing, including identification of routine and complex steps. METHODS: We collected 298 cataract surgical videos from 12 resident surgeons across 6 sites...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yeh, Hsu-Hang, Jain, Anjal M., Fox, Olivia, Wang, Sophia Y.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Association for Research in Vision and Ophthalmology 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606857/ https://www.ncbi.nlm.nih.gov/pubmed/34784415 http://dx.doi.org/10.1167/tvst.10.13.23

_version_	1784602426777534464
author	Yeh, Hsu-Hang Jain, Anjal M. Fox, Olivia Wang, Sophia Y.
author_facet	Yeh, Hsu-Hang Jain, Anjal M. Fox, Olivia Wang, Sophia Y.
author_sort	Yeh, Hsu-Hang
collection	PubMed
description	PURPOSE: To build and evaluate deep learning models for recognizing cataract surgical steps from whole-length surgical videos with minimal preprocessing, including identification of routine and complex steps. METHODS: We collected 298 cataract surgical videos from 12 resident surgeons across 6 sites and excluded 30 incomplete, duplicated, and combination surgery videos. Videos were downsampled at 1 frame/second. Trained annotators labeled 13 steps of surgery: create wound, injection into the eye, capsulorrhexis, hydrodissection, phacoemulsification, irrigation/aspiration, place lens, remove viscoelastic, close wound, advanced technique/other, stain with trypan blue, manipulating iris, and subconjunctival injection. We trained two deep learning models, one based on the VGG16 architecture (VGG model) and the second using VGG16 followed by a long short-term memory network (convolutional neural network [CNN]– recurrent neural network [RNN] model). Class activation maps were visualized using Grad-CAM. RESULTS: Overall top 1 prediction accuracy was 76% for VGG model (93% for top 3 accuracy) and 84% for the CNN–RNN model (97% for top 3 accuracy). The microaveraged area under receiver-operating characteristic curves was 0.97 for the VGG model and 0.99 for the CNN–RNN model. The microaveraged average precision score was 0.83 for the VGG model and 0.92 for the CNN–RNN model. Class activation maps revealed the model was appropriately focused on the instrumentation used in each step to identify which step was being performed. CONCLUSIONS: Deep learning models can classify cataract surgical activities on a frame-by-frame basis with remarkably high accuracy, especially routine surgical steps. TRANSLATIONAL RELEVANCE: An automated system for recognition of cataract surgical steps could provide to residents automated feedback metrics, such as the length of time spent on each step.
format	Online Article Text
id	pubmed-8606857
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	The Association for Research in Vision and Ophthalmology
record_format	MEDLINE/PubMed
spelling	pubmed-86068572021-12-02 PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos Yeh, Hsu-Hang Jain, Anjal M. Fox, Olivia Wang, Sophia Y. Transl Vis Sci Technol Article PURPOSE: To build and evaluate deep learning models for recognizing cataract surgical steps from whole-length surgical videos with minimal preprocessing, including identification of routine and complex steps. METHODS: We collected 298 cataract surgical videos from 12 resident surgeons across 6 sites and excluded 30 incomplete, duplicated, and combination surgery videos. Videos were downsampled at 1 frame/second. Trained annotators labeled 13 steps of surgery: create wound, injection into the eye, capsulorrhexis, hydrodissection, phacoemulsification, irrigation/aspiration, place lens, remove viscoelastic, close wound, advanced technique/other, stain with trypan blue, manipulating iris, and subconjunctival injection. We trained two deep learning models, one based on the VGG16 architecture (VGG model) and the second using VGG16 followed by a long short-term memory network (convolutional neural network [CNN]– recurrent neural network [RNN] model). Class activation maps were visualized using Grad-CAM. RESULTS: Overall top 1 prediction accuracy was 76% for VGG model (93% for top 3 accuracy) and 84% for the CNN–RNN model (97% for top 3 accuracy). The microaveraged area under receiver-operating characteristic curves was 0.97 for the VGG model and 0.99 for the CNN–RNN model. The microaveraged average precision score was 0.83 for the VGG model and 0.92 for the CNN–RNN model. Class activation maps revealed the model was appropriately focused on the instrumentation used in each step to identify which step was being performed. CONCLUSIONS: Deep learning models can classify cataract surgical activities on a frame-by-frame basis with remarkably high accuracy, especially routine surgical steps. TRANSLATIONAL RELEVANCE: An automated system for recognition of cataract surgical steps could provide to residents automated feedback metrics, such as the length of time spent on each step. The Association for Research in Vision and Ophthalmology 2021-11-16 /pmc/articles/PMC8606857/ /pubmed/34784415 http://dx.doi.org/10.1167/tvst.10.13.23 Text en Copyright 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
spellingShingle	Article Yeh, Hsu-Hang Jain, Anjal M. Fox, Olivia Wang, Sophia Y. PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title	PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title_full	PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title_fullStr	PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title_full_unstemmed	PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title_short	PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos
title_sort	phacotrainer: a multicenter study of deep learning for activity recognition in cataract surgical videos
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606857/ https://www.ncbi.nlm.nih.gov/pubmed/34784415 http://dx.doi.org/10.1167/tvst.10.13.23
work_keys_str_mv	AT yehhsuhang phacotraineramulticenterstudyofdeeplearningforactivityrecognitionincataractsurgicalvideos AT jainanjalm phacotraineramulticenterstudyofdeeplearningforactivityrecognitionincataractsurgicalvideos AT foxolivia phacotraineramulticenterstudyofdeeplearningforactivityrecognitionincataractsurgicalvideos AT wangsophiay phacotraineramulticenterstudyofdeeplearningforactivityrecognitionincataractsurgicalvideos

PhacoTrainer: A Multicenter Study of Deep Learning for Activity Recognition in Cataract Surgical Videos

Ejemplares similares