Cargando…

Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches

Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. In...

Descripción completa

Detalles Bibliográficos
Autores principales: Jalal, Nour Aldeen, Alshirbaji, Tamer Abdulbaki, Docherty, Paul David, Arabian, Herag, Laufer, Bernhard, Krueger-Ziolek, Sabine, Neumuth, Thomas, Moeller, Knut
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9964851/
https://www.ncbi.nlm.nih.gov/pubmed/36850554
http://dx.doi.org/10.3390/s23041958
_version_ 1784896611196862464
author Jalal, Nour Aldeen
Alshirbaji, Tamer Abdulbaki
Docherty, Paul David
Arabian, Herag
Laufer, Bernhard
Krueger-Ziolek, Sabine
Neumuth, Thomas
Moeller, Knut
author_facet Jalal, Nour Aldeen
Alshirbaji, Tamer Abdulbaki
Docherty, Paul David
Arabian, Herag
Laufer, Bernhard
Krueger-Ziolek, Sabine
Neumuth, Thomas
Moeller, Knut
author_sort Jalal, Nour Aldeen
collection PubMed
description Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools.
format Online
Article
Text
id pubmed-9964851
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99648512023-02-26 Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches Jalal, Nour Aldeen Alshirbaji, Tamer Abdulbaki Docherty, Paul David Arabian, Herag Laufer, Bernhard Krueger-Ziolek, Sabine Neumuth, Thomas Moeller, Knut Sensors (Basel) Article Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools. MDPI 2023-02-09 /pmc/articles/PMC9964851/ /pubmed/36850554 http://dx.doi.org/10.3390/s23041958 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jalal, Nour Aldeen
Alshirbaji, Tamer Abdulbaki
Docherty, Paul David
Arabian, Herag
Laufer, Bernhard
Krueger-Ziolek, Sabine
Neumuth, Thomas
Moeller, Knut
Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_full Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_fullStr Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_full_unstemmed Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_short Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_sort laparoscopic video analysis using temporal, attention, and multi-feature fusion based-approaches
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9964851/
https://www.ncbi.nlm.nih.gov/pubmed/36850554
http://dx.doi.org/10.3390/s23041958
work_keys_str_mv AT jalalnouraldeen laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT alshirbajitamerabdulbaki laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT dochertypauldavid laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT arabianherag laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT lauferbernhard laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT kruegerzioleksabine laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT neumuththomas laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches
AT moellerknut laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches