Cargando…

Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks

Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallo...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Ki-Sun, Lee, Eunyoung, Choi, Bareun, Pyun, Sung-Bom
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7918932/
https://www.ncbi.nlm.nih.gov/pubmed/33668528
http://dx.doi.org/10.3390/diagnostics11020300
_version_ 1783658035272482816
author Lee, Ki-Sun
Lee, Eunyoung
Choi, Bareun
Pyun, Sung-Bom
author_facet Lee, Ki-Sun
Lee, Eunyoung
Choi, Bareun
Pyun, Sung-Bom
author_sort Lee, Ki-Sun
collection PubMed
description Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually. Methods: To determine whether the image frame in the VFSS video is in the pharyngeal phase, a single-frame baseline architecture based the deep CNN framework is used and a transfer learning technique with fine-tuning is applied. Results: Compared with all experimental CNN models, that fine-tuned with two blocks of the VGG-16 (VGG16-FT5) model achieved the highest performance in terms of recognizing the frame of pharyngeal phase, that is, the accuracy of 93.20 (±1.25)%, sensitivity of 84.57 (±5.19)%, specificity of 94.36 (±1.21)%, AUC of 0.8947 (±0.0269) and Kappa of 0.7093 (±0.0488). Conclusions: Using appropriate and fine-tuning techniques and explainable deep learning techniques such as grad CAM, this study shows that the proposed single-frame-baseline-architecture-based deep CNN framework can yield high performances in the full automation of VFSS video analysis.
format Online
Article
Text
id pubmed-7918932
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79189322021-03-02 Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks Lee, Ki-Sun Lee, Eunyoung Choi, Bareun Pyun, Sung-Bom Diagnostics (Basel) Article Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually. Methods: To determine whether the image frame in the VFSS video is in the pharyngeal phase, a single-frame baseline architecture based the deep CNN framework is used and a transfer learning technique with fine-tuning is applied. Results: Compared with all experimental CNN models, that fine-tuned with two blocks of the VGG-16 (VGG16-FT5) model achieved the highest performance in terms of recognizing the frame of pharyngeal phase, that is, the accuracy of 93.20 (±1.25)%, sensitivity of 84.57 (±5.19)%, specificity of 94.36 (±1.21)%, AUC of 0.8947 (±0.0269) and Kappa of 0.7093 (±0.0488). Conclusions: Using appropriate and fine-tuning techniques and explainable deep learning techniques such as grad CAM, this study shows that the proposed single-frame-baseline-architecture-based deep CNN framework can yield high performances in the full automation of VFSS video analysis. MDPI 2021-02-13 /pmc/articles/PMC7918932/ /pubmed/33668528 http://dx.doi.org/10.3390/diagnostics11020300 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Lee, Ki-Sun
Lee, Eunyoung
Choi, Bareun
Pyun, Sung-Bom
Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title_full Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title_fullStr Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title_full_unstemmed Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title_short Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks
title_sort automatic pharyngeal phase recognition in untrimmed videofluoroscopic swallowing study using transfer learning with deep convolutional neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7918932/
https://www.ncbi.nlm.nih.gov/pubmed/33668528
http://dx.doi.org/10.3390/diagnostics11020300
work_keys_str_mv AT leekisun automaticpharyngealphaserecognitioninuntrimmedvideofluoroscopicswallowingstudyusingtransferlearningwithdeepconvolutionalneuralnetworks
AT leeeunyoung automaticpharyngealphaserecognitioninuntrimmedvideofluoroscopicswallowingstudyusingtransferlearningwithdeepconvolutionalneuralnetworks
AT choibareun automaticpharyngealphaserecognitioninuntrimmedvideofluoroscopicswallowingstudyusingtransferlearningwithdeepconvolutionalneuralnetworks
AT pyunsungbom automaticpharyngealphaserecognitioninuntrimmedvideofluoroscopicswallowingstudyusingtransferlearningwithdeepconvolutionalneuralnetworks