Cargando…

Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Although wireless capsule endoscopy (WCE) detects small bowel diseases effectively, it has some limitations. For example, the reading process can be time consuming due to the numerous images generated per case and the lesion detection accuracy may rely on the operators’ skills and experiences. Hence...

Descripción completa

Detalles Bibliográficos
Autores principales: Oh, SangYup, Oh, DongJun, Kim, Dongmin, Song, Woohyuk, Hwang, Youngbae, Cho, Namik, Lim, Yun Jeong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10572266/
https://www.ncbi.nlm.nih.gov/pubmed/37835876
http://dx.doi.org/10.3390/diagnostics13193133
_version_ 1785120194712043520
author Oh, SangYup
Oh, DongJun
Kim, Dongmin
Song, Woohyuk
Hwang, Youngbae
Cho, Namik
Lim, Yun Jeong
author_facet Oh, SangYup
Oh, DongJun
Kim, Dongmin
Song, Woohyuk
Hwang, Youngbae
Cho, Namik
Lim, Yun Jeong
author_sort Oh, SangYup
collection PubMed
description Although wireless capsule endoscopy (WCE) detects small bowel diseases effectively, it has some limitations. For example, the reading process can be time consuming due to the numerous images generated per case and the lesion detection accuracy may rely on the operators’ skills and experiences. Hence, many researchers have recently developed deep-learning-based methods to address these limitations. However, they tend to select only a portion of the images from a given WCE video and analyze each image individually. In this study, we note that more information can be extracted from the unused frames and the temporal relations of sequential frames. Specifically, to increase the accuracy of lesion detection without depending on experts’ frame selection skills, we suggest using whole video frames as the input to the deep learning system. Thus, we propose a new Transformer-architecture-based neural encoder that takes the entire video as the input, exploiting the power of the Transformer architecture to extract long-term global correlation within and between the input frames. Subsequently, we can capture the temporal context of the input frames and the attentional features within a frame. Tests on benchmark datasets of four WCE videos showed 95.1% sensitivity and 83.4% specificity. These results may significantly advance automated lesion detection techniques for WCE images.
format Online
Article
Text
id pubmed-10572266
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105722662023-10-14 Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network Oh, SangYup Oh, DongJun Kim, Dongmin Song, Woohyuk Hwang, Youngbae Cho, Namik Lim, Yun Jeong Diagnostics (Basel) Article Although wireless capsule endoscopy (WCE) detects small bowel diseases effectively, it has some limitations. For example, the reading process can be time consuming due to the numerous images generated per case and the lesion detection accuracy may rely on the operators’ skills and experiences. Hence, many researchers have recently developed deep-learning-based methods to address these limitations. However, they tend to select only a portion of the images from a given WCE video and analyze each image individually. In this study, we note that more information can be extracted from the unused frames and the temporal relations of sequential frames. Specifically, to increase the accuracy of lesion detection without depending on experts’ frame selection skills, we suggest using whole video frames as the input to the deep learning system. Thus, we propose a new Transformer-architecture-based neural encoder that takes the entire video as the input, exploiting the power of the Transformer architecture to extract long-term global correlation within and between the input frames. Subsequently, we can capture the temporal context of the input frames and the attentional features within a frame. Tests on benchmark datasets of four WCE videos showed 95.1% sensitivity and 83.4% specificity. These results may significantly advance automated lesion detection techniques for WCE images. MDPI 2023-10-05 /pmc/articles/PMC10572266/ /pubmed/37835876 http://dx.doi.org/10.3390/diagnostics13193133 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Oh, SangYup
Oh, DongJun
Kim, Dongmin
Song, Woohyuk
Hwang, Youngbae
Cho, Namik
Lim, Yun Jeong
Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title_full Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title_fullStr Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title_full_unstemmed Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title_short Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network
title_sort video analysis of small bowel capsule endoscopy using a transformer network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10572266/
https://www.ncbi.nlm.nih.gov/pubmed/37835876
http://dx.doi.org/10.3390/diagnostics13193133
work_keys_str_mv AT ohsangyup videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT ohdongjun videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT kimdongmin videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT songwoohyuk videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT hwangyoungbae videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT chonamik videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork
AT limyunjeong videoanalysisofsmallbowelcapsuleendoscopyusingatransformernetwork