Cargando…

Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features

Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are well known tools for diagnosing gastrointestinal (GI) tract related disorders. Defining the anatomical location within the GI tract helps clinicians determine appropriate treatment options, which can reduce the need for repetitive...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sedighipour Chafjiri, Fatemeh, Mohebbian, Mohammad Reza, Wahid, Khan A., Babyn, Paul
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10020761/ https://www.ncbi.nlm.nih.gov/pubmed/37362715 http://dx.doi.org/10.1007/s11042-023-14982-1

_version_	1784908336573972480
author	Sedighipour Chafjiri, Fatemeh Mohebbian, Mohammad Reza Wahid, Khan A. Babyn, Paul
author_facet	Sedighipour Chafjiri, Fatemeh Mohebbian, Mohammad Reza Wahid, Khan A. Babyn, Paul
author_sort	Sedighipour Chafjiri, Fatemeh
collection	PubMed
description	Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are well known tools for diagnosing gastrointestinal (GI) tract related disorders. Defining the anatomical location within the GI tract helps clinicians determine appropriate treatment options, which can reduce the need for repetitive endoscopy. Limited research addresses the localization of the anatomical location of WCE and CE images using classification, mainly due to the difficulty in collecting annotated data. In this study, we present a few-shot learning method based on distance metric learning which combines transfer-learning and manifold mixup schemes to localize and classify endoscopic images and video frames. The proposed method allows us to develop a pipeline for endoscopy video sequence localization that can be trained with only a few samples. The use of manifold mixup improves learning by increasing the number of training epochs while reducing overfitting and providing more accurate decision boundaries. A dataset is collected from 10 different anatomical positions of the human GI tract. Two models were trained using only 78 CE and 27 WCE annotated frames to predict the location of 25,700 and 1825 video frames from CE and WCE respectively. We performed subjective evaluation using nine gastroenterologists to validate the need of having such an automated system to localize endoscopic images and video frames. Our method achieved higher accuracy and a higher F1-score when compared with the scores from subjective evaluation. In addition, the results show improved performance with less cross-entropy loss when compared with several existing methods trained on the same datasets. This indicates that the proposed method has the potential to be used in endoscopy image classification. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11042-023-14982-1.
format	Online Article Text
id	pubmed-10020761
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-100207612023-03-17 Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features Sedighipour Chafjiri, Fatemeh Mohebbian, Mohammad Reza Wahid, Khan A. Babyn, Paul Multimed Tools Appl Article Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are well known tools for diagnosing gastrointestinal (GI) tract related disorders. Defining the anatomical location within the GI tract helps clinicians determine appropriate treatment options, which can reduce the need for repetitive endoscopy. Limited research addresses the localization of the anatomical location of WCE and CE images using classification, mainly due to the difficulty in collecting annotated data. In this study, we present a few-shot learning method based on distance metric learning which combines transfer-learning and manifold mixup schemes to localize and classify endoscopic images and video frames. The proposed method allows us to develop a pipeline for endoscopy video sequence localization that can be trained with only a few samples. The use of manifold mixup improves learning by increasing the number of training epochs while reducing overfitting and providing more accurate decision boundaries. A dataset is collected from 10 different anatomical positions of the human GI tract. Two models were trained using only 78 CE and 27 WCE annotated frames to predict the location of 25,700 and 1825 video frames from CE and WCE respectively. We performed subjective evaluation using nine gastroenterologists to validate the need of having such an automated system to localize endoscopic images and video frames. Our method achieved higher accuracy and a higher F1-score when compared with the scores from subjective evaluation. In addition, the results show improved performance with less cross-entropy loss when compared with several existing methods trained on the same datasets. This indicates that the proposed method has the potential to be used in endoscopy image classification. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11042-023-14982-1. Springer US 2023-03-17 /pmc/articles/PMC10020761/ /pubmed/37362715 http://dx.doi.org/10.1007/s11042-023-14982-1 Text en © Crown 2023 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Sedighipour Chafjiri, Fatemeh Mohebbian, Mohammad Reza Wahid, Khan A. Babyn, Paul Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title	Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title_full	Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title_fullStr	Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title_full_unstemmed	Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title_short	Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
title_sort	classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10020761/ https://www.ncbi.nlm.nih.gov/pubmed/37362715 http://dx.doi.org/10.1007/s11042-023-14982-1
work_keys_str_mv	AT sedighipourchafjirifatemeh classificationofendoscopicimageandvideoframesusingdistancemetricbasedlearningwithinterpolatedlatentfeatures AT mohebbianmohammadreza classificationofendoscopicimageandvideoframesusingdistancemetricbasedlearningwithinterpolatedlatentfeatures AT wahidkhana classificationofendoscopicimageandvideoframesusingdistancemetricbasedlearningwithinterpolatedlatentfeatures AT babynpaul classificationofendoscopicimageandvideoframesusingdistancemetricbasedlearningwithinterpolatedlatentfeatures

Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features

Ejemplares similares