Cargando…

Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection

The current advancement towards retinal disease detection mainly focused on distinct feature extraction using either a convolutional neural network (CNN) or a transformer-based end-to-end deep learning (DL) model. The individual end-to-end DL models are capable of only processing texture or shape-ba...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dutta, Pramit, Sathi, Khaleda Akther, Hossain, Md. Azad, Dewan, M. Ali Akber
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10381782/ https://www.ncbi.nlm.nih.gov/pubmed/37504817 http://dx.doi.org/10.3390/jimaging9070140

_version_	1785080529307041792
author	Dutta, Pramit Sathi, Khaleda Akther Hossain, Md. Azad Dewan, M. Ali Akber
author_facet	Dutta, Pramit Sathi, Khaleda Akther Hossain, Md. Azad Dewan, M. Ali Akber
author_sort	Dutta, Pramit
collection	PubMed
description	The current advancement towards retinal disease detection mainly focused on distinct feature extraction using either a convolutional neural network (CNN) or a transformer-based end-to-end deep learning (DL) model. The individual end-to-end DL models are capable of only processing texture or shape-based information for performing detection tasks. However, extraction of only texture- or shape-based features does not provide the model robustness needed to classify different types of retinal diseases. Therefore, concerning these two features, this paper developed a fusion model called ‘Conv-ViT’ to detect retinal diseases from foveal cut optical coherence tomography (OCT) images. The transfer learning-based CNN models, such as Inception-V3 and ResNet-50, are utilized to process texture information by calculating the correlation of the nearby pixel. Additionally, the vision transformer model is fused to process shape-based features by determining the correlation between long-distance pixels. The hybridization of these three models results in shape-based texture feature learning during the classification of retinal diseases into its four classes, including choroidal neovascularization (CNV), diabetic macular edema (DME), DRUSEN, and NORMAL. The weighted average classification accuracy, precision, recall, and F1 score of the model are found to be approximately 94%. The results indicate that the fusion of both texture and shape features assisted the proposed Conv-ViT model to outperform the state-of-the-art retinal disease classification models.
format	Online Article Text
id	pubmed-10381782
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-103817822023-07-29 Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection Dutta, Pramit Sathi, Khaleda Akther Hossain, Md. Azad Dewan, M. Ali Akber J Imaging Article The current advancement towards retinal disease detection mainly focused on distinct feature extraction using either a convolutional neural network (CNN) or a transformer-based end-to-end deep learning (DL) model. The individual end-to-end DL models are capable of only processing texture or shape-based information for performing detection tasks. However, extraction of only texture- or shape-based features does not provide the model robustness needed to classify different types of retinal diseases. Therefore, concerning these two features, this paper developed a fusion model called ‘Conv-ViT’ to detect retinal diseases from foveal cut optical coherence tomography (OCT) images. The transfer learning-based CNN models, such as Inception-V3 and ResNet-50, are utilized to process texture information by calculating the correlation of the nearby pixel. Additionally, the vision transformer model is fused to process shape-based features by determining the correlation between long-distance pixels. The hybridization of these three models results in shape-based texture feature learning during the classification of retinal diseases into its four classes, including choroidal neovascularization (CNV), diabetic macular edema (DME), DRUSEN, and NORMAL. The weighted average classification accuracy, precision, recall, and F1 score of the model are found to be approximately 94%. The results indicate that the fusion of both texture and shape features assisted the proposed Conv-ViT model to outperform the state-of-the-art retinal disease classification models. MDPI 2023-07-10 /pmc/articles/PMC10381782/ /pubmed/37504817 http://dx.doi.org/10.3390/jimaging9070140 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Dutta, Pramit Sathi, Khaleda Akther Hossain, Md. Azad Dewan, M. Ali Akber Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title	Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title_full	Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title_fullStr	Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title_full_unstemmed	Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title_short	Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection
title_sort	conv-vit: a convolution and vision transformer-based hybrid feature extraction method for retinal disease detection
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10381782/ https://www.ncbi.nlm.nih.gov/pubmed/37504817 http://dx.doi.org/10.3390/jimaging9070140
work_keys_str_mv	AT duttapramit convvitaconvolutionandvisiontransformerbasedhybridfeatureextractionmethodforretinaldiseasedetection AT sathikhaledaakther convvitaconvolutionandvisiontransformerbasedhybridfeatureextractionmethodforretinaldiseasedetection AT hossainmdazad convvitaconvolutionandvisiontransformerbasedhybridfeatureextractionmethodforretinaldiseasedetection AT dewanmaliakber convvitaconvolutionandvisiontransformerbasedhybridfeatureextractionmethodforretinaldiseasedetection

Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection

Ejemplares similares