Cargando…

Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs

Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a...

Descripción completa

Detalles Bibliográficos
Autores principales: Hwang, Elizabeth E., Chen, Dake, Han, Ying, Jia, Lin, Shan, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10669064/
https://www.ncbi.nlm.nih.gov/pubmed/38002390
http://dx.doi.org/10.3390/bioengineering10111266
_version_ 1785139607675863040
author Hwang, Elizabeth E.
Chen, Dake
Han, Ying
Jia, Lin
Shan, Jing
author_facet Hwang, Elizabeth E.
Chen, Dake
Han, Ying
Jia, Lin
Shan, Jing
author_sort Hwang, Elizabeth E.
collection PubMed
description Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a major obstacle in the prevention of glaucoma-related blindness. Deep learning models have gained significant interest as potential solutions, as these models offer objective and high-throughput methods for processing image-based medical data. While convolutional neural networks (CNN) have been widely utilized for these purposes, more recent advances in the application of Transformer architectures have led to new models, including Vision Transformer (ViT,) that have shown promise in many domains of image analysis. However, previous comparisons of these two architectures have not sufficiently compared models side-by-side with more than a single dataset, making it unclear which model is more generalizable or performs better in different clinical contexts. Our purpose is to investigate comparable ViT and CNN models tasked with GON detection from fundus photos and highlight their respective strengths and weaknesses. We train CNN and ViT models on six unrelated, publicly available databases and compare their performance using well-established statistics including AUC, sensitivity, and specificity. Our results indicate that ViT models often show superior performance when compared with a similarly trained CNN model, particularly when non-glaucomatous images are over-represented in a given dataset. We discuss the clinical implications of these findings and suggest that ViT can further the development of accurate and scalable GON detection for this leading cause of irreversible blindness worldwide.
format Online
Article
Text
id pubmed-10669064
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106690642023-10-30 Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs Hwang, Elizabeth E. Chen, Dake Han, Ying Jia, Lin Shan, Jing Bioengineering (Basel) Brief Report Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a major obstacle in the prevention of glaucoma-related blindness. Deep learning models have gained significant interest as potential solutions, as these models offer objective and high-throughput methods for processing image-based medical data. While convolutional neural networks (CNN) have been widely utilized for these purposes, more recent advances in the application of Transformer architectures have led to new models, including Vision Transformer (ViT,) that have shown promise in many domains of image analysis. However, previous comparisons of these two architectures have not sufficiently compared models side-by-side with more than a single dataset, making it unclear which model is more generalizable or performs better in different clinical contexts. Our purpose is to investigate comparable ViT and CNN models tasked with GON detection from fundus photos and highlight their respective strengths and weaknesses. We train CNN and ViT models on six unrelated, publicly available databases and compare their performance using well-established statistics including AUC, sensitivity, and specificity. Our results indicate that ViT models often show superior performance when compared with a similarly trained CNN model, particularly when non-glaucomatous images are over-represented in a given dataset. We discuss the clinical implications of these findings and suggest that ViT can further the development of accurate and scalable GON detection for this leading cause of irreversible blindness worldwide. MDPI 2023-10-30 /pmc/articles/PMC10669064/ /pubmed/38002390 http://dx.doi.org/10.3390/bioengineering10111266 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Brief Report
Hwang, Elizabeth E.
Chen, Dake
Han, Ying
Jia, Lin
Shan, Jing
Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title_full Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title_fullStr Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title_full_unstemmed Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title_short Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs
title_sort multi-dataset comparison of vision transformers and convolutional neural networks for detecting glaucomatous optic neuropathy from fundus photographs
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10669064/
https://www.ncbi.nlm.nih.gov/pubmed/38002390
http://dx.doi.org/10.3390/bioengineering10111266
work_keys_str_mv AT hwangelizabethe multidatasetcomparisonofvisiontransformersandconvolutionalneuralnetworksfordetectingglaucomatousopticneuropathyfromfundusphotographs
AT chendake multidatasetcomparisonofvisiontransformersandconvolutionalneuralnetworksfordetectingglaucomatousopticneuropathyfromfundusphotographs
AT hanying multidatasetcomparisonofvisiontransformersandconvolutionalneuralnetworksfordetectingglaucomatousopticneuropathyfromfundusphotographs
AT jialin multidatasetcomparisonofvisiontransformersandconvolutionalneuralnetworksfordetectingglaucomatousopticneuropathyfromfundusphotographs
AT shanjing multidatasetcomparisonofvisiontransformersandconvolutionalneuralnetworksfordetectingglaucomatousopticneuropathyfromfundusphotographs