Cargando…

Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization

PURPOSE: To compare the diagnostic accuracy and explainability of a Vision Transformer deep learning technique, Data-efficient image Transformer (DeiT), and ResNet-50, trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG) and i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fan, Rui, Alipour, Kamran, Bowd, Christopher, Christopher, Mark, Brye, Nicole, Proudfoot, James A., Goldbaum, Michael H., Belghith, Akram, Girkin, Christopher A., Fazio, Massimo A., Liebmann, Jeffrey M., Weinreb, Robert N., Pazzani, Michael, Kriegman, David, Zangwill, Linda M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2022
Materias:	Artificial Intelligence and Big Data
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762193/ https://www.ncbi.nlm.nih.gov/pubmed/36545260 http://dx.doi.org/10.1016/j.xops.2022.100233

_version_	1784852819581337600
author	Fan, Rui Alipour, Kamran Bowd, Christopher Christopher, Mark Brye, Nicole Proudfoot, James A. Goldbaum, Michael H. Belghith, Akram Girkin, Christopher A. Fazio, Massimo A. Liebmann, Jeffrey M. Weinreb, Robert N. Pazzani, Michael Kriegman, David Zangwill, Linda M.
author_facet	Fan, Rui Alipour, Kamran Bowd, Christopher Christopher, Mark Brye, Nicole Proudfoot, James A. Goldbaum, Michael H. Belghith, Akram Girkin, Christopher A. Fazio, Massimo A. Liebmann, Jeffrey M. Weinreb, Robert N. Pazzani, Michael Kriegman, David Zangwill, Linda M.
author_sort	Fan, Rui
collection	PubMed
description	PURPOSE: To compare the diagnostic accuracy and explainability of a Vision Transformer deep learning technique, Data-efficient image Transformer (DeiT), and ResNet-50, trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG) and identify the salient areas of the photographs most important for each model’s decision-making process. DESIGN: Evaluation of a diagnostic technology. SUBJECTS, PARTICIPANTS, AND CONTROLS: Overall 66 715 photographs from 1636 OHTS participants and an additional 5 external datasets of 16 137 photographs of healthy and glaucoma eyes. METHODS: Data-efficient image Transformer models were trained to detect 5 ground-truth OHTS POAG classifications: OHTS end point committee POAG determinations because of disc changes (model 1), visual field (VF) changes (model 2), or either disc or VF changes (model 3) and Reading Center determinations based on disc (model 4) and VFs (model 5). The best-performing DeiT models were compared with ResNet-50 models on OHTS and 5 external datasets. MAIN OUTCOME MEASURES: Diagnostic performance was compared using areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities. The explainability of the DeiT and ResNet-50 models was compared by evaluating the attention maps derived directly from DeiT to 3 gradient-weighted class activation map strategies. RESULTS: Compared with our best-performing ResNet-50 models, the DeiT models demonstrated similar performance on the OHTS test sets for all 5 ground-truth POAG labels; AUROC ranged from 0.82 (model 5) to 0.91 (model 1). Data-efficient image Transformer AUROC was consistently higher than ResNet-50 on the 5 external datasets. For example, AUROC for the main OHTS end point (model 3) was between 0.08 and 0.20 higher in the DeiT than ResNet-50 models. The saliency maps from the DeiT highlight localized areas of the neuroretinal rim, suggesting important rim features for classification. The same maps in the ResNet-50 models show a more diffuse, generalized distribution around the optic disc. CONCLUSIONS: Vision Transformers have the potential to improve generalizability and explainability in deep learning models, detecting eye disease and possibly other medical conditions that rely on imaging for clinical diagnosis and management.
format	Online Article Text
id	pubmed-9762193
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-97621932022-12-20 Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization Fan, Rui Alipour, Kamran Bowd, Christopher Christopher, Mark Brye, Nicole Proudfoot, James A. Goldbaum, Michael H. Belghith, Akram Girkin, Christopher A. Fazio, Massimo A. Liebmann, Jeffrey M. Weinreb, Robert N. Pazzani, Michael Kriegman, David Zangwill, Linda M. Ophthalmol Sci Artificial Intelligence and Big Data PURPOSE: To compare the diagnostic accuracy and explainability of a Vision Transformer deep learning technique, Data-efficient image Transformer (DeiT), and ResNet-50, trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG) and identify the salient areas of the photographs most important for each model’s decision-making process. DESIGN: Evaluation of a diagnostic technology. SUBJECTS, PARTICIPANTS, AND CONTROLS: Overall 66 715 photographs from 1636 OHTS participants and an additional 5 external datasets of 16 137 photographs of healthy and glaucoma eyes. METHODS: Data-efficient image Transformer models were trained to detect 5 ground-truth OHTS POAG classifications: OHTS end point committee POAG determinations because of disc changes (model 1), visual field (VF) changes (model 2), or either disc or VF changes (model 3) and Reading Center determinations based on disc (model 4) and VFs (model 5). The best-performing DeiT models were compared with ResNet-50 models on OHTS and 5 external datasets. MAIN OUTCOME MEASURES: Diagnostic performance was compared using areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities. The explainability of the DeiT and ResNet-50 models was compared by evaluating the attention maps derived directly from DeiT to 3 gradient-weighted class activation map strategies. RESULTS: Compared with our best-performing ResNet-50 models, the DeiT models demonstrated similar performance on the OHTS test sets for all 5 ground-truth POAG labels; AUROC ranged from 0.82 (model 5) to 0.91 (model 1). Data-efficient image Transformer AUROC was consistently higher than ResNet-50 on the 5 external datasets. For example, AUROC for the main OHTS end point (model 3) was between 0.08 and 0.20 higher in the DeiT than ResNet-50 models. The saliency maps from the DeiT highlight localized areas of the neuroretinal rim, suggesting important rim features for classification. The same maps in the ResNet-50 models show a more diffuse, generalized distribution around the optic disc. CONCLUSIONS: Vision Transformers have the potential to improve generalizability and explainability in deep learning models, detecting eye disease and possibly other medical conditions that rely on imaging for clinical diagnosis and management. Elsevier 2022-10-19 /pmc/articles/PMC9762193/ /pubmed/36545260 http://dx.doi.org/10.1016/j.xops.2022.100233 Text en © 2022 Published by Elsevier Inc. on behalf of American Academy of Ophthalmology. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle	Artificial Intelligence and Big Data Fan, Rui Alipour, Kamran Bowd, Christopher Christopher, Mark Brye, Nicole Proudfoot, James A. Goldbaum, Michael H. Belghith, Akram Girkin, Christopher A. Fazio, Massimo A. Liebmann, Jeffrey M. Weinreb, Robert N. Pazzani, Michael Kriegman, David Zangwill, Linda M. Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title	Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title_full	Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title_fullStr	Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title_full_unstemmed	Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title_short	Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization
title_sort	detecting glaucoma from fundus photographs using deep learning without convolutions: transformer for improved generalization
topic	Artificial Intelligence and Big Data
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762193/ https://www.ncbi.nlm.nih.gov/pubmed/36545260 http://dx.doi.org/10.1016/j.xops.2022.100233
work_keys_str_mv	AT fanrui detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT alipourkamran detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT bowdchristopher detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT christophermark detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT bryenicole detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT proudfootjamesa detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT goldbaummichaelh detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT belghithakram detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT girkinchristophera detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT faziomassimoa detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT liebmannjeffreym detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT weinrebrobertn detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT pazzanimichael detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT kriegmandavid detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization AT zangwilllindam detectingglaucomafromfundusphotographsusingdeeplearningwithoutconvolutionstransformerforimprovedgeneralization

Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization

Ejemplares similares