Cargando…

Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models

PURPOSE: Two novel deep learning methods using a convolutional neural network (CNN) and a recurrent neural network (RNN) have recently been developed to forecast future visual fields (VFs). Although the original evaluations of these models focused on overall accuracy, it was not assessed whether the...

Descripción completa

Detalles Bibliográficos
Autores principales: Eslami, Mohammad, Kim, Julia A., Zhang, Miao, Boland, Michael V., Wang, Mengyu, Chang, Dolly S., Elze, Tobias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619031/
https://www.ncbi.nlm.nih.gov/pubmed/36325476
http://dx.doi.org/10.1016/j.xops.2022.100222
_version_ 1784821186366013440
author Eslami, Mohammad
Kim, Julia A.
Zhang, Miao
Boland, Michael V.
Wang, Mengyu
Chang, Dolly S.
Elze, Tobias
author_facet Eslami, Mohammad
Kim, Julia A.
Zhang, Miao
Boland, Michael V.
Wang, Mengyu
Chang, Dolly S.
Elze, Tobias
author_sort Eslami, Mohammad
collection PubMed
description PURPOSE: Two novel deep learning methods using a convolutional neural network (CNN) and a recurrent neural network (RNN) have recently been developed to forecast future visual fields (VFs). Although the original evaluations of these models focused on overall accuracy, it was not assessed whether they can accurately identify patients with progressive glaucomatous vision loss to aid clinicians in preventing further decline. We evaluated these 2 prediction models for potential biases in overestimating or underestimating VF changes over time. DESIGN: Retrospective observational cohort study. PARTICIPANTS: All available and reliable Swedish Interactive Thresholding Algorithm Standard 24-2 VFs from Massachusetts Eye and Ear Glaucoma Service collected between 1999 and 2020 were extracted. Because of the methods’ respective needs, the CNN data set included 54 373 samples from 7472 patients, and the RNN data set included 24 430 samples from 1809 patients. METHODS: The CNN and RNN methods were reimplemented. A fivefold cross-validation procedure was performed on each model, and pointwise mean absolute error (PMAE) was used to measure prediction accuracy. Test data were stratified into categories based on the severity of VF progression to investigate the models’ performances on predicting worsening cases. The models were additionally compared with a no-change model that uses the baseline VF (for the CNN) and the last-observed VF (for the RNN) for its prediction. MAIN OUTCOME MEASURES: PMAE in predictions. RESULTS: The overall PMAE 95% confidence intervals were 2.21 to 2.24 decibels (dB) for the CNN and 2.56 to 2.61 dB for the RNN, which were close to the original studies’ reported values. However, both models exhibited large errors in identifying patients with worsening VFs and often failed to outperform the no-change model. Pointwise mean absolute error values were higher in patients with greater changes in mean sensitivity (for the CNN) and mean total deviation (for the RNN) between baseline and follow-up VFs. CONCLUSIONS: Although our evaluation confirms the low overall PMAEs reported in the original studies, our findings also reveal that both models severely underpredict worsening of VF loss. Because the accurate detection and projection of glaucomatous VF decline is crucial in ophthalmic clinical practice, we recommend that this consideration is explicitly taken into account when developing and evaluating future deep learning models.
format Online
Article
Text
id pubmed-9619031
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-96190312022-11-01 Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models Eslami, Mohammad Kim, Julia A. Zhang, Miao Boland, Michael V. Wang, Mengyu Chang, Dolly S. Elze, Tobias Ophthalmol Sci Original Article PURPOSE: Two novel deep learning methods using a convolutional neural network (CNN) and a recurrent neural network (RNN) have recently been developed to forecast future visual fields (VFs). Although the original evaluations of these models focused on overall accuracy, it was not assessed whether they can accurately identify patients with progressive glaucomatous vision loss to aid clinicians in preventing further decline. We evaluated these 2 prediction models for potential biases in overestimating or underestimating VF changes over time. DESIGN: Retrospective observational cohort study. PARTICIPANTS: All available and reliable Swedish Interactive Thresholding Algorithm Standard 24-2 VFs from Massachusetts Eye and Ear Glaucoma Service collected between 1999 and 2020 were extracted. Because of the methods’ respective needs, the CNN data set included 54 373 samples from 7472 patients, and the RNN data set included 24 430 samples from 1809 patients. METHODS: The CNN and RNN methods were reimplemented. A fivefold cross-validation procedure was performed on each model, and pointwise mean absolute error (PMAE) was used to measure prediction accuracy. Test data were stratified into categories based on the severity of VF progression to investigate the models’ performances on predicting worsening cases. The models were additionally compared with a no-change model that uses the baseline VF (for the CNN) and the last-observed VF (for the RNN) for its prediction. MAIN OUTCOME MEASURES: PMAE in predictions. RESULTS: The overall PMAE 95% confidence intervals were 2.21 to 2.24 decibels (dB) for the CNN and 2.56 to 2.61 dB for the RNN, which were close to the original studies’ reported values. However, both models exhibited large errors in identifying patients with worsening VFs and often failed to outperform the no-change model. Pointwise mean absolute error values were higher in patients with greater changes in mean sensitivity (for the CNN) and mean total deviation (for the RNN) between baseline and follow-up VFs. CONCLUSIONS: Although our evaluation confirms the low overall PMAEs reported in the original studies, our findings also reveal that both models severely underpredict worsening of VF loss. Because the accurate detection and projection of glaucomatous VF decline is crucial in ophthalmic clinical practice, we recommend that this consideration is explicitly taken into account when developing and evaluating future deep learning models. Elsevier 2022-09-13 /pmc/articles/PMC9619031/ /pubmed/36325476 http://dx.doi.org/10.1016/j.xops.2022.100222 Text en © 2022 by the American Academy of Ophthalmology. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Eslami, Mohammad
Kim, Julia A.
Zhang, Miao
Boland, Michael V.
Wang, Mengyu
Chang, Dolly S.
Elze, Tobias
Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title_full Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title_fullStr Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title_full_unstemmed Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title_short Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models
title_sort visual field prediction: evaluating the clinical relevance of deep learning models
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619031/
https://www.ncbi.nlm.nih.gov/pubmed/36325476
http://dx.doi.org/10.1016/j.xops.2022.100222
work_keys_str_mv AT eslamimohammad visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT kimjuliaa visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT zhangmiao visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT bolandmichaelv visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT wangmengyu visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT changdollys visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels
AT elzetobias visualfieldpredictionevaluatingtheclinicalrelevanceofdeeplearningmodels