Cargando…

Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs

We have attempted to reproduce the results in Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, published in JAMA 2016; 316(22), using publicly available data sets. We re-implemented the main method in the original study sinc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Voets, Mike, Møllersen, Kajsa, Bongo, Lars Ailo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6553744/ https://www.ncbi.nlm.nih.gov/pubmed/31170223 http://dx.doi.org/10.1371/journal.pone.0217541

_version_	1783424867792584704
author	Voets, Mike Møllersen, Kajsa Bongo, Lars Ailo
author_facet	Voets, Mike Møllersen, Kajsa Bongo, Lars Ailo
author_sort	Voets, Mike
collection	PubMed
description	We have attempted to reproduce the results in Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, published in JAMA 2016; 316(22), using publicly available data sets. We re-implemented the main method in the original study since the source code is not available. The original study used non-public fundus images from EyePACS and three hospitals in India for training. We used a different EyePACS data set from Kaggle. The original study used the benchmark data set Messidor-2 to evaluate the algorithm’s performance. We used another distribution of the Messidor-2 data set, since the original data set is no longer available. In the original study, ophthalmologists re-graded all images for diabetic retinopathy, macular edema, and image gradability. We have one diabetic retinopathy grade per image for our data sets, and we assessed image gradability ourselves. We were not able to reproduce the original study’s results with publicly available data. Our algorithm’s area under the receiver operating characteristic curve (AUC) of 0.951 (95% CI, 0.947-0.956) on the Kaggle EyePACS test set and 0.853 (95% CI, 0.835-0.871) on Messidor-2 did not come close to the reported AUC of 0.99 on both test sets in the original study. This may be caused by the use of a single grade per image, or different data. This study shows the challenges of reproducing deep learning method results, and the need for more replication and reproduction studies to validate deep learning methods, especially for medical image analysis. Our source code and instructions are available at: https://github.com/mikevoets/jama16-retina-replication.
format	Online Article Text
id	pubmed-6553744
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-65537442019-06-17 Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs Voets, Mike Møllersen, Kajsa Bongo, Lars Ailo PLoS One Research Article We have attempted to reproduce the results in Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, published in JAMA 2016; 316(22), using publicly available data sets. We re-implemented the main method in the original study since the source code is not available. The original study used non-public fundus images from EyePACS and three hospitals in India for training. We used a different EyePACS data set from Kaggle. The original study used the benchmark data set Messidor-2 to evaluate the algorithm’s performance. We used another distribution of the Messidor-2 data set, since the original data set is no longer available. In the original study, ophthalmologists re-graded all images for diabetic retinopathy, macular edema, and image gradability. We have one diabetic retinopathy grade per image for our data sets, and we assessed image gradability ourselves. We were not able to reproduce the original study’s results with publicly available data. Our algorithm’s area under the receiver operating characteristic curve (AUC) of 0.951 (95% CI, 0.947-0.956) on the Kaggle EyePACS test set and 0.853 (95% CI, 0.835-0.871) on Messidor-2 did not come close to the reported AUC of 0.99 on both test sets in the original study. This may be caused by the use of a single grade per image, or different data. This study shows the challenges of reproducing deep learning method results, and the need for more replication and reproduction studies to validate deep learning methods, especially for medical image analysis. Our source code and instructions are available at: https://github.com/mikevoets/jama16-retina-replication. Public Library of Science 2019-06-06 /pmc/articles/PMC6553744/ /pubmed/31170223 http://dx.doi.org/10.1371/journal.pone.0217541 Text en © 2019 Voets et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Voets, Mike Møllersen, Kajsa Bongo, Lars Ailo Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title	Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title_full	Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title_fullStr	Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title_full_unstemmed	Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title_short	Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
title_sort	reproduction study using public data of: development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6553744/ https://www.ncbi.nlm.nih.gov/pubmed/31170223 http://dx.doi.org/10.1371/journal.pone.0217541
work_keys_str_mv	AT voetsmike reproductionstudyusingpublicdataofdevelopmentandvalidationofadeeplearningalgorithmfordetectionofdiabeticretinopathyinretinalfundusphotographs AT møllersenkajsa reproductionstudyusingpublicdataofdevelopmentandvalidationofadeeplearningalgorithmfordetectionofdiabeticretinopathyinretinalfundusphotographs AT bongolarsailo reproductionstudyusingpublicdataofdevelopmentandvalidationofadeeplearningalgorithmfordetectionofdiabeticretinopathyinretinalfundusphotographs

Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs

Ejemplares similares