Cargando…

The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images

PURPOSE: Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy o...

Descripción completa

Detalles Bibliográficos
Autores principales: Mitry, Danny, Zutis, Kris, Dhillon, Baljean, Peto, Tunde, Hayat, Shabina, Khaw, Kay-Tee, Morgan, James E., Moncur, Wendy, Trucco, Emanuele, Foster, Paul J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Association for Research in Vision and Ophthalmology 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5032847/
https://www.ncbi.nlm.nih.gov/pubmed/27668130
http://dx.doi.org/10.1167/tvst.5.5.6
_version_ 1782455074215165952
author Mitry, Danny
Zutis, Kris
Dhillon, Baljean
Peto, Tunde
Hayat, Shabina
Khaw, Kay-Tee
Morgan, James E.
Moncur, Wendy
Trucco, Emanuele
Foster, Paul J.
author_facet Mitry, Danny
Zutis, Kris
Dhillon, Baljean
Peto, Tunde
Hayat, Shabina
Khaw, Kay-Tee
Morgan, James E.
Moncur, Wendy
Trucco, Emanuele
Foster, Paul J.
author_sort Mitry, Danny
collection PubMed
description PURPOSE: Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. METHODS: We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. RESULTS: In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. CONCLUSIONS: This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. TRANSLATIONAL RELEVANCE: The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
format Online
Article
Text
id pubmed-5032847
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher The Association for Research in Vision and Ophthalmology
record_format MEDLINE/PubMed
spelling pubmed-50328472016-09-23 The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images Mitry, Danny Zutis, Kris Dhillon, Baljean Peto, Tunde Hayat, Shabina Khaw, Kay-Tee Morgan, James E. Moncur, Wendy Trucco, Emanuele Foster, Paul J. Transl Vis Sci Technol Articles PURPOSE: Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. METHODS: We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. RESULTS: In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. CONCLUSIONS: This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. TRANSLATIONAL RELEVANCE: The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. The Association for Research in Vision and Ophthalmology 2016-09-21 /pmc/articles/PMC5032847/ /pubmed/27668130 http://dx.doi.org/10.1167/tvst.5.5.6 Text en http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License.
spellingShingle Articles
Mitry, Danny
Zutis, Kris
Dhillon, Baljean
Peto, Tunde
Hayat, Shabina
Khaw, Kay-Tee
Morgan, James E.
Moncur, Wendy
Trucco, Emanuele
Foster, Paul J.
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title_full The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title_fullStr The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title_full_unstemmed The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title_short The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
title_sort accuracy and reliability of crowdsource annotations of digital retinal images
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5032847/
https://www.ncbi.nlm.nih.gov/pubmed/27668130
http://dx.doi.org/10.1167/tvst.5.5.6
work_keys_str_mv AT mitrydanny theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT zutiskris theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT dhillonbaljean theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT petotunde theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT hayatshabina theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT khawkaytee theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT morganjamese theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT moncurwendy theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT truccoemanuele theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT fosterpaulj theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT theaccuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT mitrydanny accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT zutiskris accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT dhillonbaljean accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT petotunde accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT hayatshabina accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT khawkaytee accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT morganjamese accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT moncurwendy accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT truccoemanuele accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT fosterpaulj accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages
AT accuracyandreliabilityofcrowdsourceannotationsofdigitalretinalimages