Cargando…

A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study

BACKGROUND: As trachoma is eliminated, skilled field graders become less adept at correctly identifying active disease (trachomatous inflammation—follicular [TF]). Deciding if trachoma has been eliminated from a district or if treatment strategies need to be continued or reinstated is of critical pu...

Descripción completa

Detalles Bibliográficos
Autores principales: Brady, Christopher J, Cockrell, R Chase, Aldrich, Lindsay R, Wolle, Meraf A, West, Sheila K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132003/
https://www.ncbi.nlm.nih.gov/pubmed/37023420
http://dx.doi.org/10.2196/41233
_version_ 1785031305125167104
author Brady, Christopher J
Cockrell, R Chase
Aldrich, Lindsay R
Wolle, Meraf A
West, Sheila K
author_facet Brady, Christopher J
Cockrell, R Chase
Aldrich, Lindsay R
Wolle, Meraf A
West, Sheila K
author_sort Brady, Christopher J
collection PubMed
description BACKGROUND: As trachoma is eliminated, skilled field graders become less adept at correctly identifying active disease (trachomatous inflammation—follicular [TF]). Deciding if trachoma has been eliminated from a district or if treatment strategies need to be continued or reinstated is of critical public health importance. Telemedicine solutions require both connectivity, which can be poor in the resource-limited regions of the world in which trachoma occurs, and accurate grading of the images. OBJECTIVE: Our purpose was to develop and validate a cloud-based “virtual reading center” (VRC) model using crowdsourcing for image interpretation. METHODS: The Amazon Mechanical Turk (AMT) platform was used to recruit lay graders to interpret 2299 gradable images from a prior field trial of a smartphone-based camera system. Each image received 7 grades for US $0.05 per grade in this VRC. The resultant data set was divided into training and test sets to internally validate the VRC. In the training set, crowdsourcing scores were summed, and the optimal raw score cutoff was chosen to optimize kappa agreement and the resulting prevalence of TF. The best method was then applied to the test set, and the sensitivity, specificity, kappa, and TF prevalence were calculated. RESULTS: In this trial, over 16,000 grades were rendered in just over 60 minutes for US $1098 including AMT fees. After choosing an AMT raw score cut point to optimize kappa near the World Health Organization (WHO)–endorsed level of 0.7 (with a simulated 40% prevalence TF), crowdsourcing was 95% sensitive and 87% specific for TF in the training set with a kappa of 0.797. All 196 crowdsourced-positive images received a skilled overread to mimic a tiered reading center and specificity improved to 99%, while sensitivity remained above 78%. Kappa for the entire sample improved from 0.162 to 0.685 with overreads, and the skilled grader burden was reduced by over 80%. This tiered VRC model was then applied to the test set and produced a sensitivity of 99% and a specificity of 76% with a kappa of 0.775 in the entire set. The prevalence estimated by the VRC was 2.70% (95% CI 1.84%-3.80%) compared to the ground truth prevalence of 2.87% (95% CI 1.98%-4.01%). CONCLUSIONS: A VRC model using crowdsourcing as a first pass with skilled grading of positive images was able to identify TF rapidly and accurately in a low prevalence setting. The findings from this study support further validation of a VRC and crowdsourcing for image grading and estimation of trachoma prevalence from field-acquired images, although further prospective field testing is required to determine if diagnostic characteristics are acceptable in real-world surveys with a low prevalence of the disease.
format Online
Article
Text
id pubmed-10132003
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-101320032023-04-27 A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study Brady, Christopher J Cockrell, R Chase Aldrich, Lindsay R Wolle, Meraf A West, Sheila K J Med Internet Res Original Paper BACKGROUND: As trachoma is eliminated, skilled field graders become less adept at correctly identifying active disease (trachomatous inflammation—follicular [TF]). Deciding if trachoma has been eliminated from a district or if treatment strategies need to be continued or reinstated is of critical public health importance. Telemedicine solutions require both connectivity, which can be poor in the resource-limited regions of the world in which trachoma occurs, and accurate grading of the images. OBJECTIVE: Our purpose was to develop and validate a cloud-based “virtual reading center” (VRC) model using crowdsourcing for image interpretation. METHODS: The Amazon Mechanical Turk (AMT) platform was used to recruit lay graders to interpret 2299 gradable images from a prior field trial of a smartphone-based camera system. Each image received 7 grades for US $0.05 per grade in this VRC. The resultant data set was divided into training and test sets to internally validate the VRC. In the training set, crowdsourcing scores were summed, and the optimal raw score cutoff was chosen to optimize kappa agreement and the resulting prevalence of TF. The best method was then applied to the test set, and the sensitivity, specificity, kappa, and TF prevalence were calculated. RESULTS: In this trial, over 16,000 grades were rendered in just over 60 minutes for US $1098 including AMT fees. After choosing an AMT raw score cut point to optimize kappa near the World Health Organization (WHO)–endorsed level of 0.7 (with a simulated 40% prevalence TF), crowdsourcing was 95% sensitive and 87% specific for TF in the training set with a kappa of 0.797. All 196 crowdsourced-positive images received a skilled overread to mimic a tiered reading center and specificity improved to 99%, while sensitivity remained above 78%. Kappa for the entire sample improved from 0.162 to 0.685 with overreads, and the skilled grader burden was reduced by over 80%. This tiered VRC model was then applied to the test set and produced a sensitivity of 99% and a specificity of 76% with a kappa of 0.775 in the entire set. The prevalence estimated by the VRC was 2.70% (95% CI 1.84%-3.80%) compared to the ground truth prevalence of 2.87% (95% CI 1.98%-4.01%). CONCLUSIONS: A VRC model using crowdsourcing as a first pass with skilled grading of positive images was able to identify TF rapidly and accurately in a low prevalence setting. The findings from this study support further validation of a VRC and crowdsourcing for image grading and estimation of trachoma prevalence from field-acquired images, although further prospective field testing is required to determine if diagnostic characteristics are acceptable in real-world surveys with a low prevalence of the disease. JMIR Publications 2023-04-06 /pmc/articles/PMC10132003/ /pubmed/37023420 http://dx.doi.org/10.2196/41233 Text en ©Christopher J Brady, R Chase Cockrell, Lindsay R Aldrich, Meraf A Wolle, Sheila K West. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 06.04.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Brady, Christopher J
Cockrell, R Chase
Aldrich, Lindsay R
Wolle, Meraf A
West, Sheila K
A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title_full A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title_fullStr A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title_full_unstemmed A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title_short A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study
title_sort virtual reading center model using crowdsourcing to grade photographs for trachoma: validation study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132003/
https://www.ncbi.nlm.nih.gov/pubmed/37023420
http://dx.doi.org/10.2196/41233
work_keys_str_mv AT bradychristopherj avirtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT cockrellrchase avirtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT aldrichlindsayr avirtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT wollemerafa avirtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT westsheilak avirtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT bradychristopherj virtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT cockrellrchase virtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT aldrichlindsayr virtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT wollemerafa virtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy
AT westsheilak virtualreadingcentermodelusingcrowdsourcingtogradephotographsfortrachomavalidationstudy