Cargando…
Scaling drug indication curation through crowdsourcing
Motivated by the high cost of human curation of biological databases, there is an increasing interest in using computational approaches to assist human curators and accelerate the manual curation process. Towards the goal of cataloging drug indications from FDA drug labels, we recently developed Lab...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4369375/ https://www.ncbi.nlm.nih.gov/pubmed/25797061 http://dx.doi.org/10.1093/database/bav016 |
_version_ | 1782362761198567424 |
---|---|
author | Khare, Ritu Burger, John D. Aberdeen, John S. Tresner-Kirsch, David W. Corrales, Theodore J. Hirchman, Lynette Lu, Zhiyong |
author_facet | Khare, Ritu Burger, John D. Aberdeen, John S. Tresner-Kirsch, David W. Corrales, Theodore J. Hirchman, Lynette Lu, Zhiyong |
author_sort | Khare, Ritu |
collection | PubMed |
description | Motivated by the high cost of human curation of biological databases, there is an increasing interest in using computational approaches to assist human curators and accelerate the manual curation process. Towards the goal of cataloging drug indications from FDA drug labels, we recently developed LabeledIn, a human-curated drug indication resource for 250 clinical drugs. Its development required over 40 h of human effort across 20 weeks, despite using well-defined annotation guidelines. In this study, we aim to investigate the feasibility of scaling drug indication annotation through a crowdsourcing technique where an unknown network of workers can be recruited through the technical environment of Amazon Mechanical Turk (MTurk). To translate the expert-curation task of cataloging indications into human intelligence tasks (HITs) suitable for the average workers on MTurk, we first simplify the complex task such that each HIT only involves a worker making a binary judgment of whether a highlighted disease, in context of a given drug label, is an indication. In addition, this study is novel in the crowdsourcing interface design where the annotation guidelines are encoded into user options. For evaluation, we assess the ability of our proposed method to achieve high-quality annotations in a time-efficient and cost-effective manner. We posted over 3000 HITs drawn from 706 drug labels on MTurk. Within 8 h of posting, we collected 18 775 judgments from 74 workers, and achieved an aggregated accuracy of 96% on 450 control HITs (where gold-standard answers are known), at a cost of $1.75 per drug label. On the basis of these results, we conclude that our crowdsourcing approach not only results in significant cost and time saving, but also leads to accuracy comparable to that of domain experts. Database URL: ftp://ftp.ncbi.nlm.nih.gov/pub/lu/LabeledIn/Crowdsourcing/. |
format | Online Article Text |
id | pubmed-4369375 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-43693752015-04-17 Scaling drug indication curation through crowdsourcing Khare, Ritu Burger, John D. Aberdeen, John S. Tresner-Kirsch, David W. Corrales, Theodore J. Hirchman, Lynette Lu, Zhiyong Database (Oxford) Original Article Motivated by the high cost of human curation of biological databases, there is an increasing interest in using computational approaches to assist human curators and accelerate the manual curation process. Towards the goal of cataloging drug indications from FDA drug labels, we recently developed LabeledIn, a human-curated drug indication resource for 250 clinical drugs. Its development required over 40 h of human effort across 20 weeks, despite using well-defined annotation guidelines. In this study, we aim to investigate the feasibility of scaling drug indication annotation through a crowdsourcing technique where an unknown network of workers can be recruited through the technical environment of Amazon Mechanical Turk (MTurk). To translate the expert-curation task of cataloging indications into human intelligence tasks (HITs) suitable for the average workers on MTurk, we first simplify the complex task such that each HIT only involves a worker making a binary judgment of whether a highlighted disease, in context of a given drug label, is an indication. In addition, this study is novel in the crowdsourcing interface design where the annotation guidelines are encoded into user options. For evaluation, we assess the ability of our proposed method to achieve high-quality annotations in a time-efficient and cost-effective manner. We posted over 3000 HITs drawn from 706 drug labels on MTurk. Within 8 h of posting, we collected 18 775 judgments from 74 workers, and achieved an aggregated accuracy of 96% on 450 control HITs (where gold-standard answers are known), at a cost of $1.75 per drug label. On the basis of these results, we conclude that our crowdsourcing approach not only results in significant cost and time saving, but also leads to accuracy comparable to that of domain experts. Database URL: ftp://ftp.ncbi.nlm.nih.gov/pub/lu/LabeledIn/Crowdsourcing/. Oxford University Press 2015-03-22 /pmc/articles/PMC4369375/ /pubmed/25797061 http://dx.doi.org/10.1093/database/bav016 Text en Published by Oxford University Press 2015. This work is written by US Government employees and is in the public domain in the US. |
spellingShingle | Original Article Khare, Ritu Burger, John D. Aberdeen, John S. Tresner-Kirsch, David W. Corrales, Theodore J. Hirchman, Lynette Lu, Zhiyong Scaling drug indication curation through crowdsourcing |
title | Scaling drug indication curation through crowdsourcing |
title_full | Scaling drug indication curation through crowdsourcing |
title_fullStr | Scaling drug indication curation through crowdsourcing |
title_full_unstemmed | Scaling drug indication curation through crowdsourcing |
title_short | Scaling drug indication curation through crowdsourcing |
title_sort | scaling drug indication curation through crowdsourcing |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4369375/ https://www.ncbi.nlm.nih.gov/pubmed/25797061 http://dx.doi.org/10.1093/database/bav016 |
work_keys_str_mv | AT khareritu scalingdrugindicationcurationthroughcrowdsourcing AT burgerjohnd scalingdrugindicationcurationthroughcrowdsourcing AT aberdeenjohns scalingdrugindicationcurationthroughcrowdsourcing AT tresnerkirschdavidw scalingdrugindicationcurationthroughcrowdsourcing AT corralestheodorej scalingdrugindicationcurationthroughcrowdsourcing AT hirchmanlynette scalingdrugindicationcurationthroughcrowdsourcing AT luzhiyong scalingdrugindicationcurationthroughcrowdsourcing |