Cargando…
How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it importan...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119495/ https://www.ncbi.nlm.nih.gov/pubmed/35587481 http://dx.doi.org/10.1371/journal.pone.0267114 |
_version_ | 1784710713836568576 |
---|---|
author | Salk, Carl Moltchanova, Elena See, Linda Sturn, Tobias McCallum, Ian Fritz, Steffen |
author_facet | Salk, Carl Moltchanova, Elena See, Linda Sturn, Tobias McCallum, Ian Fritz, Steffen |
author_sort | Salk, Carl |
collection | PubMed |
description | Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels. |
format | Online Article Text |
id | pubmed-9119495 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-91194952022-05-20 How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications Salk, Carl Moltchanova, Elena See, Linda Sturn, Tobias McCallum, Ian Fritz, Steffen PLoS One Research Article Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels. Public Library of Science 2022-05-19 /pmc/articles/PMC9119495/ /pubmed/35587481 http://dx.doi.org/10.1371/journal.pone.0267114 Text en © 2022 Salk et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Salk, Carl Moltchanova, Elena See, Linda Sturn, Tobias McCallum, Ian Fritz, Steffen How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title | How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title_full | How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title_fullStr | How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title_full_unstemmed | How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title_short | How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications |
title_sort | how many people need to classify the same image? a method for optimizing volunteer contributions in binary geographical classifications |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119495/ https://www.ncbi.nlm.nih.gov/pubmed/35587481 http://dx.doi.org/10.1371/journal.pone.0267114 |
work_keys_str_mv | AT salkcarl howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications AT moltchanovaelena howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications AT seelinda howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications AT sturntobias howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications AT mccallumian howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications AT fritzsteffen howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications |