Cargando…

How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications

Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it importan...

Descripción completa

Detalles Bibliográficos
Autores principales: Salk, Carl, Moltchanova, Elena, See, Linda, Sturn, Tobias, McCallum, Ian, Fritz, Steffen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119495/
https://www.ncbi.nlm.nih.gov/pubmed/35587481
http://dx.doi.org/10.1371/journal.pone.0267114
_version_ 1784710713836568576
author Salk, Carl
Moltchanova, Elena
See, Linda
Sturn, Tobias
McCallum, Ian
Fritz, Steffen
author_facet Salk, Carl
Moltchanova, Elena
See, Linda
Sturn, Tobias
McCallum, Ian
Fritz, Steffen
author_sort Salk, Carl
collection PubMed
description Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels.
format Online
Article
Text
id pubmed-9119495
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-91194952022-05-20 How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications Salk, Carl Moltchanova, Elena See, Linda Sturn, Tobias McCallum, Ian Fritz, Steffen PLoS One Research Article Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels. Public Library of Science 2022-05-19 /pmc/articles/PMC9119495/ /pubmed/35587481 http://dx.doi.org/10.1371/journal.pone.0267114 Text en © 2022 Salk et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Salk, Carl
Moltchanova, Elena
See, Linda
Sturn, Tobias
McCallum, Ian
Fritz, Steffen
How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title_full How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title_fullStr How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title_full_unstemmed How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title_short How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications
title_sort how many people need to classify the same image? a method for optimizing volunteer contributions in binary geographical classifications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119495/
https://www.ncbi.nlm.nih.gov/pubmed/35587481
http://dx.doi.org/10.1371/journal.pone.0267114
work_keys_str_mv AT salkcarl howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications
AT moltchanovaelena howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications
AT seelinda howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications
AT sturntobias howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications
AT mccallumian howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications
AT fritzsteffen howmanypeopleneedtoclassifythesameimageamethodforoptimizingvolunteercontributionsinbinarygeographicalclassifications