Cargando…

Using demographics toward efficient data classification in citizen science: a Bayesian approach

Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem...

Descripción completa

Detalles Bibliográficos
Autores principales: De Lellis, Pietro, Nakayama, Shinnosuke, Porfiri, Maurizio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924415/
https://www.ncbi.nlm.nih.gov/pubmed/33816892
http://dx.doi.org/10.7717/peerj-cs.239
_version_ 1783659084457705472
author De Lellis, Pietro
Nakayama, Shinnosuke
Porfiri, Maurizio
author_facet De Lellis, Pietro
Nakayama, Shinnosuke
Porfiri, Maurizio
author_sort De Lellis, Pietro
collection PubMed
description Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them.
format Online
Article
Text
id pubmed-7924415
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-79244152021-04-02 Using demographics toward efficient data classification in citizen science: a Bayesian approach De Lellis, Pietro Nakayama, Shinnosuke Porfiri, Maurizio PeerJ Comput Sci Algorithms and Analysis of Algorithms Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them. PeerJ Inc. 2019-11-25 /pmc/articles/PMC7924415/ /pubmed/33816892 http://dx.doi.org/10.7717/peerj-cs.239 Text en © 2019 De Lellis et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
De Lellis, Pietro
Nakayama, Shinnosuke
Porfiri, Maurizio
Using demographics toward efficient data classification in citizen science: a Bayesian approach
title Using demographics toward efficient data classification in citizen science: a Bayesian approach
title_full Using demographics toward efficient data classification in citizen science: a Bayesian approach
title_fullStr Using demographics toward efficient data classification in citizen science: a Bayesian approach
title_full_unstemmed Using demographics toward efficient data classification in citizen science: a Bayesian approach
title_short Using demographics toward efficient data classification in citizen science: a Bayesian approach
title_sort using demographics toward efficient data classification in citizen science: a bayesian approach
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924415/
https://www.ncbi.nlm.nih.gov/pubmed/33816892
http://dx.doi.org/10.7717/peerj-cs.239
work_keys_str_mv AT delellispietro usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach
AT nakayamashinnosuke usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach
AT porfirimaurizio usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach