Cargando…
Using demographics toward efficient data classification in citizen science: a Bayesian approach
Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924415/ https://www.ncbi.nlm.nih.gov/pubmed/33816892 http://dx.doi.org/10.7717/peerj-cs.239 |
_version_ | 1783659084457705472 |
---|---|
author | De Lellis, Pietro Nakayama, Shinnosuke Porfiri, Maurizio |
author_facet | De Lellis, Pietro Nakayama, Shinnosuke Porfiri, Maurizio |
author_sort | De Lellis, Pietro |
collection | PubMed |
description | Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them. |
format | Online Article Text |
id | pubmed-7924415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79244152021-04-02 Using demographics toward efficient data classification in citizen science: a Bayesian approach De Lellis, Pietro Nakayama, Shinnosuke Porfiri, Maurizio PeerJ Comput Sci Algorithms and Analysis of Algorithms Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them. PeerJ Inc. 2019-11-25 /pmc/articles/PMC7924415/ /pubmed/33816892 http://dx.doi.org/10.7717/peerj-cs.239 Text en © 2019 De Lellis et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Algorithms and Analysis of Algorithms De Lellis, Pietro Nakayama, Shinnosuke Porfiri, Maurizio Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title | Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title_full | Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title_fullStr | Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title_full_unstemmed | Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title_short | Using demographics toward efficient data classification in citizen science: a Bayesian approach |
title_sort | using demographics toward efficient data classification in citizen science: a bayesian approach |
topic | Algorithms and Analysis of Algorithms |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924415/ https://www.ncbi.nlm.nih.gov/pubmed/33816892 http://dx.doi.org/10.7717/peerj-cs.239 |
work_keys_str_mv | AT delellispietro usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach AT nakayamashinnosuke usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach AT porfirimaurizio usingdemographicstowardefficientdataclassificationincitizenscienceabayesianapproach |