Cargando…

Recognizability bias in citizen science photographs

Citizen science and automated collection methods increasingly depend on image recognition to provide the amounts of observational data research and management needs. Recognition models, meanwhile, also require large amounts of data from these sources, creating a feedback loop between the methods and...

Descripción completa

Detalles Bibliográficos
Autores principales: Koch, Wouter, Hogeweg, Laurens, Nilsen, Erlend B., O’Hara, Robert B., Finstad, Anders G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890120/
https://www.ncbi.nlm.nih.gov/pubmed/36756065
http://dx.doi.org/10.1098/rsos.221063
_version_ 1784880885313568768
author Koch, Wouter
Hogeweg, Laurens
Nilsen, Erlend B.
O’Hara, Robert B.
Finstad, Anders G.
author_facet Koch, Wouter
Hogeweg, Laurens
Nilsen, Erlend B.
O’Hara, Robert B.
Finstad, Anders G.
author_sort Koch, Wouter
collection PubMed
description Citizen science and automated collection methods increasingly depend on image recognition to provide the amounts of observational data research and management needs. Recognition models, meanwhile, also require large amounts of data from these sources, creating a feedback loop between the methods and tools. Species that are harder to recognize, both for humans and machine learning algorithms, are likely to be under-reported, and thus be less prevalent in the training data. As a result, the feedback loop may hamper training mostly for species that already pose the greatest challenge. In this study, we trained recognition models for various taxa, and found evidence for a ‘recognizability bias’, where species that are more readily identified by humans and recognition models alike are more prevalent in the available image data. This pattern is present across multiple taxa, and does not appear to relate to differences in picture quality, biological traits or data collection metrics other than recognizability. This has implications for the expected performance of future models trained with more data, including such challenging species.
format Online
Article
Text
id pubmed-9890120
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-98901202023-02-07 Recognizability bias in citizen science photographs Koch, Wouter Hogeweg, Laurens Nilsen, Erlend B. O’Hara, Robert B. Finstad, Anders G. R Soc Open Sci Ecology, Conservation and Global Change Biology Citizen science and automated collection methods increasingly depend on image recognition to provide the amounts of observational data research and management needs. Recognition models, meanwhile, also require large amounts of data from these sources, creating a feedback loop between the methods and tools. Species that are harder to recognize, both for humans and machine learning algorithms, are likely to be under-reported, and thus be less prevalent in the training data. As a result, the feedback loop may hamper training mostly for species that already pose the greatest challenge. In this study, we trained recognition models for various taxa, and found evidence for a ‘recognizability bias’, where species that are more readily identified by humans and recognition models alike are more prevalent in the available image data. This pattern is present across multiple taxa, and does not appear to relate to differences in picture quality, biological traits or data collection metrics other than recognizability. This has implications for the expected performance of future models trained with more data, including such challenging species. The Royal Society 2023-02-01 /pmc/articles/PMC9890120/ /pubmed/36756065 http://dx.doi.org/10.1098/rsos.221063 Text en © 2023 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited.
spellingShingle Ecology, Conservation and Global Change Biology
Koch, Wouter
Hogeweg, Laurens
Nilsen, Erlend B.
O’Hara, Robert B.
Finstad, Anders G.
Recognizability bias in citizen science photographs
title Recognizability bias in citizen science photographs
title_full Recognizability bias in citizen science photographs
title_fullStr Recognizability bias in citizen science photographs
title_full_unstemmed Recognizability bias in citizen science photographs
title_short Recognizability bias in citizen science photographs
title_sort recognizability bias in citizen science photographs
topic Ecology, Conservation and Global Change Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890120/
https://www.ncbi.nlm.nih.gov/pubmed/36756065
http://dx.doi.org/10.1098/rsos.221063
work_keys_str_mv AT kochwouter recognizabilitybiasincitizensciencephotographs
AT hogeweglaurens recognizabilitybiasincitizensciencephotographs
AT nilsenerlendb recognizabilitybiasincitizensciencephotographs
AT ohararobertb recognizabilitybiasincitizensciencephotographs
AT finstadandersg recognizabilitybiasincitizensciencephotographs