Cargando…

Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird

Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user...

Descripción completa

Detalles Bibliográficos
Autores principales: Grade, Aaron M., Chan, Nathan W., Gajbhiye, Prashikdivya, Perkins, Deja J., Warren, Paige S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645630/
https://www.ncbi.nlm.nih.gov/pubmed/36350898
http://dx.doi.org/10.1371/journal.pone.0277223
_version_ 1784827003747172352
author Grade, Aaron M.
Chan, Nathan W.
Gajbhiye, Prashikdivya
Perkins, Deja J.
Warren, Paige S.
author_facet Grade, Aaron M.
Chan, Nathan W.
Gajbhiye, Prashikdivya
Perkins, Deja J.
Warren, Paige S.
author_sort Grade, Aaron M.
collection PubMed
description Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user-submitted ecological data. However, the representativeness of the areas sampled by these data portals is not well known. We investigate how data availability in eBird, one of the largest and most popular crowdsourced science platforms, correlates with race and income of census tracts in two cities: Boston, MA and Phoenix, AZ. We find that checklist submissions vary greatly across census tracts, with similar patterns within both metropolitan regions. In particular, census tracts with high income and high proportions of white residents are most likely to be represented in the data in both cities, which indicates selection bias in eBird coverage. Our results illustrate the non-representativeness of eBird data, and they also raise deeper questions about the validity of statistical inferences regarding disparities that can be drawn from such datasets. We discuss these challenges and illustrate how sample selection problems in unstructured or semi-structured crowdsourced data can lead to spurious conclusions regarding the relationships between race, income, and access to urban bird biodiversity. While crowdsourced data are indispensable and complementary to more traditional approaches for collecting ecological data, we conclude that unstructured or semi-structured data may not be well-suited for all lines of inquiry, particularly those requiring consistent data coverage, and should thus be handled with appropriate care.
format Online
Article
Text
id pubmed-9645630
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-96456302022-11-15 Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird Grade, Aaron M. Chan, Nathan W. Gajbhiye, Prashikdivya Perkins, Deja J. Warren, Paige S. PLoS One Research Article Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user-submitted ecological data. However, the representativeness of the areas sampled by these data portals is not well known. We investigate how data availability in eBird, one of the largest and most popular crowdsourced science platforms, correlates with race and income of census tracts in two cities: Boston, MA and Phoenix, AZ. We find that checklist submissions vary greatly across census tracts, with similar patterns within both metropolitan regions. In particular, census tracts with high income and high proportions of white residents are most likely to be represented in the data in both cities, which indicates selection bias in eBird coverage. Our results illustrate the non-representativeness of eBird data, and they also raise deeper questions about the validity of statistical inferences regarding disparities that can be drawn from such datasets. We discuss these challenges and illustrate how sample selection problems in unstructured or semi-structured crowdsourced data can lead to spurious conclusions regarding the relationships between race, income, and access to urban bird biodiversity. While crowdsourced data are indispensable and complementary to more traditional approaches for collecting ecological data, we conclude that unstructured or semi-structured data may not be well-suited for all lines of inquiry, particularly those requiring consistent data coverage, and should thus be handled with appropriate care. Public Library of Science 2022-11-09 /pmc/articles/PMC9645630/ /pubmed/36350898 http://dx.doi.org/10.1371/journal.pone.0277223 Text en https://creativecommons.org/publicdomain/zero/1.0/This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Grade, Aaron M.
Chan, Nathan W.
Gajbhiye, Prashikdivya
Perkins, Deja J.
Warren, Paige S.
Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title_full Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title_fullStr Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title_full_unstemmed Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title_short Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
title_sort evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: a case study with ebird
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645630/
https://www.ncbi.nlm.nih.gov/pubmed/36350898
http://dx.doi.org/10.1371/journal.pone.0277223
work_keys_str_mv AT gradeaaronm evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird
AT channathanw evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird
AT gajbhiyeprashikdivya evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird
AT perkinsdejaj evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird
AT warrenpaiges evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird