Cargando…
Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird
Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645630/ https://www.ncbi.nlm.nih.gov/pubmed/36350898 http://dx.doi.org/10.1371/journal.pone.0277223 |
_version_ | 1784827003747172352 |
---|---|
author | Grade, Aaron M. Chan, Nathan W. Gajbhiye, Prashikdivya Perkins, Deja J. Warren, Paige S. |
author_facet | Grade, Aaron M. Chan, Nathan W. Gajbhiye, Prashikdivya Perkins, Deja J. Warren, Paige S. |
author_sort | Grade, Aaron M. |
collection | PubMed |
description | Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user-submitted ecological data. However, the representativeness of the areas sampled by these data portals is not well known. We investigate how data availability in eBird, one of the largest and most popular crowdsourced science platforms, correlates with race and income of census tracts in two cities: Boston, MA and Phoenix, AZ. We find that checklist submissions vary greatly across census tracts, with similar patterns within both metropolitan regions. In particular, census tracts with high income and high proportions of white residents are most likely to be represented in the data in both cities, which indicates selection bias in eBird coverage. Our results illustrate the non-representativeness of eBird data, and they also raise deeper questions about the validity of statistical inferences regarding disparities that can be drawn from such datasets. We discuss these challenges and illustrate how sample selection problems in unstructured or semi-structured crowdsourced data can lead to spurious conclusions regarding the relationships between race, income, and access to urban bird biodiversity. While crowdsourced data are indispensable and complementary to more traditional approaches for collecting ecological data, we conclude that unstructured or semi-structured data may not be well-suited for all lines of inquiry, particularly those requiring consistent data coverage, and should thus be handled with appropriate care. |
format | Online Article Text |
id | pubmed-9645630 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-96456302022-11-15 Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird Grade, Aaron M. Chan, Nathan W. Gajbhiye, Prashikdivya Perkins, Deja J. Warren, Paige S. PLoS One Research Article Credibly estimating social-ecological relationships requires data with broad coverage and fine geographic resolutions that are not typically available from standard ecological surveys. Open and unstructured data from crowdsourced platforms offer an opportunity for collecting large quantities of user-submitted ecological data. However, the representativeness of the areas sampled by these data portals is not well known. We investigate how data availability in eBird, one of the largest and most popular crowdsourced science platforms, correlates with race and income of census tracts in two cities: Boston, MA and Phoenix, AZ. We find that checklist submissions vary greatly across census tracts, with similar patterns within both metropolitan regions. In particular, census tracts with high income and high proportions of white residents are most likely to be represented in the data in both cities, which indicates selection bias in eBird coverage. Our results illustrate the non-representativeness of eBird data, and they also raise deeper questions about the validity of statistical inferences regarding disparities that can be drawn from such datasets. We discuss these challenges and illustrate how sample selection problems in unstructured or semi-structured crowdsourced data can lead to spurious conclusions regarding the relationships between race, income, and access to urban bird biodiversity. While crowdsourced data are indispensable and complementary to more traditional approaches for collecting ecological data, we conclude that unstructured or semi-structured data may not be well-suited for all lines of inquiry, particularly those requiring consistent data coverage, and should thus be handled with appropriate care. Public Library of Science 2022-11-09 /pmc/articles/PMC9645630/ /pubmed/36350898 http://dx.doi.org/10.1371/journal.pone.0277223 Text en https://creativecommons.org/publicdomain/zero/1.0/This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. |
spellingShingle | Research Article Grade, Aaron M. Chan, Nathan W. Gajbhiye, Prashikdivya Perkins, Deja J. Warren, Paige S. Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title | Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title_full | Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title_fullStr | Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title_full_unstemmed | Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title_short | Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: A case study with eBird |
title_sort | evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: a case study with ebird |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645630/ https://www.ncbi.nlm.nih.gov/pubmed/36350898 http://dx.doi.org/10.1371/journal.pone.0277223 |
work_keys_str_mv | AT gradeaaronm evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird AT channathanw evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird AT gajbhiyeprashikdivya evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird AT perkinsdejaj evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird AT warrenpaiges evaluatingtheuseofsemistructuredcrowdsourceddatatoquantifyinequitableaccesstourbanbiodiversityacasestudywithebird |