Cargando…

A globally synthesised and flagged bee occurrence dataset and cleaning workflow

Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee o...

Descripción completa

Detalles Bibliográficos
Autores principales: Dorey, James B., Fischer, Erica E., Chesshire, Paige R., Nava-Bolaños, Angela, O’Reilly, Robert L., Bossert, Silas, Collins, Shannon M., Lichtenberg, Elinor M., Tucker, Erika M., Smith-Pardo, Allan, Falcon-Brindis, Armando, Guevara, Diego A., Ribeiro, Bruno, de Pedro, Diego, Pickering, John, Hung, Keng-Lou James, Parys, Katherine A., McCabe, Lindsie M., Rogan, Matthew S., Minckley, Robert L., Velazco, Santiago J. E., Griswold, Terry, Zarrillo, Tracy A., Jetz, Walter, Sica, Yanina V., Orr, Michael C., Guzman, Laura Melissa, Ascher, John S., Hughes, Alice C., Cobb, Neil S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622554/
https://www.ncbi.nlm.nih.gov/pubmed/37919303
http://dx.doi.org/10.1038/s41597-023-02626-w
_version_ 1785130564994465792
author Dorey, James B.
Fischer, Erica E.
Chesshire, Paige R.
Nava-Bolaños, Angela
O’Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith-Pardo, Allan
Falcon-Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng-Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago J. E.
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
author_facet Dorey, James B.
Fischer, Erica E.
Chesshire, Paige R.
Nava-Bolaños, Angela
O’Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith-Pardo, Allan
Falcon-Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng-Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago J. E.
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
author_sort Dorey, James B.
collection PubMed
description Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
format Online
Article
Text
id pubmed-10622554
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106225542023-11-04 A globally synthesised and flagged bee occurrence dataset and cleaning workflow Dorey, James B. Fischer, Erica E. Chesshire, Paige R. Nava-Bolaños, Angela O’Reilly, Robert L. Bossert, Silas Collins, Shannon M. Lichtenberg, Elinor M. Tucker, Erika M. Smith-Pardo, Allan Falcon-Brindis, Armando Guevara, Diego A. Ribeiro, Bruno de Pedro, Diego Pickering, John Hung, Keng-Lou James Parys, Katherine A. McCabe, Lindsie M. Rogan, Matthew S. Minckley, Robert L. Velazco, Santiago J. E. Griswold, Terry Zarrillo, Tracy A. Jetz, Walter Sica, Yanina V. Orr, Michael C. Guzman, Laura Melissa Ascher, John S. Hughes, Alice C. Cobb, Neil S. Sci Data Data Descriptor Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation. Nature Publishing Group UK 2023-11-02 /pmc/articles/PMC10622554/ /pubmed/37919303 http://dx.doi.org/10.1038/s41597-023-02626-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Dorey, James B.
Fischer, Erica E.
Chesshire, Paige R.
Nava-Bolaños, Angela
O’Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith-Pardo, Allan
Falcon-Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng-Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago J. E.
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_full A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_fullStr A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_full_unstemmed A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_short A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_sort globally synthesised and flagged bee occurrence dataset and cleaning workflow
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622554/
https://www.ncbi.nlm.nih.gov/pubmed/37919303
http://dx.doi.org/10.1038/s41597-023-02626-w
work_keys_str_mv AT doreyjamesb agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT fischerericae agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT chesshirepaiger agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT navabolanosangela agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT oreillyrobertl agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT bossertsilas agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT collinsshannonm agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT lichtenbergelinorm agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT tuckererikam agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT smithpardoallan agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT falconbrindisarmando agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT guevaradiegoa agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT ribeirobruno agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT depedrodiego agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT pickeringjohn agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT hungkengloujames agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT paryskatherinea agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT mccabelindsiem agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT roganmatthews agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT minckleyrobertl agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT velazcosantiagoje agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT griswoldterry agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT zarrillotracya agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT jetzwalter agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT sicayaninav agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT orrmichaelc agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT guzmanlauramelissa agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT ascherjohns agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT hughesalicec agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT cobbneils agloballysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT doreyjamesb globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT fischerericae globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT chesshirepaiger globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT navabolanosangela globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT oreillyrobertl globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT bossertsilas globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT collinsshannonm globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT lichtenbergelinorm globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT tuckererikam globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT smithpardoallan globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT falconbrindisarmando globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT guevaradiegoa globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT ribeirobruno globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT depedrodiego globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT pickeringjohn globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT hungkengloujames globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT paryskatherinea globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT mccabelindsiem globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT roganmatthews globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT minckleyrobertl globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT velazcosantiagoje globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT griswoldterry globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT zarrillotracya globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT jetzwalter globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT sicayaninav globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT orrmichaelc globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT guzmanlauramelissa globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT ascherjohns globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT hughesalicec globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow
AT cobbneils globallysynthesisedandflaggedbeeoccurrencedatasetandcleaningworkflow