Cargando…
A resource for automated search and collation of geochemical datasets from journal supplements
This article presents a resource for automated search, extraction and collation of geochemical and geochronological data from the Figshare repository using web scraping code. To answer fundamental questions about the Earth’s evolution, such as spatial and temporal evolution and interrelationships be...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9700723/ https://www.ncbi.nlm.nih.gov/pubmed/36433993 http://dx.doi.org/10.1038/s41597-022-01730-7 |
_version_ | 1784839375243182080 |
---|---|
author | Martin, Erin L. Barrote, Vitor R. Cawood, Peter A. |
author_facet | Martin, Erin L. Barrote, Vitor R. Cawood, Peter A. |
author_sort | Martin, Erin L. |
collection | PubMed |
description | This article presents a resource for automated search, extraction and collation of geochemical and geochronological data from the Figshare repository using web scraping code. To answer fundamental questions about the Earth’s evolution, such as spatial and temporal evolution and interrelationships between the planet’s solid and surficial reservoirs, researchers must utilize global geochemical datasets. Due to the volume of data being published, these datasets become quickly outdated. We present a resource that allows researchers to rapidly curate and update their own databases from existing published data. We use open-source Python code to web scrape the Figshare repository for journal supplementary files using the application programming interface, allowing for the collection and download of hundreds of supplementary files and metadata in minutes. Use of this web scraping tool is demonstrated here by collation of a zircon geochronology and chemistry database of >150,000 analyses. The database is consistent in reproducing trends in other published zircon compilations. Providing a resource for automated collection of Figshare data files will encourage data sharing and reuse. |
format | Online Article Text |
id | pubmed-9700723 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-97007232022-11-27 A resource for automated search and collation of geochemical datasets from journal supplements Martin, Erin L. Barrote, Vitor R. Cawood, Peter A. Sci Data Data Descriptor This article presents a resource for automated search, extraction and collation of geochemical and geochronological data from the Figshare repository using web scraping code. To answer fundamental questions about the Earth’s evolution, such as spatial and temporal evolution and interrelationships between the planet’s solid and surficial reservoirs, researchers must utilize global geochemical datasets. Due to the volume of data being published, these datasets become quickly outdated. We present a resource that allows researchers to rapidly curate and update their own databases from existing published data. We use open-source Python code to web scrape the Figshare repository for journal supplementary files using the application programming interface, allowing for the collection and download of hundreds of supplementary files and metadata in minutes. Use of this web scraping tool is demonstrated here by collation of a zircon geochronology and chemistry database of >150,000 analyses. The database is consistent in reproducing trends in other published zircon compilations. Providing a resource for automated collection of Figshare data files will encourage data sharing and reuse. Nature Publishing Group UK 2022-11-25 /pmc/articles/PMC9700723/ /pubmed/36433993 http://dx.doi.org/10.1038/s41597-022-01730-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Data Descriptor Martin, Erin L. Barrote, Vitor R. Cawood, Peter A. A resource for automated search and collation of geochemical datasets from journal supplements |
title | A resource for automated search and collation of geochemical datasets from journal supplements |
title_full | A resource for automated search and collation of geochemical datasets from journal supplements |
title_fullStr | A resource for automated search and collation of geochemical datasets from journal supplements |
title_full_unstemmed | A resource for automated search and collation of geochemical datasets from journal supplements |
title_short | A resource for automated search and collation of geochemical datasets from journal supplements |
title_sort | resource for automated search and collation of geochemical datasets from journal supplements |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9700723/ https://www.ncbi.nlm.nih.gov/pubmed/36433993 http://dx.doi.org/10.1038/s41597-022-01730-7 |
work_keys_str_mv | AT martinerinl aresourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements AT barrotevitorr aresourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements AT cawoodpetera aresourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements AT martinerinl resourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements AT barrotevitorr resourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements AT cawoodpetera resourceforautomatedsearchandcollationofgeochemicaldatasetsfromjournalsupplements |