Cargando…

PUblications Metadata Augmentation (PUMA) pipeline

Cohort studies collect, generate and distribute data over long periods of time – often over the lifecourse of their participants. It is common for these studies to host a list of publications (which can number many thousands) on their website to demonstrate the impact of the study and facilitate the...

Descripción completa

Detalles Bibliográficos
Autores principales: Butters, Oliver W., Wilson, Rebecca C., Garner, Hugh, Burton, Thomas W. Y.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8108552/
https://www.ncbi.nlm.nih.gov/pubmed/34026049
http://dx.doi.org/10.12688/f1000research.25484.2
_version_ 1783690150496174080
author Butters, Oliver W.
Wilson, Rebecca C.
Garner, Hugh
Burton, Thomas W. Y.
author_facet Butters, Oliver W.
Wilson, Rebecca C.
Garner, Hugh
Burton, Thomas W. Y.
author_sort Butters, Oliver W.
collection PubMed
description Cohort studies collect, generate and distribute data over long periods of time – often over the lifecourse of their participants. It is common for these studies to host a list of publications (which can number many thousands) on their website to demonstrate the impact of the study and facilitate the search of existing research to which the study data has contributed. The ability to search and explore these publication lists varies greatly between studies. We believe a lack of rich search and exploration functionality of study publications is a barrier to entry for new or prospective users of a study’s data, since it may be difficult to find and evaluate previous work in a given area. These lists of publications are also typically manually curated, resulting in a lack of rich metadata to analyse, making bibliometric analysis difficult. We present here a software pipeline that aggregates metadata from a variety of third-party providers to power a web based search and exploration tool for lists of publications. Alongside core publication metadata (i.e. author lists, keywords etc.), we include geocoding of first authors and citation counts in our pipeline. This allows a characterisation of a study as a whole based on common locations of authors, frequency of keywords, citation profile etc. This enriched publications metadata can be useful for generating study impact metrics and web-based graphics for public dissemination. In addition, the pipeline produces a research data set for bibliometric analysis or social studies of science. We use a previously published list of publications from a cohort study as an exemplar input data set to show the output and utility of the pipeline here.
format Online
Article
Text
id pubmed-8108552
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-81085522021-05-21 PUblications Metadata Augmentation (PUMA) pipeline Butters, Oliver W. Wilson, Rebecca C. Garner, Hugh Burton, Thomas W. Y. F1000Res Software Tool Article Cohort studies collect, generate and distribute data over long periods of time – often over the lifecourse of their participants. It is common for these studies to host a list of publications (which can number many thousands) on their website to demonstrate the impact of the study and facilitate the search of existing research to which the study data has contributed. The ability to search and explore these publication lists varies greatly between studies. We believe a lack of rich search and exploration functionality of study publications is a barrier to entry for new or prospective users of a study’s data, since it may be difficult to find and evaluate previous work in a given area. These lists of publications are also typically manually curated, resulting in a lack of rich metadata to analyse, making bibliometric analysis difficult. We present here a software pipeline that aggregates metadata from a variety of third-party providers to power a web based search and exploration tool for lists of publications. Alongside core publication metadata (i.e. author lists, keywords etc.), we include geocoding of first authors and citation counts in our pipeline. This allows a characterisation of a study as a whole based on common locations of authors, frequency of keywords, citation profile etc. This enriched publications metadata can be useful for generating study impact metrics and web-based graphics for public dissemination. In addition, the pipeline produces a research data set for bibliometric analysis or social studies of science. We use a previously published list of publications from a cohort study as an exemplar input data set to show the output and utility of the pipeline here. F1000 Research Limited 2021-04-12 /pmc/articles/PMC8108552/ /pubmed/34026049 http://dx.doi.org/10.12688/f1000research.25484.2 Text en Copyright: © 2021 Butters OW et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Butters, Oliver W.
Wilson, Rebecca C.
Garner, Hugh
Burton, Thomas W. Y.
PUblications Metadata Augmentation (PUMA) pipeline
title PUblications Metadata Augmentation (PUMA) pipeline
title_full PUblications Metadata Augmentation (PUMA) pipeline
title_fullStr PUblications Metadata Augmentation (PUMA) pipeline
title_full_unstemmed PUblications Metadata Augmentation (PUMA) pipeline
title_short PUblications Metadata Augmentation (PUMA) pipeline
title_sort publications metadata augmentation (puma) pipeline
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8108552/
https://www.ncbi.nlm.nih.gov/pubmed/34026049
http://dx.doi.org/10.12688/f1000research.25484.2
work_keys_str_mv AT buttersoliverw publicationsmetadataaugmentationpumapipeline
AT wilsonrebeccac publicationsmetadataaugmentationpumapipeline
AT garnerhugh publicationsmetadataaugmentationpumapipeline
AT burtonthomaswy publicationsmetadataaugmentationpumapipeline