Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles

BACKGROUND: In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files shoul...

Descripción completa

Detalles Bibliográficos
Autores principales: Kafkas, Şenay, Kim, Jee-Hyub, Pi, Xingjun, McEntyre, Johanna R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4363206/
https://www.ncbi.nlm.nih.gov/pubmed/25789152
http://dx.doi.org/10.1186/2041-1480-6-1
_version_ 1782361887715885056
author Kafkas, Şenay
Kim, Jee-Hyub
Pi, Xingjun
McEntyre, Johanna R
author_facet Kafkas, Şenay
Kim, Jee-Hyub
Pi, Xingjun
McEntyre, Johanna R
author_sort Kafkas, Şenay
collection PubMed
description BACKGROUND: In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files should be considered as a source of information for integrating the literature with biomolecular databases. RESULTS: Using text-mining methods to identify and extract a variety of core biological database accession numbers, we found that the supplemental data files contain many more database citations than the body of the article, and that those citations often take the form of a relatively small number of articles citing large collections of accession numbers in text-based files. Moreover, citation of value-added databases derived from submission databases (such as Pfam, UniProt or Ensembl) is common, demonstrating the reuse of these resources as datasets in themselves. All the database accession numbers extracted from the supplementary data are publicly accessible from http://dx.doi.org/10.5281/zenodo.11771. CONCLUSIONS: Our study suggests that supplementary data should be considered when linking articles with data, in curation pipelines, and in information retrieval tasks in order to make full use of the entire research article. These observations highlight the need to improve the management of supplemental data in general, in order to make this information more discoverable and useful.
format Online
Article
Text
id pubmed-4363206
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43632062015-03-19 Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles Kafkas, Şenay Kim, Jee-Hyub Pi, Xingjun McEntyre, Johanna R J Biomed Semantics Research BACKGROUND: In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files should be considered as a source of information for integrating the literature with biomolecular databases. RESULTS: Using text-mining methods to identify and extract a variety of core biological database accession numbers, we found that the supplemental data files contain many more database citations than the body of the article, and that those citations often take the form of a relatively small number of articles citing large collections of accession numbers in text-based files. Moreover, citation of value-added databases derived from submission databases (such as Pfam, UniProt or Ensembl) is common, demonstrating the reuse of these resources as datasets in themselves. All the database accession numbers extracted from the supplementary data are publicly accessible from http://dx.doi.org/10.5281/zenodo.11771. CONCLUSIONS: Our study suggests that supplementary data should be considered when linking articles with data, in curation pipelines, and in information retrieval tasks in order to make full use of the entire research article. These observations highlight the need to improve the management of supplemental data in general, in order to make this information more discoverable and useful. BioMed Central 2015-01-05 /pmc/articles/PMC4363206/ /pubmed/25789152 http://dx.doi.org/10.1186/2041-1480-6-1 Text en © Kafkas et al.; licensee BioMed Central. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kafkas, Şenay
Kim, Jee-Hyub
Pi, Xingjun
McEntyre, Johanna R
Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title_full Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title_fullStr Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title_full_unstemmed Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title_short Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
title_sort database citation in supplementary data linked to europe pubmed central full text biomedical articles
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4363206/
https://www.ncbi.nlm.nih.gov/pubmed/25789152
http://dx.doi.org/10.1186/2041-1480-6-1
work_keys_str_mv AT kafkassenay databasecitationinsupplementarydatalinkedtoeuropepubmedcentralfulltextbiomedicalarticles
AT kimjeehyub databasecitationinsupplementarydatalinkedtoeuropepubmedcentralfulltextbiomedicalarticles
AT pixingjun databasecitationinsupplementarydatalinkedtoeuropepubmedcentralfulltextbiomedicalarticles
AT mcentyrejohannar databasecitationinsupplementarydatalinkedtoeuropepubmedcentralfulltextbiomedicalarticles