Cargando…

Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance

We present data concerning the distribution of scientific publications for human protein-coding genes together with their protein products and genetic relevance. We annotated the gene2pubmed dataset Maglott et al., 2007 provided by the NCBI (National Center for Biotechnology Information) with public...

Descripción completa

Detalles Bibliográficos
Autores principales: Zwick, Matthias, Kraemer, Oliver, Carter, Adrian J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6702404/
https://www.ncbi.nlm.nih.gov/pubmed/31453287
http://dx.doi.org/10.1016/j.dib.2019.104284
_version_ 1783445221692932096
author Zwick, Matthias
Kraemer, Oliver
Carter, Adrian J.
author_facet Zwick, Matthias
Kraemer, Oliver
Carter, Adrian J.
author_sort Zwick, Matthias
collection PubMed
description We present data concerning the distribution of scientific publications for human protein-coding genes together with their protein products and genetic relevance. We annotated the gene2pubmed dataset Maglott et al., 2007 provided by the NCBI (National Center for Biotechnology Information) with publication years, genetic metadata corresponding to Online Mendelian Inheritance in Man (OMIM) Hamosh et al., 2005 entries and the frequency of their appearance in Genome-Wide Association Studies (GWAS) Buniello et al., 2019 provided by the European Bioinformatics Institute (EBI) using the KNIME(®) Analytics Platform Berthold et al., 2008. The results of this data integration process comprise two datasets: 1) A dataset containing information on all human protein-coding genes that can be used to analyse the number of scientific publications in context of the potential disease relevance of the individual genes. 2) A table with the annual and cumulated number of PubMed entries. For further interpretation of the data presented in this article, please see the research article ‘Target 2035 - probing the human proteome’ by Carter et al. https://doi.org/10.1016/j.drudis.2019.06.020 Carter et al., 2019.
format Online
Article
Text
id pubmed-6702404
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-67024042019-08-26 Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance Zwick, Matthias Kraemer, Oliver Carter, Adrian J. Data Brief Proteomics We present data concerning the distribution of scientific publications for human protein-coding genes together with their protein products and genetic relevance. We annotated the gene2pubmed dataset Maglott et al., 2007 provided by the NCBI (National Center for Biotechnology Information) with publication years, genetic metadata corresponding to Online Mendelian Inheritance in Man (OMIM) Hamosh et al., 2005 entries and the frequency of their appearance in Genome-Wide Association Studies (GWAS) Buniello et al., 2019 provided by the European Bioinformatics Institute (EBI) using the KNIME(®) Analytics Platform Berthold et al., 2008. The results of this data integration process comprise two datasets: 1) A dataset containing information on all human protein-coding genes that can be used to analyse the number of scientific publications in context of the potential disease relevance of the individual genes. 2) A table with the annual and cumulated number of PubMed entries. For further interpretation of the data presented in this article, please see the research article ‘Target 2035 - probing the human proteome’ by Carter et al. https://doi.org/10.1016/j.drudis.2019.06.020 Carter et al., 2019. Elsevier 2019-07-18 /pmc/articles/PMC6702404/ /pubmed/31453287 http://dx.doi.org/10.1016/j.dib.2019.104284 Text en © 2019 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Proteomics
Zwick, Matthias
Kraemer, Oliver
Carter, Adrian J.
Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title_full Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title_fullStr Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title_full_unstemmed Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title_short Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
title_sort dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance
topic Proteomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6702404/
https://www.ncbi.nlm.nih.gov/pubmed/31453287
http://dx.doi.org/10.1016/j.dib.2019.104284
work_keys_str_mv AT zwickmatthias datasetofthefrequencypatternsofpublicationsannotatedtohumanproteincodinggenestheirproteinproductsandgeneticrelevance
AT kraemeroliver datasetofthefrequencypatternsofpublicationsannotatedtohumanproteincodinggenestheirproteinproductsandgeneticrelevance
AT carteradrianj datasetofthefrequencypatternsofpublicationsannotatedtohumanproteincodinggenestheirproteinproductsandgeneticrelevance