Cargando…

Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000

The mentioning of gene names in the body of the scientific literature 1901–2017 and their fractional counting is used as a proxy to assess the level of biological function discovery. A literature score of one has been defined as full publication equivalent (FPE), the amount of literature necessary t...

Descripción completa

Detalles Bibliográficos
Autores principales: Sinha, Swati, Eisenhaber, Birgit, Jensen, Lars Juhl, Kalbuaji, Bharata, Eisenhaber, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6282819/
https://www.ncbi.nlm.nih.gov/pubmed/30265449
http://dx.doi.org/10.1002/pmic.201800093
_version_ 1783379072347275264
author Sinha, Swati
Eisenhaber, Birgit
Jensen, Lars Juhl
Kalbuaji, Bharata
Eisenhaber, Frank
author_facet Sinha, Swati
Eisenhaber, Birgit
Jensen, Lars Juhl
Kalbuaji, Bharata
Eisenhaber, Frank
author_sort Sinha, Swati
collection PubMed
description The mentioning of gene names in the body of the scientific literature 1901–2017 and their fractional counting is used as a proxy to assess the level of biological function discovery. A literature score of one has been defined as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. It has been found that less than 5000 human genes have each at least 100 FPEs in the available literature corpus. This group of elite genes (4817 protein‐coding genes, 119 non‐coding RNAs) attracts the overwhelming majority of the scientific literature about genes. Yet, thousands of proteins have never been mentioned at all, ≈2000 further proteins have not even one FPE of literature and, for ≈4600 additional proteins, the FPE count is below 10. The protein function discovery rate measured as numbers of proteins first mentioned or crossing a threshold of accumulated FPEs in a given year has grown until 2000 but is in decline thereafter. This drop is partially offset by function discoveries for non‐coding RNAs. The full human genome sequencing does not boost the function discovery rate. Since 2000, the fastest growing group in the literature is that with at least 500 FPEs per gene.
format Online
Article
Text
id pubmed-6282819
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-62828192018-12-11 Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000 Sinha, Swati Eisenhaber, Birgit Jensen, Lars Juhl Kalbuaji, Bharata Eisenhaber, Frank Proteomics Research Articles The mentioning of gene names in the body of the scientific literature 1901–2017 and their fractional counting is used as a proxy to assess the level of biological function discovery. A literature score of one has been defined as full publication equivalent (FPE), the amount of literature necessary to achieve one publication solely dedicated to a gene. It has been found that less than 5000 human genes have each at least 100 FPEs in the available literature corpus. This group of elite genes (4817 protein‐coding genes, 119 non‐coding RNAs) attracts the overwhelming majority of the scientific literature about genes. Yet, thousands of proteins have never been mentioned at all, ≈2000 further proteins have not even one FPE of literature and, for ≈4600 additional proteins, the FPE count is below 10. The protein function discovery rate measured as numbers of proteins first mentioned or crossing a threshold of accumulated FPEs in a given year has grown until 2000 but is in decline thereafter. This drop is partially offset by function discoveries for non‐coding RNAs. The full human genome sequencing does not boost the function discovery rate. Since 2000, the fastest growing group in the literature is that with at least 500 FPEs per gene. John Wiley and Sons Inc. 2018-10-30 2018-11 /pmc/articles/PMC6282819/ /pubmed/30265449 http://dx.doi.org/10.1002/pmic.201800093 Text en © 2018 Bioinformatics Institute. Proteomics Published by WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
Sinha, Swati
Eisenhaber, Birgit
Jensen, Lars Juhl
Kalbuaji, Bharata
Eisenhaber, Frank
Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title_full Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title_fullStr Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title_full_unstemmed Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title_short Darkness in the Human Gene and Protein Function Space: Widely Modest or Absent Illumination by the Life Science Literature and the Trend for Fewer Protein Function Discoveries Since 2000
title_sort darkness in the human gene and protein function space: widely modest or absent illumination by the life science literature and the trend for fewer protein function discoveries since 2000
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6282819/
https://www.ncbi.nlm.nih.gov/pubmed/30265449
http://dx.doi.org/10.1002/pmic.201800093
work_keys_str_mv AT sinhaswati darknessinthehumangeneandproteinfunctionspacewidelymodestorabsentilluminationbythelifescienceliteratureandthetrendforfewerproteinfunctiondiscoveriessince2000
AT eisenhaberbirgit darknessinthehumangeneandproteinfunctionspacewidelymodestorabsentilluminationbythelifescienceliteratureandthetrendforfewerproteinfunctiondiscoveriessince2000
AT jensenlarsjuhl darknessinthehumangeneandproteinfunctionspacewidelymodestorabsentilluminationbythelifescienceliteratureandthetrendforfewerproteinfunctiondiscoveriessince2000
AT kalbuajibharata darknessinthehumangeneandproteinfunctionspacewidelymodestorabsentilluminationbythelifescienceliteratureandthetrendforfewerproteinfunctiondiscoveriessince2000
AT eisenhaberfrank darknessinthehumangeneandproteinfunctionspacewidelymodestorabsentilluminationbythelifescienceliteratureandthetrendforfewerproteinfunctiondiscoveriessince2000