Cargando…

Human protein-coding genes and gene feature statistics in 2019

OBJECTIVE: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their...

Descripción completa

Detalles Bibliográficos
Autores principales: Piovesan, Allison, Antonaros, Francesca, Vitale, Lorenza, Strippoli, Pierluigi, Pelleri, Maria Chiara, Caracausi, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6549324/
https://www.ncbi.nlm.nih.gov/pubmed/31164174
http://dx.doi.org/10.1186/s13104-019-4343-8
_version_ 1783423981960822784
author Piovesan, Allison
Antonaros, Francesca
Vitale, Lorenza
Strippoli, Pierluigi
Pelleri, Maria Chiara
Caracausi, Maria
author_facet Piovesan, Allison
Antonaros, Francesca
Vitale, Lorenza
Strippoli, Pierluigi
Pelleri, Maria Chiara
Caracausi, Maria
author_sort Piovesan, Allison
collection PubMed
description OBJECTIVE: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. RESULTS: Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Finally, we confirm that there are no human introns shorter than 30 bp.
format Online
Article
Text
id pubmed-6549324
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65493242019-06-06 Human protein-coding genes and gene feature statistics in 2019 Piovesan, Allison Antonaros, Francesca Vitale, Lorenza Strippoli, Pierluigi Pelleri, Maria Chiara Caracausi, Maria BMC Res Notes Research Note OBJECTIVE: A well-known limit of genome browsers is that the large amount of genome and gene data is not organized in the form of a searchable database, hampering full management of numerical data and free calculations. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. RESULTS: Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Finally, we confirm that there are no human introns shorter than 30 bp. BioMed Central 2019-06-04 /pmc/articles/PMC6549324/ /pubmed/31164174 http://dx.doi.org/10.1186/s13104-019-4343-8 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Note
Piovesan, Allison
Antonaros, Francesca
Vitale, Lorenza
Strippoli, Pierluigi
Pelleri, Maria Chiara
Caracausi, Maria
Human protein-coding genes and gene feature statistics in 2019
title Human protein-coding genes and gene feature statistics in 2019
title_full Human protein-coding genes and gene feature statistics in 2019
title_fullStr Human protein-coding genes and gene feature statistics in 2019
title_full_unstemmed Human protein-coding genes and gene feature statistics in 2019
title_short Human protein-coding genes and gene feature statistics in 2019
title_sort human protein-coding genes and gene feature statistics in 2019
topic Research Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6549324/
https://www.ncbi.nlm.nih.gov/pubmed/31164174
http://dx.doi.org/10.1186/s13104-019-4343-8
work_keys_str_mv AT piovesanallison humanproteincodinggenesandgenefeaturestatisticsin2019
AT antonarosfrancesca humanproteincodinggenesandgenefeaturestatisticsin2019
AT vitalelorenza humanproteincodinggenesandgenefeaturestatisticsin2019
AT strippolipierluigi humanproteincodinggenesandgenefeaturestatisticsin2019
AT pellerimariachiara humanproteincodinggenesandgenefeaturestatisticsin2019
AT caracausimaria humanproteincodinggenesandgenefeaturestatisticsin2019