Cargando…

GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics

We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization i...

Descripción completa

Detalles Bibliográficos
Autores principales: Piovesan, Allison, Caracausi, Maria, Antonaros, Francesca, Pelleri, Maria Chiara, Vitale, Lorenza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5199132/
https://www.ncbi.nlm.nih.gov/pubmed/28025344
http://dx.doi.org/10.1093/database/baw153
_version_ 1782488954455457792
author Piovesan, Allison
Caracausi, Maria
Antonaros, Francesca
Pelleri, Maria Chiara
Vitale, Lorenza
author_facet Piovesan, Allison
Caracausi, Maria
Antonaros, Francesca
Pelleri, Maria Chiara
Vitale, Lorenza
author_sort Piovesan, Allison
collection PubMed
description We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a ‘mean’ human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves. Database URL: http://apollo11.isto.unibo.it/software/
format Online
Article
Text
id pubmed-5199132
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-51991322017-01-06 GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics Piovesan, Allison Caracausi, Maria Antonaros, Francesca Pelleri, Maria Chiara Vitale, Lorenza Database (Oxford) Original Article We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a ‘mean’ human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves. Database URL: http://apollo11.isto.unibo.it/software/ Oxford University Press 2016-12-26 /pmc/articles/PMC5199132/ /pubmed/28025344 http://dx.doi.org/10.1093/database/baw153 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Piovesan, Allison
Caracausi, Maria
Antonaros, Francesca
Pelleri, Maria Chiara
Vitale, Lorenza
GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title_full GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title_fullStr GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title_full_unstemmed GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title_short GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
title_sort genebase 1.1: a tool to summarize data from ncbi gene datasets and its application to an update of human gene statistics
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5199132/
https://www.ncbi.nlm.nih.gov/pubmed/28025344
http://dx.doi.org/10.1093/database/baw153
work_keys_str_mv AT piovesanallison genebase11atooltosummarizedatafromncbigenedatasetsanditsapplicationtoanupdateofhumangenestatistics
AT caracausimaria genebase11atooltosummarizedatafromncbigenedatasetsanditsapplicationtoanupdateofhumangenestatistics
AT antonarosfrancesca genebase11atooltosummarizedatafromncbigenedatasetsanditsapplicationtoanupdateofhumangenestatistics
AT pellerimariachiara genebase11atooltosummarizedatafromncbigenedatasetsanditsapplicationtoanupdateofhumangenestatistics
AT vitalelorenza genebase11atooltosummarizedatafromncbigenedatasetsanditsapplicationtoanupdateofhumangenestatistics