Cargando…

GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences

Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed framesh...

Descripción completa

Detalles Bibliográficos
Autores principales: Antonov, Ivan, Baranov, Pavel, Borodovsky, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531167/
https://www.ncbi.nlm.nih.gov/pubmed/23161689
http://dx.doi.org/10.1093/nar/gks1062
_version_ 1782254125807828992
author Antonov, Ivan
Baranov, Pavel
Borodovsky, Mark
author_facet Antonov, Ivan
Baranov, Pavel
Borodovsky, Mark
author_sort Antonov, Ivan
collection PubMed
description Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (−1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).
format Online
Article
Text
id pubmed-3531167
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35311672013-03-07 GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences Antonov, Ivan Baranov, Pavel Borodovsky, Mark Nucleic Acids Res Articles Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (−1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events). Oxford University Press 2013-01 2012-11-17 /pmc/articles/PMC3531167/ /pubmed/23161689 http://dx.doi.org/10.1093/nar/gks1062 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.
spellingShingle Articles
Antonov, Ivan
Baranov, Pavel
Borodovsky, Mark
GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title_full GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title_fullStr GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title_full_unstemmed GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title_short GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
title_sort genetack database: genes with frameshifts in prokaryotic genomes and eukaryotic mrna sequences
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531167/
https://www.ncbi.nlm.nih.gov/pubmed/23161689
http://dx.doi.org/10.1093/nar/gks1062
work_keys_str_mv AT antonovivan genetackdatabasegeneswithframeshiftsinprokaryoticgenomesandeukaryoticmrnasequences
AT baranovpavel genetackdatabasegeneswithframeshiftsinprokaryoticgenomesandeukaryoticmrnasequences
AT borodovskymark genetackdatabasegeneswithframeshiftsinprokaryoticgenomesandeukaryoticmrnasequences