Cargando…

CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

BACKGROUND: Viruses and small-genome bacteria (~2 megabases and smaller) comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniq...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahadevan, Padmanabhan, King, John F, Seto, Donald
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2738686/
https://www.ncbi.nlm.nih.gov/pubmed/19706165
http://dx.doi.org/10.1186/1756-0500-2-168
_version_ 1782171536761815040
author Mahadevan, Padmanabhan
King, John F
Seto, Donald
author_facet Mahadevan, Padmanabhan
King, John F
Seto, Donald
author_sort Mahadevan, Padmanabhan
collection PubMed
description BACKGROUND: Viruses and small-genome bacteria (~2 megabases and smaller) comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG) is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. FINDINGS: CGUG is available at as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. CONCLUSION: CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.
format Text
id pubmed-2738686
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27386862009-09-05 CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb Mahadevan, Padmanabhan King, John F Seto, Donald BMC Res Notes Technical Note BACKGROUND: Viruses and small-genome bacteria (~2 megabases and smaller) comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG) is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. FINDINGS: CGUG is available at as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. CONCLUSION: CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins. BioMed Central 2009-08-25 /pmc/articles/PMC2738686/ /pubmed/19706165 http://dx.doi.org/10.1186/1756-0500-2-168 Text en Copyright © 2009 Seto et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Mahadevan, Padmanabhan
King, John F
Seto, Donald
CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title_full CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title_fullStr CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title_full_unstemmed CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title_short CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb
title_sort cgug: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 mb
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2738686/
https://www.ncbi.nlm.nih.gov/pubmed/19706165
http://dx.doi.org/10.1186/1756-0500-2-168
work_keys_str_mv AT mahadevanpadmanabhan cguginsilicoproteomeandgenomeparsingtoolforthedeterminationofcoreanduniquegenesintheanalysisofgenomesuptoca19mb
AT kingjohnf cguginsilicoproteomeandgenomeparsingtoolforthedeterminationofcoreanduniquegenesintheanalysisofgenomesuptoca19mb
AT setodonald cguginsilicoproteomeandgenomeparsingtoolforthedeterminationofcoreanduniquegenesintheanalysisofgenomesuptoca19mb