Cargando…

A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

BACKGROUND: The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence h...

Descripción completa

Detalles Bibliográficos
Autores principales: Podell, Sheila, Gaasterland, Terry, Allen, Eric E
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2573894/
https://www.ncbi.nlm.nih.gov/pubmed/18840280
http://dx.doi.org/10.1186/1471-2105-9-419
_version_ 1782160285479469056
author Podell, Sheila
Gaasterland, Terry
Allen, Eric E
author_facet Podell, Sheila
Gaasterland, Terry
Allen, Eric E
author_sort Podell, Sheila
collection PubMed
description BACKGROUND: The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. DESCRIPTION: The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource . Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. CONCLUSION: The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.
format Text
id pubmed-2573894
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25738942008-10-28 A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm Podell, Sheila Gaasterland, Terry Allen, Eric E BMC Bioinformatics Database BACKGROUND: The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. DESCRIPTION: The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource . Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. CONCLUSION: The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied. BioMed Central 2008-10-07 /pmc/articles/PMC2573894/ /pubmed/18840280 http://dx.doi.org/10.1186/1471-2105-9-419 Text en Copyright © 2008 Podell et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Podell, Sheila
Gaasterland, Terry
Allen, Eric E
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title_full A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title_fullStr A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title_full_unstemmed A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title_short A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm
title_sort database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the darkhorse algorithm
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2573894/
https://www.ncbi.nlm.nih.gov/pubmed/18840280
http://dx.doi.org/10.1186/1471-2105-9-419
work_keys_str_mv AT podellsheila adatabaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm
AT gaasterlandterry adatabaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm
AT allenerice adatabaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm
AT podellsheila databaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm
AT gaasterlandterry databaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm
AT allenerice databaseofphylogeneticallyatypicalgenesinarchaealandbacterialgenomesidentifiedusingthedarkhorsealgorithm