Cargando…

BranchClust: a phylogenetic algorithm for selecting gene families

BACKGROUND: Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number...

Descripción completa

Detalles Bibliográficos
Autores principales: Poptsova, Maria S, Gogarten, J Peter
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853112/
https://www.ncbi.nlm.nih.gov/pubmed/17425803
http://dx.doi.org/10.1186/1471-2105-8-120
_version_ 1782133110568124416
author Poptsova, Maria S
Gogarten, J Peter
author_facet Poptsova, Maria S
Gogarten, J Peter
author_sort Poptsova, Maria S
collection PubMed
description BACKGROUND: Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. RESULTS: Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete (containing all taxa) and incomplete (not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at . CONCLUSION: BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees.
format Text
id pubmed-1853112
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18531122007-04-20 BranchClust: a phylogenetic algorithm for selecting gene families Poptsova, Maria S Gogarten, J Peter BMC Bioinformatics Methodology Article BACKGROUND: Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. RESULTS: Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete (containing all taxa) and incomplete (not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at . CONCLUSION: BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees. BioMed Central 2007-04-10 /pmc/articles/PMC1853112/ /pubmed/17425803 http://dx.doi.org/10.1186/1471-2105-8-120 Text en Copyright © 2007 Poptsova and Gogarten; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Poptsova, Maria S
Gogarten, J Peter
BranchClust: a phylogenetic algorithm for selecting gene families
title BranchClust: a phylogenetic algorithm for selecting gene families
title_full BranchClust: a phylogenetic algorithm for selecting gene families
title_fullStr BranchClust: a phylogenetic algorithm for selecting gene families
title_full_unstemmed BranchClust: a phylogenetic algorithm for selecting gene families
title_short BranchClust: a phylogenetic algorithm for selecting gene families
title_sort branchclust: a phylogenetic algorithm for selecting gene families
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853112/
https://www.ncbi.nlm.nih.gov/pubmed/17425803
http://dx.doi.org/10.1186/1471-2105-8-120
work_keys_str_mv AT poptsovamarias branchclustaphylogeneticalgorithmforselectinggenefamilies
AT gogartenjpeter branchclustaphylogeneticalgorithmforselectinggenefamilies