Cargando…

Shortest triplet clustering: reconstructing large phylogenies using representative sets

BACKGROUND: Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics. RESULTS: We propose a new distance-based...

Descripción completa

Detalles Bibliográficos
Autores principales: Sy Vinh, Le, von Haeseler, Arndt
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1097715/
https://www.ncbi.nlm.nih.gov/pubmed/15819989
http://dx.doi.org/10.1186/1471-2105-6-92
_version_ 1782123909038997504
author Sy Vinh, Le
von Haeseler, Arndt
author_facet Sy Vinh, Le
von Haeseler, Arndt
author_sort Sy Vinh, Le
collection PubMed
description BACKGROUND: Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics. RESULTS: We propose a new distance-based clustering method, the shortest triplet clustering algorithm (STC), to reconstruct phylogenies. The main idea is the introduction of a natural definition of so-called k-representative sets. Based on k-representative sets, shortest triplets are reconstructed and serve as building blocks for the STC algorithm to agglomerate sequences for tree reconstruction in O(n(2)) time for n sequences. Simulations show that STC gives better topological accuracy than other tested methods that also build a first starting tree. STC appears as a very good method to start the tree reconstruction. However, all tested methods give similar results if balanced nearest neighbor interchange (BNNI) is applied as a post-processing step. BNNI leads to an improvement in all instances. The program is available at . CONCLUSION: The results demonstrate that the new approach efficiently reconstructs phylogenies for large data sets. We found that BNNI boosts the topological accuracy of all methods including STC, therefore, one should use BNNI as a post-processing step to get better topological accuracy.
format Text
id pubmed-1097715
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-10977152005-05-12 Shortest triplet clustering: reconstructing large phylogenies using representative sets Sy Vinh, Le von Haeseler, Arndt BMC Bioinformatics Research Article BACKGROUND: Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics. RESULTS: We propose a new distance-based clustering method, the shortest triplet clustering algorithm (STC), to reconstruct phylogenies. The main idea is the introduction of a natural definition of so-called k-representative sets. Based on k-representative sets, shortest triplets are reconstructed and serve as building blocks for the STC algorithm to agglomerate sequences for tree reconstruction in O(n(2)) time for n sequences. Simulations show that STC gives better topological accuracy than other tested methods that also build a first starting tree. STC appears as a very good method to start the tree reconstruction. However, all tested methods give similar results if balanced nearest neighbor interchange (BNNI) is applied as a post-processing step. BNNI leads to an improvement in all instances. The program is available at . CONCLUSION: The results demonstrate that the new approach efficiently reconstructs phylogenies for large data sets. We found that BNNI boosts the topological accuracy of all methods including STC, therefore, one should use BNNI as a post-processing step to get better topological accuracy. BioMed Central 2005-04-08 /pmc/articles/PMC1097715/ /pubmed/15819989 http://dx.doi.org/10.1186/1471-2105-6-92 Text en Copyright © 2005 Sy Vinh and von Haeseler; licensee BioMed Central Ltd.
spellingShingle Research Article
Sy Vinh, Le
von Haeseler, Arndt
Shortest triplet clustering: reconstructing large phylogenies using representative sets
title Shortest triplet clustering: reconstructing large phylogenies using representative sets
title_full Shortest triplet clustering: reconstructing large phylogenies using representative sets
title_fullStr Shortest triplet clustering: reconstructing large phylogenies using representative sets
title_full_unstemmed Shortest triplet clustering: reconstructing large phylogenies using representative sets
title_short Shortest triplet clustering: reconstructing large phylogenies using representative sets
title_sort shortest triplet clustering: reconstructing large phylogenies using representative sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1097715/
https://www.ncbi.nlm.nih.gov/pubmed/15819989
http://dx.doi.org/10.1186/1471-2105-6-92
work_keys_str_mv AT syvinhle shortesttripletclusteringreconstructinglargephylogeniesusingrepresentativesets
AT vonhaeselerarndt shortesttripletclusteringreconstructinglargephylogeniesusingrepresentativesets