Cargando…
DACTAL: divide-and-conquer trees (almost) without alignments
Motivation: While phylogenetic analyses of datasets containing 1000–5000 sequences are challenging for existing methods, the estimation of substantially larger phylogenies poses a problem of much greater complexity and scale. Methods: We present DACTAL, a method for phylogeny estimation that produce...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371850/ https://www.ncbi.nlm.nih.gov/pubmed/22689772 http://dx.doi.org/10.1093/bioinformatics/bts218 |
_version_ | 1782235270232408064 |
---|---|
author | Nelesen, Serita Liu, Kevin Wang, Li-San Linder, C. Randal Warnow, Tandy |
author_facet | Nelesen, Serita Liu, Kevin Wang, Li-San Linder, C. Randal Warnow, Tandy |
author_sort | Nelesen, Serita |
collection | PubMed |
description | Motivation: While phylogenetic analyses of datasets containing 1000–5000 sequences are challenging for existing methods, the estimation of substantially larger phylogenies poses a problem of much greater complexity and scale. Methods: We present DACTAL, a method for phylogeny estimation that produces trees from unaligned sequence datasets without ever needing to estimate an alignment on the entire dataset. DACTAL combines iteration with a novel divide-and-conquer approach, so that each iteration begins with a tree produced in the prior iteration, decomposes the taxon set into overlapping subsets, estimates trees on each subset, and then combines the smaller trees into a tree on the full taxon set using a new supertree method. We prove that DACTAL is guaranteed to produce the true tree under certain conditions. We compare DACTAL to SATé and maximum likelihood trees on estimated alignments using simulated and real datasets with 1000–27 643 taxa. Results: Our studies show that on average DACTAL yields more accurate trees than the two-phase methods we studied on very large datasets that are difficult to align, and has approximately the same accuracy on the easier datasets. The comparison to SATé shows that both have the same accuracy, but that DACTAL achieves this accuracy in a fraction of the time. Furthermore, DACTAL can analyze larger datasets than SATé, including a dataset with almost 28 000 sequences. Availability: DACTAL source code and results of dataset analyses are available at www.cs.utexas.edu/users/phylo/software/dactal. Contact: tandy@cs.utexas.edu |
format | Online Article Text |
id | pubmed-3371850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-33718502012-06-11 DACTAL: divide-and-conquer trees (almost) without alignments Nelesen, Serita Liu, Kevin Wang, Li-San Linder, C. Randal Warnow, Tandy Bioinformatics Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Motivation: While phylogenetic analyses of datasets containing 1000–5000 sequences are challenging for existing methods, the estimation of substantially larger phylogenies poses a problem of much greater complexity and scale. Methods: We present DACTAL, a method for phylogeny estimation that produces trees from unaligned sequence datasets without ever needing to estimate an alignment on the entire dataset. DACTAL combines iteration with a novel divide-and-conquer approach, so that each iteration begins with a tree produced in the prior iteration, decomposes the taxon set into overlapping subsets, estimates trees on each subset, and then combines the smaller trees into a tree on the full taxon set using a new supertree method. We prove that DACTAL is guaranteed to produce the true tree under certain conditions. We compare DACTAL to SATé and maximum likelihood trees on estimated alignments using simulated and real datasets with 1000–27 643 taxa. Results: Our studies show that on average DACTAL yields more accurate trees than the two-phase methods we studied on very large datasets that are difficult to align, and has approximately the same accuracy on the easier datasets. The comparison to SATé shows that both have the same accuracy, but that DACTAL achieves this accuracy in a fraction of the time. Furthermore, DACTAL can analyze larger datasets than SATé, including a dataset with almost 28 000 sequences. Availability: DACTAL source code and results of dataset analyses are available at www.cs.utexas.edu/users/phylo/software/dactal. Contact: tandy@cs.utexas.edu Oxford University Press 2012-06-15 2012-06-09 /pmc/articles/PMC3371850/ /pubmed/22689772 http://dx.doi.org/10.1093/bioinformatics/bts218 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Nelesen, Serita Liu, Kevin Wang, Li-San Linder, C. Randal Warnow, Tandy DACTAL: divide-and-conquer trees (almost) without alignments |
title | DACTAL: divide-and-conquer trees (almost) without alignments |
title_full | DACTAL: divide-and-conquer trees (almost) without alignments |
title_fullStr | DACTAL: divide-and-conquer trees (almost) without alignments |
title_full_unstemmed | DACTAL: divide-and-conquer trees (almost) without alignments |
title_short | DACTAL: divide-and-conquer trees (almost) without alignments |
title_sort | dactal: divide-and-conquer trees (almost) without alignments |
topic | Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371850/ https://www.ncbi.nlm.nih.gov/pubmed/22689772 http://dx.doi.org/10.1093/bioinformatics/bts218 |
work_keys_str_mv | AT nelesenserita dactaldivideandconquertreesalmostwithoutalignments AT liukevin dactaldivideandconquertreesalmostwithoutalignments AT wanglisan dactaldivideandconquertreesalmostwithoutalignments AT lindercrandal dactaldivideandconquertreesalmostwithoutalignments AT warnowtandy dactaldivideandconquertreesalmostwithoutalignments |