Cargando…

RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation

Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Kevin, Linder, C. Randal, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221724/
https://www.ncbi.nlm.nih.gov/pubmed/22132132
http://dx.doi.org/10.1371/journal.pone.0027731
_version_ 1782217130602659840
author Liu, Kevin
Linder, C. Randal
Warnow, Tandy
author_facet Liu, Kevin
Linder, C. Randal
Warnow, Tandy
author_sort Liu, Kevin
collection PubMed
description Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequences. Faster methods for ML estimation, among them FastTree, have also been developed, but their relative performance to RAxML is not yet fully understood. In this study, we explore the performance with respect to ML score, running time, and topological accuracy, of FastTree and RAxML on thousands of alignments (based on both simulated and biological nucleotide datasets) with up to 27,634 sequences. We find that when RAxML and FastTree are constrained to the same running time, FastTree produces topologically much more accurate trees in almost all cases. We also find that when RAxML is allowed to run to completion, it provides an advantage over FastTree in terms of the ML score, but does not produce substantially more accurate tree topologies. Interestingly, the relative accuracy of trees computed using FastTree and RAxML depends in part on the accuracy of the sequence alignment and dataset size, so that FastTree can be more accurate than RAxML on large datasets with relatively inaccurate alignments. Finally, the running times of RAxML and FastTree are dramatically different, so that when run to completion, RAxML can take several orders of magnitude longer than FastTree to complete. Thus, our study shows that very large phylogenies can be estimated very quickly using FastTree, with little (and in some cases no) degradation in tree accuracy, as compared to RAxML.
format Online
Article
Text
id pubmed-3221724
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32217242011-11-30 RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation Liu, Kevin Linder, C. Randal Warnow, Tandy PLoS One Research Article Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequences. Faster methods for ML estimation, among them FastTree, have also been developed, but their relative performance to RAxML is not yet fully understood. In this study, we explore the performance with respect to ML score, running time, and topological accuracy, of FastTree and RAxML on thousands of alignments (based on both simulated and biological nucleotide datasets) with up to 27,634 sequences. We find that when RAxML and FastTree are constrained to the same running time, FastTree produces topologically much more accurate trees in almost all cases. We also find that when RAxML is allowed to run to completion, it provides an advantage over FastTree in terms of the ML score, but does not produce substantially more accurate tree topologies. Interestingly, the relative accuracy of trees computed using FastTree and RAxML depends in part on the accuracy of the sequence alignment and dataset size, so that FastTree can be more accurate than RAxML on large datasets with relatively inaccurate alignments. Finally, the running times of RAxML and FastTree are dramatically different, so that when run to completion, RAxML can take several orders of magnitude longer than FastTree to complete. Thus, our study shows that very large phylogenies can be estimated very quickly using FastTree, with little (and in some cases no) degradation in tree accuracy, as compared to RAxML. Public Library of Science 2011-11-21 /pmc/articles/PMC3221724/ /pubmed/22132132 http://dx.doi.org/10.1371/journal.pone.0027731 Text en Liu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Liu, Kevin
Linder, C. Randal
Warnow, Tandy
RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title_full RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title_fullStr RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title_full_unstemmed RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title_short RAxML and FastTree: Comparing Two Methods for Large-Scale Maximum Likelihood Phylogeny Estimation
title_sort raxml and fasttree: comparing two methods for large-scale maximum likelihood phylogeny estimation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221724/
https://www.ncbi.nlm.nih.gov/pubmed/22132132
http://dx.doi.org/10.1371/journal.pone.0027731
work_keys_str_mv AT liukevin raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation
AT lindercrandal raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation
AT warnowtandy raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation