Cargando…
SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement
SUMMARY: Phylogenetic placement is the problem of placing ‘query’ sequences into an existing tree (called a ‘backbone tree’). One of the most accurate phylogenetic placement methods to date is the maximum likelihood-based method pplacer, using RAxML to estimate numeric parameters on the backbone tre...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933845/ https://www.ncbi.nlm.nih.gov/pubmed/36818728 http://dx.doi.org/10.1093/bioadv/vbad008 |
_version_ | 1784889758229463040 |
---|---|
author | Chu, Gillian Warnow, Tandy |
author_facet | Chu, Gillian Warnow, Tandy |
author_sort | Chu, Gillian |
collection | PubMed |
description | SUMMARY: Phylogenetic placement is the problem of placing ‘query’ sequences into an existing tree (called a ‘backbone tree’). One of the most accurate phylogenetic placement methods to date is the maximum likelihood-based method pplacer, using RAxML to estimate numeric parameters on the backbone tree and then adding the given query sequence to the edge that maximizes the probability that the resulting tree generates the query sequence. Unfortunately, this way of running pplacer fails to return valid outputs on many moderately large backbone trees and so is limited to backbone trees with at most ∼10 000 leaves. SCAMPP is a technique to enable pplacer to run on larger backbone trees, which operates by finding a small ‘placement subtree’ specific to each query sequence, within which the query sequence are placed using pplacer. That approach matched the scalability and accuracy of APPLES-2, the previous most scalable method. Here, we explore a different aspect of pplacer’s strategy: the technique used to estimate numeric parameters on the backbone tree. We confirm anecdotal evidence that using FastTree instead of RAxML to estimate numeric parameters on the backbone tree enables pplacer to scale to much larger backbone trees, almost (but not quite) matching the scalability of APPLES-2 and pplacer-SCAMPP. We then evaluate the combination of these two techniques—SCAMPP and the use of FastTree. We show that this combined approach, pplacer-SCAMPP-FastTree, has the same scalability as APPLES-2, improves on the scalability of pplacer-FastTree and achieves better accuracy than the comparably scalable methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/gillichu/PLUSplacer-taxtastic. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-9933845 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-99338452023-02-17 SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement Chu, Gillian Warnow, Tandy Bioinform Adv Original Paper SUMMARY: Phylogenetic placement is the problem of placing ‘query’ sequences into an existing tree (called a ‘backbone tree’). One of the most accurate phylogenetic placement methods to date is the maximum likelihood-based method pplacer, using RAxML to estimate numeric parameters on the backbone tree and then adding the given query sequence to the edge that maximizes the probability that the resulting tree generates the query sequence. Unfortunately, this way of running pplacer fails to return valid outputs on many moderately large backbone trees and so is limited to backbone trees with at most ∼10 000 leaves. SCAMPP is a technique to enable pplacer to run on larger backbone trees, which operates by finding a small ‘placement subtree’ specific to each query sequence, within which the query sequence are placed using pplacer. That approach matched the scalability and accuracy of APPLES-2, the previous most scalable method. Here, we explore a different aspect of pplacer’s strategy: the technique used to estimate numeric parameters on the backbone tree. We confirm anecdotal evidence that using FastTree instead of RAxML to estimate numeric parameters on the backbone tree enables pplacer to scale to much larger backbone trees, almost (but not quite) matching the scalability of APPLES-2 and pplacer-SCAMPP. We then evaluate the combination of these two techniques—SCAMPP and the use of FastTree. We show that this combined approach, pplacer-SCAMPP-FastTree, has the same scalability as APPLES-2, improves on the scalability of pplacer-FastTree and achieves better accuracy than the comparably scalable methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/gillichu/PLUSplacer-taxtastic. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-01-30 /pmc/articles/PMC9933845/ /pubmed/36818728 http://dx.doi.org/10.1093/bioadv/vbad008 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Chu, Gillian Warnow, Tandy SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title | SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title_full | SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title_fullStr | SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title_full_unstemmed | SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title_short | SCAMPP+FastTree: improving scalability for likelihood-based phylogenetic placement |
title_sort | scampp+fasttree: improving scalability for likelihood-based phylogenetic placement |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933845/ https://www.ncbi.nlm.nih.gov/pubmed/36818728 http://dx.doi.org/10.1093/bioadv/vbad008 |
work_keys_str_mv | AT chugillian scamppfasttreeimprovingscalabilityforlikelihoodbasedphylogeneticplacement AT warnowtandy scamppfasttreeimprovingscalabilityforlikelihoodbasedphylogeneticplacement |