Cargando…
A general species delimitation method with applications to phylogenetic placements
Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810850/ https://www.ncbi.nlm.nih.gov/pubmed/23990417 http://dx.doi.org/10.1093/bioinformatics/btt499 |
_version_ | 1782288861207986176 |
---|---|
author | Zhang, Jiajie Kapli, Paschalia Pavlidis, Pavlos Stamatakis, Alexandros |
author_facet | Zhang, Jiajie Kapli, Paschalia Pavlidis, Pavlos Stamatakis, Alexandros |
author_sort | Zhang, Jiajie |
collection | PubMed |
description | Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.exelixis-lab.org/software.html. Contact: Alexandros.Stamatakis@h-its.org Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-3810850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-38108502013-10-29 A general species delimitation method with applications to phylogenetic placements Zhang, Jiajie Kapli, Paschalia Pavlidis, Pavlos Stamatakis, Alexandros Bioinformatics Original Papers Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.exelixis-lab.org/software.html. Contact: Alexandros.Stamatakis@h-its.org Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2013-11-15 2013-08-29 /pmc/articles/PMC3810850/ /pubmed/23990417 http://dx.doi.org/10.1093/bioinformatics/btt499 Text en © The Author 2013. Published by Oxford University Press. All rights reserved. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Zhang, Jiajie Kapli, Paschalia Pavlidis, Pavlos Stamatakis, Alexandros A general species delimitation method with applications to phylogenetic placements |
title | A general species delimitation method with applications to phylogenetic placements |
title_full | A general species delimitation method with applications to phylogenetic placements |
title_fullStr | A general species delimitation method with applications to phylogenetic placements |
title_full_unstemmed | A general species delimitation method with applications to phylogenetic placements |
title_short | A general species delimitation method with applications to phylogenetic placements |
title_sort | general species delimitation method with applications to phylogenetic placements |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810850/ https://www.ncbi.nlm.nih.gov/pubmed/23990417 http://dx.doi.org/10.1093/bioinformatics/btt499 |
work_keys_str_mv | AT zhangjiajie ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT kaplipaschalia ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT pavlidispavlos ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT stamatakisalexandros ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT zhangjiajie generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT kaplipaschalia generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT pavlidispavlos generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements AT stamatakisalexandros generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements |