Cargando…

A general species delimitation method with applications to phylogenetic placements

Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jiajie, Kapli, Paschalia, Pavlidis, Pavlos, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810850/
https://www.ncbi.nlm.nih.gov/pubmed/23990417
http://dx.doi.org/10.1093/bioinformatics/btt499
_version_ 1782288861207986176
author Zhang, Jiajie
Kapli, Paschalia
Pavlidis, Pavlos
Stamatakis, Alexandros
author_facet Zhang, Jiajie
Kapli, Paschalia
Pavlidis, Pavlos
Stamatakis, Alexandros
author_sort Zhang, Jiajie
collection PubMed
description Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.exelixis-lab.org/software.html. Contact: Alexandros.Stamatakis@h-its.org Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3810850
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-38108502013-10-29 A general species delimitation method with applications to phylogenetic placements Zhang, Jiajie Kapli, Paschalia Pavlidis, Pavlos Stamatakis, Alexandros Bioinformatics Original Papers Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.exelixis-lab.org/software.html. Contact: Alexandros.Stamatakis@h-its.org Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2013-11-15 2013-08-29 /pmc/articles/PMC3810850/ /pubmed/23990417 http://dx.doi.org/10.1093/bioinformatics/btt499 Text en © The Author 2013. Published by Oxford University Press. All rights reserved. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Zhang, Jiajie
Kapli, Paschalia
Pavlidis, Pavlos
Stamatakis, Alexandros
A general species delimitation method with applications to phylogenetic placements
title A general species delimitation method with applications to phylogenetic placements
title_full A general species delimitation method with applications to phylogenetic placements
title_fullStr A general species delimitation method with applications to phylogenetic placements
title_full_unstemmed A general species delimitation method with applications to phylogenetic placements
title_short A general species delimitation method with applications to phylogenetic placements
title_sort general species delimitation method with applications to phylogenetic placements
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810850/
https://www.ncbi.nlm.nih.gov/pubmed/23990417
http://dx.doi.org/10.1093/bioinformatics/btt499
work_keys_str_mv AT zhangjiajie ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT kaplipaschalia ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT pavlidispavlos ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT stamatakisalexandros ageneralspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT zhangjiajie generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT kaplipaschalia generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT pavlidispavlos generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements
AT stamatakisalexandros generalspeciesdelimitationmethodwithapplicationstophylogeneticplacements