Cargando…

Phylogenetic placement of metagenomic reads using the minimum evolution principle

BACKGROUND: A central problem of computational metagenomics is determining the correct placement into an existing phylogenetic tree of individual reads (nucleotide sequences of varying lengths, ranging from hundreds to thousands of bases) obtained using next-generation sequencing of DNA samples from...

Descripción completa

Detalles Bibliográficos
Autores principales: Filipski, Alan, Tamura, Koichiro, Billing-Ross, Paul, Murillo, Oscar, Kumar, Sudhir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4315155/
https://www.ncbi.nlm.nih.gov/pubmed/25923672
http://dx.doi.org/10.1186/1471-2164-16-S1-S13
_version_ 1782355436671860736
author Filipski, Alan
Tamura, Koichiro
Billing-Ross, Paul
Murillo, Oscar
Kumar, Sudhir
author_facet Filipski, Alan
Tamura, Koichiro
Billing-Ross, Paul
Murillo, Oscar
Kumar, Sudhir
author_sort Filipski, Alan
collection PubMed
description BACKGROUND: A central problem of computational metagenomics is determining the correct placement into an existing phylogenetic tree of individual reads (nucleotide sequences of varying lengths, ranging from hundreds to thousands of bases) obtained using next-generation sequencing of DNA samples from a mixture of known and unknown species. Correct placement allows us to easily identify or classify the sequences in the sample as to taxonomic position or function. RESULTS: Here we propose a novel method (PhyClass), based on the Minimum Evolution (ME) phylogenetic inference criterion, for determining the appropriate phylogenetic position of each read. Without using heuristics, the new approach efficiently finds the optimal placement of the unknown read in a reference phylogenetic tree given a sequence alignment for the taxa in the tree. In short, the total resulting branch length for the tree is computed for every possible placement of the unknown read and the placement that gives the smallest value for this total is the best (optimal) choice. By taking advantage of computational efficiencies and mathematical formulations, we are able to find the true optimal ME placement for each read in the phylogenetic tree. Using computer simulations, we assessed the accuracy of the new approach for different read lengths over a variety of data sets and phylogenetic trees. We found the accuracy of the new method to be good and comparable to existing Maximum Likelihood (ML) approaches. CONCLUSIONS: In particular, we found that the consensus assignments based on ME and ML approaches are more correct than either method individually. This is true even when the statistical support for read assignments was low, which is inevitable given that individual reads are often short and come from only one gene.
format Online
Article
Text
id pubmed-4315155
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43151552015-02-09 Phylogenetic placement of metagenomic reads using the minimum evolution principle Filipski, Alan Tamura, Koichiro Billing-Ross, Paul Murillo, Oscar Kumar, Sudhir BMC Genomics Research BACKGROUND: A central problem of computational metagenomics is determining the correct placement into an existing phylogenetic tree of individual reads (nucleotide sequences of varying lengths, ranging from hundreds to thousands of bases) obtained using next-generation sequencing of DNA samples from a mixture of known and unknown species. Correct placement allows us to easily identify or classify the sequences in the sample as to taxonomic position or function. RESULTS: Here we propose a novel method (PhyClass), based on the Minimum Evolution (ME) phylogenetic inference criterion, for determining the appropriate phylogenetic position of each read. Without using heuristics, the new approach efficiently finds the optimal placement of the unknown read in a reference phylogenetic tree given a sequence alignment for the taxa in the tree. In short, the total resulting branch length for the tree is computed for every possible placement of the unknown read and the placement that gives the smallest value for this total is the best (optimal) choice. By taking advantage of computational efficiencies and mathematical formulations, we are able to find the true optimal ME placement for each read in the phylogenetic tree. Using computer simulations, we assessed the accuracy of the new approach for different read lengths over a variety of data sets and phylogenetic trees. We found the accuracy of the new method to be good and comparable to existing Maximum Likelihood (ML) approaches. CONCLUSIONS: In particular, we found that the consensus assignments based on ME and ML approaches are more correct than either method individually. This is true even when the statistical support for read assignments was low, which is inevitable given that individual reads are often short and come from only one gene. BioMed Central 2015-01-15 /pmc/articles/PMC4315155/ /pubmed/25923672 http://dx.doi.org/10.1186/1471-2164-16-S1-S13 Text en Copyright © 2015 Filipski et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Filipski, Alan
Tamura, Koichiro
Billing-Ross, Paul
Murillo, Oscar
Kumar, Sudhir
Phylogenetic placement of metagenomic reads using the minimum evolution principle
title Phylogenetic placement of metagenomic reads using the minimum evolution principle
title_full Phylogenetic placement of metagenomic reads using the minimum evolution principle
title_fullStr Phylogenetic placement of metagenomic reads using the minimum evolution principle
title_full_unstemmed Phylogenetic placement of metagenomic reads using the minimum evolution principle
title_short Phylogenetic placement of metagenomic reads using the minimum evolution principle
title_sort phylogenetic placement of metagenomic reads using the minimum evolution principle
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4315155/
https://www.ncbi.nlm.nih.gov/pubmed/25923672
http://dx.doi.org/10.1186/1471-2164-16-S1-S13
work_keys_str_mv AT filipskialan phylogeneticplacementofmetagenomicreadsusingtheminimumevolutionprinciple
AT tamurakoichiro phylogeneticplacementofmetagenomicreadsusingtheminimumevolutionprinciple
AT billingrosspaul phylogeneticplacementofmetagenomicreadsusingtheminimumevolutionprinciple
AT murillooscar phylogeneticplacementofmetagenomicreadsusingtheminimumevolutionprinciple
AT kumarsudhir phylogeneticplacementofmetagenomicreadsusingtheminimumevolutionprinciple