Cargando…

scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements

Microbial ecology research is currently driven by the continuously decreasing cost of DNA sequencing and the improving accuracy of data analysis methods. One such analysis method is phylogenetic placement, which establishes the phylogenetic identity of the anonymous environmental sequences in a samp...

Descripción completa

Detalles Bibliográficos
Autores principales: Barbera, Pierre, Czech, Lucas, Lutteropp, Sarah, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7756409/
https://www.ncbi.nlm.nih.gov/pubmed/32996237
http://dx.doi.org/10.1111/1755-0998.13255
_version_ 1783626535306002432
author Barbera, Pierre
Czech, Lucas
Lutteropp, Sarah
Stamatakis, Alexandros
author_facet Barbera, Pierre
Czech, Lucas
Lutteropp, Sarah
Stamatakis, Alexandros
author_sort Barbera, Pierre
collection PubMed
description Microbial ecology research is currently driven by the continuously decreasing cost of DNA sequencing and the improving accuracy of data analysis methods. One such analysis method is phylogenetic placement, which establishes the phylogenetic identity of the anonymous environmental sequences in a sample by means of a given phylogenetic reference tree. However, assessing the diversity of a sample remains challenging, as traditional methods do not scale well with the increasing data volumes and/or do not leverage the phylogenetic placement information. Here, we present scrapp, a highly parallel and scalable tool that uses a molecular species delimitation algorithm to quantify the diversity distribution over the reference phylogeny for a given phylogenetic placement of the sample. scrapp employs a novel approach to cluster phylogenetic placements, called placement space clustering, to efficiently perform dimensionality reduction, so as to scale on large data volumes. Furthermore, it uses the phylogeny‐aware molecular species delimitation method mPTP to quantify diversity. We evaluated scrapp using both, simulated and empirical data sets. We use simulated data to verify our approach. Tests on an empirical data set show that scrapp‐derived metrics can classify samples by their diversity‐correlated features equally well or better than existing, commonly used approaches. scrapp is available at https://github.com/pbdas/scrapp.
format Online
Article
Text
id pubmed-7756409
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-77564092020-12-28 scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements Barbera, Pierre Czech, Lucas Lutteropp, Sarah Stamatakis, Alexandros Mol Ecol Resour RESOURCE ARTICLES Microbial ecology research is currently driven by the continuously decreasing cost of DNA sequencing and the improving accuracy of data analysis methods. One such analysis method is phylogenetic placement, which establishes the phylogenetic identity of the anonymous environmental sequences in a sample by means of a given phylogenetic reference tree. However, assessing the diversity of a sample remains challenging, as traditional methods do not scale well with the increasing data volumes and/or do not leverage the phylogenetic placement information. Here, we present scrapp, a highly parallel and scalable tool that uses a molecular species delimitation algorithm to quantify the diversity distribution over the reference phylogeny for a given phylogenetic placement of the sample. scrapp employs a novel approach to cluster phylogenetic placements, called placement space clustering, to efficiently perform dimensionality reduction, so as to scale on large data volumes. Furthermore, it uses the phylogeny‐aware molecular species delimitation method mPTP to quantify diversity. We evaluated scrapp using both, simulated and empirical data sets. We use simulated data to verify our approach. Tests on an empirical data set show that scrapp‐derived metrics can classify samples by their diversity‐correlated features equally well or better than existing, commonly used approaches. scrapp is available at https://github.com/pbdas/scrapp. John Wiley and Sons Inc. 2020-10-09 2021-01 /pmc/articles/PMC7756409/ /pubmed/32996237 http://dx.doi.org/10.1111/1755-0998.13255 Text en © 2020 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle RESOURCE ARTICLES
Barbera, Pierre
Czech, Lucas
Lutteropp, Sarah
Stamatakis, Alexandros
scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title_full scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title_fullStr scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title_full_unstemmed scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title_short scrapp: A tool to assess the diversity of microbial samples from phylogenetic placements
title_sort scrapp: a tool to assess the diversity of microbial samples from phylogenetic placements
topic RESOURCE ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7756409/
https://www.ncbi.nlm.nih.gov/pubmed/32996237
http://dx.doi.org/10.1111/1755-0998.13255
work_keys_str_mv AT barberapierre scrappatooltoassessthediversityofmicrobialsamplesfromphylogeneticplacements
AT czechlucas scrappatooltoassessthediversityofmicrobialsamplesfromphylogeneticplacements
AT lutteroppsarah scrappatooltoassessthediversityofmicrobialsamplesfromphylogeneticplacements
AT stamatakisalexandros scrappatooltoassessthediversityofmicrobialsamplesfromphylogeneticplacements