Cargando…

Richness estimation in microbiome data obtained from denoising pipelines

The quantification of richness within a sample—either measured as the number of observed species or approximated by estimation—is a common first step in microbiome studies and is known to be highly dependent on sequencing depth, which itself is highly variable between samples. Rarefaction curves ser...

Descripción completa

Detalles Bibliográficos
Autores principales: Kleine Bardenhorst, Sven, Vital, Marius, Karch, André, Rübsamen, Nicole
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762370/
https://www.ncbi.nlm.nih.gov/pubmed/35070172
http://dx.doi.org/10.1016/j.csbj.2021.12.036
Descripción
Sumario:The quantification of richness within a sample—either measured as the number of observed species or approximated by estimation—is a common first step in microbiome studies and is known to be highly dependent on sequencing depth, which itself is highly variable between samples. Rarefaction curves serve as a tool to investigate this dependency and it is often argued that after rarefying data—sub-sampling to an equal sequencing depth—richness estimates no longer depend on sequencing depth. However, the estimation of richness from data obtained by high throughput sequencing methods and processed by current bioinformatics pipelines may be subject to various sources of variation related to sequencing depth. Those that may confound inference based on richness estimates and cannot be solved by sub-sampling. We investigated how pipeline settings in DADA2 and deblur affect estimates of richness and showed that the use of rarefaction and sub-sampling is inappropriate when default pipeline settings are applied. Furthermore, we showed how independent sample-wise processing established spurious correlations between sequencing depth and richness estimations in data produced by DADA2 and how this problem can be solved by a pooled processing approach.