Cargando…

Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing

Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using class...

Descripción completa

Detalles Bibliográficos
Autores principales: Schmidt, Philip J., Cameron, Ellen S., Müller, Kirsten M., Emelko, Monica B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921663/
https://www.ncbi.nlm.nih.gov/pubmed/35300475
http://dx.doi.org/10.3389/fmicb.2022.728146
_version_ 1784669367276929024
author Schmidt, Philip J.
Cameron, Ellen S.
Müller, Kirsten M.
Emelko, Monica B.
author_facet Schmidt, Philip J.
Cameron, Ellen S.
Müller, Kirsten M.
Emelko, Monica B.
author_sort Schmidt, Philip J.
collection PubMed
description Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using classical microbiological methods, amplicon sequence read counts obtained from a sample are random data linked to source properties (e.g., proportional composition) by a probabilistic process. Thus, diversity analysis has focused on diversity exhibited in (normalized) samples rather than probabilistic inference about source diversity. This study applies fundamentals of statistical analysis for quantitative microbiology (e.g., microscopy, plating, and most probable number methods) to sample collection and processing procedures of amplicon sequencing methods to facilitate inference reflecting the probabilistic nature of such data and evaluation of uncertainty in diversity metrics. Following description of types of random error, mechanisms such as clustering of microorganisms in the source, differential analytical recovery during sample processing, and amplification are found to invalidate a multinomial relative abundance model. The zeros often abounding in amplicon sequencing data and their implications are addressed, and Bayesian analysis is applied to estimate the source Shannon index given unnormalized data (both simulated and experimental). Inference about source diversity is found to require knowledge of the exact number of unique variants in the source, which is practically unknowable due to library size limitations and the inability to differentiate zeros corresponding to variants that are actually absent in the source from zeros corresponding to variants that were merely not detected. Given these problems with estimation of diversity in the source even when the basic multinomial model is valid, diversity analysis at the level of samples with normalized library sizes is discussed.
format Online
Article
Text
id pubmed-8921663
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89216632022-03-16 Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing Schmidt, Philip J. Cameron, Ellen S. Müller, Kirsten M. Emelko, Monica B. Front Microbiol Microbiology Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using classical microbiological methods, amplicon sequence read counts obtained from a sample are random data linked to source properties (e.g., proportional composition) by a probabilistic process. Thus, diversity analysis has focused on diversity exhibited in (normalized) samples rather than probabilistic inference about source diversity. This study applies fundamentals of statistical analysis for quantitative microbiology (e.g., microscopy, plating, and most probable number methods) to sample collection and processing procedures of amplicon sequencing methods to facilitate inference reflecting the probabilistic nature of such data and evaluation of uncertainty in diversity metrics. Following description of types of random error, mechanisms such as clustering of microorganisms in the source, differential analytical recovery during sample processing, and amplification are found to invalidate a multinomial relative abundance model. The zeros often abounding in amplicon sequencing data and their implications are addressed, and Bayesian analysis is applied to estimate the source Shannon index given unnormalized data (both simulated and experimental). Inference about source diversity is found to require knowledge of the exact number of unique variants in the source, which is practically unknowable due to library size limitations and the inability to differentiate zeros corresponding to variants that are actually absent in the source from zeros corresponding to variants that were merely not detected. Given these problems with estimation of diversity in the source even when the basic multinomial model is valid, diversity analysis at the level of samples with normalized library sizes is discussed. Frontiers Media S.A. 2022-03-01 /pmc/articles/PMC8921663/ /pubmed/35300475 http://dx.doi.org/10.3389/fmicb.2022.728146 Text en Copyright © 2022 Schmidt, Cameron, Müller and Emelko. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Schmidt, Philip J.
Cameron, Ellen S.
Müller, Kirsten M.
Emelko, Monica B.
Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title_full Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title_fullStr Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title_full_unstemmed Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title_short Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
title_sort ensuring that fundamentals of quantitative microbiology are reflected in microbial diversity analyses based on next-generation sequencing
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921663/
https://www.ncbi.nlm.nih.gov/pubmed/35300475
http://dx.doi.org/10.3389/fmicb.2022.728146
work_keys_str_mv AT schmidtphilipj ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing
AT cameronellens ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing
AT mullerkirstenm ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing
AT emelkomonicab ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing