Cargando…
Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing
Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using class...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921663/ https://www.ncbi.nlm.nih.gov/pubmed/35300475 http://dx.doi.org/10.3389/fmicb.2022.728146 |
_version_ | 1784669367276929024 |
---|---|
author | Schmidt, Philip J. Cameron, Ellen S. Müller, Kirsten M. Emelko, Monica B. |
author_facet | Schmidt, Philip J. Cameron, Ellen S. Müller, Kirsten M. Emelko, Monica B. |
author_sort | Schmidt, Philip J. |
collection | PubMed |
description | Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using classical microbiological methods, amplicon sequence read counts obtained from a sample are random data linked to source properties (e.g., proportional composition) by a probabilistic process. Thus, diversity analysis has focused on diversity exhibited in (normalized) samples rather than probabilistic inference about source diversity. This study applies fundamentals of statistical analysis for quantitative microbiology (e.g., microscopy, plating, and most probable number methods) to sample collection and processing procedures of amplicon sequencing methods to facilitate inference reflecting the probabilistic nature of such data and evaluation of uncertainty in diversity metrics. Following description of types of random error, mechanisms such as clustering of microorganisms in the source, differential analytical recovery during sample processing, and amplification are found to invalidate a multinomial relative abundance model. The zeros often abounding in amplicon sequencing data and their implications are addressed, and Bayesian analysis is applied to estimate the source Shannon index given unnormalized data (both simulated and experimental). Inference about source diversity is found to require knowledge of the exact number of unique variants in the source, which is practically unknowable due to library size limitations and the inability to differentiate zeros corresponding to variants that are actually absent in the source from zeros corresponding to variants that were merely not detected. Given these problems with estimation of diversity in the source even when the basic multinomial model is valid, diversity analysis at the level of samples with normalized library sizes is discussed. |
format | Online Article Text |
id | pubmed-8921663 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89216632022-03-16 Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing Schmidt, Philip J. Cameron, Ellen S. Müller, Kirsten M. Emelko, Monica B. Front Microbiol Microbiology Diversity analysis of amplicon sequencing data has mainly been limited to plug-in estimates calculated using normalized data to obtain a single value of an alpha diversity metric or a single point on a beta diversity ordination plot for each sample. As recognized for count data generated using classical microbiological methods, amplicon sequence read counts obtained from a sample are random data linked to source properties (e.g., proportional composition) by a probabilistic process. Thus, diversity analysis has focused on diversity exhibited in (normalized) samples rather than probabilistic inference about source diversity. This study applies fundamentals of statistical analysis for quantitative microbiology (e.g., microscopy, plating, and most probable number methods) to sample collection and processing procedures of amplicon sequencing methods to facilitate inference reflecting the probabilistic nature of such data and evaluation of uncertainty in diversity metrics. Following description of types of random error, mechanisms such as clustering of microorganisms in the source, differential analytical recovery during sample processing, and amplification are found to invalidate a multinomial relative abundance model. The zeros often abounding in amplicon sequencing data and their implications are addressed, and Bayesian analysis is applied to estimate the source Shannon index given unnormalized data (both simulated and experimental). Inference about source diversity is found to require knowledge of the exact number of unique variants in the source, which is practically unknowable due to library size limitations and the inability to differentiate zeros corresponding to variants that are actually absent in the source from zeros corresponding to variants that were merely not detected. Given these problems with estimation of diversity in the source even when the basic multinomial model is valid, diversity analysis at the level of samples with normalized library sizes is discussed. Frontiers Media S.A. 2022-03-01 /pmc/articles/PMC8921663/ /pubmed/35300475 http://dx.doi.org/10.3389/fmicb.2022.728146 Text en Copyright © 2022 Schmidt, Cameron, Müller and Emelko. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Schmidt, Philip J. Cameron, Ellen S. Müller, Kirsten M. Emelko, Monica B. Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title | Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title_full | Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title_fullStr | Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title_full_unstemmed | Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title_short | Ensuring That Fundamentals of Quantitative Microbiology Are Reflected in Microbial Diversity Analyses Based on Next-Generation Sequencing |
title_sort | ensuring that fundamentals of quantitative microbiology are reflected in microbial diversity analyses based on next-generation sequencing |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921663/ https://www.ncbi.nlm.nih.gov/pubmed/35300475 http://dx.doi.org/10.3389/fmicb.2022.728146 |
work_keys_str_mv | AT schmidtphilipj ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing AT cameronellens ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing AT mullerkirstenm ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing AT emelkomonicab ensuringthatfundamentalsofquantitativemicrobiologyarereflectedinmicrobialdiversityanalysesbasedonnextgenerationsequencing |