Cargando…

Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities

The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-based technol...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Jizhong, Jiang, Yi-Huei, Deng, Ye, Shi, Zhou, Zhou, Benjamin Yamin, Xue, Kai, Wu, Liyou, He, Zhili, Yang, Yunfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Microbiology 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3684833/
https://www.ncbi.nlm.nih.gov/pubmed/23760464
http://dx.doi.org/10.1128/mBio.00324-13
_version_ 1782273617440014336
author Zhou, Jizhong
Jiang, Yi-Huei
Deng, Ye
Shi, Zhou
Zhou, Benjamin Yamin
Xue, Kai
Wu, Liyou
He, Zhili
Yang, Yunfeng
author_facet Zhou, Jizhong
Jiang, Yi-Huei
Deng, Ye
Shi, Zhou
Zhou, Benjamin Yamin
Xue, Kai
Wu, Liyou
He, Zhili
Yang, Yunfeng
author_sort Zhou, Jizhong
collection PubMed
description The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-based technologies is a great challenge because of a high number of sequencing errors, bias, and poor reproducibility and quantification. Herein, based on general sampling theory, a mathematical framework is first developed for simulating the effects of random sampling processes on quantifying β-diversity when the community size is known or unknown. Also, using an analogous ball example under Poisson sampling with limited sampling efforts, the developed mathematical framework can exactly predict the low reproducibility among technically replicate samples from the same community of a certain species abundance distribution, which provides explicit evidences of random sampling processes as the main factor causing high percentages of technical variations. In addition, the predicted values under Poisson random sampling were highly consistent with the observed low percentages of operational taxonomic unit (OTU) overlap (<30% and <20% for two and three tags, respectively, based on both Jaccard and Bray-Curtis dissimilarity indexes), further supporting the hypothesis that the poor reproducibility among technical replicates is due to the artifacts associated with random sampling processes. Finally, a mathematical framework was developed for predicting sampling efforts to achieve a desired overlap among replicate samples. Our modeling simulations predict that several orders of magnitude more sequencing efforts are needed to achieve desired high technical reproducibility. These results suggest that great caution needs to be taken in quantifying and interpreting β-diversity for microbial community analysis using next-generation sequencing technologies.
format Online
Article
Text
id pubmed-3684833
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher American Society of Microbiology
record_format MEDLINE/PubMed
spelling pubmed-36848332013-07-09 Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities Zhou, Jizhong Jiang, Yi-Huei Deng, Ye Shi, Zhou Zhou, Benjamin Yamin Xue, Kai Wu, Liyou He, Zhili Yang, Yunfeng mBio Research Article The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-based technologies is a great challenge because of a high number of sequencing errors, bias, and poor reproducibility and quantification. Herein, based on general sampling theory, a mathematical framework is first developed for simulating the effects of random sampling processes on quantifying β-diversity when the community size is known or unknown. Also, using an analogous ball example under Poisson sampling with limited sampling efforts, the developed mathematical framework can exactly predict the low reproducibility among technically replicate samples from the same community of a certain species abundance distribution, which provides explicit evidences of random sampling processes as the main factor causing high percentages of technical variations. In addition, the predicted values under Poisson random sampling were highly consistent with the observed low percentages of operational taxonomic unit (OTU) overlap (<30% and <20% for two and three tags, respectively, based on both Jaccard and Bray-Curtis dissimilarity indexes), further supporting the hypothesis that the poor reproducibility among technical replicates is due to the artifacts associated with random sampling processes. Finally, a mathematical framework was developed for predicting sampling efforts to achieve a desired overlap among replicate samples. Our modeling simulations predict that several orders of magnitude more sequencing efforts are needed to achieve desired high technical reproducibility. These results suggest that great caution needs to be taken in quantifying and interpreting β-diversity for microbial community analysis using next-generation sequencing technologies. American Society of Microbiology 2013-06-11 /pmc/articles/PMC3684833/ /pubmed/23760464 http://dx.doi.org/10.1128/mBio.00324-13 Text en Copyright © 2013 Zhou et al. http://creativecommons.org/licenses/by-nc-sa/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/) , which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhou, Jizhong
Jiang, Yi-Huei
Deng, Ye
Shi, Zhou
Zhou, Benjamin Yamin
Xue, Kai
Wu, Liyou
He, Zhili
Yang, Yunfeng
Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_full Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_fullStr Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_full_unstemmed Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_short Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_sort random sampling process leads to overestimation of β-diversity of microbial communities
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3684833/
https://www.ncbi.nlm.nih.gov/pubmed/23760464
http://dx.doi.org/10.1128/mBio.00324-13
work_keys_str_mv AT zhoujizhong randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT jiangyihuei randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT dengye randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT shizhou randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT zhoubenjaminyamin randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT xuekai randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT wuliyou randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT hezhili randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT yangyunfeng randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities