Cargando…
Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifyi...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131952/ https://www.ncbi.nlm.nih.gov/pubmed/30201048 http://dx.doi.org/10.1186/s40168-018-0543-z |
_version_ | 1783354229317959680 |
---|---|
author | Bender, Jeffrey M. Li, Fan Adisetiyo, Helty Lee, David Zabih, Sara Hung, Long Wilkinson, Thomas A. Pannaraj, Pia S. She, Rosemary C. Bard, Jennifer Dien Tobin, Nicole H. Aldrovandi, Grace M. |
author_facet | Bender, Jeffrey M. Li, Fan Adisetiyo, Helty Lee, David Zabih, Sara Hung, Long Wilkinson, Thomas A. Pannaraj, Pia S. She, Rosemary C. Bard, Jennifer Dien Tobin, Nicole H. Aldrovandi, Grace M. |
author_sort | Bender, Jeffrey M. |
collection | PubMed |
description | BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifying the technical versus biological variation expected in targeted 16S rRNA gene sequencing studies and how this variation changes with input biomass is critical to guide meaningful interpretation of the current literature and plan future research. RESULTS: Data were compiled from 469 sequencing libraries across 19 separate targeted 16S rRNA gene sequencing runs over a 2.5-year time period. Following removal of contaminant sequences identified from negative controls, 244 samples retained sufficient reads for further analysis. Coefficients of variation for intra- and inter-assay variation from repeated measurements of a bacterial mock community ranged from 8.7 to 37.6% (intra) and 15.6 to 80.5% (inter) for all but one genus of bacteria whose relative abundance was greater than 1%. Intra- versus inter-assay Bray-Curtis pairwise distances for a single stool sample were 0.11 versus 0.31, whereas intra-assay variation from repeat stool samples from the same donor was greater at 0.38 (Wilcoxon p = 0.001). A dilution series of the bacterial mock community was used to assess the effect of input biomass on variability. Pairwise distances increased with more dilute samples, and estimates of relative abundance became unreliable below approximately 100 copies of the 16S rRNA gene per microliter. Using this data, we created a prediction model to estimate the expected variation in microbiome measurements for given input biomass and relative abundance values. CONCLUSIONS: Well-controlled microbiome studies are sufficiently robust to capture small biological effects and can achieve levels of variability consistent with clinical assays. Relative abundance is negatively associated with measures of variability and has a stronger effect on variability than does absolute biomass, suggesting that it is feasible to detect differences in bacterial populations in very low-biomass samples. Further, by quantifying the effect of biomass and relative abundance on compositional variability, we developed a tool for defining the expected variance in a given microbiome study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0543-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6131952 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-61319522018-09-13 Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies Bender, Jeffrey M. Li, Fan Adisetiyo, Helty Lee, David Zabih, Sara Hung, Long Wilkinson, Thomas A. Pannaraj, Pia S. She, Rosemary C. Bard, Jennifer Dien Tobin, Nicole H. Aldrovandi, Grace M. Microbiome Methodology BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifying the technical versus biological variation expected in targeted 16S rRNA gene sequencing studies and how this variation changes with input biomass is critical to guide meaningful interpretation of the current literature and plan future research. RESULTS: Data were compiled from 469 sequencing libraries across 19 separate targeted 16S rRNA gene sequencing runs over a 2.5-year time period. Following removal of contaminant sequences identified from negative controls, 244 samples retained sufficient reads for further analysis. Coefficients of variation for intra- and inter-assay variation from repeated measurements of a bacterial mock community ranged from 8.7 to 37.6% (intra) and 15.6 to 80.5% (inter) for all but one genus of bacteria whose relative abundance was greater than 1%. Intra- versus inter-assay Bray-Curtis pairwise distances for a single stool sample were 0.11 versus 0.31, whereas intra-assay variation from repeat stool samples from the same donor was greater at 0.38 (Wilcoxon p = 0.001). A dilution series of the bacterial mock community was used to assess the effect of input biomass on variability. Pairwise distances increased with more dilute samples, and estimates of relative abundance became unreliable below approximately 100 copies of the 16S rRNA gene per microliter. Using this data, we created a prediction model to estimate the expected variation in microbiome measurements for given input biomass and relative abundance values. CONCLUSIONS: Well-controlled microbiome studies are sufficiently robust to capture small biological effects and can achieve levels of variability consistent with clinical assays. Relative abundance is negatively associated with measures of variability and has a stronger effect on variability than does absolute biomass, suggesting that it is feasible to detect differences in bacterial populations in very low-biomass samples. Further, by quantifying the effect of biomass and relative abundance on compositional variability, we developed a tool for defining the expected variance in a given microbiome study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0543-z) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-10 /pmc/articles/PMC6131952/ /pubmed/30201048 http://dx.doi.org/10.1186/s40168-018-0543-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Bender, Jeffrey M. Li, Fan Adisetiyo, Helty Lee, David Zabih, Sara Hung, Long Wilkinson, Thomas A. Pannaraj, Pia S. She, Rosemary C. Bard, Jennifer Dien Tobin, Nicole H. Aldrovandi, Grace M. Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title | Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title_full | Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title_fullStr | Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title_full_unstemmed | Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title_short | Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies |
title_sort | quantification of variation and the impact of biomass in targeted 16s rrna gene sequencing studies |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131952/ https://www.ncbi.nlm.nih.gov/pubmed/30201048 http://dx.doi.org/10.1186/s40168-018-0543-z |
work_keys_str_mv | AT benderjeffreym quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT lifan quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT adisetiyohelty quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT leedavid quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT zabihsara quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT hunglong quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT wilkinsonthomasa quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT pannarajpias quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT sherosemaryc quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT bardjenniferdien quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT tobinnicoleh quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies AT aldrovandigracem quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies |