Cargando…

Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies

BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifyi...

Descripción completa

Detalles Bibliográficos
Autores principales: Bender, Jeffrey M., Li, Fan, Adisetiyo, Helty, Lee, David, Zabih, Sara, Hung, Long, Wilkinson, Thomas A., Pannaraj, Pia S., She, Rosemary C., Bard, Jennifer Dien, Tobin, Nicole H., Aldrovandi, Grace M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131952/
https://www.ncbi.nlm.nih.gov/pubmed/30201048
http://dx.doi.org/10.1186/s40168-018-0543-z
_version_ 1783354229317959680
author Bender, Jeffrey M.
Li, Fan
Adisetiyo, Helty
Lee, David
Zabih, Sara
Hung, Long
Wilkinson, Thomas A.
Pannaraj, Pia S.
She, Rosemary C.
Bard, Jennifer Dien
Tobin, Nicole H.
Aldrovandi, Grace M.
author_facet Bender, Jeffrey M.
Li, Fan
Adisetiyo, Helty
Lee, David
Zabih, Sara
Hung, Long
Wilkinson, Thomas A.
Pannaraj, Pia S.
She, Rosemary C.
Bard, Jennifer Dien
Tobin, Nicole H.
Aldrovandi, Grace M.
author_sort Bender, Jeffrey M.
collection PubMed
description BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifying the technical versus biological variation expected in targeted 16S rRNA gene sequencing studies and how this variation changes with input biomass is critical to guide meaningful interpretation of the current literature and plan future research. RESULTS: Data were compiled from 469 sequencing libraries across 19 separate targeted 16S rRNA gene sequencing runs over a 2.5-year time period. Following removal of contaminant sequences identified from negative controls, 244 samples retained sufficient reads for further analysis. Coefficients of variation for intra- and inter-assay variation from repeated measurements of a bacterial mock community ranged from 8.7 to 37.6% (intra) and 15.6 to 80.5% (inter) for all but one genus of bacteria whose relative abundance was greater than 1%. Intra- versus inter-assay Bray-Curtis pairwise distances for a single stool sample were 0.11 versus 0.31, whereas intra-assay variation from repeat stool samples from the same donor was greater at 0.38 (Wilcoxon p = 0.001). A dilution series of the bacterial mock community was used to assess the effect of input biomass on variability. Pairwise distances increased with more dilute samples, and estimates of relative abundance became unreliable below approximately 100 copies of the 16S rRNA gene per microliter. Using this data, we created a prediction model to estimate the expected variation in microbiome measurements for given input biomass and relative abundance values. CONCLUSIONS: Well-controlled microbiome studies are sufficiently robust to capture small biological effects and can achieve levels of variability consistent with clinical assays. Relative abundance is negatively associated with measures of variability and has a stronger effect on variability than does absolute biomass, suggesting that it is feasible to detect differences in bacterial populations in very low-biomass samples. Further, by quantifying the effect of biomass and relative abundance on compositional variability, we developed a tool for defining the expected variance in a given microbiome study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0543-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6131952
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61319522018-09-13 Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies Bender, Jeffrey M. Li, Fan Adisetiyo, Helty Lee, David Zabih, Sara Hung, Long Wilkinson, Thomas A. Pannaraj, Pia S. She, Rosemary C. Bard, Jennifer Dien Tobin, Nicole H. Aldrovandi, Grace M. Microbiome Methodology BACKGROUND: Recent advances in sequencing technologies and bioinformatics tools have allowed for large-scale microbiome studies that are rapidly advancing medical research. However, small changes in technique or analysis can significantly alter the results and lead to conflicting findings. Quantifying the technical versus biological variation expected in targeted 16S rRNA gene sequencing studies and how this variation changes with input biomass is critical to guide meaningful interpretation of the current literature and plan future research. RESULTS: Data were compiled from 469 sequencing libraries across 19 separate targeted 16S rRNA gene sequencing runs over a 2.5-year time period. Following removal of contaminant sequences identified from negative controls, 244 samples retained sufficient reads for further analysis. Coefficients of variation for intra- and inter-assay variation from repeated measurements of a bacterial mock community ranged from 8.7 to 37.6% (intra) and 15.6 to 80.5% (inter) for all but one genus of bacteria whose relative abundance was greater than 1%. Intra- versus inter-assay Bray-Curtis pairwise distances for a single stool sample were 0.11 versus 0.31, whereas intra-assay variation from repeat stool samples from the same donor was greater at 0.38 (Wilcoxon p = 0.001). A dilution series of the bacterial mock community was used to assess the effect of input biomass on variability. Pairwise distances increased with more dilute samples, and estimates of relative abundance became unreliable below approximately 100 copies of the 16S rRNA gene per microliter. Using this data, we created a prediction model to estimate the expected variation in microbiome measurements for given input biomass and relative abundance values. CONCLUSIONS: Well-controlled microbiome studies are sufficiently robust to capture small biological effects and can achieve levels of variability consistent with clinical assays. Relative abundance is negatively associated with measures of variability and has a stronger effect on variability than does absolute biomass, suggesting that it is feasible to detect differences in bacterial populations in very low-biomass samples. Further, by quantifying the effect of biomass and relative abundance on compositional variability, we developed a tool for defining the expected variance in a given microbiome study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0543-z) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-10 /pmc/articles/PMC6131952/ /pubmed/30201048 http://dx.doi.org/10.1186/s40168-018-0543-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Bender, Jeffrey M.
Li, Fan
Adisetiyo, Helty
Lee, David
Zabih, Sara
Hung, Long
Wilkinson, Thomas A.
Pannaraj, Pia S.
She, Rosemary C.
Bard, Jennifer Dien
Tobin, Nicole H.
Aldrovandi, Grace M.
Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title_full Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title_fullStr Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title_full_unstemmed Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title_short Quantification of variation and the impact of biomass in targeted 16S rRNA gene sequencing studies
title_sort quantification of variation and the impact of biomass in targeted 16s rrna gene sequencing studies
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131952/
https://www.ncbi.nlm.nih.gov/pubmed/30201048
http://dx.doi.org/10.1186/s40168-018-0543-z
work_keys_str_mv AT benderjeffreym quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT lifan quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT adisetiyohelty quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT leedavid quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT zabihsara quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT hunglong quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT wilkinsonthomasa quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT pannarajpias quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT sherosemaryc quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT bardjenniferdien quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT tobinnicoleh quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies
AT aldrovandigracem quantificationofvariationandtheimpactofbiomassintargeted16srrnagenesequencingstudies