Cargando…

Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers

BACKGROUND: Sequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, q...

Descripción completa

Detalles Bibliográficos
Autores principales: Chrisman, Brianna S., Paskov, Kelley M., Stockham, Nate, Jung, Jae-Yoon, Varma, Maya, Washington, Peter Y., Tataru, Christine, Iwai, Shoko, DeSantis, Todd Z., David, Maude, Wall, Dennis P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8527694/
https://www.ncbi.nlm.nih.gov/pubmed/34666677
http://dx.doi.org/10.1186/s12859-021-04427-7
_version_ 1784586121031712768
author Chrisman, Brianna S.
Paskov, Kelley M.
Stockham, Nate
Jung, Jae-Yoon
Varma, Maya
Washington, Peter Y.
Tataru, Christine
Iwai, Shoko
DeSantis, Todd Z.
David, Maude
Wall, Dennis P.
author_facet Chrisman, Brianna S.
Paskov, Kelley M.
Stockham, Nate
Jung, Jae-Yoon
Varma, Maya
Washington, Peter Y.
Tataru, Christine
Iwai, Shoko
DeSantis, Todd Z.
David, Maude
Wall, Dennis P.
author_sort Chrisman, Brianna S.
collection PubMed
description BACKGROUND: Sequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, querying a database to get taxonomic labels from sequences, or clustering similar sequences together. However, these approaches do not fully capture evolutionary relationships between microbes, limiting the ability to identify differentially abundant groups of microbes between a diseased and control cohort. We present sequence-based biomarkers (SBBs), an aggregation method that groups and aggregates microbes using single variants and combinations of variants within their 16S sequences. We compare SBBs against other existing aggregation methods (OTU clustering and Microphenoor DiTaxa features) in several benchmarking tasks: biomarker discovery via permutation test, biomarker discovery via linear discriminant analysis, and phenotype prediction power. We demonstrate the SBBs perform on-par or better than the state-of-the-art methods in biomarker discovery and phenotype prediction. RESULTS: On two independent datasets, SBBs identify differentially abundant groups of microbes with similar or higher statistical significance than existing methods in both a permutation-test-based analysis and using linear discriminant analysis effect size. . By grouping microbes by SBB, we can identify several differentially abundant microbial groups (FDR <.1) between children with autism and neurotypical controls in a set of 115 discordant siblings. Porphyromonadaceae, Ruminococcaceae, and an unnamed species of Blastocystis were significantly enriched in autism, while Veillonellaceae was significantly depleted. Likewise, aggregating microbes by SBB on a dataset of obese and lean twins, we find several significantly differentially abundant microbial groups (FDR<.1). We observed Megasphaera andSutterellaceae highly enriched in obesity, and Phocaeicola significantly depleted. SBBs also perform on bar with or better than existing aggregation methods as features in a phenotype prediction model, predicting the autism phenotype with an ROC-AUC score of .64 and the obesity phenotype with an ROC-AUC score of .84. CONCLUSIONS: SBBs provide a powerful method for aggregating microbes to perform differential abundance analysis as well as phenotype prediction. Our source code can be freely downloaded from http://github.com/briannachrisman/16s_biomarkers.
format Online
Article
Text
id pubmed-8527694
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85276942021-10-25 Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers Chrisman, Brianna S. Paskov, Kelley M. Stockham, Nate Jung, Jae-Yoon Varma, Maya Washington, Peter Y. Tataru, Christine Iwai, Shoko DeSantis, Todd Z. David, Maude Wall, Dennis P. BMC Bioinformatics Methodology Article BACKGROUND: Sequencing partial 16S rRNA genes is a cost effective method for quantifying the microbial composition of an environment, such as the human gut. However, downstream analysis relies on binning reads into microbial groups by either considering each unique sequence as a different microbe, querying a database to get taxonomic labels from sequences, or clustering similar sequences together. However, these approaches do not fully capture evolutionary relationships between microbes, limiting the ability to identify differentially abundant groups of microbes between a diseased and control cohort. We present sequence-based biomarkers (SBBs), an aggregation method that groups and aggregates microbes using single variants and combinations of variants within their 16S sequences. We compare SBBs against other existing aggregation methods (OTU clustering and Microphenoor DiTaxa features) in several benchmarking tasks: biomarker discovery via permutation test, biomarker discovery via linear discriminant analysis, and phenotype prediction power. We demonstrate the SBBs perform on-par or better than the state-of-the-art methods in biomarker discovery and phenotype prediction. RESULTS: On two independent datasets, SBBs identify differentially abundant groups of microbes with similar or higher statistical significance than existing methods in both a permutation-test-based analysis and using linear discriminant analysis effect size. . By grouping microbes by SBB, we can identify several differentially abundant microbial groups (FDR <.1) between children with autism and neurotypical controls in a set of 115 discordant siblings. Porphyromonadaceae, Ruminococcaceae, and an unnamed species of Blastocystis were significantly enriched in autism, while Veillonellaceae was significantly depleted. Likewise, aggregating microbes by SBB on a dataset of obese and lean twins, we find several significantly differentially abundant microbial groups (FDR<.1). We observed Megasphaera andSutterellaceae highly enriched in obesity, and Phocaeicola significantly depleted. SBBs also perform on bar with or better than existing aggregation methods as features in a phenotype prediction model, predicting the autism phenotype with an ROC-AUC score of .64 and the obesity phenotype with an ROC-AUC score of .84. CONCLUSIONS: SBBs provide a powerful method for aggregating microbes to perform differential abundance analysis as well as phenotype prediction. Our source code can be freely downloaded from http://github.com/briannachrisman/16s_biomarkers. BioMed Central 2021-10-19 /pmc/articles/PMC8527694/ /pubmed/34666677 http://dx.doi.org/10.1186/s12859-021-04427-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Chrisman, Brianna S.
Paskov, Kelley M.
Stockham, Nate
Jung, Jae-Yoon
Varma, Maya
Washington, Peter Y.
Tataru, Christine
Iwai, Shoko
DeSantis, Todd Z.
David, Maude
Wall, Dennis P.
Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title_full Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title_fullStr Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title_full_unstemmed Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title_short Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers
title_sort improved detection of disease-associated gut microbes using 16s sequence-based biomarkers
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8527694/
https://www.ncbi.nlm.nih.gov/pubmed/34666677
http://dx.doi.org/10.1186/s12859-021-04427-7
work_keys_str_mv AT chrismanbriannas improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT paskovkelleym improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT stockhamnate improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT jungjaeyoon improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT varmamaya improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT washingtonpetery improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT tataruchristine improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT iwaishoko improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT desantistoddz improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT davidmaude improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers
AT walldennisp improveddetectionofdiseaseassociatedgutmicrobesusing16ssequencebasedbiomarkers